Metadata-Version: 2.1
Name: qrl-graph
Version: 0.0.10
Summary: Reinforcement Learning Algorithms for the quantum speed up in graphs
Home-page: https://github.com/JiahaoYao/QRL_graph
Author: Jimmy
Author-email: jiahaoyao.math@gmail.com
License: MIT
Keywords: quantum computing,variational quantum algorithms,quantum machine learning,Reinforcement Learning,Quspin,Graph
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.6
Description-Content-Type: text/markdown
License-File: LICENSE

# QRL_graph
Reinforcement Learning for the quantum speedup in the graph

Given a graph, we try to compute the classical and quantum critical time. The definition of the criticial time is defined as the hitting time of the endpoints with the probility bigger than $p_0$. 

### Install

```markdown
pip install qrl_graph==0.0.8
```


### Usage

```python
import numpy as np
from scipy.sparse.csgraph import laplacian
import networkx as nx
import matplotlib.pyplot as plt
import matplotlib 
from qrl_graph.graph_env.graph import Graph

g = np.array([[0, 1, 1, 0],
              [1, 0, 0, 1],
              [1, 0, 0, 1],
              [0, 1, 1, 0]])

g_env = Graph(g=g)
print('Laplacian matrix:\n', g_env.laplacian)

t_cl = g_env.get_classical_time(p0=0.1)
t_q = g_env.get_quantum_time(p0=0.1)

print('Classical time:', t_cl)
print('Quantum time:', t_q)
print('Speed up:', t_cl / t_q)


# uncomment to show the graph
# g_env.show_graph()
```


The results are 
```markdown
Laplacian matrix:
 [[ 2 -1 -1  0]
 [-1  2  0 -1]
 [-1  0  2 -1]
 [ 0 -1 -1  2]]
Classical time: 0.25000000000000006
Quantum time: 0.6000000000000003
Speed up: 0.4166666666666665
```



### Side notes 


for the policy gradient, 
First, the probability is parametrized through  $$\sigma(x) = \frac 1 {1+e^{-x}}$$, and its derivative is $$\sigma(x) (1 - \sigma(x))$$. 

According to the policy gradient, the policy gradient is equal to $$\mathbb E( \sigma(\theta) (1 - \sigma(\theta)) r ).$$
