-
Couldn't load subscription status.
- Fork 0
an RL algorithm solving Flappy Bird. by setting returns R to be the number obstacles cleared upon crashing, q* : S × A → ℝ generates the expectation E(R) from the state-action pair (s, a). experiments support the conjecture that a tabular, n-step Sarsa algorithm converges to a policy π clearing arbitrarily many obstacles (confirmed up to 1,000,000)
Couldn't load subscription status.
clay-curry/flapPy-RL
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
About
an RL algorithm solving Flappy Bird. by setting returns R to be the number obstacles cleared upon crashing, q* : S × A → ℝ generates the expectation E(R) from the state-action pair (s, a). experiments support the conjecture that a tabular, n-step Sarsa algorithm converges to a policy π clearing arbitrarily many obstacles (confirmed up to 1,000,000)