You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on May 6, 2021. It is now read-only.
* add experiment of snake game
* sync
* add Experiment for Minimax
* add Experiment for CFRPolicy
* fix CFR
* add more experiments
* add more experiments
* update dependency
* bump version
* add more info in README.md
* bugfix
* minor bugfix
* automatically decrease steps in CI
* increase steps in JuliaRL_TabularCFR_OpenSpiel
Copy file name to clipboardExpand all lines: README.md
+6-1Lines changed: 6 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,6 +24,8 @@ This project aims to provide some implementations of the most typical reinforcem
24
24
- PPO
25
25
- DDPG
26
26
- SAC
27
+
- CFR
28
+
- Minimax
27
29
28
30
If you are looking for tabular reinforcement learning algorithms, you may refer [ReinforcementLearningAnIntroduction.jl](https://github.com/JuliaReinforcementLearning/ReinforcementLearningAnIntroduction.jl).
29
31
@@ -45,6 +47,9 @@ Some built-in experiments are exported to help new users to easily run benchmark
45
47
-``E`JuliaRL_SAC_Pendulum` `` (Thanks to [@rbange](https://github.com/rbange))
46
48
-``E`JuliaRL_BasicDQN_MountainCar` `` (Thanks to [@felixchalumeau](https://github.com/felixchalumeau))
47
49
-``E`JuliaRL_DQN_MountainCar` `` (Thanks to [@felixchalumeau](https://github.com/felixchalumeau))
50
+
-``E`JuliaRL_Minimax_OpenSpiel(tic_tac_toe)` ``
51
+
-``E`JuliaRL_TabularCFR_OpenSpiel(kuhn_poker)` ``
52
+
-``E`JuliaRL_DQN_SnakeGame` ``
48
53
-``E`Dopamine_DQN_Atari(pong)` ``
49
54
-``E`Dopamine_Rainbow_Atari(pong)` ``
50
55
-``E`Dopamine_IQN_Atari(pong)` ``
@@ -56,7 +61,7 @@ Some built-in experiments are exported to help new users to easily run benchmark
56
61
- Experiments on `CartPole` usually run faster with CPU only due to the overhead of sending data between CPU and GPU.
57
62
- It shouldn't surprise you that our experiments on `CartPole` are much faster than those written in Python. The secret is that our environment is written in Julia!
58
63
- Remember to set `JULIA_NUM_THREADS` to enable multi-threading when using algorithms like `A2C` and `PPO`.
59
-
- Experiments on `Atari` are only available when you have `ArcadeLearningEnvironment.jl` installed and `using ArcadeLearningEnvironment`.
64
+
- Experiments on `Atari`(`OpenSpiel`, `SnakeGame`) are only available after you have `ArcadeLearningEnvironment.jl`(`OpenSpiel.jl`, `SnakeGame.jl`) installed and `using ArcadeLearningEnvironment` (`using OpenSpiel`, `using SnakeGame`).
The minimax algorithm with [Alpha-beta pruning](https://en.wikipedia.org/wiki/Alpha-beta_pruning)
6
+
## Keyword Arguments
7
+
- `maximum_depth::Int=30`, the maximum depth of search.
8
+
- `value_function=nothing`, estimate the value of `env`. `value_function(env) -> Number`. It is only called after searching for `maximum_depth` and the `env` is not terminated yet.
0 commit comments