Using the DDDQN (Dueling Double Deep Q Learning) algorithm with Ray['RLlib'] on the gym-super-mario-bros environment to make the mario character finish the game by itself.
What is Ray? : Ray provides a simple, universal API for building distributed applications. Ray accomplishes this mission by:
-
Providing simple primitives for building and running distributed applications.
-
Enabling end users to parallelize single machine code, with little to zero code changes.
-
Including a large ecosystem of applications, libraries, and tools on top of the core Ray to enable complex applications.
What is RLlib? : Scalable Reinforcement Learning, RLlib is an open-source library for reinforcement learning that offers both high scalability and a unified API for a variety of applications. RLlib natively supports TensorFlow, TensorFlow Eager, and PyTorch, but most of its internals are framework agnostic.
What is gym super mario bros? : An OpenAI Gym environment for Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The Nintendo Entertainment System (NES) using the nes-py emulator.
Python3 libraries
sudo apt-get install python3-pip
sudo apt-get install python3-dev
pip3 install tensorflow-gpu
pip3 install ray
pip3 install ray['rllib']
pip3 install gym
pip3 install gym-super-mario-bros
NVIDIA CUDA : Setup document, Warning: If Tensorflow cannot establish a GPU connection via CUDA, the code will run on the CPU.
The following command starts training with the DDDQN (Dueling Double Deep Q Learning) algorithm in the "SuperMarioBros-v0" environment, which comes by default in the configPy.py file. For any changes, go to the configPy.py file.
python3 train.py
Agent | Iteration | Steps | Max Reward | Min Reward | Mean Reward |
---|---|---|---|---|---|
DQN | 616 | 617000 | 17504 | -5708 | 9048.6 |
DQN | 617 | 618000 | 19847 | -5708 | 9628.2 |
DQN | ... | ... | x | y | z |
gym env | State | World | gif |
---|---|---|---|
SuperMarioBros-v0 | Training process | 1-1 |
You can test in "SuperMarioBros-v0" environment using weights trained with the command below.
python3 test.py checkpoint-xxxx --env super_mario_bros --steps y
Sample command
python3 test.py /home/demir/Desktop/rl/root/checkpoint_001501/checkpoint-1501 --env super_mario_bros --steps 2000
gym env | State | World | Video |
---|---|---|---|
SuperMarioBros-v0 | Test | 1-1 | |
SuperMarioBros-v0 | Test | 1-2 | None |
SuperMarioBros-v0 | Test | 1-3 | None |
SuperMarioBros-v0 | Test | 1-4 | None |