Skip to content

Commit f3099df

Browse files
authored
Update README.md
1 parent d08d146 commit f3099df

File tree

1 file changed

+108
-0
lines changed

1 file changed

+108
-0
lines changed

README.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,109 @@
11
# Project Navigation - Udacity Deep Reinforcement Learning
2+
[//]: # (Image References)
3+
4+
[video_random]: https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/results/random_agent.gif "Random Agent"
5+
6+
[video_trained]: https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/results/trained_agent.gif "Trained Agent"
7+
8+
# Project 1: Navigation
9+
10+
### Introduction
11+
12+
For this project, you will train an agent to navigate (and collect bananas!) in a large, square world.
13+
14+
15+
16+
| Random agent | Trained agent |
17+
:-------------------------:|:-------------------------:
18+
![Random Agent][video_random] | ![Trained Agent][video_trained]
19+
20+
A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.
21+
22+
The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:
23+
- **`0`** - move forward.
24+
- **`1`** - move backward.
25+
- **`2`** - turn left.
26+
- **`3`** - turn right.
27+
28+
The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.
29+
30+
### Getting Started
31+
32+
1. Download the environment from one of the links below. You need only select the environment that matches your operating system:
33+
- Linux: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux.zip)
34+
- Mac OSX: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana.app.zip)
35+
- Windows (32-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Windows_x86.zip)
36+
- Windows (64-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Windows_x86_64.zip)
37+
38+
(_For Windows users_) Check out [this link](https://support.microsoft.com/en-us/help/827218/how-to-determine-whether-a-computer-is-running-a-32-bit-version-or-64) if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
39+
40+
(_For AWS_) If you'd like to train the agent on AWS (and have not [enabled a virtual screen](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-on-Amazon-Web-Service.md)), then please use [this link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux_NoVis.zip) to obtain the environment.
41+
42+
2. Place the file in this folder, unzip (or decompress) the file and then write the correct path in the argument for creating the environment under the notebook `Navigation_solution.ipynb`:
43+
44+
```python
45+
env = env = UnityEnvironment(file_name="Banana.app")
46+
47+
```
48+
49+
### Description
50+
51+
- `dqn_agent.py`: code for the agent used in the environment
52+
- `model.py`: code containing the Q-Network used as the function approximator by the agent
53+
- `dqn.pth`: saved model weights for the original DQN model
54+
- `ddqn.pth`: saved model weights for the Double DQN model
55+
- `ddqn.pth`: saved model weights for the Dueling Double DQN model
56+
- `Navigation_exploration.ipynb`: explore the unity environment
57+
- `Navigation_solution.ipynb`: notebook containing the solution
58+
- `Navigation_Pixels.ipynb`: notebook containing the code for the pixel-action problem (see below)
59+
60+
### Instructions
61+
62+
Follow the instructions in `Navigation_solution.ipynb` to get started with training your own agent!
63+
To watch a trained smart agent, follow the instructions below:
64+
65+
- **DQN**: If you want to run the original DQN algorithm, use the checkpoint `dqn.pth` for loading the trained model. Also, choose the parameter `qnetwork` as `QNetwork` while defining the agent and the parameter `update_type` as `dqn`.
66+
- **Double DQN**: If you want to run the Double DQN algorithm, use the checkpoint `ddqn.pth` for loading the trained model. Also, choose the parameter `qnetwork` as `QNetwork` while defining the agent and the parameter `update_type` as `double_dqn`.
67+
- **Dueling Double DQN**: If you want to run the Dueling Double DQN algorithm, use the checkpoint `dddqn.pth` for loading the trained model. Also, choose the parameter `qnetwork` as `DuelingQNetwork` while defining the agent and the parameter `update_type` as `double_dqn`.
68+
69+
### Enhancements
70+
71+
Several enhancements to the original DQN algorithm have also been incorporated:
72+
73+
- Double DQN [[Paper](https://arxiv.org/abs/1509.06461)] [[Code](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/dqn_agent.py#L94)]
74+
- Prioritized Experience Replay [[Paper](https://arxiv.org/abs/1511.05952)] [[Code]()] (WIP)
75+
- Dueling DQN [[Paper](https://arxiv.org/abs/1511.06581)] [[Code](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/model.py)]
76+
77+
### Results
78+
79+
Plot showing the score per episode over all the episodes. The environment was solved in **377** episodes (currently).
80+
81+
| Double DQN | DQN | Dueling DQN |
82+
:-------------------------:|:-------------------------:|:-------------------------:
83+
![double-dqn-scores](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/results/ddqn_new_scores.png) | ![dqn-scores](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/results/dqn_new_scores.png) | ![dueling-double-dqn-scores](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/results/dddqn_new_scores.png)
84+
85+
86+
### Challenge: Learning from Pixels
87+
88+
In the project, your agent learned from information such as its velocity, along with ray-based perception of objects around its forward direction. A more challenging task would be to learn directly from pixels!
89+
90+
To solve this harder task, you'll need to download a new Unity environment. This environment is almost identical to the project environment, where the only difference is that the state is an 84 x 84 RGB image, corresponding to the agent's first-person view. (**Note**: Udacity students should not submit a project with this new environment.)
91+
92+
You need only select the environment that matches your operating system:
93+
- Linux: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana_Linux.zip)
94+
- Mac OSX: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana.app.zip)
95+
- Windows (32-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana_Windows_x86.zip)
96+
- Windows (64-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana_Windows_x86_64.zip)
97+
98+
Then, place the file in this folder, and unzip (or decompress) the file. Next, open `Navigation_Pixels.ipynb` and follow the instructions to learn how to use the Python API to control the agent.
99+
100+
(_For AWS_) If you'd like to train the agent on AWS, you must follow the instructions to [set up X Server](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-on-Amazon-Web-Service.md), and then download the environment for the **Linux** operating system above.
101+
102+
### Dependencies
103+
104+
Use the `requirements.txt` file (in the [main](https://github.com/dalmia/udacity-deep-reinforcement-learning) folder) to install the required dependencies via `pip`.
105+
106+
```
107+
pip install -r requirements.txt
108+
109+
```

0 commit comments

Comments
 (0)