Skip to content

Commit 5afe6d2

Browse files
author
Thota Vivek HDR
committed
2 parents 57db1e4 + 62dc6cc commit 5afe6d2

File tree

1 file changed

+106
-1
lines changed

1 file changed

+106
-1
lines changed

README.md

Lines changed: 106 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,106 @@
1-
# Project Navigation - Udacity Deep Reinforcement Learning
1+
[//]: # (Image References)
2+
3+
[video_random]: https://github.com/vivekthota16/Project-Navigation-Udacity-Deep-Reinforcement-Learning/blob/master/Training-Results/random_agent.gif "Random Agent"
4+
5+
[video_trained]: https://github.com/vivekthota16/Project-Navigation-Udacity-Deep-Reinforcement-Learning/blob/master/Training-Results/trained_agent.gif "Trained Agent"
6+
7+
# Project 1: Navigation - Udacity Deep Reinforcement Learning
8+
9+
### Introduction
10+
11+
For this project, you will train an agent to navigate (and collect bananas!) in a large, square world.
12+
13+
14+
15+
| Random agent | Trained agent |
16+
:-------------------------:|:-------------------------:
17+
![Random Agent][video_random] | ![Trained Agent][video_trained]
18+
19+
A reward of +1 is provided for collecting a yellow banana, and a reward of -1 is provided for collecting a blue banana. Thus, the goal of your agent is to collect as many yellow bananas as possible while avoiding blue bananas.
20+
21+
The state space has 37 dimensions and contains the agent's velocity, along with ray-based perception of objects around agent's forward direction. Given this information, the agent has to learn how to best select actions. Four discrete actions are available, corresponding to:
22+
- **`0`** - move forward.
23+
- **`1`** - move backward.
24+
- **`2`** - turn left.
25+
- **`3`** - turn right.
26+
27+
The task is episodic, and in order to solve the environment, your agent must get an average score of +13 over 100 consecutive episodes.
28+
29+
### Getting Started
30+
31+
1. Download the environment from one of the links below. You need only select the environment that matches your operating system:
32+
- Linux: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux.zip)
33+
- Mac OSX: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana.app.zip)
34+
- Windows (32-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Windows_x86.zip)
35+
- Windows (64-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Windows_x86_64.zip)
36+
37+
(_For Windows users_) Check out [this link](https://support.microsoft.com/en-us/help/827218/how-to-determine-whether-a-computer-is-running-a-32-bit-version-or-64) if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.
38+
39+
(_For AWS_) If you'd like to train the agent on AWS (and have not [enabled a virtual screen](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-on-Amazon-Web-Service.md)), then please use [this link](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/Banana_Linux_NoVis.zip) to obtain the environment.
40+
41+
2. Place the file in this folder, unzip (or decompress) the file and then write the correct path in the argument for creating the environment under the notebook `Navigation_solution.ipynb`:
42+
43+
```python
44+
env = env = UnityEnvironment(file_name="Banana.app")
45+
46+
```
47+
48+
### Description
49+
50+
- `dqn_agent.py`: code for the agent used in the environment
51+
- `model.py`: code containing the Q-Network used as the function approximator by the agent
52+
- `dqn.pth`: saved model weights for the original DQN model
53+
- `ddqn.pth`: saved model weights for the Double DQN model
54+
- `ddqn.pth`: saved model weights for the Dueling Double DQN model
55+
- `Navigation.ipynb`: notebook containing the solution
56+
57+
### Instructions
58+
59+
Follow the instructions in `Navigation_solution.ipynb` to get started with training your own agent!
60+
To watch a trained smart agent, follow the instructions below:
61+
62+
- **DQN**: If you want to run the original DQN algorithm, use the checkpoint `dqn.pth` for loading the trained model. Also, choose the parameter `qnetwork` as `QNetwork` while defining the agent and the parameter `update_type` as `dqn`.
63+
- **Double DQN**: If you want to run the Double DQN algorithm, use the checkpoint `ddqn.pth` for loading the trained model. Also, choose the parameter `qnetwork` as `QNetwork` while defining the agent and the parameter `update_type` as `double_dqn`.
64+
- **Dueling Double DQN**: If you want to run the Dueling Double DQN algorithm, use the checkpoint `dddqn.pth` for loading the trained model. Also, choose the parameter `qnetwork` as `DuelingQNetwork` while defining the agent and the parameter `update_type` as `double_dqn`.
65+
66+
### Enhancements
67+
68+
Several enhancements to the original DQN algorithm have also been incorporated:
69+
70+
- Double DQN [[Paper](https://arxiv.org/abs/1509.06461)] [[Code](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/dqn_agent.py#L94)]
71+
- Prioritized Experience Replay [[Paper](https://arxiv.org/abs/1511.05952)] [[Code]()] (To be worked out)
72+
- Dueling DQN [[Paper](https://arxiv.org/abs/1511.06581)] [[Code](https://github.com/dalmia/udacity-deep-reinforcement-learning/blob/master/2%20-%20Value-based%20methods/Project-Navigation/model.py)]
73+
74+
### Results
75+
76+
Plot showing the score per episode over all the episodes. The environment was solved in **361** episodes i.e, acheived score of +13 (with Double DQN).
77+
78+
| Double DQN | DQN | Dueling DQN |
79+
:-------------------------:|:-------------------------:|:-------------------------:
80+
![double-dqn-scores](https://github.com/vivekthota16/Project-Navigation-Udacity-Deep-Reinforcement-Learning/blob/master/Training-Results/ddqn_new_scores.png) | ![dqn-scores](https://github.com/vivekthota16/Project-Navigation-Udacity-Deep-Reinforcement-Learning/blob/master/Training-Results/dqn_new_scores.png) | ![dueling-double-dqn-scores](https://github.com/vivekthota16/Project-Navigation-Udacity-Deep-Reinforcement-Learning/blob/master/Training-Results/dddqn_new_scores.png)
81+
82+
83+
### Challenge: Learning from Pixels
84+
85+
In the project, your agent learned from information such as its velocity, along with ray-based perception of objects around its forward direction. A more challenging task would be to learn directly from pixels!
86+
87+
To solve this harder task, you'll need to download a new Unity environment. This environment is almost identical to the project environment, where the only difference is that the state is an 84 x 84 RGB image, corresponding to the agent's first-person view. (**Note**: Udacity students should not submit a project with this new environment.)
88+
89+
You need only select the environment that matches your operating system:
90+
- Linux: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana_Linux.zip)
91+
- Mac OSX: [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana.app.zip)
92+
- Windows (32-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana_Windows_x86.zip)
93+
- Windows (64-bit): [click here](https://s3-us-west-1.amazonaws.com/udacity-drlnd/P1/Banana/VisualBanana_Windows_x86_64.zip)
94+
95+
Then, place the file in this folder, and unzip (or decompress) the file. Next, open `Navigation_Pixels.ipynb` and follow the instructions to learn how to use the Python API to control the agent.
96+
97+
(_For AWS_) If you'd like to train the agent on AWS, you must follow the instructions to [set up X Server](https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Training-on-Amazon-Web-Service.md), and then download the environment for the **Linux** operating system above.
98+
99+
### Dependencies
100+
101+
Use the `requirements.txt` file (in the [main](https://github.com/vivekthota16/Project-Navigation-Udacity-Deep-Reinforcement-Learning) folder) to install the required dependencies via `pip`.
102+
103+
```
104+
pip install -r requirements.txt
105+
106+
```

0 commit comments

Comments
 (0)