Skip to content

Commit 368883c

Browse files
committed
standard readme
1 parent 399680c commit 368883c

File tree

1 file changed

+41
-0
lines changed

1 file changed

+41
-0
lines changed

README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,51 @@ This is a selenium tasks to crawl Ask.fm because of correcting QA list for Machi
2020

2121
## Background
2222

23+
Among machine learning, there was a task to create a bot that responds to natural language using LSTM (a kind of RNN).
24+
25+
At that time, a large amount of conversation corpus is required, but since I did not get a good conversation corpus, I decided to make a conversation corpus by crawling the Ask.fm question answer list with Selenium (Google Chrome) Did.
26+
27+
I'm using Selenium for Python because my favorite programming language is Python.
28+
2329
## Install
2430

31+
### Precondition
32+
33+
- Python 3.6+
34+
- Google Chrome
35+
- [Google Chrome WebDriver](https://sites.google.com/a/chromium.org/chromedriver/downloads)
36+
- Check your Chrome version and install suitable driver version.
37+
38+
### PIP
39+
40+
Install dependencies.
41+
42+
```
43+
pip install -r requirements.txt
44+
```
45+
2546
## Usage
2647

48+
### Create faces list (Account list)
49+
50+
Before create conversation corpus, create `face list` because of crawling QA.
51+
52+
First args, number of loop count.
53+
54+
```
55+
python src/get_faces.py 100
56+
```
57+
58+
After run script, get face list into `data/face_list.txt`
59+
60+
### Create conversation corpus
61+
62+
```
63+
python src/main.py
64+
```
65+
66+
After run script, get conversation corpus into `data/askfm_data/foobar.txt`
67+
2768
## Contributing
2869

2970
See [the contributing file](CONTRIBUTING.md)!

0 commit comments

Comments
 (0)