Skip to content

Commit c3d57d7

Browse files
authored
Update README.md
1 parent aa8858a commit c3d57d7

File tree

1 file changed

+9
-2
lines changed

1 file changed

+9
-2
lines changed

README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -29,9 +29,16 @@ Below examples include the intense usage of industry-hot frameworks (i.e. Pytorc
2929

3030
### 2020 Edition
3131

32-
#### [Multi-class Text Classification Problem with PySpark and MLlib](https://github.com/hyunjoonbok/Python-Projects/blob/master/vanilla/Multi-class%20Text%20Classification%20Problem%20with%20PySpark%20and%20MLlib.ipynb):
32+
33+
#### [Multi-Class Text Classification 1 (with PySpark and Doc2Vec)](https://github.com/hyunjoonbok/natural-language-processing/blob/master/07_Multi-Class_Text_Classification_with_PySpark_and_Doc2Vec.ipynb):
34+
<p>
35+
In this notebook, we utilize Apache Spark's machine learning library (MLlib) with PySpark to tackle NLP problem and how to simulate Doc2Vec inside Spark envioronment. Apache Spark is a famous distributed competiting system to to scale up any data processing solutions. Spark also provides a Machine-learning powered library called 'MLlib'. We utilize Spark Machine Learning Library (Spark MLlib) to look at 3297 labeled sentences, and classify them into 5 different categories.
36+
</p>
37+
Jul 7, 2020
38+
39+
#### [Multi-class Text Classification 2 (with PySpark, MLlib, SparkSQL)](https://github.com/hyunjoonbok/Python-Projects/blob/master/vanilla/Multi-class%20Text%20Classification%20Problem%20with%20PySpark%20and%20MLlib.ipynb):
3340
<p>
34-
Apache Spark is quickly gaining steam both in the headlines and real-world adoption, mainly because of its ability to process streaming data. With so much data being processed on a daily basis, it has become essential for us to be able to stream and analyze it in real time. We use Spark Machine Learning Library (Spark MLlib) to solve this multi-class text classification problem.
41+
Apache Spark is quickly gaining steam both in the headlines and real-world adoption, mainly because of its ability to process streaming data. With so much data being processed on a daily basis, it has become essential for us to be able to stream and analyze it in real time. We use Spark Machine Learning Library (Spark MLlib) to classify crime rescription into 33 categories.
3542
</p>
3643
Jul 6, 2020
3744

0 commit comments

Comments
 (0)