Welcome to the Statistics for Data Science repository! This repository aims to provide a comprehensive collection of resources, ipynb-tuts, and examples related to statistical techniques and their application in data science. Whether you are a beginner or an experienced data scientist, you'll find valuable content to enhance your understanding of statistics and their role in data-driven decision making.
- Descriptive Statistics
MeanMedianModeStandard DeviationVarianceCovarianceCorrelationQuartilesIQRBox Plots
- Inferential Statistics
- Probability Distributions
BinomialPoissonNormal
- Sampling
Confidence IntervalsSample Size SelectionStatistical Significance
- Hypothesis Testing
Z-testsT-testsANOVAChi-Square tests
- Probability Distributions
- Regression
Simple Linear RegressionMultiple Linear RegressionPolynomial RegressionLogistic Regression
- Time Series
TrendSeasonalityAutocorrelationARIMA Models
- Statistical Visualization
HistogramsDensity PlotsQ-Q PlotsPair PlotsCorrelation Heatmaps
Some useful resources for statistics in data science:
- Kaggle Learn Statistics Course
- Khan Academy Statistics and Probability
- Introduction to Statistical Thought by Michael Lavine
- Think Stats by Allen B. Downey
- YT Krish Naik
- Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurelien Geron
- StatQuest with Josh Starmer
This repository is open source and contributions are welcome. If you have any ideas for hacks or tips, or if you find any errors, please feel free to open an issue or submit a pull request.
This repository is licensed under the MIT License.



