Skip to content

Commit 7c3d93d

Browse files
committed
2 parents b04b5e8 + 588cef0 commit 7c3d93d

File tree

8 files changed

+2967
-0
lines changed

8 files changed

+2967
-0
lines changed
Lines changed: 283 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,283 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "f6dc47ed-b5c4-4e23-8af3-365fa5aea930",
6+
"metadata": {},
7+
"source": [
8+
"# Basic descriptive statistics\n",
9+
"The term [descriptive statistics](https://en.wikipedia.org/wiki/Descriptive_statistics) refers to methods that allow summarizing collections of data. To demonstrate the most important methods, we start by defining a dataset first."
10+
]
11+
},
12+
{
13+
"cell_type": "code",
14+
"execution_count": 1,
15+
"id": "52c2d369-3c34-42f1-860d-4b15aa0a56fb",
16+
"metadata": {},
17+
"outputs": [],
18+
"source": [
19+
"measurements = [5, 2, 6, 4, 8, 6, 2, 5, 1, 3, 3, 6]"
20+
]
21+
},
22+
{
23+
"cell_type": "markdown",
24+
"id": "495cce67-9f45-4b36-842d-404747ba43a1",
25+
"metadata": {},
26+
"source": [
27+
"## Measurements of central tendency\n",
28+
"We can measure the _location_ of our `measurement` in space using [numpy's statistics functions](https://numpy.org/doc/stable/reference/routines.statistics.html) and Python's [statistics module](https://docs.python.org/3/library/statistics.html)."
29+
]
30+
},
31+
{
32+
"cell_type": "code",
33+
"execution_count": 2,
34+
"id": "afe45a7b-d925-46b6-b2c3-8fdcbdafae5c",
35+
"metadata": {},
36+
"outputs": [],
37+
"source": [
38+
"import numpy as np\n",
39+
"import statistics as st"
40+
]
41+
},
42+
{
43+
"cell_type": "code",
44+
"execution_count": 3,
45+
"id": "ae6d6f1d-da15-4fb3-9021-bfffa18b6e76",
46+
"metadata": {},
47+
"outputs": [
48+
{
49+
"data": {
50+
"text/plain": [
51+
"4.25"
52+
]
53+
},
54+
"execution_count": 3,
55+
"metadata": {},
56+
"output_type": "execute_result"
57+
}
58+
],
59+
"source": [
60+
"np.mean(measurements)"
61+
]
62+
},
63+
{
64+
"cell_type": "code",
65+
"execution_count": 4,
66+
"id": "5c8785a5-8f3e-43f3-abb6-ab798d2d0e8e",
67+
"metadata": {},
68+
"outputs": [
69+
{
70+
"data": {
71+
"text/plain": [
72+
"4.5"
73+
]
74+
},
75+
"execution_count": 4,
76+
"metadata": {},
77+
"output_type": "execute_result"
78+
}
79+
],
80+
"source": [
81+
"np.median(measurements)"
82+
]
83+
},
84+
{
85+
"cell_type": "code",
86+
"execution_count": 5,
87+
"id": "068fadac-1e6d-4449-b074-6b320ff85bbf",
88+
"metadata": {},
89+
"outputs": [
90+
{
91+
"data": {
92+
"text/plain": [
93+
"6"
94+
]
95+
},
96+
"execution_count": 5,
97+
"metadata": {},
98+
"output_type": "execute_result"
99+
}
100+
],
101+
"source": [
102+
"st.mode(measurements)"
103+
]
104+
},
105+
{
106+
"cell_type": "markdown",
107+
"id": "b14d6e91-b0b8-4acf-8fa4-616aca040bb8",
108+
"metadata": {},
109+
"source": [
110+
"## Measurements of spread\n",
111+
"Numpy also allows measuring the spread of `measurements`."
112+
]
113+
},
114+
{
115+
"cell_type": "code",
116+
"execution_count": 6,
117+
"id": "21cc5cfa-6e48-481b-a9d1-a51e114976b5",
118+
"metadata": {},
119+
"outputs": [
120+
{
121+
"data": {
122+
"text/plain": [
123+
"2.0052015692526606"
124+
]
125+
},
126+
"execution_count": 6,
127+
"metadata": {},
128+
"output_type": "execute_result"
129+
}
130+
],
131+
"source": [
132+
"np.std(measurements)"
133+
]
134+
},
135+
{
136+
"cell_type": "code",
137+
"execution_count": 7,
138+
"id": "90a53582-7f8b-4c46-9360-224ea142acc3",
139+
"metadata": {},
140+
"outputs": [
141+
{
142+
"data": {
143+
"text/plain": [
144+
"4.020833333333333"
145+
]
146+
},
147+
"execution_count": 7,
148+
"metadata": {},
149+
"output_type": "execute_result"
150+
}
151+
],
152+
"source": [
153+
"np.var(measurements)"
154+
]
155+
},
156+
{
157+
"cell_type": "code",
158+
"execution_count": 8,
159+
"id": "959c06f8-05eb-481c-9fbd-32b965e7cd40",
160+
"metadata": {},
161+
"outputs": [
162+
{
163+
"data": {
164+
"text/plain": [
165+
"(1, 8)"
166+
]
167+
},
168+
"execution_count": 8,
169+
"metadata": {},
170+
"output_type": "execute_result"
171+
}
172+
],
173+
"source": [
174+
"np.min(measurements), np.max(measurements)"
175+
]
176+
},
177+
{
178+
"cell_type": "code",
179+
"execution_count": 9,
180+
"id": "b2501f96-4f16-488f-9b9b-74d62309d471",
181+
"metadata": {},
182+
"outputs": [
183+
{
184+
"data": {
185+
"text/plain": [
186+
"array([2.75, 4.5 , 6. ])"
187+
]
188+
},
189+
"execution_count": 9,
190+
"metadata": {},
191+
"output_type": "execute_result"
192+
}
193+
],
194+
"source": [
195+
"np.percentile(measurements, [25, 50, 75])"
196+
]
197+
},
198+
{
199+
"cell_type": "markdown",
200+
"id": "c5465b23-6818-4456-b07b-42e479431a96",
201+
"metadata": {},
202+
"source": [
203+
"## Exercise\n",
204+
"Find out if the median of a sample dataset is always a number within the sample. Use these three examples to elaborate on this:"
205+
]
206+
},
207+
{
208+
"cell_type": "code",
209+
"execution_count": 10,
210+
"id": "acc54c8e-5b77-4f56-8bee-c6d0c5a13bbb",
211+
"metadata": {},
212+
"outputs": [],
213+
"source": [
214+
"example1 = [3, 4, 5]"
215+
]
216+
},
217+
{
218+
"cell_type": "code",
219+
"execution_count": null,
220+
"id": "0789f045-0782-4587-ab26-7ea96a38b662",
221+
"metadata": {},
222+
"outputs": [],
223+
"source": []
224+
},
225+
{
226+
"cell_type": "code",
227+
"execution_count": 11,
228+
"id": "bd27b8de-db9b-48f9-b6c9-78dea2958319",
229+
"metadata": {},
230+
"outputs": [],
231+
"source": [
232+
"example2 = [3, 4, 4, 5]"
233+
]
234+
},
235+
{
236+
"cell_type": "code",
237+
"execution_count": null,
238+
"id": "1a4cfe20-e174-43e6-8bdd-7bb0d70c2474",
239+
"metadata": {},
240+
"outputs": [],
241+
"source": []
242+
},
243+
{
244+
"cell_type": "code",
245+
"execution_count": 12,
246+
"id": "9b679e86-3ffb-49ce-ac36-a2bcb2607004",
247+
"metadata": {},
248+
"outputs": [],
249+
"source": [
250+
"example3 = [3, 4, 5, 6]"
251+
]
252+
},
253+
{
254+
"cell_type": "code",
255+
"execution_count": null,
256+
"id": "56d926f8-81b9-4b22-9392-12c2fa568adb",
257+
"metadata": {},
258+
"outputs": [],
259+
"source": []
260+
}
261+
],
262+
"metadata": {
263+
"kernelspec": {
264+
"display_name": "Python 3 (ipykernel)",
265+
"language": "python",
266+
"name": "python3"
267+
},
268+
"language_info": {
269+
"codemirror_mode": {
270+
"name": "ipython",
271+
"version": 3
272+
},
273+
"file_extension": ".py",
274+
"mimetype": "text/x-python",
275+
"name": "python",
276+
"nbconvert_exporter": "python",
277+
"pygments_lexer": "ipython3",
278+
"version": "3.9.7"
279+
}
280+
},
281+
"nbformat": 4,
282+
"nbformat_minor": 5
283+
}

0 commit comments

Comments
 (0)