|
7 | 7 | "source": [ |
8 | 8 | "## Permutation Tests\n", |
9 | 9 | "### Exact Tests\n", |
10 | | - "Consider the following experiment from [An Introduction to the Bootstrap](https://books.google.com/books?id=MWC1DwAAQBAJ&printsec=frontcoverhttps://books.google.com/books?id=MWC1DwAAQBAJ&printsec=frontcover). A new medical treatment is intended to prolong life after a form of surgery. Sixteen mice are randomly assigned to either a treatment group or control group under the constraint that only seven treatments are available. All mice receive the surgery, but only the treatment group will receive the treatment being studied. The survival time of each mouse after surgery is recorded below." |
| 10 | + "Consider the following experiment from Efron and Tibshirani's [An Introduction to the Bootstrap](https://books.google.com/books?id=MWC1DwAAQBAJ&printsec=frontcoverhttps://books.google.com/books?id=MWC1DwAAQBAJ&printsec=frontcover). A new medical treatment is intended to prolong life after a form of surgery. Sixteen mice are randomly assigned to either a treatment group or control group under the constraint that only seven treatments are available. All mice receive the surgery, but only the treatment group will receive the treatment being studied. The survival time of each mouse after surgery is recorded below." |
11 | 11 | ] |
12 | 12 | }, |
13 | 13 | { |
|
28 | 28 | "id": "e19f78a9-e1c0-4d8a-aeb6-3fa944752948", |
29 | 29 | "metadata": {}, |
30 | 30 | "source": [ |
31 | | - "The difference in the mean life after treatment between the two groups suggests that the treatment has a prolonging effect, as hypothesized." |
| 31 | + "The difference in mean lifetime after treatment between the two groups suggests that the treatment has a prolonging effect, as hypothesized." |
32 | 32 | ] |
33 | 33 | }, |
34 | 34 | { |
|
88 | 88 | "source": [ |
89 | 89 | "The probability of observing such an extreme test statistic under the null hypothesis (due to chance alone) is greater than 14%, so these data do not seem inconsistent with the null hypothesis. The *point estimate* of the statistic (~30 days) suggested a life-prolonging effect, but such a value of the statistic could quite easily have been observed due to chance alone.\n", |
90 | 90 | "\n", |
91 | | - "Although the t-test tends to be rather robust to violations of its underlying assumptions (e.g., $X$ and $Y$ do not need to be strictly normally distributed for the test to be reasonably accurate), it is possible to perform a hypothesis test which requires no such assumptions at all. \n", |
| 91 | + "Although the t-test tends to be rather robust to violations of its underlying assumptions (e.g., $X$ and $Y$ do not need to be strictly normally distributed for the test to be reasonably accurate), it is possible to perform a hypothesis test which requires almost no such assumptions at all. \n", |
92 | 92 | "\n", |
93 | | - "Instead, let the null hypothesis be that the samples `x` and `y` are drawn a single distribution ($X = Y = Z$), and test this against the alternative that the two sample are drawn from distributions which would tend to produce greater values of `statistic`. \n", |
| 93 | + "Instead, let the null hypothesis be that the observations from the samples `x` and `y` were all drawn independently<a name=\"cite_ref-2\"></a>[<sup>**†**</sup>](#cite_note-2) from a single distribution ($X = Y = Z$), and test this against the alternative that the two samples were drawn from distinct distributions that would tend to produce a greater value of `statistic` (in this case such that $\\mu_x > \\mu_y$).\n", |
94 | 94 | "\n", |
| 95 | + "<a name=\"cite_ref-2\"></a>[<sup>**†**</sup>](#cite_note-2) Actually, only exchangeability is required [[4]](https://en.wikipedia.org/wiki/Exchangeable_random_variables)." |
| 96 | + ] |
| 97 | + }, |
| 98 | + { |
| 99 | + "cell_type": "markdown", |
| 100 | + "id": "96079705", |
| 101 | + "metadata": {}, |
| 102 | + "source": [ |
95 | 103 | "The complete population of mice survival times in the study is really:" |
96 | 104 | ] |
97 | 105 | }, |
|
119 | 127 | "id": "6204af8f-b858-4d5a-8bed-b4ef1bc00c30", |
120 | 128 | "metadata": {}, |
121 | 129 | "source": [ |
122 | | - "Since the mice were randomly divided into the two groups under the constraint that there were only seven treatments available, any selection of seven mice from `z` to form the treatment group `x` was equally likely; the remaining mice would form the control group `y`. Furthermore, if the null hypothesis is true, the mice survival times would be *unaffected by the grouping*. Therefore, each value of the statistic obtained from the possible groupings is equaly likely.\n", |
| 130 | + "Since the mice were randomly divided into the two groups under the constraint that there were only seven treatments available, any selection of seven mice from `z` to form the treatment group `x` was equally likely; the remaining mice would form the control group `y`. Furthermore, if the null hypothesis is true, the mice survival times would be *unaffected by the grouping*. Therefore, each value of the statistic obtained from the possible groupings is equally likely.\n", |
123 | 131 | "\n", |
124 | | - "We begin our hypothesis test by calculating the value of `statistic` for all possible *permutations*<a name=\"cite_ref-2\"></a>[<sup>[2]</sup>](#cite_note-2) of mice into the the two groups, forming an exact null distribution.\n", |
| 132 | + "We begin our hypothesis test by calculating the value of `statistic` for all possible *permutations*<a name=\"cite_ref-3\"> </a>[<sup>**‡**</sup>](#cite_note-3) of mice into the the two groups, forming an exact null distribution.\n", |
125 | 133 | "\n", |
126 | | - "<a name=\"cite_ref-2\"></a>[<sup>[2]</sup>](#cite_note-2) Here and below, we will refer to the the ways of rearranging samples as \"permutations\" even when the word is not stricly appropriate in the technical sense. " |
| 134 | + "<a name=\"cite_ref-3\"></a>[<sup>**‡**</sup>](#cite_note-3) Here and below, we will refer to the the ways of rearranging samples as \"permutations\" even when the word is not stricly appropriate in the technical sense. " |
127 | 135 | ] |
128 | 136 | }, |
129 | 137 | { |
|
261 | 269 | "id": "ae116861-ded9-4728-a1b1-4c9c34c50fdd", |
262 | 270 | "metadata": {}, |
263 | 271 | "source": [ |
264 | | - "Note that the exact $p$-value from the permutation test matches the $p$-value from the t-test quite closely. (As we shall see, Ronald Fisher introduced permutation tests primarily to support the use of the t-test in applications where the underlying normality assumptions were not strictly true [[4](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2458144/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2458144/)].)" |
| 272 | + "Note that the exact $p$-value from the permutation test matches the $p$-value from the t-test quite closely. (As we shall see, Ronald Fisher introduced permutation tests primarily to support the use of the t-test in applications where the underlying normality assumptions were not strictly true [[5]](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2458144/https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2458144/).)" |
265 | 273 | ] |
266 | 274 | }, |
267 | 275 | { |
|
350 | 358 | "id": "459d7ca3-f719-4b8e-a855-8e47846c3ec0", |
351 | 359 | "metadata": {}, |
352 | 360 | "source": [ |
353 | | - "Note that `1` is added to both the numerator and denominator when performing the randomized test [[3]](https://www.degruyter.com/document/doi/10.2202/1544-6115.1585/html). This can be thought of as including the observed value of the test statistic in the null distribution, and it ensures that the $p$-value of a randomized test is never zero." |
| 361 | + "Note that `1` is added to both the numerator and denominator when performing the randomized test [[6]](https://www.degruyter.com/document/doi/10.2202/1544-6115.1585/html). This can be thought of as including the observed value of the test statistic in the null distribution, and it ensures that the $p$-value of a randomized test is never zero." |
354 | 362 | ] |
355 | 363 | }, |
356 | 364 | { |
|
0 commit comments