-
Notifications
You must be signed in to change notification settings - Fork 12
[CORE] remove one_minus_pval attribute, add fwer_selection
#541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
one_minus_pval attribute, add fwer_selection
|
I have been encountering several failures in the tests for DL, CluDL, and EnCluDL. I replaced a single test FDP < FDR with the mean FDP over 10 repeats. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #541 +/- ##
==========================================
- Coverage 98.36% 98.25% -0.12%
==========================================
Files 23 23
Lines 1591 1604 +13
==========================================
+ Hits 1565 1576 +11
- Misses 26 28 +2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bthirion
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx ! This looks good. Just provided a few suggestions.
| selected_dl = (dl.pvalues_ < (fwer_target / 2) / n_features) | ( | ||
| dl.one_minus_pvalues_ < (fwer_target / 2) / n_features | ||
| ) | ||
| selected_dl = dl.fwer_selection(fwer=fwer_target / 2, n_tests=n_features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In examples, you don't want to trick the thresholds. Rather have an explicit two_sided=True and let the method deal with the threshold adaptation.
| ) | ||
|
|
||
|
|
||
| pval_cludl, _, one_minus_pval_cludl, _ = pval_from_two_sided_pval_and_sign( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we avoid the one_minus_pval_cludl here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| if alternative_hypothesis is None or alternative_hypothesis: | ||
| threshold_one_minus_pvalues = fdr_threshold( | ||
| self.one_minus_pvalues_, | ||
| one_minus_pval, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we get rid of it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| procedure : {'bonferroni'}, default='bonferroni' | ||
| The FWER control method to use: | ||
| - 'bonferroni': Bonferroni correction | ||
| n_tests : int or None, default=None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would n_tests differ from len(self.pvalues_) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For CluDL, for instance, it is the number of clusters, although the p-values have the same shape as the input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, you're right.
| n_tests = self.importances_.shape[0] | ||
| threshold_pvalue = fwer / n_tests | ||
| selected = (self.pvalues_ < threshold_pvalue).astype(int) | ||
| if two_tailed_test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we should set threshold_pvalue to fwer / (2 * n_tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| target_fdr = 0.1 | ||
| test_tol = 0.1 | ||
|
|
||
| for _ in range(500): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, that's a lot !
| def test_fwer_selection(): | ||
| """ | ||
| Test that the FWER selection procedure achieves the desired guarantees. | ||
| For 500 draws of p-values with 10 important ones the rest drawn from a uniform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's limit it to a small number. 50 ?
|
|
||
| assert_almost_equal(importances, beta, decimal=1) | ||
| selected = desparsified_lasso.fwer_selection(fwer=alpha, two_tailed_test=False) | ||
| fdp, power = fdp_power(np.where(selected)[0], np.where(important)[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you control the FWER, the FDP should be 0 with high probability
Changes in the core API
one_minus_pvalfromBaseVariableImportance, it is now only computed as an intermediate step offdr_selectionwhentwo_tailed_test=Trueone_minus_pvalwas used in some examples for FWER control. I also added a methodfwer_selectionwhich currently only supports the Bonferroni procedure.two_tailed_test=True, -1 for negative effects.BaseVariableImportance,two_tailed_test=False. For DL, CluDL, EnCluDL, this default is overwritten toTrueOther changes (most of the diff)
one_minus_pval. Replace withfwer_selection