[skip ci] Checklist: docs/value/shapley.md

mdbenito · mdbenito · commit 0f406ec059a3 · 2025-04-02T17:14:05.000+02:00
diff --git a/docs/value/shapley.md b/docs/value/shapley.md
@@ -85,24 +85,25 @@ and others are preferred, but if desired, usage follows the same pattern:
     )
 
     model = SomeSKLearnModel()
-    scorer = SupervisedScorer("accuracy", test_data, default=0)
-    utility = ModelUtility(model, scorer, ...)
+    scorer = SupervisedScorer("accuracy", test_data, default=0.0)
+    utility = ModelUtility(model, scorer)
     sampler = UniformSampler(seed=42)
-    stopping = MaxSamples(5000)
+    stopping = MaxSamples(sampler, 5000)
     valuation = ShapleyValuation(utility, sampler, stopping)
     with parallel_config(n_jobs=16):
         valuation.fit(training_data)
     result = valuation.values()
     ```
 
-The DataFrames returned by most Monte Carlo methods will contain approximate
-standard errors as an additional column, in this case named `cmc_stderr`.
-
-Note the usage of the object [MaxUpdates][pydvl.value.stopping.MaxUpdates] as the
-stop condition. This is an instance of a
-[StoppingCriterion][pydvl.value.stopping.StoppingCriterion]. Other examples are
-[MaxTime][pydvl.value.stopping.MaxTime] and
-[AbsoluteStandardError][pydvl.value.stopping.AbsoluteStandardError].
+Note the usage of the object [MaxSamples][pydvl.value.stopping.MaxSamples] as
+the stopping condition, which takes the sampler as argument. This is a special
+instance of a [StoppingCriterion][pydvl.value.stopping.StoppingCriterion]. More
+examples which are not tied to the sampler are
+[MaxTime][pydvl.value.stopping.MaxTime] (stops after a certain time),
+[MinUpdates][pydvl.value.stopping.MinUpdates] (looks at the number of updates
+to the individual values), and
+[AbsoluteStandardError][pydvl.value.stopping.AbsoluteStandardError] (not very
+reliable as a stopping criterion), among others.
 
 
 ## A stratified approach  { #stratified-shapley-value }
@@ -151,13 +152,17 @@ a way to reduce the variance of the estimator.
     [ShapleyValuation][pydvl.valuation.methods.shapley.ShapleyValuation] with a
     custom sampler, for instance
     [VRDSSampler][pydvl.valuation.samplers.stratified.VRDSSampler].
+    Note the use of the [History][pydvl.value.stopping.History] object, a stopping
+    which does not stop, but records the trace of value updates in a rolling
+    memory. The data can then be used to check for convergence, debugging,
+    plotting, etc.
 
     ```python
     from pydvl.valuation import StratifiedShapleyValuation, MinUpdates, History
     training_data, test_data = Dataset.from_arrays(...)
     model = ...
     scorer = SupervisedScorer(model, test_data, default=..., range=...)
-    utility = ModelUtility(model, scorer, 
+    utility = ModelUtility(model, scorer)
     valuation = StratifiedShapleyValuation(
         utility=utility,
         is_done=MinUpdates(min_updates) | History(n_steps=min_updates),
@@ -232,21 +237,29 @@ You can see this method in action in
 
 
 ??? example "Truncated Monte Carlo Shapley values"
+    Use of this object follows the same pattern as the previous examples, except
+    that separate instantiation of the sampler is not necessary anymore. This
+    has the drawback that we cannot use
+    [MaxSamples][pydvl.value.stopping.MaxSamples] as stopping criterion anymore
+    since it requires the sampler. To work around this, use
+    [ShapleyValuation][pydvl.valuation.methods.shapley.ShapleyValuation]
+    directly.
+
     ```python
     from pydvl.valuation import (
-        TMCShapleyValuation,
+        MinUpdates
         ModelUtility,
-        SupervisedScorer,
         PermutationSampler,
+        SupervisedScorer,
         RelativeTruncation,
-        MaxSamples
+        TMCShapleyValuation,
     )
 
     model = SomeSKLearnModel()
     scorer = SupervisedScorer("accuracy", test_data, default=0)
     utility = ModelUtility(model, scorer, ...)
     truncation = RelativeTruncation(rtol=0.05)
-    stopping = MaxSamples(5000)
+    stopping = MinUpdates(5000)
     valuation = TMCShapleyValuation(utility, truncation, stopping)
     with parallel_config(n_jobs=16):
         valuation.fit(training_data)
@@ -259,25 +272,24 @@ As already mentioned, with the architecture of
 [ShapleyValuation][pydvl.valuation.methods.shapley.ShapleyValuation] it is
 possible to try different importance-sampling schemes by swapping the sampler.
 Besides TMCS we also have [Owen sampling][owen-shapley-intro]
-[@okhrati_multilinear_2021], and [Beta
-Shapley][beta-shapley-intro] [@kwon_beta_2022] when $\alpha = \beta = 1.$
+[@okhrati_multilinear_2021], and [Beta Shapley][beta-shapley-intro]
+[@kwon_beta_2022] when $\alpha = \beta = 1.$
 
 A different approach is via a SAT problem, as done in [Group Testing
 Shapley][group-testing-shapley-intro] [@jia_efficient_2019].
 
-Yet another, which is applicable to any utility-based valuation method, is
-[Data Utility Learning][data-utility-learning-intro]
-[@wang_improving_2022]. This method learns a model of the utility function
-during a warmup phase, and then uses it to speed up marginal utility
-computations.
+Yet another, which is applicable to any utility-based valuation method, is [Data
+Utility Learning][data-utility-learning-intro] [@wang_improving_2022]. This
+method learns a model of the utility function during a warmup phase, and then
+uses it to speed up marginal utility computations.
 
 
 ## Model-specific methods
 
 Shapley values can have a closed form expression or a simpler approximation
 scheme when the model class is restricted. The prime example is
 [kNN-Shapley][knn-shapley-intro] [@jia_efficient_2019a], which is exact for the
-kNN model, and is $O(n_test n \log n).$
+kNN model, and is $O(n_\text{test}\  n \log n).$
 
 [^not1]: The quantity $u(S_{+i}) − u(S)$ is called the
   [marginal utility][glossary-marginal-utility] of the sample $x_i$ (with