88
99LOO is the simplest approach to valuation. Let $D$ be the training set, and
1010$D_ {-i}$ be the training set without the sample $x_i$. Assume some utility
11- function $u(S)$ that measures the performance of a model trained on
11+ function $u(S)$ that measures the performance of a model trained on
1212$S \subseteq D$.
1313
14- LOO assigns to each sample its * marginal utility* as value:
14+ LOO assigns to each sample its * marginal utility* as value:
1515
1616$$ v_\text{loo}(i) = u(D) - u(D_{-i}), $$
1717
@@ -20,13 +20,13 @@ method. In pyDVL it is available as
2020[ LOOValuation] [ pydvl.valuation.methods.loo.LOOValuation ] .
2121
2222For the purposes of data valuation, this is rarely useful beyond serving as a
23- baseline for benchmarking. Although it can perform astonishingly well on
24- occasion.
23+ baseline for benchmarking (although it can perform astonishingly well on
24+ occasion) .
2525
2626One particular weakness is that it does not necessarily correlate with an
2727intrinsic value of a sample: since it is a marginal utility, it is affected by
28- _ diminishing returns_ . Often, the training set is large enough for a single sample
29- not to have any significant effect on training performance, despite any
28+ _ diminishing returns_ . Often, the training set is large enough for a single
29+ sample not to have any significant effect on training performance, despite any
3030qualities it may possess. Whether this is indicative of low value or not depends
3131on one's goals and definitions, but other methods are typically preferable.
3232
@@ -46,3 +46,8 @@ on one's goals and definitions, but other methods are typically preferable.
4646
4747Strictly speaking, LOO can be seen as a [ semivalue] [ semi-values-intro ] where
4848all the coefficients are zero except for $k=|D|-1.$
49+
50+ !!! tip "Connection to the influence function"
51+ With a slight change of perspective, the _ influence function_ can be seen as
52+ a first order approximation to the Leave-One-Out values. See [ Approximating
53+ the influence of a point] [ influence-of-a-point ] .
0 commit comments