-
Notifications
You must be signed in to change notification settings - Fork 71
Variance reduction #1018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Variance reduction #1018
Conversation
s3alfisc
commented
Sep 2, 2025
- Add a tutorial section
- Write notebook on variance reduction with pyfixest
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
btw one way to distinguish from CUPED might be to allow for (1) heterogeneity in the autoregressive parameter in the DGP, which I remember messing with when I originally wrote this DGP to test things out and (2) bigger variance in time FEs, which CUPED averages over but 2WFE captures correctly. |
|
Nice, I'll play around with it & add a section =) |
|
Small typo
Should be
|
|
Oh sweet, thank you! |
|
Any other thoughts on the vignette? @apoorvalal suggested to compare CUPED with panel regressions when there are auto correlated errors, as panel regression should look better in this case? |
|
One for me below, curious what @juanitorduz thinks since I think this PR is related to #1017? The example shown has 30 observations per subject. I think that would correspond to a case where you randomize subjects over T days, and then only analyze the experiment after T+30 days (waiting for the subjects on the last day of randomization to complete their 30 day observation period). In reality, we may want to do inference before all subjects have their 30 day observation period (depends on the design I suppose). In the case where not all subjects have the same number of observations, we would need to use a ratio metric + the delta method to get the right pointe estimate and standard error. I don't think there is (yet) a CUPED interpretation for ratio metrics as a subject level outcome regression which makes it hard to use in this case. My understanding is that we could instead use a two way fixed effect (or some other approach, I'm not completely sure). Do you think it might be worthwhile to explore a case in which the experiment does not have the same observation length for all subjects? |