Skip to content

Conversation

@s3alfisc
Copy link
Member

@s3alfisc s3alfisc commented Sep 2, 2025

  • Add a tutorial section
  • Write notebook on variance reduction with pyfixest

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@s3alfisc s3alfisc marked this pull request as draft September 2, 2025 20:46
@codecov
Copy link

codecov bot commented Sep 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag Coverage Δ
core-tests 76.78% <ø> (ø)
tests-extended ?
tests-vs-r 15.81% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 7 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@apoorvalal
Copy link
Member

btw one way to distinguish from CUPED might be to allow for (1) heterogeneity in the autoregressive parameter in the DGP, which I remember messing with when I originally wrote this DGP to test things out and (2) bigger variance in time FEs, which CUPED averages over but 2WFE captures correctly.

@s3alfisc
Copy link
Member Author

s3alfisc commented Sep 4, 2025

Nice, I'll play around with it & add a section =)

@Dpananos
Copy link
Contributor

Dpananos commented Nov 8, 2025

Small typo

Because linear regression is just a fency way to compute and compare differences, we could also have estimated this via pf.feols():

Should be

Because linear regression is just a fancy way to compute and compare differences, we could also have estimated this via pf.feols():

@s3alfisc
Copy link
Member Author

s3alfisc commented Nov 8, 2025

Oh sweet, thank you!

@s3alfisc
Copy link
Member Author

s3alfisc commented Nov 8, 2025

Any other thoughts on the vignette? @apoorvalal suggested to compare CUPED with panel regressions when there are auto correlated errors, as panel regression should look better in this case?

@Dpananos
Copy link
Contributor

Dpananos commented Nov 9, 2025

One for me below, curious what @juanitorduz thinks since I think this PR is related to #1017?

The example shown has 30 observations per subject. I think that would correspond to a case where you randomize subjects over T days, and then only analyze the experiment after T+30 days (waiting for the subjects on the last day of randomization to complete their 30 day observation period).

In reality, we may want to do inference before all subjects have their 30 day observation period (depends on the design I suppose). In the case where not all subjects have the same number of observations, we would need to use a ratio metric + the delta method to get the right pointe estimate and standard error. I don't think there is (yet) a CUPED interpretation for ratio metrics as a subject level outcome regression which makes it hard to use in this case.

My understanding is that we could instead use a two way fixed effect (or some other approach, I'm not completely sure).

Do you think it might be worthwhile to explore a case in which the experiment does not have the same observation length for all subjects?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants