Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add 1st stage regression in Feiv class #525

Merged
merged 4 commits into from
Jun 27, 2024

Conversation

Jayhyung
Copy link
Member

Hi Alex @s3alfisc! More details with this PR will follow soon.

Copy link

codecov bot commented Jun 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
pyfixest/estimation/FixestMulti_.py 82.54% <ø> (ø)
pyfixest/estimation/feiv_.py 97.61% <100.00%> (ø)
tests/test_iv.py 100.00% <100.00%> (ø)

... and 27 files with indirect coverage changes

@s3alfisc
Copy link
Member

Very cool! Looking forward to it =)

@Jayhyung
Copy link
Member Author

Jayhyung commented Jun 23, 2024

There are a few concerns that I have working on this issue.

  1. Does feiv class in the codebase allow multiple endogenous independent variables? I'm asking this question because the IV test cases in the test code(line 97 ~ 117 in test_vs_fixest.py) only allow the single case. The first stage in my code only allows the single case, but if you would like to generalize the code that allows the multiple cases, I'm happy to work more on this!
  2. I defined several attributes within feiv class related with the first stage regression(_pi_hat : estimated coefficient, X_hat : predicted values, _v_hat : residuals). Please, let me know if you want me to define more/less attributes related with the first stage.
  3. I would add a test code that evaluates whether the result(e.g. _pi_hat) of the first stage regression is the same with the result from feols. Do you think this is a reasonable way to test the 1st stage part of the get_fit method in feiv class? Also, where do i have to add the test code in test_vs_fixest.py exactly?

Thanks!

@s3alfisc
Copy link
Member

Hi @Jayhyung - sorry for my delayed response, I have been swamped for work over tethe last days :/

Does feiv class in the codebase allow multiple endogenous independent variables? I'm asking this question because the IV test cases in the test code(line 97 ~ 117 in test_vs_fixest.py) only allow the single case. The first stage in my code only allows the single case, but if you would like to generalize the code that allows the multiple cases, I'm happy to work more on this!

Yes, only one endogenous variable is allowed. It would be quite cool to allow for more, but this would require to rework the FixestFormulaParser and it might not be a small endeavor. So for now, I would recommend to maybe keep this as a separate issue?

I defined several attributes within feiv class related with the first stage regression(_pi_hat : estimated coefficient, X_hat : predicted values, _v_hat : residuals). Please, let me know if you want me to define more/less attributes related with the first stage.

For now, I think that coefficient, predicted values and residuals is all that we need, right? So this sounds good to me =)

I would add a test code that evaluates whether the result(e.g. _pi_hat) of the first stage regression is the same with the result from feols. Do you think this is a reasonable way to test the 1st stage part of the get_fit method in feiv class? Also, where do i have to add the test code in test_vs_fixest.py exactly?

Sounds like a good strategy =) I think that you could just create a new test file, e.g. test_iv.py?

@Jayhyung
Copy link
Member Author

Thanks for the feedback! I will work on my next iteration. Stay tuned. :)

@Jayhyung
Copy link
Member Author

Just added the error testing file! Please let me know if anything extra should be done.

@@ -133,6 +139,28 @@ def get_fit(self) -> None:
_Z = self._Z
_Y = self._Y

# Start First Stage

model1 = Feols(
Copy link
Member

@s3alfisc s3alfisc Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it correct that the endogenous variable is always the first one in _X. But it would likely be safer to use the endogvar created via the model_matrix_fixest function?

This would require to add an endogvar argument to Feiv.__init__ and to pass it to Feiv.get_fit().

We would also have to demean endogvar in FixestMulti.

Alternatively, we could also keep this as is, as the unit tests you've implemented would certainly catch it if we broke this logic. =)

What do you think @Jayhyung ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your practice is more sensible as it encompasses more general cases. I will take this into account in the next iteration!

demean "endogva" in FixestMulti.py
@Jayhyung
Copy link
Member Author

I added endogvar as you suggested to the next iteration. Let me know if we need anything extra to be done!

@@ -31,7 +33,8 @@ class Feiv(Feols):
Names of the coefficients of Z.
collin_tol : float
Tolerance for collinearity check.
weights_name : Optional[str]
weights_name : Op endgvar : np.ndarray
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small formatting issue here =)

@Jayhyung
Copy link
Member Author

Thanks! Just fixed the formatting issue. Let me know if anything comes up!

@s3alfisc s3alfisc merged commit 88c87cd into py-econometrics:master Jun 27, 2024
7 checks passed
@s3alfisc
Copy link
Member

Perfect. It's merged =) thanks @Jayhyung!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants