Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior with resids and predict post feols #596

Closed
daltonm-bls opened this issue Aug 30, 2024 · 5 comments · Fixed by #597
Closed

Unexpected behavior with resids and predict post feols #596

daltonm-bls opened this issue Aug 30, 2024 · 5 comments · Fixed by #597

Comments

@daltonm-bls
Copy link

Python Version: 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]
Operating System: Linux-4.18.0-553.8.1.el8_10.x86_64-x86_64-with-glibc2.28
pyfixest Version: 0.24.0

Expected behavior:
model.resid() should give an array of floats that represent the residuals
model.predict() should give an array of floats that represent the predicted values

Actual behavior:
in a regression with only absorbed fixed effects (all I care about is the residual, so this is a plausible model), the residuals seem to be transposed with the predicted values and vice versa. Furthermore, the residuals are an array of arrays, as opposed to an array of floats.

Recreated:

import pandas as pd
import pyfixest as pf

data = pf.get_data()


yv = 'Y'
fe_v = ['f1', ]

dfr = data[[yv]+fe_v].dropna()


####### THIS PRODUCES WRONG RESULT
t_equation = f'{yv} ~ 1 | {" + ".join(fe_v)}'
# Running a fixed effects regression
model = pf.feols(t_equation, data=dfr,
)

print('this is an array, instead of a float')
print(type(model.resid()[0]))
resids = [i[0] for i in model.resid()]
dfr['yv_hat'] = model.predict()
dfr['resid'] = resids 


print('residuals are clearly the predicted values, and predicted values are the residuals')
print(dfr[yv].mean())
print(dfr['yv_hat'].mean())
print(dfr['resid'].mean())

####### THIS PRODUCES CORRECT RESULT
###moves fixed effect from absorbed part of equation to righ-hand-side variable
t_equation = f'{yv} ~ C({") + C(".join(fe_v)}) '
# Running a fixed effects regression
model = pf.feols(t_equation, data=dfr,
)

print('this is a float')
print(type(model.resid()[0]))
resids = model.resid()
dfr['yv_hat'] = model.predict()
dfr['resid'] = resids 


print('correct predicted value and residuals')
print(dfr[yv].mean())
print(dfr['yv_hat'].mean())
print(dfr['resid'].mean())
@s3alfisc
Copy link
Member

Hi @daltonm-bls , thanks for reporting this! Can confirm:

%load_ext autoreload
%autoreload 2

import pandas as pd
import pyfixest as pf

data = pf.get_data()
dfr = data[["Y", "f1"]].dropna()
fml_1 = "Y ~ 1 | f1"
fml_2 = "Y ~ C(f1)"

model_1 = pf.feols(fml = fml_1, data=dfr)
model_2 = pf.feols(fml = fml_2, data = dfr)

model_1.resid().flatten()[0:5]
# array([-1.38608153,  2.05742899,  0.10149807,  1.31744198, -2.12153977])
model_2.resid().flatten()[0:5]
# array([-0.07256172,  1.26208442,  0.03292194, -1.59579222,  0.60175012])
model_1.predict().flatten()[0:5]
# array([-0.07256172,  1.26208442,  0.03292194, -1.59579222,  0.60175012])
model_2.predict().flatten()[0:5]
# array([-1.38608153,  2.05742899,  0.10149807,  1.31744198, -2.12153977])

Will tackle this tomorrow!

@s3alfisc
Copy link
Member

s3alfisc commented Aug 30, 2024

This is the problematic line:

FIT._u_hat = Y.to_numpy() - Yd_array

@s3alfisc
Copy link
Member

@all-contributors please add @daltonm-bls for bug

Copy link
Contributor

@s3alfisc

I've put up a pull request to add @daltonm-bls! 🎉

@s3alfisc
Copy link
Member

Fixed in version 0.24.2, now available via PyPi. Thanks for reporting @daltonm-bls!

import pyfixest as pf

data = pf.get_data()
fit = pf.feols("Y ~1 |f1", data=data)
fitC = pf.feols("Y ~ C(f1)", data=data)

print(fit.resid()[0:5])
# [-0.07256172  1.26208442  0.03292194 -1.59579222  0.60175012]
print(fitC.resid()[0:5])
#[-0.07256172  1.26208442  0.03292194 -1.59579222  0.60175012]

print(fit.predict()[0:5])
# [-1.38608153  2.05742899  0.10149807  1.31744198 -2.12153977]
print(fitC.predict()[0:5])
# [-1.38608153  2.05742899  0.10149807  1.31744198 -2.12153977]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants