Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GLM Support #758

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
Open

GLM Support #758

wants to merge 16 commits into from

Conversation

s3alfisc
Copy link
Member

@s3alfisc s3alfisc commented Dec 22, 2024

This PR

  • Adds an abstract Feglm class from which different type of GLM families can inherit.
  • Adds a pf.feglm(, family = "logit") front end to call these functions.
  • The implementation closely follows this paper by Stamann: https://arxiv.org/pdf/1707.01815. Equations and algo are implemented "verbatim" - should be easy to match code to paper
  • Families to support for now: (unconditional) logit, probit, an alternative Poisson implementation.

Other PR

  • Later: anything else of interest. Clogit (different estimator). Negbin?
  • Incidental Param Problem Bias Corrections

TODO:

  • Implement abstract GLM base class (inheriting from Fepois for now, later this should be reversed) with IWLS algo and step halfing
  • Implement Logit class based on Stamman (not yet converging, bug somewhere)
  • store IWLS parameters so that they can be used in post estimation classes (i.e. for computing vcov's etc). See Fepois.get_fit() method for details.
  • tests against fixest::feglm()

Goal:

import pyfixest as pf
data = pf.get_data(model ="Fepois")
data["Y"] = np.where(data.Y > 0, 1, 0)
fit = pf.feglm("Y ~ X1 | f1", data = data, family = "logit")

Should all be doable before the year ends (hopefully!)

Fyi @leostimpfle @apoorvalal @gbekes.

Copy link

codecov bot commented Dec 22, 2024

Codecov Report

Attention: Patch coverage is 32.80757% with 213 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pyfixest/estimation/feglm_.py 23.95% 146 Missing ⚠️
pyfixest/estimation/estimation.py 5.00% 19 Missing ⚠️
pyfixest/estimation/fegaussian_.py 50.00% 16 Missing ⚠️
pyfixest/estimation/FixestMulti_.py 33.33% 12 Missing ⚠️
pyfixest/estimation/felogit_.py 60.00% 10 Missing ⚠️
pyfixest/estimation/feprobit_.py 61.53% 10 Missing ⚠️
Flag Coverage Δ
core-tests 75.00% <32.80%> (-3.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pyfixest/__init__.py 81.81% <ø> (ø)
pyfixest/estimation/__init__.py 100.00% <100.00%> (ø)
pyfixest/estimation/feiv_.py 98.18% <ø> (ø)
pyfixest/estimation/feols_.py 83.57% <100.00%> (-0.07%) ⬇️
pyfixest/estimation/fepois_.py 87.26% <ø> (ø)
pyfixest/estimation/felogit_.py 60.00% <60.00%> (ø)
pyfixest/estimation/feprobit_.py 61.53% <61.53%> (ø)
pyfixest/estimation/FixestMulti_.py 76.27% <33.33%> (-3.93%) ⬇️
pyfixest/estimation/fegaussian_.py 50.00% <50.00%> (ø)
pyfixest/estimation/estimation.py 81.51% <5.00%> (-15.46%) ⬇️
... and 1 more

@s3alfisc
Copy link
Member Author

Status: implementations seem to work for logit and probit without fixed effects (coefficients match, SEs is another story):

import statsmodels.api as sm
import pyfixest as pf 

spector_data = sm.datasets.spector.load_pandas()
spector_data.exog = sm.add_constant(spector_data.exog)

# fit via statsmodels
logit_mod = sm.Logit(spector_data.endog, spector_data.exog)
probit_mod = sm.Probit(spector_data.endog, spector_data.exog)
logit_res = logit_mod.fit()
probit_res = probit_mod.fit()

logit_params = logit_res.params
probit_params = probit_res.params
print(logit_params)
#          Iterations 6
# const   -13.021347
# GPA       2.826113
# TUCE      0.095158
# PSI       2.378688
# print(probit_params)
# const   -7.452320
# GPA      1.625810
# TUCE     0.051729
# PSI      1.426332


# fit via pyfixest 
fit_logit = pf.feglm("GRADE ~ GPA + TUCE + PSI", data = spector_data.data, family = "logit")
fit_probit = pf.feglm("GRADE ~ GPA + TUCE + PSI", data = spector_data.data, family = "probit")

pf.etable([fit_logit, fit_probit], digits = 6)

image

@s3alfisc s3alfisc linked an issue Dec 22, 2024 that may be closed by this pull request
@s3alfisc s3alfisc mentioned this pull request Dec 22, 2024
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Logistic Regression
1 participant