Add updates to wald_test method and unit testing files #536

Jayhyung · 2024-07-03T19:24:34Z

More details will follow with this PR soon, but I have one question before I proceed.

If you see the first commit of this PR, I added # type: ignore to several spots in feols_.py to avoid some error messages. I don't know why, but I encountered an incompatible data frame type errors, saying like the following;

pyfixest/estimation/feols_.py:411: error: Argument "data" to "get_cluster_df" has incompatible type "pandas.core.frame.DataFrame | polars.dataframe.frame.DataFrame"; expected "pandas.core.frame.DataFrame" [arg-type]
pyfixest/estimation/feols.py:413: error: Argument "data" to "_check_cluster_df" has incompatible type "pandas.core.frame.DataFrame | polars.dataframe.frame.DataFrame"; expected "pandas.core.frame.DataFrame" [arg-type]
Found 2 errors in 1 file (checked 1 source file)

Fixing this issue is not hard. Just converting data dataframe into Pandas dataframe if it is in Polars fixes the issue, but I was wondering whether this is okay to do in feols_.py . I thought that any dataframe that the variable data takes is already processed to be Pandas dataframe by the dev util files dev_utils.py and 'vcov_utils.py'.
How should i deal with this?

Thanks!

codecov · 2024-07-03T19:52:12Z

Codecov Report

Attention: Patch coverage is 98.62069% with 2 lines in your changes missing coverage. Please review.

Files	Coverage Δ
tests/test_errors.py	`98.78% <100.00%> (+1.69%)`	⬆️
tests/test_wald_test.py	`100.00% <100.00%> (+44.00%)`	⬆️
pyfixest/estimation/feols_.py	`91.09% <93.93%> (ø)`

... and 26 files with indirect coverage changes

s3alfisc · 2024-07-03T21:13:55Z

Should I just ignore this error message and keep working on wald_test method? Thanks!

I think the alert might be raised because the vcov methods allows a DataFrameType, which can either be a pl.DataFrame or a pd.DataFrame:

pyfixest/pyfixest/estimation/feols_.py

Line 360 in 405ba73

def vcov(

. If the input is a polars DataFrame, it is never converted to pandas, hence when it reaches

pyfixest/pyfixest/estimation/feols_.py

Line 441 in 405ba73

self._cluster_df = _get_cluster_df(

, the type checker complains because it expects a pd.DataFrame.

I think if we apply _polars_to_pandas(data) at the beginning of the vcov method, the error might vanish? Though admittedly I am slightly puzzled why this issue arrives only now? 🤔

Anyways, If that is not it I would suggest you continue with # ignore and I try to debug this in the next days!

Jayhyung · 2024-07-03T21:23:27Z

Yes, this certainly fixes the issue. This also puzzles me as I didn't encounter this error before ;). Stay tuned with my next update!

s3alfisc · 2024-07-03T21:34:45Z

Stay tuned with my next update!

Looking forward to it! 😄

Jayhyung · 2024-07-05T03:16:13Z

Several points should be mentioned regarding Wald test.

I tried to build wald_test method as comprehensive as possible. For example, the F statistics is computed differently depending on whether users set up clusters for computing standard errors, or the target statistics can be switched to chi-square if R matrix or q vector are not identity matrix and zero vector. I think this is more natural as F test is a special case of Wald test, when R matrix or q vector satisfy certain conditions
I updated testing scripts to test 1) single equation case(where R.shape[0]=1 ), 2) multiple equations case (where R.shape[0]=n ) with and without cluster robust standard error, and all the test statistics(F-statistics and p-value) are tested against R package, clubSandwich. Also, all error flags are tested in test_errors.py.

One concern that I have is that, I didn't test the case with q != 0. Only zero vector in q is allowed in clubSandwich according to my understanding(They have arg rhs(right hand side) to specify q, but setting different q vector never changes the result according to my testing.). Also, their documentation was not much informative regarding this (Link) so we may have to use different stats package to be compared with. Do you have any clue on which package is great with Walt test?

Thanks!

s3alfisc · 2024-07-07T10:23:05Z

Hi @Jayhyung , thanks for the PR! I have started with a first review. Overall, this looks very good! =)

I am really surprised that clubSandwich does not allow users to vary q - I really think it is one of the best packages for Wald Testing in R. It adds a lot of very nice convenience features for users, i.e. it allows to easily create constraint matrices (as described in the vignette).

One way to test for different values of q is to test "identical" but "numerically different" hypotheses internally.

I.e. we could test that H_{0}: np.eye(3) = 1 vs H_{1}: np.eye(3) != 1 produces identical p-values to e.g. the test of H_{0}: np.eye(3) * 2 = 2 vs H_{1}: np.eye(3) * 2 != 2?

s3alfisc · 2024-07-07T12:06:41Z

On the dof correction fixest employs for its wald function:

Here is the code for fixest::wald():

wald = function (x, keep = NULL, drop = NULL, print = TRUE, vcov, se, 
          cluster, ...) 
{
  check_arg(x, "class(fixest)")
  check_arg(keep, drop, "NULL character vector no na")
  if (isTRUE(x$onlyFixef)) 
    return(NA)
  dots = list(...)
  if (!isTRUE(x$summary) || !missing(vcov) || !missing(se) || 
      !missing(cluster) || ...length() > 0) {
    x = summary(x, vcov = vcov, se = se, cluster = cluster, 
                ...)
  }
  if (missing(keep) && missing(drop)) {
    drop = "\\(Intercept\\)"
  }
  coef_name = names(x$coefficients)
  coef_name = keep_apply(coef_name, keep)
  coef_name = drop_apply(coef_name, drop)
  if (length(coef_name) == 0) 
    return(NA)
  qui = names(x$coefficients) %in% coef_name
  my_coef = x$coefficients[qui]
  df1 = length(my_coef)
  df2 = x$nobs - x$nparams
  stat = drop(my_coef %*% solve(x$cov.scaled[qui, qui]) %*% 
                my_coef)/df1
  p = pf(stat, df1, df2, lower.tail = FALSE)
  vcov = attr(x$cov.scaled, "type")
  vec = list(stat = stat, p = p, df1 = df1, df2 = df2, vcov = vcov)
  if (print) {
    cat("Wald test, H0: ", ifsingle(coef_name, "", "joint "), 
        "nullity of ", enumerate_items(coef_name), "\n", 
        sep = "")
    cat(" stat = ", numberFormatNormal(stat), ", p-value ", 
        ifelse(p < 2.2e-16, "< 2.2e-16", paste0("= ", numberFormatNormal(p))), 
        ", on ", numberFormatNormal(df1), " and ", numberFormatNormal(df2), 
        " DoF, ", "VCOV: ", vcov, ".", sep = "")
    return(invisible(vec))
  }
  else {
    return(vec)
  }
}

The relevant part of the code is this one

  df1 = length(my_coef)
  df2 = x$nobs - x$nparams

nparams appears to combine the number of covariates plus the number of swiped out dummies:

library(fixest)

data(mtcars)

fit = feols(mpg ~ cyl + as.factor(hp), data = mtcars)
my_coef = fit$coefficients
df1 = length(my_coef)
df2 = fit$nobs -fit$nparams
df1 # 23
df2 # 9
fit$nparams #23 -> covariates + swiped out fixed effects


fit = feols(mpg ~ cyl | hp, data = mtcars)
my_coef = fit$coefficients
df1 = length(my_coef)
df2 = fit$nobs -fit$nparams

df1 # 1
df2 # 9
fit$nparams # 23

For pyfixest, this would then translate to

dfn = _N - _k_fe - _k
dfd = _k

which is already what is implemented =)

Interestingly, there is no special logic for clustered errors, i.e. dfd is just _k and not _G-1. To align with fixest, I think we can then delete the

        if self._is_clustered:
            self._dfd = np.min(np.array(self._G)) - 1

clause?

Additionally, would you mind adding an exact test of fixest::wald() against our implementation @Jayhyung? Note that despite being called wald, it only ever runs an F-Test =)

s3alfisc · 2024-07-07T13:08:31Z

Oh, and of course there is another way to test against clubSandwich by simply transforming the null!

Suppose we want to test the null of

$$ R' \hat{\beta} - d = 0 $$

with $d \in \mathbb{R}^{k}$.

Then we can define

$$
R'= (R, -I_{k} )

with $I_{k}$ being the identity matrix, and

$$ \hat{\beta} = (\beta, d)^{T} $$

and then we can simply test the null of $R' \beta' = 0$.

Jayhyung · 2024-07-09T01:21:58Z

Thanks for the comment!

About testing different q vector and augmenting R as R=(R,-I_{k}), it still eludes me even though I read the vignette . First, the vignette only shows the cases with linear combination of specified coefficients being 0. Second, Wald_test in clubSandwich package outputs error flags if the number of columns of R matrix is not the same with the number of estimated coefficients(this includes constant term). Hmm, I'm not sure but we can still test q! = 0 cases using wald from fixest. If there is no objection, I will proceed in this direction. Please, let me know if anything! (I may reach out the author of the vignette regarding this).
Actually, the degree of freedom in the denominator(i.e. dfd) is adjusted in theclubSandwich package according whether the users specify clusters or not. The following is the snapshot from Wald_test in clubSandwich (Line 429 shows how they compute p-value):
Here

The degree of the freedom in the denominator in the F test is set up to be J-1(if test is set up to be Naive-F), where the J is actually equal to the number of clusters, if clusters are set up in the regression specification. Also, setting up the dfd to be G-1 in the case when there are clusters actually produces the p-value that is a lot much closer to the result from clubSandwich. Theoretical foundation of this can be found in the guide ( Page 350 "A.Cluster- Robust F- tests" is relevant to the discussion. Sorry for just putting the link here, as this is the published version of this famous guide to all the empirical economists ;)).
I can just erase the part as you suggested, but the F statistics and p-value would not be adjusted to the clusters. Please let me know how you want me to proceed.

Yes, we only tested F-test ;). I will add more testing cases that encompasses the wald-test.

Thanks for the comments!

s3alfisc · 2024-07-10T21:01:37Z

The degree of the freedom in the denominator in the F test is set up to be J-1(if test is set up to be Naive-F), where the J is actually equal to the number of clusters, if clusters are set up in the regression specification. Also, setting up the dfd to be G-1 in the case when there are clusters actually produces the p-value that is a lot much closer to the result from clubSandwich. Theoretical foundation of this can be found in the guide ( Page 350 "A.Cluster- Robust F- tests" is relevant to the discussion. Sorry for just putting the link here, as this is the published version of this famous guide to all the empirical economists ;)).

Ah, great! Then let's follow what Cameron & Miller recommend. Even better if this aligns closely with the implementation in clubSandwich =)

s3alfisc · 2024-07-10T21:19:01Z

Second, Wald_test in clubSandwich package outputs error flags if the number of columns of R matrix is not the same with the number of estimated coefficients(this includes constant term).

Hm, in this case, the described method indeed does not work with clubSandwich. And Feols.waldttest() also throw an error if the length of the R vector does not equal the number of coefficients, right?

Alternatively, we could use R's car::linearHypothesis function to test against, which is nice because the car package is included in R's base distribution.

linearHypothesis(model, hypothesis.matrix, rhs=NULL,
    test=c("F", "Chisq"), vcov.=NULL, 
	white.adjust=c(FALSE, TRUE, "hc3", "hc0", "hc1", "hc2", "hc4"), 
	singular.ok=FALSE, ...)

It's only available for objects of type lm, but we can provide a custom vcov matrix (it's iid by default). As this is only about testing that the offset r works correctly, it should be sufficient to only test for the iid case (given that we that for hetero and clustered vcovs against clubSandwich).

This is how it would work via rpy2:

%load_ext autoreload
%autoreload 2

import pyfixest as pf 
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri

# rpy2 imports
from rpy2.robjects.packages import importr

pandas2ri.activate()

fixest = importr("fixest")
stats = importr("stats")
broom = importr("broom")
car = importr("car")

data = pf.get_data()
fit_r = stats.lm("Y ~ X1 + X2", data = data)
R = ro.r.matrix(np.array([0, 1, 1]), nrow=1)
car.linearHypothesis(fit_r, R, 1)

Jayhyung · 2024-07-11T03:54:40Z

I just added wald test cases to the testing scripts. Let me know if something extra should be done!

s3alfisc

Looks phenomenal @Jayhyung. I particularly dig the super thorough unit tests! Thank you 🎉

s3alfisc · 2024-07-07T10:24:06Z

pyfixest/estimation/feols_.py

-        ], "distribution must be either 'F' or 'chi2'."
+        if self._is_clustered:
+            self._dfd = np.min(np.array(self._G)) - 1
+        else:


Is it standard to include the fixed effects in the dof computation? I am actually not sure about this and will take a look at the fixest docs!

Oh sorry for the late reply, I didn't notice your comment until now. According to my impression of clubSanwich package, the fixed effects didn't change anything related with the inference part(it was just like demeaning the whole dataset and implementing estimation on this demeaned data and then doing inference). I will take a close look at this and update if serious corrections are necessary(I will PR this if they really are)

No worries - this is at least what fixest seems to be doing (see the function I posted somewhere in the PR) =) I think clubSandwich is not supporting fixed effects / fixest as far as I am aware (due to difficulties computing CRV2 and CRV3 vcov's with fixed effects I believe) - so I am afraid your chances to find anything there are probably not super big :/

ah, I actually meant to say fixest. Thanks for letting me know. :)

s3alfisc · 2024-07-07T10:25:35Z

tests/test_wald_test.py

+    r_fstat = pd.DataFrame(r_wald).T[1].values[0]
+    r_pvalue = pd.DataFrame(r_wald).T[5].values[0]
+
+    np.testing.assert_allclose(f_stat, r_fstat, rtol=1e-02, atol=1e-02)


Do the tests still pass if we increase the error tolerance to e.g. 1e-04? At some point I think not exactly matching the clubSandwich dof defauls might bite us?

This is the testing case for the regression specifications without clusters. I just tested with different error tolerance. It survives 1e-05 for relative error tolerance, but fails for 1e-03 for absolute tolerance. Here is the comparison of statistics between the two packages.

Python f_stat: 1.3298150228659409, R f_stat: 1.3220891299061177
Python p_stat: 0.2628217235735699, R p_stat: 0.2653441458766282

Reasonably close enough, but there is a bit of discrepancy. The following is the case when I set up the constant term being 0.1

Python f_stat: 15.748365910640432, R f_stat: 15.708691053689362
Python p_stat: 3.67809116497142e-10, R p_stat: 3.8949330392594636e-10

Do you think this is reasonable enough? I may take a look at this again in different PRs if it is not.

Thanks!

Jayhyung added 2 commits July 3, 2024 15:04

Update Wald Test and add testing py files

729d3fd

Merge branch 'dev2' of https://github.com/Jayhyung/pyfixest into dev2

d7836f8

Update minor changes to wald test and unit testing

52d9510

Merge branch 'py-econometrics:master' into dev2

55b2df0

Jayhyung mentioned this pull request Jul 9, 2024

Feiv : More diagnostic tests on the first stage regression need to be added #543

Open

Jayhyung added 2 commits July 10, 2024 23:05

Add wald-test case and rename tests

3c73b78

Minor changes

c5ca1e0

Jayhyung added 2 commits July 11, 2024 01:14

Add an wald-test usage example

8eec5d1

Add minor changes to test cases and usage example.

9783181

s3alfisc approved these changes Jul 11, 2024

View reviewed changes

s3alfisc merged commit ca1200e into py-econometrics:master Jul 11, 2024
7 checks passed

s3alfisc mentioned this pull request Jul 11, 2024

Add note on wald_test method to quickstart (or even a new notebook)? #545

Closed

Jayhyung deleted the dev2 branch July 11, 2024 21:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add updates to wald_test method and unit testing files #536

Add updates to wald_test method and unit testing files #536

Jayhyung commented Jul 3, 2024 •

edited

Loading

codecov bot commented Jul 3, 2024 •

edited

Loading

s3alfisc commented Jul 3, 2024

Jayhyung commented Jul 3, 2024

s3alfisc commented Jul 3, 2024

Jayhyung commented Jul 5, 2024 •

edited

Loading

s3alfisc commented Jul 7, 2024 •

edited

Loading

s3alfisc commented Jul 7, 2024

s3alfisc commented Jul 7, 2024 •

edited

Loading

Jayhyung commented Jul 9, 2024 •

edited

Loading

s3alfisc commented Jul 10, 2024

s3alfisc commented Jul 10, 2024 •

edited

Loading

Jayhyung commented Jul 11, 2024

s3alfisc left a comment

s3alfisc Jul 7, 2024

Jayhyung Jul 11, 2024

s3alfisc Jul 11, 2024

Jayhyung Jul 11, 2024

s3alfisc Jul 7, 2024

Jayhyung Jul 11, 2024

Add updates to wald_test method and unit testing files #536

Add updates to wald_test method and unit testing files #536

Conversation

Jayhyung commented Jul 3, 2024 • edited Loading

codecov bot commented Jul 3, 2024 • edited Loading

Codecov Report

s3alfisc commented Jul 3, 2024

Jayhyung commented Jul 3, 2024

s3alfisc commented Jul 3, 2024

Jayhyung commented Jul 5, 2024 • edited Loading

s3alfisc commented Jul 7, 2024 • edited Loading

s3alfisc commented Jul 7, 2024

s3alfisc commented Jul 7, 2024 • edited Loading

Jayhyung commented Jul 9, 2024 • edited Loading

s3alfisc commented Jul 10, 2024

s3alfisc commented Jul 10, 2024 • edited Loading

Jayhyung commented Jul 11, 2024

s3alfisc left a comment

Choose a reason for hiding this comment

s3alfisc Jul 7, 2024

Choose a reason for hiding this comment

Jayhyung Jul 11, 2024

Choose a reason for hiding this comment

s3alfisc Jul 11, 2024

Choose a reason for hiding this comment

Jayhyung Jul 11, 2024

Choose a reason for hiding this comment

s3alfisc Jul 7, 2024

Choose a reason for hiding this comment

Jayhyung Jul 11, 2024

Choose a reason for hiding this comment

Jayhyung commented Jul 3, 2024 •

edited

Loading

codecov bot commented Jul 3, 2024 •

edited

Loading

Jayhyung commented Jul 5, 2024 •

edited

Loading

s3alfisc commented Jul 7, 2024 •

edited

Loading

s3alfisc commented Jul 7, 2024 •

edited

Loading

Jayhyung commented Jul 9, 2024 •

edited

Loading

s3alfisc commented Jul 10, 2024 •

edited

Loading