Take the dispersion into account for GLMM familes with a dispersion param #291

palday · 2020-02-26T19:43:25Z

Closes #206

Note (to self) that some distributions (e.g. Gamma) require changing the linear predictor + dispersion into a different parameterization (e.g. shape, scale).

codecov · 2020-02-26T19:55:29Z

Codecov Report

Patch coverage has no change and project coverage change: -3.10% ⚠️

Comparison is base (df203bc) 95.83% compared to head (d6db3b0) 92.74%.

❗ Current head d6db3b0 differs from pull request most recent head 0af4eeb. Consider uploading reports for the commit 0af4eeb to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #291      +/-   ##
==========================================
- Coverage   95.83%   92.74%   -3.10%     
==========================================
  Files          35       23      -12     
  Lines        3267     1722    -1545     
==========================================
- Hits         3131     1597    -1534     
+ Misses        136      125      -11

Flag	Coverage Δ
current	`?`
minimum	`?`
nightly	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

see 36 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

palday · 2020-02-26T23:08:10Z

For tinkering and comparison to lme4, look at this gist.

… in LMM)

palday · 2020-04-13T23:35:08Z

After pulling my hair out for far too long trying to figure out why fast fits worked but 'slow' ones didn't, I tried a different optimizer. It turns out that I accidentally stumbled across a case where BOBYQA fails dramatically. The old two-stage optimization for slow fits didn't have this problem because it started from the fast fits don't.

julia> ┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
julia> bobyqa = GeneralizedLinearMixedModel(@formula(yield ~ 1 + (1|batch)), MixedModels.dataset(:dyestuff),Normal(),SqrtLink());

julia> fit!(bobyqa)
┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  yield ~ 1 + (1 | batch)
  Distribution: Normal{Float64}
  Link: SqrtLink()

  Deviance: 58893.7814

Variance components:
            Column    Variance Std.Dev. 
batch    (Intercept)  1954.5903 44.21075
Residual              2178.8889 46.67857
 Number of obs: 30; levels of grouping factors: 6

Fixed-effects parameters:
──────────────────────────────────────────────────
             Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   39.0793    18.0493     2.17   0.0304
──────────────────────────────────────────────────

julia> bobyqa.optsum
Initial parameter vector: [39.08324449172631, 1.0]
Initial objective value:  58893.79526421842

Optimizer (from NLopt):   LN_BOBYQA
Lower bounds:             [-Inf, 0.0]
ftol_rel:                 1.0e-12
ftol_abs:                 1.0e-8
xtol_rel:                 0.0
xtol_abs:                 [1.0e-10]
initial_step:             [39.08324449172631, 0.75]
maxfeval:                 -1

Function evaluations:     26
Final parameter vector:   [39.0788080375932, 0.998348758382195]
Final objective value:    58893.7814171595
Return code:              FTOL_REACHED


julia> neldermead = GeneralizedLinearMixedModel(@formula(yield ~ 1 + (1|batch)), MixedModels.dataset(:dyestuff),Normal(),SqrtLink());

julia> ┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
┌ Warning: Model has not been fit
└ @ MixedModels ~/Work/MixedModels.jl/src/generalizedlinearmixedmodel.jl:563
julia> neldermead.optsum.optimizer = :LN_NELDERMEAD;

julia> fit!(neldermead)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  yield ~ 1 + (1 | batch)
  Distribution: Normal{Float64}
  Link: SqrtLink()

  Deviance: 58890.8511

Variance components:
            Column   Variance  Std.Dev. 
batch    (Intercept)   599.565 24.486016
Residual              2178.889 46.678570
 Number of obs: 30; levels of grouping factors: 6

Fixed-effects parameters:
──────────────────────────────────────────────────
             Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   39.0793    9.99691     3.91    <1e-4
──────────────────────────────────────────────────

julia> neldermead.optsum
Initial parameter vector: [39.08324449172631, 1.0]
Initial objective value:  58893.79526421842

Optimizer (from NLopt):   LN_NELDERMEAD
Lower bounds:             [-Inf, 0.0]
ftol_rel:                 1.0e-12
ftol_abs:                 1.0e-8
xtol_rel:                 0.0
xtol_abs:                 [1.0e-10]
initial_step:             [39.08324449172631, 0.75]
maxfeval:                 -1

Function evaluations:     80
Final parameter vector:   [39.07930704835212, 0.5529134950119795]
Final objective value:    58890.851102667264
Return code:              FTOL_REACHED

For reference, here's the slow fit:

julia> fastfit = fit(MixedModel, @formula(yield ~ 1 + (1|batch)), MixedModels.dataset(:dyestuff),Normal(),SqrtLink(), fast=true)

Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 1)
  yield ~ 1 + (1 | batch)
  Distribution: Normal{Float64}
  Link: SqrtLink()

  Deviance: 58890.8511

Variance components:
            Column    Variance  Std.Dev. 
batch    (Intercept)   599.6439 24.487628
Residual              2178.8889 46.678570
 Number of obs: 30; levels of grouping factors: 6

Fixed-effects parameters:
──────────────────────────────────────────────────
             Estimate  Std.Error  z value  P(>|z|)
──────────────────────────────────────────────────
(Intercept)   39.0793    9.99757     3.91    <1e-4
──────────────────────────────────────────────────

julia> fastfit.optsum
Initial parameter vector: [1.0]
Initial objective value:  58893.79517237531

Optimizer (from NLopt):   LN_BOBYQA
Lower bounds:             [0.0]
ftol_rel:                 1.0e-12
ftol_abs:                 1.0e-8
xtol_rel:                 0.0
xtol_abs:                 [1.0e-10]
initial_step:             [0.75]
maxfeval:                 -1

Function evaluations:     26
Final parameter vector:   [0.55294989485893]
Final objective value:    58890.85110260193
Return code:              FTOL_REACHED

palday · 2020-04-14T00:12:01Z

Based on tinkering elsewhere (e.g. toying with fitting a Gamma model with identity and log links to the languageR::lexdec data -- there was a proposal to use Gamma with identity link to model RT data), this type of problem with the optimizer seems to keep cropping up for GLMMs with dispersion parameter. I suspect that the Gamma model I was trying is already on very shaky in terms of numerical stability, but without better test data, I can't really check this.

I don't know if the optimizer issues are particular to the datasets I've been testing on (which aren't ideal for the models I've been fitting) or indicative or a larger issue in optimization for GLMMs with dispersion parameters.

But given all that, I think it's ready for an initial review.

…mm_dispersion_deviance

…_dispersion_deviance

dmbates · 2022-01-04T16:27:16Z

Do you want to continue to use StableRNG now that the Random package for v1.7.0 has switched to Xoshiro as the default RNG?

palday · 2022-01-04T16:33:06Z

For tests, yes, because there could in theory be further optimizations to the MersenneTwister (or more specifically, methods that interpret the stream produced by MT).

On the PRNG front: I'm still looking at how we can take advantage of Xoshiro and related improvements for our embarrassingly parallel things.

…_dispersion_deviance

palday added 6 commits April 13, 2020 22:08

return the dispersion for varest of GLMMs with a dispersion param

d6c038b

export additional links

2760775

new computation of loglikelihood for GLMM that works with all families

c8261be

sdest for GLMM

cb90435

initial work on including dispersion in GLMM deviance

43d09ca

fix sdest

0858435

palday force-pushed the glmm_dispersion_deviance branch from 72ab191 to 0858435 Compare April 13, 2020 20:08

palday added 3 commits April 13, 2020 23:27

fix loglik calc

a79d32b

start preparing tests for GLMMs with dispersion parameter

ebe7906

change GLMM objective to use NLopt's tracking of function calls (like…

7300ef3

… in LMM)

palday force-pushed the glmm_dispersion_deviance branch from ef8e380 to 7300ef3 Compare April 13, 2020 22:04

palday added 2 commits April 14, 2020 02:07

rework tests for optimizer failure

0e591fe

remove explicit dispersion parameter from deviance

8bc5254

palday marked this pull request as ready for review April 14, 2020 00:12

palday requested a review from dmbates April 14, 2020 00:12

palday mentioned this pull request Apr 26, 2020

correct coeftable for GLMMs, add tests #308

Merged

palday added 5 commits May 16, 2020 20:53

simulate data for InverseGaussian and Gamma

72e4ffc

separate construct and fit for simulated data

12538df

new gamma dataset

6c6aaa0

formulae for new gamma dataset

a6d518f

merge master

2577e13

palday mentioned this pull request Jun 21, 2020

Support GLMM with InverseGaussian, Gamma, Gaussian with non-identity link palday/JellyMe4.jl#26

Open

4 tasks

palday marked this pull request as draft September 28, 2020 22:53

Merge branch 'master' of github.com:JuliaStats/MixedModels.jl into gl…

fed8b01

…mm_dispersion_deviance

palday mentioned this pull request Oct 14, 2020

Better loglikelihood for GLMM #419

Merged

Merge branch 'master' of github.com:JuliaStats/MixedModels.jl into gl…

c4d3fa4

…mm_dispersion_deviance

Merge branch 'master' of github.com:JuliaStats/MixedModels.jl into gl…

d6db3b0

…mm_dispersion_deviance

palday added bug enhancement labels Dec 17, 2020

Base automatically changed from master to main March 24, 2021 19:52

palday added 3 commits March 25, 2021 22:38

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into glmm…

de610d7

…_dispersion_deviance

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into glmm…

61dc4d4

…_dispersion_deviance

ditch MersenneTwister

3fbaaf8

palday added 2 commits May 3, 2022 10:36

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into glmm…

4fc5b0d

…_dispersion_deviance

Merge branch 'main' of github.com:JuliaStats/MixedModels.jl into glmm…

0af4eeb

…_dispersion_deviance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take the dispersion into account for GLMM familes with a dispersion param #291

Take the dispersion into account for GLMM familes with a dispersion param #291

palday commented Feb 26, 2020 •

edited

Loading

codecov bot commented Feb 26, 2020 •

edited

Loading

palday commented Feb 26, 2020

palday commented Apr 13, 2020

palday commented Apr 14, 2020

dmbates commented Jan 4, 2022

palday commented Jan 4, 2022

Take the dispersion into account for GLMM familes with a dispersion param #291

Are you sure you want to change the base?

Take the dispersion into account for GLMM familes with a dispersion param #291

Conversation

palday commented Feb 26, 2020 • edited Loading

codecov bot commented Feb 26, 2020 • edited Loading

Codecov Report

palday commented Feb 26, 2020

palday commented Apr 13, 2020

palday commented Apr 14, 2020

dmbates commented Jan 4, 2022

palday commented Jan 4, 2022

palday commented Feb 26, 2020 •

edited

Loading

codecov bot commented Feb 26, 2020 •

edited

Loading