Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with TweedieDistribution log_likelihood #628

Closed
lorentzenchr opened this issue Apr 7, 2023 · 5 comments · Fixed by #648
Closed

Segmentation fault with TweedieDistribution log_likelihood #628

lorentzenchr opened this issue Apr 7, 2023 · 5 comments · Fixed by #648

Comments

@lorentzenchr
Copy link
Contributor

import numpy as np
from glum._distribution import TweedieDistribution

td = TweedieDistribution(power=1.5)

td.log_likelihood(
    y=np.array([0.1, 0.2], dtype=float),
    mu=np.array([1, 2], dtype=float),
    dispersion=1.0,
)

Results in

segmentation fault  ipython

version:
Python 3.9.15
glum: 2.1.2
numpy: 1.23.5
MacOS 13.3, Intel Core i7

@jtilly
Copy link
Member

jtilly commented Apr 8, 2023

Thanks for the bug report. Unfortunately, I can't reproduce this. But that doesn't mean that we don't have a bug. I tried in CI and on my M1.

Did you install from conda-forge of from pypi? Could you please provide more details on your environment (e.g., the output from conda list or pip list)?

There aren't really all that many places where this could segfault. Must be somewhere in

def tweedie_log_likelihood(
const_floating1d y,
const_floating1d weights,
const_floating1d mu,
floating p,
floating dispersion,
):
cdef int i # loop counter
cdef int n = y.shape[0] # loop length
cdef floating ll = 0.0 # output
for i in prange(n, nogil=True):
ll += weights[i] * _tweedie_unit_loglikelihood(y[i], mu[i], p, dispersion)
return ll
cdef floating _tweedie_unit_loglikelihood(floating y, floating mu, floating power, floating dispersion) nogil:
cdef floating kappa, normalization, theta
if y == 0:
return -(mu ** (2 - power)) / (dispersion * (2 - power))
else:
theta = mu ** (1 - power)
kappa = mu * theta / (2 - power)
theta = theta / (1 - power)
normalization = _tweedie_normalization(y, power, dispersion)
return (theta * y - kappa) / dispersion + normalization
cdef floating _tweedie_normalization(floating y, floating power, floating dispersion) nogil:
cdef int j, j_lower, j_upper
cdef floating j_max = exp((2 - power) * log(y) - log(dispersion) - log(2 - power))
cdef floating w_max = _log_w_j(y, power, dispersion, j_max)
cdef floating w_summand = 0.0
j_lower, j_upper = _sum_limits(y, power, dispersion, j_max)
for j in range(j_lower, j_upper + 1):
w_summand += exp(_log_w_j(y, power, dispersion, j) - w_max)
return w_max + log(w_summand) - log(y)
cdef (int, int) _sum_limits(floating y, floating power, floating dispersion, floating j_max) nogil:
cdef floating w_lower
cdef floating j_lower = 1.0
cdef floating j_upper = ceil(j_max)
cdef floating w_upper = _log_w_j(y, power, dispersion, j_upper)
cdef floating w_crt = _log_w_j(y, power, dispersion, j_max) - 37
cdef floating w_one = _log_w_j(y, power, dispersion, 1)
if w_one <= w_crt:
j_lower = floor(j_max)
w_lower = _log_w_j(y, power, dispersion, j_lower)
while w_lower >= w_crt:
j_lower -= 1
w_lower = _log_w_j(y, power, dispersion, j_lower)
while w_upper >= w_crt:
j_upper += 1
w_upper = _log_w_j(y, power, dispersion, j_upper)
return int(j_lower), int(j_upper)
cdef floating _log_w_j(floating y, floating power, floating dispersion, numeric j) nogil:
cdef floating alpha = (2 - power) / (1 - power)
return j * _log_z(y, power, dispersion) - lgamma(1 + j) - lgamma(-alpha * j)
cdef floating _log_z(floating y, floating power, floating dispersion) nogil:
cdef floating alpha = (2 - power) / (1 - power)
return (
alpha * log(power - 1)
- alpha * log(y)
- (1 - alpha) * log(dispersion)
- log(2 - power)
)

@lorentzenchr
Copy link
Contributor Author

It is a pip virtualenv.

pip list
Package                       Version
----------------------------- -----------
absl-py                       1.2.0
aeppl                         0.0.38
aesara                        2.8.7
aiofiles                      22.1.0
aiosqlite                     0.18.0
alabaster                     0.7.12
altair                        4.2.0
anyio                         3.6.1
appnope                       0.1.3
argon2-cffi                   21.3.0
argon2-cffi-bindings          21.2.0
arrow                         1.2.3
arviz                         0.12.1
astor                         0.8.1
astroid                       2.12.11
asttokens                     2.0.8
astunparse                    1.6.3
async-generator               1.10
atpublic                      3.1.1
attrs                         21.4.0
autopep8                      2.0.2
Babel                         2.10.3
backcall                      0.2.0
bandit                        1.7.5
beautifulsoup4                4.11.1
benchopt                      1.2.0
beniget                       0.4.1
black                         22.6.0
blackjax                      0.8.2
bleach                        5.0.1
boost-histogram               1.3.1
Bottleneck                    1.3.5
cachetools                    5.2.0
category-encoders             2.5.0
certifi                       2022.5.18.1
cffi                          1.15.1
cfgv                          3.3.1
cftime                        1.6.1
charset-normalizer            2.0.12
click                         8.1.3
cloudpickle                   2.1.0
colorama                      0.4.5
commonmark                    0.9.1
cons                          0.4.5
contourpy                     1.0.5
cryptography                  38.0.1
cycler                        0.11.0
Cython                        0.29.33
dask                          2022.5.0
dask-glm                      0.2.0
dask-ml                       2022.5.27
debugpy                       1.6.0
decorator                     5.1.1
defusedxml                    0.7.1
deprecat                      2.1.1
Deprecated                    1.2.13
descartes                     1.1.0
dill                          0.3.5.1
distlib                       0.3.5
distributed                   2022.5.0
docutils                      0.16
duckdb                        0.7.1
duckdb-engine                 0.6.8
entrypoints                   0.4
etils                         0.7.1
etuples                       0.3.5
exceptiongroup                1.0.0rc9
executing                     0.8.3
fastjsonschema                2.15.3
fastprogress                  1.0.2
filelock                      3.7.0
flake8                        4.0.1
flatbuffers                   22.12.6
fonttools                     4.33.3
formulaic                     0.4.0
fqdn                          1.5.1
fsspec                        2022.7.1
future                        0.18.2
gast                          0.4.0
gitdb                         4.0.9
GitPython                     3.1.27
glum                          2.1.2
google-auth                   2.15.0
google-auth-oauthlib          1.0.0
google-pasta                  0.2.0
greenlet                      1.1.2
grpcio                        1.51.1
h11                           0.13.0
h2o                           3.36.1.4
h5py                          3.7.0
HeapDict                      1.0.1
ibis-framework                4.1.0
identify                      2.5.3
idna                          3.3
imagesize                     1.4.1
importlib-metadata            4.12.0
importlib-resources           5.9.0
iniconfig                     1.1.1
interface-meta                1.3.0
ipykernel                     6.16.0
ipython                       8.5.0
ipython-genutils              0.2.0
ipywidgets                    7.7.2
isoduration                   20.11.0
isort                         5.10.1
jaraco.classes                3.2.3
jax                           0.3.23
jaxlib                        0.3.22
jedi                          0.18.1
Jinja2                        3.1.2
joblib                        1.2.0
json5                         0.9.10
jsonpointer                   2.3
jsonschema                    4.17.3
jupyter-book                  0.13.1
jupyter-cache                 0.4.3
jupyter_client                7.4.2
jupyter-core                  4.11.1
jupyter-events                0.6.3
jupyter-server                1.21.0
jupyter_server_fileid         0.8.0
jupyter-server-mathjax        0.2.6
jupyter_server_ydoc           0.8.0
jupyter-sphinx                0.3.2
jupyter-ydoc                  0.2.3
jupyterlab                    3.6.3
jupyterlab-pygments           0.2.2
jupyterlab_server             2.22.0
jupyterlab-widgets            1.1.1
jupytext                      1.14.1
keras                         2.12.0
kiwisolver                    1.4.4
latexcodec                    2.0.1
lazy-object-proxy             1.7.1
libclang                      14.0.6
line-profiler                 3.5.1
linkify-it-py                 1.0.3
llvmlite                      0.39.1
locket                        1.0.0
logical-unification           0.4.5
lxml                          4.9.1
Mako                          1.2.3
Markdown                      3.4.1
markdown-it-py                1.1.0
MarkupSafe                    2.1.1
matplotlib                    3.6.1
matplotlib-inline             0.1.6
mccabe                        0.6.1
mdit-py-plugins               0.2.8
mdurl                         0.1.2
memory-profiler               0.61.0
miniKanren                    1.0.3
mistune                       0.8.4
mizani                        0.8.1
model-diagnostics             0.2.0
more-itertools                8.14.0
mpmath                        1.2.1
msgpack                       1.0.4
multipledispatch              0.6.0
mypy-extensions               0.4.3
myst-nb                       0.13.2
myst-parser                   0.15.2
nb-black                      1.0.7
nbclassic                     0.4.5
nbclient                      0.5.13
nbconvert                     6.5.4
nbdime                        3.1.1
nbformat                      5.7.0
nest-asyncio                  1.5.6
netCDF4                       1.6.1
networkx                      2.8.7
neurtu                        0.3.0
nodeenv                       1.7.0
notebook                      6.5.1
notebook-shim                 0.1.0
numba                         0.56.4
numexpr                       2.8.3
numpy                         1.23.5
numpyro                       0.10.1
oauthlib                      3.2.2
objprint                      0.2.2
opt-einsum                    3.3.0
outcome                       1.2.0
packaging                     21.3
palettable                    3.3.0
pandas                        1.5.3
pandocfilters                 1.5.0
parso                         0.8.3
parsy                         2.0
partd                         1.3.0
pathspec                      0.10.1
patsy                         0.5.3
pbr                           5.11.1
pexpect                       4.8.0
pickleshare                   0.7.5
Pillow                        9.2.0
pip                           23.0.1
pipdeptree                    2.3.3
platformdirs                  2.5.2
plotly                        5.10.0
plotnine                      0.10.1
pluggy                        1.0.0
ply                           3.11
polars                        0.16.18
pre-commit                    2.20.0
prometheus-client             0.15.0
prompt-toolkit                3.0.31
protobuf                      4.22.1
psutil                        5.9.2
ptyprocess                    0.7.0
pure-eval                     0.2.2
py                            1.11.0
pyarrow                       11.0.0
pyasn1                        0.4.8
pyasn1-modules                0.2.8
pyaxis                        0.3.4
pybtex                        0.24.0
pybtex-docutils               1.0.2
pycodestyle                   2.8.0
pycparser                     2.21
pydata-sphinx-theme           0.8.1
pyflakes                      2.4.0
PyGithub                      1.56
Pygments                      2.13.0
pyjstat                       2.3.0
PyJWT                         2.5.0
pylint                        2.15.4
pymc                          4.2.2
PyNaCl                        1.5.0
pynvml                        11.4.1
pyOpenSSL                     22.1.0
pyparsing                     3.0.9
pyrsistent                    0.18.1
PySocks                       1.7.1
pytest                        7.1.3
python-dateutil               2.8.2
python-json-logger            2.0.7
pythran                       0.12.0
pytz                          2022.7.1
PyYAML                        6.0
pyzmq                         24.0.1
requests                      2.28.1
requests-oauthlib             1.3.1
requests-unixsocket           0.3.0
rfc3339-validator             0.1.4
rfc3986-validator             0.1.1
rich                          12.6.0
rsa                           4.9
scalene                       1.5.13
scikit-learn                  1.2.2
scipy                         1.10.1
seaborn                       0.12.0
selenium                      4.5.0
Send2Trash                    1.8.0
setuptools                    65.5.0
six                           1.16.0
smmap                         5.0.0
sniffio                       1.3.0
snowballstemmer               2.2.0
sortedcontainers              2.4.0
soupsieve                     2.3.2.post1
Sphinx                        4.5.0
sphinx-book-theme             0.3.3
sphinx-comments               0.0.3
sphinx-copybutton             0.5.0
sphinx_design                 0.1.0
sphinx-external-toc           0.2.4
sphinx-jupyterbook-latex      0.4.7
sphinx-multitoc-numbering     0.1.3
sphinx-panels                 0.6.0
sphinx-thebe                  0.1.2
sphinx-togglebutton           0.3.2
sphinxcontrib-applehelp       1.0.2
sphinxcontrib-bibtex          2.5.0
sphinxcontrib-devhelp         1.0.2
sphinxcontrib-htmlhelp        2.0.0
sphinxcontrib-jsmath          1.0.1
sphinxcontrib-qthelp          1.0.3
sphinxcontrib-serializinghtml 1.1.5
SQLAlchemy                    1.4.41
sqlalchemy-views              0.3.1
sqlglot                       10.5.10
stack-data                    0.5.1
statsmodels                   0.13.2
stevedore                     5.0.0
sympy                         1.11.1
tabmat                        3.1.2
tabulate                      0.9.0
tblib                         1.7.0
tenacity                      8.1.0
tensorboard                   2.12.1
tensorboard-data-server       0.7.0
tensorboard-plugin-wit        1.8.1
tensorflow                    2.12.0
tensorflow-estimator          2.12.0
tensorflow-io-gcs-filesystem  0.28.0
termcolor                     2.1.1
terminado                     0.16.0
Theano-PyMC                   1.1.2
threadpoolctl                 3.1.0
tinycss2                      1.1.1
toml                          0.10.2
tomli                         2.0.1
tomlkit                       0.11.5
toolz                         0.12.0
tornado                       6.2
tox                           3.26.0
tqdm                          4.64.1
traitlets                     5.4.0
trio                          0.22.0
trio-websocket                0.9.2
typing_extensions             4.4.0
uc-micro-py                   1.0.1
uri-template                  1.2.0
urllib3                       1.26.12
virtualenv                    20.16.5
viztracer                     0.15.4
wcwidth                       0.2.5
webcolors                     1.13
webencodings                  0.5.1
websocket-client              1.4.1
Werkzeug                      2.2.2
wheel                         0.37.1
widgetsnbextension            3.6.1
wrapt                         1.14.1
wsproto                       1.2.0
xarray                        2023.2.0
xarray-einstats               0.3.0
y-py                          0.5.9
ypy-websocket                 0.8.2
zict                          2.2.0
zipp                          3.9.0

@jtilly
Copy link
Member

jtilly commented Jun 2, 2023

The segfault only occurs on macOS with Intel architecture and when installing from the wheels. The conda builds just work. This looks like something that scikit-learn went through as well, see scikit-learn/scikit-learn#21227 (comment). I used the same "fix" as scikit-learn and pinned llvm-openmp=11 for the build. This is part of our 2.5.2 release.

Feel free to re-open the issue if the problem pops up again.

@lorentzenchr
Copy link
Contributor Author

Thanks for fixing it!

@lorentzenchr
Copy link
Contributor Author

lorentzenchr commented Oct 30, 2023

For your information:

  • With glum 2.2.1 to 2.4.0, I get
    TypeError: C function sklearn.utils._cython_blas.__pyx_fuse_0_dot has wrong signature (expected float (int, float *, int, float *, int), got float (int, float const *, int, float const *, int))
  • With glum 2.5.0 and 2.5.1 I get (again)
    zsh: segmentation fault  ipython
  • With glum 2.5.2 and 2.6.0, I get (finally)
    Out[1]: -1.9685325337344186

So it seems, somewhere between version 2.5.1 and 2.5.2, maybe in #648, this bug was solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants