SDM model predictability issue #362
Replies: 4 comments 1 reply
-
A challenge in working with delta models is that you need to interpret both the presence-absence and positive pieces of the model. For the presence-absence component, rather than correlations, it might be a good idea to look at the classification error, AUC (see For the positive model, it's ok to use correlations -- if you used a Gamma family with log link, it might be worth looking at the correlation between log(response) and the predictions from your model. You could start with a null model (no covariates) and build up, including covariates like temperature to see if that increases the correlation. Ideally this would be done with some sort of cross validation. |
Beta Was this translation helpful? Give feedback.
-
Thank you so much for your helpful answer. I was initially assessing correlations using the combined estimates instead of focusing on just the gamma estimates. The correlations are much better now. If I may ask another quick question: I'm projecting biomass to the end of the century using climate model scenarios and running 100 simulations to account for uncertainty as follows: preds_future_IPSL126 <- predict(fit_hist, IPSL_grid126, nsim = 100L) IPSL_grid126$se <- apply(preds_future_IPSL126, 1, sd) My question is: when I calculate these values, I end up with a single set of estimates (est) rather than separate estimates for the binomial (est1) and gamma (est2) components. I'm unsure whether I need to transform these new estimates and the standard errors. Should I apply exp() since the gamma model uses a log link function, or should I leave them as they are? I really appreciate your help with this. |
Beta Was this translation helpful? Give feedback.
-
Yes -- good question. For delta models, you want to use the
|
Beta Was this translation helpful? Give feedback.
-
@Raquel-RuizDiaz in addition to being able to predict the two parts separately (and then combine them if you'd like), in answer to this:
Yes, the overall values from a delta-gamma model are returned in log-space by default. You can exp() them and take quantiles (or add and subtract SDs in log space) if you want to turn those into biomass or density estimates. |
Beta Was this translation helpful? Give feedback.
-
Hi!
I’m building delta-gamma species distribution models using sdmTMB for three species. The residuals distribution looks good, and the models explain about 65% of the deviance compared to a null model. However, the model predictability is quite low for two of the species. When assessing predictability, I found correlations of 0.3 and 0.4 for two models and 0.7 for the other, between predictions and observations for the same period. These correlations decrease further when performing out-of-sample cross-validations.
My main question is: what correlation level is generally considered acceptable for species distribution models?
The binomial component of my model performs well, with correlations around 0.8. It seems that the gamma component might be the issue, which makes sense given that predicting biomass with a model that only includes temperature and depth may not yield high correlations. I’m wondering if these correlation values are good enough, or if I should consider using a binomial model instead.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions