Recent Releases of sdmTMB
sdmTMB - sdmTMB 1.0.0
- Switch the recommended citation from the preprint to the Journal of
Statistical Software version. Seecitation("sdmTMB").
Minor improvements and fixes
-
Fix crash with betabinomial and binomial when a response value was
NAbut the
correspondingweightsargument (turned into 'size') was notNA. -
Add
simulate_new()as a synonym forsdmTMB_simulate().simulate_new()
is recommended going forward because it is more expressive.
Biosphere - Species Distribution Modeling
- R
Published by seananderson 22 days ago
sdmTMB - sdmTMB 0.8.0
New features
-
sdmTMB_cv()now supports theweightsargument. User-supplied weights are
combined with the internal fold-assignment mechanism (held-out data are
assigned weight 0). Weights must be positive (> 0). #486 -
Add experimental
collapse_spatial_varianceoption tosdmTMBcontrol()to
automatically detect and turn off spatial and/or spatiotemporal random fields
when their SD parameters are estimated to be very small (below a threshold).
When enabled, the model will automatically refit with the appropriate fields
turned off if SD parameters fall belowcollapse_threshold(default 0.01).
This can help avoid boundary issues and improve model stability when random
fields are not needed. Set toFALSEby default. #263 -
Add new experimental function
get_range_edge()to calculate range edges as
density-weighted quantiles along a spatial axis (e.g., latitude, coastal distance).
Range edges are calculated as positions where cumulative density equals specified
quantiles. Uses linear interpolation for accurate
quantile estimation and simulation from the joint precision matrix for uncertainty
quantification. Implements a similar approach to VAST range edge calculations
following Fredston et al. (2021) https://doi.org/10.1111/gcb.15614.
See the new range edge vignette at https://sdmTMB.github.io/sdmTMB/articles/ -
The Student-t degrees of freedom parameter is now
estimated by default instudent(). Previously it was fixed at 3. To fix it
at a specific value, setdfto a numeric value (e.g.,student(df = 3)).
To estimate it (new default), setdf = NULLor omit the argument. The
parameter is constrained to be > 1. The function now prints an informative
message about the df parameter behavior. -
Add deviance residuals for left-out data in the cross validation output.
This can be used to calculate deviance explained of left-out data.
Seecv_deviance_residcolumn in thedataelement of the output of
sdmTMB_cv(). -
Add
save_modelsargument tosdmTMB_cv()(defaults toTRUE). Setting
save_models = FALSEprevents storing fitted model objects for each fold,
which can substantially reduce memory usage for large datasets or many folds.
WhenFALSE, functions requiring model access (tidy(),cv_to_waywiser())
will error with informative messages. Essential CV metrics (predictions, log
likelihoods, deviance residuals, convergence info) remain available. -
Extend
sdmTMB_simulate()to support time-varying effects with vectorsigma_V
inputs and AR1 correlation (rho_time). #447 -
Add new function
cv_to_waywiser()to convert cross-validation results to sf
format for use with the waywiser package. This enables multi-scale spatial
assessment of model predictions. #193 -
Add vignette demonstrating how to fit zero-one-inflated beta (ZOIB) models
by fitting three separate model components and combining predictions. #440 #441 -
Add argument to fix probability of extreme events for
*_mix()families.
Note that the internal parameter name has also changed fromp_mixto
p_extremeand fromlogit_p_mixtologit_p_extreme. #318 #474 -
Add beta-binomial family (
betabinomial()) for modeling overdispersed binomial
data. Supports logit and cloglog links, and includes proper residuals support. -
Add new function
get_weighted_average()to calculate biomass-weighted averages
of user-supplied vectors (e.g., depth, temperature). This function follows the
same pattern asget_cog()but allows users to specify any variable for
weighted averaging. Supports bias correction and area weighting. -
Add vignette on age (or length) composition standardization. To help with this
add a new experimental functionmake_category_svc(). -
Add
emmeanssupport for delta/hurdle models. Usemodel = 1for the binomial
component ormodel = 2for the positive component when callingemmeans().
Example:emmeans(fit, ~ predictor, model = 2). #247 #249
Minor improvements and fixes
-
Fix barrier model implementation. The SPDE input matrices for the barrier
model from INLA and INLAspacetime had changed. sdmTMB now appropriately
uses these new matrices and unit tests in sdmTMBextra should catch such a
change in the future. #457 -
Fix Student-t deviance residuals, which were incorrectly returning
NaNs. -
Fix
sign()utility function to avoidNaNwhenx = 0. Now returns
standard mathematical sign function behavior:sign(0) = 0. -
Fix bug in
sdmTMB_cv()automatic fold generation that could result in
unbalanced folds with duplicate and missing fold IDs. The bug was most severe
with largek_foldsvalues (e.g., leave-one-out cross-validation with
k_folds = nrow(data)), which could cause errors when folds had no data.
User-suppliedfold_idswere not affected. -
Add check if newdata has been filtered after prediction and before passing
to aget_*()function. -
Fix
tidy()to only include themodelcolumn for delta models. For non-delta
models, themodelcolumn is no longer included in the output for
effects = "ran_pars"andeffects = "ran_vals", making the output cleaner
and more consistent. -
Update package logo.
-
Add residuals for truncated negative binomial families. #481
Thanks to @Joseph-Barss -
Fix an issue with residuals for delta models by consistently using
get_par(),
and another issue specifically for delta truncated negative binomial models by
replacing NaN with NA. #484 -
Fix
tidy()witheffects = "ran_pars"to report min/max anisotropic ranges
(e.g.,range_min,range_max) for models fit withanisotropy = TRUE,
matching the values shown inprint_anisotropy(). Standard errors and
confidence intervals are set to NA since uncertainty in both the range parameter
and H matrix cannot be easily propagated. -
Fix issue with ggeffects with multiple smoothers + offsets. #450
-
Improve
t2()printing and appearance intidy.sdmTMB(). #415 #472 -
Fix
emmeanssupport for models with smoothers (s()terms). Previously,
emmeanswould fail with "Non-conformable elements in reference grid" when
smoothers were included in the model formula.
Biosphere - Species Distribution Modeling
- R
Published by seananderson 3 months ago
sdmTMB - sdmTMB 0.7.4
Minor improvements and fixes
-
Let
simulate.sdmTMB()work with binomial GLMs with size specified via
weightsandnewdatasupplied. #465 -
Fix issue with fold logic in LFO (leave-future-out) cross validation
forlfo_forecast > 1. #454 Thanks to @Joseph-Barss. -
Add
update.sdmTMB()so that the mesh argument doesn't have to be
specified if model is loaded in a fresh session. #461 -
Change default in
get_index()etc. tobias_correct = TRUE. This is the
recommended setting for final inference and speed improvements within TMB
have made it more viable to include the bias correction by default. #456 -
Only retain Newton update parameters if they improve the objective function.
#455 -
Only run Newton updates if maximum absolute gradient is
>= 1e-9to save
time. #455 -
Suppress
nlminb()warnings by default, which can usually be ignored by the
user and may be confusing. This can be controlled via
sdmTMB(..., control = sdmTMBcontrol(suppress_nlminb_warnings = FALSE)).
This option now mirrors tinyVAST. -
Round time-varying AR(1) rho to 2 decimals in model printing/summary.
Biosphere - Species Distribution Modeling
- R
Published by seananderson 7 months ago
sdmTMB - sdmTMB 0.7.2
New features
-
Add deviance residuals (
residuals(fit, type = "deviance")) and
deviance.sdmTMB()method (deviance(fit)). Proportion deviance explained
can be calculated as1 - deviance(fit) / deviance(fit_null)where
fit_nullis a null model, e.g., fit withformula = ~ 1and turning
off any random fields as desired
(e.g.,spatial = "off", spatiotemporal = "off"). -
Add
observation_errorargument tosimulate.sdmTMB()to allow
turning off observation error simulation. The intended use-case
is for simulating from random effects but not adding observation
error. #431
Minor improvements and fixes
-
Change the default in
dharma_residuals()to
test_uniformity = FALSE. Based on simulation testing, we
generally do not recommend using these p-values to reject models. -
Fix a bug introduced in version 0.7.0 where printing of the 2nd
linear predictor smoother fixed effects (bs) was accidentally a copy
of the 1st linear predictor smoother fixed effects. -
Fix bug in simulation with time-varying AR(1) when using the
project()function. Thanks to A. Allyn for pointing out the bug. -
Fix reporting of converged models with
sdmTMB_cv(). A recent change
resulted in reporting only 1 model converged if all models converged. -
Remove warning about old default residuals type.
-
Fix
project()andsimulate.sdmTMB(..., newdata = ...)when
random intercepts/slopes are present. #431 -
Remove extra TMB data slots for
project()and
simulate.sdmTMB(..., newdata = ...)to save memory. #431
Biosphere - Species Distribution Modeling
- R
Published by seananderson 8 months ago
sdmTMB - sdmTMB 0.7.0
New features
-
Add option for random slopes, or random intercepts to be passed in in
lme4style formulas,density ~ (1 | fyear)ordensity ~ (depth | fyear),
Matches output oflme4andglmmTMB, and summarizes output withtidy(). -
Add
project()experimental function. -
Add
get_eao()to calculate effective area occupied. -
Allow predicting on new data with
t2()smoothers. #413 -
Add priors for
breakpt()andlogistic()parameters. #403 -
Add priors on time-varying SD parameters (
sigma_V). -
Add
cAIC()for calculating conditional AIC. Theory based on
https://arxiv.org/abs/2411.14185; also see
https://doi.org/10.1002/ecy.4327. J.T. Thorson wrote the function code.
EDF (effective degrees of freedom) will ultimately be further split
(e.g., split by smoothers) and added tosummary.sdmTMB(). #383 #387 -
Add EDF (effective degrees of freedom) printing to smoothers with
print.sdmTMB()andsummary.sdmTMB(). Set argumentedf = TRUE.
E.g.print(fit, edf = TRUE). #383 #387 -
At experimental function
get_index_split(), which takes care of
splitting a prediction grid by time, undoing the prediction and
area-integration index calculations for each chunk to save memory. -
Add
newdataargument tosimulate.sdmTMB(). This enables simulating on
a new data frame similar to how one would predict on new data. -
Add
mle_mvn_samplesargument tosimulate.sdmTMB(). Defaults to "single".
If "multiple", then a sample from the random effects is taken for each
simulation iteration. -
Allow for specifying only lower or upper limits. #394
-
sdmTMB_cv()gains atidy()andprint()method for output. #319 -
simulate.sdmTMB()method now has anreturn_tmb_reportargument.
New vignettes/articles
-
Add forecasting and presence-only article vignettes. See
https://pbs-assess.github.io/sdmTMB/articles/ -
Add vignette on multispecies models with sdmTMB (or any case where one wants
additional spatial and or spatiotemporal fields by some group).
See https://pbs-assess.github.io/sdmTMB/articles/
Minor improvements and fixes
-
Add a useful error if memory error occurs on index calculation.
-
Fix bug in a check in
make_mesh()around if coordinates look
overly large. #427 -
Re-enable bias correction for
get_cog()(get center of gravity). -
Add check for
Inf/-Infvalues before fitting. #408 -
Add linear component of smoothers to
tidy(). #90 -
Add time varying AR(1) correlation to
tidy()andprint(). #374 -
Warn if parameter limits are set with
newton_loops > 0. #394 -
Fix bug in
estcolumn when predicting on new data with Poisson-link
delta models withtype = "link"andre_form = NA. #389 -
Fix bug in
s95parameter reporting from thetidy()method.s95is
present in the logistic threshold models. The model itself was fine but the
s95parameter was supposed to be reported bytidy()as a combination of two
other parameters. This also affected the output inprint()/summary(). -
Add progress bar to
simulate.sdmTMB(). #346 -
Add AUC and TSS examples to cross validation vignette. #268
-
Add
model(linear predictor number) argument tocoef()method. Also,
write documentation for?coef.sdmTMB. #351 -
Add helpful error message if some coordinates in
make_mesh()areNA. #365 -
Add informative message if fitting with an offset but predicting with offset
argument left atNULLonnewdata. #372 -
Fix passing of
offsetargument through insdmTMB_cv(). Before it was being
omitted in the prediction (i.e., set to 0). #372 -
Fig bug in
exponentiateargument fortidy(). Setconf.int = TRUEas
default. #353 -
Fix bug in prediction from
delta_truncated_nbinom1()and
delta_truncated_nbinom2()families. The positive component
needs to be transformed to represent the mean of the untruncated
distribution first before multiplying by the probability of a non-zero.
Thanks to @tom-peatman #350 -
Add option for
areato be passed in as the name of a column in the
data frame to be used for area weighting. Used inget_index(),
get_cog(),get_eao(), etc.
Biosphere - Species Distribution Modeling
- R
Published by seananderson 11 months ago
sdmTMB - sdmTMB 0.6.0
-
Pass several arguments to
DHARMa::plotQQunif(). -
Add
silentoption insimulate.sdmTMB(). Setting it toFALSEallows
monitoring simulations from larger models. -
Fix bug in
est_non_rf1andest_non_rf2columns when all the following
conditions were true:- predicting on new data
- using a delta model
- including IID random intercepts or time-varying coefficients
See #342. Thanks to @tom-peatman for the issue report.
-
Fix delta-gamma binomial link printing for
type = 'poisson-link'#340 -
Add suggestion to use an optimized BLAS library to README.
-
Add warning if it's detected that there were problems reloading (e.g., with
readRDS()) a fitted model. Simultaneously revert the approach to
how reloaded models are reattached. -
Move
log_ratio_mixparameter to 2nd phase with starting value of -1 instead
of 0 to improve convergence. -
Fix bugs for
nbinom1()andnbinom2_mix()simulation. -
Allow
profileargument in the control list to take a character vector of
parameters. This move these parameters from the outer optimization problem to
the inner problem (but omits from the from the Laplace approximation). See
documentation in TMB. This can considerably speed up fitting models with many
fixed effects. -
Add theoretical quantile residuals for the generalized gamma distribution.
Thanks to J.C. Dunic. #333 -
Add
"poisson-link"option to delta-mixture lognormal. -
Fix bug in simulation from Poisson-link delta models.
-
Simplify the internal treatment of extra time slices (
extra_time). #329
This is much less bug prone and also fixes a recently introduced bug. #335
This can slightly affect model results compared to the previous approach if
extra time was used along with smoothers since the 'fake' extra data
previously used was included when mgcv determined knot locations for
smoothers.
Biosphere - Species Distribution Modeling
- R
Published by seananderson over 1 year ago
sdmTMB - sdmTMB 0.5.0
-
Overhaul residuals vignette ('article')
https://pbs-assess.github.io/sdmTMB/articles/web_only/residual-checking.html
including brief intros to randomized quantile residuals, simulation-based
residuals, 'one-sample' residuals, and uniform vs. Gaussian residuals. -
Add check if prediction coordinates appear outside of fitted coordinates. #285
-
Fix memory issue with Tweedie family on large datasets. #302
-
Add experimental option to return standard normal residuals from
dharma_residuals(). -
Make
simulate.sdmTMB()not includeextra_timeelements. -
Improved re-initialization of saved fitted model objects in new sessions.
-
Fix important bug in
simulate.sdmTMB()method for delta families where
the positive linear predictor was only getting simulated for observations
present in the fitted data. -
Add new
"mle-mvn"type toresiduals.sdmTMB()and make it the default.
This is a fast option for evaluating goodness of fit that should be better
than the previous default. See the details section in?residuals.sdmTMB
for details. The previous default is now called"mvn-eb"but is not
recommended. -
Bring
dharma_residuals()back over from sdmTMBextra to sdmTMB. Add a new
option in thetypeargument ("mle-mvn") that should make the
simulation residuals consistent with the expected distribution.
See the same new documentation in?residuals.sdmTMB. The examples
in?dharma_residualsillustrate suggested use. -
Fix bug in
sanity()where gradient checks were missingabs()such that
large negative gradients weren't getting caught. #324 -
Return
offsetvector in fitted object as an element. Ensure any extra time
rows of data in thedataelement of the fitted object do not include the
extra time slices. -
Add experimental residuals option "mle-mvn" where a single approximate
posterior sample of the random effects is drawn and these are combined
with the MLE fixed effects to produce residuals. This may become the
default option. -
Add the generalized gamma distribution (thanks to J.T. Thorson with additional
work by J.C. Dunic.) Seegengamma(). This distribution is still in a testing
phase and is not recommended for applied use yet. #286 -
Detect possible issue with factor(time) in formula if same column name is used
fortimeandextra_timeis specified. #320 -
Improve
sanity()check output when there are NA fixed effect standard
errors. -
Set
intern = FALSEwithin index bias correction, which seems to be
considerably faster when testing with most models.
Biosphere - Species Distribution Modeling
- R
Published by seananderson almost 2 years ago
sdmTMB - sdmTMB 0.4.3
-
Fix a bug likely introduced in July 2023 that caused issues when
extra_timewas specified. This is an important bug and models fit with
extra_timebetween that date (if using the GitHub version) and v0.4.2.9004
(2024-02-24) should be checked against a current version of sdmTMB
(v0.4.2.9005 or greater). On CRAN, this affected v0.4.0 (2023-10-20) to
v0.4.2. Details:- The essence of the bug was that
extra_timeworks by padding the data
with a fake row of data for every extra time element (using the first row of
data as the template). This is supposed to then be omitted from the
likelihood so it has no impact on model fitting beyond spacing
time-series processes appropriately and setting up internal structures for
forecasting. Unfortunately, a bug was introduced that caused these fake data
(1 per extra time element) to be included in the likelihood.
- The essence of the bug was that
-
Issue error if
timecolumn has NAs. #298 #299 -
Fix bug in
get_cog(..., format = "wide")where the time column was
hardcoded to"year"by accident. -
Poisson-link delta models now use a
typeargument indelta_gamma()and
delta_lognormal().delta_poisson_link_gamma()and
delta_poisson_link_lognormal()are deprecated. #290 -
Delta families can now pass links that are different from the default
"logit"and"log". #290
Biosphere - Species Distribution Modeling
- R
Published by seananderson almost 2 years ago
sdmTMB - sdmTMB 0.4.2
-
Force rebuild of CRAN binaries to fix issue with breaking Matrix ABI change
causingNaN gradienterrors. #288 #287 -
Fix crash in if
sdmTMB(..., do_index = TRUE)andextra_timesupplied along
withpredict_args = list(newdata = ...)that lackedextra_timeelements. -
Allow
get_index()to work with missing time elements. -
Add the ability to pass a custom randomized quantile function
qres_func
toresiduals.sdmTMB(). -
Add check for factor random intercept columns in
newdatato avoid a crash.
#278 #280 -
Improve warnings/errors around use of
do_index = TRUEandget_index()
ifnewdata = NULL. #276 -
Fix prediction with
offsetwhennewdataisNULLbutoffsetis
specified. #274 -
Fix prediction failure when both
offsetandnsimare provided and
model includesextra_time. #273
Biosphere - Species Distribution Modeling
- R
Published by seananderson about 2 years ago
sdmTMB - sdmTMB 0.4.1
-
Fix memory issues detected by CRAN 'Additional issues' clang-UBSAN, valgrind.
-
Fix a bug predicting on new data with a specified offset and
extra_time.
#270 -
Add warning around non-factor handling of the
spatial_varyingformula. #269 -
Add experimental
set_delta_model()for plotting delta models with
ggeffects::ggpredict()(GitHub version only until next CRAN version).
Biosphere - Species Distribution Modeling
- R
Published by seananderson over 2 years ago
sdmTMB - sdmTMB 0.4.0
-
Move add_barrier_mesh() to sdmTMBextra to avoid final INLA dependency.
https://github.com/pbs-assess/sdmTMBextra -
Switch to using the new fmesher package for all mesh/SPDE calculations. INLA
is no longer a dependency. -
Switch to
diagonal.penalty = FALSEin mgcv::smoothCon().
This changes the scale of the linear component of the smoother, but
should result in the same model.
https://github.com/glmmTMB/glmmTMB/issues/928#issuecomment-1642862066 -
Implement cross validation for delta models #239
-
Remove ELPD from cross validation output. Use sum_loglik instead. #235
-
Turn on Newton optimization by default. #182
-
print() now checks sanity() and issues a warning if there may be issues. #176
-
Poisson-link delta models and censored likelihood distributions have been made
considerably more robust. #186 -
Standard errors are now available on SD parameters etc. in tidy() #240
-
Fix bug in print()/tidy() for delta-model positive model component sigma_E.
A recently introduce bug was causing sigma_E for the 2nd model to be reported
as the 1st model component sigma_E. -
Add new anisotropy plotting function.
-
Add anisotropic range printing. #149 by @jdunic
Biosphere - Species Distribution Modeling
- R
Published by seananderson over 2 years ago
sdmTMB - sdmTMB 0.3.0
-
Create the sdmTMBextra package to remove rstan/tmbstan helpers, which
were causing memory sanitizer errors on CRAN.
https://github.com/pbs-assess/sdmTMBextra -
The following functions are affected:
predict.sdmTMB()now takesmcmc_samples, which is output from
sdmTMBextra::extract_mcmc().simulate.sdmTMB()now takesmcmc_samples, which is output from
sdmTMBextra::extract_mcmc().residuals.sdmTMB()now takesmcmc_samples, which is output
sdmTMBextra::predict_mle_mcmc(). This only affects
residuals(..., type = "mle-mcmc").
-
Move
dharma_residuals()to
sdmTMBextra to reduce heavy
dependencies. -
See examples in the Bayesian and residuals vignettes or in the help files for
those functions within sdmTMBextra. -
Various fixes to pass CRAN checks. #158
-
Fix memory issue highlighted by Additional issues CRAN checks. #158
-
'offset' argument can now be a character value indicating a column name. This
is the preferred way of using an offset with parallel cross validation. #165 -
Fix parallel cross validation when using an offset vector. #165
-
Add leave-future-out cross validation functionality. #156
-
Example data
qcs_gridis no longer replicated by year to save package
space. #158 -
Add message with
tidy(fit, "ran_pars")about why SEs are NA. -
Add anisotropy to
print()#157 -
Fix
predict(..., type = "response", se_fit = TRUE), which involves issuing
a warning and sticking to link space. #140
Biosphere - Species Distribution Modeling
- R
Published by seananderson about 3 years ago