Project Details
Increasing the validity of statistical analyses with the R package “DHARMa”
Applicant
Professor Dr. Florian Hartig
Subject Area
Ecology and Biodiversity of Plants and Ecosystems
Epidemiology and Medical Biometry/Statistics
Epidemiology and Medical Biometry/Statistics
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 528747641
The statistical analysis of observational or experimental data is a central process in the empirical sciences. An important problem in this process is that conclusions (inferences) drawn from data using a statistical model depend on the specific assumptions of that model. Statistical results are generally only reliable if these assumptions are consistent with the underlying data-generating process. For this reason, introductory statistical books extensively emphasize the need to validate statistical models by analyzing residuals. In recent years, the complexity of statistical models used in ecology and many related empirical sciences has steadily increased. Analyses using simple linear regressions have become rare. Most empirical analyses in the field use the framework of generalized linear mixed models (GLMM), which allow flexibility in modeling both the distribution of the data and its structure (clusters, covariances, homoscedasticity). However, the problem with these models is that their naive residuals cannot be interpreted directly, leaving many researchers with the problem of how to validate their statistical models. The R package 'DHARMa' solves this problem by using a simulation-based approach to produce easily interpretable scaled (quantile) residuals for fitted (generalized) linear mixed models. It supports many of the common regression packages in the R environment and can also be coupled with external frequentist and Bayesian software, provided they can produce simulations from the fitted model. The DHARMa package now has a large user community from all empirical sciences, though still with a focus on ecology and evolutionary biology. The goal of the proposed project is to increase the interoperability and user-friendliness of DHARMa (RFP goal: Usability and Impact); to strengthen the valid application of DHARMa through unit tests, numerical and statistical robustness tests, and improved reports (RFP goal: Quality assurance); and to implement further testing procedures in collaboration with the R community and other package developers (RFP goal: Further development).
DFG Programme
Research Grants