Project Details
Projekt Print View

Inference methods for multivariate and high-dimensional data

Subject Area Mathematics
Epidemiology and Medical Biometry/Statistics
Term from 2016 to 2020
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 282140603
 
With greatly advanced computational resources, the scope of statistical data analysis and modeling accommodates pressing new arenas of application for modern multivariate inference methods. Analyzing multivariate data usually faces several challenges, partly due to possibly complex dependence structures between the different variables. Additionally, the endpoints are typically not measured on the same scale, hence assumptions of specific covariance structures are inadequate. Particularly difficult is inference for multivariate data in which one or more endpoints are ordinal, as methods assuming multivariate normality are obviously inappropriate for such data. However, also skewed or discrete data are not appropriately described by a multivariate normal model. Moreover, data are usually collected in elaborate factorial settings, and the complexity increases if the number of endpoint is greater than the number of independent experimental units (high-dimensional data). Among the main questions arising in those studies are the detection of endpoints, group levels, or combinations of these, causing statistical significance. In order to be able to answer these questions, it is desirable to have powerful procedures available that do not make restrictive model assumptions. Central themes of this project are the derivation of 1. asymptotically valid tests based on a semiparametric location model without normality assumption 2. rank-based inference methods using a purely nonparametric model framework 3. approximations and adjustments to 1.-2. for small sample sizes or high dimensional observations, based on different bootstrap, randomization, or moment approximation techniques. 4. multiple testing procedures to investigate "local" questions, after having performed "global" tests, and in case 2. also 5. extensions of the above methods to censored data and 6. extensions of the above methods to detect specific patterns of alternatives. In the first topic, powerful inference tools for possibly high-dimensional multivariate data are derived, based on expected values, while the second topic considers generalizations of Wilcoxon-Mann-Whitney type tests to multivariate layouts, using a different hypothesis formulation. In the third topic, approximative inferential solutions are developed, using resampling and other techniques. The fourth topic provides the logical next step away from global decisions to detecting the relevant variables or factor level combinations that are responsible for significant results. The fifth topic addresses issues of data which are observed incompletely due to censoring, which is quite frequent in real data collection. Finally, in the sixth topic, inference methods are devised to be more powerful for the detection of specific a priori specified alternatives, for example increasing or decreasing trends. The results promise to have wide application and to broadly enhance the role of statistical science.
DFG Programme Research Grants
International Connection Austria
Co-Investigator Professor Dr. Arne Bathke
 
 

Additional Information

Textvergrößerung und Kontrastanpassung