Show newer

Mapping climate change awareness through spatial hierarchical clustering arxiv.org/abs/2409.13760 .AP

Mapping climate change awareness through spatial hierarchical clustering

Climate change is a critical issue that will be in the political agenda for the next decades. While it is important for this topic to be discussed at higher levels, it is also of paramount importance that the populations became aware of the problem. As different countries may face more or less severe repercussions, it is also useful to understand the degree of awareness of specific populations. In this paper, we present a geographically-informed hierarchical clustering analysis aimed at identify groups of countries with a similar level of climate change awareness. We employ a Ward-like clustering algorithm that combines information pertaining climate change awareness, socio-economic factors, climate-related characteristics of different countries, and the physical distances between countries. To choose suitable values for the clustering hyperparameters, we propose a customized algorithm that takes into account the within-clusters homogeneity, the between-clusters separation and that explicitly compares the geographically-informed and non-geographical partitioning. The results show that the geographically-informed clustering provides more stability of the partitions and leads to interpretable and geographically-compact aggregations compared to a clustering in which the geographical component is absent. In particular, we identify a clear contrast among Western countries, characterized by high and compact awareness, and Asian, African, and Middle Eastern countries having greater variability but still lower awareness.

arxiv.org

Supervised low-rank approximation of high-dimensional multivariate functional data via tensor decomposition arxiv.org/abs/2409.13819 .ME

Supervised low-rank approximation of high-dimensional multivariate functional data via tensor decomposition

Motivated by the challenges of analyzing high-dimensional ($p \gg n$) sequencing data from longitudinal microbiome studies, where samples are collected at multiple time points from each subject, we propose supervised functional tensor singular value decomposition (SupFTSVD), a novel dimensionality reduction method that leverages auxiliary information in the dimensionality reduction of high-dimensional functional tensors. Although multivariate functional principal component analysis is a natural choice for dimensionality reduction of multivariate functional data, it becomes computationally burdensome in high-dimensional settings. Low-rank tensor decomposition is a feasible alternative and has gained popularity in recent literature, but existing methods in this realm are often incapable of simultaneously utilizing the temporal structure of the data and subject-level auxiliary information. SupFTSVD overcomes these limitations by generating low-rank representations of high-dimensional functional tensors while incorporating subject-level auxiliary information and accounting for the functional nature of the data. Moreover, SupFTSVD produces low-dimensional representations of subjects, features, and time, as well as subject-specific trajectories, providing valuable insights into the biological significance of variations within the data. In simulation studies, we demonstrate that our method achieves notable improvement in tensor approximation accuracy and loading estimation by utilizing auxiliary information. Finally, we applied SupFTSVD to two longitudinal microbiome studies where biologically meaningful patterns in the data were revealed.

arxiv.org

Jointly modeling time-to-event and longitudinal data with individual-specific change points: a case study in modeling tumor burden arxiv.org/abs/2409.13873 .ME .AP

Jointly modeling time-to-event and longitudinal data with individual-specific change points: a case study in modeling tumor burden

In oncology clinical trials, tumor burden (TB) stands as a crucial longitudinal biomarker, reflecting the toll a tumor takes on a patient's prognosis. With certain treatments, the disease's natural progression shows the tumor burden initially receding before rising once more. Biologically, the point of change may be different between individuals and must have occurred between the baseline measurement and progression time of the patient, implying a random effects model obeying a bound constraint. However, in practice, patients may drop out of the study due to progression or death, presenting a non-ignorable missing data problem. In this paper, we introduce a novel joint model that combines time-to-event data and longitudinal data, where the latter is parameterized by a random change point augmented by random pre-slope and post-slope dynamics. Importantly, the model is equipped to incorporate covariates across for the longitudinal and survival models, adding significant flexibility. Adopting a Bayesian approach, we propose an efficient Hamiltonian Monte Carlo algorithm for parameter inference. We demonstrate the superiority of our approach compared to a longitudinal-only model via simulations and apply our method to a data set in oncology.

arxiv.org

High-dimensional learning of narrow neural networks arxiv.org/abs/2409.13904 -mat.dis-nn .ML .LG

High-dimensional learning of narrow neural networks

Recent years have been marked with the fast-pace diversification and increasing ubiquity of machine learning applications. Yet, a firm theoretical understanding of the surprising efficiency of neural networks to learn from high-dimensional data still proves largely elusive. In this endeavour, analyses inspired by statistical physics have proven instrumental, enabling the tight asymptotic characterization of the learning of neural networks in high dimensions, for a broad class of solvable models. This manuscript reviews the tools and ideas underlying recent progress in this line of work. We introduce a generic model -- the sequence multi-index model -- which encompasses numerous previously studied models as special instances. This unified framework covers a broad class of machine learning architectures with a finite number of hidden units, including multi-layer perceptrons, autoencoders, attention mechanisms; and tasks, including (un)supervised learning, denoising, contrastive learning, in the limit of large data dimension, and comparably large number of samples. We explicate in full detail the analysis of the learning of sequence multi-index models, using statistical physics techniques such as the replica method and approximate message-passing algorithms. This manuscript thus provides a unified presentation of analyses reported in several previous works, and a detailed overview of central techniques in the field of statistical physics of machine learning. This review should be a useful primer for machine learning theoreticians curious of statistical physics approaches; it should also be of value to statistical physicists interested in the transfer of such ideas to the study of neural networks.

arxiv.org

Chauhan Weighted Trajectory Analysis of combined efficacy and safety outcomes for risk-benefit analysis arxiv.org/abs/2409.13946 .ME

Chauhan Weighted Trajectory Analysis of combined efficacy and safety outcomes for risk-benefit analysis

Analyzing and effectively communicating the efficacy and toxicity of treatment is the basis of risk benefit analysis (RBA). More efficient and objective tools are needed. We apply Chauhan Weighted Trajectory Analysis (CWTA) to perform RBA with superior objectivity, power, and clarity. We used CWTA to perform 1000-fold simulations of RCTs using ordinal endpoints for both treatment efficacy and toxicity. RCTs were simulated with 1:1 allocation at defined sample sizes and hazard ratios. We studied the simplest case of 3 levels each of toxicity and efficacy and the general case of the advanced cancer trial, with efficacy graded by five RECIST 1.1 health statuses and toxicity by the six-point CTCAE scale (6 x 5 matrix). The latter model was applied to a real-world dose escalation phase I trial in advanced cancer. Simulations in both the 3 x 3 and the 6 x 5 advanced cancer matrix confirmed that drugs with both superior efficacy and toxicity profiles synergize for greater statistical power with CWTA-RBA. The CWTA-RBA 6 x 5 matrix reduced sample size requirements over CWTA efficacy-only analysis. Application to the dose finding phase I clinical trial provided objective, statistically significant validation for the selected dose. CWTA-RBA, by incorporating both drug efficacy and toxicity, provides a single test statistic and plot that analyzes and effectively communicates therapeutic risks and benefits. CWTA-RBA requires fewer patients than CWTA efficacy-only analysis when the experimental drug is both more effective and less toxic. CWTA-RBA facilitates the objective and efficient assessment of new therapies throughout the drug development pathway. Furthermore, several advantages over competing tests in communicating risk-benefit will assist regulatory review, clinical adoption, and understanding of therapeutic risks and benefits by clinicians and patients alike.

arxiv.org

Functional Factor Modeling of Brain Connectivity arxiv.org/abs/2409.13963 .ME .AP

Functional Factor Modeling of Brain Connectivity

Many fMRI analyses examine functional connectivity, or statistical dependencies among remote brain regions. Yet popular methods for studying whole-brain functional connectivity often yield results that are difficult to interpret. Factor analysis offers a natural framework in which to study such dependencies, particularly given its emphasis on interpretability. However, multivariate factor models break down when applied to functional and spatiotemporal data, like fMRI. We present a factor model for discretely-observed multidimensional functional data that is well-suited to the study of functional connectivity. Unlike classical factor models which decompose a multivariate observation into a "common" term that captures covariance between observed variables and an uncorrelated "idiosyncratic" term that captures variance unique to each observed variable, our model decomposes a functional observation into two uncorrelated components: a "global" term that captures long-range dependencies and a "local" term that captures short-range dependencies. We show that if the global covariance is smooth with finite rank and the local covariance is banded with potentially infinite rank, then this decomposition is identifiable. Under these conditions, recovery of the global covariance amounts to rank-constrained matrix completion, which we exploit to formulate consistent loading estimators. We study these estimators, and their more interpretable post-processed counterparts, through simulations, then use our approach to uncover a rich covariance structure in a collection of resting-state fMRI scans.

arxiv.org

Batch Predictive Inference arxiv.org/abs/2409.13990 .ME

Batch Predictive Inference

Constructing prediction sets with coverage guarantees for unobserved outcomes is a core problem in modern statistics. Methods for predictive inference have been developed for a wide range of settings, but usually only consider test data points one at a time. Here we study the problem of distribution-free predictive inference for a batch of multiple test points, aiming to construct prediction sets for functions -- such as the mean or median -- of any number of unobserved test datapoints. This setting includes constructing simultaneous prediction sets with a high probability of coverage, and selecting datapoints satisfying a specified condition while controlling the number of false claims. For the general task of predictive inference on a function of a batch of test points, we introduce a methodology called batch predictive inference (batch PI), and provide a distribution-free coverage guarantee under exchangeability of the calibration and test data. Batch PI requires the quantiles of a rank ordering function defined on certain subsets of ranks. While computing these quantiles is NP-hard in general, we show that it can be done efficiently in many cases of interest, most notably for batch score functions with a compositional structure -- which includes examples of interest such as the mean -- via a dynamic programming algorithm that we develop. Batch PI has advantages over naive approaches (such as partitioning the calibration data or directly extending conformal prediction) in many settings, as it can deliver informative prediction sets even using small calibration sample sizes. We illustrate that our procedures provide informative inference across the use cases mentioned above, through experiments on both simulated data and a drug-target interaction dataset.

arxiv.org

Forecasting Causal Effects of Future Interventions: Confounding and Transportability Issues arxiv.org/abs/2409.13060 .ME

Forecasting Causal Effects of Future Interventions: Confounding and Transportability Issues

Recent developments in causal inference allow us to transport a causal effect of a time-fixed treatment from a randomized trial to a target population across space but within the same time frame. In contrast to transportability across space, transporting causal effects across time or forecasting causal effects of future interventions is more challenging due to time-varying confounders and time-varying effect modifiers. In this article, we seek to formally clarify the causal estimands for forecasting causal effects over time and the structural assumptions required to identify these estimands. Specifically, we develop a set of novel nonparametric identification formulas--g-computation formulas--for these causal estimands, and lay out the conditions required to accurately forecast causal effects from a past observed sample to a future population in a future time window. Our overaching objective is to leverage the modern causal inference theory to provide a theoretical framework for investigating whether the effects seen in a past sample would carry over to a new future population. Throughout the article, a working example addressing the effect of public policies or social events on COVID-related deaths is considered to contextualize the developments of analytical results.

arxiv.org

Stochastic mirror descent for nonparametric adaptive importance sampling arxiv.org/abs/2409.13272 .ST .TH

Stochastic mirror descent for nonparametric adaptive importance sampling

This paper addresses the problem of approximating an unknown probability distribution with density $f$ -- which can only be evaluated up to an unknown scaling factor -- with the help of a sequential algorithm that produces at each iteration $n\geq 1$ an estimated density $q_n$.The proposed method optimizes the Kullback-Leibler divergence using a mirror descent (MD) algorithm directly on the space of density functions, while a stochastic approximation technique helps to manage between algorithm complexity and variability. One of the key innovations of this work is the theoretical guarantee that is provided for an algorithm with a fixed MD learning rate $η\in (0,1 )$. The main result is that the sequence $q_n$ converges almost surely to the target density $f$ uniformly on compact sets. Through numerical experiments, we show that fixing the learning rate $η\in (0,1 )$ significantly improves the algorithm's performance, particularly in the context of multi-modal target distributions where a small value of $η$ allows to increase the chance of finding all modes. Additionally, we propose a particle subsampling method to enhance computational efficiency and compare our method against other approaches through numerical experiments.

arxiv.org

Validity of Feature Importance in Low-Performing Machine Learning for Tabular Biomedical Data arxiv.org/abs/2409.13342 .ML .LG

Validity of Feature Importance in Low-Performing Machine Learning for Tabular Biomedical Data

In tabular biomedical data analysis, tuning models to high accuracy is considered a prerequisite for discussing feature importance, as medical practitioners expect the validity of feature importance to correlate with performance. In this work, we challenge the prevailing belief, showing that low-performing models may also be used for feature importance. We propose experiments to observe changes in feature rank as performance degrades sequentially. Using three synthetic datasets and six real biomedical datasets, we compare the rank of features from full datasets to those with reduced sample sizes (data cutting) or fewer features (feature cutting). In synthetic datasets, feature cutting does not change feature rank, while data cutting shows higher discrepancies with lower performance. In real datasets, feature cutting shows similar or smaller changes than data cutting, though some datasets exhibit the opposite. When feature interactions are controlled by removing correlations, feature cutting consistently shows better stability. By analyzing the distribution of feature importance values and theoretically examining the probability that the model cannot distinguish feature importance between features, we reveal that models can still distinguish feature importance despite performance degradation through feature cutting, but not through data cutting. We conclude that the validity of feature importance can be maintained even at low performance levels if the data size is adequate, which is a significant factor contributing to suboptimal performance in tabular medical data analysis. This paper demonstrates the potential for utilizing feature importance analysis alongside statistical analysis to compare features relatively, even when classifier performance is not satisfactory.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.