Show newer

The Statistical Accuracy of Neural Posterior and Likelihood Estimation arxiv.org/abs/2411.12068 .ML .ST .CO .TH .LG

A Comparison of Zero-Inflated Models for Modern Biomedical Data arxiv.org/abs/2411.12086 .ME .AP

Asymptotics in Multiple Hypotheses Testing under Dependence: beyond Normality arxiv.org/abs/2411.12119 .ST .ME .TH

Exact Risk Curves of signSGD in High-Dimensions: Quantifying Preconditioning and Noise-Compression Effects arxiv.org/abs/2411.12135 .ML .LG

Tangential Randomization in Linear Bandits (TRAiL): Guaranteed Inference and Regret Bounds arxiv.org/abs/2411.12154 .ML .SY .LG .SY

Sensor-fusion based Prognostics Framework for Complex Engineering Systems Exhibiting Multiple Failure Modes arxiv.org/abs/2411.12159 .ML .SY .AP .LG .SY

Inference for overparametrized hierarchical Archimedean copulas arxiv.org/abs/2411.10615 .ME

Inference for overparametrized hierarchical Archimedean copulas

Hierarchical Archimedean copulas (HACs) are multivariate uniform distributions constructed by nesting Archimedean copulas into one another, and provide a flexible approach to modeling non-exchangeable data. However, this flexibility in the model structure may lead to over-fitting when the model estimation procedure is not performed properly. In this paper, we examine the problem of structure estimation and more generally on the selection of a parsimonious model from the hypothesis testing perspective. Formal tests for structural hypotheses concerning HACs have been lacking so far, most likely due to the restrictions on their associated parameter space which hinders the use of standard inference methodology. Building on previously developed asymptotic methods for these non-standard parameter spaces, we provide an asymptotic stochastic representation for the maximum likelihood estimators of (potentially) overparametrized HACs, which we then use to formulate a likelihood ratio test for certain common structural hypotheses. Additionally, we also derive analytical expressions for the first- and second-order partial derivatives of two-level HACs based on Clayton and Gumbel generators, as well as general numerical approximation schemes for the Fisher information matrix.

arXiv.org

Wasserstein Spatial Depth arxiv.org/abs/2411.10646 .ST .ME .TH

Wasserstein Spatial Depth

Modeling observations as random distributions embedded within Wasserstein spaces is becoming increasingly popular across scientific fields, as it captures the variability and geometric structure of the data more effectively. However, the distinct geometry and unique properties of Wasserstein space pose challenges to the application of conventional statistical tools, which are primarily designed for Euclidean spaces. Consequently, adapting and developing new methodologies for analysis within Wasserstein spaces has become essential. The space of distributions on $\mathbb{R}^d$ with $d>1$ is not linear, and ''mimic'' the geometry of a Riemannian manifold. In this paper, we extend the concept of statistical depth to distribution-valued data, introducing the notion of {\it Wasserstein spatial depth}. This new measure provides a way to rank and order distributions, enabling the development of order-based clustering techniques and inferential tools. We show that Wasserstein spatial depth (WSD) preserves critical properties of conventional statistical depths, notably, ranging within $[0,1]$, transformation invariance, vanishing at infinity, reaching a maximum at the geometric median, and continuity. Additionally, the population WSD has a straightforward plug-in estimator based on sampled empirical distributions. We establish the estimator's consistency and asymptotic normality. Extensive simulation and real-data application showcase the practical efficacy of WSD.

arXiv.org

Series Expansion of Probability of Correct Selection for Improved Finite Budget Allocation in Ranking and Selection arxiv.org/abs/2411.10695 .ML .OC .LG

Series Expansion of Probability of Correct Selection for Improved Finite Budget Allocation in Ranking and Selection

This paper addresses the challenge of improving finite sample performance in Ranking and Selection by developing a Bahadur-Rao type expansion for the Probability of Correct Selection (PCS). While traditional large deviations approximations captures PCS behavior in the asymptotic regime, they can lack precision in finite sample settings. Our approach enhances PCS approximation under limited simulation budgets, providing more accurate characterization of optimal sampling ratios and optimality conditions dependent of budgets. Algorithmically, we propose a novel finite budget allocation (FCBA) policy, which sequentially estimates the optimality conditions and accordingly balances the sampling ratios. We illustrate numerically on toy examples that our FCBA policy achieves superior PCS performance compared to tested traditional methods. As an extension, we note that the non-monotonic PCS behavior described in the literature for low-confidence scenarios can be attributed to the negligence of simultaneous incorrect binary comparisons in PCS approximations. We provide a refined expansion and a tailored allocation strategy to handle low-confidence scenarios, addressing the non-monotonicity issue.

arXiv.org

Variance bounds and robust tuning for pseudo-marginal Metropolis--Hastings algorithms arxiv.org/abs/2411.10785 .ST .CO .TH

Variance bounds and robust tuning for pseudo-marginal Metropolis--Hastings algorithms

The general applicability and ease of use of the pseudo-marginal Metropolis--Hastings (PMMH) algorithm, and particle Metropolis--Hastings in particular, makes it a popular method for inference on discretely observed Markovian stochastic processes. The performance of these algorithms and, in the case of particle Metropolis--Hastings, the trade off between improved mixing through increased accuracy of the estimator and the computational cost were investigated independently in two papers, both published in 2015. Each suggested choosing the number of particles so that the variance of the logarithm of the estimator of the posterior at a fixed sensible parameter value is approximately 1. This advice has been widely and successfully adopted. We provide new, remarkably simple upper and lower bounds on the asymptotic variance of PMMH algorithms. The bounds explain how blindly following the 2015 advice can hide serious issues with the algorithm and they strongly suggest an alternative criterion. In most situations our guidelines and those from 2015 closely coincide; however, when the two differ it is safer to follow the new guidance. An extension of one of our bounds shows how the use of correlated proposals can fundamentally shift the properties of pseudo-marginal algorithms, so that asymptotic variances that were infinite under the PMMH kernel become finite.

arXiv.org

Power and Sample Size Calculations for Cluster Randomized Hybrid Type 2 Effectiveness-Implementation Studies arxiv.org/abs/2411.08929 .ME

Power and Sample Size Calculations for Cluster Randomized Hybrid Type 2 Effectiveness-Implementation Studies

Hybrid studies allow investigators to simultaneously study an intervention effectiveness outcome and an implementation research outcome. In particular, type 2 hybrid studies support research that places equal importance on both outcomes rather than focusing on one and secondarily on the other (i.e., type 1 and type 3 studies). Hybrid 2 studies introduce the statistical issue of multiple testing, complicated by the fact that they are typically also cluster randomized trials. Standard statistical methods do not apply in this scenario. Here, we describe the design methodologies available for validly powering hybrid type 2 studies and producing reliable sample size calculations in a cluster-randomized design with a focus on binary outcomes. Through a literature search, 18 publications were identified that included methods relevant to the design of hybrid 2 studies. Five methods were identified, two of which did not account for clustering but are extended in this article to do so, namely the combined outcomes approach and the single 1-degree of freedom combined test. Procedures for powering hybrid 2 studies using these five methods are described and illustrated using input parameters inspired by a study from the Community Intervention to Reduce CardiovascuLar Disease in Chicago (CIRCL-Chicago) Implementation Research Center. In this illustrative example, the intervention effectiveness outcome was controlled blood pressure, and the implementation outcome was reach. The conjunctive test resulted in higher power than the popular p-value adjustment methods, and the newly extended combined outcomes and single 1-DF test were found to be the most powerful among all of the tests.

arXiv.org

Using Principal Progression Rate to Quantify and Compare Disease Progression in Comparative Studies arxiv.org/abs/2411.08984 .ME

Using Principal Progression Rate to Quantify and Compare Disease Progression in Comparative Studies

In comparative studies of progressive diseases, such as randomized controlled trials (RCTs), the mean Change From Baseline (CFB) of a continuous outcome at a pre-specified follow-up time across subjects in the target population is a standard estimand used to summarize the overall disease progression. Despite its simplicity in interpretation, the mean CFB may not efficiently capture important features of the trajectory of the mean outcome relevant to the evaluation of the treatment effect of an intervention. Additionally, the estimation of the mean CFB does not use all longitudinal data points. To address these limitations, we propose a class of estimands called Principal Progression Rate (PPR). The PPR is a weighted average of local or instantaneous slope of the trajectory of the population mean during the follow-up. The flexibility of the weight function allows the PPR to cover a broad class of intuitive estimands, including the mean CFB, the slope of ordinary least-square fit to the trajectory, and the area under the curve. We showed that properly chosen PPRs can enhance statistical power over the mean CFB by amplifying the signal of treatment effect and/or improving estimation precision. We evaluated different versions of PPRs and the performance of their estimators through numerical studies. A real dataset was analyzed to demonstrate the advantage of using alternative PPR over the mean CFB.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.