arXiv Statistics @arxiv_stats@qoto.org

Bot

I post the feed of the arXiv Statistics.

#Statistics #Stats #Mathematics #Math #Maths #Science #arXiv #News #PeerReview

Joined Aug 2019

2 Following 603 Followers

Posts Posts and replies Media

arXiv Statistics @arxiv_stats@qoto.org

Approximation of supply curves. (arXiv:2311.10738v1 [q-fin.PR]) http://arxiv.org/abs/2311.10738

Approximation of supply curves

In this note, we illustrate the computation of the approximation of the supply curves using a one-step basis. We derive the expression for the L2 approximation and propose a procedure for the selection of nodes of the approximation. We illustrate the use of this approach with three large sets of bid curves from European electricity markets.

arXiv Statistics @arxiv_stats@qoto.org

A Bayesian Inverse Approach to Proton Therapy Dose Delivery Verification. (arXiv:2311.10769v1 [stat.AP]) http://arxiv.org/abs/2311.10769

A Bayesian Inverse Approach to Proton Therapy Dose Delivery Verification

This study presents a proof-of-concept for a novel Bayesian inverse method in a one-dimensional setting, aimed at proton beam therapy treatment verification. Our methodology is predicated on a hypothetical scenario wherein strategically positioned sensors detect prompt-γ's emitted from a proton beam when it interacts with defined layers of tissue. Using this data, we employ a Bayesian framework to estimate the proton beam's energy deposition profile. We validate our Bayesian inverse estimations against a closed-form approximation of the Bragg Peak in a uniform medium and a layered lung tumour.

arXiv Statistics @arxiv_stats@qoto.org

Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention. (arXiv:2311.10792v1 [cs.LG]) http://arxiv.org/abs/2311.10792

Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Accurately predicting the lifespan of lithium-ion batteries (LIBs) is pivotal for optimizing usage and preventing accidents. Previous studies in constructing prediction models often relied on inputs challenging to measure in real-time operations and failed to capture intra-cycle and inter-cycle data patterns, essential features for accurate predictions, comprehensively. In this study, we employ attention mechanisms (AM) to develop data-driven models for predicting LIB lifespan using easily measurable inputs such as voltage, current, temperature, and capacity data. The developed model integrates recurrent neural network (RNN) and convolutional neural network (CNN) components, featuring two types of attention mechanisms: temporal attention (TA) and cyclic attention (CA). The inclusion of TA aims to identify important time steps within each cycle by scoring the hidden states of the RNN, whereas CA strives to capture key features of inter-cycle correlations through self-attention (SA). This enhances model accuracy and elucidates critical features in the input data. To validate our method, we apply it to publicly available cycling data consisting of three batches of cycling modes. The calculated TA scores highlight the rest phase as a key characteristic distinguishing LIB data among different batches. Additionally, CA scores reveal variations in the importance of cycles across batches. By leveraging CA scores, we explore the potential to reduce the number of cycles in the input data. The single-head and multi-head attentions enable us to decrease the input dimension from 100 to 50 and 30 cycles, respectively.

arXiv Statistics @arxiv_stats@qoto.org

Addressing Population Heterogeneity for HIV Incidence Estimation Based on Recency Test. (arXiv:2311.10848v1 [stat.ME]) http://arxiv.org/abs/2311.10848

Addressing Population Heterogeneity for HIV Incidence Estimation Based on Recency Test

Cross-sectional HIV incidence estimation leverages recency test results to determine the HIV incidence of a population of interest, where recency test uses biomarker profiles to infer whether an HIV-positive individual was "recently" infected. This approach possesses an obvious advantage over the conventional cohort follow-up method since it avoids longitudinal follow-up and repeated HIV testing. In this manuscript, we consider the extension of cross-sectional incidence estimation to estimate the incidence of a different target population addressing potential population heterogeneity. We propose a general framework that incorporates two settings: one with the target population that is a subset of the population with cross-sectional recency testing data, e.g., leveraging recency testing data from screening in active-arm trial design, and the other with an external target population. We also propose a method to incorporate HIV subtype, a special covariate that modifies the properties of recency test, into our framework. Through extensive simulation studies and a data application, we demonstrate the excellent performance of the proposed methods. We conclude with a discussion of sensitivity analysis and future work to improve our framework.

arXiv Statistics @arxiv_stats@qoto.org

Covariate adjustment in randomized experiments with missing outcomes and covariates. (arXiv:2311.10877v1 [stat.ME]) http://arxiv.org/abs/2311.10877

Covariate adjustment in randomized experiments with missing outcomes and covariates

Covariate adjustment can improve precision in estimating treatment effects from randomized experiments. With fully observed data, regression adjustment and propensity score weighting are two asymptotically equivalent methods for covariate adjustment in randomized experiments. We show that this equivalence breaks down in the presence of missing outcomes, with regression adjustment no longer ensuring efficiency gain when the true outcome model is not linear in covariates. Propensity score weighting, in contrast, still guarantees efficiency over unadjusted analysis, and including more covariates in adjustment never harms asymptotic efficiency. Moreover, we establish the value of using partially observed covariates to secure additional efficiency. Based on these findings, we recommend a simple double-weighted estimator for covariate adjustment with incomplete outcomes and covariates: (i) impute all missing covariates by zero, and use the union of the completed covariates and corresponding missingness indicators to estimate the probability of treatment and the probability of having observed outcome for all units; (ii) estimate the average treatment effect by the coefficient of the treatment from the least-squares regression of the observed outcome on the treatment, where we weight each unit by the inverse of the product of these two estimated probabilities.

arXiv Statistics @arxiv_stats@qoto.org

The Hidden Linear Structure in Score-Based Models and its Application. (arXiv:2311.10892v1 [cs.AI]) http://arxiv.org/abs/2311.10892

The Hidden Linear Structure in Score-Based Models and its Application

Score-based models have achieved remarkable results in the generative modeling of many domains. By learning the gradient of smoothed data distribution, they can iteratively generate samples from complex distribution e.g. natural images. However, is there any universal structure in the gradient field that will eventually be learned by any neural network? Here, we aim to find such structures through a normative analysis of the score function. First, we derived the closed-form solution to the scored-based model with a Gaussian score. We claimed that for well-trained diffusion models, the learned score at a high noise scale is well approximated by the linear score of Gaussian. We demonstrated this through empirical validation of pre-trained images diffusion model and theoretical analysis of the score function. This finding enabled us to precisely predict the initial diffusion trajectory using the analytical solution and to accelerate image sampling by 15-30\% by skipping the initial phase without sacrificing image quality. Our finding of the linear structure in the score-based model has implications for better model design and data pre-processing.

arXiv Statistics @arxiv_stats@qoto.org

A powerful rank-based correction to multiple testing under positive dependency. (arXiv:2311.10900v1 [stat.ME]) http://arxiv.org/abs/2311.10900

A powerful rank-based correction to multiple testing under positive dependency

We develop a novel multiple hypothesis testing correction with family-wise error rate (FWER) control that efficiently exploits positive dependencies between potentially correlated statistical hypothesis tests. Our proposed algorithm $\texttt{max-rank}$ is conceptually straight-forward, relying on the use of a $\max$-operator in the rank domain of computed test statistics. We compare our approach to the frequently employed Bonferroni correction, theoretically and empirically demonstrating its superiority over Bonferroni in the case of existing positive dependency, and its equivalence otherwise. Our advantage over Bonferroni increases as the number of tests rises, and we maintain high statistical power whilst ensuring FWER control. We specifically frame our algorithm in the context of parallel permutation testing, a scenario that arises in our primary application of conformal prediction, a recently popularized approach for quantifying uncertainty in complex predictive settings.

arXiv Statistics @arxiv_stats@qoto.org

Short-term Volatility Estimation for High Frequency Trades using Gaussian processes (GPs). (arXiv:2311.10935v1 [q-fin.ST]) http://arxiv.org/abs/2311.10935

Short-term Volatility Estimation for High Frequency Trades using Gaussian processes (GPs)

The fundamental theorem behind financial markets is that stock prices are intrinsically complex and stochastic. One of the complexities is the volatility associated with stock prices. Volatility is a tendency for prices to change unexpectedly [1]. Price volatility is often detrimental to the return economics, and thus, investors should factor it in whenever making investment decisions, choices, and temporal or permanent moves. It is, therefore, crucial to make necessary and regular short and long-term stock price volatility forecasts for the safety and economics of investors returns. These forecasts should be accurate and not misleading. Different models and methods, such as ARCH GARCH models, have been intuitively implemented to make such forecasts. However, such traditional means fail to capture the short-term volatility forecasts effectively. This paper, therefore, investigates and implements a combination of numeric and probabilistic models for short-term volatility and return forecasting for high-frequency trades. The essence is that one-day-ahead volatility forecasts were made with Gaussian Processes (GPs) applied to the outputs of a Numerical market prediction (NMP) model. Firstly, the stock price data from NMP was corrected by a GP. Since it is not easy to set price limits in a market due to its free nature and randomness, a Censored GP was used to model the relationship between the corrected stock prices and returns. Forecasting errors were evaluated using the implied and estimated data.

arXiv Statistics @arxiv_stats@qoto.org

Strong law of large numbers for the generalized Fr\'{e}chet means with random minimizing domains. (arXiv:2311.10958v1 [math.ST]) http://arxiv.org/abs/2311.10958

Strong law of large numbers for the generalized Fréchet means with random minimizing domains

This paper introduces a novel extension of Fréchet means, called \textit{generalized Fréchet means} as a comprehensive framework for characterizing features in probability distributions in general topological spaces. The generalized Fréchet means are defined as minimizers of a suitably defined cost function. The framework encompasses various extensions of Fréchet means in the literature. The most distinctive difference of the new framework from the previous works is that we allow the domain of minimization of the empirical means be random and different from that of the population means. This expands the applicability of the Fréchet mean framework to diverse statistical scenarios, including dimension reduction for manifold-valued data.

arXiv Statistics @arxiv_stats@qoto.org

Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes. (arXiv:2311.10972v1 [cs.LG]) http://arxiv.org/abs/2311.10972

Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes

We investigate the complexity of training a two-layer ReLU neural network with weight decay regularization. Previous research has shown that the optimal solution of this problem can be found by solving a standard cone-constrained convex program. Using this convex formulation, we prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it. In particular, when $ε\leq\sqrt{84/83}-1\approx 0.006$, we show that it is NP-hard to find an approximate global optimizer of the ReLU network objective with relative error $ε$ with respect to the objective value. Moreover, we develop a randomized algorithm which mirrors the Goemans-Williamson rounding of semidefinite Max-Cut relaxations. To provide polynomial-time approximations, we classify training datasets into three categories: (i) For orthogonal separable datasets, a precise solution can be obtained in polynomial-time. (ii) When there is a negative correlation between samples of different classes, we give a polynomial-time approximation with relative error $\sqrt{π/2}-1\approx 0.253$. (iii) For general datasets, the degree to which the problem can be approximated in polynomial-time is governed by a geometric factor that controls the diameter of two zonotopes intrinsic to the dataset. To our knowledge, these results present the first polynomial-time approximation guarantees along with first hardness of approximation results for regularized ReLU networks.

arXiv Statistics @arxiv_stats@qoto.org

Gaussian Differential Privacy on Riemannian Manifolds. (arXiv:2311.10101v1 [cs.CR]) http://arxiv.org/abs/2311.10101

Gaussian Differential Privacy on Riemannian Manifolds

We develop an advanced approach for extending Gaussian Differential Privacy (GDP) to general Riemannian manifolds. The concept of GDP stands out as a prominent privacy definition that strongly warrants extension to manifold settings, due to its central limit properties. By harnessing the power of the renowned Bishop-Gromov theorem in geometric analysis, we propose a Riemannian Gaussian distribution that integrates the Riemannian distance, allowing us to achieve GDP in Riemannian manifolds with bounded Ricci curvature. To the best of our knowledge, this work marks the first instance of extending the GDP framework to accommodate general Riemannian manifolds, encompassing curved spaces, and circumventing the reliance on tangent space summaries. We provide a simple algorithm to evaluate the privacy budget $μ$ on any one-dimensional manifold and introduce a versatile Markov Chain Monte Carlo (MCMC)-based algorithm to calculate $μ$ on any Riemannian manifold with constant curvature. Through simulations on one of the most prevalent manifolds in statistics, the unit sphere $S^d$, we demonstrate the superior utility of our Riemannian Gaussian mechanism in comparison to the previously proposed Riemannian Laplace mechanism for implementing GDP.

arXiv Statistics @arxiv_stats@qoto.org

Employing Gaussian process priors for studying spatial variation in the parameters of a cardiac action potential model. (arXiv:2311.10114v1 [stat.AP]) http://arxiv.org/abs/2311.10114

Employing Gaussian process priors for studying spatial variation in the parameters of a cardiac action potential model

Cardiac cells exhibit variability in the shape and duration of their action potentials in space within a single individual. To create a mathematical model of cardiac action potentials (AP) which captures this spatial variability and also allows for rigorous uncertainty quantification regarding within-tissue spatial correlation structure, we developed a novel hierarchical Bayesian model making use of a latent Gaussian process prior on the parameters of a simplified cardiac AP model which is used to map forcing behavior to observed voltage signals. This model allows for prediction of cardiac electrophysiological dynamics at new points in space and also allows for reconstruction of surface electrical dynamics with a relatively small number of spatial observation points. Furthermore, we make use of Markov chain Monte Carlo methods via the Stan modeling framework for parameter estimation. We employ a synthetic data case study oriented around the reconstruction of a sparsely-observed spatial parameter surface to highlight how this approach can be used for spatial or spatiotemporal analyses of cardiac electrophysiology.

arXiv Statistics @arxiv_stats@qoto.org

Accommodating Missing Modalities in Time-Continuous Multimodal Emotion Recognition. (arXiv:2311.10119v1 [cs.LG]) http://arxiv.org/abs/2311.10119

Accommodating Missing Modalities in Time-Continuous Multimodal Emotion Recognition

Decades of research indicate that emotion recognition is more effective when drawing information from multiple modalities. But what if some modalities are sometimes missing? To address this problem, we propose a novel Transformer-based architecture for recognizing valence and arousal in a time-continuous manner even with missing input modalities. We use a coupling of cross-attention and self-attention mechanisms to emphasize relationships between modalities during time and enhance the learning process on weak salient inputs. Experimental results on the Ulm-TSST dataset show that our model exhibits an improvement of the concordance correlation coefficient evaluation of 37% when predicting arousal values and 30% when predicting valence values, compared to a late-fusion baseline approach.

arXiv Statistics @arxiv_stats@qoto.org

Optimal recovery by maximum and integrated conditional likelihood in the general Stochastic Block Model. (arXiv:2311.10153v1 [math.ST]) http://arxiv.org/abs/2311.10153

Optimal recovery by maximum and integrated conditional likelihood in the general Stochastic Block Model

In this paper, we prove the weak and strong consistency of the maximum and integrated conditional likelihood estimators for the community detection problem in the Stochastic Block Model with $k$ communities and unknown parameters. We show that maximum conditional likelihood achieves the optimal known threshold for exact recovery in the logarithmic degree regime. For the integrated conditional likelihood, we obtain a sub-optimal constant in the same regime. Both methods are shown to be weakly consistent in the divergent degree regime. The results also hold when the number of communities is allowed to increase with the network size.

arXiv Statistics @arxiv_stats@qoto.org

Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication. (arXiv:2311.10207v1 [cs.AR]) http://arxiv.org/abs/2311.10207

Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication

From classical HPC to deep learning, MatMul is at the heart of today's computing. The recent Maddness method approximates MatMul without the need for multiplication by using a hash-based version of product quantization (PQ) indexing into a look-up table (LUT). Stella Nera is the first Maddness accelerator and it achieves 15x higher area efficiency (GMAC/s/mm^2) and more than 25x higher energy efficiency (TMAC/s/W) than direct MatMul accelerators implemented in the same technology. The hash function is a decision tree, which allows for an efficient hardware implementation as the multiply-accumulate operations are replaced by decision tree passes and LUT lookups. The entire Maddness MatMul can be broken down into parts that allow an effective implementation with small computing units and memories, allowing it to reach extreme efficiency while remaining generically applicable for MatMul tasks. In a commercial 14nm technology and scaled to 3nm, we achieve an energy efficiency of 161 TOp/s/W@0.55V with a Top-1 accuracy on CIFAR-10 of more than 92.5% using ResNet9.

arXiv Statistics @arxiv_stats@qoto.org

A Graphical Model of Hurricane Evacuation Behaviors. (arXiv:2311.10228v1 [cs.AI]) http://arxiv.org/abs/2311.10228

A Graphical Model of Hurricane Evacuation Behaviors

Natural disasters such as hurricanes are increasing and causing widespread devastation. People's decisions and actions regarding whether to evacuate or not are critical and have a large impact on emergency planning and response. Our interest lies in computationally modeling complex relationships among various factors influencing evacuation decisions. We conducted a study on the evacuation of Hurricane Irma of the 2017 Atlantic hurricane season. The study was guided by the Protection motivation theory (PMT), a widely-used framework to understand people's responses to potential threats. Graphical models were constructed to represent the complex relationships among the factors involved and the evacuation decision. We evaluated different graphical structures based on conditional independence tests using Irma data. The final model largely aligns with PMT. It shows that both risk perception (threat appraisal) and difficulties in evacuation (coping appraisal) influence evacuation decisions directly and independently. Certain information received from media was found to influence risk perception, and through it influence evacuation behaviors indirectly. In addition, several variables were found to influence both risk perception and evacuation behaviors directly, including family and friends' suggestions, neighbors' evacuation behaviors, and evacuation notices from officials.

arXiv Statistics @arxiv_stats@qoto.org

Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric Learning. (arXiv:2311.10246v1 [cs.LG]) http://arxiv.org/abs/2311.10246

Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric Learning

Nonparametric learning is a fundamental concept in machine learning that aims to capture complex patterns and relationships in data without making strong assumptions about the underlying data distribution. Owing to simplicity and familiarity, one of the most well-known algorithms under this paradigm is the $k$-nearest neighbors ($k$-NN) algorithm. Driven by the usage of machine learning in safety-critical applications, in this work, we shed new light on the traditional nearest neighbors algorithm from the perspective of information theory and propose a robust and interpretable framework for tasks such as classification, regression, and anomaly detection using a single model. Instead of using a traditional distance measure which needs to be scaled and contextualized, we use a novel formulation of \textit{surprisal} (amount of information required to explain the difference between the observed and expected result). Finally, we demonstrate this architecture's capability to perform at-par or above the state-of-the-art on classification, regression, and anomaly detection tasks using a single model with enhanced interpretability by providing novel concepts for characterizing data and predictions.

arXiv Statistics @arxiv_stats@qoto.org

Stable Differentiable Causal Discovery. (arXiv:2311.10263v1 [cs.LG]) http://arxiv.org/abs/2311.10263

Stable Differentiable Causal Discovery

Inferring causal relationships as directed acyclic graphs (DAGs) is an important but challenging problem. Differentiable Causal Discovery (DCD) is a promising approach to this problem, framing the search as a continuous optimization. But existing DCD methods are numerically unstable, with poor performance beyond tens of variables. In this paper, we propose Stable Differentiable Causal Discovery (SDCD), a new method that improves previous DCD methods in two ways: (1) It employs an alternative constraint for acyclicity; this constraint is more stable, both theoretically and empirically, and fast to compute. (2) It uses a training procedure tailored for sparse causal graphs, which are common in real-world scenarios. We first derive SDCD and prove its stability and correctness. We then evaluate it with both observational and interventional data and on both small-scale and large-scale settings. We find that SDCD outperforms existing methods in both convergence speed and accuracy and can scale to thousands of variables.

arXiv Statistics @arxiv_stats@qoto.org

Multiscale Hodge Scattering Networks for Data Analysis. (arXiv:2311.10270v1 [cs.LG]) http://arxiv.org/abs/2311.10270

Multiscale Hodge Scattering Networks for Data Analysis

We propose new scattering networks for signals measured on simplicial complexes, which we call \emph{Multiscale Hodge Scattering Networks} (MHSNs). Our construction is based on multiscale basis dictionaries on simplicial complexes, i.e., the $κ$-GHWT and $κ$-HGLET, which we recently developed for simplices of dimension $κ\in \N$ in a given simplicial complex by generalizing the node-based Generalized Haar-Walsh Transform (GHWT) and Hierarchical Graph Laplacian Eigen Transform (HGLET). The $κ$-GHWT and the $\kk$-HGLET both form redundant sets (i.e., dictionaries) of multiscale basis vectors and the corresponding expansion coefficients of a given signal. Our MHSNs use a layered structure analogous to a convolutional neural network (CNN) to cascade the moments of the modulus of the dictionary coefficients. The resulting features are invariant to reordering of the simplices (i.e., node permutation of the underlying graphs). Importantly, the use of multiscale basis dictionaries in our MHSNs admits a natural pooling operation that is akin to local pooling in CNNs, and which may be performed either locally or per-scale. These pooling operations are harder to define in both traditional scattering networks based on Morlet wavelets, and geometric scattering networks based on Diffusion Wavelets. As a result, we are able to extract a rich set of descriptive yet robust features that can be used along with very simple machine learning methods (i.e., logistic regression or support vector machines) to achieve high-accuracy classification systems with far fewer parameters to train than most modern graph neural networks. Finally, we demonstrate the usefulness of our MHSNs in three distinct types of problems: signal classification, domain (i.e., graph/simplex) classification, and molecular dynamics prediction.

arXiv Statistics @arxiv_stats@qoto.org

Differentially private analysis of networks with covariates via a generalized $\beta$-model. (arXiv:2311.10279v1 [stat.ME]) http://arxiv.org/abs/2311.10279

Differentially private analysis of networks with covariates via a generalized $β$-model

How to achieve the tradeoff between privacy and utility is one of fundamental problems in private data analysis.In this paper, we give a rigourous differential privacy analysis of networks in the appearance of covariates via a generalized $β$-model, which has an $n$-dimensional degree parameter $β$ and a $p$-dimensional homophily parameter $γ$.Under $(k_n, ε_n)$-edge differential privacy, we use the popular Laplace mechanism to release the network statistics.The method of moments is used to estimate the unknown model parameters. We establish the conditions guaranteeing consistency of the differentially private estimators $\widehatβ$ and $\widehatγ$ as the number of nodes $n$ goes to infinity, which reveal an interesting tradeoff between a privacy parameter and model parameters. The consistency is shown by applying a two-stage Newton's method to obtain the upper bound of the error between $(\widehatβ,\widehatγ)$ and its true value $(β, γ)$ in terms of the $\ell_\infty$ distance, which has a convergence rate of rough order $1/n^{1/2}$ for $\widehatβ$ and $1/n$ for $\widehatγ$, respectively. Further, we derive the asymptotic normalities of $\widehatβ$ and $\widehatγ$, whose asymptotic variances are the same as those of the non-private estimators under some conditions. Our paper sheds light on how to explore asymptotic theory under differential privacy in a principled manner; these principled methods should be applicable to a class of network models with covariates beyond the generalized $β$-model. Numerical studies and a real data analysis demonstrate our theoretical findings.

Bot

I post the feed of the arXiv Statistics.

#Statistics #Stats #Mathematics #Math #Maths #Science #arXiv #News #PeerReview

Joined Aug 2019