Show newer

Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention. (arXiv:2311.10792v1 [cs.LG]) arxiv.org/abs/2311.10792

Attention Mechanism for Lithium-Ion Battery Lifespan Prediction: Temporal and Cyclic Attention

Accurately predicting the lifespan of lithium-ion batteries (LIBs) is pivotal for optimizing usage and preventing accidents. Previous studies in constructing prediction models often relied on inputs challenging to measure in real-time operations and failed to capture intra-cycle and inter-cycle data patterns, essential features for accurate predictions, comprehensively. In this study, we employ attention mechanisms (AM) to develop data-driven models for predicting LIB lifespan using easily measurable inputs such as voltage, current, temperature, and capacity data. The developed model integrates recurrent neural network (RNN) and convolutional neural network (CNN) components, featuring two types of attention mechanisms: temporal attention (TA) and cyclic attention (CA). The inclusion of TA aims to identify important time steps within each cycle by scoring the hidden states of the RNN, whereas CA strives to capture key features of inter-cycle correlations through self-attention (SA). This enhances model accuracy and elucidates critical features in the input data. To validate our method, we apply it to publicly available cycling data consisting of three batches of cycling modes. The calculated TA scores highlight the rest phase as a key characteristic distinguishing LIB data among different batches. Additionally, CA scores reveal variations in the importance of cycles across batches. By leveraging CA scores, we explore the potential to reduce the number of cycles in the input data. The single-head and multi-head attentions enable us to decrease the input dimension from 100 to 50 and 30 cycles, respectively.

arxiv.org

Addressing Population Heterogeneity for HIV Incidence Estimation Based on Recency Test. (arXiv:2311.10848v1 [stat.ME]) arxiv.org/abs/2311.10848

Addressing Population Heterogeneity for HIV Incidence Estimation Based on Recency Test

Cross-sectional HIV incidence estimation leverages recency test results to determine the HIV incidence of a population of interest, where recency test uses biomarker profiles to infer whether an HIV-positive individual was "recently" infected. This approach possesses an obvious advantage over the conventional cohort follow-up method since it avoids longitudinal follow-up and repeated HIV testing. In this manuscript, we consider the extension of cross-sectional incidence estimation to estimate the incidence of a different target population addressing potential population heterogeneity. We propose a general framework that incorporates two settings: one with the target population that is a subset of the population with cross-sectional recency testing data, e.g., leveraging recency testing data from screening in active-arm trial design, and the other with an external target population. We also propose a method to incorporate HIV subtype, a special covariate that modifies the properties of recency test, into our framework. Through extensive simulation studies and a data application, we demonstrate the excellent performance of the proposed methods. We conclude with a discussion of sensitivity analysis and future work to improve our framework.

arxiv.org

Covariate adjustment in randomized experiments with missing outcomes and covariates. (arXiv:2311.10877v1 [stat.ME]) arxiv.org/abs/2311.10877

Covariate adjustment in randomized experiments with missing outcomes and covariates

Covariate adjustment can improve precision in estimating treatment effects from randomized experiments. With fully observed data, regression adjustment and propensity score weighting are two asymptotically equivalent methods for covariate adjustment in randomized experiments. We show that this equivalence breaks down in the presence of missing outcomes, with regression adjustment no longer ensuring efficiency gain when the true outcome model is not linear in covariates. Propensity score weighting, in contrast, still guarantees efficiency over unadjusted analysis, and including more covariates in adjustment never harms asymptotic efficiency. Moreover, we establish the value of using partially observed covariates to secure additional efficiency. Based on these findings, we recommend a simple double-weighted estimator for covariate adjustment with incomplete outcomes and covariates: (i) impute all missing covariates by zero, and use the union of the completed covariates and corresponding missingness indicators to estimate the probability of treatment and the probability of having observed outcome for all units; (ii) estimate the average treatment effect by the coefficient of the treatment from the least-squares regression of the observed outcome on the treatment, where we weight each unit by the inverse of the product of these two estimated probabilities.

arxiv.org

Short-term Volatility Estimation for High Frequency Trades using Gaussian processes (GPs). (arXiv:2311.10935v1 [q-fin.ST]) arxiv.org/abs/2311.10935

Short-term Volatility Estimation for High Frequency Trades using Gaussian processes (GPs)

The fundamental theorem behind financial markets is that stock prices are intrinsically complex and stochastic. One of the complexities is the volatility associated with stock prices. Volatility is a tendency for prices to change unexpectedly [1]. Price volatility is often detrimental to the return economics, and thus, investors should factor it in whenever making investment decisions, choices, and temporal or permanent moves. It is, therefore, crucial to make necessary and regular short and long-term stock price volatility forecasts for the safety and economics of investors returns. These forecasts should be accurate and not misleading. Different models and methods, such as ARCH GARCH models, have been intuitively implemented to make such forecasts. However, such traditional means fail to capture the short-term volatility forecasts effectively. This paper, therefore, investigates and implements a combination of numeric and probabilistic models for short-term volatility and return forecasting for high-frequency trades. The essence is that one-day-ahead volatility forecasts were made with Gaussian Processes (GPs) applied to the outputs of a Numerical market prediction (NMP) model. Firstly, the stock price data from NMP was corrected by a GP. Since it is not easy to set price limits in a market due to its free nature and randomness, a Censored GP was used to model the relationship between the corrected stock prices and returns. Forecasting errors were evaluated using the implied and estimated data.

arxiv.org

Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes. (arXiv:2311.10972v1 [cs.LG]) arxiv.org/abs/2311.10972

Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes

We investigate the complexity of training a two-layer ReLU neural network with weight decay regularization. Previous research has shown that the optimal solution of this problem can be found by solving a standard cone-constrained convex program. Using this convex formulation, we prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it. In particular, when $ε\leq\sqrt{84/83}-1\approx 0.006$, we show that it is NP-hard to find an approximate global optimizer of the ReLU network objective with relative error $ε$ with respect to the objective value. Moreover, we develop a randomized algorithm which mirrors the Goemans-Williamson rounding of semidefinite Max-Cut relaxations. To provide polynomial-time approximations, we classify training datasets into three categories: (i) For orthogonal separable datasets, a precise solution can be obtained in polynomial-time. (ii) When there is a negative correlation between samples of different classes, we give a polynomial-time approximation with relative error $\sqrt{π/2}-1\approx 0.253$. (iii) For general datasets, the degree to which the problem can be approximated in polynomial-time is governed by a geometric factor that controls the diameter of two zonotopes intrinsic to the dataset. To our knowledge, these results present the first polynomial-time approximation guarantees along with first hardness of approximation results for regularized ReLU networks.

arxiv.org

A Graphical Model of Hurricane Evacuation Behaviors. (arXiv:2311.10228v1 [cs.AI]) arxiv.org/abs/2311.10228

A Graphical Model of Hurricane Evacuation Behaviors

Natural disasters such as hurricanes are increasing and causing widespread devastation. People's decisions and actions regarding whether to evacuate or not are critical and have a large impact on emergency planning and response. Our interest lies in computationally modeling complex relationships among various factors influencing evacuation decisions. We conducted a study on the evacuation of Hurricane Irma of the 2017 Atlantic hurricane season. The study was guided by the Protection motivation theory (PMT), a widely-used framework to understand people's responses to potential threats. Graphical models were constructed to represent the complex relationships among the factors involved and the evacuation decision. We evaluated different graphical structures based on conditional independence tests using Irma data. The final model largely aligns with PMT. It shows that both risk perception (threat appraisal) and difficulties in evacuation (coping appraisal) influence evacuation decisions directly and independently. Certain information received from media was found to influence risk perception, and through it influence evacuation behaviors indirectly. In addition, several variables were found to influence both risk perception and evacuation behaviors directly, including family and friends' suggestions, neighbors' evacuation behaviors, and evacuation notices from officials.

arxiv.org

Multiscale Hodge Scattering Networks for Data Analysis. (arXiv:2311.10270v1 [cs.LG]) arxiv.org/abs/2311.10270

Multiscale Hodge Scattering Networks for Data Analysis

We propose new scattering networks for signals measured on simplicial complexes, which we call \emph{Multiscale Hodge Scattering Networks} (MHSNs). Our construction is based on multiscale basis dictionaries on simplicial complexes, i.e., the $κ$-GHWT and $κ$-HGLET, which we recently developed for simplices of dimension $κ\in \N$ in a given simplicial complex by generalizing the node-based Generalized Haar-Walsh Transform (GHWT) and Hierarchical Graph Laplacian Eigen Transform (HGLET). The $κ$-GHWT and the $\kk$-HGLET both form redundant sets (i.e., dictionaries) of multiscale basis vectors and the corresponding expansion coefficients of a given signal. Our MHSNs use a layered structure analogous to a convolutional neural network (CNN) to cascade the moments of the modulus of the dictionary coefficients. The resulting features are invariant to reordering of the simplices (i.e., node permutation of the underlying graphs). Importantly, the use of multiscale basis dictionaries in our MHSNs admits a natural pooling operation that is akin to local pooling in CNNs, and which may be performed either locally or per-scale. These pooling operations are harder to define in both traditional scattering networks based on Morlet wavelets, and geometric scattering networks based on Diffusion Wavelets. As a result, we are able to extract a rich set of descriptive yet robust features that can be used along with very simple machine learning methods (i.e., logistic regression or support vector machines) to achieve high-accuracy classification systems with far fewer parameters to train than most modern graph neural networks. Finally, we demonstrate the usefulness of our MHSNs in three distinct types of problems: signal classification, domain (i.e., graph/simplex) classification, and molecular dynamics prediction.

arxiv.org

Differentially private analysis of networks with covariates via a generalized $\beta$-model. (arXiv:2311.10279v1 [stat.ME]) arxiv.org/abs/2311.10279

Differentially private analysis of networks with covariates via a generalized $β$-model

How to achieve the tradeoff between privacy and utility is one of fundamental problems in private data analysis.In this paper, we give a rigourous differential privacy analysis of networks in the appearance of covariates via a generalized $β$-model, which has an $n$-dimensional degree parameter $β$ and a $p$-dimensional homophily parameter $γ$.Under $(k_n, ε_n)$-edge differential privacy, we use the popular Laplace mechanism to release the network statistics.The method of moments is used to estimate the unknown model parameters. We establish the conditions guaranteeing consistency of the differentially private estimators $\widehatβ$ and $\widehatγ$ as the number of nodes $n$ goes to infinity, which reveal an interesting tradeoff between a privacy parameter and model parameters. The consistency is shown by applying a two-stage Newton's method to obtain the upper bound of the error between $(\widehatβ,\widehatγ)$ and its true value $(β, γ)$ in terms of the $\ell_\infty$ distance, which has a convergence rate of rough order $1/n^{1/2}$ for $\widehatβ$ and $1/n$ for $\widehatγ$, respectively. Further, we derive the asymptotic normalities of $\widehatβ$ and $\widehatγ$, whose asymptotic variances are the same as those of the non-private estimators under some conditions. Our paper sheds light on how to explore asymptotic theory under differential privacy in a principled manner; these principled methods should be applicable to a class of network models with covariates beyond the generalized $β$-model. Numerical studies and a real data analysis demonstrate our theoretical findings.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.