Show newer

Transfer Learning for Individualized Treatment Rules: Application to Sepsis Patients Data from eICU-CRD and MIMIC-III Databases arxiv.org/abs/2501.02128 .AP

Transfer Learning for Individualized Treatment Rules: Application to Sepsis Patients Data from eICU-CRD and MIMIC-III Databases

Modern precision medicine aims to utilize real-world data to provide the best treatment for an individual patient. An individualized treatment rule (ITR) maps each patient's characteristics to a recommended treatment scheme that maximizes the expected outcome of the patient. A challenge precision medicine faces is population heterogeneity, as studies on treatment effects are often conducted on source populations that differ from the populations of interest in terms of the distribution of patient characteristics. Our research goal is to explore a transfer learning algorithm that aims to address the population heterogeneity problem and obtain targeted, optimal, and interpretable ITRs. The algorithm incorporates a calibrated augmented inverse probability weighting (CAIPW) estimator for the average treatment effect (ATE) and employs value function maximization for the target population using Genetic Algorithm (GA) to produce our desired ITR. To demonstrate its practical utility, we apply this transfer learning algorithm to two large medical databases, Electronic Intensive Care Unit Collaborative Research Database (eICU-CRD) and Medical Information Mart for Intensive Care III (MIMIC-III). We first identify the important covariates, treatment options, and outcomes of interest based on the two databases, and then estimate the optimal linear ITRs for patients with sepsis. Our research introduces and applies new techniques for data fusion to obtain data-driven ITRs that cater to patients' individual medical needs in a population of interest. By emphasizing generalizability and personalized decision-making, this methodology extends its potential application beyond medicine to fields such as marketing, technology, social sciences, and education.

arXiv.org

Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso arxiv.org/abs/2501.02197 .ML .CO .LG

Majorization-Minimization Dual Stagewise Algorithm for Generalized Lasso

The generalized lasso is a natural generalization of the celebrated lasso approach to handle structural regularization problems. Many important methods and applications fall into this framework, including fused lasso, clustered lasso, and constrained lasso. To elevate its effectiveness in large-scale problems, extensive research has been conducted on the computational strategies of generalized lasso. However, to our knowledge, most studies are under the linear setup, with limited advances in non-Gaussian and non-linear models. We propose a majorization-minimization dual stagewise (MM-DUST) algorithm to efficiently trace out the full solution paths of the generalized lasso problem. The majorization technique is incorporated to handle different convex loss functions through their quadratic majorizers. Utilizing the connection between primal and dual problems and the idea of ``slow-brewing'' from stagewise learning, the minimization step is carried out in the dual space through a sequence of simple coordinate-wise updates on the dual coefficients with a small step size. Consequently, selecting an appropriate step size enables a trade-off between statistical accuracy and computational efficiency. We analyze the computational complexity of MM-DUST and establish the uniform convergence of the approximated solution paths. Extensive simulation studies and applications with regularized logistic regression and Cox model demonstrate the effectiveness of the proposed approach.

arXiv.org

Efficient estimation of average treatment effects with unmeasured confounding and proxies arxiv.org/abs/2501.02214 .ME .EM

Efficient estimation of average treatment effects with unmeasured confounding and proxies

One approach to estimating the average treatment effect in binary treatment with unmeasured confounding is the proximal causal inference, which assumes the availability of outcome and treatment confounding proxies. The key identifying result relies on the existence of a so-called bridge function. A parametric specification of the bridge function is usually postulated and estimated using standard techniques. The estimated bridge function is then plugged in to estimate the average treatment effect. This approach may have two efficiency losses. First, the bridge function may not be efficiently estimated since it solves an integral equation. Second, the sequential procedure may fail to account for the correlation between the two steps. This paper proposes to approximate the integral equation with increasing moment restrictions and jointly estimate the bridge function and the average treatment effect. Under sufficient conditions, we show that the proposed estimator is efficient. To assist implementation, we propose a data-driven procedure for selecting the tuning parameter (i.e., number of moment restrictions). Simulation studies reveal that the proposed method performs well in finite samples, and application to the right heart catheterization dataset from the SUPPORT study demonstrates its practical value.

arXiv.org

Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance arxiv.org/abs/2501.02298 .ML .LG

Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance

Score-based Generative Models (SGMs) aim to sample from a target distribution by learning score functions using samples perturbed by Gaussian noise. Existing convergence bounds for SGMs in the $\mathcal{W}_2$-distance rely on stringent assumptions about the data distribution. In this work, we present a novel framework for analyzing $\mathcal{W}_2$-convergence in SGMs, significantly relaxing traditional assumptions such as log-concavity and score regularity. Leveraging the regularization properties of the Ornstein-Uhlenbeck (OU) process, we show that weak log-concavity of the data distribution evolves into log-concavity over time. This transition is rigorously quantified through a PDE-based analysis of the Hamilton-Jacobi-Bellman equation governing the log-density of the forward process. Moreover, we establish that the drift of the time-reversed OU process alternates between contractive and non-contractive regimes, reflecting the dynamics of concavity. Our approach circumvents the need for stringent regularity conditions on the score function and its estimators, relying instead on milder, more practical assumptions. We demonstrate the wide applicability of this framework through explicit computations on Gaussian mixture models, illustrating its versatility and potential for broader classes of data distributions.

arXiv.org

Statistical Demography Meets Ministry of Health: The Case of the Family Planning Estimation Tool arxiv.org/abs/2501.00007 .AP

High-Dimensional Markov-switching Ordinary Differential Processes arxiv.org/abs/2501.00087 .ME .ST .AP .TH .LG

A portmanteau test for multivariate non-stationary functional time series with an increasing number of lags arxiv.org/abs/2501.00118 .ME

Post Launch Evaluation of Policies in a High-Dimensional Setting arxiv.org/abs/2501.00119 .ML .AP .ME .LG

Competitiveness of Formula 1 championship from 2012 to 2022 as measured by Kendall corrected evolutive coefficient arxiv.org/abs/2501.00126 .AP

Denoising Data with Measurement Error Using a Reproducing Kernel-based Diffusion Model arxiv.org/abs/2501.00212 .ME

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.