Show newer

Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic Data Imputation. (arXiv:2302.03038v1 [q-bio.GN]) arxiv.org/abs/2302.03038

Single Cells Are Spatial Tokens: Transformers for Spatial Transcriptomic Data Imputation

Spatially resolved transcriptomics brings exciting breakthroughs to single-cell analysis by providing physical locations along with gene expression. However, as a cost of the extremely high spatial resolution, the cellular level spatial transcriptomic data suffer significantly from missing values. While a standard solution is to perform imputation on the missing values, most existing methods either overlook spatial information or only incorporate localized spatial context without the ability to capture long-range spatial information. Using multi-head self-attention mechanisms and positional encoding, transformer models can readily grasp the relationship between tokens and encode location information. In this paper, by treating single cells as spatial tokens, we study how to leverage transformers to facilitate spatial tanscriptomics imputation. In particular, investigate the following two key questions: (1) $\textit{how to encode spatial information of cells in transformers}$, and (2) $\textit{ how to train a transformer for transcriptomic imputation}$. By answering these two questions, we present a transformer-based imputation framework, SpaFormer, for cellular-level spatial transcriptomic data. Extensive experiments demonstrate that SpaFormer outperforms existing state-of-the-art imputation algorithms on three large-scale datasets.

arxiv.org

Temporal and probabilistic forecasts of epidemic interventions. (arXiv:2302.03210v1 [q-bio.PE]) arxiv.org/abs/2302.03210

Temporal and probabilistic forecasts of epidemic interventions

Forecasting disease spread is a critical tool to help public health officials design and plan public health interventions. However, the expected future state of an epidemic is not necessarily well defined as disease spread is inherently stochastic, contact patterns within a population are heterogeneous, and behaviors change. In this work, we use time-dependent probability generating functions (PGFs) to capture these characteristics by modeling a stochastic branching process of the spread of a disease over a network of contacts in which public health interventions are introduced over time. To achieve this, we define a general transmissibility equation to account for varying transmission rates (e.g. masking), recovery rates (e.g. treatment), contact patterns (e.g. social distancing) and percentage of the population immunized (e.g. vaccination). The resulting framework allows for a temporal and probabilistic analysis of an intervention's impact on disease spread, which match continuous-time stochastic simulations that are much more computationally expensive. To aid policy making, we then define several metrics over which temporal and probabilistic intervention forecasts can be compared: Looking at the expected number of cases and the worst-case scenario over time, as well as the probability of reaching a critical level of cases and of not seeing any improvement following an intervention. Given that epidemics do not always follow their average expected trajectories and that the underlying dynamics can change over time, our work paves the way for more detailed short-term forecasts of disease spread and more informed comparison of intervention strategies.

arxiv.org

A Critical Review of the Impact of Candidate Copy Number Variants on Autism Spectrum Disorders. (arXiv:2302.03211v1 [q-bio.GN]) arxiv.org/abs/2302.03211

A Critical Review of the Impact of Candidate Copy Number Variants on Autism Spectrum Disorders

Autism spectrum disorder (ASD) is a heterogeneous neurodevelopmental disorder (NDD) that is caused by genetic, epigenetic, and environmental factors. Recent advances in genomic analysis have uncovered numerous candidate genes with common and/or rare mutations that increase susceptibility to ASD. In addition, there is increasing evidence that copy number variations (CNVs), single nucleotide polymorphisms (SNPs), and unusual de novo variants negatively affect neurodevelopment pathways in various ways. The overall rate of copy number variants found in patients with autism is 10%-20%, of which 3%-7% can be detected cytogenetically. Although the role of submicroscopic CNVs in ASD has been studied recently, their association with genomic loci and genes has not been properly studied. In this review, we focus on 47 ASD-associated CNV regions and their related genes. Here, we identify 1,632 protein-coding genes and long non-coding RNAs (lncRNAs) within these regions. Among them, 552 are significantly expressed in the brain. Using a list of ASD-associated genes from SFARI, we detect 17 regions containing at least one known ASD-associated protein-coding genes. Of the remaining 30 regions, we identify 24 regions containing at least one protein-coding genes with brain-enriched expression and nervous system phenotype in mouse mutant and one lncRNAs with both brain-enriched expression and upregulation in iPSC to neuron differentiation. Our analyses highlight the diversity of genetic lesions of CNV regions that contribute to ASD and provide new genetic evidence that lncRNA genes may contribute to etiology of ASD. In addition, the discovered CNVs will be a valuable resource for diagnostic facilities, therapeutic strategies, and research in terms of variation priority.

arxiv.org

Automatic Sleep Stage Classification with Cross-modal Self-supervised Features from Deep Brain Signals. (arXiv:2302.03227v1 [cs.LG]) arxiv.org/abs/2302.03227

Automatic Sleep Stage Classification with Cross-modal Self-supervised Features from Deep Brain Signals

The detection of human sleep stages is widely used in the diagnosis and intervention of neurological and psychiatric diseases. Some patients with deep brain stimulator implanted could have their neural activities recorded from the deep brain. Sleep stage classification based on deep brain recording has great potential to provide more precise treatment for patients. The accuracy and generalizability of existing sleep stage classifiers based on local field potentials are still limited. We proposed an applicable cross-modal transfer learning method for sleep stage classification with implanted devices. This end-to-end deep learning model contained cross-modal self-supervised feature representation, self-attention, and classification framework. We tested the model with deep brain recording data from 12 patients with Parkinson's disease. The best total accuracy reached 83.2% for sleep stage classification. Results showed speech self-supervised features catch the conversion pattern of sleep stages effectively. We provide a new method on transfer learning from acoustic signals to local field potentials. This method supports an effective solution for the insufficient scale of clinical data. This sleep stage classification model could be adapted to chronic and continuous monitor sleep for Parkinson's patients in daily life, and potentially utilized for more precise treatment in deep brain-machine interfaces, such as closed-loop deep brain stimulation.

arxiv.org

Network-based Statistics Distinguish Anomic and Broca Aphasia. (arXiv:2302.03250v1 [q-bio.NC]) arxiv.org/abs/2302.03250

Network-based Statistics Distinguish Anomic and Broca Aphasia

Aphasia is a speech-language impairment commonly caused by damage to the left hemisphere. Due to the complexity of speech-language processing, the neural mechanisms that underpin various symptoms between different types of aphasia are still not fully understood. We used the network-based statistic method to identify distinct subnetwork(s) of connections differentiating the resting-state functional networks of the anomic and Broca groups. We identified one such subnetwork that mainly involved the brain regions in the premotor, primary motor, primary auditory, and primary sensory cortices in both hemispheres. The majority of connections in the subnetwork were weaker in the Broca group than the anomic group. The network properties of the subnetwork were examined through complex network measures, which indicated that the regions in the superior temporal gyrus and auditory cortex bilaterally exhibit intensive interaction, and primary motor, premotor and primary sensory cortices in the left hemisphere play an important role in information flow and overall communication efficiency. These findings underlied articulatory difficulties and reduced repetition performance in Broca aphasia, which are rarely observed in anomic aphasia. This research provides novel findings into the resting-state brain network differences between groups of individuals with anomic and Broca aphasia. We identified a subnetwork of, rather than isolated, connections that statistically differentiate the resting-state brain networks of the two groups, in comparison with standard lesion symptom mapping results that yield isolated connections.

arxiv.org

Scalable Gaussian process regression enables accurate prediction of protein and small molecule properties with uncertainty quantitation. (arXiv:2302.03294v1 [cs.LG]) arxiv.org/abs/2302.03294

Scalable Gaussian process regression enables accurate prediction of protein and small molecule properties with uncertainty quantitation

Gaussian process (GP) is a Bayesian model which provides several advantages for regression tasks in machine learning such as reliable quantitation of uncertainty and improved interpretability. Their adoption has been precluded by their excessive computational cost and by the difficulty in adapting them for analyzing sequences (e.g. amino acid and nucleotide sequences) and graphs (e.g. ones representing small molecules). In this study, we develop efficient and scalable approaches for fitting GP models as well as fast convolution kernels which scale linearly with graph or sequence size. We implement these improvements by building an open-source Python library called xGPR. We compare the performance of xGPR with the reported performance of various deep learning models on 20 benchmarks, including small molecule, protein sequence and tabular data. We show that xGRP achieves highly competitive performance with much shorter training time. Furthermore, we also develop new kernels for sequence and graph data and show that xGPR generally outperforms convolutional neural networks on predicting key properties of proteins and small molecules. Importantly, xGPR provides uncertainty information not available from typical deep learning models. Additionally, xGPR provides a representation of the input data that can be used for clustering and data visualization. These results demonstrate that xGPR provides a powerful and generic tool that can be broadly useful in protein engineering and drug discovery.

arxiv.org

Monitoring oligomerization dynamics of individual human neurotensin receptors 1 in living cells and in SMALP nanodiscs. (arXiv:2302.02416v1 [q-bio.BM]) arxiv.org/abs/2302.02416

Monitoring oligomerization dynamics of individual human neurotensin receptors 1 in living cells and in SMALP nanodiscs

The human neurotensin receptor 1 (NTSR1) is a G protein-coupled receptor. The receptor is activated by a small peptide ligand neurotensin. NTSR1 can be expressed in HEK cells by stable transfection. Previously we used the fluorescent protein markers mRuby3 or mNeonGreen fused to NTSR1 for EMCCD-based structured illumination microscopy (SIM) in living HEK cells. Ligand binding induced conformational changes in NTSR1 which triggered the intracellular signaling processes. Recent single-molecule studies revealed a dynamic monomer/dimer equilibrium of this receptor in artificial lipid bilayers. Here we report on the oligomerization state of human NTSR1 from living cells by trapping them into lipid nanodiscs. Briefly, SMALPs (styrene-maleic acid copolymer lipid nanoparticles) were produced directly from the plasma membranes of living HEK293T FlpIn cells. SMALPs with a diameter of 15 nm were soluble and stable. NTSR1 in SMALPs were analyzed by single-molecule intensity measurements one membrane patch at a time using a custom-built confocal anti-Brownian electrokinetic trap (ABEL trap) microscope. We found oligomerization changes before and after stimulation of the receptor with its ligand neurotensin.

arxiv.org

Towards inferring network properties from epidemic data. (arXiv:2302.02470v1 [q-bio.QM]) arxiv.org/abs/2302.02470

Towards inferring network properties from epidemic data

Epidemic propagation on networks represents an important departure from traditional massaction models. However, the high-dimensionality of the exact models poses a challenge to both mathematical analysis and parameter inference. By using mean-field models, such as the pairwise model (PWM), the complexity becomes tractable. While such models have been used extensively for model analysis, there is limited work in the context of statistical inference. In this paper, we explore the extent to which the PWM with the susceptible-infected-recovered (SIR) epidemic can be used to infer disease- and network-related parameters. The widely-used MLE approach exhibits several issues pertaining to parameter unidentifiability and a lack of robustness to exact knowledge about key quantities such as population size and/or proportion of under reporting. As an alternative, we considered the recently developed dynamical survival analysis (DSA). For scenarios in which there is no model mismatch, such as when data are generated via simulations, both methods perform well despite strong dependence between parameters. However, for real-world data, such as foot-and-mouth, H1N1 and COVID19, the DSA method appears more robust to potential model mismatch and the parameter estimates appear more epidemiologically plausible. Taken together, however, our findings suggest that network-based mean-field models can be used to formulate approximate likelihoods which, coupled with an efficient inference scheme, make it possible to not only learn about the parameters of the disease dynamics but also that of the underlying network.

arxiv.org

Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease Treatment. (arXiv:2302.02477v1 [cs.LG]) arxiv.org/abs/2302.02477

Offline Learning of Closed-Loop Deep Brain Stimulation Controllers for Parkinson Disease Treatment

Deep brain stimulation (DBS) has shown great promise toward treating motor symptoms caused by Parkinson's disease (PD), by delivering electrical pulses to the Basal Ganglia (BG) region of the brain. However, DBS devices approved by the U.S. Food and Drug Administration (FDA) can only deliver continuous DBS (cDBS) stimuli at a fixed amplitude; this energy inefficient operation reduces battery lifetime of the device, cannot adapt treatment dynamically for activity, and may cause significant side-effects (e.g., gait impairment). In this work, we introduce an offline reinforcement learning (RL) framework, allowing the use of past clinical data to train an RL policy to adjust the stimulation amplitude in real time, with the goal of reducing energy use while maintaining the same level of treatment (i.e., control) efficacy as cDBS. Moreover, clinical protocols require the safety and performance of such RL controllers to be demonstrated ahead of deployments in patients. Thus, we also introduce an offline policy evaluation (OPE) method to estimate the performance of RL policies using historical data, before deploying them on patients. We evaluated our framework on four PD patients equipped with the RC+S DBS system, employing the RL controllers during monthly clinical visits, with the overall control efficacy evaluated by severity of symptoms (i.e., bradykinesia and tremor), changes in PD biomakers (i.e., local field potentials), and patient ratings. The results from clinical experiments show that our RL-based controller maintains the same level of control efficacy as cDBS, but with significantly reduced stimulation energy. Further, the OPE method is shown effective in accurately estimating and ranking the expected returns of RL controllers.

arxiv.org

A three-state coupled Markov switching model for COVID-19 outbreaks across Quebec based on hospital admissions. (arXiv:2302.02488v1 [stat.AP]) arxiv.org/abs/2302.02488

A three-state coupled Markov switching model for COVID-19 outbreaks across Quebec based on hospital admissions

Recurrent COVID-19 outbreaks have placed immense strain on the hospital system in Quebec. We develop a Bayesian three-state coupled Markov switching model to analyze COVID-19 outbreaks across Quebec based on admissions in the 30 largest hospitals. Within each catchment area we assume the existence of three states for the disease: absence, a new state meant to account for many zeroes in some of the smaller areas, endemic and outbreak. Then we assume the disease switches between the three states in each area through a series of coupled nonhomogeneous hidden Markov chains. Unlike previous approaches, the transition probabilities may depend on covariates and the occurrence of outbreaks in neighboring areas, to account for geographical outbreak spread. Additionally, to prevent rapid switching between endemic and outbreak periods we introduce clone states into the model which enforce minimum endemic and outbreak durations. We make some interesting findings such as that mobility in retail and recreation venues had a strong positive association with the development and persistence of new COVID-19 outbreaks in Quebec. Based on model comparison our contributions show promise in improving state estimation retrospectively and in real-time, especially when there are smaller areas and highly spatially synchronized outbreaks, and they offer new and interesting epidemiological interpretations.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.