Show newer

Distribution law of the COVID-19 number through different temporal stages and geographic scales. (arXiv:2208.06435v1 [physics.soc-ph]) arxiv.org/abs/2208.06435

Heavy-tailed distributions of confirmed COVID-19 cases and deaths in spatiotemporal space

This paper conducts a systematic statistical analysis of the characteristics of the geographical empirical distributions for the numbers of both cumulative and daily confirmed COVID-19 cases and deaths at county, city, and state levels over a time span from January 2020 to June 2022. The mathematical heavy-tailed distributions can be used for fitting the empirical distributions observed in different temporal stages and geographical scales. The estimations of the shape parameter of the tail distributions using the Generalized Pareto Distribution also support the observations of the heavy-tailed distributions. According to the characteristics of the heavy-tailed distributions, the evolution course of the geographical empirical distributions can be divided into three distinct phases, namely the power-law phase, the lognormal phase I, and the lognormal phase II. These three phases could serve as an indicator of the severity degree of the COVID-19 pandemic within an area. The empirical results suggest important intrinsic dynamics of a human infectious virus spread in the human interconnected physical complex network. The findings extend previous empirical studies and could provide more strict constraints for current mathematical and physical modeling studies, such as the SIR model and its variants based on the theory of complex networks.

arxiv.org

Sequence-based deep learning antibody design for in silico antibody affinity maturation. (arXiv:2103.03724v2 [q-bio.BM] UPDATED) arxiv.org/abs/2103.03724

Sequence-based deep learning antibody design for in silico antibody affinity maturation

Antibody therapeutics has been extensively studied in drug discovery and development within the past decades. One increasingly popular focus in the antibody discovery pipeline is the optimization step for therapeutic leads. Both traditional methods and in silico approaches aim to generate candidates with high binding affinity against specific target antigens. Traditional in vitro approaches use hybridoma or phage display for candidate selection, and surface plasmon resonance (SPR) for evaluation, while in silico computational approaches aim to reduce the high cost and improve efficiency by incorporating mathematical algorithms and computational processing power in the design process. In the present study, we investigated different graph-based designs for depicting antibody-antigen interactions in terms of antibody affinity prediction using deep learning techniques. While other in silico computations require experimentally determined crystal structures, our study took interest in the capability of sequence-based models for in silico antibody maturation. Our preliminary studies achieved satisfying prediction accuracy on binding affinities comparing to conventional approaches and other deep learning approaches. To further study the antibody-antigen binding specificity, and to simulate the optimization process in real-world scenario, we introduced pairwise prediction strategy. We performed analysis based on both baseline and pairwise prediction results. The resulting prediction and efficiency prove the feasibility and computational efficiency of sequence-based method to be adapted as a scalable industry practice.

arxiv.org

Sequential attractors in combinatorial threshold-linear networks. (arXiv:2107.10244v4 [q-bio.NC] UPDATED) arxiv.org/abs/2107.10244

Sequential attractors in combinatorial threshold-linear networks

Sequences of neural activity arise in many brain areas, including cortex, hippocampus, and central pattern generator circuits that underlie rhythmic behaviors like locomotion. While network architectures supporting sequence generation vary considerably, a common feature is an abundance of inhibition. In this work, we focus on architectures that support sequential activity in recurrently connected networks with inhibition-dominated dynamics. Specifically, we study emergent sequences in a special family of threshold-linear networks, called combinatorial threshold-linear networks (CTLNs), whose connectivity matrices are defined from directed graphs. Such networks naturally give rise to an abundance of sequences whose dynamics are tightly connected to the underlying graph. We find that architectures based on generalizations of cycle graphs produce limit cycle attractors that can be activated to generate transient or persistent (repeating) sequences. Each architecture type gives rise to an infinite family of graphs that can be built from arbitrary component subgraphs. Moreover, we prove a number of graph rules for the corresponding CTLNs in each family. The graph rules allow us to strongly constrain, and in some cases fully determine, the fixed points of the network in terms of the fixed points of the component subnetworks. Finally, we also show how the structure of certain architectures gives insight into the sequential dynamics of the corresponding attractor.

arxiv.org

Core motifs predict dynamic attractors in combinatorial threshold-linear networks. (arXiv:2109.03198v2 [q-bio.NC] UPDATED) arxiv.org/abs/2109.03198

Core motifs predict dynamic attractors in combinatorial threshold-linear networks

Combinatorial threshold-linear networks (CTLNs) are a special class of inhibition-dominated TLNs defined from directed graphs. Like more general TLNs, they display a wide variety of nonlinear dynamics including multistability, limit cycles, quasiperiodic attractors, and chaos. In prior work, we have developed a detailed mathematical theory relating stable and unstable fixed points of CTLNs to graph-theoretic properties of the underlying network. Here we find that a special type of fixed points, corresponding to core motifs, are predictive of both static and dynamic attractors. Moreover, the attractors can be found by choosing initial conditions that are small perturbations of these fixed points. This motivates us to hypothesize that dynamic attractors of a network correspond to unstable fixed points supported on core motifs. We tested this hypothesis on a large family of directed graphs of size $n=5$, and found remarkable agreement. Furthermore, we discovered that core motifs with similar embeddings give rise to nearly identical attractors. This allowed us to classify attractors based on structurally-defined graph families. Our results suggest that graphical properties of the connectivity can be used to predict a network's complex repertoire of nonlinear dynamics.

arxiv.org

Retinotopic Mechanics derived using classical physics. (arXiv:2109.11632v2 [q-bio.NC] UPDATED) arxiv.org/abs/2109.11632

Retinotopic Mechanics derived using classical physics

The concept of a cell$'$s receptive field is a bedrock in systems neuroscience, and the classical static description of the receptive field has had enormous success in explaining the fundamental mechanisms underlying visual processing. Borne out by the spatio-temporal dynamics of visual sensitivity to probe stimuli in primates, I build on top of this static account with the introduction of a new computational field of research, retinotopic mechanics. At its core, retinotopic mechanics assumes that during active sensing receptive fields are not static but can shift beyond their classical extent. Specifically, the canonical computations and the neural architecture that supports these computations are inherently mediated by a neurobiologically inspired force field (e.g.,$R_s\propto \sim 1 /ΔM$). For example, when the retina is displaced because of a saccadic eye movement from one point in space to another, cells across retinotopic brain areas are tasked with discounting the retinal disruptions such active surveillance inherently introduces. This neural phenomenon is known as spatial constancy. Using retinotopic mechanics, I propose that to achieve spatial constancy or any active visually mediated task, retinotopic cells, namely their receptive fields, are constrained by eccentricity dependent elastic fields. I propose that elastic fields are self-generated by the visual system and allow receptive fields the ability to predictively shift beyond their classical extent to future post-saccadic location such that neural sensitivity which would otherwise support intermediate eccentric locations likely to contain retinal disruptions is transiently blunted.

arxiv.org

Conditional Antibody Design as 3D Equivariant Graph Translation. (arXiv:2208.06073v1 [q-bio.BM]) arxiv.org/abs/2208.06073

Conditional Antibody Design as 3D Equivariant Graph Translation

Antibody design is valuable for therapeutic usage and biological research. Existing deep-learning-based methods encounter several key issues: 1) incomplete context for Complementarity-Determining Regions (CDRs) generation; 2) incapability of capturing the entire 3D geometry of the input structure; 3) inefficient prediction of the CDR sequences in an autoregressive manner. In this paper, we propose Multi-channel Equivariant Attention Network (MEAN) to co-design 1D sequences and 3D structures of CDRs. To be specific, MEAN formulates antibody design as a conditional graph translation problem by importing extra components including the target antigen and the light chain of the antibody. Then, MEAN resorts to E(3)-equivariant message passing along with a proposed attention mechanism to better capture the geometrical correlation between different components. Finally, it outputs both the 1D sequences and 3D structure via a multi-round progressive full-shot scheme, which enjoys more efficiency and precision against previous autoregressive approaches. Our method significantly surpasses state-of-the-art models in sequence and structure modeling, antigen-binding CDR design, and binding affinity optimization. Specifically, the relative improvement to baselines is about 23% in antigen-binding CDR design and 34% for affinity optimization.

arxiv.org

Feature-Based Time-Series Analysis in R using the theft Package. (arXiv:2208.06146v1 [stat.ML]) arxiv.org/abs/2208.06146

Feature-Based Time-Series Analysis in R using the theft Package

Time series are measured and analyzed across the sciences. One method of quantifying the structure of time series is by calculating a set of summary statistics or `features', and then representing a time series in terms of its properties as a feature vector. The resulting feature space is interpretable and informative, and enables conventional statistical learning approaches, including clustering, regression, and classification, to be applied to time-series datasets. Many open-source software packages for computing sets of time-series features exist across multiple programming languages, including catch22 (22 features: Matlab, R, Python, Julia), feasts (42 features: R), tsfeatures (63 features: R), Kats (40 features: Python), tsfresh (779 features: Python), and TSFEL (390 features: Python). However, there are several issues: (i) a singular access point to these packages is not currently available; (ii) to access all feature sets, users must be fluent in multiple languages; and (iii) these feature-extraction packages lack extensive accompanying methodological pipelines for performing feature-based time-series analysis, such as applications to time-series classification. Here we introduce a solution to these issues in an R software package called theft: Tools for Handling Extraction of Features from Time series. theft is a unified and extendable framework for computing features from the six open-source time-series feature sets listed above. It also includes a suite of functions for processing and interpreting the performance of extracted features, including extensive data-visualization templates, low-dimensional projections, and time-series classification operations. With an increasing volume and complexity of time-series datasets in the sciences and industry, theft provides a standardized framework for comprehensively quantifying and interpreting informative structure in time series.

arxiv.org

An Empirical Exploration of Cross-domain Alignment between Language and Electroencephalogram. (arXiv:2208.06348v1 [q-bio.NC]) arxiv.org/abs/2208.06348

An Empirical Exploration of Cross-domain Alignment between Language and Electroencephalogram

Electroencephalography (EEG) and language have been widely explored independently for many downstream tasks (e.g., sentiment analysis, relation detection, etc.). Multimodal approaches that study both domains have not been well explored, even though in recent years, multimodal learning has been seen to be more powerful than its unimodal counterparts. In this study, we want to explore the relationship and dependency between EEG and language, i.e., how one domain reflects and represents the other. To study the relationship at the representation level, we introduced MTAM, a Multimodal Transformer Alignment Model, to observe coordinated representations between the two modalities, and thus employ the transformed representations for downstream applications. We used various relationship alignment-seeking techniques, such as Canonical Correlation Analysis and Wasserstein Distance, as loss functions to transfigure low-level language and EEG features to high-level transformed features. On downstream applications, sentiment analysis, and relation detection, we achieved new state-of-the-art results on two datasets, ZuCo and K-EmoCon. Our method achieved an F1-score improvement of 16.5% on sentiment analysis for K-EmoCon, 26.6% on sentiment analysis of ZuCo, and 31.1% on relation detection of ZuCo. In addition, we provide interpretation of the performance improvement by: (1) visualizing the original feature distribution and the transformed feature distribution, showing the effectiveness of the alignment module for discovering and encoding the relationship between EEG and language; (2) visualizing word-level and sentence-level EEG-language alignment weights, showing the influence of different language semantics as well as EEG frequency features; and (3) visualizing brain topographical maps to provide an intuitive demonstration of the connectivity of EEG and language response in the brain regions.

arxiv.org

3D Graph Contrastive Learning for Molecular Property Prediction. (arXiv:2208.06360v1 [q-bio.BM]) arxiv.org/abs/2208.06360

3D Graph Contrastive Learning for Molecular Property Prediction

Self-supervised learning (SSL) is a method that learns the data representation by utilizing supervision inherent in the data. This learning method is in the spotlight in the drug field, lacking annotated data due to time-consuming and expensive experiments. SSL using enormous unlabeled data has shown excellent performance for molecular property prediction, but a few issues exist. (1) Existing SSL models are large-scale; there is a limitation to implementing SSL where the computing resource is insufficient. (2) In most cases, they do not utilize 3D structural information for molecular representation learning. The activity of a drug is closely related to the structure of the drug molecule. Nevertheless, most current models do not use 3D information or use it partially. (3) Previous models that apply contrastive learning to molecules use the augmentation of permuting atoms and bonds. Therefore, molecules having different characteristics can be in the same positive samples. We propose a novel contrastive learning framework, small-scale 3D Graph Contrastive Learning (3DGCL) for molecular property prediction, to solve the above problems. 3DGCL learns the molecular representation by reflecting the molecule's structure through the pre-training process that does not change the semantics of the drug. Using only 1,128 samples for pre-train data and 1 million model parameters, we achieved the state-of-the-art or comparable performance in four regression benchmark datasets. Extensive experiments demonstrate that 3D structural information based on chemical knowledge is essential to molecular representation learning for property prediction.

arxiv.org

Hyperbolic Molecular Representation Learning for Drug Repositioning. (arXiv:2208.06361v1 [q-bio.BM]) arxiv.org/abs/2208.06361

Hyperbolic Molecular Representation Learning for Drug Repositioning

Learning accurate drug representations is essential for task such as computational drug repositioning. A drug hierarchy is a valuable source that encodes knowledge of relations among drugs in a tree-like structure where drugs that act on the same organs, treat the same disease, or bind to the same biological target are grouped together. However, its utility in learning drug representations has not yet been explored, and currently described drug representations cannot place novel molecules in a drug hierarchy. Here, we develop a semi-supervised drug embedding that incorporates two sources of information: (1) underlying chemical grammar that is inferred from chemical structures of drugs and drug-like molecules (unsupervised), and (2) hierarchical relations that are encoded in an expert-crafted hierarchy of approved drugs (supervised). We use the Variational Auto-Encoder (VAE) framework to encode the chemical structures of molecules and use the drug-drug similarity information obtained from the hierarchy to induce the clustering of drugs in hyperbolic space. The hyperbolic space is amenable for encoding hierarchical relations. Our qualitative results support that the learned drug embedding can induce the hierarchical relations among drugs. We demonstrate that the learned drug embedding can be used for drug repositioning.

arxiv.org

Hybrid Approach to Identify Druglikeness Leading Compounds against SARS 3CL Protease. (arXiv:2208.06362v1 [q-bio.BM]) arxiv.org/abs/2208.06362

Hybrid Approach to Identify Druglikeness Leading Compounds against SARS 3CL Protease

SARS-COV-2 is a positive single strand RNA based macromolecule that has caused the death of more than 6.3 million people since June 2022. Moreover, by disturbing global supply chains through lockdown, the virus has indirectly caused devastating damage to the global economy. It is vital to design and develop drugs for this virus and its various variants. In this paper, we have used an In-Silico study framework to repurpose existing therapeutic agents to find drug-like bioactive molecules that could cure Covid-19. We used the Lipinski rules on the molecules retrieved from ChEMBL database to find 133 drug-likeness bioactive molecules against SARS coronavirus 3CL Protease. On the basis of standard IC50, the dataset was divided into three classes of active, inactive and intermediate. Our comparative analysis demonstrated that proposed Extra Tree Regressor (ETR) ensemble model has improved results while predicting accurate bioactivity of chemical compound relative to other state-of-the-art machine learning models. Using ADMET analysis, we identified 13 novel bioactive molecules having ChEMBL IDs 187460, 190743, 222234, 222628, 222735, 222769, 222840, 222893, 225515, 358279, 363535, 365134 and 426898. We found that these molecules are highly suitable drug candidates for SARS-COV-2 3CL Protease. These candidate molecules are further investigated for binding affinities. For this purpose, we performed molecular docking and short listed six bioactive molecules having ChEMBL IDs 187460, 222769, 225515, 358279, 363535, and 365134. These molecules can be suitable drug candidates for SARS-COV-2. It is anticipated that pharmacologist community may use these promising compounds for further vitro analysis.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.