Show newer

Boosting Convolutional Neural Networks' Protein Binding Site Prediction Capacity Using SE(3)-invariant transformers, Transfer Learning and Homology-based Augmentation. (arXiv:2303.08818v1 [q-bio.QM]) arxiv.org/abs/2303.08818

Boosting Convolutional Neural Networks' Protein Binding Site Prediction Capacity Using SE(3)-invariant transformers, Transfer Learning and Homology-based Augmentation

Figuring out small molecule binding sites in target proteins, in the resolution of either pocket or residue, is critical in many virtual and real drug-discovery scenarios. Since it is not always easy to find such binding sites based on domain knowledge or traditional methods, different deep learning methods that predict binding sites out of protein structures have been developed in recent years. Here we present a new such deep learning algorithm, that significantly outperformed all state-of-the-art baselines in terms of the both resolutions$\unicode{x2013}$pocket and residue. This good performance was also demonstrated in a case study involving the protein human serum albumin and its binding sites. Our algorithm included new ideas both in the model architecture and in the training method. For the model architecture, it incorporated SE(3)-invariant geometric self-attention layers that operate on top of residue-level CNN outputs. This residue-level processing of the model allowed a transfer learning between the two resolutions, which turned out to significantly improve the binding pocket prediction. Moreover, we developed novel augmentation method based on protein homology, which prevented our model from over-fitting. Overall, we believe that our contribution to the literature is twofold. First, we provided a new computational method for binding site prediction that is relevant to real-world applications, as shown by the good performance on different benchmarks and case study. Second, the novel ideas in our method$\unicode{x2013}$the model architecture, transfer learning and the homology augmentation$\unicode{x2013}$would serve as useful components in future works.

arxiv.org

Nonequilibrium calcium dynamics optimizes the energetic efficiency of mitochondrial metabolism. (arXiv:2303.08822v1 [q-bio.MN]) arxiv.org/abs/2303.08822

Nonequilibrium calcium dynamics optimizes the energetic efficiency of mitochondrial metabolism

Living organisms continuously harness energy to perform complex functions for their adaptation and survival while part of that energy is dissipated in the form of heat or chemical waste. Determining the energetic cost and the efficiency of specific cellular processes remains a largely open problem. Here, we analyze the efficiency of mitochondrial adenosine triphosphate (ATP) production through the tricarboxylic acid (TCA) cycle and oxidative phosphorylation that generates most of the cellular chemical energy in eukaryotes. The regulation of this pathway by calcium signaling represents a well-characterized example of a regulatory cross-talk that can affect the energetic output of a metabolic pathway, but its concrete energetic impact remains elusive. On the one hand, calcium enhances ATP production by activating key enzymes of the TCA cycle, but on the other hand calcium homeostasis depends on ATP availability. To evaluate how calcium signaling impacts the efficiency of mitochondrial metabolism, we propose a detailed kinetic model describing the calcium-mitochondria cross-talk and we analyze it using a nonequilibrium thermodynamic approach: after identifying the effective reactions driving mitochondrial metabolism out of equilibrium, we quantify the thermodynamic efficiency of the metabolic machinery for different physiological conditions. We find that calcium oscillations increase the efficiency with a maximum close to substrate-limited conditions, suggesting a compensatory effect of calcium signaling on the energetics of mitochondrial metabolism.

arxiv.org

ROSE: A Neurocomputational Architecture for Syntax. (arXiv:2303.08877v1 [cs.CL]) arxiv.org/abs/2303.08877

ROSE: A Neurocomputational Architecture for Syntax

A comprehensive model of natural language processing in the brain must accommodate four components: representations, operations, structures and encoding. It further requires a principled account of how these components mechanistically, and causally, relate to each another. While previous models have isolated regions of interest for structure-building and lexical access, many gaps remain with respect to bridging distinct scales of neural complexity. By expanding existing accounts of how neural oscillations can index various linguistic processes, this article proposes a neurocomputational architecture for syntax, termed the ROSE model (Representation, Operation, Structure, Encoding). Under ROSE, the basic data structures of syntax are atomic features, types of mental representations (R), and are coded at the single-unit and ensemble level. Elementary computations (O) that transform these units into manipulable objects accessible to subsequent structure-building levels are coded via high frequency gamma activity. Low frequency synchronization and cross-frequency coupling code for recursive categorial inferences (S). Distinct forms of low frequency coupling and phase-amplitude coupling (delta-theta coupling via pSTS-IFG; theta-gamma coupling via IFG to conceptual hubs) then encode these structures onto distinct workspaces (E). Causally connecting R to O is spike-phase/LFP coupling; connecting O to S is phase-amplitude coupling; connecting S to E is a system of frontotemporal traveling oscillations; connecting E to lower levels is low-frequency phase resetting of spike-LFP coupling. ROSE is reliant on neurophysiologically plausible mechanisms, is supported at all four levels by a range of recent empirical research, and provides an anatomically precise and falsifiable grounding for the basic property of natural language syntax: hierarchical, recursive structure-building.

arxiv.org

LRDB: LSTM Raw data DNA Base-caller based on long-short term models in an active learning environment. (arXiv:2303.08915v1 [q-bio.GN]) arxiv.org/abs/2303.08915

LRDB: LSTM Raw data DNA Base-caller based on long-short term models in an active learning environment

The first important step in extracting DNA characters is using the output data of MinION devices in the form of electrical current signals. Various cutting-edge base callers use this data to detect the DNA characters based on the input. In this paper, we discuss several shortcomings of prior base callers in the case of time-critical applications, privacy-aware design, and the problem of catastrophic forgetting. Next, we propose the LRDB model, a lightweight open-source model for private developments with a better read-identity (0.35% increase) for the target bacterial samples in the paper. We have limited the extent of training data and benefited from the transfer learning algorithm to make the active usage of the LRDB viable in critical applications. Henceforth, less training time for adapting to new DNA samples (in our case, Bacterial samples) is needed. Furthermore, LRDB can be modified concerning the user constraints as the results show a negligible accuracy loss in case of using fewer parameters. We have also assessed the noise-tolerance property, which offers about a 1.439% decline in accuracy for a 15dB noise injection, and the performance metrics show that the model executes in a medium speed range compared with current cutting-edge models.

arxiv.org

Folding@home: achievements from over twenty years of citizen science herald the exascale era. (arXiv:2303.08993v1 [q-bio.BM]) arxiv.org/abs/2303.08993

Folding@home: achievements from over twenty years of citizen science herald the exascale era

Simulations of biomolecules have enormous potential to inform our understanding of biology but require extremely demanding calculations. For over twenty years, the Folding@home distributed computing project has pioneered a massively parallel approach to biomolecular simulation, harnessing the resources of citizen scientists across the globe. Here, we summarize the scientific and technical advances this perspective has enabled. As the project's name implies, the early years of Folding@home focused on driving advances in our understanding of protein folding by developing statistical methods for capturing long-timescale processes and facilitating insight into complex dynamical processes. Success laid a foundation for broadening the scope of Folding@home to address other functionally relevant conformational changes, such as receptor signaling, enzyme dynamics, and ligand binding. Continued algorithmic advances, hardware developments such as GPU-based computing, and the growing scale of Folding@home have enabled the project to focus on new areas where massively parallel sampling can be impactful. While previous work sought to expand toward larger proteins with slower conformational changes, new work focuses on large-scale comparative studies of different protein sequences and chemical compounds to better understand biology and inform the development of small molecule drugs. Progress on these fronts enabled the community to pivot quickly in response to the COVID-19 pandemic, expanding to become the world's first exascale computer and deploying this massive resource to provide insight into the inner workings of the SARS-CoV-2 virus and aid the development of new antivirals. This success provides a glimpse of what's to come as exascale supercomputers come online, and Folding@home continues its work.

arxiv.org

Machine Learning for Flow Cytometry Data Analysis. (arXiv:2303.09007v1 [cs.LG]) arxiv.org/abs/2303.09007

Machine Learning for Flow Cytometry Data Analysis

Flow cytometry mainly used for detecting the characteristics of a number of biochemical substances based on the expression of specific markers in cells. It is particularly useful for detecting membrane surface receptors, antigens, ions, or during DNA/RNA expression. Not only can it be employed as a biomedical research tool for recognising distinctive types of cells in mixed populations, but it can also be used as a diagnostic tool for classifying abnormal cell populations connected with disease. Modern flow cytometers can rapidly analyse tens of thousands of cells at the same time while also measuring multiple parameters from a single cell. However, the rapid development of flow cytometers makes it challenging for conventional analysis methods to interpret flow cytometry data. Researchers need to be able to distinguish interesting-looking cell populations manually in multi-dimensional data collected from millions of cells. Thus, it is essential to find a robust approach for analysing flow cytometry data automatically, specifically in identifying cell populations automatically. This thesis mainly concerns discover the potential shortcoming of current automated-gating algorithms in both real datasets and synthetic datasets. Three representative automated clustering algorithms are selected to be applied, compared and evaluated by completely and partially automated gating. A subspace clustering ProClus also implemented in this thesis. The performance of ProClus in flow cytometry is not well, but it is still a useful algorithm to detect noise.

arxiv.org

The measurement of bovine pericardium density and its implications on leaflet stress distribution in bioprosthetic heart valves. (arXiv:2303.09094v1 [q-bio.TO]) arxiv.org/abs/2303.09094

The measurement of bovine pericardium density and its implications on leaflet stress distribution in bioprosthetic heart valves

Purpose: Bioprosthetic Heart Valves (BHVs) are currently in widespread use with promising outcomes. Computational modeling provides a framework for quantitatively describing BHVs in the preclinical phase. To obtain reliable solutions in computational modeling, it is essential to consider accurate leaflet properties such as mechanical properties and density. Bovine pericardium (BP) is widely used as BHV leaflets. Previous computational studies assume BP density to be close to the density of water or blood. However, BP leaflets undergo multiple treatments such as fixation and anti-calcification. The present study aims to measure the density of the BP used in BHVs and determine its effect on leaflet stress distribution. Methods: We determined the density of eight square BP samples laser cut from Edwards BP patches. The weight of specimens was measured using an A&D Analytical Balance, and volume was measured by high-resolution imaging. Finite element models of a BHV similar to PERIMOUNT Magna were developed in ABAQUS. Results: The average density value of the BP samples was 1410 kg/m3. In the acceleration phase of a cardiac cycle, the maximum stress value reached 1.89 MPa for a density value of 1410 kg/m3 , and 2.47 MPa for a density of 1000 kg/m3(30.7% difference). In the deceleration, the maximum stress value reached 713 and 669 kPa, respectively. Conclusion: Stress distribution and deformation of BHV leaflets are dependent upon the magnitude of density. Ascertaining an accurate value for the density of BHV leaflets is essential for computational models.

arxiv.org

Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection. (arXiv:2303.08216v1 [eess.IV]) arxiv.org/abs/2303.08216

Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection

Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning architectures - have emerged in recent years as an alternative to CNNs for several computer vision applications. Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty, in this case for sex and Alzheimer's disease (AD) classification based on 3D brain MRI. In our experiments, two vision transformer architecture variants achieved an AUC of 0.987 for sex and 0.892 for AD classification, respectively. We independently evaluated our models on data from two benchmark AD datasets. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic (generated by a latent diffusion model) and real MRI scans, respectively. Our main contributions include testing the effects of different ViT training strategies including pre-training, data augmentation and learning rate warm-ups followed by annealing, as pertaining to the neuroimaging domain. These techniques are essential for training ViT-like models for neuroimaging applications where training data is usually limited. We also analyzed the effect of the amount of training data utilized on the test-time performance of the ViT via data-model scaling curves.

arxiv.org

Few-Shot Classification of Autism Spectrum Disorder using Site-Agnostic Meta-Learning and Brain MRI. (arXiv:2303.08224v1 [eess.IV]) arxiv.org/abs/2303.08224

Few-Shot Classification of Autism Spectrum Disorder using Site-Agnostic Meta-Learning and Brain MRI

For machine learning applications in medical imaging, the availability of training data is often limited, which hampers the design of radiological classifiers for subtle conditions such as autism spectrum disorder (ASD). Transfer learning is one method to counter this problem of low training data regimes. Here we explore the use of meta-learning for very low data regimes in the context of having prior data from multiple sites - an approach we term site-agnostic meta-learning. Inspired by the effectiveness of meta-learning for optimizing a model across multiple tasks, here we propose a framework to adapt it to learn across multiple sites. We tested our meta-learning model for classifying ASD versus typically developing controls in 2,201 T1-weighted (T1-w) MRI scans collected from 38 imaging sites as part of Autism Brain Imaging Data Exchange (ABIDE) [age: 5.2-64.0 years]. The method was trained to find a good initialization state for our model that can quickly adapt to data from new unseen sites by fine-tuning on the limited data that is available. The proposed method achieved an ROC-AUC=0.857 on 370 scans from 7 unseen sites in ABIDE using a few-shot setting of 2-way 20-shot i.e., 20 training samples per site. Our results outperformed a transfer learning baseline by generalizing across a wider range of sites as well as other related prior work. We also tested our model in a zero-shot setting on an independent test site without any additional fine-tuning. Our experiments show the promise of the proposed site-agnostic meta-learning framework for challenging neuroimaging tasks involving multi-site heterogeneity with limited availability of training data.

arxiv.org

Using birth-death processes to infer tumor subpopulation structure from live-cell imaging drug screening data. (arXiv:2303.08245v1 [q-bio.PE]) arxiv.org/abs/2303.08245

Using birth-death processes to infer tumor subpopulation structure from live-cell imaging drug screening data

Tumor heterogeneity is a complex and widely recognized trait that poses significant challenges in developing effective cancer therapies. Characterizing the heterogeneous subpopulation structure within a tumor will enable a more precise and successful treatment strategy for cancer therapy. A possible strategy to uncover the structure of subpopulations is to examine how they respond differently to different drugs. For instance, PhenoPop was proposed as a means to unravel the subpopulation structure within a tumor from high-throughput drug screening data. However, the deterministic nature of PhenoPop restricts the model fit and the information it can extract from the data. As an advancement, we proposed a stochastic model based on the linear birth-death process to address this limitation. Our model can formulate a dynamic variance along the horizon of the experiment so that the model uses more information from the data to provide a more robust estimation. In addition, the newly proposed model can be readily adapted to situations where the experimental data exhibits a positive time correlation. We concluded our study by testing our model on simulated data (In Silico) and experimental data (In Vitro), which supports our argument about its advantages.

arxiv.org

Pesticide Mediated Critical Transition in Plant-Pollinator Networks. (arXiv:2303.08495v1 [q-bio.PE]) arxiv.org/abs/2303.08495

Pesticide Mediated Critical Transition in Plant-Pollinator Networks

Mutually beneficial interactions between plant and pollinators play an essential role in the biodiversity, stability of the ecosystem and crop production. Despite their immense importance, rapid decline events of pollinators are common worldwide in past decades. Excessive use of chemical pesticides is one of the most important threat to pollination in the current era of anthropogenic changes. Pesticides are applied to the plants to increase their growth by killing harmful pests and pollinators accumulates toxic pesticides from the interacting plants directly from the nectar and pollen. This has a significant adverse effect on the pollinator growth and the mutualism which in turn can cause an abrupt collapse of the community however predicting the fate of such community dynamics remains a blur under the alarming rise in the dependency of chemical pesticides. We mathematically modeled the influence of pesticides in a multispecies mutualistic community and used 105 real plant-pollinator networks sampled worldwide as well as simulated networks, to assess its detrimental effect on the plant-pollinator mutualistic networks. Our results indicate that the persistence of the community is strongly influenced by the level of pesticide and catastrophic and irreversible community collapse may occur due to pesticide. Furthermore, a species rich, highly nested community with low connectance and modularity has greater potential to function under the influence of pesticide. We finally proposed a realistic intervention strategy which involves the management of the pesticide level of one targeted plant from the community. We show that our intervention strategy can significantly delay the collapse of the community. Overall our study can be considered as the first attempt to understand the consequences of the chemical pesticide on a plant-pollinator mutualistic community.

arxiv.org

Artificial Psychophysics questions Hue Cancellation Experiments. (arXiv:2303.08496v1 [q-bio.NC]) arxiv.org/abs/2303.08496

Artificial Psychophysics questions Hue Cancellation Experiments

We show that in conventional hue cancellation experiments human-like opponent curves emerge even if the task is done by trivial (identity) artificial networks. Specifically, opponent spectral sensitivities always emerge as long as (i) the retina converts the input radiation into a tristimulus-like representation (any basis is equally valid), and (ii) the post-retinal network solves the standard hue cancellation task, e.g. it looks for the weights of the conventional cancelling lights so that every monochromatic stimulus plus the weighted cancelling lights match a grey reference in the (arbitrary) color representation used by the network. In fact, the selection of the cancellation lights is key to obtain human-like curves: results show that the classical choice for the cancellation lights is the one that leads to the best (more human-like) opponent result, and any other choices lead to progressively different spectral sensitivities. We show this in two different ways: through artificial hue cancellation experiments for a range of cancellation lights, and through a change-of-basis analogy of the experiments. These results suggest that the opponent curves of the standard hue cancellation experiment are just a by-product of the front-end photoreceptors and of a very specific experimental choice but they do not inform about the downstream color representation. In fact, the architecture of the post-retinal network (signal recombination or internal color space) seems irrelevant for the emergence of the curves in the classical experimental setting. This questions the conventional interpretation of the classical result of Jameson and Hurvich.

arxiv.org

Combined effects of STDP and homeostatic structural plasticity on coherence resonance. (arXiv:2303.08530v1 [q-bio.NC]) arxiv.org/abs/2303.08530

Combined effects of STDP and homeostatic structural plasticity on coherence resonance

Efficient processing and transfer of information in neurons have been linked to noise-induced resonance phenomena such as coherence resonance (CR), and adaptive rules in neural networks have been mostly linked to two prevalent mechanisms: spike-timing-dependent plasticity (STDP) and homeostatic structural plasticity (HSP). Thus, this paper investigates CR in small-world and random adaptive networks of Hodgkin-Huxley neurons driven by STDP and HSP. Our numerical study indicates that the degree of CR strongly depends, and in different ways, on the adjusting rate parameter $P$, which controls STDP, on the characteristic rewiring frequency parameter $F$, which controls HSP, and on the parameters of the network topology. In particular, we found two robust behaviors: (i) Decreasing $P$ (which enhances the weakening effect of STDP on synaptic weights) and decreasing $F$ (which slows down the swapping rate of synapses between neurons) always leads to higher degrees of CR in small-world and random networks, provided that the synaptic time delay parameter $τ_c$ has some appropriate values. (ii) Increasing the synaptic time delay $τ_c$ induces multiple CR (MCR) -- the occurrence of multiple peaks in the degree of coherence as $τ_c$ changes -- in small-world and random networks, with MCR becoming more pronounced at smaller values of $P$ and $F$. Our results imply that STDP and HSP can jointly play an essential role in enhancing the time precision of firing necessary for optimal information processing and transfer in neural systems and could thus have applications in designing networks of noisy artificial neural circuits engineered to use CR to optimize information processing and transfer.

arxiv.org

EGFR mutation prediction using F18-FDG PET-CT based radiomics features in non-small cell lung cancer. (arXiv:2303.08569v1 [q-bio.QM]) arxiv.org/abs/2303.08569

EGFR mutation prediction using F18-FDG PET-CT based radiomics features in non-small cell lung cancer

Lung cancer is the leading cause of cancer death in the world. Accurate determination of the EGFR (epidermal growth factor receptor) mutation status is highly relevant for the proper treatment of this patients. Purpose: The aim of this study was to predict the mutational status of the EGFR in non-small cell lung cancer patients using radiomics features extracted from PET-CT images. Methods: Retrospective study that involve 34 patients with lung cancer confirmed by histology and EGFR status mutation assessment. A total of 2.205 radiomics features were extracted from manual segmentation of the PET-CT images using pyradiomics library. Both computed tomography and positron emission tomography images were used. All images were acquired with intravenous iodinated contrast and F18-FDG. Preprocessing includes resampling, normalization, and discretization of the pixel intensity. Three methods were used for the feature selection process: backward selection (set 1), forward selection (set 2), and feature importance analysis of random forest model (set 3). Nine machine learning methods were used for radiomics model building. Results: 35.2% of patients had EGFR mutation, without significant differences in age, gender, tumor size and SUVmax. After the feature selection process 6, 7 and 17 radiomics features were selected, respectively in each group. The best performances were obtained by Ridge Regression in set 1: AUC of 0.826 (95% CI, 0.811 - 0.839), Random Forest in set 2: AUC of 0.823 (95% CI, 0.808 - 0.838) and Neural Network in set 3: AUC of 0.821 (95% CI, 0.808 - 0.835). Conclusion: The radiomics features analysis has the potential of predicting clinically relevant mutations in lung cancer patients through a non-invasive methodology.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.