Show newer

LRBmat: A Novel Gut Microbial Interaction and Individual Heterogeneity Inference Method for Colorectal Cancer. (arXiv:2303.07498v1 [q-bio.QM]) arxiv.org/abs/2303.07498

LRBmat: A Novel Gut Microbial Interaction and Individual Heterogeneity Inference Method for Colorectal Cancer

Many diseases are considered to be closely related to the changes in the gut microbial community, including colorectal cancer (CRC), which is one of the most common cancers in the world. The diagnostic classification and etiological analysis of CRC are two critical issues worthy of attention. Many methods adopt gut microbiota to solve it, but few of them simultaneously take into account the complex interactions and individual heterogeneity of gut microbiota, which are two common and important issues in genetics and intestinal microbiology, especially in high-dimensional cases. In this paper, a novel method with a Binary matrix based on Logistic Regression (LRBmat) is proposed to deal with the above problem. The binary matrix can directly weakened or avoided the influence of heterogeneity, and also contain the information about gut microbial interactions with any order. Moreover, LRBmat has a powerful generalization, it can combine with any machine learning method and enhance them. The real data analysis on CRC validates the proposed method, which has the best classification performance compared with the state-of-the-art. Furthermore, the association rules extracted from the binary matrix of the real data align well with the biological properties and existing literatures, which are helpful for the etiological analysis of CRC. The source codes for LRBmat are available at https://github.com/tsnm1/LRBmat.

arxiv.org

SuperMask: Generating High-resolution object masks from multi-view, unaligned low-resolution MRIs. (arXiv:2303.07517v1 [eess.IV]) arxiv.org/abs/2303.07517

SuperMask: Generating High-resolution object masks from multi-view, unaligned low-resolution MRIs

Three-dimensional segmentation in magnetic resonance images (MRI), which reflects the true shape of the objects, is challenging since high-resolution isotropic MRIs are rare and typical MRIs are anisotropic, with the out-of-plane dimension having a much lower resolution. A potential remedy to this issue lies in the fact that often multiple sequences are acquired on different planes. However, in practice, these sequences are not orthogonal to each other, limiting the applicability of many previous solutions to reconstruct higher-resolution images from multiple lower-resolution ones. We propose a weakly-supervised deep learning-based solution to generating high-resolution masks from multiple low-resolution images. Our method combines segmentation and unsupervised registration networks by introducing two new regularizations to make registration and segmentation reinforce each other. Finally, we introduce a multi-view fusion method to generate high-resolution target object masks. The experimental results on two datasets show the superiority of our methods. Importantly, the advantage of not using high-resolution images in the training process makes our method applicable to a wide variety of MRI segmentation tasks.

arxiv.org

Fractional dynamics foster deep learning of COPD stage prediction. (arXiv:2303.07537v1 [cs.LG]) arxiv.org/abs/2303.07537

Fractional dynamics foster deep learning of COPD stage prediction

Chronic obstructive pulmonary disease (COPD) is one of the leading causes of death worldwide. Current COPD diagnosis (i.e., spirometry) could be unreliable because the test depends on an adequate effort from the tester and testee. Moreover, the early diagnosis of COPD is challenging. We address COPD detection by constructing two novel physiological signals datasets (4432 records from 54 patients in the WestRo COPD dataset and 13824 medical records from 534 patients in the WestRo Porti COPD dataset). The authors demonstrate their complex coupled fractal dynamical characteristics and perform a fractional-order dynamics deep learning analysis to diagnose COPD. The authors found that the fractional-order dynamical modeling can extract distinguishing signatures from the physiological signals across patients with all COPD stages from stage 0 (healthy) to stage 4 (very severe). They use the fractional signatures to develop and train a deep neural network that predicts COPD stages based on the input features (such as thorax breathing effort, respiratory rate, or oxygen saturation). The authors show that the fractional dynamic deep learning model (FDDLM) achieves a COPD prediction accuracy of 98.66% and can serve as a robust alternative to spirometry. The FDDLM also has high accuracy when validated on a dataset with different physiological signals.

arxiv.org

Tensor-based Multimodal Learning for Prediction of Pulmonary Arterial Wedge Pressure from Cardiac MRI. (arXiv:2303.07540v1 [cs.LG]) arxiv.org/abs/2303.07540

Tensor-based Multimodal Learning for Prediction of Pulmonary Arterial Wedge Pressure from Cardiac MRI

Heart failure is a serious and life-threatening condition that can lead to elevated pressure in the left ventricle. Pulmonary Arterial Wedge Pressure (PAWP) is an important surrogate marker indicating high pressure in the left ventricle. PAWP is determined by Right Heart Catheterization (RHC) but it is an invasive procedure. A non-invasive method is useful in quickly identifying high-risk patients from a large population. In this work, we develop a tensor learning-based pipeline for identifying PAWP from multimodal cardiac Magnetic Resonance Imaging (MRI). This pipeline extracts spatial and temporal features from high-dimensional scans. For quality control, we incorporate an epistemic uncertainty-based binning strategy to identify poor-quality training samples. To improve the performance, we learn complementary information by integrating features from multimodal data: cardiac MRI with short-axis and four-chamber views, and Electronic Health Records. The experimental analysis on a large cohort of $1346$ subjects who underwent the RHC procedure for PAWP estimation indicates that the proposed pipeline has a diagnostic value and can produce promising performance with significant improvement over the baseline in clinical practice (i.e., $Δ$AUC $=0.10$, $Δ$Accuracy $=0.06$, and $Δ$MCC $=0.39$). The decision curve analysis further confirms the clinical utility of our method.

arxiv.org

Recovering Arrhythmic EEG Transients from Their Stochastic Interference. (arXiv:2303.07683v1 [q-bio.NC]) arxiv.org/abs/2303.07683

Recovering Arrhythmic EEG Transients from Their Stochastic Interference

Traditionally, the neuronal dynamics underlying electroencephalograms (EEG) have been understood as arising from \textit{rhythmic oscillators with varying degrees of synchronization}. This dominant metaphor employs frequency domain EEG analysis to identify the most prominent populations of neuronal current sources in terms of their frequency and spectral power. However, emerging perspectives on EEG highlight its arrhythmic nature, which is primarily inferred from broadband EEG properties like the ubiquitous $1/f$ spectrum. In the present study, we use an \textit{arrhythmic superposition of pulses} as a metaphor to explain the origin of EEG. This conceptualization has a fundamental problem because the interference produced by the superpositions of pulses generates colored Gaussian noise, masking the temporal profile of the generating pulse. We solved this problem by developing a mathematical method involving the derivative of the autocovariance function to recover excellent approximations of the underlying pulses, significantly extending the analysis of this type of stochastic processes. When the method is applied to spontaneous mouse EEG sampled at $5$ kHz during the sleep-wake cycle, specific patterns -- called $Ψ$-patterns -- characterizing NREM sleep, REM sleep, and wakefulness are revealed. $Ψ$-patterns can be understood theoretically as \textit{power density in the time domain} and correspond to combinations of generating pulses at different time scales. Remarkably, we report the first EEG wakefulness-specific feature, which corresponds to an ultra-fast ($\sim 1$ ms) transient component of the observed patterns. By shifting the paradigm of EEG genesis from oscillators to random pulse generators, our theoretical framework pushes the boundaries of traditional Fourier-based EEG analysis, paving the way for new insights into the arrhythmic components of neural dynamics.

arxiv.org

Resource saving taxonomy classification with k-mer distributions and machine learning. (arXiv:2303.06154v1 [q-bio.GN]) arxiv.org/abs/2303.06154

Resource saving taxonomy classification with k-mer distributions and machine learning

Modern high throughput sequencing technologies like metagenomic sequencing generate millions of sequences which have to be classified based on their taxonomic rank. Modern approaches either apply local alignment and comparison to existing data sets like MMseqs2 or use deep neural networks as it is done in DeepMicrobes and BERTax. Alignment-based approaches are costly in terms of runtime, especially since databases get larger and larger. For the deep learning-based approaches, specialized hardware is necessary for a computation, which consumes large amounts of energy. In this paper, we propose to use $k$-mer distributions obtained from DNA as features to classify its taxonomic origin using machine learning approaches like the subspace $k$-nearest neighbors algorithm, neural networks or bagged decision trees. In addition, we propose a feature space data set balancing approach, which allows reducing the data set for training and improves the performance of the classifiers. By comparing performance, time, and memory consumption of our approach to those of state-of-the-art algorithms (BERTax and MMseqs2) using several datasets, we show that our approach improves the classification on the genus level and achieves comparable results for the superkingdom and phylum level. Link: https://es-cloud.cs.uni-tuebingen.de/d/8e2ab8c3fdd444e1a135/?p=%2FTaxonomyClassification&mode=list

arxiv.org

Molecular characterization of wild Pleurotus ostreatus (MW457626) and evaluation of $\beta$-glucans polysaccharide activities. (arXiv:2303.06187v1 [q-bio.GN]) arxiv.org/abs/2303.06187

Molecular characterization of wild Pleurotus ostreatus (MW457626) and evaluation of $β$-glucans polysaccharide activities

Pleurotus ostreatus is a common cultivated edible mushroom worldwide. The fruiting bodies of P. ostreatus is a rich source of a $β$-glucans polysaccharide. The current study aimed to investigate the effectiveness of $β$-glucans as a natural polysaccharide produced by P. ostreatus as an antioxidant, antimicrobial, and anticancer. The molecular identification of P. ostreatus isolate was confirmed by Internal Transcribed Spacer (ITS) sequence. The sequence alignment and phylogenetic evolutionary relationship of studied ITS sequence were performed against some deposited sequences in GenBank. The analysis of high-performance liquid chromatography (HPLC) as well as the result of fourier transform infrared spectroscopy (FTIR) has confirmed the presence of $β$-glucans polysaccharide in the tested samples. The percentage of antioxidant activity of $β$-glucans showed a gradual increase from 8.59% to 12.36, 18.56, 23.69, 44.66 and 80.36% at the concentrations of 31.2, 64.4, 125, 250, 500, and 800 $μ$g/ml, respectively. In addition, all concentrations of $β$-glucans showed higher antioxidant activities when compared with standard antioxidant (Vitamin C). The highest antimicrobial activity of $β$-glucans polysaccharide was against P. aeruginosa with a zone of inhibition (45 mm), while the lowest activity was against S. aureus (13 mm) both at 100 mg/mL. The percentage of growth-inhibiting of MCF-7 a humanbreast cancer cell line and normal WRL-68 cell line affected by $β$-glucans were determined by 3-(4,5)-dimethylthiazol (-z-y1)-3,5-di-phenytetrazoliumromide (MTT assay).

arxiv.org

A high yield method for protoplast isolation and ease detection of rol B and C genes in the hairy roots of cauliflflower (Brassica oleracea L.) inoculated with Agrobacterium rhizogenes. (arXiv:2303.06194v1 [q-bio.SC]) arxiv.org/abs/2303.06194

A high yield method for protoplast isolation and ease detection of rol B and C genes in the hairy roots of cauliflflower (Brassica oleracea L.) inoculated with Agrobacterium rhizogenes

Protoplasts represent a unique experimental system for the circulation and formation of genetically modified plants. Here, protoplasts were isolated from genetically modified hairy root tissues of Brassica oleracea L. induced by the Agrobacterium rhizogenes strain (ATCC13332). The concentration of enzyme solutions utilized for protoplast isolation was 1.5 % Cellulase YC and 0.1 % Pectolyase Y23 in 13% mannitol solution, which resulted in high efficiency of isolation within 8 hours, in which the protoplast yield was 2 x 104 cells ml-1 and the percentage of viability was 72%. Each protoplast has one nucleus with a nucleation of 48%. A polymerase chain reaction (PCR) assay verified the presence of rol B and rol C genes in hairy root tissues by detaching a single bundle of DNA replication from these roots using a specific pair of primers. The current study demonstrated that A. rhizogenes strain (ATCC13332) is a vector for the incorporation of T-DNA genes into cauliflower plants, as well as the success of the hairy roots retention of rol B and rol C genes transferred to it.

arxiv.org

Enhancing Protein Language Models with Structure-based Encoder and Pre-training. (arXiv:2303.06275v1 [q-bio.QM]) arxiv.org/abs/2303.06275

Enhancing Protein Language Models with Structure-based Encoder and Pre-training

Protein language models (PLMs) pre-trained on large-scale protein sequence corpora have achieved impressive performance on various downstream protein understanding tasks. Despite the ability to implicitly capture inter-residue contact information, transformer-based PLMs cannot encode protein structures explicitly for better structure-aware protein representations. Besides, the power of pre-training on available protein structures has not been explored for improving these PLMs, though structures are important to determine functions. To tackle these limitations, in this work, we enhance the PLMs with structure-based encoder and pre-training. We first explore feasible model architectures to combine the advantages of a state-of-the-art PLM (i.e., ESM-1b1) and a state-of-the-art protein structure encoder (i.e., GearNet). We empirically verify the ESM-GearNet that connects two encoders in a series way as the most effective combination model. To further improve the effectiveness of ESM-GearNet, we pre-train it on massive unlabeled protein structures with contrastive learning, which aligns representations of co-occurring subsequences so as to capture their biological correlation. Extensive experiments on EC and GO protein function prediction benchmarks demonstrate the superiority of ESM-GearNet over previous PLMs and structure encoders, and clear performance gains are further achieved by structure-based pre-training upon ESM-GearNet. Our implementation is available at https://github.com/DeepGraphLearning/GearNet.

arxiv.org

Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learning. (arXiv:2303.06340v1 [q-bio.QM]) arxiv.org/abs/2303.06340

Intelligent diagnostic scheme for lung cancer screening with Raman spectra data by tensor network machine learning

Artificial intelligence (AI) has brought tremendous impacts on biomedical sciences from academic researches to clinical applications, such as in biomarkers' detection and diagnosis, optimization of treatment, and identification of new therapeutic targets in drug discovery. However, the contemporary AI technologies, particularly deep machine learning (ML), severely suffer from non-interpretability, which might uncontrollably lead to incorrect predictions. Interpretability is particularly crucial to ML for clinical diagnosis as the consumers must gain necessary sense of security and trust from firm grounds or convincing interpretations. In this work, we propose a tensor-network (TN)-ML method to reliably predict lung cancer patients and their stages via screening Raman spectra data of Volatile organic compounds (VOCs) in exhaled breath, which are generally suitable as biomarkers and are considered to be an ideal way for non-invasive lung cancer screening. The prediction of TN-ML is based on the mutual distances of the breath samples mapped to the quantum Hilbert space. Thanks to the quantum probabilistic interpretation, the certainty of the predictions can be quantitatively characterized. The accuracy of the samples with high certainty is almost 100$\%$. The incorrectly-classified samples exhibit obviously lower certainty, and thus can be decipherably identified as anomalies, which will be handled by human experts to guarantee high reliability. Our work sheds light on shifting the ``AI for biomedical sciences'' from the conventional non-interpretable ML schemes to the interpretable human-ML interactive approaches, for the purpose of high accuracy and reliability.

arxiv.org

A PDMP to model the stochastic influence of quiescence dynamics in blood cancers. (arXiv:2303.06412v1 [math.PR]) arxiv.org/abs/2303.06412

A PDMP to model the stochastic influence of quiescence dynamics in blood cancers

In this article, we will see a new approach to study the impact of a small microscopic population of cancer cells on a macroscopic population of healthy cells, with an example inspired by pathological hematopoiesis. Hematopoiesis is the biological phenomenon of blood cells production by differentiation of cells called hematopoietic stem cells (HSCs). We will study the dynamics of a stochastic $4$-dimensional process describing the evolution over time of the number of healthy and cancer stem cells and the number of healthy and mutant red blood cells. The model takes into account the amplification between stem cells and red blood cells as well as the regulation of this amplification as a function of the number of red blood cells (healthy and mutant). A single cancer HSC is considered while other populations are in large numbers. We assume that the unique cancer HSC randomly switches between an active and a quiescent state. We show the convergence in law of this process towards a piecewise deterministic Markov process (PDMP), when the population size goes to infinity. We then study the long time behaviour of this limit process. We show the existence and uniqueness of an absolutely continuous invariant probability measure with respect to the Lebesgue's measure for the limit PDMP, previously gathered. We describe the support of the invariant probability and show that the process converges in total variation towards it, using theory develop by M. Benaim et al. We finally identify the invariant probability using its infinitesimal generator. Thanks to this probabilistic approach, we obtain a stationary system of partial differential equation describing the impact of cancer HSC quiescent phases and regulation on the cell density of the hematopoietic system studied.

arxiv.org

Semi-Quantitative Analysis and Serological Evidence of Hepatitis A Virus IgG Antibody among children in Rumuewhor, Emuoha, Rivers State, Nigeria. (arXiv:2303.06503v1 [q-bio.OT]) arxiv.org/abs/2303.06503

Semi-Quantitative Analysis and Serological Evidence of Hepatitis A Virus IgG Antibody among children in Rumuewhor, Emuoha, Rivers State, Nigeria

Hepatitis A virus (HAV) infection has been greatly reduced in most developed countries through the use of vaccine and improved hygienic conditions. However, the magnitude of the problem is underestimated and there are no well-established Hepatitis A virus prevention and control strategies in Nigeria. The aim of this study was to determine the prevalence of Hepatitis A virus infection among children aged 2 to 9 years in Rumuewhor, Emuoha LGA, Rivers State, Nigeria. Blood samples were collected from the 89 children enrolled in this study, and analyzed for the presence of HAV IgG antibodies using ELISA techniques. Of the 89 participants, 22 (24.7%) tested positive for HAV IgG antibodies, while 67 (75.3%) were negative. The children within the ages of 4 to 6 years had the highest seropositivity rate (33.3%) while those less than 4 years had the least seropositivity rate (22.4%). The prevalence rate ratio of the males to females was 1:1.3. There was no significant difference (p between IgG seropositivity and age groups and gender. However, there was a statistical association of IgG seropositivity rates with respect to immunization. The seroprevalence rate recorded in this study was significant, indicating that the virus is endemic in this study area. Proper awareness, health education and vaccination are imperative to controlling and preventing HAV infection in Rumuewhor, Emuoha, Rivers State, Nigeria.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.