SiCmiR Atlas: Single-Cell miRNA Landscapes Reveals Hub-miRNA and Network Signatures in Human Cancers arxiv.org/abs/2508.05692

SiCmiR Atlas: Single-Cell miRNA Landscapes Reveals Hub-miRNA and Network Signatures in Human Cancers

microRNA are pivotal post-transcriptional regulators whose single-cell behavior has remained largely inaccessible owing to technical barriers in single-cell small-RNA profiling. We present SiCmiR, a two-layer neural network that predicts miRNA expression profile from only 977 LINCS L1000 landmark genes reducing sensitivity to dropout of single-cell RNA-seq data. Proof-of-concept analyses illustrate how SiCmiR can uncover candidate hub-miRNAs in bulk-seq cell lines and hepatocellular carcinoma, scRNA-seq pancreatic ductal carcinoma and ACTH-secreting pituitary adenoma and extracellular-vesicle-mediated crosstalk in glioblastoma. Trained on 6462 TCGA paired miRNA-mRNA samples, SiCmiR attains state-of-the-art accuracy on held-out cancers and generalizes to unseen cancer types, drug perturbations and scRNA-seq. We next constructed SiCmiR-Atlas, containing 632 public datasets, 9.36 million cells, 726 cell types, which is the first dedicated database of single-cell mature miRNA expression--providing interactive visualization, biomarker identification and cell-type-resolved miRNA-target networks. SiCmiR transforms bulk-derived statistical power into a single-cell view of miRNA biology and provides a community resource SiCmiR Atlas for biomarker discovery. SiCmiR Atlas is avilable at https://awi.cuhk.edu.cn/~SiCmiR/.

arXiv.org

Designing de novo TIM Barrels: Insights into Stabilization, Diversification, and Functionalization Strategies arxiv.org/abs/2508.05699

Designing de novo TIM Barrels: Insights into Stabilization, Diversification, and Functionalization Strategies

The TIM-barrel fold is one of the most versatile and ubiquitous protein folds in nature, hosting a wide variety of catalytic activities and functions while serving as a model system in protein biochemistry and engineering. This review explores its role as a key fold model in protein design, particularly in addressing challenges in stabilization and functionalization. We discuss historical and recent advances in de novo TIM barrel design from the landmark creation of sTIM11 to the development of the diversified variants, with a special focus on deepening our understanding of the determinants that modulate the sequence-structure-function relationships of this architecture. Also, we examine why the diversification of de novo TIM barrels towards functionalization remains a major challenge, given the absence of natural-like active site features. Current approaches have focused on incorporating structural extensions, modifying loops, and using cutting-edge AI-based strategies to create scaffolds with tailored characteristics. Despite significant advances, achieving enzymatically active de novo TIM barrels has been proven difficult, with only recent breakthroughs demonstrating functionalized designs. We discuss the limitations of stepwise functionalization approaches and support an integrated approach that simultaneously optimizes scaffold structure and active site shape, using both physical- and AI-driven methods. By combining computational and experimental insights, we highlight the TIM barrel as a powerful template for custom enzyme design and as a model system to explore the intersection of protein biochemistry, biophysics, and design.

arXiv.org

A Physiologically-Constrained Neural Network Digital Twin Framework for Replicating Glucose Dynamics in Type 1 Diabetes arxiv.org/abs/2508.05705

A Physiologically-Constrained Neural Network Digital Twin Framework for Replicating Glucose Dynamics in Type 1 Diabetes

Simulating glucose dynamics in individuals with type 1 diabetes (T1D) is critical for developing personalized treatments and supporting data-driven clinical decisions. Existing models often miss key physiological aspects and are difficult to individualize. Here, we introduce physiologically-constrained neural network (NN) digital twins to simulate glucose dynamics in T1D. To ensure interpretability and physiological consistency, we first build a population-level NN state-space model aligned with a set of ordinary differential equations (ODEs) describing glucose regulation. This model is formally verified to conform to known T1D dynamics. Digital twins are then created by augmenting the population model with individual-specific models, which include personal data, such as glucose management and contextual information, capturing both inter- and intra-individual variability. We validate our approach using real-world data from the T1D Exercise Initiative study. Two weeks of data per participant were split into 5-hour sequences and simulated glucose profiles were compared to observed ones. Clinically relevant outcomes were used to assess similarity via paired equivalence t-tests with predefined clinical equivalence margins. Across 394 digital twins, glucose outcomes were equivalent between simulated and observed data: time in range (70-180 mg/dL) was 75.1$\pm$21.2% (simulated) vs. 74.4$\pm$15.4% (real; P<0.001); time below range (<70 mg/dL) 2.5$\pm$5.2% vs. 3.0$\pm$3.3% (P=0.022); and time above range (>180 mg/dL) 22.4$\pm$22.0% vs. 22.6$\pm$15.9% (P<0.001). Our framework can incorporate unmodeled factors like sleep and activity while preserving key dynamics. This approach enables personalized in silico testing of treatments, supports insulin optimization, and integrates physics-based and data-driven modeling. Code: https://github.com/mosqueralopez/T1DSim_AI

arXiv.org

Progress and new challenges in image-based profiling arxiv.org/abs/2508.05800

Progress and new challenges in image-based profiling

For over two decades, image-based profiling has revolutionized cellular phenotype analysis. Image-based profiling processes rich, high-throughput, microscopy data into unbiased measurements that reveal phenotypic patterns powerful for drug discovery, functional genomics, and cell state classification. Here, we review the evolving computational landscape of image-based profiling, detailing current procedures, discussing limitations, and highlighting future development directions. Deep learning has fundamentally reshaped image-based profiling, improving feature extraction, scalability, and multimodal data integration. Methodological advancements such as single-cell analysis and batch effect correction, drawing inspiration from single-cell transcriptomics, have enhanced analytical precision. The growth of open-source software ecosystems and the development of community-driven standards have further democratized access to image-based profiling, fostering reproducibility and collaboration across research groups. Despite these advancements, the field still faces significant challenges requiring innovative solutions. By focusing on the technical evolution of image-based profiling rather than the wide-ranging biological applications, our aim with this review is to provide researchers with a roadmap for navigating the progress and new challenges in this rapidly advancing domain.

arXiv.org

Optimal trap cropping investments to maximize agricultural yield arxiv.org/abs/2508.05896

Optimal trap cropping investments to maximize agricultural yield

Trap cropping is a pest management strategy where a grower plants an attractive "trap crop" alongside the primary crop to divert pests away from it. We propose a simple framework for optimizing the proportion of a grower's field or greenhouse allocated to a main crop and a trap crop to maximize agricultural yield. We implement this framework using a model of pest movement governed by trap crop attractiveness, the potential yield threatened by pests, and functional relationships between yield loss and pest density drawn from the literature. Focusing on a simple case in which pests move freely across the field and are attracted to traps solely by their relative attractiveness, we find that allocating 5-20 percent of the landscape to trap plants is typically required to maximize yield and achieve effective pest control in the absence of pesticides. For highly attractive trap plants, growers can devote less space because they are more effective; less attractive plants are ineffective even in large numbers. Intermediate attractiveness warrants the greatest investment in trap cropping. Our framework offers a transparent and tractable approach for exploring trade-offs in pest management and can be extended to incorporate more complex pest behaviors, crop spatial configurations, and economic considerations.

arXiv.org

Diverse Neural Sequences in QIF Networks: An Analytically Tractable Framework for Synfire Chains and Hippocampal Replay arxiv.org/abs/2508.06085

Diverse Neural Sequences in QIF Networks: An Analytically Tractable Framework for Synfire Chains and Hippocampal Replay

Sequential neural activity is fundamental to cognition, yet how diverse sequences are recalled under biological constraints remains a key question. Existing models often struggle to balance biophysical realism and analytical tractability. We address this problem by proposing a parsimonious network of Quadratic Integrate-and-Fire (QIF) neurons with sequences embedded via a temporally asymmetric Hebbian (TAH) rule. Our findings demonstrate that this single framework robustly reproduces a spectrum of sequential activities, including persistent synfire-like chains and transient, hippocampal replay-like bursts exhibiting intra-ripple frequency accommodation (IFA), all achieved without requiring specialized delay or adaptation mechanisms. Crucially, we derive exact low-dimensional firing-rate equations (FREs) that provide mechanistic insight, elucidating the bifurcation structure governing these distinct dynamical regimes and explaining their stability. The model also exhibits strong robustness to synaptic heterogeneity and memory pattern overlap. These results establish QIF networks with TAH connectivity as an analytically tractable and biologically plausible platform for investigating the emergence, stability, and diversity of sequential neural activity in the brain.

arXiv.org

Ensemble-Based Graph Representation of fMRI Data for Cognitive Brain State Classification arxiv.org/abs/2508.06118

Ensemble-Based Graph Representation of fMRI Data for Cognitive Brain State Classification

Understanding and classifying human cognitive brain states based on neuroimaging data remains one of the foremost and most challenging problems in neuroscience, owing to the high dimensionality and intrinsic noise of the signals. In this work, we propose an ensemble-based graph representation method of functional magnetic resonance imaging (fMRI) data for the task of binary brain-state classification. Our method builds the graph by leveraging multiple base machine-learning models: each edge weight reflects the difference in posterior probabilities between two cognitive states, yielding values in the range [-1, 1] that encode confidence in a given state. We applied this approach to seven cognitive tasks from the Human Connectome Project (HCP 1200 Subject Release), including working memory, gambling, motor activity, language, social cognition, relational processing, and emotion processing. Using only the mean incident edge weights of the graphs as features, a simple logistic-regression classifier achieved average accuracies from 97.07% to 99.74%. We also compared our ensemble graphs with classical correlation-based graphs in a classification task with a graph neural network (GNN). In all experiments, the highest classification accuracy was obtained with ensemble graphs. These results demonstrate that ensemble graphs convey richer topological information and enhance brain-state discrimination. Our approach preserves edge-level interpretability of the fMRI graph representation, is adaptable to multiclass and regression tasks, and can be extended to other neuroimaging modalities and pathological-state classification.

arXiv.org

The Role of Arteriovenous Graft Curvature in Haemodynamics: an Image-Based Approach arxiv.org/abs/2508.06148

The Role of Arteriovenous Graft Curvature in Haemodynamics: an Image-Based Approach

Vascular access, such as arteriovenous grafts, is crucial for patients undergoing haemodialysis as part of kidney replacement therapy. One of the primary causes of arteriovenous graft failure and loss of patency is disordered blood flow, as the vein is exposed to the arterial environment with high flow rates and shear stress. We hypothesize that secondary flow downstream of the vein-graft anastomosis plays a critical role in generating low shear regions, thereby promoting neointima hyperplasia. The secondary flow highlighted here also promotes high oscillatory shear index regions downstream of the vein-graft anastomosis, further contributing to graft failure. To prolong the overall graft survival and patency, we aim to develop a strategy to optimise graft configurations with reduced levels of disturbed haemodynamics. We developed an image-based approach to build three-dimensional geometries for subsequent computational fluid dynamics (CFD) numerical simulations. This simple, yet accurate, method allowed us to improve the accuracy of geometries, thus facilitating comparisons between different vein-graft anastomotic angles. Our results reveal that overall graft curvature (looped vs. straight) plays a dominant role in characterising the failure metrics. Looped grafts, particularly at moderate vein-graft anastomotic angles (30°-45°), exhibited the most favourable metrics, including reduced values of low wall shear stress, high wall shear stress, and high oscillatory shear index. These findings provide critical insights to inform medical professionals about graft areas that are subject to high shear stresses due to the oscillating nature of blood flow as well as the graft geometric configuration when performing surgery. The model developed in this work offers a framework enabling personalised vascular access strategies tailored to individual patient needs.

arXiv.org

Low dimensional dynamics of a sparse balanced synaptic network of quadratic integrate-and-fire neurons arxiv.org/abs/2508.06253

Low dimensional dynamics of a sparse balanced synaptic network of quadratic integrate-and-fire neurons

Kinetics of a balanced network of neurons with a sparse grid of synaptic links is well representable by the stochastic dynamics of a generic neuron subject to an effective shot noise. The rate of delta-pulses of the noise is determined self-consistently from the probability density of the neuron states. Importantly, the most sophisticated (but robust) collective regimes of the network do not allow for the diffusion approximation, which is routinely adopted for a shot noise in mathematical neuroscience. These regimes can be expected to be biologically relevant. For the kinetics equations of the complete mean field theory of a homogeneous inhibitory network of quadratic integrate-and-fire neurons, we introduce circular cumulants of the genuine phase variable and derive a rigorous two cumulant reduction for both time-independent conditions and modulation of the excitatory current. The low dimensional model is examined with numerical simulations and found to be accurate for time-independent states and dynamic response to a periodic modulation deep into the parameter domain where the diffusion approximation is not applicable. The accuracy of a low dimensional model indicates and explains a low embedding dimensionality of the macroscopic collective dynamics of the network. The reduced model can be instrumental for theoretical studies of inhibitory-excitatory balances neural networks.

arXiv.org

A Novel cVAE-Augmented Deep Learning Framework for Pan-Cancer RNA-Seq Classification arxiv.org/abs/2508.02743

A Novel cVAE-Augmented Deep Learning Framework for Pan-Cancer RNA-Seq Classification

Pan-cancer classification using transcriptomic (RNA-Seq) data can inform tumor subtyping and therapy selection, but is challenging due to extremely high dimensionality and limited sample sizes. In this study, we propose a novel deep learning framework that uses a class-conditional variational autoencoder (cVAE) to augment training data for pan-cancer gene expression classification. Using 801 tumor RNA-Seq samples spanning 5 cancer types from The Cancer Genome Atlas (TCGA), we first perform feature selection to reduce 20,531 gene expression features to the 500 most variably expressed genes. A cVAE is then trained on this data to learn a latent representation of gene expression conditioned on cancer type, enabling the generation of synthetic gene expression samples for each tumor class. We augment the training set with these cVAE-generated samples (doubling the dataset size) to mitigate overfitting and class imbalance. A two-layer multilayer perceptron (MLP) classifier is subsequently trained on the augmented dataset to predict tumor type. The augmented framework achieves high classification accuracy (~98%) on a held-out test set, substantially outperforming a classifier trained on the original data alone. We present detailed experimental results, including VAE training curves, classifier performance metrics (ROC curves and confusion matrix), and architecture diagrams to illustrate the approach. The results demonstrate that cVAE-based synthetic augmentation can significantly improve pan-cancer prediction performance, especially for underrepresented cancer classes.

arXiv.org

Random Effects Models for Understanding Variability and Association between Brain Functional and Structural Connectivity arxiv.org/abs/2508.02908

Random Effects Models for Understanding Variability and Association between Brain Functional and Structural Connectivity

The human brain is organized as a complex network, where connections between regions are characterized by both functional connectivity (FC) and structural connectivity (SC). While previous studies have primarily focused on network-level FC-SC correlations (i.e., the correlation between FC and SC across all edges within a predefined network), edge-level correlations (i.e., the correlation between FC and SC across subjects at each edge) has received comparatively little attention. In this study, we systematically analyze both network-level and edge-level FC-SC correlations, demonstrating that they lead to divergent conclusions about the strength of brain function-structure association. To explain these discrepancies, we introduce new random effects models that decompose FC and SC variability into different sources: subject effects, edge effects, and their interactions. Our results reveal that network-level and edge-level FC-SC correlations are influenced by different effects, each contributing differently to the total variability in FC and SC. This modeling framework provides the first statistical approach for disentangling and quantitatively assessing different sources of FC and SC variability and yields new insights into the relationship between functional and structural brain networks.

arXiv.org

The Multi-biophysical nature of Computation in brain neural networks arxiv.org/abs/2508.03115

The Multi-biophysical nature of Computation in brain neural networks

Comprehending the nature of action potentials is fundamental to our understanding of the functioning of nervous systems in general. The ionic mechanisms underlying action potentials in the squid giant axon were first described by Hodgkin and Huxley in 1952 and their findings have formed our orthodox view of how the physiological action potential functions. However, substan-tial evidence has now accumulated to show that the action potential is accompanied by a syn-chronized coupled soliton pressure pulse in the cell membrane, the action potential pulse (AP-Pulse) which we have recently shown to have an essential function in computation. Here we ex-plore the interactions between the soliton and the ionic mechanisms known to be associated with the action potential. Computational models of the action potential usually describe it as a binary event, but we have shown that it must be a quantum ternary event known as the computa-tional action potential (CAP), whose temporal fixed point is the threshold of the soliton, rather than the rather plastic action potential peak used in other models to facilitate meaningful compu-tation. We have demonstrated this type of frequency computation for the retina, in detail, and also provided an extensive analysis for computation for other brain neural networks. The CAP ac-companies the APPulse and the Physiological action potential. Therefore, we conclude that nerve impulses appear to be an ensemble of three inseparable, interdependent, concurrent states: the physiological action potential, the APPulse and the CAP. However, while the physio-logical action potential is important in terms of neural connectivity, it is irrelevant to computational processes as this is always facilitated by the soliton part of the APPulse.

arXiv.org

Fitness and Overfitness: Implicit Regularization in Evolutionary Dynamics arxiv.org/abs/2508.03187

Fitness and Overfitness: Implicit Regularization in Evolutionary Dynamics

A common assumption in evolutionary thought is that adaptation drives an increase in biological complexity. However, the rules governing evolution of complexity appear more nuanced. Evolution is deeply connected to learning, where complexity is much better understood, with established results on optimal complexity appropriate for a given learning task. In this work, we suggest a mathematical framework for studying the relationship between evolved organismal complexity and enviroenmntal complexity by leveraging a mathematical isomorphism between evolutionary dynamics and learning theory. Namely, between the replicator equation and sequential Bayesian learning, with evolving types corresponding to competing hypotheses and fitness in a given environment to likelihood of observed evidence. In Bayesian learning, implicit regularization prevents overfitting and drives the inference of hypotheses whose complexity matches the learning challenge. We show how these results naturally carry over to the evolutionary setting, where they are interpreted as organism complexity evolving to match the complexity of the environment, with too complex or too simple organisms suffering from \textit{overfitness} and \textit{underfitness}, respectively. Other aspects, peculiar to evolution and not to learning, reveal additional trends. One such trend is that frequently changing environments decrease selected complexity, a result with potential implications to both evolution and learning. Together, our results suggest that the balance between over-adaptation to transient environmental features, and insufficient flexiblity in responding to environmental challenges, drives the emergence of optimal complexity, reflecting environmental structure. This framework offers new ways of thinking about biological complexity, suggesting new potential causes for it to increase or decrease in different environments.

arXiv.org

Decoding Polyphenol-Protein Interactions with Deep Learning: From Molecular Mechanisms to Food Applications arxiv.org/abs/2508.03456

Decoding Polyphenol-Protein Interactions with Deep Learning: From Molecular Mechanisms to Food Applications

Polyphenols and proteins are essential biomolecules that influence food functionality and, by extension, human health. Their interactions -- hereafter referred to as PhPIs (polyphenol-protein interactions) -- affect key processes such as nutrient bioavailability, antioxidant activity, and therapeutic efficacy. However, these interactions remain challenging due to the structural diversity of polyphenols and the dynamic nature of protein binding. Traditional experimental techniques like nuclear magnetic resonance (NMR) and mass spectrometry (MS), along with computational tools such as molecular docking and molecular dynamics (MD), have offered important insights but face constraints in scalability, throughput, and reproducibility. This review explores how deep learning (DL) is reshaping the study of PhPIs by enabling efficient prediction of binding sites, interaction affinities, and MD using high-dimensional bio- and chem-informatics data. While DL enhances prediction accuracy and reduces experimental redundancy, its effectiveness remains limited by data availability, quality, and representativeness, particularly in the context of natural products. We critically assess current DL frameworks for PhPIs analysis and outline future directions, including multimodal data integration, improved model generalizability, and development of domain-specific benchmark datasets. This synthesis offers guidance for researchers aiming to apply DL in unraveling structure-function relationships of polyphenols, accelerating discovery in nutritional science and therapeutic development.

arXiv.org

Advancing Wildlife Monitoring: Drone-Based Sampling for Roe Deer Density Estimation arxiv.org/abs/2508.03545 .CV

Advancing Wildlife Monitoring: Drone-Based Sampling for Roe Deer Density Estimation

We use unmanned aerial drones to estimate wildlife density in southeastern Austria and compare these estimates to camera trap data. Traditional methods like capture-recapture, distance sampling, or camera traps are well-established but labour-intensive or spatially constrained. Using thermal (IR) and RGB imagery, drones enable efficient, non-intrusive animal counting. Our surveys were conducted during the leafless period on single days in October and November 2024 in three areas of a sub-Illyrian hill and terrace landscape. Flight transects were based on predefined launch points using a 350 m grid and an algorithm that defined the direction of systematically randomized transects. This setup allowed surveying large areas in one day using multiple drones, minimizing double counts. Flight altitude was set at 60 m to avoid disturbing roe deer (Capreolus capreolus) while ensuring detection. Animals were manually annotated in the recorded imagery and extrapolated to densities per square kilometer. We applied three extrapolation methods with increasing complexity: naive area-based extrapolation, bootstrapping, and zero-inflated negative binomial modelling. For comparison, a Random Encounter Model (REM) estimate was calculated using camera trap data from the flight period. The drone-based methods yielded similar results, generally showing higher densities than REM, except in one area in October. We hypothesize that drone-based density reflects daytime activity in open and forested areas, while REM estimates average activity over longer periods within forested zones. Although both approaches estimate density, they offer different perspectives on wildlife presence. Our results show that drones offer a promising, scalable method for wildlife density estimation.

arXiv.org

Decoding and Engineering the Phytobiome Communication for Smart Agriculture arxiv.org/abs/2508.03584 .SP .AI .ET .NI

Decoding and Engineering the Phytobiome Communication for Smart Agriculture

Smart agriculture applications, integrating technologies like the Internet of Things and machine learning/artificial intelligence (ML/AI) into agriculture, hold promise to address modern challenges of rising food demand, environmental pollution, and water scarcity. Alongside the concept of the phytobiome, which defines the area including the plant, its environment, and associated organisms, and the recent emergence of molecular communication (MC), there exists an important opportunity to advance agricultural science and practice using communication theory. In this article, we motivate to use the communication engineering perspective for developing a holistic understanding of the phytobiome communication and bridge the gap between the phytobiome communication and smart agriculture. Firstly, an overview of phytobiome communication via molecular and electrophysiological signals is presented and a multi-scale framework modeling the phytobiome as a communication network is conceptualized. Then, how this framework is used to model electrophysiological signals is demonstrated with plant experiments. Furthermore, possible smart agriculture applications, such as smart irrigation and targeted delivery of agrochemicals, through engineering the phytobiome communication are proposed. These applications merge ML/AI methods with the Internet of Bio-Nano-Things enabled by MC and pave the way towards more efficient, sustainable, and eco-friendly agricultural production. Finally, the implementation challenges, open research issues, and industrial outlook for these applications are discussed.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.