Show newer

Exploring Visual Complaints through a test battery in Acquired Brain Injury Patients: A Detailed Analysis of the DiaNAH Dataset arxiv.org/abs/2504.18540

Exploring Visual Complaints through a test battery in Acquired Brain Injury Patients: A Detailed Analysis of the DiaNAH Dataset

This study investigated visual impairment complaints in a sample of 948 Acquired Brain Injury (ABI) patients using the DiaNAH dataset, emphasizing advanced machine learning techniques for managing missing data. Patients completed a CVS questionnaire capturing eight types of visual symptoms, including blurred vision and altered contrast perception. Due to incomplete data, 181 patients were excluded, resulting in an analytical subset of 767 individuals. To address the challenge of missing data, an automated machine learning (AutoML) approach was employed for data imputation, preserving the distributional characteristics of the original dataset. Patients were grouped according to singular and combined complaint clusters derived from the 40,320 potential combinations identified through the CVS questionnaire. A linear correlation analysis revealed minimal to no direct relationship between patient-reported visual complaints and standard visual perceptual function tests. This study represents an initial systematic attempt to understand the complex relationship between subjective visual complaints and objective visual perceptual assessments in ABI patients. Given the limitations of sample size and variability, further studies with larger populations are recommended to robustly explore these complaint clusters and their implications for visual perception following brain injury.

arXiv.org

Quantum information theoretic approach to the hard problem of consciousness arxiv.org/abs/2504.18550

Quantum information theoretic approach to the hard problem of consciousness

Functional theories of consciousness, based on emergence of conscious experiences from the execution of a particular function by an insentient brain, face the hard problem of consciousness of explaining why the insentient brain should produce any conscious experiences at all. This problem is exacerbated by the determinism characterizing the laws of classical physics, due to the resulting lack of causal potency of the emergent consciousness, which is not present already as a physical quantity in the deterministic equations of motion of the brain. Here, we present a quantum information theoretic approach to the hard problem of consciousness that avoids all of the drawbacks of emergence. This is achieved through reductive identification of first-person subjective conscious states with unobservable quantum state vectors in the brain, whereas the anatomically observable brain is viewed as a third-person objective construct created by classical bits of information obtained during the measurement of a subset of commuting quantum brain observables by the environment. Quantum resource theory further implies that the quantum features of consciousness granted by quantum no-go theorems cannot be replicated by any classical physical device.

arXiv.org

An X-ray absorption spectrum database for iron-containing proteins arxiv.org/abs/2504.18554

An X-ray absorption spectrum database for iron-containing proteins

Earth-abundant iron is an essential metal in regulating the structure and function of proteins. This study presents the development of a comprehensive X-ray Absorption Spectroscopy (XAS) database focused on iron-containing proteins, addressing a critical gap in available high-quality annotated spectral data for iron-containing proteins. The database integrates detailed XAS spectra with their corresponding local structural data of proteins and enables a direct comparison of spectral features and structural motifs. Utilizing a combination of manual curation and semi-automated data extraction techniques, we developed a comprehensive dataset from an extensive review of literature, ensuring the quality and accuracy of data, which contains 437 protein structures and 1954 XAS spectrums. Our methods included careful documentation and validation processes to ensure accuracy and reproducibility. This dataset not only centralizes information on iron-containing proteins but also supports advanced data-driven discoveries, such as machine learning, to predict and analyze protein structure and functions. This work underscores the potential of integrating detailed spectroscopic data with structural biology to advance the field of biological chemistry and catalysis.

arXiv.org

Photon Absorption Remote Sensing Virtual Histopathology: Diagnostic Equivalence to Gold-Standard H&E Staining in Skin Cancer Excisional Biopsies arxiv.org/abs/2504.18737

Uncovering potential effects of spontaneous waves on synaptic development: the visual system as a model arxiv.org/abs/2504.18991

Uncovering potential effects of spontaneous waves on synaptic development: the visual system as a model

Spontaneous waves are ubiquitous during early brain development and are hypothesized to drive the development of receptive fields (RFs). Different stages of spontaneous waves in the retina have been observed to coincide with the development of the retinotopic map, ON-OFF segregation, and orientation selectivity in the early visual pathway of mammals, and can be characterized by different activity patterns in the retina and downstream areas. Stage II waves, which occur in rodents right after birth, have been implicated in a possible synaptic pruning process, relating these stage II retinal waves to the refinement of the retinotopic map. However, the mechanisms underlying the activity-dependent effects of retinal waves on primary visual cortex (V1) are poorly understood. In this work, we build a biologically-constrained model of the development of thalamocortical synapses onto neurons in V1 driven by stage II retinal waves using a spike-timing dependent triplet learning rule. Using this model, together with a reduced rate-based model, we propose possible mechanisms underlying such a pruning process and predict how characteristics of the retinal waves may lead to different RF structures, including periodic RFs. We introduce gap junctions into the V1 network and show that such a coupling can serve to promote precise local retinotopy. Finally, we discuss how the spatial distribution of synaptic weights at the end of stage II may affect the emergence of orientation selectivity of V1 neurons during stage III waves. The mechanisms uncovered in this work may be useful in understanding synaptic structures that emerge across cortical regions during development.

arXiv.org

PhyloProfile v2 -- Exploring multi-layered phylogenetic profiles at scale arxiv.org/abs/2504.19710

PhyloProfile v2 -- Exploring multi-layered phylogenetic profiles at scale

Phylogenetic profiles visualize the presence-absence pattern of genes across taxa and are essential for delineating the evolutionary fate of genes and gene families. Integrating phylogenetic profiles across many genes and taxa reveals patterns of coevolution, aiding the predictions of gene functions and interactions. The surge of genome sequences generated by biodiversity genomics projects allows to compile phylogenetic profiles at an unprecedented scale. PhyloProfile v2 was designed to cope with the novel challenges of visualizing and analyzing phylogenetic profiles comprising millions of pairwise orthology relationships. By providing the ability to interact with the visualization and dynamically filter the data, PhyloProfile v2 facilitates a seamless transition from survey analyses across thousands of genes and taxa down to the feature architecture comparison of two ortholog pairs within the same analysis. As one key innovation, PhyloProfile v2 allows the display of phylogenetic profiles in 2D or 3D using dimensionality reduction techniques. This novel perspective eases, for example, the identification of taxa with similar presence/absence patterns of genes irrespective of their phylogenetic relationships. PhyloProfile v2 is available as an R package at Bioconductor https://doi.org/doi:10.18129/B9.bioc.PhyloProfile. The open-source code and documentation are provided under MIT license at https://github.com/BIONF/PhyloProfile

arXiv.org

OmicsQ: A User-Friendly Platform for Interactive Quantitative Omics Data Analysis arxiv.org/abs/2504.19813

OmicsQ: A User-Friendly Platform for Interactive Quantitative Omics Data Analysis

Motivation: High-throughput omics technologies generate complex datasets with thousands of features that are quantified across multiple experimental conditions, but often suffer from incomplete measurements, missing values and individually fluctuating variances. This requires sophisticated analytical methods for accurate, deep and insightful biological interpretations, capable of dealing with a large variety of data properties and different amounts of completeness. Software to handle such data complexity is rare and mostly relies on programming-based environments, limiting accessibility for researchers without computational expertise. Results: We present OmicsQ, an interactive, web-based platform designed to streamline quantitative omics data analysis. OmicsQ integrates established statistical processing tools with an intuitive, browser-based visualization interface. It provides robust batch correction, automated experimental design annotation, and missing-data handling without imputation, which ensures data integrity and avoids artifacts from a priori assumptions. OmicsQ seamlessly interacts with external applications for statistical testing, clustering, analysis of protein complex behavior, and pathway enrichment, offering a comprehensive and flexible workflow from data import to biological interpretation that is broadly applicable tov data from different domains. Availability and Implementation: OmicsQ is implemented in R and R Shiny and is available at https://computproteomics.bmb.sdu.dk/app_direct/OmicsQ. Source code and installation instructions can be found at https://github.com/computproteomics/OmicsQ

arXiv.org

Warming demands extensive tropical but minimal temperate management in plant-pollinator networks arxiv.org/abs/2504.19879

Warming demands extensive tropical but minimal temperate management in plant-pollinator networks

Anthropogenic warming impacts ecological communities and disturbs species interactions, particularly in temperature sensitive plant pollinator networks. While previous assessments indicate that rising mean temperatures and shifting temporal variability universally elevate pollinator extinction risk, many studies often overlook how plant-pollinator networks of different ecoregions require distinct management approaches. Here, we integrate monthly near-surface temperature projections from various Shared Socioeconomic Pathways of CMIP6 Earth System Models with region-specific thermal performance parameters to simulate population dynamics in 11 plant pollinator networks across tropical, temperate, and Mediterranean ecosystems. Our results show that tropical networks, already near their thermal limits, face pronounced (50 percent) pollinator declines under high-emissions scenarios (SSP5-8.5). Multi-species management targeting keystone plants emerges as a critical strategy for stabilizing these high risk tropical systems, boosting both pollinator abundance and evenness. In contrast, temperate networks remain well below critical temperature thresholds, with minimal (5 percent) pollinator declines and negligible gains from any intensive management strategy. These findings challenge single-species models and uniform-parameter frameworks, which consistently underestimate tropical vulnerability while overestimating temperate risk. We demonstrate that explicitly incorporating complex network interactions, region-specific thermal tolerances, and targeted multi species interventions is vital for maintaining pollination services. By revealing when and where limited interventions suffice versus extensive management becomes indispensable, our study provides a clear blueprint for adaptive, ecosystem specific management under accelerating climate change.

arXiv.org

Modelling collective cell migration in a data-rich age: challenges and opportunities for data-driven modelling arxiv.org/abs/2504.19974

Modelling collective cell migration in a data-rich age: challenges and opportunities for data-driven modelling

Mathematical modelling has a long history in the context of collective cell migration, with applications throughout development, disease and regenerative medicine. The aim of modelling in this context is to provide a framework in which to mathematically encode experimentally derived mechanistic hypotheses, and then to test and validate them to provide new insights and understanding. Traditionally, mathematical models have consisted of systems of partial differential equations that model the evolution of cell density over time, together with the dynamics of any associated biochemical signals or the underlying substrate. The various terms in the model are usually chosen to provide simplified, phenomenological descriptions of the underlying biology, and follow long-standing conventions in the field. However, with the recent development of a plethora of new experimental technologies that provide quantitative data on collective cell migration processes, we now have the opportunity to leverage statistical and machine learning tools to determine mathematical models directly from the data. This perspectives article aims to provide an overview of recently developed data-driven modelling approaches, outlining the main methodologies and the challenges involved in using them to interrogate real-world data relating to collective cell migration.

arXiv.org

An Integrated Genomics Workflow Tool: Simulating Reads, Evaluating Read Alignments, and Optimizing Variant Calling Algorithms arxiv.org/abs/2504.17860

An Integrated Genomics Workflow Tool: Simulating Reads, Evaluating Read Alignments, and Optimizing Variant Calling Algorithms

Next-generation sequencing (NGS) is a pivotal technique in genome sequencing due to its high throughput, rapid results, cost-effectiveness, and enhanced accuracy. Its significance extends across various domains, playing a crucial role in identifying genetic variations and exploring genomic complexity. NGS finds applications in diverse fields such as clinical genomics, comparative genomics, functional genomics, and metagenomics, contributing substantially to advancements in research, medicine, and scientific disciplines. Within the sphere of genomics data science, the execution of read simulation, mapping, and variant calling holds paramount importance for obtaining precise and dependable results. Given the plethora of tools available for these purposes, each employing distinct methodologies and options, a nuanced understanding of their intricacies becomes imperative for optimization. This research, situated at the intersection of data science and genomics, involves a meticulous assessment of various tools, elucidating their individual strengths and weaknesses through rigorous experimentation and analysis. This comprehensive evaluation has enabled the researchers to pinpoint the most accurate tools, reinforcing the alignment between the established workflow and the demonstrated efficacy of specific tools in the context of genomics data analysis. To meet these requirements, "VarFind", an open-source and freely accessible pipeline tool designed to automate the entire process has been introduced (VarFind GitHub repository: https://github.com/shanikawm/varfinder)

arXiv.org

Seizure duration is associated with multiple timescales in interictal iEEG band power arxiv.org/abs/2504.17888

Seizure duration is associated with multiple timescales in interictal iEEG band power

Background Seizure severity can change from one seizure to the next within individual people with epilepsy. It is unclear if and how seizure severity is modulated over longer timescales. Characterising seizure severity variability over time could lead to tailored treatments. In this study, we test if continuously-recorded interictal intracranial EEG (iEEG) features encapsulate signatures of such modulations. Methods We analysed 20 subjects with iEEG recordings of at least one day. We identified cycles on timescales of hours to days embedded in long-term iEEG band power and associated them with seizure severity, which we approximated using seizure duration. In order to quantify these associations, we created linear-circular statistical models of seizure duration that incorporated different band power cycles within each subject. Findings In most subjects, seizure duration was weakly to moderately correlated with individual band power cycles. Combinations of multiple band power cycles significantly explained most of the variability in seizure duration. Specifically, we found 70% of the models had a higher than 60% adjusted $R^2$ across all subjects. From these models, around 80% were deemed to be above chance-level (p-value < 0.05) based on permutation tests. Models included cycles of ultradian, circadian and slower timescales in a subject-specific manner. Interpretation These results suggest that seizure severity, as measured by seizure duration, may be modulated over timescales of minutes to days by subject-specific cycles in interictal iEEG signal properties. These cycles likely serve as markers of seizure modulating processes. Future work can investigate biological drivers of these detected fluctuations and may inform novel treatment strategies that minimise seizure severity.

arXiv.org

A computational model of infant sensorimotor exploration in the mobile paradigm arxiv.org/abs/2504.17939

A computational model of infant sensorimotor exploration in the mobile paradigm

We present a computational model of the mechanisms that may determine infants' behavior in the "mobile paradigm". This paradigm has been used in developmental psychology to explore how infants learn the sensory effects of their actions. In this paradigm, a mobile (an articulated and movable object hanging above an infant's crib) is connected to one of the infant's limbs, prompting the infant to preferentially move that "connected" limb. This ability to detect a "sensorimotor contingency" is considered to be a foundational cognitive ability in development. To understand how infants learn sensorimotor contingencies, we built a model that attempts to replicate infant behavior. Our model incorporates a neural network, action-outcome prediction, exploration, motor noise, preferred activity level, and biologically-inspired motor control. We find that simulations with our model replicate the classic findings in the literature showing preferential movement of the connected limb. An interesting observation is that the model sometimes exhibits a burst of movement after the mobile is disconnected, casting light on a similar occasional finding in infants. In addition to these general findings, the simulations also replicate data from two recent more detailed studies using a connection with the mobile that was either gradual or all-or-none. A series of ablation studies further shows that the inclusion of mechanisms of action-outcome prediction, exploration, motor noise, and biologically-inspired motor control was essential for the model to correctly replicate infant behavior. This suggests that these components are also involved in infants' sensorimotor learning.

arXiv.org

Modular integration of neural connectomics, dynamics and biomechanics for identification of behavioral sensorimotor pathways in Caenorhabditis elegans arxiv.org/abs/2504.18073

3plex Web: An Interactive Platform for RNA:DNA Triplex Prediction and Analysis arxiv.org/abs/2504.18076

3plex Web: An Interactive Platform for RNA:DNA Triplex Prediction and Analysis

Summary: Long non-coding RNAs (lncRNAs) exert their functions by cooperating with other molecules including proteins and DNA. Triplexes, formed through the interaction between a single-stranded RNA (ssRNA) and a double-stranded DNA (dsDNA), have been consistently described as a mechanism that allows lncRNAs to target specific genomic sequences in vivo. Building on the computational tool 3plex, we developed 3plex Web, an accessible platform that enhances RNA:DNA triplex prediction by integrating interactive visualization, statistical evaluation, and user-friendly downstream analysis workflows. 3plex Web implements new features such as input randomization for statistical assessments, interactive profile plotting for triplex stability, and customizable DNA Binding Domain (DBD) selection. This platform enables rapid analysis through PATO, substantially reducing processing times compared to previous methods, while offering Snakemake workflows to integrate gene expression data and explore lncRNA regulatory mechanisms. Availability and implementation: 3plex Web is freely available at https://3plex.unito.it as an online web service. The source code for 3plex is available at https://github.com/molinerisLab/3plex, paired with a definition file to set up the application into a Singularity image. Contact: ivan.molineris@unito.it Keywords: DNA; RNA; RNA-DNA interaction; triplex; long non-coding RNA; lncRNA; gene regulation; web application

arXiv.org

TopSpace: spatial topic modeling for unsupervised discovery of multicellular spatial tissue structures in multiplex imaging arxiv.org/abs/2504.18495

TopSpace: spatial topic modeling for unsupervised discovery of multicellular spatial tissue structures in multiplex imaging

Motivation: Understanding the spatial architecture of tissues is essential for decoding the complex interactions within cellular ecosystems and their implications for disease pathology and clinical outcomes. Recent advances in multiplex imaging technologies have enabled high-resolution profiling of cellular phenotypes and their spatial distributions, revealing critical roles of tissue structures such as tertiary lymphoid structures (TLSs) in shaping immune responses and influencing disease progression. However, existing methods for analyzing spatial tissue structures often rely on hard clustering or adjacency-based spatial models, which are limited in capturing the nuanced and overlapping nature of cellular communities. To address these challenges, we develop a novel spatial topic modeling framework for the unsupervised discovery of spatial tissue structures in multiplex imaging data. Results: We propose TopSpace, a novel Bayesian spatial topic model that integrates Gaussian processes into latent Dirichlet allocation to flexibly model spatial dependencies in tissue microenvironments. By leveraging the Bayesian framework, TopSpace supports multicellular mixed-membership clustering and offers key inferential advantages, including robust uncertainty quantification and data-driven determination of the number of multicellular microenvironments. We demonstrate the utility of TopSpace through simulations and a case study on non-small cell lung cancer (NSCLC) data. Simulations show that TopSpace accurately recovers latent tissue microenvironments and spatial clustering patterns, outperforming existing methods in scenarios with varying spatial dependencies. Applied to NSCLC data, TopSpace successfully identifies TLS and captures their spatial probability distribution, which strongly correlates with patient survival outcomes.

arXiv.org

Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics arxiv.org/abs/2504.18367 .comp-ph .chem-ph .LG

Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics

Drug-protein binding and dissociation dynamics are fundamental to understanding molecular interactions in biological systems. While many tools for drug-protein interaction studies have emerged, especially artificial intelligence (AI)-based generative models, predictive tools on binding/dissociation kinetics and dynamics are still limited. We propose a novel research paradigm that combines molecular dynamics (MD) simulations, enhanced sampling, and AI generative models to address this issue. We propose an enhanced sampling strategy to efficiently implement the drug-protein dissociation process in MD simulations and estimate the free energy surface (FES). We constructed a program pipeline of MD simulations based on this sampling strategy, thus generating a dataset including 26,612 drug-protein dissociation trajectories containing about 13 million frames. We named this dissociation dynamics dataset DD-13M and used it to train a deep equivariant generative model UnbindingFlow, which can generate collision-free dissociation trajectories. The DD-13M database and UnbindingFlow model represent a significant advancement in computational structural biology, and we anticipate its broad applicability in machine learning studies of drug-protein interactions. Our ongoing efforts focus on expanding this methodology to encompass a broader spectrum of drug-protein complexes and exploring novel applications in pathway prediction.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.