Show newer

Multi-Slice Spatial Transcriptomics Data Integration Analysis with STG3Net arxiv.org/abs/2408.15246 .CV .AI .LG

Multi-Slice Spatial Transcriptomics Data Integration Analysis with STG3Net

With the rapid development of the latest Spatially Resolved Transcriptomics (SRT) technology, which allows for the mapping of gene expression within tissue sections, the integrative analysis of multiple SRT data has become increasingly important. However, batch effects between multiple slices pose significant challenges in analyzing SRT data. To address these challenges, we have developed a plug-and-play batch correction method called Global Nearest Neighbor (G2N) anchor pairs selection. G2N effectively mitigates batch effects by selecting representative anchor pairs across slices. Building upon G2N, we propose STG3Net, which cleverly combines masked graph convolutional autoencoders as backbone modules. These autoencoders, integrated with generative adversarial learning, enable STG3Net to achieve robust multi-slice spatial domain identification and batch correction. We comprehensively evaluate the feasibility of STG3Net on three multiple SRT datasets from different platforms, considering accuracy, consistency, and the F1LISI metric (a measure of batch effect correction efficiency). Compared to existing methods, STG3Net achieves the best overall performance while preserving the biological variability and connectivity between slices. Source code and all public datasets used in this paper are available at https://github.com/wenwenmin/STG3Net and https://zenodo.org/records/12737170.

arxiv.org

Pedestrian Motion Prediction Using Transformer-based Behavior Clustering and Data-Driven Reachability Analysis arxiv.org/abs/2408.15250 .SY .CV .RO .SY

Pedestrian Motion Prediction Using Transformer-based Behavior Clustering and Data-Driven Reachability Analysis

In this work, we present a transformer-based framework for predicting future pedestrian states based on clustered historical trajectory data. In previous studies, researchers propose enhancing pedestrian trajectory predictions by using manually crafted labels to categorize pedestrian behaviors and intentions. However, these approaches often only capture a limited range of pedestrian behaviors and introduce human bias into the predictions. To alleviate the dependency on manually crafted labels, we utilize a transformer encoder coupled with hierarchical density-based clustering to automatically identify diverse behavior patterns, and use these clusters in data-driven reachability analysis. By using a transformer-based approach, we seek to enhance the representation of pedestrian trajectories and uncover characteristics or features that are subsequently used to group trajectories into different "behavior" clusters. We show that these behavior clusters can be used with data-driven reachability analysis, yielding an end-to-end data-driven approach to predicting the future motion of pedestrians. We train and evaluate our approach on a real pedestrian dataset, showcasing its effectiveness in forecasting pedestrian movements.

arxiv.org

TrajFM: A Vehicle Trajectory Foundation Model for Region and Task Transferability arxiv.org/abs/2408.15251 .CV .LG

TrajFM: A Vehicle Trajectory Foundation Model for Region and Task Transferability

Vehicle trajectories provide valuable movement information that supports various downstream tasks and powers real-world applications. A desirable trajectory learning model should transfer between different regions and tasks without retraining, thus improving computational efficiency and effectiveness with limited training data. However, a model's ability to transfer across regions is limited by the unique spatial features and POI arrangements of each region, which are closely linked to vehicle movement patterns and difficult to generalize. Additionally, achieving task transferability is challenging due to the differing generation schemes required for various tasks. Existing efforts towards transferability primarily involve learning embedding vectors for trajectories, which perform poorly in region transfer and still require retraining of prediction modules for task transfer. To address these challenges, we propose TrajFM, a vehicle trajectory foundation model that excels in both region and task transferability. For region transferability, we introduce STRFormer as the main learnable model within TrajFM. It integrates spatial, temporal, and POI modalities of trajectories to effectively manage variations in POI arrangements across regions and includes a learnable spatio-temporal Rotary position embedding module for handling spatial features. For task transferability, we propose a trajectory masking and recovery scheme. This scheme unifies the generation processes of various tasks into the masking and recovery of modalities and sub-trajectories, allowing TrajFM to be pre-trained once and transferred to different tasks without retraining. Experiments on two real-world vehicle trajectory datasets under various settings demonstrate the effectiveness of TrajFM. Code is available at https://anonymous.4open.science/r/TrajFM-30E4.

arxiv.org

Gravix: Active Learning for Gravitational Waves Classification Algorithms arxiv.org/abs/2408.14483 .LG -qc

Gravix: Active Learning for Gravitational Waves Classification Algorithms

This project explores the integration of Bayesian Optimization (BO) algorithms into a base machine learning model, specifically Convolutional Neural Networks (CNNs), for classifying gravitational waves among background noise. The primary objective is to evaluate whether optimizing hyperparameters using Bayesian Optimization enhances the base model's performance. For this purpose, a Kaggle [1] dataset that comprises real background noise (labeled 0) and simulated gravitational wave signals with noise (labeled 1) is used. Data with real noise is collected from three detectors: LIGO Livingston, LIGO Hanford, and Virgo. Through data preprocessing and training, the models effectively classify testing data, predicting the presence of gravitational wave signals with a remarkable score, of 83.61%. The BO model demonstrates comparable accuracy to the base model, but its performance improvement is not very significant (84.34%). However, it is worth noting that the BO model needs additional computational resources and time due to the iterations required for hyperparameter optimization, requiring additional training on the entire dataset. For this reason, the BO model is less efficient in terms of resources compared to the base model in gravitational wave classification

arxiv.org

Active learning of digenic functions with boolean matrix logic programming arxiv.org/abs/2408.14487 -bio.MN .AI .LG .SC

Active learning of digenic functions with boolean matrix logic programming

We apply logic-based machine learning techniques to facilitate cellular engineering and drive biological discovery, based on comprehensive databases of metabolic processes called genome-scale metabolic network models (GEMs). Predicted host behaviours are not always correctly described by GEMs. Learning the intricate genetic interactions within GEMs presents computational and empirical challenges. To address these, we describe a novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging boolean matrices to evaluate large logic programs. We introduce a new system, $BMLP_{active}$, which efficiently explores the genomic hypothesis space by guiding informative experimentation through active learning. In contrast to sub-symbolic methods, $BMLP_{active}$ encodes a state-of-the-art GEM of a widely accepted bacterial host in an interpretable and logical representation using datalog logic programs. Notably, $BMLP_{active}$ can successfully learn the interaction between a gene pair with fewer training examples than random experimentation, overcoming the increase in experimental design space. $BMLP_{active}$ enables rapid optimisation of metabolic models and offers a realistic approach to a self-driving lab for microbial engineering.

arxiv.org

Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review arxiv.org/abs/2408.14491 .LG .MM

Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review

Recent technological advancements have enhanced our ability to collect and analyze rich multimodal data (e.g., speech, video, and eye gaze) to better inform learning and training experiences. While previous reviews have focused on parts of the multimodal pipeline (e.g., conceptual models and data fusion), a comprehensive literature review on the methods informing multimodal learning and training environments has not been conducted. This literature review provides an in-depth analysis of research methods in these environments, proposing a taxonomy and framework that encapsulates recent methodological advances in this field and characterizes the multimodal domain in terms of five modality groups: Natural Language, Video, Sensors, Human-Centered, and Environment Logs. We introduce a novel data fusion category -- mid fusion -- and a graph-based technique for refining literature reviews, termed citation graph pruning. Our analysis reveals that leveraging multiple modalities offers a more holistic understanding of the behaviors and outcomes of learners and trainees. Even when multimodality does not enhance predictive accuracy, it often uncovers patterns that contextualize and elucidate unimodal data, revealing subtleties that a single modality may miss. However, there remains a need for further research to bridge the divide between multimodal learning and training studies and foundational AI research.

arxiv.org

Evolvable Psychology Informed Neural Network for Memory Behavior Modeling arxiv.org/abs/2408.14492 .LG

Evolvable Psychology Informed Neural Network for Memory Behavior Modeling

Memory behavior modeling is a core issue in cognitive psychology and education. Classical psychological theories typically use memory equations to describe memory behavior, which exhibits insufficient accuracy and controversy, while data-driven memory modeling methods often require large amounts of training data and lack interpretability. Knowledge-informed neural network models have shown excellent performance in fields like physics, but there have been few attempts in the domain of behavior modeling. This paper proposed a psychology theory informed neural networks for memory behavior modeling named PsyINN, where it constructs a framework that combines neural network with differentiating sparse regression, achieving joint optimization. Specifically, to address the controversies and ambiguity of descriptors in memory equations, a descriptor evolution method based on differentiating operators is proposed to achieve precise characterization of descriptors and the evolution of memory theoretical equations. Additionally, a buffering mechanism for the sparse regression and a multi-module alternating iterative optimization method are proposed, effectively mitigating gradient instability and local optima issues. On four large-scale real-world memory behavior datasets, the proposed method surpasses the state-of-the-art methods in prediction accuracy. Ablation study demonstrates the effectiveness of the proposed refinements, and application experiments showcase its potential in inspiring psychological research.

arxiv.org

Extraction of Typical Operating Scenarios of New Power System Based on Deep Time Series Aggregation arxiv.org/abs/2408.14493 .SY .LG .SY

Extraction of Typical Operating Scenarios of New Power System Based on Deep Time Series Aggregation

Extracting typical operational scenarios is essential for making flexible decisions in the dispatch of a new power system. This study proposed a novel deep time series aggregation scheme (DTSAs) to generate typical operational scenarios, considering the large amount of historical operational snapshot data. Specifically, DTSAs analyze the intrinsic mechanisms of different scheduling operational scenario switching to mathematically represent typical operational scenarios. A gramian angular summation field (GASF) based operational scenario image encoder was designed to convert operational scenario sequences into high-dimensional spaces. This enables DTSAs to fully capture the spatiotemporal characteristics of new power systems using deep feature iterative aggregation models. The encoder also facilitates the generation of typical operational scenarios that conform to historical data distributions while ensuring the integrity of grid operational snapshots. Case studies demonstrate that the proposed method extracted new fine-grained power system dispatch schemes and outperformed the latest high-dimensional featurescreening methods. In addition, experiments with different new energy access ratios were conducted to verify the robustness of the proposed method. DTSAs enables dispatchers to master the operation experience of the power system in advance, and actively respond to the dynamic changes of the operation scenarios under the high access rate of new energy.

arxiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.