arXiv Computer Science @arxiv_cs@qoto.org

1.13K Followers

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018

2 Following 1.13K Followers

Posts Posts and replies Media

arXiv Computer Science @arxiv_cs@qoto.org

HyperCausalLP: Causal Link Prediction using Hyper-Relational Knowledge Graph https://arxiv.org/abs/2410.14679 #cs.AI

HyperCausalLP: Causal Link Prediction using Hyper-Relational Knowledge Graph

Causal networks are often incomplete with missing causal links. This is due to various issues, such as missing observation data. Recent approaches to the issue of incomplete causal networks have used knowledge graph link prediction methods to find the missing links. In the causal link A causes B causes C, the influence of A to C is influenced by B which is known as a mediator. Existing approaches using knowledge graph link prediction do not consider these mediated causal links. This paper presents HyperCausalLP, an approach designed to find missing causal links within a causal network with the help of mediator links. The problem of missing links is formulated as a hyper-relational knowledge graph completion. The approach uses a knowledge graph link prediction model trained on a hyper-relational knowledge graph with the mediators. The approach is evaluated on a causal benchmark dataset, CLEVRER-Humans. Results show that the inclusion of knowledge about mediators in causal link prediction using hyper-relational knowledge graph improves the performance on an average by 5.94% mean reciprocal rank.

arXiv Computer Science @arxiv_cs@qoto.org

Influence of Backdoor Paths on Causal Link Prediction https://arxiv.org/abs/2410.14680 #cs.AI

Influence of Backdoor Paths on Causal Link Prediction

The current method for predicting causal links in knowledge graphs uses weighted causal relations. For a given link between cause-effect entities, the presence of a confounder affects the causal link prediction, which can lead to spurious and inaccurate results. We aim to block these confounders using backdoor path adjustment. Backdoor paths are non-causal association flows that connect the \textit{cause-entity} to the \textit{effect-entity} through other variables. Removing these paths ensures a more accurate prediction of causal links. This paper proposes CausalLPBack, a novel approach to causal link prediction that eliminates backdoor paths and uses knowledge graph link prediction methods. It extends the representation of causality in a neuro-symbolic framework, enabling the adoption and use of traditional causal AI concepts and methods. We demonstrate our approach using a causal reasoning benchmark dataset of simulated videos. The evaluation involves a unique dataset splitting method called the Markov-based split that's relevant for causal link prediction. The evaluation of the proposed approach demonstrates atleast 30\% in MRR and 16\% in Hits@K inflated performance for causal link prediction that is due to the bias introduced by backdoor paths for both baseline and weighted causal relations.

arXiv Computer Science @arxiv_cs@qoto.org

ET-Plan-Bench: Embodied Task-level Planning Benchmark Towards Spatial-Temporal Cognition with Foundation Models https://arxiv.org/abs/2410.14682 #cs.RO #cs.AI

ET-Plan-Bench: Embodied Task-level Planning Benchmark Towards Spatial-Temporal Cognition with Foundation Models

Recent advancements in Large Language Models (LLMs) have spurred numerous attempts to apply these technologies to embodied tasks, particularly focusing on high-level task planning and task decomposition. To further explore this area, we introduce a new embodied task planning benchmark, ET-Plan-Bench, which specifically targets embodied task planning using LLMs. It features a controllable and diverse set of embodied tasks varying in different levels of difficulties and complexities, and is designed to evaluate two critical dimensions of LLMs' application in embodied task understanding: spatial (relation constraint, occlusion for target objects) and temporal & causal understanding of the sequence of actions in the environment. By using multi-source simulators as the backend simulator, it can provide immediate environment feedback to LLMs, which enables LLMs to interact dynamically with the environment and re-plan as necessary. We evaluated the state-of-the-art open source and closed source foundation models, including GPT-4, LLAMA and Mistral on our proposed benchmark. While they perform adequately well on simple navigation tasks, their performance can significantly deteriorate when faced with tasks that require a deeper understanding of spatial, temporal, and causal relationships. Thus, our benchmark distinguishes itself as a large-scale, quantifiable, highly automated, and fine-grained diagnostic framework that presents a significant challenge to the latest foundation models. We hope it can spark and drive further research in embodied task planning using foundation models.

arXiv Computer Science @arxiv_cs@qoto.org

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph https://arxiv.org/abs/2410.14684 #cs.SE #cs.AI #cs.CL

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

Large Language Models (LLMs) excel in code generation yet struggle with modern AI software engineering tasks. Unlike traditional function-level or file-level coding tasks, AI software engineering requires not only basic coding proficiency but also advanced skills in managing and interacting with code repositories. However, existing methods often overlook the need for repository-level code understanding, which is crucial for accurately grasping the broader context and developing effective solutions. On this basis, we present RepoGraph, a plug-in module that manages a repository-level structure for modern AI software engineering solutions. RepoGraph offers the desired guidance and serves as a repository-wide navigation for AI software engineers. We evaluate RepoGraph on the SWE-bench by plugging it into four different methods of two lines of approaches, where RepoGraph substantially boosts the performance of all systems, leading to a new state-of-the-art among open-source frameworks. Our analyses also demonstrate the extensibility and flexibility of RepoGraph by testing on another repo-level coding benchmark, CrossCodeEval. Our code is available at https://github.com/ozyyshr/RepoGraph.

arXiv Computer Science @arxiv_cs@qoto.org

Leveraging Event Streams with Deep Reinforcement Learning for End-to-End UAV Tracking https://arxiv.org/abs/2410.14685 #cs.RO #cs.AI #cs.NE

Leveraging Event Streams with Deep Reinforcement Learning for End-to-End UAV Tracking

In this paper, we present our proposed approach for active tracking to increase the autonomy of Unmanned Aerial Vehicles (UAVs) using event cameras, low-energy imaging sensors that offer significant advantages in speed and dynamic range. The proposed tracking controller is designed to respond to visual feedback from the mounted event sensor, adjusting the drone movements to follow the target. To leverage the full motion capabilities of a quadrotor and the unique properties of event sensors, we propose an end-to-end deep-reinforcement learning (DRL) framework that maps raw sensor data from event streams directly to control actions for the UAV. To learn an optimal policy under highly variable and challenging conditions, we opt for a simulation environment with domain randomization for effective transfer to real-world environments. We demonstrate the effectiveness of our approach through experiments in challenging scenarios, including fast-moving targets and changing lighting conditions, which result in improved generalization capabilities.

arXiv Computer Science @arxiv_cs@qoto.org

BrainTransformers: SNN-LLM https://arxiv.org/abs/2410.14687 #cs.NE #cs.CL #cs.LG

BrainTransformers: SNN-LLM

This study introduces BrainTransformers, an innovative Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). Our key contributions include: (1) designing SNN-compatible Transformer components such as SNNMatmul, SNNSoftmax, and SNNSiLU; (2) implementing an SNN approximation of the SiLU activation function; and (3) developing a Synapsis module to simulate synaptic plasticity. Our 3-billion parameter model, BrainTransformers-3B-Chat, demonstrates competitive performance across various benchmarks, including MMLU (63.2), BBH (54.1), ARC-C (54.3), and GSM8K (76.3), while potentially offering improved energy efficiency and biological plausibility. The model employs a three-stage training approach, including SNN-specific neuronal synaptic plasticity training. This research opens new avenues for brain-like AI systems in natural language processing and neuromorphic computing. Future work will focus on hardware optimization, developing specialized SNN fine-tuning tools, and exploring practical applications in energy-efficient computing environments.

arXiv Computer Science @arxiv_cs@qoto.org

A positional $\mathbf{\Pi}^0_3$-complete objective https://arxiv.org/abs/2410.14688 #cs.CC #cs.FL #cs.GT #cs.LO

A positional $\mathbfΠ^0_3$-complete objective

We study zero-sum turn-based games on graphs. In this note, we show the existence of a game objective that is $\mathbfΠ^0_3$-complete for the Borel hierarchy and that is positional, i.e., for which positional strategies suffice for the first player to win over arenas of arbitrary cardinality. To the best of our knowledge, this is the first known such objective; all previously known positional objectives are in $\mathbfΣ^0_3$. The objective in question is a qualitative variant of the well-studied total-payoff objective, where the goal is to maximise the sum of weights.

arXiv Computer Science @arxiv_cs@qoto.org

Rethinking VLMs and LLMs for Image Classification https://arxiv.org/abs/2410.14690 #cs.LG #cs.AI #cs.CV

Rethinking VLMs and LLMs for Image Classification

Visual Language Models (VLMs) are now increasingly being merged with Large Language Models (LLMs) to enable new capabilities, particularly in terms of improved interactivity and open-ended responsiveness. While these are remarkable capabilities, the contribution of LLMs to enhancing the longstanding key problem of classifying an image among a set of choices remains unclear. Through extensive experiments involving seven models, ten visual understanding datasets, and multiple prompt variations per dataset, we find that, for object and scene recognition, VLMs that do not leverage LLMs can achieve better performance than VLMs that do. Yet at the same time, leveraging LLMs can improve performance on tasks requiring reasoning and outside knowledge. In response to these challenges, we propose a pragmatic solution: a lightweight fix involving a relatively small LLM that efficiently routes visual tasks to the most suitable model for the task. The LLM router undergoes training using a dataset constructed from more than 2.5 million examples of pairs of visual task and model accuracy. Our results reveal that this lightweight fix surpasses or matches the accuracy of state-of-the-art alternatives, including GPT-4V and HuggingGPT, while improving cost-effectiveness.

arXiv Computer Science @arxiv_cs@qoto.org

Green vehicle routing problem that jointly optimizes delivery speed and routing based on the characteristics of electric vehicles https://arxiv.org/abs/2410.14691 #cs.NE #cs.AI #cs.CE

Green vehicle routing problem that jointly optimizes delivery speed and routing based on the characteristics of electric vehicles

The abundance of materials and the development of the economy have led to the flourishing of the logistics industry, but have also caused certain pollution. The research on GVRP (Green vehicle routing problem) for planning vehicle routes during transportation to reduce pollution is also increasingly developing. Further exploration is needed on how to integrate these research findings with real vehicles. This paper establishes an energy consumption model using real electric vehicles, fully considering the physical characteristics of each component of the vehicle. To avoid the distortion of energy consumption models affecting the results of route planning. The energy consumption model also incorporates the effects of vehicle start/stop, speed, distance, and load on energy consumption. In addition, a load first speed optimization algorithm was proposed, which selects the most suitable speed between every two delivery points while planning the route. In order to further reduce energy consumption while meeting the time window. Finally, an improved Adaptive Genetic Algorithm is used to solve for the most energy-efficient route. The experiment shows that the results of using this speed optimization algorithm are generally more energy-efficient than those without using this algorithm. The average energy consumption of constant speed delivery at different speeds is 17.16% higher than that after speed optimization. Provided a method that is closer to reality and easier for logistics companies to use. It also enriches the GVRP model.

arXiv Computer Science @arxiv_cs@qoto.org

Attribute-Based Semantic Type Detection and Data Quality Assessment https://arxiv.org/abs/2410.14692 #cs.DB #cs.IR

Attribute-Based Semantic Type Detection and Data Quality Assessment

The reliance on data-driven decision-making across sectors highlights the critical need for high-quality data; despite advancements, data quality issues persist, significantly impacting business strategies and scientific research. Current data quality methods fail to leverage the semantic richness embedded in words inside attribute labels (or column names/headers in tables) across diverse datasets and domains, leaving a crucial gap in comprehensive data quality evaluation. This research addresses this gap by introducing an innovative methodology centered around Attribute-Based Semantic Type Detection and Data Quality Assessment. By leveraging semantic information within attribute labels, combined with rule-based analysis and comprehensive Formats and Abbreviations dictionaries, our approach introduces a practical semantic type classification system comprising approximately 23 types, including numerical non-negative, categorical, ID, names, strings, geographical, temporal, and complex formats like URLs, IP addresses, email, and binary values plus several numerical bounded types, such as age and percentage. A comparative analysis with Sherlock, a state-of-the-art Semantic Type Detection system, shows the advantages of our approach in terms of classification robustness and applicability to data quality assessment tasks. Our research focuses on well-known data quality issues and their corresponding data quality dimension violations, grounding our methodology in a robust academic framework. Detailed analysis of fifty distinct datasets from the UCI Machine Learning Repository showcases our method's proficiency in identifying potential data quality issues. Compared to established tools like YData Profiling, our method exhibits superior accuracy, detecting 81 missing values across 922 attributes where YData identified only one.

arXiv Computer Science @arxiv_cs@qoto.org

Classifying Peace in Global Media Using RAG and Intergroup Reciprocity https://arxiv.org/abs/2410.13865 #cs.IR

Classifying Peace in Global Media Using RAG and Intergroup Reciprocity

This paper presents a novel approach to identifying insights of peace in global media using a Retrieval Augmented Generation (RAG) model and concepts of Positive and Negative Intergroup Reciprocity (PIR/NIR). By refining the definitions of PIR and NIR, we offer a more accurate and meaningful analysis of intergroup relations as represented in media articles. Our methodology provides insights into the dynamics that contribute to or detract from peace at a national level.

arXiv Computer Science @arxiv_cs@qoto.org

Stars, Stripes, and Silicon: Unravelling the ChatGPT's All-American, Monochrome, Cis-centric Bias https://arxiv.org/abs/2410.13868 #cs.CY #cs.AI #cs.CL

Stars, Stripes, and Silicon: Unravelling the ChatGPT's All-American, Monochrome, Cis-centric Bias

This paper investigates the challenges associated with bias, toxicity, unreliability, and lack of robustness in large language models (LLMs) such as ChatGPT. It emphasizes that these issues primarily stem from the quality and diversity of data on which LLMs are trained, rather than the model architectures themselves. As LLMs are increasingly integrated into various real-world applications, their potential to negatively impact society by amplifying existing biases and generating harmful content becomes a pressing concern. The paper calls for interdisciplinary efforts to address these challenges. Additionally, it highlights the need for collaboration between researchers, practitioners, and stakeholders to establish governance frameworks, oversight, and accountability mechanisms to mitigate the harmful consequences of biased LLMs. By proactively addressing these challenges, the AI community can harness the enormous potential of LLMs for the betterment of society without perpetuating harmful biases or exacerbating existing inequalities.

arXiv Computer Science @arxiv_cs@qoto.org

A Federated Learning Platform as a Service for Advancing Stroke Management in European Clinical Centers https://arxiv.org/abs/2410.13869 #cs.CY #cs.DC #cs.LG

A Federated Learning Platform as a Service for Advancing Stroke Management in European Clinical Centers

The rapid evolution of artificial intelligence (AI) technologies holds transformative potential for the healthcare sector. In critical situations requiring immediate decision-making, healthcare professionals can leverage machine learning (ML) algorithms to prioritize and optimize treatment options, thereby reducing costs and improving patient outcomes. However, the sensitive nature of healthcare data presents significant challenges in terms of privacy and data ownership, hindering data availability and the development of robust algorithms. Federated Learning (FL) addresses these challenges by enabling collaborative training of ML models without the exchange of local data. This paper introduces a novel FL platform designed to support the configuration, monitoring, and management of FL processes. This platform operates on Platform-as-a-Service (PaaS) principles and utilizes the Message Queuing Telemetry Transport (MQTT) publish-subscribe protocol. Considering the production readiness and data sensitivity inherent in clinical environments, we emphasize the security of the proposed FL architecture, addressing potential threats and proposing mitigation strategies to enhance the platform's trustworthiness. The platform has been successfully tested in various operational environments using a publicly available dataset, highlighting its benefits and confirming its efficacy.

arXiv Computer Science @arxiv_cs@qoto.org

Experimental Validation of Light Cable-Driven Elbow-Assisting Device L-CADEL Design https://arxiv.org/abs/2410.13870 #cs.RO #cs.HC

Experimental Validation of Light Cable-Driven Elbow-Assisting Device L-CADEL Design

This paper presents a new design of CADEL, a cable-driven elbow-assisting device, with light weighting and control improvements. The new device design is appropriate to be more portable and user-oriented solution, presenting additional facilities with respect to the original design. One of potential benefits of improved portability can be envisaged in the possibility of house and hospital usage keeping social distancing while allowing rehabilitation treatments even during a pandemic spread. Specific attention has been devoted to design main mechatronic components by developing specific kinematics models. The design process includes an implementation of specific control hardware and software. The kinematic model of the new design is formulated and features are evaluated through numerical simulations and experimental tests. An evaluation from original design highlights the proposed improvements mainly in terms of comfort, portability and user-oriented operation.

arXiv Computer Science @arxiv_cs@qoto.org

Explaining an image classifier with a generative model conditioned by uncertainty https://arxiv.org/abs/2410.13871 #eess.IV #cs.CV #cs.AI

Explaining an image classifier with a generative model conditioned by uncertainty

We propose to condition a generative model by a given image classifier uncertainty in order to analyze and explain its behavior. Preliminary experiments on synthetic data and a corrupted version of MNIST dataset illustrate the idea.

arXiv Computer Science @arxiv_cs@qoto.org

BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation https://arxiv.org/abs/2410.13872 #q-bio.NC #cs.NE #cs.LG

BLEND: Behavior-guided Neural Population Dynamics Modeling via Privileged Knowledge Distillation

Modeling the nonlinear dynamics of neuronal populations represents a key pursuit in computational neuroscience. Recent research has increasingly focused on jointly modeling neural activity and behavior to unravel their interconnections. Despite significant efforts, these approaches often necessitate either intricate model designs or oversimplified assumptions. Given the frequent absence of perfectly paired neural-behavioral datasets in real-world scenarios when deploying these models, a critical yet understudied research question emerges: how to develop a model that performs well using only neural activity as input at inference, while benefiting from the insights gained from behavioral signals during training? To this end, we propose BLEND, the behavior-guided neural population dynamics modeling framework via privileged knowledge distillation. By considering behavior as privileged information, we train a teacher model that takes both behavior observations (privileged features) and neural activities (regular features) as inputs. A student model is then distilled using only neural activity. Unlike existing methods, our framework is model-agnostic and avoids making strong assumptions about the relationship between behavior and neural activity. This allows BLEND to enhance existing neural dynamics modeling architectures without developing specialized models from scratch. Extensive experiments across neural population activity modeling and transcriptomic neuron identity prediction tasks demonstrate strong capabilities of BLEND, reporting over 50% improvement in behavioral decoding and over 15% improvement in transcriptomic neuron identity prediction after behavior-guided distillation. Furthermore, we empirically explore various behavior-guided distillation strategies within the BLEND framework and present a comprehensive analysis of effectiveness and implications for model performance.

arXiv Computer Science @arxiv_cs@qoto.org

COOL: Efficient and Reliable Chain-Oriented Objective Logic with Neural Networks Feedback Control for Program Synthesis https://arxiv.org/abs/2410.13874 #cs.SE #cs.LG

COOL: Efficient and Reliable Chain-Oriented Objective Logic with Neural Networks Feedback Control for Program Synthesis

Program synthesis methods, whether formal or neural-based, lack fine-grained control and flexible modularity, which limits their adaptation to complex software development. These limitations stem from rigid Domain-Specific Language (DSL) frameworks and neural network incorrect predictions. To this end, we propose the Chain of Logic (CoL), which organizes synthesis stages into a chain and provides precise heuristic control to guide the synthesis process. Furthermore, by integrating neural networks with libraries and introducing a Neural Network Feedback Control (NNFC) mechanism, our approach modularizes synthesis and mitigates the impact of neural network mispredictions. Experiments on relational and symbolic synthesis tasks show that CoL significantly enhances the efficiency and reliability of DSL program synthesis across multiple metrics. Specifically, CoL improves accuracy by 70% while reducing tree operations by 91% and time by 95%. Additionally, NNFC further boosts accuracy by 6%, with a 64% reduction in tree operations under challenging conditions such as insufficient training data, increased difficulty, and multidomain synthesis. These improvements confirm COOL as a highly efficient and reliable program synthesis framework.

arXiv Computer Science @arxiv_cs@qoto.org

SpaceRaceEdu: developing an educational multi-player videogame for self-study and assessment https://arxiv.org/abs/2410.13875 #cs.CY

SpaceRaceEdu: developing an educational multi-player videogame for self-study and assessment

The teaching innovation project SpaceRaceEdu: development of an educational multiplayer video game for self-study and self-assessment has been carried out under the INNOVA call of the Autonomous University of Madrid during the 2022-2023 academic year. In this project, a functional prototype of SpaceRaceEdu has been developed: a multiplayer video game with a social and educational nature, which can be used both by teachers as a training and evaluation activity and by students as a tool for study and evaluation. In SpaceRaceEdu, several student teams try to launch a rocket before everyone else. To meet this objective, they must gather a series of resources by going through a scenario and answering questions of different types. The teachers can introduce these questions according to the contents of their subject. The videogame balances competition and cooperation to promote participation and learning. Competition occurs between teams who strive to answer all their questions correctly before their rivals. In contrast, cooperation occurs between students on the same team who can organize and support each other to be more effective.

arXiv Computer Science @arxiv_cs@qoto.org

Deep Knowledge Tracing for Personalized Adaptive Learning at Historically Black Colleges and Universities https://arxiv.org/abs/2410.13876 #cs.CY #cs.AI

Deep Knowledge Tracing for Personalized Adaptive Learning at Historically Black Colleges and Universities

Personalized adaptive learning (PAL) stands out by closely monitoring individual students' progress and tailoring their learning paths to their unique knowledge and needs. A crucial technique for effective PAL implementation is knowledge tracing, which models students' evolving knowledge to predict their future performance. Recent advancements in deep learning have significantly enhanced knowledge tracing through Deep Knowledge Tracing (DKT). However, there is limited research on DKT for Science, Technology, Engineering, and Math (STEM) education at Historically Black Colleges and Universities (HBCUs). This study builds a comprehensive dataset to investigate DKT for implementing PAL in STEM education at HBCUs, utilizing multiple state-of-the-art (SOTA) DKT models to examine knowledge tracing performance. The dataset includes 352,148 learning records for 17,181 undergraduate students across eight colleges at Prairie View A&M University (PVAMU). The SOTA DKT models employed include DKT, DKT+, DKVMN, SAKT, and KQN. Experimental results demonstrate the effectiveness of DKT models in accurately predicting students' academic outcomes. Specifically, the SAKT and KQN models outperform others in terms of accuracy and AUC. These findings have significant implications for faculty members and academic advisors, providing valuable insights for identifying students at risk of academic underperformance before the end of the semester. Furthermore, this allows for proactive interventions to support students' academic progress, potentially enhancing student retention and graduation rates.

arXiv Computer Science @arxiv_cs@qoto.org

Model Validation Practice in Banking: A Structured Approach https://arxiv.org/abs/2410.13877 #stat.AP #cs.CY

Model Validation Practice in Banking: A Structured Approach

This paper presents a comprehensive overview of model validation practices and advancement in the banking industry based on the experience of managing Model Risk Management (MRM) since the inception of regulatory guidance SR11-7/OCC11-12 over a decade ago. Model validation in banking is a crucial process designed to ensure that predictive models, which are often used for credit risk, fraud detection, and capital planning, operate reliably and meet regulatory standards. This practice ensures that models are conceptually sound, produce valid outcomes, and are consistently monitored over time. Model validation in banking is a multi-faceted process with three key components: conceptual soundness evaluation, outcome analysis, and on-going monitoring to ensure that the models are not only designed correctly but also perform reliably and consistently in real-world environments. Effective validation helps banks mitigate risks, meet regulatory requirements, and maintain trust in the models that underpin critical business decisions.

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018