arXiv Computer Science @arxiv_cs@qoto.org

1.13K Followers

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018

2 Following 1.13K Followers

Posts Posts and replies Media

arXiv Computer Science @arxiv_cs@qoto.org

Fundamental Risks in the Current Deployment of General-Purpose AI Models: What Have We (Not) Learnt From Cybersecurity? https://arxiv.org/abs/2501.01435 #cs.CR #cs.AI

arXiv Computer Science @arxiv_cs@qoto.org

Toi uu hieu suat toc do dong co Servo DC su dung bo dieu khien PID ket hop mang no-ron https://arxiv.org/abs/2501.01438 #cs.RO

Toi uu hieu suat toc do dong co Servo DC su dung bo dieu khien PID ket hop mang no-ron

DC motors have been widely used in many industrial applications, from small jointed robots with multiple degrees of freedom to household appliances and transportation vehicles such as electric cars and trains. The main function of these motors is to ensure stable positioning performance and speed for mechanical systems based on pre-designed control methods. However, achieving optimal speed performance for servo motors faces many challenges due to the impact of internal and external loads, which affect output stability. To optimize the speed performance of DC Servo motors, a control method combining PID controllers and artificial neural networks has been proposed. Traditional PID controllers have the advantage of a simple structure and effective control capability in many systems, but they face difficulties when dealing with nonlinear and uncertain changes. The neural network is integrated to adjust the PID parameters in real time, helping the system adapt to different operating conditions. Simulation and experimental results have demonstrated that the proposed method significantly improves the speed tracking capability and stability of the motor while ensuring quick response, zero steady-state error, and eliminating overshoot. This method offers high potential for application in servo motor control systems requiring high precision and performance.

arXiv Computer Science @arxiv_cs@qoto.org

Probabilistic Mission Design in Neuro-Symbolic Systems https://arxiv.org/abs/2501.01439 #cs.AI #cs.RO

Probabilistic Mission Design in Neuro-Symbolic Systems

Advanced Air Mobility (AAM) is a growing field that demands accurate modeling of legal concepts and restrictions in navigating intelligent vehicles. In addition, any implementation of AAM needs to face the challenges posed by inherently dynamic and uncertain human-inhabited spaces robustly. Nevertheless, the employment of Unmanned Aircraft Systems (UAS) beyond visual line of sight (BVLOS) is an endearing task that promises to enhance significantly today's logistics and emergency response capabilities. To tackle these challenges, we present a probabilistic and neuro-symbolic architecture to encode legal frameworks and expert knowledge over uncertain spatial relations and noisy perception in an interpretable and adaptable fashion. More specifically, we demonstrate Probabilistic Mission Design (ProMis), a system architecture that links geospatial and sensory data with declarative, Hybrid Probabilistic Logic Programs (HPLP) to reason over the agent's state space and its legality. As a result, ProMis generates Probabilistic Mission Landscapes (PML), which quantify the agent's belief that a set of mission conditions is satisfied across its navigation space. Extending prior work on ProMis' reasoning capabilities and computational characteristics, we show its integration with potent machine learning models such as Large Language Models (LLM) and Transformer-based vision models. Hence, our experiments underpin the application of ProMis with multi-modal input data and how our method applies to many important AAM scenarios.

arXiv Computer Science @arxiv_cs@qoto.org

Explanatory Debiasing: Involving Domain Experts in the Data Generation Process to Mitigate Representation Bias in AI Systems https://arxiv.org/abs/2501.01441 #cs.HC #cs.AI

Explanatory Debiasing: Involving Domain Experts in the Data Generation Process to Mitigate Representation Bias in AI Systems

Representation bias is one of the most common types of biases in artificial intelligence (AI) systems, causing AI models to perform poorly on underrepresented data segments. Although AI practitioners use various methods to reduce representation bias, their effectiveness is often constrained by insufficient domain knowledge in the debiasing process. To address this gap, this paper introduces a set of generic design guidelines for effectively involving domain experts in representation debiasing. We instantiated our proposed guidelines in a healthcare-focused application and evaluated them through a comprehensive mixed-methods user study with 35 healthcare experts. Our findings show that involving domain experts can reduce representation bias without compromising model accuracy. Based on our findings, we also offer recommendations for developers to build robust debiasing systems guided by our generic design guidelines, ensuring more effective inclusion of domain experts in the debiasing process.

arXiv Computer Science @arxiv_cs@qoto.org

Feedback Design and Implementation for Integrated Posture Manipulation and Thrust Vectoring https://arxiv.org/abs/2501.01443 #cs.RO

arXiv Computer Science @arxiv_cs@qoto.org

Mathematical modelling of flow and adsorption in a gas chromatograph https://arxiv.org/abs/2501.00001 #physics.chem-ph #cs.CE

Mathematical modelling of flow and adsorption in a gas chromatograph

In this paper, a mathematical model is developed to describe the evolution of the concentration of compounds through a gas chromatography column. The model couples mass balances and kinetic equations for all components. Both single and multiple-component cases are considered with constant or variable velocity. Non-dimensionalisation indicates the small effect of diffusion. The system where diffusion is neglected is analysed using Laplace transforms. In the multiple-component case, it is demonstrated that the competition between the compounds is negligible and the equations may be decoupled. This reduces the problem to solving a single integral equation to determine the concentration profile for all components (since they are scaled versions of each other). For a given analyte, we then only two parameters need to be fitted to the data. To verify this approach, the full governing equations are also solved numerically using the finite difference method and a global adaptive quadrature method to integrate the Laplace transformation. Comparison with the Laplace solution verifies the high degree of accuracy of the simpler Laplace form. The Laplace solution is then verified against experimental data from BTEX chromatography. This novel method, which involves solving a single equation and fitting parameters in pairs for individual components, is highly efficient. It is significantly faster and simpler than the full numerical solution and avoids the computationally expensive methods that would normally be used to fit all curves at the same time.

arXiv Computer Science @arxiv_cs@qoto.org

A QUBO Formulation for the Generalized Takuzu/LinkedIn Tango Game https://arxiv.org/abs/2501.00002 #quant-ph #math.HO #cs.OH

A QUBO Formulation for the Generalized Takuzu/LinkedIn Tango Game

In this paper we present a QUBO formulation for the Takuzu game (or Binairo), for the most recent LinkedIn game, Tango, and for its generalizations. We optimize the number of variables needed to solve the combinatorial problem, making it suitable to be solved by quantum devices with fewer resources.

arXiv Computer Science @arxiv_cs@qoto.org

NewsHomepages: Homepage Layouts Capture Information Prioritization Decisions https://arxiv.org/abs/2501.00004 #cs.IR #cs.AI #cs.CL

NewsHomepages: Homepage Layouts Capture Information Prioritization Decisions

Information prioritization plays an important role in how humans perceive and understand the world. Homepage layouts serve as a tangible proxy for this prioritization. In this work, we present NewsHomepages, a large dataset of over 3,000 new website homepages (including local, national and topic-specific outlets) captured twice daily over a three-year period. We develop models to perform pairwise comparisons between news items to infer their relative significance. To illustrate that modeling organizational hierarchies has broader implications, we applied our models to rank-order a collection of local city council policies passed over a ten-year period in San Francisco, assessing their "newsworthiness". Our findings lay the groundwork for leveraging implicit organizational cues to deepen our understanding of information prioritization.

arXiv Computer Science @arxiv_cs@qoto.org

Special Coverings of Sets and Boolean Functions https://arxiv.org/abs/2501.00008 #cs.CC

Special Coverings of Sets and Boolean Functions

We will study some important properties of Boolean functions based on newly introduced concepts called Special Decomposition of a Set and Special Covering of a Set. These concepts enable us to study important problems concerning Boolean functions represented in conjunctive normal form including the satisfiability problem. Studying the relationship between the Boolean satisfiability problem and the problem of existence of a special covering for set we show that these problems are polynomially equivalent. This means that the problem of existence of a special covering for a set is an NP complete problem. We prove an important theorem regarding the relationship between these problems. The Boolean function in conjunctive normal form is satisfiable if and only if there is a special covering for the set of clauses of this function. The purpose of the article is also to study some important properties of satisfiable Boolean functions using the concepts of special decomposition and special covering of a set. We introduce the concept of generation of satisfiable function by another satisfiable function by means of admissible changes in the clauses of the function. We will prove that if the generation of a function by another function is defined as a binary relation then the set of satisfiable functions of n variables represented in conjunctive normal form with m clauses is partitioned to equivalence classes In addition, extending the rules of admissible changes we prove that arbitrary two satisfiable Boolean functions of n variables represented in conjunctive normal form with m clauses can be generated from each other.

arXiv Computer Science @arxiv_cs@qoto.org

AI Across Borders: Exploring Perceptions and Interactions in Higher Education https://arxiv.org/abs/2501.00017 #cs.CY #cs.HC

arXiv Computer Science @arxiv_cs@qoto.org

SECodec: Structural Entropy-based Compressive Speech Representation Codec for Speech Language Models https://arxiv.org/abs/2501.00018 #eess.AS #cs.SD

SECodec: Structural Entropy-based Compressive Speech Representation Codec for Speech Language Models

With the rapid advancement of large language models (LLMs), discrete speech representations have become crucial for integrating speech into LLMs. Existing methods for speech representation discretization rely on a predefined codebook size and Euclidean distance-based quantization. However, 1) the size of codebook is a critical parameter that affects both codec performance and downstream task training efficiency. 2) The Euclidean distance-based quantization may lead to audio distortion when the size of the codebook is controlled within a reasonable range. In fact, in the field of information compression, structural information and entropy guidance are crucial, but previous methods have largely overlooked these factors. Therefore, we address the above issues from an information-theoretic perspective, we present SECodec, a novel speech representation codec based on structural entropy (SE) for building speech language models. Specifically, we first model speech as a graph, clustering the speech features nodes within the graph and extracting the corresponding codebook by hierarchically and disentangledly minimizing 2D SE. Then, to address the issue of audio distortion, we propose a new quantization method. This method still adheres to the 2D SE minimization principle, adaptively selecting the most suitable token corresponding to the cluster for each incoming original speech node. Furthermore, we develop a Structural Entropy-based Speech Language Model (SESLM) that leverages SECodec. Experimental results demonstrate that SECodec performs comparably to EnCodec in speech reconstruction, and SESLM surpasses VALL-E in zero-shot text-to-speech tasks. Code, demo speeches, speech feature graph, SE codebook, and models are available at https://github.com/wlq2019/SECodec.

arXiv Computer Science @arxiv_cs@qoto.org

Did we miss P In CAP? Partial Progress Conjecture under Asynchrony https://arxiv.org/abs/2501.00021 #cs.DC #cs.DB

Did we miss P In CAP? Partial Progress Conjecture under Asynchrony

Each application developer desires to provide its users with consistent results and an always-available system despite failures. Boldly, the CALM theorem disagrees. It states that it is hard to design a system that is both consistent and available under network partitions; select at most two out of these three properties. One possible solution is to design coordination-free monotonic applications. However, a majority of real-world applications require coordination. We resolve this dilemma by conjecturing that partial progress is possible under network partitions. This partial progress ensures the system appears responsive to a subset of clients and achieves non-zero throughput during failures. To this extent, we present the design of our CASSANDRA consensus protocol that allows partitioned replicas to order client requests.

arXiv Computer Science @arxiv_cs@qoto.org

A Breadth-First Catalog of Text Processing, Speech Processing and Multimodal Research in South Asian Languages https://arxiv.org/abs/2501.00029 #cs.CL #cs.IR #cs.LG

A Breadth-First Catalog of Text Processing, Speech Processing and Multimodal Research in South Asian Languages

We review the recent literature (January 2022- October 2024) in South Asian languages on text-based language processing, multimodal models, and speech processing, and provide a spotlight analysis focused on 21 low-resource South Asian languages, namely Saraiki, Assamese, Balochi, Bhojpuri, Bodo, Burmese, Chhattisgarhi, Dhivehi, Gujarati, Kannada, Kashmiri, Konkani, Khasi, Malayalam, Meitei, Nepali, Odia, Pashto, Rajasthani, Sindhi, and Telugu. We identify trends, challenges, and future research directions, using a step-wise approach that incorporates relevance classification and clustering based on large language models (LLMs). Our goal is to provide a breadth-first overview of the recent developments in South Asian language technologies to NLP researchers interested in working with South Asian languages.

arXiv Computer Science @arxiv_cs@qoto.org

Underutilization of Syntactic Processing by Chinese Learners of English in Comprehending English Sentences, Evidenced from Adapted Garden-Path Ambiguity Experiment https://arxiv.org/abs/2501.00030 #cs.CL

Underutilization of Syntactic Processing by Chinese Learners of English in Comprehending English Sentences, Evidenced from Adapted Garden-Path Ambiguity Experiment

Many studies have revealed that sentence comprehension relies more on semantic processing than on syntactic processing. However, previous studies have predominantly emphasized the preference for semantic processing, focusing on the semantic perspective. In contrast, this current study highlights the under-utilization of syntactic processing, from a syntactic perspective. Based on the traditional garden-path experiment, which involves locally ambiguous but globally unambiguous sentences, this study's empirical experiment innovatively crafted an adapted version featuring semantically ambiguous but syntactically unambiguous sentences to meet its specific research objective. This experiment, involving 140 subjects, demonstrates through descriptive and inferential statistical analyses using SPSS, Graph Pad Prism, and Cursor that Chinese learners of English tend to under-utilize syntactic processing when comprehending English sentences. The study identifies two types of parsing under-utilization: partial and complete. Further exploration reveals that trial and error in syntactic processing contributes to both. Consequently, this study lays a foundation for the development of a novel parsing method designed to fully integrate syntactic processing into sentence comprehension, thereby enhancing the level of English sentence comprehension for Chinese learners of English.

arXiv Computer Science @arxiv_cs@qoto.org

Distilling Large Language Models for Efficient Clinical Information Extraction https://arxiv.org/abs/2501.00031 #cs.CL

Distilling Large Language Models for Efficient Clinical Information Extraction

Large language models (LLMs) excel at clinical information extraction but their computational demands limit practical deployment. Knowledge distillation--the process of transferring knowledge from larger to smaller models--offers a potential solution. We evaluate the performance of distilled BERT models, which are approximately 1,000 times smaller than modern LLMs, for clinical named entity recognition (NER) tasks. We leveraged state-of-the-art LLMs (Gemini and OpenAI models) and medical ontologies (RxNorm and SNOMED) as teacher labelers for medication, disease, and symptom extraction. We applied our approach to over 3,300 clinical notes spanning five publicly available datasets, comparing distilled BERT models against both their teacher labelers and BERT models fine-tuned on human labels. External validation was conducted using clinical notes from the MedAlign dataset. For disease extraction, F1 scores were 0.82 (teacher model), 0.89 (BioBERT trained on human labels), and 0.84 (BioBERT-distilled). For medication, F1 scores were 0.84 (teacher model), 0.91 (BioBERT-human), and 0.87 (BioBERT-distilled). For symptoms: F1 score of 0.73 (teacher model) and 0.68 (BioBERT-distilled). Distilled BERT models had faster inference (12x, 4x, 8x faster than GPT-4o, o1-mini, and Gemini Flash respectively) and lower costs (85x, 101x, 2x cheaper than GPT-4o, o1-mini, and Gemini Flash respectively). On the external validation dataset, the distilled BERT model achieved F1 scores of 0.883 (medication), 0.726 (disease), and 0.699 (symptom). Distilled BERT models were up to 101x cheaper and 12x faster than state-of-the-art LLMs while achieving similar performance on NER tasks. Distillation offers a computationally efficient and scalable alternative to large LLMs for clinical information extraction.

arXiv Computer Science @arxiv_cs@qoto.org

Concrete Branching Bisimilarity for Processes with Time-outs https://arxiv.org/abs/2412.19805 #cs.LO

Concrete Branching Bisimilarity for Processes with Time-outs

This paper provides an adaptation of branching bisimilarity to reactive systems with time-outs that does not enable eliding of time-out transitions. Multiple equivalent definitions are procured, along with a modal characterisation and a proof of its congruence property for a standard process algebra with recursion. The last section presents a complete axiomatisation for guarded processes without infinite sequences of unobservable actions.

arXiv Computer Science @arxiv_cs@qoto.org

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing https://arxiv.org/abs/2412.19806 #cs.CV #cs.HC

Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing

Recent developments of vision large language models (LLMs) have seen remarkable progress, yet still encounter challenges towards multimodal generalists, such as coarse-grained instance-level understanding, lack of unified support for both images and videos, and insufficient coverage across various vision tasks. In this paper, we present VITRON, a universal pixel-level vision LLM designed for comprehensive understanding, generating, segmenting, and editing of both static images and dynamic videos. Building on top of an LLM backbone, VITRON incorporates encoders for images, videos, and pixel-level regional visuals within its frontend modules, while employing state-of-the-art visual specialists as its backend, via which VITRON supports a spectrum of vision end tasks, spanning visual comprehension to visual generation, from low level to high level. To ensure an effective and precise message passing from LLM to backend modules for function invocation, we propose a novel hybrid method by simultaneously integrating discrete textual instructions and continuous signal embeddings. Further, we design various pixel-level spatiotemporal vision-language alignment learning for VITRON to reach the best fine-grained visual capability. Finally, a cross-task synergy module is advised to learn to maximize the task-invariant fine-grained visual features, enhancing the synergy between different visual tasks. Demonstrated over 12 visual tasks and evaluated across 22 datasets, VITRON showcases its extensive capabilities in the four main vision task clusters. Overall, this work illuminates the great potential of developing a more unified multimodal generalist. Project homepage: https://vitron-llm.github.io/

arXiv Computer Science @arxiv_cs@qoto.org

AI-driven Automation as a Pre-condition for Eudaimonia https://arxiv.org/abs/2412.19808 #cs.CY #cs.AI

AI-driven Automation as a Pre-condition for Eudaimonia

The debate surrounding the 'future of work' is saturated with alarmist warnings about the loss of work as an intrinsically valuable activity. Instead, the present doctoral research approaches this debate from the perspective of human flourishing (eudaimonia). It articulates a neo-Aristotelian interpretation according to which the prospect of mass AI-driven automation, far from being a threat, is rather desirable insofar as it facilitates humans' flourishing and, subsequently, their engagement in leisure. Drawing on virtue jurisprudence, this research further explores what this desirability may imply for the current legal order.

arXiv Computer Science @arxiv_cs@qoto.org

LINKs: Large Language Model Integrated Management for 6G Empowered Digital Twin NetworKs https://arxiv.org/abs/2412.19811 #eess.SY #cs.NI #cs.AI #cs.SY

LINKs: Large Language Model Integrated Management for 6G Empowered Digital Twin NetworKs

In the rapidly evolving landscape of digital twins (DT) and 6G networks, the integration of large language models (LLMs) presents a novel approach to network management. This paper explores the application of LLMs in managing 6G-empowered DT networks, with a focus on optimizing data retrieval and communication efficiency in smart city scenarios. The proposed framework leverages LLMs for intelligent DT problem analysis and radio resource management (RRM) in fully autonomous way without any manual intervention. Our proposed framework -- LINKs, builds up a lazy loading strategy which can minimize transmission delay by selectively retrieving the relevant data. Based on the data retrieval plan, LLMs transform the retrieval task into an numerical optimization problem and utilizing solvers to build an optimal RRM, ensuring efficient communication across the network. Simulation results demonstrate the performance improvements in data planning and network management, highlighting the potential of LLMs to enhance the integration of DT and 6G technologies.

arXiv Computer Science @arxiv_cs@qoto.org

Coverage Path Planning in Precision Agriculture: Algorithms, Applications, and Key Benefits https://arxiv.org/abs/2412.19813 #cs.RO

Coverage Path Planning in Precision Agriculture: Algorithms, Applications, and Key Benefits

Coverage path planning (CPP) is the task of computing an optimal path within a region to completely scan or survey an area of interest using one or multiple mobile robots. Robots equipped with sensors and cameras can collect vast amounts of data on crop health, soil conditions, and weather patterns. Advanced analytics can then be applied to this data to make informed decisions, improving overall farm management. In this paper, we will demonstrate one approach to find the optimal coverage path of an agricultural field using a single robot, and one using multiple robots. For the single robot, we used a wavefront coverage algorithm that generates a sequence of locations that the robot needs to follow. For the multi-robot approach, the proposed approach consists of two steps: dividing the agricultural field into convex polygonal areas to optimally distribute them among the robots, and generating an optimal coverage path to ensure minimum coverage time for each of the polygonal areas.

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018