Show newer

Text2Zinc: A Cross-Domain Dataset for Modeling Optimization and Satisfaction Problems in MiniZinc arxiv.org/abs/2503.10642 .CL .AI

Synthetic Categorical Restructuring large Or How AIs Gradually Extract Efficient Regularities from Their Experience of the World arxiv.org/abs/2503.10643 -bio.NC .CL .NE

The Reliability of LLMs for Medical Diagnosis: An Examination of Consistency, Manipulation, and Contextual Awareness arxiv.org/abs/2503.10647 .CL .AI .CY .HC

Hate Speech and Sentiment of YouTube Video Comments From Public and Private Sources Covering the Israel-Palestine Conflict arxiv.org/abs/2503.10648 .CL .CY .LG .SI

Measuring Political Preferences in AI Systems: An Integrative Approach arxiv.org/abs/2503.10649 .CY .AI .CL

Measuring Political Preferences in AI Systems: An Integrative Approach

Political biases in Large Language Model (LLM)-based artificial intelligence (AI) systems, such as OpenAI's ChatGPT or Google's Gemini, have been previously reported. While several prior studies have attempted to quantify these biases using political orientation tests, such approaches are limited by potential tests' calibration biases and constrained response formats that do not reflect real-world human-AI interactions. This study employs a multi-method approach to assess political bias in leading AI systems, integrating four complementary methodologies: (1) linguistic comparison of AI-generated text with the language used by Republican and Democratic U.S. Congress members, (2) analysis of political viewpoints embedded in AI-generated policy recommendations, (3) sentiment analysis of AI-generated text toward politically affiliated public figures, and (4) standardized political orientation testing. Results indicate a consistent left-leaning bias across most contemporary AI systems, with arguably varying degrees of intensity. However, this bias is not an inherent feature of LLMs; prior research demonstrates that fine-tuning with politically skewed data can realign these models across the ideological spectrum. The presence of systematic political bias in AI systems poses risks, including reduced viewpoint diversity, increased societal polarization, and the potential for public mistrust in AI technologies. To mitigate these risks, AI systems should be designed to prioritize factual accuracy while maintaining neutrality on most lawful normative issues. Furthermore, independent monitoring platforms are necessary to ensure transparency, accountability, and responsible AI development.

arXiv.org

AI Enabled User-Specific Cyberbullying Severity Detection with Explainability arxiv.org/abs/2503.10650 .LG .CL .CY

AI Enabled User-Specific Cyberbullying Severity Detection with Explainability

The rise of social media has significantly increased the prevalence of cyberbullying (CB), posing serious risks to both mental and physical well-being. Effective detection systems are essential for mitigating its impact. While several machine learning (ML) models have been developed, few incorporate victims' psychological, demographic, and behavioral factors alongside bullying comments to assess severity. In this study, we propose an AI model intregrating user-specific attributes, including psychological factors (self-esteem, anxiety, depression), online behavior (internet usage, disciplinary history), and demographic attributes (race, gender, ethnicity), along with social media comments. Additionally, we introduce a re-labeling technique that categorizes social media comments into three severity levels: Not Bullying, Mild Bullying, and Severe Bullying, considering user-specific factors.Our LSTM model is trained using 146 features, incorporating emotional, topical, and word2vec representations of social media comments as well as user-level attributes and it outperforms existing baseline models, achieving the highest accuracy of 98\% and an F1-score of 0.97. To identify key factors influencing the severity of cyberbullying, we employ explainable AI techniques (SHAP and LIME) to interpret the model's decision-making process. Our findings reveal that, beyond hate comments, victims belonging to specific racial and gender groups are more frequently targeted and exhibit higher incidences of depression, disciplinary issues, and low self-esteem. Additionally, individuals with a prior history of bullying are at a greater risk of becoming victims of cyberbullying.

arXiv.org

Evaluating Local and Cloud-Based Large Language Models for Simulating Consumer Choices in Energy Stated Preference Surveys arxiv.org/abs/2503.10652 .CL .AI .CY

Evaluating Local and Cloud-Based Large Language Models for Simulating Consumer Choices in Energy Stated Preference Surveys

Survey research is essential in energy demand studies for capturing consumer preferences and informing policy decisions. Stated preference (SP) surveys, in particular, analyse how individuals make trade-offs in hypothetical scenarios. However, traditional survey methods are costly, time-consuming, and affected by biases and respondent fatigue. Large language models (LLMs) have emerged as a potential tool to address these challenges by generating human-like textual responses. This study investigates the ability of LLMs to simulate consumer choices in energy-related SP surveys. A series of test scenarios evaluated the simulation performance of LLMs at both individual and aggregated levels, considering factors in the prompt, in-context learning (ICL), chain-of-thought (CoT) reasoning, the comparison between local and cloud-based LLMs, integration with traditional choice models, and potential biases. Results indicate that while LLMs achieve an average accuracy of up to 48%, surpassing random guessing, their performance remains insufficient for practical application. Local and cloud-based LLMs perform similarly in simulation accuracy but exhibit differences in adherence to prompt requirements and susceptibility to social desirability biases. Findings suggest that previous SP choices are the most effective input factor, while longer prompts with varied factor formats may reduce accuracy. Furthermore, the traditional mixed logit choice model outperforms LLMs and provides insights for refining LLM prompts. Despite their limitations, LLMs provide scalability and efficiency advantages, requiring minimal historical data compared to traditional survey methods. Future research should refine prompt structures, further investigate CoT reasoning, and explore fine-tuning techniques to improve LLM-based energy survey simulations.

arXiv.org

Video Anomaly Detection with Structured Keywords arxiv.org/abs/2503.10653 .CV .AI .LG

Video Anomaly Detection with Structured Keywords

This paper focuses on detecting anomalies in surveillance video using keywords by leveraging foundational models' feature representation generalization capabilities. We present a novel, lightweight pipeline for anomaly classification using keyword weights. Our pipeline employs a two-stage process: induction followed by deduction. In induction, descriptions are generated from normal and anomalous frames to identify and assign weights to relevant keywords. In deduction, inference frame descriptions are converted into keyword encodings using induction-derived weights for input into our neural network for anomaly classification. We achieved comparable performance on the three benchmarks UCSD Ped2, Shanghai Tech, and CUHK Avenue, with ROC AUC scores of 0.865, 0.745, and 0.742, respectively. These results are achieved without temporal context, making such a system viable for real-time applications. Our model improves implementation setup, interpretability, and inference speed for surveillance devices on the edge, introducing a performance trade-off against other video anomaly detection systems. As the generalization capabilities of open-source foundational models improve, our model demonstrates that the exclusive use of text for feature representations is a promising direction for efficient real-time interpretable video anomaly detection.

arXiv.org

Improving RAG Retrieval via Propositional Content Extraction: a Speech Act Theory Approach arxiv.org/abs/2503.10654 .CL .AI .IR

Improving RAG Retrieval via Propositional Content Extraction: a Speech Act Theory Approach

When users formulate queries, they often include not only the information they seek, but also pragmatic markers such as interrogative phrasing or polite requests. Although these speech act indicators communicate the user\textquotesingle s intent -- whether it is asking a question, making a request, or stating a fact -- they do not necessarily add to the core informational content of the query itself. This paper investigates whether extracting the underlying propositional content from user utterances -- essentially stripping away the linguistic markers of intent -- can improve retrieval quality in Retrieval-Augmented Generation (RAG) systems. Drawing upon foundational insights from speech act theory, we propose a practical method for automatically transforming queries into their propositional equivalents before embedding. To assess the efficacy of this approach, we conducted an experimental study involving 63 user queries related to a Brazilian telecommunications news corpus with precomputed semantic embeddings. Results demonstrate clear improvements in semantic similarity between query embeddings and document embeddings at top ranks, confirming that queries stripped of speech act indicators more effectively retrieve relevant content.

arXiv.org

Adaptive Deadlock Avoidance for Decentralized Multi-agent Systems via CBF-inspired Risk Measurement arxiv.org/abs/2503.09621 .SY .RO .SY

Adaptive Deadlock Avoidance for Decentralized Multi-agent Systems via CBF-inspired Risk Measurement

Decentralized safe control plays an important role in multi-agent systems given the scalability and robustness without reliance on a central authority. However, without an explicit global coordinator, the decentralized control methods are often prone to deadlock -- a state where the system reaches equilibrium, causing the robots to stall. In this paper, we propose a generalized decentralized framework that unifies the Control Lyapunov Function (CLF) and Control Barrier Function (CBF) to facilitate efficient task execution and ensure deadlock-free trajectories for the multi-agent systems. As the agents approach the deadlock-related undesirable equilibrium, the framework can detect the equilibrium and drive agents away before that happens. This is achieved by a secondary deadlock resolution design with an auxiliary CBF to prevent the multi-agent systems from converging to the undesirable equilibrium. To avoid dominating effects due to the deadlock resolution over the original task-related controllers, a deadlock indicator function using CBF-inspired risk measurement is proposed and encoded in the unified framework for the agents to adaptively determine when to activate the deadlock resolution. This allows the agents to follow their original control tasks and seamlessly unlock or deactivate deadlock resolution as necessary, effectively improving task efficiency. We demonstrate the effectiveness of the proposed method through theoretical analysis, numerical simulations, and real-world experiments.

arXiv.org

Dynamics-Invariant Quadrotor Control using Scale-Aware Deep Reinforcement Learning arxiv.org/abs/2503.09622 .SY .RO .SY

Dynamics-Invariant Quadrotor Control using Scale-Aware Deep Reinforcement Learning

Due to dynamic variations such as changing payload, aerodynamic disturbances, and varying platforms, a robust solution for quadrotor trajectory tracking remains challenging. To address these challenges, we present a deep reinforcement learning (DRL) framework that achieves physical dynamics invariance by directly optimizing force/torque inputs, eliminating the need for traditional intermediate control layers. Our architecture integrates a temporal trajectory encoder, which processes finite-horizon reference positions/velocities, with a latent dynamics encoder trained on historical state-action pairs to model platform-specific characteristics. Additionally, we introduce scale-aware dynamics randomization parameterized by the quadrotor's arm length, enabling our approach to maintain stability across drones spanning from 30g to 2.1kg and outperform other DRL baselines by 85% in tracking accuracy. Extensive real-world validation of our approach on the Crazyflie 2.1 quadrotor, encompassing over 200 flights, demonstrates robust adaptation to wind, ground effects, and swinging payloads while achieving less than 0.05m RMSE at speeds up to 2.0 m/s. This work introduces a universal quadrotor control paradigm that compensates for dynamic discrepancies across varied conditions and scales, paving the way for more resilient aerial systems.

arXiv.org

Certainly Bot Or Not? Trustworthy Social Bot Detection via Robust Multi-Modal Neural Processes arxiv.org/abs/2503.09626 .SI .AI .LG

Certainly Bot Or Not? Trustworthy Social Bot Detection via Robust Multi-Modal Neural Processes

Social bot detection is crucial for mitigating misinformation, online manipulation, and coordinated inauthentic behavior. While existing neural network-based detectors perform well on benchmarks, they struggle with generalization due to distribution shifts across datasets and frequently produce overconfident predictions for out-of-distribution accounts beyond the training data. To address this, we introduce a novel Uncertainty Estimation for Social Bot Detection (UESBD) framework, which quantifies the predictive uncertainty of detectors beyond mere classification. For this task, we propose Robust Multi-modal Neural Processes (RMNP), which aims to enhance the robustness of multi-modal neural processes to modality inconsistencies caused by social bot camouflage. RMNP first learns unimodal representations through modality-specific encoders. Then, unimodal attentive neural processes are employed to encode the Gaussian distribution of unimodal latent variables. Furthermore, to avoid social bots stealing human features to camouflage themselves thus causing certain modalities to provide conflictive information, we introduce an evidential gating network to explicitly model the reliability of modalities. The joint latent distribution is learned through the generalized product of experts, which takes the reliability of each modality into consideration during fusion. The final prediction is obtained through Monte Carlo sampling of the joint latent distribution followed by a decoder. Experiments on three real-world benchmarks show the effectiveness of RMNP in classification and uncertainty estimation, as well as its robustness to modality conflicts.

arXiv.org
Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.