arXiv Computer Science @arxiv_cs@qoto.org

1.13K Followers

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018

2 Following 1.13K Followers

Posts Posts and replies Media

arXiv Computer Science @arxiv_cs@qoto.org

What Every Computer Scientist Needs To Know About Parallelization https://arxiv.org/abs/2504.03647 #cs.DC

What Every Computer Scientist Needs To Know About Parallelization

Parallelization has become a cornerstone of modern computing, influencing everything from high performance supercomputers to everyday mobile devices. This paper presents a comprehensive guide on the fundamentals of parallelization that every computer scientist should know, beginning with a historical perspective that traces the evolution from early theoretical models such as PRAM and BSP to today's advanced multicore and heterogeneous architectures. We explore essential theoretical frameworks, practical paradigms, and synchronization mechanisms while discussing implementation strategies using processes, threads, and modern models like the Actor framework. Additionally, we examine how hardware components including CPUs, caches, memory, and accelerators interact with software to impact performance, scalability, and load balancing. This work demystifies parallel programming by integrating historical context, theoretical underpinnings, and practical case studies. It equips readers with the tools to design, optimize, and troubleshoot parallel applications in an increasingly concurrent computing landscape.

arXiv Computer Science @arxiv_cs@qoto.org

AIBrix: Towards Scalable, Cost-Effective Large Language Model Inference Infrastructure https://arxiv.org/abs/2504.03648 #cs.DC #cs.AI

AIBrix: Towards Scalable, Cost-Effective Large Language Model Inference Infrastructure

We introduce AIBrix, a cloud-native, open-source framework designed to optimize and simplify large-scale LLM deployment in cloud environments. Unlike traditional cloud-native stacks, AIBrix follows a co-design philosophy, ensuring every layer of the infrastructure is purpose-built for seamless integration with inference engines like vLLM. AIBrix introduces several key innovations to reduce inference costs and enhance performance including high-density LoRA management for dynamic adapter scheduling, LLM-specific autoscalers, and prefix-aware, load-aware routing. To further improve efficiency, AIBrix incorporates a distributed KV cache, boosting token reuse across nodes, leading to a 50% increase in throughput and a 70% reduction in inference latency. AIBrix also supports unified AI runtime which streamlines model management while maintaining vendor-agnostic engine compatibility. For large-scale multi-node inference, AIBrix employs hybrid orchestration -- leveraging Kubernetes for coarse-grained scheduling and Ray for fine-grained execution -- to balance efficiency and flexibility. Additionally, an SLO-driven GPU optimizer dynamically adjusts resource allocations, optimizing heterogeneous serving to maximize cost efficiency while maintaining service guarantees. Finally, AIBrix enhances system reliability with AI accelerator diagnostic tools, enabling automated failure detection and mock-up testing to improve fault resilience. AIBrix is available at https://github.com/vllm-project/aibrix.

arXiv Computer Science @arxiv_cs@qoto.org

Diagnostic Method for Hydropower Plant Condition-based Maintenance combining Autoencoder with Clustering Algorithms https://arxiv.org/abs/2504.03649 #cs.AI #cs.LG #cs.NE

Diagnostic Method for Hydropower Plant Condition-based Maintenance combining Autoencoder with Clustering Algorithms

The French company EDF uses supervisory control and data acquisition systems in conjunction with a data management platform to monitor hydropower plant, allowing engineers and technicians to analyse the time-series collected. Depending on the strategic importance of the monitored hydropower plant, the number of time-series collected can vary greatly making it difficult to generate valuable information from the extracted data. In an attempt to provide an answer to this particular problem, a condition detection and diagnosis method combining clustering algorithms and autoencoder neural networks for pattern recognition has been developed and is presented in this paper. First, a dimension reduction algorithm is used to create a 2-or 3-dimensional projection that allows the users to identify unsuspected relationships between datapoints. Then, a collection of clustering algorithms regroups the datapoints into clusters. For each identified cluster, an autoencoder neural network is trained on the corresponding dataset. The aim is to measure the reconstruction error between each autoencoder model and the measured values, thus creating a proximity index for each state discovered during the clustering stage.

arXiv Computer Science @arxiv_cs@qoto.org

BoxRL-NNV: Boxed Refinement of Latin Hypercube Samples for Neural Network Verification https://arxiv.org/abs/2504.03650 #cs.LG #cs.AI

BoxRL-NNV: Boxed Refinement of Latin Hypercube Samples for Neural Network Verification

BoxRL-NNV is a Python tool for the detection of safety violations in neural networks by computing the bounds of the output variables, given the bounds of the input variables of the network. This is done using global extrema estimation via Latin Hypercube Sampling, and further refinement using L-BFGS-B for local optimization around the initial guess. This paper presents an overview of BoxRL-NNV, as well as our results for a subset of the ACAS Xu benchmark. A complete evaluation of the tool's performance, including benchmark comparisons with state-of-the-art tools, shall be presented at the Sixth International Verification of Neural Networks Competition (VNN-COMP'25).

arXiv Computer Science @arxiv_cs@qoto.org

Echo: Efficient Co-Scheduling of Hybrid Online-Offline Tasks for Large Language Model Serving https://arxiv.org/abs/2504.03651 #cs.DC #cs.AI #cs.LG

Echo: Efficient Co-Scheduling of Hybrid Online-Offline Tasks for Large Language Model Serving

Large language models have been widely deployed in various applications, encompassing both interactive online tasks and batched offline tasks. Given the burstiness and latency sensitivity of online tasks, over-provisioning resources is common practice. This allows for the integration of latency-insensitive offline tasks during periods of low online load, enhancing resource utilization. However, strategically serving online and offline tasks through a preemption mechanism fails to fully leverage the flexibility of offline tasks and suffers from KV cache recomputation and irregular workloads. In this paper, we introduce Echo, a collaborative online-offline task serving system, including a scheduler, a KV cache manager, and estimation toolkits. The scheduler and KV cache manager work tightly to maximize the throughput of offline tasks, while the estimator further predicts execution time to ensure online task SLOs. The scheduler leverages the batch information of last iteration to reduce the search space for finding the optimal schedule. The KV cache manager sets the priority of the KV cache based on the type of tasks and the opportunity of prefix sharing to reduce the recomputation. Finally, the estimation toolkits predict the execution time, future memory consumption, and the throughput of offline tasks to guide the scheduler, KV cache manager, and the system deployer. Evaluation based on real-world workloads demonstrates that Echo can increase offline task throughput by up to $3.3\times$, while satisfying online task SLOs.

arXiv Computer Science @arxiv_cs@qoto.org

A Modern Approach to Real-Time Air Traffic Management System https://arxiv.org/abs/2504.03652 #cs.DC #cs.LG

A Modern Approach to Real-Time Air Traffic Management System

Air traffic analytics systems are pivotal for ensuring safety, efficiency, and predictability in air travel. However, traditional systems struggle to handle the increasing volume and complexity of air traffic data. This project explores the application of real-time big data processing frameworks like Apache Spark, HDFS, and Spark Streaming for developing a new robust system. By reviewing existing research on real-time systems and analyzing the challenges and opportunities presented by big data technologies, we propose an architecture for a real-time system. Our project pipeline involves real-time data collection from flight information sources through flight API's, ingestion into Kafka, and transmission to Elasticsearch for visualization using Kibana. Additionally, we present a dashboard of U.S. airlines on PowerBI, demonstrating the potential of real-time analytics in revolutionizing air traffic management.

arXiv Computer Science @arxiv_cs@qoto.org

A Survey on Heterogeneous Computing Using SmartNICs and Emerging Data Processing Units (Expanded Preprint) https://arxiv.org/abs/2504.03653 #cs.DC #cs.NI

A Survey on Heterogeneous Computing Using SmartNICs and Emerging Data Processing Units (Expanded Preprint)

The emergence of new, off-path smart network cards (SmartNICs), known generally as Data Processing Units (DPU), has opened a wide range of research opportunities. Of particular interest is the use of these and related devices in tandem with their host's CPU, creating a heterogeneous computing system with new properties and strengths to be explored, capable of accelerating a wide variety of workloads. This survey begins by providing background information to this new field, such as discussing its origins, its motivations and challenges, listing a few of the current market offerings for DPUs, and providing some brief information about the major programming languages and frameworks for using them. Then, we review and categorize a number of recent works in the field, covering a wide variety of studies, benchmarks, and application areas such as in data center infrastructure, commercial uses, and AI and ML acceleration.

arXiv Computer Science @arxiv_cs@qoto.org

PointSplit: Towards On-device 3D Object Detection with Heterogeneous Low-power Accelerators https://arxiv.org/abs/2504.03654 #cs.DC #cs.AI #cs.CV

PointSplit: Towards On-device 3D Object Detection with Heterogeneous Low-power Accelerators

Running deep learning models on resource-constrained edge devices has drawn significant attention due to its fast response, privacy preservation, and robust operation regardless of Internet connectivity. While these devices already cope with various intelligent tasks, the latest edge devices that are equipped with multiple types of low-power accelerators (i.e., both mobile GPU and NPU) can bring another opportunity; a task that used to be too heavy for an edge device in the single-accelerator world might become viable in the upcoming heterogeneous-accelerator world.To realize the potential in the context of 3D object detection, we identify several technical challenges and propose PointSplit, a novel 3D object detection framework for multi-accelerator edge devices that addresses the problems. Specifically, our PointSplit design includes (1) 2D semantics-aware biased point sampling, (2) parallelized 3D feature extraction, and (3) role-based group-wise quantization. We implement PointSplit on TensorFlow Lite and evaluate it on a customized hardware platform comprising both mobile GPU and EdgeTPU. Experimental results on representative RGB-D datasets, SUN RGB-D and Scannet V2, demonstrate that PointSplit on a multi-accelerator device is 24.7 times faster with similar accuracy compared to the full-precision, 2D-3D fusion-based 3D detector on a GPU-only device.

arXiv Computer Science @arxiv_cs@qoto.org

Memory and Bandwidth are All You Need for Fully Sharded Data Parallel https://arxiv.org/abs/2504.03655 #cs.DC #cs.LG

Memory and Bandwidth are All You Need for Fully Sharded Data Parallel

Transformer models have revolutionized a wide spectrum of disciplines, especially in language processing. The recent success has proven that model size scalability is crucial for achieving superior performance metrics. However, training large transformer models is challenging even on modern hardware with powerful GPUs and high-speed interconnects. Existing studies primarily focus on optimizing model training distribution strategies to minimize memory footprint and enhance training speed, often overlooking the scalability challenges related to model size and hardware constraints. To address this oversight, we thoroughly investigate computational, memory, and network demands of training large transformers using the Fully Sharded Data Parallel (FSDP) distributed strategy across different hardware clusters. We explore the intricate relationships between model size and hardware setups to identify configurations that ensure maximum model and hardware efficiency, effective sequence length management, and optimal training throughput. A significant finding of our study is the critical interplay of the cluster's connection bandwidth and GPU memory size compared to the computational performance of GPUs. This interplay limits training efficiency, underscoring the role of both hardware characteristics as a possible bottleneck. By integrating theoretical analysis with simulations and empirical tests, we demonstrate how hardware limitations affect training efficacy, identifying key hardware thresholds and the impact of network connectivity. Our findings prompt a reassessment of training strategies guiding users on the way to finding hardware-optimal FSDP configurations, enhancing training efficiency for large-scale transformer models.

arXiv Computer Science @arxiv_cs@qoto.org

Comparative Analysis of Lightweight Kubernetes Distributions for Edge Computing: Performance and Resource Efficiency https://arxiv.org/abs/2504.03656 #cs.DC

Comparative Analysis of Lightweight Kubernetes Distributions for Edge Computing: Performance and Resource Efficiency

Edge computing environments increasingly rely on lightweight container orchestration platforms to manage resource-constrained devices. This paper provides an empirical analysis of five lightweight kubernetes distributions (KD)(k0s, k3s, KubeEdge, OpenYurt, and Kubernetes (k8s)) focusing on their performance and resource efficiency in edge computing scenarios. We evaluated key metrics such as CPU, memory, disk usage, throughput, and latency under varying workloads, utilizing a testbed of Intel NUCs and Raspberry Pi devices. Our results demonstrate significant differences in performance: k3s exhibited the lowest resource consumption, while k0s and k8s excelled in data plane throughput and latency. Under heavy stress scenarios, k3s and k0s accomplished the same workloads faster than the other distributions. OpenYurt offered balanced performance, suitable for hybrid cloud-edge use cases, but was less efficient in terms of resource usage and scalability compared to k0s, k3s and k8s. KubeEdge, although feature-rich for edge environments, exhibited higher resource consumption and lower scalability. These findings offer valuable insights for developers and operators selecting appropriate KD based on specific performance and resource efficiency requirements for edge computing environments.

arXiv Computer Science @arxiv_cs@qoto.org

A Class of Hierarchical Sliding Mode Control based on Extended Kalman filter for Quadrotor UAVs https://arxiv.org/abs/2504.02851 #eess.SY #cs.RO #cs.SY

A Class of Hierarchical Sliding Mode Control based on Extended Kalman filter for Quadrotor UAVs

This study introduces a novel methodology for controlling Quadrotor Unmanned Aerial Vehicles, focusing on Hierarchical Sliding Mode Control strategies and an Extended Kalman Filter. Initially, an EKF is proposed to enhance robustness in estimating UAV states, thereby reducing the impact of measured noises and external disturbances. By locally linearizing UAV systems, the EKF can mitigate the disadvantages of the Kalman filter and reduce the computational cost of other nonlinear observers. Subsequently, in comparison to other related work in terms of stability and computational cost, the HSMC framework shows its outperformance in allowing the quadrotor UAVs to track the references. Three types of HSMC Aggregated HSMC, Incremental HSMC, and Combining HSMC are investigated for their effectiveness in tracking reference trajectories. Moreover, the stability of the quadrotor UAVs is rigorously analyzed using the Lyapunov stability principle. Finally, experimental results and comparative analyses demonstrate the efficacy and feasibility of the proposed methodologies.

arXiv Computer Science @arxiv_cs@qoto.org

Curvature-Constrained Vector Field for Motion Planning of Nonholonomic Robots https://arxiv.org/abs/2504.02852 #eess.SY #cs.RO #cs.SY

Curvature-Constrained Vector Field for Motion Planning of Nonholonomic Robots

Vector fields are advantageous in handling nonholonomic motion planning as they provide reference orientation for robots. However, additionally incorporating curvature constraints becomes challenging, due to the interconnection between the design of the curvature-bounded vector field and the tracking controller under underactuation. In this paper, we present a novel framework to co-develop the vector field and the control laws, guiding the nonholonomic robot to the target configuration with curvature-bounded trajectory. First, we formulate the problem by introducing the target positive limit set, which allows the robot to converge to or pass through the target configuration, depending on different dynamics and tasks. Next, we construct a curvature-constrained vector field (CVF) via blending and distributing basic flow fields in workspace and propose the saturated control laws with a dynamic gain, under which the tracking error's magnitude decreases even when saturation occurs. Under the control laws, kinematically constrained nonholonomic robots are guaranteed to track the reference CVF and converge to the target positive limit set with bounded trajectory curvature. Numerical simulations show that the proposed CVF method outperforms other vector-field-based algorithms. Experiments on Ackermann UGVs and semi-physical fixed-wing UAVs demonstrate that the method can be effectively implemented in real-world scenarios.

arXiv Computer Science @arxiv_cs@qoto.org

Mapping Technological Futures: Anticipatory Discourse Through Text Mining https://arxiv.org/abs/2504.02853 #cs.SI #cs.CL #cs.CY

Mapping Technological Futures: Anticipatory Discourse Through Text Mining

The volatility and unpredictability of emerging technologies, such as artificial intelligence (AI), generate significant uncertainty, which is widely discussed on social media. This study examines anticipatory discourse surrounding technological futures by analysing 1.5 million posts from 400 key opinion leaders (KOLs) published on the X platform (from 2021 to 2023). Using advanced text mining techniques, including BERTopic modelling, sentiment, emotion, and attitude analyses, the research identifies 100 distinct topics reflecting anticipated tech-driven futures. Our findings emphasize the dual role of KOLs in framing \textit{present futures} -- optimistic visions of transformative technologies like AI and IoT -- and influencing \textit{future presents}, where these projections shape contemporary societal and geopolitical debates. Positive emotions such as Hope dominate, outweighing Anxiety, particularly in topics like ``Machine Learning, Data Science, and Deep Learning,'' while discussions around ``Climate Change'' and ``War, Ukraine, and Trump People'' elicit \textit{Anxiety}. By framing technologies as solutions to societal challenges, KOLs act as mediators of societal narratives, bridging imagined futures and current realities. These insights underscore their pivotal role in directing public attention with emerging technologies during periods of heightened uncertainty, advancing our understanding of anticipatory discourse in technology-mediated contexts.

arXiv Computer Science @arxiv_cs@qoto.org

Exploration of Multi-Element Collaborative Research and Application for Modern Power System Based on Generative Large Models https://arxiv.org/abs/2504.02855 #eess.SY #cs.AI #cs.SY

Exploration of Multi-Element Collaborative Research and Application for Modern Power System Based on Generative Large Models

The transition to intelligent, low-carbon power systems necessitates advanced optimization strategies for managing renewable energy integration, energy storage, and carbon emissions. Generative Large Models (GLMs) provide a data-driven approach to enhancing forecasting, scheduling, and market operations by processing multi-source data and capturing complex system dynamics. This paper explores the role of GLMs in optimizing load-side management, energy storage utilization, and electricity carbon, with a focus on Smart Wide-area Hybrid Energy Systems with Storage and Carbon (SGLSC). By leveraging spatiotemporal modeling and reinforcement learning, GLMs enable dynamic energy scheduling, improve grid stability, enhance carbon trading strategies, and strengthen resilience against extreme weather events. The proposed framework highlights the transformative potential of GLMs in achieving efficient, adaptive, and low-carbon power system operations.

arXiv Computer Science @arxiv_cs@qoto.org

The epistemic dimension of algorithmic fairness: assessing its impact in innovation diffusion and fair policy making https://arxiv.org/abs/2504.02856 #cs.CY #cs.SI

The epistemic dimension of algorithmic fairness: assessing its impact in innovation diffusion and fair policy making

Algorithmic fairness is an expanding field that addresses a range of discrimination issues associated with algorithmic processes. However, most works in the literature focus on analyzing it only from an ethical perspective, focusing on moral principles and values that should be considered in the design and evaluation of algorithms, while disregarding the epistemic dimension related to knowledge transmission and validation. However, this aspect of algorithmic fairness should also be included in the debate, as it is crucial to introduce a specific type of harm: an individual may be systematically excluded from the dissemination of knowledge due to the attribution of a credibility deficit/excess. In this work, we specifically focus on characterizing and analyzing the impact of this credibility deficit or excess on the diffusion of innovations on a societal scale, a phenomenon driven by individual attitudes and social interactions, and also by the strength of mutual connections. Indeed, discrimination might shape the latter, ultimately modifying how innovations spread within the network. In this light, to incorporate, also from a formal point of view, the epistemic dimension in innovation diffusion models becomes paramount, especially if these models are intended to support fair policy design. For these reasons, we formalize the epistemic properties of a social environment, by extending the well-established Linear Threshold Model (LTM) in an epistemic direction to show the impact of epistemic biases in innovation diffusion. Focusing on the impact of epistemic bias in both open-loop and closed-loop scenarios featuring optimal fostering policies, our results shed light on the pivotal role the epistemic dimension might have in the debate of algorithmic fairness in decision-making.

arXiv Computer Science @arxiv_cs@qoto.org

Optimizing Humor Generation in Large Language Models: Temperature Configurations and Architectural Trade-offs https://arxiv.org/abs/2504.02858 #cs.CL #cs.LG

Optimizing Humor Generation in Large Language Models: Temperature Configurations and Architectural Trade-offs

Large language models (LLMs) demonstrate increasing capabilities in creative text generation, yet systematic evaluations of their humor production remain underexplored. This study presents a comprehensive analysis of 13 state-of-the-art LLMs across five architectural families, evaluating their performance in generating technically relevant humor for software developers. Through a full factorial design testing 715 unique configurations of temperature settings and prompt variations, we assess model outputs using five weighted criteria: humor quality, domain relevance, concept originality, tone precision, and delivery efficiency. Our methodology employs rigorous statistical analysis including ANOVA, correlation studies, and quadratic regression to identify optimal configurations and architectural influences. Results reveal significant performance variations across models, with certain architectures achieving 21.8% superiority over baseline systems. Temperature sensitivity analysis demonstrates that 73% of models achieve peak performance at lower stochasticity settings (<= 0.5), though optimal ranges vary substantially by architecture. We identify distinct model clusters: compact high-performers maintaining efficiency-quality balance versus verbose specialists requiring longer outputs for marginal gains. Statistical validation confirms model architecture explains 38.7% of performance variance, with significant correlations between humor quality and concept originality. The study establishes practical guidelines for model selection and configuration, demonstrating how temperature adjustments and architectural considerations impact humor generation effectiveness. These findings advance understanding of LLM capabilities in creative technical writing and provide empirically validated configuration strategies for developers implementing humor-generation systems.

arXiv Computer Science @arxiv_cs@qoto.org

AI-Enhanced Resilience in Power Systems: Adversarial Deep Learning for Robust Short-Term Voltage Stability Assessment under Cyber-Attacks https://arxiv.org/abs/2504.02859 #eess.SY #cs.SY

AI-Enhanced Resilience in Power Systems: Adversarial Deep Learning for Robust Short-Term Voltage Stability Assessment under Cyber-Attacks

In the era of Industry 4.0, ensuring the resilience of cyber-physical systems against sophisticated cyber threats is increasingly critical. This study proposes a pioneering AI-based control framework that enhances short-term voltage stability assessments (STVSA) in power systems under complex composite cyber-attacks. First, by incorporating white-box and black-box adversarial attacks with Denial-of-Service (DoS) perturbations during training, composite adversarial attacks are implemented. Second, the application of Spectral Normalized Conditional Wasserstein Generative Adversarial Network with Gradient Penalty (SNCWGAN-GP) and Fast Gradient Sign Method (FGSM) strengthens the model's resistance to adversarial disturbances, improving data quality and training stability. Third, an assessment model based on Long Short-Term Memory (LSTM)-enhanced Graph Attention Network (L-GAT) is developed to capture dynamic relationships between the post-fault dynamic trajectories and electrical grid topology. Experimental results on the IEEE 39-bus test system demonstrate the efficacy and superiority of the proposed method in composite cyber-attack scenarios. This contribution is pivotal to advancing AI-based resilient control strategies for nonlinear dynamical systems, marking a substantial enhancement in the security of cyber-physical systems.

arXiv Computer Science @arxiv_cs@qoto.org

Computer Vision and Deep Learning for 4D Augmented Reality https://arxiv.org/abs/2504.02860 #cs.CV #cs.AI

Computer Vision and Deep Learning for 4D Augmented Reality

The prospect of 4D video in Extended Reality (XR) platform is huge and exciting, it opens a whole new way of human computer interaction and the way we perceive the reality and consume multimedia. In this thesis, we have shown that feasibility of rendering 4D video in Microsoft mixed reality platform. This enables us to port any 3D performance capture from CVSSP into XR product like the HoloLens device with relative ease. However, if the 3D model is too complex and is made up of millions of vertices, the data bandwidth required to port the model is a severe limitation with the current hardware and communication system. Therefore, in this project we have also developed a compact representation of both shape and appearance of the 4d video sequence using deep learning models to effectively learn the compact representation of 4D video sequence and reconstruct it without affecting the shape and appearance of the video sequence.

arXiv Computer Science @arxiv_cs@qoto.org

Towards Understanding How Knowledge Evolves in Large Vision-Language Models https://arxiv.org/abs/2504.02862 #cs.CV #cs.LG

Towards Understanding How Knowledge Evolves in Large Vision-Language Models

Large Vision-Language Models (LVLMs) are gradually becoming the foundation for many artificial intelligence applications. However, understanding their internal working mechanisms has continued to puzzle researchers, which in turn limits the further enhancement of their capabilities. In this paper, we seek to investigate how multimodal knowledge evolves and eventually induces natural languages in LVLMs. We design a series of novel strategies for analyzing internal knowledge within LVLMs, and delve into the evolution of multimodal knowledge from three levels, including single token probabilities, token probability distributions, and feature encodings. In this process, we identify two key nodes in knowledge evolution: the critical layers and the mutation layers, dividing the evolution process into three stages: rapid evolution, stabilization, and mutation. Our research is the first to reveal the trajectory of knowledge evolution in LVLMs, providing a fresh perspective for understanding their underlying mechanisms. Our codes are available at https://github.com/XIAO4579/Vlm-interpretability.

arXiv Computer Science @arxiv_cs@qoto.org

GS_DravidianLangTech@2025: Women Targeted Abusive Texts Detection on Social Media https://arxiv.org/abs/2504.02863 #cs.CL #cs.SI

GS_DravidianLangTech@2025: Women Targeted Abusive Texts Detection on Social Media

The increasing misuse of social media has become a concern; however, technological solutions are being developed to moderate its content effectively. This paper focuses on detecting abusive texts targeting women on social media platforms. Abusive speech refers to communication intended to harm or incite hatred against vulnerable individuals or groups. Specifically, this study aims to identify abusive language directed toward women. To achieve this, we utilized logistic regression and BERT as base models to train datasets sourced from DravidianLangTech@2025 for Tamil and Malayalam languages. The models were evaluated on test datasets, resulting in a 0.729 macro F1 score for BERT and 0.6279 for logistic regression in Tamil and Malayalam, respectively.

Bot

I toot the arXiv feed for topics in Computer Science.

#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview

Joined Jul 2018