Identifying and Mitigating Systemic Measurement Bias in Production LLM Inference Benchmarks https://arxiv.org/abs/2605.24217 #cs.AI #cs.DC
QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks https://arxiv.org/abs/2605.24218 #cs.CL
Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows https://arxiv.org/abs/2605.24219 #cs.AI
Polar: Agentic RL on Any Harness at Scale https://arxiv.org/abs/2605.24220 #cs.DC
Analyzing the Effects of Two-Stage Peer Evaluation https://arxiv.org/abs/2605.24222 #cs.GT
ECo-MoE: Embodiment-Conditioned Mixture of Experts Increases the Evolvability of Robots https://arxiv.org/abs/2605.24225 #cs.RO
Sketch Bug: Using Sketch-Based Input for Interactive Code Debugging https://arxiv.org/abs/2605.24228 #cs.HC
How Well Do Models Follow Their Constitutions? https://arxiv.org/abs/2605.24229 #cs.AI
Accuracy Analysis of the Proxy Point Method with Applications to Some Toeplitz Matrices https://arxiv.org/abs/2605.24231 #math.NA #cs.NA
Bayesian Rational Search Engine User https://arxiv.org/abs/2605.24233 #econ.TH #cs.IR
An AI-Driven Framework for Energy-Efficient Environmental Monitoring in Smart Cities Using Edge Intelligence https://arxiv.org/abs/2605.22824 #cs.DC #cs.AI
KPI2KVI: A Multi Agent Workflow for Calculating Key Value Indicators from Service Descriptions https://arxiv.org/abs/2605.22825 #cs.DC #cs.AI #cs.ET #cs.PF
Evaluating Large Language Models in a Complex Hidden Role Game https://arxiv.org/abs/2605.22826 #cs.CL #cs.AI #cs.GT #cs.MA
A Survey of Text and Speech Resources for Hausa and Fongbe: Availability, Quality, and Gaps for NLP Development https://arxiv.org/abs/2605.22828 #cs.CL
LFRAG: Layout-oriented Fine-grained Retrieval-Augmented Generation on Multimodal Document Understanding https://arxiv.org/abs/2605.22829 #cs.IR #cs.AI
Intercloud: Eventual Consistency for Decentralised Economies via Chilling-Effect Consensus https://arxiv.org/abs/2605.22830 #cs.DC #cs.CR #cs.GT
Monte Cimone v3: Where RISC-V Stands in High-Performance Computing https://arxiv.org/abs/2605.22831 #cs.DC
Mathematical Foundations for Peer-to-Peer Lattice Computation https://arxiv.org/abs/2605.22832 #math.CO #math.PR #cs.DC #cs.DS
RAG4Outcome: A Retrieval-Augmented Multimodal Framework for Prognostic Prediction in Chronic Osteomyelitis https://arxiv.org/abs/2605.22833 #cs.IR #cs.AI #cs.LG
Query-Adaptive Semantic Chunking for Retrieval-Augmented Generation: A Dynamic Strategy with Contextual Window Expansion https://arxiv.org/abs/2605.22834 #cs.CL #cs.IR
I toot the arXiv feed for topics in Computer Science.
#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview