MedPI: Evaluating AI Systems in Medical Patient-facing Interactions https://arxiv.org/abs/2601.04195 #cs.CL #cs.AI
RAGVUE: A Diagnostic View for Explainable and Automated Evaluation of Retrieval-Augmented Generation https://arxiv.org/abs/2601.04196 #cs.CL #cs.IR
Automatic Construction of Chinese Verb Collostruction Database https://arxiv.org/abs/2601.04197 #cs.CL
Identification of a Kalman filter: consistency of local solutions https://arxiv.org/abs/2601.04198 #eess.SY #math.DS #cs.SY
The Forgotten Shield: Safety Grafting in Parameter-Space for Medical MLLMs https://arxiv.org/abs/2601.04199 #cs.LG #cs.AI #cs.CL
Attribute-Aware Controlled Product Generation with LLMs for E-commerce https://arxiv.org/abs/2601.04200 #cs.CL #cs.AI
Collective Narrative Grounding: Community-Coordinated Data Contributions to Improve Local AI Systems https://arxiv.org/abs/2601.04201 #cs.CL #cs.AI #cs.CY #cs.HC
TeleTables: A Benchmark for Large Language Models in Telecom Table Interpretation https://arxiv.org/abs/2601.04202 #cs.CL #cs.AI #cs.LG
FronTalk: Benchmarking Front-End Development as Conversational Code Generation with Multi-Modal Feedback https://arxiv.org/abs/2601.04203 #cs.CL #cs.CV #cs.LG #cs.SE
Enhancing Retrieval-Augmented Generation with Two-Stage Retrieval: FlashRank Reranking and Query Expansion https://arxiv.org/abs/2601.03258 #cs.IR
LLMDiRec: LLM-Enhanced Intent Diffusion for Sequential Recommendation https://arxiv.org/abs/2601.03259 #cs.IR
SciNetBench: A Relation-Aware Benchmark for Scientific Literature Retrieval Agents https://arxiv.org/abs/2601.03260 #cs.CE #cs.CL
DeepResearch-Slice: Bridging the Retrieval-Utilization Gap via Explicit Text Slicing https://arxiv.org/abs/2601.03261 #cs.CL #cs.AI
Roles of MLLMs in Visually Rich Document Retrieval for RAG: A Survey https://arxiv.org/abs/2601.03262 #cs.IR #cs.CL
Internal Reasoning vs. External Control: A Thermodynamic Analysis of Sycophancy in Large Language Models https://arxiv.org/abs/2601.03263 #cs.CL #cs.AI
Jailbreak-Zero: A Path to Pareto Optimal Red Teaming for Large Language Models https://arxiv.org/abs/2601.03265 #cs.CL #cs.CR #cs.LG
Benchmarking and Adapting On-Device Large Language Models for Clinical Decision Support https://arxiv.org/abs/2601.03266 #cs.CL #cs.AI
OpenAI GPT-5 System Card https://arxiv.org/abs/2601.03267 #cs.CL #cs.AI
WRAVAL -- WRiting Assist eVALuation https://arxiv.org/abs/2601.03268 #cs.CL #cs.LG
I toot the arXiv feed for topics in Computer Science.
#ComputerScience #CS #Programming #SoftwareEngineering #Software #SoftwareDevelopment #Computers #Science #arXiv #News #PeerReview