MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation. (arXiv:2310.14088v2 [cs.CL] UPDATED)
Towards Understanding Sycophancy in Language Models. (arXiv:2310.13548v3 [cs.CL] UPDATED)
Open-ended Commonsense Reasoning with Unrestricted Answer Scope. (arXiv:2310.11672v2 [cs.CL] UPDATED)
VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System. (arXiv:2310.11069v4 [cs.CL] UPDATED)
Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model. (arXiv:2310.09520v3 [cs.CL] UPDATED)
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling. (arXiv:2310.04691v2 [cs.CL] UPDATED)
Minimum Bayes' Risk Decoding for System Combination of Grammatical Error Correction Systems. (arXiv:2309.06520v2 [cs.CL] UPDATED)
Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity. (arXiv:2309.06364v2 [cs.CL] UPDATED)
Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages. (arXiv:2309.04679v2 [cs.CL] UPDATED)
FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models. (arXiv:2308.10397v2 [cs.CL] UPDATED)
ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP. (arXiv:2308.02122v2 [cs.CR] UPDATED)
Android in the Wild: A Large-Scale Dataset for Android Device Control. (arXiv:2307.10088v2 [cs.LG] UPDATED)
HYTREL: Hypergraph-enhanced Tabular Data Representation Learning. (arXiv:2307.08623v2 [cs.LG] UPDATED)
DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge Graphs. (arXiv:2307.04090v2 [cs.CL] UPDATED)
Automatic Calibration and Error Correction for Generative Large Language Models via Pareto Optimal Self-Supervision. (arXiv:2306.16564v3 [cs.CL] UPDATED)
YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus. (arXiv:2306.15162v2 [cs.CL] UPDATED)
Block-State Transformers. (arXiv:2306.09539v3 [cs.CL] UPDATED)
Semantic HELM: A Human-Readable Memory for Reinforcement Learning. (arXiv:2306.09312v2 [cs.LG] UPDATED)
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model. (arXiv:2306.02531v2 [cs.CL] UPDATED)
TIES-Merging: Resolving Interference When Merging Models. (arXiv:2306.01708v2 [cs.LG] UPDATED)
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl