Show newer

MedEval: A Multi-Level, Multi-Task, and Multi-Domain Medical Benchmark for Language Model Evaluation. (arXiv:2310.14088v2 [cs.CL] UPDATED) 

Towards Understanding Sycophancy in Language Models. (arXiv:2310.13548v3 [cs.CL] UPDATED) 

Open-ended Commonsense Reasoning with Unrestricted Answer Scope. (arXiv:2310.11672v2 [cs.CL] UPDATED) 

VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System. (arXiv:2310.11069v4 [cs.CL] UPDATED) 

Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model. (arXiv:2310.09520v3 [cs.CL] UPDATED) 

EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling. (arXiv:2310.04691v2 [cs.CL] UPDATED) 

Minimum Bayes' Risk Decoding for System Combination of Grammatical Error Correction Systems. (arXiv:2309.06520v2 [cs.CL] UPDATED) 

Framework-Based Qualitative Analysis of Free Responses of Large Language Models: Algorithmic Fidelity. (arXiv:2309.06364v2 [cs.CL] UPDATED) 

Embedding structure matters: Comparing methods to adapt multilingual vocabularies to new languages. (arXiv:2309.04679v2 [cs.CL] UPDATED) 

FairMonitor: A Four-Stage Automatic Framework for Detecting Stereotypes and Biases in Large Language Models. (arXiv:2308.10397v2 [cs.CL] UPDATED) 

ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP. (arXiv:2308.02122v2 [cs.CR] UPDATED) 

Android in the Wild: A Large-Scale Dataset for Android Device Control. (arXiv:2307.10088v2 [cs.LG] UPDATED) 

HYTREL: Hypergraph-enhanced Tabular Data Representation Learning. (arXiv:2307.08623v2 [cs.LG] UPDATED) 

DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge Graphs. (arXiv:2307.04090v2 [cs.CL] UPDATED) 

Automatic Calibration and Error Correction for Generative Large Language Models via Pareto Optimal Self-Supervision. (arXiv:2306.16564v3 [cs.CL] UPDATED) 

YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus. (arXiv:2306.15162v2 [cs.CL] UPDATED) 

Block-State Transformers. (arXiv:2306.09539v3 [cs.CL] UPDATED) 

Semantic HELM: A Human-Readable Memory for Reinforcement Learning. (arXiv:2306.09312v2 [cs.LG] UPDATED) 

PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model. (arXiv:2306.02531v2 [cs.CL] UPDATED) 

TIES-Merging: Resolving Interference When Merging Models. (arXiv:2306.01708v2 [cs.LG] UPDATED) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.