Human-Centered Metrics for Dialog System Evaluation. (arXiv:2305.14757v1 [cs.CL])
Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting. (arXiv:2305.14755v1 [cs.CL])
DialogVCS: Robust Natural Language Understanding in Dialogue System Upgrade. (arXiv:2305.14751v1 [cs.CL])
Mastering the ABCDs of Complex Questions: Answer-Based Claim Decomposition for Fine-grained Self-Evaluation. (arXiv:2305.14750v1 [cs.CL])
ECHo: Event Causality Inference via Human-centric Reasoning. (arXiv:2305.14740v1 [cs.AI])
Trusting Your Evidence: Hallucinate Less with Context-aware Decoding. (arXiv:2305.14739v1 [cs.CL])
Centering the Margins: Outlier-Based Identification of Harmed Populations in Toxicity Detection. (arXiv:2305.14735v1 [cs.CL])
Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation. (arXiv:2305.14734v1 [cs.CL])
SenteCon: Leveraging Lexicons to Learn Human-Interpretable Language Representations. (arXiv:2305.14728v1 [cs.CL])
In-Context Demonstration Selection with Cross Entropy Difference. (arXiv:2305.14726v1 [cs.CL])
AMELI: Enhancing Multimodal Entity Linking with Fine-Grained Attributes. (arXiv:2305.14725v1 [cs.CL])
I Spy a Metaphor: Large Language Models and Diffusion Models Co-Create Visual Metaphors. (arXiv:2305.14724v1 [cs.CL])
CuRIAM: Corpus re Interpretation and Metalanguage in U.S. Supreme Court Opinions. (arXiv:2305.14719v1 [cs.CL])
Improving Language Models with Advantage-based Offline Policy Gradients. (arXiv:2305.14718v1 [cs.CL])
Exploiting Correlations Between Contexts and Definitions with Multiple Definition Modeling. (arXiv:2305.14717v1 [cs.CL])
GlobalBench: A Benchmark for Global Progress in Natural Language Processing. (arXiv:2305.14716v1 [cs.CL])
Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning. (arXiv:2305.14711v1 [cs.CL])
Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models. (arXiv:2305.14710v1 [cs.CL])
The student becomes the master: Matching GPT3 on Scientific Factual Error Correction. (arXiv:2305.14707v1 [cs.CL])
Flan-MoE: Scaling Instruction-Finetuned Language Models with Sparse Mixture of Experts. (arXiv:2305.14705v1 [cs.CL])
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl