How FaR Are Large Language Models From Agents with Theory-of-Mind?. (arXiv:2310.03051v1 [cs.CL])
How Prevalent is Gender Bias in ChatGPT? -- Exploring German and English ChatGPT Responses. (arXiv:2310.03031v1 [cs.CL])
Named Entity Inclusion in Abstractive Text Summarization. (arXiv:2307.02570v1 [cs.CL] CROSS LISTED)
Who's Harry Potter? Approximate Unlearning in LLMs. (arXiv:2310.02238v2 [cs.CL] UPDATED)
OceanGPT: A Large Language Model for Ocean Science Tasks. (arXiv:2310.02031v2 [cs.CL] UPDATED)
Effective and Parameter-Efficient Reusing Fine-Tuned Models. (arXiv:2310.01886v2 [cs.LG] UPDATED)
Preserving Phonemic Distinctions for Ordinal Regression: A Novel Loss Function for Automatic Pronunciation Assessment. (arXiv:2310.01839v2 [eess.AS] UPDATED)
LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples. (arXiv:2310.01469v2 [cs.CL] UPDATED)
The Entity-Deduction Arena: A playground for probing the conversational reasoning and planning capabilities of LLMs. (arXiv:2310.01468v2 [cs.CL] UPDATED)
Fewer is More: Trojan Attacks on Parameter-Efficient Fine-Tuning. (arXiv:2310.00648v2 [cs.CL] UPDATED)
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving. (arXiv:2309.17452v2 [cs.CL] UPDATED)
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models. (arXiv:2309.14509v2 [cs.LG] UPDATED)
Investigating the Catastrophic Forgetting in Multimodal Large Language Models. (arXiv:2309.10313v3 [cs.CL] UPDATED)
Sparse Autoencoders Find Highly Interpretable Features in Language Models. (arXiv:2309.08600v3 [cs.LG] UPDATED)
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness. (arXiv:2308.16175v2 [cs.CL] UPDATED)
Instruction Tuning for Large Language Models: A Survey. (arXiv:2308.10792v3 [cs.CL] UPDATED)
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. (arXiv:2308.08155v2 [cs.AI] UPDATED)
L-Eval: Instituting Standardized Evaluation for Long Context Language Models. (arXiv:2307.11088v3 [cs.CL] UPDATED)
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets. (arXiv:2307.10928v2 [cs.CL] UPDATED)
ValiTex -- a unified validation framework for computational text-based measures of social science constructs. (arXiv:2307.02863v4 [cs.CL] UPDATED)
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl