Show newer

Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization. (arXiv:2311.09184v1 [cs.CL]) 

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models. (arXiv:2311.09182v1 [cs.CL]) 

PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers. (arXiv:2311.09180v1 [cs.CL]) 

SiRA: Sparse Mixture of Low Rank Adaptation. (arXiv:2311.09179v1 [cs.CL]) 

AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph. (arXiv:2311.09174v1 [cs.CL]) 

CLEAN-EVAL: Clean Evaluation on Contaminated Large Language Models. (arXiv:2311.09154v1 [cs.CL]) 

Temporal Knowledge Question Answering via Abstract Reasoning Induction. (arXiv:2311.09149v1 [cs.CL]) 

Grounding or Guesswork? Large Language Models are Presumptive Grounders. (arXiv:2311.09144v1 [cs.CL]) 

RRescue: Ranking LLM Responses to Enhance Reasoning Over Context. (arXiv:2311.09136v1 [cs.CL]) 

Aligning Neural Machine Translation Models: Human Feedback in Training and Inference. (arXiv:2311.09132v1 [cs.CL]) 

Social Meme-ing: Measuring Linguistic Variation in Memes. (arXiv:2311.09130v1 [cs.CL]) 

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark. (arXiv:2311.09122v1 [cs.CL]) 

R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces. (arXiv:2311.09117v1 [cs.CL]) 

Ever: Mitigating Hallucination in Large Language Models through Real-Time Verification and Rectification. (arXiv:2311.09114v1 [cs.CL]) 

Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?. (arXiv:2311.09109v1 [cs.CL]) 

"We Demand Justice!": Towards Grounding Political Text in Social Context. (arXiv:2311.09106v1 [cs.CL]) 

MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation. (arXiv:2311.09105v1 [cs.CL]) 

Towards A Unified View of Answer Calibration for Multi-Step Reasoning. (arXiv:2311.09101v1 [cs.CL]) 

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization. (arXiv:2311.09096v1 [cs.CL]) 

Social Bias Probing: Fairness Benchmarking for Language Models. (arXiv:2311.09090v1 [cs.CL]) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.