Show newer

KTRL+F: Knowledge-Augmented In-Document Search. (arXiv:2311.08329v3 [cs.CL] UPDATED) 

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?. (arXiv:2311.07587v2 [cs.CL] UPDATED) 

InCA: Rethinking In-Car Conversational System Assessment Leveraging Large Language Models. (arXiv:2311.07469v2 [cs.CL] UPDATED) 

Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse. (arXiv:2311.07468v2 [cs.CL] UPDATED) 

Exploring the Dialogue Comprehension Ability of Large Language Models. (arXiv:2311.07194v2 [cs.CL] UPDATED) 

Comparative Multi-View Language Grounding. (arXiv:2311.06694v2 [cs.CL] UPDATED) 

ALYMPICS: Language Agents Meet Game Theory. (arXiv:2311.03220v2 [cs.CL] UPDATED) 

Incorporating Worker Perspectives into MTurk Annotation Practices for NLP. (arXiv:2311.02802v2 [cs.CL] UPDATED) 

Support or Refute: Analyzing the Stance of Evidence to Detect Out-of-Context Mis- and Disinformation. (arXiv:2311.01766v3 [cs.CL] UPDATED) 

Construction Artifacts in Metaphor Identification Datasets. (arXiv:2311.00790v2 [cs.CL] UPDATED) 

DEFT: Data Efficient Fine-Tuning for Large Language Models via Unsupervised Core-Set Selection. (arXiv:2310.16776v3 [cs.CL] UPDATED) 

Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education. (arXiv:2310.12059v3 [cs.CL] UPDATED) 

Quantifying Self-diagnostic Atomic Knowledge in Chinese Medical Foundation Model: A Computational Analysis. (arXiv:2310.11722v2 [cs.CL] UPDATED) 

Will the Prince Get True Love's Kiss? On the Model Sensitivity to Gender Perturbation over Fairytale Texts. (arXiv:2310.10865v2 [cs.CL] UPDATED) 

Empirical study of pretrained multilingual language models for zero-shot cross-lingual generation. (arXiv:2310.09917v2 [cs.CL] UPDATED) 

Semi-automatic staging area for high-quality structured data extraction from scientific literature. (arXiv:2309.10923v2 [cs.CL] UPDATED) 

LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models. (arXiv:2308.16137v5 [cs.CL] UPDATED) 

WavMark: Watermarking for Audio Generation. (arXiv:2308.12770v2 [cs.SD] UPDATED) 

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench. (arXiv:2308.03656v2 [cs.CL] UPDATED) 

Towards Generalist Foundation Model for Radiology by Leveraging Web-scale 2D&3D Medical Data. (arXiv:2308.02463v5 [cs.CV] UPDATED) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.