Show newer

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models. (arXiv:2305.13112v2 [cs.CL] UPDATED) 

ACCENT: An Automatic Event Commonsense Evaluation Metric for Open-Domain Dialogue Systems. (arXiv:2305.07797v2 [cs.CL] UPDATED) 

DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction. (arXiv:2304.11015v3 [cs.CL] UPDATED) 

OpenAGI: When LLM Meets Domain Experts. (arXiv:2304.04370v6 [cs.AI] UPDATED) 

Why think step by step? Reasoning emerges from the locality of experience. (arXiv:2304.03843v3 [cs.AI] UPDATED) 

Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model. (arXiv:2212.09146v3 [cs.CL] UPDATED) 

Towards Abstractive Timeline Summarisation using Preference-based Reinforcement Learning. (arXiv:2211.07596v2 [cs.LG] UPDATED) 

Metaphorical User Simulators for Evaluating Task-oriented Dialogue Systems. (arXiv:2204.00763v5 [cs.CL] UPDATED) 

CoPaSul Manual -- Contour-based parametric and superpositional intonation stylization. (arXiv:1612.04765v12 [cs.CL] UPDATED) 

Grounded Intuition of GPT-Vision's Abilities with Scientific Images. (arXiv:2311.02069v1 [cs.CL]) 

Post Turing: Mapping the landscape of LLM Evaluation. (arXiv:2311.02049v1 [cs.CL]) 

Vicinal Risk Minimization for Few-Shot Cross-lingual Transfer in Abusive Language Detection. (arXiv:2311.02025v1 [cs.CL]) 

ProSG: Using Prompt Synthetic Gradients to Alleviate Prompt Forgetting of RNN-like Language Models. (arXiv:2311.01981v1 [cs.CL]) 

The language of prompting: What linguistic properties make a prompt successful?. (arXiv:2311.01967v1 [cs.CL]) 

Don't Make Your LLM an Evaluation Benchmark Cheater. (arXiv:2311.01964v1 [cs.CL]) 

Too Much Information: Keeping Training Simple for BabyLMs. (arXiv:2311.01955v1 [cs.CL]) 

Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks. (arXiv:2311.01949v1 [cs.CL]) 

Constructing Temporal Dynamic Knowledge Graphs from Interactive Text-based Games. (arXiv:2311.01928v1 [cs.CL]) 

GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling. (arXiv:2311.01927v1 [cs.LG]) 

Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review. (arXiv:2311.01918v1 [cs.CL]) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.