Simple LLM Prompting is State-of-the-Art for Robust and Multilingual Dialogue Evaluation. (arXiv:2308.16797v2 [cs.CL] UPDATED)
When Do Program-of-Thoughts Work for Reasoning?. (arXiv:2308.15452v2 [cs.CL] UPDATED)
Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation. (arXiv:2308.15363v2 [cs.DB] UPDATED)
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning. (arXiv:2308.12032v2 [cs.CL] UPDATED)
ValiTex -- a unified validation framework for computational text-based measures of social science constructs. (arXiv:2307.02863v3 [cs.CL] UPDATED)
A Conditional Generative Chatbot using Transformer Model. (arXiv:2306.02074v2 [cs.CL] UPDATED)
Entity Tracking in Language Models. (arXiv:2305.02363v2 [cs.CL] UPDATED)
MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization. (arXiv:2301.12307v2 [cs.CL] UPDATED)
TikTalk: A Video-Based Dialogue Dataset for Multi-Modal Chitchat in Real World. (arXiv:2301.05880v3 [cs.CL] UPDATED)
Less is More: A Lightweight and Robust Neural Architecture for Discourse Parsing. (arXiv:2210.09537v2 [cs.CL] UPDATED)
Detecting Text Formality: A Study of Text Classification Approaches. (arXiv:2204.08975v2 [cs.CL] UPDATED)
Measuring and Improving Chain-of-Thought Reasoning in Vision-Language Models. (arXiv:2309.04461v1 [cs.CL])
CSPRD: A Financial Policy Retrieval Dataset for Chinese Stock Market. (arXiv:2309.04389v1 [cs.CL])
MoEController: Instruction-based Arbitrary Image Manipulation with Mixture-of-Expert Controllers. (arXiv:2309.04372v1 [cs.CV])
Beyond Static Datasets: A Deep Interaction Approach to LLM Evaluation. (arXiv:2309.04369v1 [cs.CL])
Encoding Multi-Domain Scientific Papers by Ensembling Multiple CLS Tokens. (arXiv:2309.04333v1 [cs.CL])
Fuzzy Fingerprinting Transformer Language-Models for Emotion Recognition in Conversations. (arXiv:2309.04292v1 [cs.CL])
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting. (arXiv:2309.04269v1 [cs.CL])
UQ at #SMM4H 2023: ALEX for Public Health Analysis with Social Media. (arXiv:2309.04213v1 [cs.CL])
The CALLA Dataset: Probing LLMs' Interactive Knowledge Acquisition from Chinese Medical Literature. (arXiv:2309.04198v1 [cs.CL])
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl