Show newer

Cognitive Overload: Jailbreaking Large Language Models with Overloaded Logical Thinking. (arXiv:2311.09827v1 [cs.CL]) 

Human Still Wins over LLM: An Empirical Study of Active Learning on Domain-Specific Annotation Tasks. (arXiv:2311.09825v1 [cs.CL]) 

Towards Robust Temporal Reasoning of Large Language Models via a Multi-Hop QA Dataset and Pseudo-Instruction Tuning. (arXiv:2311.09821v1 [cs.CL]) 

SUQL: Conversational Search over Structured and Unstructured Data with Large Language Models. (arXiv:2311.09818v1 [cs.CL]) 

Performance Trade-offs of Watermarking Large Language Models. (arXiv:2311.09816v1 [cs.CL]) 

Large Language Models for Propaganda Span Annotation. (arXiv:2311.09812v1 [cs.CL]) 

PixT3: Pixel-based Table To Text generation. (arXiv:2311.09808v1 [cs.CL]) 

The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text. (arXiv:2311.09807v1 [cs.CL]) 

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data. (arXiv:2311.09805v1 [cs.CL]) 

Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs. (arXiv:2311.09802v1 [cs.AI]) 

$\textit{Dial BeInfo for Faithfulness}$: Improving Factuality of Information-Seeking Dialogue via Behavioural Fine-Tuning. (arXiv:2311.09800v1 [cs.CL]) 

How Far Can We Extract Diverse Perspectives from Large Language Models? Criteria-Based Diversity Prompting!. (arXiv:2311.09799v1 [cs.CL]) 

KnowledgeMath: Knowledge-Intensive Math Word Problem Solving in Finance Domains. (arXiv:2311.09797v1 [cs.CL]) 

Interpreting User Requests in the Context of Natural Language Standing Instructions. (arXiv:2311.09796v1 [cs.CL]) 

Investigating Data Contamination in Modern Benchmarks for Large Language Models. (arXiv:2311.09783v1 [cs.CL]) 

More Samples or More Prompt Inputs? Exploring Effective In-Context Sampling for LLM Few-Shot Prompt Engineering. (arXiv:2311.09782v1 [cs.CL]) 

HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs. (arXiv:2311.09774v1 [cs.CL]) 

To be or not to be? an exploration of continuously controllable prompt engineering. (arXiv:2311.09773v1 [cs.CL]) 

LLMs as Narcissistic Evaluators: When Ego Inflates Evaluation Scores. (arXiv:2311.09766v1 [cs.CL]) 

Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations. (arXiv:2311.09763v1 [cs.CL]) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.