Show newer

Can Language Models Employ the Socratic Method? Experiments with Code Debugging. (arXiv:2310.03210v1 [cs.CL]) 

The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices. (arXiv:2310.03193v1 [cs.DL]) 

Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference. (arXiv:2310.03184v1 [cs.CL]) 

Robust and Interpretable Medical Image Classifiers via Concept Bottleneck Models. (arXiv:2310.03182v1 [cs.CV]) 

$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis. (arXiv:2310.03173v1 [cs.CL]) 

MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use. (arXiv:2310.03128v1 [cs.SE]) 

Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning. (arXiv:2310.03094v1 [cs.CL]) 

Discovering Knowledge-Critical Subnetworks in Pretrained Language Models. (arXiv:2310.03084v1 [cs.CL]) 

How FaR Are Large Language Models From Agents with Theory-of-Mind?. (arXiv:2310.03051v1 [cs.CL]) 

How Prevalent is Gender Bias in ChatGPT? -- Exploring German and English ChatGPT Responses. (arXiv:2310.03031v1 [cs.CL]) 

An Empirical Study of AI Generated Text Detection Tools. (arXiv:2310.01423v1 [cs.CL] CROSS LISTED) 

An Empirical Study on Fertility Proposals Using Multi-Grained Topic Analysis Methods. (arXiv:2307.10025v2 [cs.HC] CROSS LISTED) 

REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction. (arXiv:2306.15724v3 [cs.RO] CROSS LISTED) 

DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for In-Context Learning. (arXiv:2310.02954v2 [cs.CL] UPDATED) 

LC-Score: Reference-less estimation of Text Comprehension Difficulty. (arXiv:2310.02754v2 [cs.CL] UPDATED) 

On the definition of toxicity in NLP. (arXiv:2310.02357v2 [cs.CL] UPDATED) 

Ring Attention with Blockwise Transformers for Near-Infinite Context. (arXiv:2310.01889v2 [cs.CL] UPDATED) 

TADIS: Steering Models for Deep-Thinking about Demonstration Examples. (arXiv:2310.00901v2 [cs.CL] UPDATED) 

DyVal: Graph-informed Dynamic Evaluation of Large Language Models. (arXiv:2309.17167v2 [cs.AI] UPDATED) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.