Can Language Models Employ the Socratic Method? Experiments with Code Debugging. (arXiv:2310.03210v1 [cs.CL])
The Rise of Open Science: Tracking the Evolution and Perceived Value of Data and Methods Link-Sharing Practices. (arXiv:2310.03193v1 [cs.DL])
Retrieval-augmented Generation to Improve Math Question-Answering: Trade-offs Between Groundedness and Human Preference. (arXiv:2310.03184v1 [cs.CL])
Robust and Interpretable Medical Image Classifiers via Concept Bottleneck Models. (arXiv:2310.03182v1 [cs.CV])
$\mathcal{B}$-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis. (arXiv:2310.03173v1 [cs.CL])
MetaTool Benchmark: Deciding Whether to Use Tools and Which to Use. (arXiv:2310.03128v1 [cs.SE])
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning. (arXiv:2310.03094v1 [cs.CL])
Discovering Knowledge-Critical Subnetworks in Pretrained Language Models. (arXiv:2310.03084v1 [cs.CL])
How FaR Are Large Language Models From Agents with Theory-of-Mind?. (arXiv:2310.03051v1 [cs.CL])
How Prevalent is Gender Bias in ChatGPT? -- Exploring German and English ChatGPT Responses. (arXiv:2310.03031v1 [cs.CL])
An Empirical Study of AI Generated Text Detection Tools. (arXiv:2310.01423v1 [cs.CL] CROSS LISTED)
An Empirical Study on Fertility Proposals Using Multi-Grained Topic Analysis Methods. (arXiv:2307.10025v2 [cs.HC] CROSS LISTED)
REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction. (arXiv:2306.15724v3 [cs.RO] CROSS LISTED)
DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for In-Context Learning. (arXiv:2310.02954v2 [cs.CL] UPDATED)
LC-Score: Reference-less estimation of Text Comprehension Difficulty. (arXiv:2310.02754v2 [cs.CL] UPDATED)
On the definition of toxicity in NLP. (arXiv:2310.02357v2 [cs.CL] UPDATED)
Ring Attention with Blockwise Transformers for Near-Infinite Context. (arXiv:2310.01889v2 [cs.CL] UPDATED)
Borges and AI. (arXiv:2310.01425v2 [cs.CL] UPDATED)
TADIS: Steering Models for Deep-Thinking about Demonstration Examples. (arXiv:2310.00901v2 [cs.CL] UPDATED)
DyVal: Graph-informed Dynamic Evaluation of Large Language Models. (arXiv:2309.17167v2 [cs.AI] UPDATED)
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl