What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks. (arXiv:2305.18365v2 [cs.CL] UPDATED)
Large Language Models Can be Lazy Learners: Analyze Shortcuts in In-Context Learning. (arXiv:2305.17256v2 [cs.CL] UPDATED)
Annotation Imputation to Individualize Predictions: Initial Studies on Distribution Dynamics and Model Predictions. (arXiv:2305.15070v2 [cs.CL] UPDATED)
Document Understanding Dataset and Evaluation (DUDE). (arXiv:2305.08455v3 [cs.CV] UPDATED)
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification. (arXiv:2305.04940v2 [cs.SE] UPDATED)
Self-Edit: Fault-Aware Code Editor for Code Generation. (arXiv:2305.04087v5 [cs.SE] UPDATED)
Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models. (arXiv:2304.07619v4 [q-fin.ST] UPDATED)
ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes. (arXiv:2304.04321v2 [cs.AI] UPDATED)
A Survey of Large Language Models. (arXiv:2303.18223v12 [cs.CL] UPDATED)
Language as a Latent Sequence: deep latent variable models for semi-supervised paraphrase generation. (arXiv:2301.02275v2 [cs.CL] UPDATED)
Evaluating Human-Language Model Interaction. (arXiv:2212.09746v4 [cs.CL] UPDATED)
Deep Emotion Recognition in Textual Conversations: A Survey. (arXiv:2211.09172v2 [cs.CL] UPDATED)
Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing. (arXiv:2211.04476v2 [cs.CL] UPDATED)
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy. (arXiv:2210.17546v3 [cs.LG] UPDATED)
What can we know about that which we cannot even imagine?. (arXiv:2208.03886v3 [physics.hist-ph] UPDATED)
Predicting Word Learning in Children from the Performance of Computer Vision Systems. (arXiv:2207.09847v3 [cs.CL] UPDATED)
A Survey of Knowledge Enhanced Pre-trained Models. (arXiv:2110.00269v4 [cs.CL] UPDATED)
Can Deep Neural Networks Predict Data Correlations from Column Names?. (arXiv:2107.04553v2 [cs.DB] UPDATED)
NewB: 200,000+ Sentences for Political Bias Detection. (arXiv:2006.03051v2 [cs.CL] UPDATED)
What Are People Asking About COVID-19? A Question Classification Dataset. (arXiv:2005.12522v3 [cs.CL] UPDATED)
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl