M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning. (arXiv:2306.04387v2 [cs.CV] UPDATED)
GPT Self-Supervision for a Better Data Annotator. (arXiv:2306.04349v2 [cs.CL] UPDATED)
Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset. (arXiv:2306.03030v2 [cs.CL] UPDATED)
UNIDECOR: A Unified Deception Corpus for Cross-Corpus Deception Detection. (arXiv:2306.02827v2 [cs.CL] UPDATED)
BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models. (arXiv:2306.01506v2 [cs.CL] UPDATED)
An Empirical Study on Challenging Math Problem Solving with GPT-4. (arXiv:2306.01337v2 [cs.CL] UPDATED)
Supplementary Features of BiLSTM for Enhanced Sequence Labeling. (arXiv:2305.19928v3 [cs.CL] UPDATED)
A Systematic Study and Comprehensive Evaluation of ChatGPT on Benchmark Datasets. (arXiv:2305.18486v3 [cs.CL] UPDATED)
Whitening-based Contrastive Learning of Sentence Embeddings. (arXiv:2305.17746v2 [cs.CL] UPDATED)
BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering. (arXiv:2305.15932v2 [cs.CL] UPDATED)
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners. (arXiv:2305.14825v2 [cs.CL] UPDATED)
Sensitivity and Robustness of Large Language Models to Prompt Template in Japanese Text Classification Tasks. (arXiv:2305.08714v2 [cs.CL] UPDATED)
On the Hidden Mystery of OCR in Large Multimodal Models. (arXiv:2305.07895v3 [cs.CV] UPDATED)
Asymmetric feature interaction for interpreting model predictions. (arXiv:2305.07224v2 [cs.CL] UPDATED)
BanglaBook: A Large-scale Bangla Dataset for Sentiment Analysis from Book Reviews. (arXiv:2305.06595v3 [cs.CL] UPDATED)
Controlled Text Generation with Natural Language Instructions. (arXiv:2304.14293v2 [cs.CL] UPDATED)
Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark. (arXiv:2304.03279v3 [cs.LG] UPDATED)
Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer. (arXiv:2301.06735v3 [cs.SD] UPDATED)
Think Twice: A Human-like Two-stage Conversational Agent for Emotional Response Generation. (arXiv:2301.04907v2 [cs.CL] UPDATED)
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning. (arXiv:2212.10773v2 [cs.CL] UPDATED)
All recent Computation and Language articles on arXiv.org for the Fediverse
Inspired by https://twitter.com/arxiv_cscl