Show newer

Late Audio-Visual Fusion for In-The-Wild Speaker Diarization. (arXiv:2211.01299v2 [eess.AS] UPDATED) 

Overcoming Referential Ambiguity in Language-Guided Goal-Conditioned Reinforcement Learning. (arXiv:2209.12758v2 [cs.LG] UPDATED) 

Disinformation Detection: An Evolving Challenge in the Age of LLMs. (arXiv:2309.15847v1 [cs.CL]) 

How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions. (arXiv:2309.15840v1 [cs.CL]) 

How We Define Harm Impacts Data Annotations: Explaining How Annotators Distinguish Hateful, Offensive, and Toxic Comments. (arXiv:2309.15827v1 [cs.CL]) 

Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing. (arXiv:2309.15826v1 [cs.CL]) 

Identifying the Risks of LM Agents with an LM-Emulated Sandbox. (arXiv:2309.15817v1 [cs.AI]) 

Lyra: Orchestrating Dual Correction in Automated Theorem Proving. (arXiv:2309.15806v1 [cs.CL]) 

Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study. (arXiv:2309.15800v1 [cs.CL]) 

Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition. (arXiv:2309.15796v1 [eess.AS]) 

Large Language Model Routing with Benchmark Datasets. (arXiv:2309.15789v1 [cs.CL]) 

Question answering using deep learning in low resource Indian language Marathi. (arXiv:2309.15779v1 [cs.CL]) 

Experience and Evidence are the eyes of an excellent summarizer! Towards Knowledge Infused Multi-modal Clinical Conversation Summarization. (arXiv:2309.15739v1 [cs.CL]) 

ChatGPT-BCI: Word-Level Neural State Classification Using GPT, EEG, and Eye-Tracking Biomarkers in Semantic Inference Reading Comprehension. (arXiv:2309.15714v1 [cs.CL]) 

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models. (arXiv:2309.15701v1 [cs.CL]) 

Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization. (arXiv:2309.15686v1 [cs.CL]) 

Speech collage: code-switched audio generation by collaging monolingual corpora. (arXiv:2309.15674v1 [cs.SD]) 

MONOVAB : An Annotated Corpus for Bangla Multi-label Emotion Detection. (arXiv:2309.15670v1 [cs.LG]) 

Conversational Feedback in Scripted versus Spontaneous Dialogues: A Comparative Analysis. (arXiv:2309.15656v1 [cs.CL]) 

Generative Speech Recognition Error Correction with Large Language Models. (arXiv:2309.15649v1 [cs.CL]) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.