Show newer

Extending Logic Explained Networks to Text Classification. (arXiv:2211.09732v1 [cs.CL]) 

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech. (arXiv:2211.09731v1 [cs.CL]) 

Generative Adversarial Training Can Improve Neural Language Models. (arXiv:2211.09728v1 [cs.CL]) 

Federated Multilingual Models for Medical Transcript Analysis. (arXiv:2211.09722v1 [cs.CL]) 

Numerical Optimizations for Weighted Low-rank Estimation on Language Model. (arXiv:2211.09718v1 [cs.CL]) 

Design Considerations For Hypothesis Rejection Modules In Spoken Language Understanding Systems. (arXiv:2211.09711v1 [cs.CL]) 

Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material. (arXiv:2211.09710v1 [cs.CL]) 

PromptCap: Prompt-Guided Task-Aware Image Captioning. (arXiv:2211.09699v1 [cs.CV]) 

The Effectiveness of Bidirectional Generative Patent Language Models. (arXiv:2211.09690v1 [cs.CL]) 

Analyse der Entwicklungstreiber milit\"arischer Schwarmdrohnen durch Natural Language Processing. (arXiv:2211.09680v1 [cs.CL]) 

Cross-Modal Adapter for Text-Video Retrieval. (arXiv:2211.09623v1 [cs.CV]) 

Towards Building Text-To-Speech Systems for the Next Billion Users. (arXiv:2211.09536v1 [cs.CL]) 

Ignore Previous Prompt: Attack Techniques For Language Models. (arXiv:2211.09527v1 [cs.CL]) 

Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature Review. (arXiv:2211.09511v1 [cs.CL]) 

Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation. (arXiv:2211.09495v1 [cs.SD]) 

Abstractive Summarization Guided by Latent Hierarchical Document Structure. (arXiv:2211.09458v1 [cs.CL]) 

Consultation Checklists: Standardising the Human Evaluation of Medical Note Generation. (arXiv:2211.09455v1 [cs.CL]) 

Feature-augmented Machine Reading Comprehension with Auxiliary Tasks. (arXiv:2211.09438v1 [cs.CL]) 

Feedback is Needed for Retakes: An Explainable Poor Image Notification Framework for the Visually Impaired. (arXiv:2211.09427v1 [cs.CV]) 

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer. (arXiv:2211.09412v1 [cs.SD]) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.