Show newer

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR. (arXiv:2311.14835v2 [cs.SD] UPDATED) 

A density estimation perspective on learning from pairwise human preferences. (arXiv:2311.14115v2 [cs.LG] UPDATED) 

Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization. (arXiv:2311.10847v2 [cs.CL] UPDATED) 

Dissecting the Runtime Performance of the Training, Fine-tuning, and Inference of Large Language Models. (arXiv:2311.03687v2 [cs.PF] UPDATED) 

In-Context Pretraining: Language Modeling Beyond Document Boundaries. (arXiv:2310.10638v4 [cs.CL] UPDATED) 

Llemma: An Open Language Model For Mathematics. (arXiv:2310.10631v2 [cs.CL] UPDATED) 

JMedLoRA:Medical Domain Adaptation on Japanese Large Language Models using Instruction-tuning. (arXiv:2310.10083v2 [cs.CL] UPDATED) 

The Gift of Feedback: Improving ASR Model Quality by Learning from User Corrections through Federated Learning. (arXiv:2310.00141v2 [cs.CL] UPDATED) 

Persona-Coded Poly-Encoder: Persona-Guided Multi-Stream Conversational Sentence Scoring. (arXiv:2309.16770v2 [cs.CL] UPDATED) 

QuantEase: Optimization-based Quantization for Language Models. (arXiv:2309.01885v2 [stat.ML] UPDATED) 

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback. (arXiv:2309.00267v2 [cs.CL] UPDATED) 

PointLLM: Empowering Large Language Models to Understand Point Clouds. (arXiv:2308.16911v2 [cs.CV] UPDATED) 

A Comparative Study of Text Embedding Models for Semantic Text Similarity in Bug Reports. (arXiv:2308.09193v2 [cs.SE] UPDATED) 

Large Language Models of Code Fail at Completing Code with Potential Bugs. (arXiv:2306.03438v2 [cs.LG] UPDATED) 

Does Conceptual Representation Require Embodiment? Insights From Large Language Models. (arXiv:2305.19103v3 [cs.CL] UPDATED) 

A Question Answering Framework for Decontextualizing User-facing Snippets from Scientific Documents. (arXiv:2305.14772v3 [cs.CL] UPDATED) 

Pointwise Mutual Information Based Metric and Decoding Strategy for Faithful Generation in Document Grounded Dialogs. (arXiv:2305.12191v2 [cs.CL] UPDATED) 

An Adversarial Non-Autoregressive Model for Text Generation with Incomplete Information. (arXiv:2305.03977v2 [cs.CL] UPDATED) 

ReCEval: Evaluating Reasoning Chains via Correctness and Informativeness. (arXiv:2304.10703v2 [cs.CL] UPDATED) 

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment. (arXiv:2304.06767v4 [cs.LG] UPDATED) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.