Show newer

SHARCS: Efficient Transformers through Routing with Dynamic Width Sub-networks. (arXiv:2310.12126v1 [cs.LG]) 

Harnessing Dataset Cartography for Improved Compositional Generalization in Transformers. (arXiv:2310.12118v1 [cs.CL]) 

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling. (arXiv:2310.12100v1 [cs.CL]) 

Unveiling the Siren's Song: Towards Reliable Fact-Conflicting Hallucination Detection. (arXiv:2310.12086v1 [cs.CL]) 

On the Benefit of Generative Foundation Models for Human Activity Recognition. (arXiv:2310.12085v1 [cs.CV]) 

Towards Safer Operations: An Expert-involved Dataset of High-Pressure Gas Incidents for Preventing Future Failures. (arXiv:2310.12074v1 [cs.CL]) 

SPEED: Speculative Pipelined Execution for Efficient Decoding. (arXiv:2310.12072v1 [cs.CL]) 

Code Book for the Annotation of Diverse Cross-Document Coreference of Entities in News Articles. (arXiv:2310.12064v1 [cs.CL]) 

Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education. (arXiv:2310.12059v1 [cs.CL]) 

Concept-Guided Chain-of-Thought Prompting for Pairwise Comparison Scaling of Texts with Large Language Models. (arXiv:2310.12049v1 [cs.CL]) 

CORE: A Few-Shot Company Relation Classification Dataset for Robust Domain Adaptation. (arXiv:2310.12024v1 [cs.CL]) 

LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation. (arXiv:2310.12020v1 [cs.RO]) 

Gold: A Global and Local-aware Denoising Framework for Commonsense Knowledge Graph Noise Detection. (arXiv:2310.12011v1 [cs.CL]) 

Multi-view Contrastive Learning for Entity Typing over Knowledge Graphs. (arXiv:2310.12008v1 [cs.CL]) 

Sociotechnical Safety Evaluation of Generative AI Systems. (arXiv:2310.11986v1 [cs.AI]) 

From Interpolation to Extrapolation: Complete Length Generalization for Arithmetic Transformers. (arXiv:2310.11984v1 [cs.LG]) 

InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation. (arXiv:2310.11976v1 [cs.CL]) 

Filling in the Gaps: Efficient Event Coreference Resolution using Graph Autoencoder Networks. (arXiv:2310.11965v1 [cs.CL]) 

AMR Parsing with Causal Hierarchical Attention and Pointers. (arXiv:2310.11964v1 [cs.CL]) 

Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences. (arXiv:2310.11960v1 [cs.CL]) 

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.