arXiv - CSCL: "Qwen-VL: A Versatile Vision-Language Model for Un…" - Qoto Mastodon

arXiv - CSCL @arxiv_cscl@qoto.org

Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond. (arXiv:2308.12966v3 [cs.CV] UPDATED)

http://arxiv.org/abs/2308.12966 #arXiv #NLProc

Oct 16, 2023, 03:18 · · arxiv-cscl · · ·

Sign in to participate in the conversation