RT @AlphaSignalAI@twitter.com

Microsoft's new Kosmos-1 is incredible.

It's a new Multimodal Large Language Model (MLLM).

Their model can understand images, text, images with text, OCR, image captioning, visual QA.

It can even solve IQ tests.

Paper: arxiv.org/abs/2302.14045
Code: github.com/microsoft/unilm

🐦🔗: twitter.com/AlphaSignalAI/stat

Follow

How the... the word "impressive" made it in this phase of the abstract: "Experimental results show that Kosmos-1 achieves impressive performance on"? Is this meant to be science or sports commentary?

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.