RT @AlphaSignalAI@twitter.com
Microsoft's new Kosmos-1 is incredible.
It's a new Multimodal Large Language Model (MLLM).
Their model can understand images, text, images with text, OCR, image captioning, visual QA.
It can even solve IQ tests.
Paper: https://arxiv.org/abs/2302.14045
Code: https://github.com/microsoft/unilm
🐦🔗: https://twitter.com/AlphaSignalAI/status/1630651280019292161
How the... the word "impressive" made it in this phase of the abstract: "Experimental results show that Kosmos-1 achieves impressive performance on"? Is this meant to be science or sports commentary?