SmolVLM—small yet mighty Vision Language Model
I've been having fun playing with this new vision model from the Hugging Face team behind [SmolLM](https://simonwillison.net/2024/Nov/2/smollm2/). They describe it as: [...] a 2B VLM, SOTA for its memory …