You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Gradio-based demonstration for the AllenAI Molmo2-8B multimodal model, enabling image QA, multi-image pointing, video QA, and temporal tracking. Users upload images or videos, provide natural language prompts.
“A clean, from-scratch implementation of the OLMo architecture with KV caching, RoPE, and an efficient autoregressive inference pipeline. Designed as a minimal yet extensible foundation for post-training research, including RLHF, preference optimization, and reasoning-focused systems.”