This is an app that lets you do a blind comparison of Ollama models and vote for which ones answered the prompt better. It’s inspired by the LMSYS Chatbot Arena that lets you do the same thing for a whole variety of hosted models.
Make sure that Ollama is running and that it can load multiple models at the same time. You can do this by running the following command:
OLLAMA_MAX_LOADED_MODELS=4 ollama serveClone the repository:
git clone git@github.com:mneedham/chatbot-arena.git
cd chatbot-arenaAnd then run it using Poetry:
poetry run streamlit run Ollama_Chatbot_Arena.py --server.headless TrueNavigate to http://localhost:8501 and you should see the following:

