Hand-picked awesome Python libraries and frameworks, organised by category 🐍
Interactive version: www.awesomepython.org
Updated 09 Oct 2025
- Newly Created Repositories - Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here (10 repos)
- Agentic AI - Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations (104 repos)
- Code Quality - Code quality tooling: linters, formatters, pre-commit hooks, unused code removal (17 repos)
- Crypto and Blockchain - Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity (14 repos)
- Data - General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks (121 repos)
- Debugging - Debugging and tracing tools (9 repos)
- Diffusion Text to Image - Text-to-image diffusion model libraries, tools and apps for generating images from natural language (43 repos)
- Finance - Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives (36 repos)
- Game Development - Game development tools, engines and libraries (8 repos)
- GIS - Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections (29 repos)
- Graph - Graphs and network libraries: network analysis, graph machine learning, visualisation (6 repos)
- GUI - Graphical user interface libraries and toolkits (8 repos)
- Jupyter - Jupyter and JupyterLab and Notebook tools, libraries and plugins (28 repos)
- LLMs and ChatGPT - Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover (354 repos)
- Math and Science - Mathematical, numerical and scientific libraries (28 repos)
- Machine Learning - General - General and classical machine learning libraries. See below for other sections covering specialised ML areas (172 repos)
- Machine Learning - Deep Learning - Machine learning libraries that cross over with deep learning in some way (79 repos)
- Machine Learning - Interpretability - Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training (27 repos)
- Machine Learning - Ops - MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models (50 repos)
- Machine Learning - Reinforcement - Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF (23 repos)
- Machine Learning - Time Series - Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics (22 repos)
- Natural Language Processing - Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover (89 repos)
- Packaging - Python packaging, dependency management and bundling (27 repos)
- Pandas - Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations (25 repos)
- Performance - Performance, parallelisation and low level libraries (28 repos)
- Profiling - Memory and CPU/GPU profiling tools and libraries (11 repos)
- Security - Security related libraries: vulnerability discovery, SQL injection, environment auditing (16 repos)
- Simulation - Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover (43 repos)
- Study - Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials (69 repos)
- Template - Template tools and libraries: cookiecutter repos, generators, quick-starts (11 repos)
- Terminal - Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars (21 repos)
- Testing - Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins (24 repos)
- Typing - Typing libraries: static and run-time type checking, annotations (15 repos)
- Utility - General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools (218 repos)
- Vizualisation - Vizualisation tools and libraries. Application frameworks, 2D/3D plotting, dashboards, WebGL (37 repos)
- Web - Web related frameworks and libraries: webapp servers, WSGI, ASGI, asyncio, HTTP, REST, user management (60 repos)
Awesome Python is regularly updated, and this category lists the most recently created GitHub repositories from all the other repositories here.
- 
github/spec-kit ⭐ 31,151 
 Toolkit to help you get started with Spec-Driven Development: specifications become executable, directly generating working implementations
- 
tencentcloudadp/youtu-agent ⭐ 3,228 
 A flexible, high-performance framework for building, running, and evaluating autonomous agents
 🔗 tencentcloudadp.github.io/youtu-agent
- 
vllm-project/semantic-router ⭐ 1,604 
 An Mixture-of-Models router that directs OpenAI API requests to the most suitable models from a defined pool based on Semantic Understanding
 🔗 vllm-semantic-router.com
- 
run-llama/semtools ⭐ 1,201 
 Semantic search and document parsing tools for the command line
- 
leochlon/hallbayes ⭐ 1,113 
 Hallucination Risk Calculator & Prompt Re-engineering Toolkit (OpenAI-only)
- 
thinking-machines-lab/batch_invariant_ops ⭐ 798 
 Defeating Nondeterminism in LLM Inference: fixing floating-point non-associativity
- 
facebookresearch/cwm ⭐ 627 
 Code World Model (CWM) is a 32-billion-parameter open-weights LLM, to advance research on code generation with world models.
- 
google-deepmind/limit ⭐ 566 
 On the Theoretical Limitations of Embedding-Based Retrieval
 🔗 arxiv.org/abs/2508.21038
- 
ivebotunac/PrimoAgent ⭐ 214 
 PrimoAgent is an multi agent AI stock analysis system built on LangGraph architecture that orchestrates four specialized agents to provide comprehensive daily trading insights and next-day price predictions
 🔗 primoinvesting.com
- 
apple/ml-l3m ⭐ 174 
 A flexible library for training any type of large model, regardless of modality. Instead of more traditional approaches, we opt for a config-heavy approach
Agentic AI libraries, frameworks and tools: AI agents, workflows, autonomous decision-making, goal-oriented tasks, and API integrations.
- 
logspace-ai/langflow ⭐ 127,537 
 Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
 🔗 www.langflow.org
- 
langchain-ai/langchain ⭐ 116,644 
 🦜🔗 Build context-aware reasoning applications
 🔗 python.langchain.com
- 
langgenius/dify ⭐ 115,738 
 Production-ready platform for agentic workflow development.
 🔗 dify.ai
- 
browser-use/browser-use ⭐ 70,816 
 Browser use is the easiest way to connect your AI agents with the browser.
 🔗 browser-use.com
- 
geekan/MetaGPT ⭐ 58,790 
 🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
 🔗 mgx.dev
- 
microsoft/autogen ⭐ 50,447 
 AutoGen is a framework for creating multi-agent AI applications that can act autonomously or work alongside humans.
 🔗 microsoft.github.io/autogen
- 
run-llama/llama_index ⭐ 44,575 
 LlamaIndex is the leading framework for building LLM-powered agents over your data.
 🔗 docs.llamaindex.ai
- 
mem0ai/mem0 ⭐ 40,810 
 Enhances AI assistants and agents with an intelligent memory layer, enabling personalized AI interactions
 🔗 mem0.ai
- 
crewaiinc/crewAI ⭐ 38,819 
 Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
 🔗 crewai.com
- 
agno-agi/agno ⭐ 34,089 
 High-performance SDK and runtime for multi-agent systems. Build, run and manage secure multi-agent systems in your cloud.
 🔗 docs.agno.com
- 
github/spec-kit ⭐ 31,151 
 Toolkit to help you get started with Spec-Driven Development: specifications become executable, directly generating working implementations
- 
openbmb/ChatDev ⭐ 27,486 
 ChatDev stands as a virtual software company that operates through various intelligent agents holding different roles, including Chief Executive Officer, Chief Product Officer etc
 🔗 arxiv.org/abs/2307.07924
- 
stanford-oval/storm ⭐ 27,474 
 An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
 🔗 storm.genie.stanford.edu
- 
composiohq/composio ⭐ 25,740 
 Composio equips your AI agents & LLMs with 100+ high-quality integrations via function calling
 🔗 docs.composio.dev
- 
assafelovic/gpt-researcher ⭐ 23,699 
 LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
 🔗 gptr.dev
- 
microsoft/OmniParser ⭐ 23,622 
 OmniParser is a comprehensive method for parsing user interface screenshots into structured and easy-to-understand elements
- 
huggingface/smolagents ⭐ 23,201 
 🤗 smolagents: a barebones library for agents that think in code.
 🔗 huggingface.co/docs/smolagents
- 
fosowl/agenticSeek ⭐ 22,037 
 A 100% local alternative to Manus AI, this voice-enabled AI assistant autonomously browses the web, writes code, and plans tasks while keeping all data on your device.
 🔗 agenticseek.tech
- 
yoheinakajima/babyagi ⭐ 21,860 
 GPT-4 powered task-driven autonomous agent
 🔗 babyagi.org
- 
openai/swarm ⭐ 20,474 
 A framework exploring ergonomic, lightweight multi-agent orchestration.
- 
a2aproject/A2A ⭐ 20,073 
 An open protocol enabling communication and interoperability between opaque agentic applications.
 🔗 a2a-protocol.org
- 
langchain-ai/langgraph ⭐ 19,407 
 LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain.
 🔗 langchain-ai.github.io/langgraph
- 
unity-technologies/ml-agents ⭐ 18,709 
 The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
 🔗 unity.com/products/machine-learning-agents
- 
letta-ai/letta ⭐ 18,642 
 Letta (formerly MemGPT) is a framework for creating LLM services with memory.
 🔗 docs.letta.com
- 
camel-ai/owl ⭐ 18,161 
 🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
- 
dzhng/deep-research ⭐ 17,848 
 An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models.
- 
bytedance/deer-flow ⭐ 17,316 
 DeerFlow is a community-driven Deep Research framework, combining language models with tools like web search, crawling, and Python execution, while contributing back to the open-source community.
 🔗 deerflow.tech
- 
nirdiamant/GenAI_Agents ⭐ 17,135 
 Tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.
- 
google-gemini/gemini-fullstack-langgraph-quickstart ⭐ 16,977 
 Demonstrates a fullstack application using a React and LangGraph-powered backend agent. The agent is designed to perform comprehensive research on a user's query.
 🔗 ai.google.dev/gemini-api/docs/google-search
- 
openai/openai-agents-python ⭐ 15,224 
 A lightweight yet powerful framework for building multi-agent workflows. It is provider-agnostic, supporting the OpenAI Responses and Chat Completions APIs, as well as 100+ other LLMs.
 🔗 openai.github.io/openai-agents-python
- 
camel-ai/camel ⭐ 14,421 
 🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org
 🔗 docs.camel-ai.org
- 
google/adk-python ⭐ 13,442 
 An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.
 🔗 google.github.io/adk-docs
- 
emcie-co/parlant ⭐ 13,418 
 LLM agents built for control. Designed for real-world use. Deployed in minutes.
 🔗 www.parlant.io
- 
agentscope-ai/agentscope ⭐ 12,899 
 AgentScope: Agent-Oriented Programming for Building LLM Applications
 🔗 doc.agentscope.io
- 
pydantic/pydantic-ai ⭐ 12,773 
 PydanticAI is a Python Agent Framework designed to make it less painful to build production grade applications with Generative AI.
 🔗 ai.pydantic.dev
- 
smol-ai/developer ⭐ 12,160 
 the first library to let you embed a developer agent in your own app!
 🔗 twitter.com/smolmodels
- 
sakanaai/AI-Scientist ⭐ 11,567 
 The AI Scientist, the first comprehensive system for fully automatic scientific discovery, enabling Foundation Models such as Large Language Models (LLMs) to perform research independently.
- 
asyncfuncai/deepwiki-open ⭐ 10,987 
 Custom implementation of DeepWiki, automatically creates beautiful, interactive wikis for any GitHub, GitLab, or BitBucket repository
 🔗 asyncfunc.mintlify.app
- 
langchain-ai/open_deep_research ⭐ 9,040 
 Open Deep Research is an open source assistant that automates research and produces customizable reports on any topic
- 
ag-ui-protocol/ag-ui ⭐ 8,290 
 AG-UI: the Agent-User Interaction Protocol. Bring Agents into Frontend Applications.
 🔗 ag-ui.com
- 
meta-llama/llama-stack ⭐ 8,097 
 Llama Stack standardizes the building blocks needed to bring genai applications to market. These blocks cover model training and fine-tuning, evaluation, and running AI agents in production
 🔗 llamastack.github.io
- 
microsoft/magentic-ui ⭐ 7,734 
 A prototype of a human-centered interface powered by a multi-agent system that can browse and perform actions on the web, generate and execute code
 🔗 www.microsoft.com/en-us/research/blog/magentic-ui-an-experimental-human-centered-web-agent
- 
upsonic/Upsonic ⭐ 7,654 
 Upsonic is a reliability-focused framework designed for real-world applications. It enables trusted agent workflows in your organization through advanced reliability features, including verification layers, triangular architecture, validator agents, and output evaluation systems.
 🔗 docs.upsonic.ai
- 
zilliztech/deep-searcher ⭐ 7,011 
 DeepSearcher combines reasoning LLMs and VectorDBs o perform search, evaluation, and reasoning based on private data, providing highly accurate answer and comprehensive report
 🔗 zilliztech.github.io/deep-searcher
- 
awslabs/agent-squad ⭐ 6,942 
 Flexible, lightweight open-source framework for orchestrating multiple AI agents to handle complex conversations
 🔗 awslabs.github.io/agent-squad
- 
mnotgod96/AppAgent ⭐ 6,155 
 AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
 🔗 appagent-official.github.io
- 
prefecthq/marvin ⭐ 5,957 
 an ambient intelligence library
 🔗 askmarvin.ai
- 
x-plug/MobileAgent ⭐ 5,944 
 Mobile-Agent: The Powerful GUI Agent Family
- 
openai/openai-cs-agents-demo ⭐ 5,806 
 Demo of a Customer Service Agent interface built on top of the OpenAI Agents SDK
- 
humanlayer/humanlayer ⭐ 5,539 
 HumanLayer is an API and SDK that enables AI Agents to contact humans for help, feedback, and approvals.
 🔗 humanlayer.dev/code
- 
pyspur-dev/pyspur ⭐ 5,515 
 A visual playground for agentic workflows: Iterate over your agents 10x faster
 🔗 pyspur.dev
- 
kyegomez/swarms ⭐ 5,297 
 The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
 🔗 docs.swarms.world
- 
brainblend-ai/atomic-agents ⭐ 5,072 
 Atomic Agents provides a set of tools and agents that can be combined to create powerful applications. It is built on top of Instructor and leverages the power of Pydantic for data and schema validation and serialization.
- 
landing-ai/vision-agent ⭐ 5,060 
 VisionAgent is a library that helps you utilize agent frameworks to generate code to solve your vision task
- 
crewaiinc/crewAI-examples ⭐ 5,033 
 A collection of examples that show how to use CrewAI framework to automate workflows.
- 
meta-llama/llama-stack-apps ⭐ 4,275 
 Agentic components of the Llama Stack APIs
- 
codelion/openevolve ⭐ 4,041 
 Evolutionary coding agent (like AlphaEvolve) enabling automated scientific and algorithmic discovery
- 
rowboatlabs/rowboat ⭐ 3,742 
 AI-powered multi-agent builder
 🔗 www.rowboatlabs.com
- 
langroid/langroid ⭐ 3,719 
 Harness LLMs with Multi-Agent Programming
 🔗 langroid.github.io/langroid
- 
getzep/zep ⭐ 3,679 
 Zep is a memory platform for AI agents that learns from user interactions and business data
 🔗 help.getzep.com
- 
joshuac215/agent-service-toolkit ⭐ 3,678 
 A full toolkit for running an AI agent service built with LangGraph, FastAPI and Streamlit.
 🔗 agent-service-toolkit.streamlit.app
- 
ag2ai/ag2 ⭐ 3,622 
 AG2 (formerly AutoGen) is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks.
 🔗 ag2.ai
- 
strands-agents/sdk-python ⭐ 3,614 
 A model-driven approach to building AI agents in just a few lines of code.
 🔗 strandsagents.com
- 
openmanus/OpenManus-RL ⭐ 3,492 
 OpenManus-RL is an open-source initiative collaboratively led by Ulab-UIUC and MetaGPT. This project is an extended version of the original OpenManus initiative.
- 
going-doer/Paper2Code ⭐ 3,416 
 A multi-agent LLM system that transforms paper into a code repository. It follows a three-stage pipeline: planning, analysis, and code generation, each handled by specialized agents.
- 
tencentcloudadp/youtu-agent ⭐ 3,228 
 A flexible, high-performance framework for building, running, and evaluating autonomous agents
 🔗 tencentcloudadp.github.io/youtu-agent
- 
facebookresearch/Pearl ⭐ 2,939 
 A Production-ready Reinforcement Learning AI Agent Library brought by the Applied Reinforcement Learning team at Meta.
- 
cheshire-cat-ai/core ⭐ 2,859 
 AI agent microservice
 🔗 cheshirecat.ai
- 
i-am-bee/beeai-framework ⭐ 2,843 
 Build production-ready AI agents in both Python and Typescript.
 🔗 framework.beeai.dev
- 
om-ai-lab/OmAgent ⭐ 2,556 
 OmAgent is python library for building multimodal language agents with ease. We try to keep the library simple without too much overhead like other agent framework.
 🔗 om-agent.com
- 
griptape-ai/griptape ⭐ 2,385 
 Modular Python framework for AI agents and workflows with chain-of-thought reasoning, tools, and memory.
 🔗 www.griptape.ai
- 
langchain-ai/executive-ai-assistant ⭐ 2,091 
 Executive AI Assistant (EAIA) is an AI agent that attempts to do the job of an Executive Assistant (EA).
- 
btahir/open-deep-research ⭐ 2,079 
 Open source alternative to Gemini Deep Research. Generate reports with AI based on search results.
 🔗 opendeepresearch.vercel.app
- 
run-llama/llama_deploy ⭐ 2,053 
 Async-first framework for deploying, scaling, and productionizing agentic multi-service systems based on workflows from llama_index.
 🔗 docs.llamaindex.ai/en/stable/module_guides/llama_deploy
- 
agentops-ai/AgentStack ⭐ 1,930 
 AgentStack scaffolds your agent stack - The tech stack that collectively is your agent
- 
openautocoder/Agentless ⭐ 1,924 
 Agentless🐱: an agentless approach to automatically solve software development problems
- 
swe-agent/mini-swe-agent ⭐ 1,809 
 The 100 line AI agent that solves GitHub issues or helps you in your command line
 🔗 mini-swe-agent.com
- 
weaviate/elysia ⭐ 1,714 
 Elysia is an agentic platform designed to use tools in a decision tree. A decision agent decides which tools to use dynamically based on its environment and context.
- 
msoedov/agentic_security ⭐ 1,638 
 An open-source vulnerability scanner for Agent Workflows and LLMs. Protecting AI systems from jailbreaks, fuzzing, and multimodal attacks.
 🔗 agentic-security.vercel.app
- 
sakanaai/AI-Scientist-v2 ⭐ 1,605 
 The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search
- 
vllm-project/semantic-router ⭐ 1,604 
 An Mixture-of-Models router that directs OpenAI API requests to the most suitable models from a defined pool based on Semantic Understanding
 🔗 vllm-semantic-router.com
- 
agentera/Agently ⭐ 1,434 
 Agently is a development framework that helps developers build AI agent native application really fast.
 🔗 agently.tech
- 
shengranhu/ADAS ⭐ 1,428 
 Automated Design of Agentic Systems using Meta Agent Search to show agents can invent novel and powerful agent designs
 🔗 www.shengranhu.com/adas
- 
link-agi/AutoAgents ⭐ 1,419 
 [IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.
 🔗 huggingface.co/spaces/linksoul/autoagents
- 
prefecthq/ControlFlow ⭐ 1,380 
 ControlFlow provides a structured, developer-focused framework for defining workflows and delegating work to LLMs, without sacrificing control or transparency
 🔗 controlflow.ai
- 
szczyglis-dev/py-gpt ⭐ 1,254 
 Desktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, speech synthesis and recognition, web search, memory, presets, assistants,and more. Linux, Windows, Mac
 🔗 pygpt.net
- 
langchain-ai/langgraph-swarm-py ⭐ 1,164 
 A library for creating swarm-style multi-agent systems using LangGraph. A swarm is a type of multi-agent architecture where agents dynamically hand off control to one another based on their specializations
 🔗 langchain-ai.github.io/langgraph/concepts/multi_agent
- 
jd-opensource/OxyGent ⭐ 1,146 
 OxyGent is a modular multi-agent framework that lets you build, deploy, and evolve AI teams
 🔗 oxygent.jd.com
- 
plurai-ai/intellagent ⭐ 1,136 
 Simulate interactions, analyze performance, and gain actionable insights for conversational agents. Test, evaluate, and optimize your agent to ensure reliable real-world deployment.
 🔗 intellagent-doc.plurai.ai
- 
thudm/CogAgent ⭐ 1,055 
 An open-sourced end-to-end VLM-based GUI Agent
- 
strnad/CrewAI-Studio ⭐ 1,035 
 agentic,gui,automation
- 
google-deepmind/concordia ⭐ 1,031 
 Concordia is a library to facilitate construction and use of generative agent-based models to simulate interactions of agents in grounded physical, social, or digital space.
- 
bytedance-seed/m3-agent ⭐ 985 
 Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
- 
victordibia/autogen-ui ⭐ 964 
 Web UI for AutoGen (A Framework Multi-Agent LLM Applications)
- 
thytu/Agentarium ⭐ 933 
 Framework for managing and orchestrating AI agents with ease. Agentarium provides a flexible and intuitive way to create, manage, and coordinate interactions between multiple AI agents in various environments.
- 
deedy/mac_computer_use ⭐ 822 
 A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
 🔗 x.com/deedydas/status/1849481225041559910
- 
salesforceairesearch/AgentLite ⭐ 633 
 AgentLite is a research-oriented library designed for building and advancing LLM-based task-oriented agent systems. It simplifies the implementation of new agent/multi-agent architectures, enabling easy orchestration of multiple agents through a manager agent.
- 
codingmoh/open-codex ⭐ 628 
 Open Codex is a fully open-source command-line AI assistant inspired by OpenAI Codex, supporting local language models like phi-4-mini and full integration with Ollama.
- 
alpha-innovator/InternAgent ⭐ 526 
 When Agent Becomes the Scientist – Building Closed-Loop System from Hypothesis to Verification
 🔗 alpha-innovator.github.io/internagent-project-page
- 
quantalogic/quantalogic ⭐ 447 
 QuantaLogic is a ReAct (Reasoning & Action) framework for building advanced AI agents. The cli version include coding capabilities comparable to Aider.
- 
sakanaai/AI-Scientist-ICLR2025-Workshop-Experiment ⭐ 270 
 A paper produced by The AI Scientist passed a peer-review process at a workshop in a top machine learning conference
- 
agentscope-ai/agentscope-runtime ⭐ 150 
 AgentScope Runtime: secure sandboxed tool execution and scalable agent deployment
 🔗 runtime.agentscope.io
- 
mannaandpoem/OpenManus ⭐ 141 
 Open source version of Manus, the general AI agent
- 
prithivirajdamodaran/Route0x ⭐ 116 
 A production-grade query routing solution, leveraging LLMs while optimizing for cost per query
Code quality tooling: linters, formatters, pre-commit hooks, unused code removal.
- 
astral-sh/ruff ⭐ 42,824 
 An extremely fast Python linter and code formatter, written in Rust.
 🔗 docs.astral.sh/ruff
- 
psf/black ⭐ 40,993 
 The uncompromising Python code formatter
 🔗 black.readthedocs.io/en/stable
- 
pre-commit/pre-commit ⭐ 14,417 
 A framework for managing and maintaining multi-language pre-commit hooks.
 🔗 pre-commit.com
- 
sqlfluff/sqlfluff ⭐ 9,213 
 A modular SQL linter and auto-formatter with support for multiple dialects and templated code.
 🔗 www.sqlfluff.com
- 
pycqa/isort ⭐ 6,830 
 A Python utility / library to sort imports.
 🔗 pycqa.github.io/isort
- 
davidhalter/jedi ⭐ 6,024 
 Awesome autocompletion, static analysis and refactoring library for python
 🔗 jedi.readthedocs.io
- 
pycqa/pylint ⭐ 5,571 
 It's not just a linter that annoys you!
 🔗 pylint.readthedocs.io/en/latest
- 
jendrikseipp/vulture ⭐ 4,065 
 Find dead Python code
- 
asottile/pyupgrade ⭐ 3,901 
 A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.
- 
pycqa/flake8 ⭐ 3,694 
 flake8 is a python tool that glues together pycodestyle, pyflakes, mccabe, and third-party plugins to check the style and quality of some python code.
 🔗 flake8.pycqa.org
- 
wemake-services/wemake-python-styleguide ⭐ 2,776 
 The strictest and most opinionated python linter ever!
 🔗 wemake-python-styleguide.rtfd.io
- 
python-lsp/python-lsp-server ⭐ 2,373 
 Fork of the python-language-server project, maintained by the Spyder IDE team and the community
- 
codespell-project/codespell ⭐ 2,226 
 check code for common misspellings
- 
sourcery-ai/sourcery ⭐ 1,719 
 Instant AI code reviews
 🔗 sourcery.ai
- 
callowayproject/bump-my-version ⭐ 531 
 A small command line tool to simplify releasing software by updating all version strings in your source code by the correct increment and optionally commit and tag the changes.
 🔗 callowayproject.github.io/bump-my-version
- 
tconbeer/sqlfmt ⭐ 486 
 sqlfmt formats your dbt SQL files so you don't have to
 🔗 sqlfmt.com
Cryptocurrency and blockchain libraries: trading bots, API integration, Ethereum virtual machine, solidity.
- 
freqtrade/freqtrade ⭐ 43,309 
 Free, open source crypto trading bot
 🔗 www.freqtrade.io
- 
ccxt/ccxt ⭐ 39,036 
 A cryptocurrency trading API with more than 100 exchanges in JavaScript / TypeScript / Python / C# / PHP / Go
 🔗 docs.ccxt.com
- 
crytic/slither ⭐ 5,924 
 Static Analyzer for Solidity and Vyper
 🔗 blog.trailofbits.com/2018/10/19/slither-a-solidity-static-analysis-framework
- 
ethereum/web3.py ⭐ 5,382 
 A python interface for interacting with the Ethereum blockchain and ecosystem.
 🔗 web3py.readthedocs.io
- 
ethereum/consensus-specs ⭐ 3,797 
 Ethereum Proof-of-Stake Consensus Specifications
 🔗 ethereum.github.io/consensus-specs
- 
cyberpunkmetalhead/Binance-volatility-trading-bot ⭐ 3,480 
 This is a fully functioning Binance trading bot that measures the volatility of every coin on Binance and places trades with the highest gaining coins If you like this project consider donating though the Brave browser to allow me to continuously improve the script.
- 
bmoscon/cryptofeed ⭐ 2,572 
 Cryptocurrency Exchange Websocket Data Feed Handler
- 
ethereum/py-evm ⭐ 2,360 
 A Python implementation of the Ethereum Virtual Machine
 🔗 py-evm.readthedocs.io/en/latest
- 
binance/binance-public-data ⭐ 2,025 
 Details on how to get Binance public data
- 
ofek/bit ⭐ 1,307 
 Bitcoin made easy.
 🔗 ofek.dev/bit
- 
man-c/pycoingecko ⭐ 1,091 
 Python wrapper for the CoinGecko API
- 
coinbase/agentkit ⭐ 867 
 AgentKit is Coinbase Developer Platform's framework for easily enabling AI agents to take actions onchain. It is designed to be framework-agnostic, so you can use it with any AI framework, and wallet-agnostic
 🔗 docs.cdp.coinbase.com/agentkit/docs/welcome
- 
dylanhogg/awesome-crypto ⭐ 78 
 A list of awesome crypto and blockchain projects
 🔗 www.awesomecrypto.xyz
General data libraries: data processing, serialisation, formats, databases, SQL, connectors, web crawlers, data generation/augmentation/checks.
- 
microsoft/markitdown ⭐ 80,505 
 A utility for converting files to Markdown, supports: PDF, PPT, Word, Excel, Images etc
- 
scrapy/scrapy ⭐ 58,428 
 Scrapy, a fast high-level web crawling & scraping framework for Python.
 🔗 scrapy.org
- 
pathwaycom/pathway ⭐ 44,754 
 Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
 🔗 pathway.com
- 
apache/spark ⭐ 42,016 
 Apache Spark - A unified analytics engine for large-scale data processing
 🔗 spark.apache.org
- 
ds4sd/docling ⭐ 40,597 
 Docling parses documents and exports them to the desired format with ease and speed.
 🔗 docling-project.github.io/docling
- 
mindsdb/mindsdb ⭐ 36,247 
 AI Analytics and Knowledge Engine for RAG over large-scale, heterogeneous data. - The only MCP Server you'll ever need
 🔗 mindsdb.com
- 
jaidedai/EasyOCR ⭐ 28,064 
 Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
 🔗 www.jaided.ai
- 
getredash/redash ⭐ 27,833 
 Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
 🔗 redash.io
- 
qdrant/qdrant ⭐ 26,426 
 Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
 🔗 qdrant.tech
- 
humansignal/label-studio ⭐ 24,943 
 Label Studio is an open source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats.
 🔗 labelstud.io
- 
chroma-core/chroma ⭐ 23,712 
 Open-source search and retrieval database for AI applications.
 🔗 www.trychroma.com
- 
airbytehq/airbyte ⭐ 19,693 
 The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
 🔗 airbyte.com
- 
avaiga/taipy ⭐ 18,751 
 Turns Data and AI algorithms into production-ready web applications in no time.
 🔗 www.taipy.io
- 
joke2k/faker ⭐ 18,736 
 Faker is a Python package that generates fake data for you.
 🔗 faker.readthedocs.io
- 
binux/pyspider ⭐ 16,900 
 A Powerful Spider(Web Crawler) System in Python.
 🔗 docs.pyspider.org
- 
tiangolo/sqlmodel ⭐ 16,874 
 SQL databases in Python, designed for simplicity, compatibility, and robustness.
 🔗 sqlmodel.tiangolo.com
- 
twintproject/twint ⭐ 16,227 
 An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
- 
apache/arrow ⭐ 16,022 
 Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
 🔗 arrow.apache.org
- 
weaviate/weaviate ⭐ 14,707 
 Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database.
 🔗 weaviate.io/developers/weaviate
- 
cyclotruc/gitingest ⭐ 12,676 
 Turn any Git repository into a prompt-friendly text ingest for LLMs.
 🔗 gitingest.com
- 
s0md3v/Photon ⭐ 12,247 
 Incredibly fast crawler designed for OSINT.
- 
coleifer/peewee ⭐ 11,709 
 a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
 🔗 docs.peewee-orm.com
- 
sqlalchemy/sqlalchemy ⭐ 10,960 
 The Database Toolkit for Python
 🔗 www.sqlalchemy.org
- 
googleapis/genai-toolbox ⭐ 10,806 
 MCP Toolbox for Databases is an open source MCP server for databases. Develop tools easier, faster, and more securely by handling connection pooling, authentication.
 🔗 googleapis.github.io/genai-toolbox/getting-started/introduction
- 
simonw/datasette ⭐ 10,362 
 An open source multi-tool for exploring and publishing data
 🔗 datasette.io
- 
voxel51/fiftyone ⭐ 9,919 
 Refine high-quality datasets and visual AI models
 🔗 fiftyone.ai
- 
gristlabs/grist-core ⭐ 9,806 
 Grist is the evolution of spreadsheets.
 🔗 www.getgrist.com
- 
bigscience-workshop/petals ⭐ 9,801 
 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
 🔗 petals.dev
- 
yzhao062/pyod ⭐ 9,482 
 A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
 🔗 pyod.readthedocs.io
- 
tobymao/sqlglot ⭐ 8,389 
 Python SQL Parser and Transpiler
 🔗 sqlglot.com
- 
lancedb/lancedb ⭐ 7,667 
 Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.
 🔗 lancedb.github.io/lancedb
- 
alirezamika/autoscraper ⭐ 6,984 
 A Smart, Automatic, Fast and Lightweight Web Scraper for Python
- 
kaggle/kaggle-api ⭐ 6,869 
 Official Kaggle API
- 
madmaze/pytesseract ⭐ 6,229 
 A Python wrapper for Google Tesseract
- 
ibis-project/ibis ⭐ 6,133 
 Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It helps Python users explore and transform data of any size, stored anywhere.
 🔗 ibis-project.org
- 
vi3k6i5/flashtext ⭐ 5,680 
 Extract Keywords from sentence or Replace keywords in sentences.
- 
airbnb/knowledge-repo ⭐ 5,527 
 A next-generation curated knowledge sharing platform for data scientists and other technical professions.
- 
superduperdb/superduper ⭐ 5,216 
 Superduper: End-to-end framework for building custom AI applications and agents.
 🔗 superduper.io
- 
rapidai/RapidOCR ⭐ 5,064 
 📄 Awesome OCR multiple programing languages toolkits based on ONNXRuntime, OpenVINO, PaddlePaddle and PyTorch.
 🔗 rapidai.github.io/rapidocrdocs
- 
facebookresearch/AugLy ⭐ 5,046 
 A data augmentations library for audio, image, text, and video.
 🔗 ai.facebook.com/blog/augly-a-new-data-augmentation-library-to-help-build-more-robust-ai-models
- 
giskard-ai/giskard-oss ⭐ 4,907 
 🐢 Open-Source Evaluation & Testing library for LLM Agents
 🔗 docs.giskard.ai
- 
adbar/trafilatura ⭐ 4,764 
 Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
 🔗 trafilatura.readthedocs.io
- 
jazzband/tablib ⭐ 4,718 
 Python Module for Tabular Datasets in XLS, CSV, JSON, YAML, &c.
 🔗 tablib.readthedocs.io
- 
amundsen-io/amundsen ⭐ 4,664 
 Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
 🔗 www.amundsen.io/amundsen
- 
lk-geimfari/mimesis ⭐ 4,624 
 Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
 🔗 mimesis.name
- 
mongodb/mongo-python-driver ⭐ 4,280 
 PyMongo - the Official MongoDB Python driver
 🔗 www.mongodb.com/docs/languages/python/pymongo-driver/current
- 
dlt-hub/dlt ⭐ 4,246 
 data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
 🔗 dlthub.com/docs
- 
rom1504/img2dataset ⭐ 4,182 
 Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
- 
andialbrecht/sqlparse ⭐ 3,939 
 A non-validating SQL parser module for Python
- 
deepchecks/deepchecks ⭐ 3,911 
 Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
 🔗 docs.deepchecks.com/stable
- 
praw-dev/praw ⭐ 3,857 
 PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
 🔗 praw.readthedocs.io
- 
jmcnamara/XlsxWriter ⭐ 3,840 
 A Python module for creating Excel XLSX files.
 🔗 xlsxwriter.readthedocs.io
- 
mlabonne/llm-datasets ⭐ 3,762 
 Curated list of datasets and tools for post-training.
 🔗 mlabonne.github.io/blog
- 
sqlalchemy/alembic ⭐ 3,664 
 A database migrations tool for SQLAlchemy.
- 
run-llama/llama-hub ⭐ 3,477 
 A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
 🔗 llamahub.ai
- 
pyeve/cerberus ⭐ 3,246 
 Lightweight, extensible data validation library for Python
 🔗 python-cerberus.org
- 
sdv-dev/SDV ⭐ 3,197 
 Synthetic data generation for tabular data
 🔗 docs.sdv.dev/sdv
- 
docarray/docarray ⭐ 3,106 
 Represent, send, store and search multimodal data
 🔗 docs.docarray.org
- 
pallets/itsdangerous ⭐ 3,059 
 Safely pass trusted data to untrusted environments and back.
 🔗 itsdangerous.palletsprojects.com
- 
datafold/data-diff ⭐ 2,988 
 Compare tables within or across databases
 🔗 docs.datafold.com
- 
goldsmith/Wikipedia ⭐ 2,960 
 A Pythonic wrapper for the Wikipedia API
 🔗 wikipedia.readthedocs.org
- 
mangiucugna/json_repair ⭐ 2,837 
 A python module to repair invalid JSON from LLMs
 🔗 pypi.org/project/json-repair
- 
awslabs/amazon-redshift-utils ⭐ 2,806 
 Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
- 
kayak/pypika ⭐ 2,726 
 PyPika is a python SQL query builder that exposes the full richness of the SQL language using a syntax that reflects the resulting query. PyPika excels at all sorts of SQL queries but is especially useful for data analysis.
 🔗 pypika.readthedocs.io/en/latest
- 
samuelcolvin/arq ⭐ 2,683 
 Fast job queuing and RPC in python with asyncio and redis.
 🔗 arq-docs.helpmanual.io
- 
huggingface/datatrove ⭐ 2,659 
 DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality
- 
pynamodb/PynamoDB ⭐ 2,539 
 A pythonic interface to Amazon's DynamoDB
 🔗 pynamodb.readthedocs.io
- 
pikepdf/pikepdf ⭐ 2,482 
 A Python library for reading and writing PDF, powered by QPDF
 🔗 pikepdf.readthedocs.io
- 
sfu-db/connector-x ⭐ 2,419 
 Fastest library to load data from DB to DataFrames in Rust and Python
 🔗 sfu-db.github.io/connector-x
- 
uqfoundation/dill ⭐ 2,381 
 serialize all of Python
 🔗 dill.rtfd.io
- 
aminalaee/sqladmin ⭐ 2,369 
 SQLAlchemy Admin for FastAPI and Starlette
 🔗 aminalaee.github.io/sqladmin
- 
emirozer/fake2db ⭐ 2,339 
 Generate fake but valid data filled databases for test purposes using most popular patterns(AFAIK). Current support is sqlite, mysql, postgresql, mongodb, redis, couchdb.
- 
graphistry/pygraphistry ⭐ 2,338 
 PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
- 
milvus-io/bootcamp ⭐ 2,255 
 Dealing with all unstructured data, such as reverse image search, audio search, molecular search, video analysis, question and answer systems, NLP, etc.
 🔗 milvus.io
- 
accenture/AmpliGraph ⭐ 2,223 
 Python library for Representation Learning on Knowledge Graphs https://docs.ampligraph.org
- 
agronholm/sqlacodegen ⭐ 2,215 
 Automatic model code generator for SQLAlchemy
- 
simonw/sqlite-utils ⭐ 1,897 
 Python CLI utility and library for manipulating SQLite databases
 🔗 sqlite-utils.datasette.io
- 
uber/petastorm ⭐ 1,859 
 Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
- 
aio-libs/aiomysql ⭐ 1,840 
 aiomysql is a library for accessing a MySQL database from the asyncio
 🔗 aiomysql.rtfd.io
- 
matthewwithanm/python-markdownify ⭐ 1,813 
 Convert HTML to Markdown
- 
simple-salesforce/simple-salesforce ⭐ 1,808 
 A very simple Salesforce.com REST API client for Python
- 
zarr-developers/zarr-python ⭐ 1,797 
 An implementation of chunked, compressed, N-dimensional arrays for Python.
 🔗 zarr.readthedocs.io
- 
collerek/ormar ⭐ 1,777 
 python async orm with fastapi in mind and pydantic validation
 🔗 collerek.github.io/ormar
- 
scholarly-python-package/scholarly ⭐ 1,710 
 Retrieve author and publication information from Google Scholar in a friendly, Pythonic way without having to worry about CAPTCHAs!
 🔗 scholarly.readthedocs.io
- 
eleutherai/the-pile ⭐ 1,605 
 The Pile is a large, diverse, open source language modelling data set that consists of many smaller datasets combined together.
- 
ydataai/ydata-synthetic ⭐ 1,579 
 Synthetic data generators for tabular and time-series data
 🔗 docs.sdk.ydata.ai
- 
d-star-ai/dsRAG ⭐ 1,501 
 A retrieval engine for unstructured data. It is especially good at handling challenging queries over dense text, like financial reports, legal documents, and academic papers.
- 
huggingface/aisheets ⭐ 1,499 
 Build, enrich, and transform datasets using AI models with no code. Deploy locally or on the Hub with access to thousands of open models.
 🔗 huggingface.co/spaces/aisheets/sheets
- 
quixio/quix-streams ⭐ 1,453 
 Python Streaming DataFrames for Kafka
 🔗 docs.quix.io
- 
google/tensorstore ⭐ 1,449 
 Library for reading and writing large multi-dimensional arrays.
 🔗 google.github.io/tensorstore
- 
mchong6/JoJoGAN ⭐ 1,436 
 Official PyTorch repo for JoJoGAN: One Shot Face Stylization
- 
sdispater/orator ⭐ 1,416 
 The Orator ORM provides a simple yet beautiful ActiveRecord implementation.
 🔗 orator-orm.com
- 
aio-libs/aiocache ⭐ 1,360 
 Asyncio cache manager for redis, memcached and memory
 🔗 aiocache.readthedocs.io
- 
igorbenav/fastcrud ⭐ 1,278 
 FastCRUD is a Python package for FastAPI, offering robust async CRUD operations and flexible endpoint creation utilities.
 🔗 benavlabs.github.io/fastcrud
- 
eliasdabbas/advertools ⭐ 1,274 
 advertools - online marketing productivity and analysis tools
 🔗 advertools.readthedocs.io
- 
meta-llama/synthetic-data-kit ⭐ 1,250 
 Tool for generating high-quality synthetic datasets to fine-tune LLMs. Generate Reasoning Traces, QA Pairs, save them to a fine-tuning format with a simple CLI.
 🔗 pypi.org/project/synthetic-data-kit
- 
pytorch/data ⭐ 1,227 
 A PyTorch repo for data loading and utilities to be shared by the PyTorch domain libraries.
- 
duckdb/dbt-duckdb ⭐ 1,153 
 dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
- 
brettkromkamp/contextualise ⭐ 1,079 
 Contextualise is an effective tool particularly suited for organising information-heavy projects and activities consisting of unstructured and widely diverse data and information resources
 🔗 contextualise.dev
- 
uber/fiber ⭐ 1,048 
 Distributed Computing for AI Made Simple
 🔗 uber.github.io/fiber
- 
goccy/bigquery-emulator ⭐ 987 
 BigQuery emulator provides a way to launch a BigQuery server on your local machine for testing and development.
- 
scikit-hep/awkward ⭐ 908 
 Manipulate JSON-like data with NumPy-like idioms.
 🔗 awkward-array.org
- 
weaviate/recipes ⭐ 891 
 This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!
- 
apache/iceberg-python ⭐ 883 
 PyIceberg is a Python library for programmatic access to Iceberg table metadata as well as to table data in Iceberg format.
 🔗 py.iceberg.apache.org
- 
macbre/sql-metadata ⭐ 875 
 Uses tokenized query returned by python-sqlparse and generates query metadata
 🔗 pypi.python.org/pypi/sql-metadata
- 
koaning/human-learn ⭐ 821 
 Natural Intelligence is still a pretty good idea.
 🔗 koaning.github.io/human-learn
- 
unstructured-io/unstructured-api ⭐ 814 
 API for Open-Source Pre-Processing Tools for Unstructured Data
- 
ibm/data-prep-kit ⭐ 809 
 Data Prep Kit is a community project to democratize and accelerate unstructured data preparation for LLM app developers
 🔗 data-prep-kit.github.io/data-prep-kit
- 
googleapis/python-bigquery ⭐ 784 
 Python Client for Google BigQuery
- 
kagisearch/vectordb ⭐ 757 
 A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.
 🔗 vectordb.com
- 
hyperqueryhq/whale ⭐ 727 
 🐳 The stupidly simple CLI workspace for your data warehouse.
 🔗 rsyi.gitbook.io/whale
- 
dgarnitz/vectorflow ⭐ 696 
 VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.
 🔗 www.getvectorflow.com
- 
jina-ai/vectordb ⭐ 633 
 A Python vector database you just need - no more, no less.
- 
koaning/bulk ⭐ 597 
 Bulk is a quick UI developer tool to apply some bulk labels.
- 
stackloklabs/deepfabric ⭐ 584 
 Promptwright is a Python library designed for generating large synthetic datasets using LLMs
 🔗 lukehinds.github.io/deepfabric
- 
koaning/embetter ⭐ 517 
 just a bunch of useful embeddings for scikit-learn pipelines
 🔗 koaning.github.io/embetter
- 
koaning/doubtlab ⭐ 514 
 Doubt your data, find bad labels.
 🔗 koaning.github.io/doubtlab
- 
apache/datafusion-python ⭐ 507 
 This is a Python library that binds to Apache Arrow in-memory query engine DataFusion
 🔗 datafusion.apache.org/python
- 
github/innovationgraph ⭐ 502 
 GitHub Innovation Graph
 🔗 innovationgraph.github.com
- 
titan-systems/titan ⭐ 477 
 Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API.
Debugging and tracing tools.
- 
cool-rr/PySnooper ⭐ 16,572 
 Never use print for debugging again
- 
shobrook/rebound ⭐ 4,131 
 Instant Stack Overflow results whenever an exception is thrown
- 
inducer/pudb ⭐ 3,156 
 Full-screen console debugger for Python
 🔗 documen.tician.de/pudb
- 
alexmojaki/heartrate ⭐ 1,828 
 Simple real time visualisation of the execution of a Python program.
- 
alexmojaki/birdseye ⭐ 1,695 
 Graphical Python debugger which lets you easily view the values of all evaluated expressions
 🔗 birdseye.readthedocs.io
- 
pdbpp/pdbpp ⭐ 1,429 
 pdb++, a drop-in replacement for pdb (the Python debugger)
- 
alexmojaki/snoop ⭐ 1,401 
 A powerful set of Python debugging tools, based on PySnooper
Text-to-image diffusion model libraries, tools and apps for generating images from natural language.
- 
automatic1111/stable-diffusion-webui ⭐ 157,053 
 Stable Diffusion web UI
- 
comfyanonymous/ComfyUI ⭐ 90,015 
 The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
 🔗 www.comfy.org
- 
compvis/stable-diffusion ⭐ 71,549 
 A latent text-to-image diffusion model
 🔗 ommer-lab.com/research/latent-diffusion-models
- 
stability-ai/stablediffusion ⭐ 41,809 
 High-Resolution Image Synthesis with Latent Diffusion Models
- 
lllyasviel/ControlNet ⭐ 33,123 
 Let us control diffusion models!
- 
huggingface/diffusers ⭐ 30,995 
 🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
 🔗 huggingface.co/docs/diffusers
- 
invoke-ai/InvokeAI ⭐ 25,953 
 Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multiple commercial products.
 🔗 invoke-ai.github.io/invokeai
- 
openbmb/MiniCPM-V ⭐ 22,024 
 MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
- 
apple/ml-stable-diffusion ⭐ 17,625 
 Stable Diffusion with Core ML on Apple Silicon
- 
borisdayma/dalle-mini ⭐ 14,810 
 DALL·E Mini - Generate images from a text prompt
 🔗 www.craiyon.com
- 
divamgupta/diffusionbee-stable-diffusion-ui ⭐ 13,404 
 Diffusion Bee is the easiest way to run Stable Diffusion locally on your M1 Mac. Comes with a one-click installer. No dependencies or technical knowledge needed.
 🔗 diffusionbee.com
- 
compvis/latent-diffusion ⭐ 13,375 
 High-Resolution Image Synthesis with Latent Diffusion Models
- 
instantid/InstantID ⭐ 11,826 
 InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
 🔗 instantid.github.io
- 
facebookresearch/dinov2 ⭐ 11,655 
 PyTorch code and models for the DINOv2 self-supervised learning method.
- 
lucidrains/DALLE2-pytorch ⭐ 11,332 
 Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
- 
opengvlab/InternVL ⭐ 9,284 
 [CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
 🔗 internvl.readthedocs.io/en/latest
- 
idea-research/GroundingDINO ⭐ 8,997 
 [ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
 🔗 arxiv.org/abs/2303.05499
- 
ashawkey/stable-dreamfusion ⭐ 8,728 
 Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
- 
carson-katri/dream-textures ⭐ 8,078 
 Stable Diffusion built-in to Blender
- 
xavierxiao/Dreambooth-Stable-Diffusion ⭐ 7,745 
 Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
- 
timothybrooks/instruct-pix2pix ⭐ 6,802 
 PyTorch implementation of InstructPix2Pix, an instruction-based image editing model, based on the original CompVis/stable_diffusion repo.
- 
openai/consistency_models ⭐ 6,412 
 Official repo for consistency models.
- 
salesforce/BLIP ⭐ 5,509 
 PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
- 
nateraw/stable-diffusion-videos ⭐ 4,620 
 Create 🔥 videos with Stable Diffusion by exploring the latent space and morphing between text prompts
- 
lkwq007/stablediffusion-infinity ⭐ 3,880 
 Outpainting with Stable Diffusion on an infinite canvas
- 
jina-ai/discoart ⭐ 3,832 
 🪩 Create Disco Diffusion artworks in one line
- 
mlc-ai/web-stable-diffusion ⭐ 3,681 
 Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.
 🔗 mlc.ai/web-stable-diffusion
- 
openai/improved-diffusion ⭐ 3,678 
 Release for Improved Denoising Diffusion Probabilistic Models
- 
openai/glide-text2im ⭐ 3,661 
 GLIDE: a diffusion-based text-conditional image synthesis model
- 
google-research/big_vision ⭐ 3,161 
 Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
- 
open-compass/VLMEvalKit ⭐ 3,123 
 Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
 🔗 huggingface.co/spaces/opencompass/open_vlm_leaderboard
- 
saharmor/dalle-playground ⭐ 2,753 
 A playground to generate images from any text prompt using Stable Diffusion (past: using DALL-E Mini)
- 
stability-ai/stability-sdk ⭐ 2,430 
 SDK for interacting with stability.ai APIs (e.g. stable diffusion inference)
 🔗 platform.stability.ai
- 
thudm/CogVLM2 ⭐ 2,416 
 GPT4V-level open-source multi-modal model based on Llama3-8B
- 
coyote-a/ultimate-upscale-for-automatic1111 ⭐ 1,755 
 Ultimate SD Upscale extension for AUTOMATIC1111 Stable Diffusion web UI
- 
divamgupta/stable-diffusion-tensorflow ⭐ 1,606 
 Stable Diffusion in TensorFlow / Keras
- 
nvlabs/prismer ⭐ 1,305 
 The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".
 🔗 shikun.io/projects/prismer
- 
chenyangqiqi/FateZero ⭐ 1,149 
 [ICCV 2023 Oral] "FateZero: Fusing Attentions for Zero-shot Text-based Video Editing"
 🔗 fate-zero-edit.github.io
- 
tanelp/tiny-diffusion ⭐ 936 
 A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.
- 
thereforegames/unprompted ⭐ 808 
 Templating language written for Stable Diffusion workflows. Available as an extension for the Automatic1111 WebUI.
- 
sharonzhou/long_stable_diffusion ⭐ 689 
 Long-form text-to-images generation, using a pipeline of deep generative models (GPT-3 and Stable Diffusion)
- 
gojasper/flash-diffusion ⭐ 626 
 ⚡ Flash Diffusion ⚡: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation (AAAI 2025 Oral)
 🔗 gojasper.github.io/flash-diffusion-project
- 
laion-ai/dalle2-laion ⭐ 502 
 Pretrained Dalle2 from laion
Financial and quantitative libraries: investment research tools, market data, algorithmic trading, backtesting, financial derivatives.
- 
openbb-finance/OpenBB ⭐ 52,910 
 Financial data platform for analysts, quants and AI agents.
 🔗 openbb.co
- 
virattt/ai-hedge-fund ⭐ 41,603 
 AI-powered hedge fund. The goal of this project is to explore the use of AI to make trading decisions.
- 
microsoft/qlib ⭐ 31,753 
 Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD...
 🔗 qlib.readthedocs.io/en/latest
- 
ranaroussi/yfinance ⭐ 19,280 
 Download market data from Yahoo! Finance's API
 🔗 ranaroussi.github.io/yfinance
- 
quantopian/zipline ⭐ 19,006 
 Zipline, a Pythonic Algorithmic Trading Library
 🔗 www.zipline.io
- 
mementum/backtrader ⭐ 18,937 
 Python Backtesting library for trading strategies
 🔗 www.backtrader.com
- 
ai4finance-foundation/FinGPT ⭐ 17,657 
 FinGPT: Open-Source Financial Large Language Models! Revolutionize 🔥 We release the trained model on HuggingFace.
 🔗 ai4finance.org
- 
ai4finance-foundation/FinRL ⭐ 12,721 
 FinRL®: Financial Reinforcement Learning. 🔥
 🔗 ai4finance.org
- 
quantconnect/Lean ⭐ 12,449 
 Lean Algorithmic Trading Engine by QuantConnect (Python, C#)
 🔗 lean.io
- 
ta-lib/ta-lib-python ⭐ 11,240 
 Python wrapper for TA-Lib (http://ta-lib.org/).
 🔗 ta-lib.github.io/ta-lib-python
- 
goldmansachs/gs-quant ⭐ 9,433 
 Python toolkit for quantitative finance
 🔗 developer.gs.com/discover/products/gs-quant
- 
kernc/backtesting.py ⭐ 7,248 
 🔎 📈 🐍 💰 Backtest trading strategies in Python.
 🔗 kernc.github.io/backtesting.py
- 
shiyu-coder/Kronos ⭐ 6,901 
 Open-source foundation model for financial candlesticks, trained on data from over 45 global exchanges
- 
ranaroussi/quantstats ⭐ 6,269 
 Portfolio analytics for quants, written in Python
- 
quantopian/pyfolio ⭐ 6,092 
 Portfolio and risk analytics in Python
 🔗 quantopian.github.io/pyfolio
- 
polakowo/vectorbt ⭐ 5,865 
 Find your trading edge, using the fastest engine for backtesting, algorithmic trading, and research.
 🔗 vectorbt.dev
- 
borisbanushev/stockpredictionai ⭐ 5,059 
 In this noteboook I will create a complete process for predicting stock price movements. Follow along and we will achieve some pretty good results. For that purpose we will use a Generative Adversarial Network (GAN) with LSTM, a type of Recurrent Neural Network, as generator, and a Convolutional Neural Networ...
- 
google/tf-quant-finance ⭐ 5,002 
 High-performance TensorFlow library for quantitative finance.
- 
gbeced/pyalgotrade ⭐ 4,608 
 Python Algorithmic Trading Library
 🔗 gbeced.github.io/pyalgotrade
- 
matplotlib/mplfinance ⭐ 4,190 
 Financial Markets Data Visualization using Matplotlib
 🔗 pypi.org/project/mplfinance
- 
quantopian/alphalens ⭐ 3,905 
 Performance analysis of predictive (alpha) stock factors
 🔗 quantopian.github.io/alphalens
- 
zvtvz/zvt ⭐ 3,779 
 modular quant framework.
 🔗 zvt.readthedocs.io/en/latest
- 
cuemacro/finmarketpy ⭐ 3,652 
 Python library for backtesting trading strategies & analyzing financial markets (formerly pythalesians)
 🔗 www.cuemacro.com
- 
robcarver17/pysystemtrade ⭐ 3,028 
 Systematic Trading in python
- 
pmorissette/bt ⭐ 2,693 
 bt - flexible backtesting for Python
 🔗 pmorissette.github.io/bt
- 
quantopian/research_public ⭐ 2,679 
 Quantitative research and educational materials
 🔗 www.quantopian.com/lectures
- 
domokane/FinancePy ⭐ 2,548 
 A Python Finance Library that focuses on the pricing and risk-management of Financial Derivatives, including fixed-income, equity, FX and credit derivatives.
- 
pmorissette/ffn ⭐ 2,373 
 ffn - a financial function library for Python
 🔗 pmorissette.github.io/ffn
- 
blankly-finance/blankly ⭐ 2,363 
 🚀 💸 Easily build, backtest and deploy your algo in just a few lines of code. Trade stocks, cryptos, and forex across exchanges w/ one package.
 🔗 package.blankly.finance
- 
cuemacro/findatapy ⭐ 1,915 
 Python library to download market data via Bloomberg, Eikon, Quandl, Yahoo etc.
- 
quantopian/empyrical ⭐ 1,410 
 Common financial risk and performance metrics. Used by zipline and pyfolio.
 🔗 quantopian.github.io/empyrical
- 
idanya/algo-trader ⭐ 845 
 Trading bot with support for realtime trading, backtesting, custom strategies and much more.
- 
chancefocus/PIXIU ⭐ 784 
 This repository introduces PIXIU, an open-source resource featuring the first financial large language models (LLMs), instruction tuning data, and evaluation benchmarks to holistically assess financial LLMs. Our goal is to continually push forward the open-source development of financial artificial intelligence (AI).
- 
gbeced/basana ⭐ 768 
 A Python async and event driven framework for algorithmic trading, with a focus on crypto currencies.
- 
nasdaq/data-link-python ⭐ 567 
 A Python library for Nasdaq Data Link's RESTful API
- 
ivebotunac/PrimoAgent ⭐ 214 
 PrimoAgent is an multi agent AI stock analysis system built on LangGraph architecture that orchestrates four specialized agents to provide comprehensive daily trading insights and next-day price predictions
 🔗 primoinvesting.com
Game development tools, engines and libraries.
- 
microsoft/TRELLIS ⭐ 10,695 
 A large 3D asset generation model. It takes in text or image prompts and generates high-quality 3D assets in various formats, such as Radiance Fields, 3D Gaussians, and meshes.
 🔗 trellis3d.github.io
- 
pygame/pygame ⭐ 8,321 
 🐍🎮 pygame (the library) is a Free and Open Source python programming language library for making multimedia applications like games built on top of the excellent SDL library. C, Python, Native, OpenGL.
 🔗 www.pygame.org
- 
panda3d/panda3d ⭐ 4,910 
 Powerful, mature open-source cross-platform game engine for Python and C++, developed by Disney and CMU
 🔗 www.panda3d.org
- 
niklasf/python-chess ⭐ 2,666 
 python-chess is a chess library for Python, with move generation, move validation, and support for common formats
 🔗 python-chess.readthedocs.io/en/latest
- 
pokepetter/ursina ⭐ 2,446 
 A game engine powered by python and panda3d.
 🔗 pokepetter.github.io/ursina
- 
pyglet/pyglet ⭐ 2,090 
 pyglet is a cross-platform windowing and multimedia library for Python, for developing games and other visually rich applications.
 🔗 pyglet.org
- 
pythonarcade/arcade ⭐ 1,882 
 Easy to use Python library for creating 2D arcade games.
 🔗 arcade.academy
Geospatial libraries: raster and vector data formats, interactive mapping and visualisation, computing frameworks for processing images, projections.
- 
domlysz/BlenderGIS ⭐ 8,516 
 Blender addons to make the bridge between Blender and geographic data
- 
python-visualization/folium ⭐ 7,237 
 Python Data. Leaflet.js Maps.
 🔗 python-visualization.github.io/folium
- 
osgeo/gdal ⭐ 5,535 
 GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
 🔗 gdal.org
- 
gboeing/osmnx ⭐ 5,273 
 Download, model, analyze, and visualize street networks and other geospatial features from OpenStreetMap.
 🔗 osmnx.readthedocs.io
- 
geopandas/geopandas ⭐ 4,909 
 Python tools for geographic data
 🔗 geopandas.org
- 
shapely/shapely ⭐ 4,265 
 Manipulation and analysis of geometric objects
 🔗 shapely.readthedocs.io/en/stable
- 
giswqs/geemap ⭐ 3,757 
 A Python package for interactive geospatial analysis and visualization with Google Earth Engine.
 🔗 geemap.org
- 
microsoft/torchgeo ⭐ 3,668 
 TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
 🔗 www.osgeo.org/projects/torchgeo
- 
opengeos/leafmap ⭐ 3,480 
 A Python package for interactive mapping and geospatial analysis with minimal coding in a Jupyter environment
 🔗 leafmap.org
- 
holoviz/datashader ⭐ 3,458 
 Quickly and accurately render even the largest data.
 🔗 datashader.org
- 
opengeos/segment-geospatial ⭐ 3,411 
 A Python package for segmenting geospatial data with the Segment Anything Model (SAM)
 🔗 samgeo.gishub.org
- 
google/earthengine-api ⭐ 3,059 
 Python and JavaScript bindings for calling the Earth Engine API.
- 
rasterio/rasterio ⭐ 2,411 
 Rasterio reads and writes geospatial raster datasets
 🔗 rasterio.readthedocs.io
- 
mcordts/cityscapesScripts ⭐ 2,306 
 README and scripts for the Cityscapes Dataset
- 
apache/sedona ⭐ 2,202 
 A cluster computing framework for processing large-scale geospatial data
 🔗 sedona.apache.org
- 
azavea/raster-vision ⭐ 2,164 
 An open source library and framework for deep learning on satellite and aerial imagery.
 🔗 docs.rastervision.io
- 
gboeing/osmnx-examples ⭐ 1,727 
 Gallery of OSMnx tutorials, usage examples, and feature demonstrations.
 🔗 osmnx.readthedocs.io
- 
microsoft/GlobalMLBuildingFootprints ⭐ 1,664 
 Worldwide building footprints derived from satellite imagery
- 
jupyter-widgets/ipyleaflet ⭐ 1,531 
 A Jupyter - Leaflet.js bridge
 🔗 ipyleaflet.readthedocs.io
- 
pysal/pysal ⭐ 1,421 
 PySAL: Python Spatial Analysis Library Meta-Package
 🔗 pysal.org/pysal
- 
anitagraser/movingpandas ⭐ 1,348 
 Movement trajectory classes and functions built on top of GeoPandas
 🔗 movingpandas.org
- 
sentinel-hub/eo-learn ⭐ 1,194 
 Earth observation processing framework for machine learning in Python
 🔗 eo-learn.readthedocs.io/en/latest
- 
residentmario/geoplot ⭐ 1,181 
 High-level geospatial data visualization library for Python.
 🔗 residentmario.github.io/geoplot/index.html
- 
osgeo/grass ⭐ 1,020 
 GRASS - free and open-source geospatial processing engine
 🔗 grass.osgeo.org
- 
opengeos/streamlit-geospatial ⭐ 976 
 A multi-page streamlit app for geospatial
 🔗 huggingface.co/spaces/giswqs/streamlit
- 
developmentseed/titiler ⭐ 952 
 Build your own Raster dynamic map tile services
 🔗 developmentseed.org/titiler
- 
makepath/xarray-spatial ⭐ 893 
 Raster-based Spatial Analytics for Python
 🔗 xarray-spatial.readthedocs.io
- 
datasystemslab/GeoTorchAI ⭐ 510 
 GeoTorchAI: A Framework for Training and Using Spatiotemporal Deep Learning Models at Scale
 🔗 kanchanchy.github.io/geotorchai
Graphs and network libraries: network analysis, graph machine learning, visualisation.
- 
networkx/networkx ⭐ 16,218 
 Network Analysis in Python
 🔗 networkx.org
- 
stellargraph/stellargraph ⭐ 3,027 
 StellarGraph - Machine Learning on Graphs
 🔗 stellargraph.readthedocs.io
- 
westhealth/pyvis ⭐ 1,137 
 Python package for creating and visualizing interactive network graphs.
 🔗 pyvis.readthedocs.io/en/latest
- 
microsoft/graspologic ⭐ 948 
 graspologic is a package for graph statistical algorithms
 🔗 graspologic-org.github.io/graspologic
- 
rampasek/GraphGPS ⭐ 785 
 Recipe for a General, Powerful, Scalable Graph Transformer
- 
dylanhogg/llmgraph ⭐ 468 
 Create knowledge graphs with LLMs
Graphical user interface libraries and toolkits.
- 
hoffstadt/DearPyGui ⭐ 14,782 
 Dear PyGui: A fast and powerful Graphical User Interface Toolkit for Python with minimal dependencies
 🔗 dearpygui.readthedocs.io/en/latest
- 
pysimplegui/PySimpleGUI ⭐ 13,666 
 Python GUIs for Humans! PySimpleGUI is the top-rated Python application development environment. Launched in 2018 and actively developed, maintained, and supported in 2024. Transforms tkinter, Qt, WxPython, and Remi into a simple, intuitive, and fun experience for both hobbyists and expert users.
 🔗 www.pysimplegui.com
- 
parthjadhav/Tkinter-Designer ⭐ 10,041 
 An easy and fast way to create a Python GUI 🐍
- 
samuelcolvin/FastUI ⭐ 8,881 
 FastUI is a new way to build web application user interfaces defined by declarative Python code.
 🔗 fastui-demo.onrender.com
- 
r0x0r/pywebview ⭐ 5,464 
 Build GUI for your Python program with JavaScript, HTML, and CSS
 🔗 pywebview.flowrl.com
- 
beeware/toga ⭐ 5,193 
 A Python native, OS native GUI toolkit.
 🔗 toga.readthedocs.io/en/latest
- 
dddomodossola/remi ⭐ 3,633 
 Python REMote Interface library. Platform independent. In about 100 Kbytes, perfect for your diet.
- 
wxwidgets/Phoenix ⭐ 2,537 
 wxPython's Project Phoenix. A new implementation of wxPython, better, stronger, faster than he was before.
 🔗 wxpython.org
Jupyter and JupyterLab and Notebook tools, libraries and plugins.
- 
marimo-team/marimo ⭐ 16,256 
 A reactive Python notebook: run a cell or interact with a UI element, and marimo automatically runs dependent cells, keeping code and outputs consistent. marimo notebooks are stored as pure Python, executable as scripts, and deployable as apps.
 🔗 marimo.io
- 
jupyterlab/jupyterlab ⭐ 14,821 
 JupyterLab computational environment.
 🔗 jupyterlab.readthedocs.io
- 
jupyter/notebook ⭐ 12,634 
 Jupyter Interactive Notebook
 🔗 jupyter-notebook.readthedocs.io
- 
garrettj403/SciencePlots ⭐ 8,254 
 Matplotlib styles for scientific plotting
- 
mwouts/jupytext ⭐ 6,996 
 Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
 🔗 jupytext.readthedocs.io
- 
nteract/papermill ⭐ 6,278 
 📚 Parameterize, execute, and analyze notebooks
 🔗 papermill.readthedocs.io/en/latest
- 
voila-dashboards/voila ⭐ 5,822 
 Voilà turns Jupyter notebooks into standalone web applications
 🔗 voila.readthedocs.io
- 
connorferster/handcalcs ⭐ 5,786 
 Python library for converting Python calculations into rendered latex.
- 
jupyterlite/jupyterlite ⭐ 4,591 
 Wasm powered Jupyter running in the browser 💡
 🔗 jupyterlite.rtfd.io/en/stable/try/lab
- 
executablebooks/jupyter-book ⭐ 4,150 
 Create beautiful, publication-quality books and documents from computational content.
 🔗 next.jupyterbook.org
- 
jupyterlab/jupyterlab-desktop ⭐ 4,124 
 JupyterLab desktop application, based on Electron.
- 
jupyterlab/jupyter-ai ⭐ 3,829 
 A generative AI extension for JupyterLab
 🔗 jupyter-ai.readthedocs.io
- 
jupyter-widgets/ipywidgets ⭐ 3,267 
 Interactive Widgets for the Jupyter Notebook
 🔗 ipywidgets.readthedocs.io
- 
quantopian/qgrid ⭐ 3,080 
 An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks
- 
jupyter/nbdime ⭐ 2,778 
 Tools for diffing and merging of Jupyter notebooks.
 🔗 nbdime.readthedocs.io
- 
mito-ds/mito ⭐ 2,568 
 Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet
 🔗 trymito.io
- 
jupyter/nbviewer ⭐ 2,265 
 nbconvert as a web service: Render Jupyter Notebooks as static web pages
 🔗 nbviewer.jupyter.org
- 
maartenbreddels/ipyvolume ⭐ 1,959 
 3d plotting for Python in the Jupyter notebook based on IPython widgets using WebGL
- 
jupyter-lsp/jupyterlab-lsp ⭐ 1,944 
 Coding assistance for JupyterLab (code navigation + hover suggestions + linters + autocompletion + rename) using Language Server Protocol
 🔗 jupyterlab-lsp.readthedocs.io
- 
jupyter/nbconvert ⭐ 1,876 
 Jupyter Notebook Conversion
 🔗 nbconvert.readthedocs.io
- 
koaning/drawdata ⭐ 1,534 
 Draw datasets from within Python notebooks.
 🔗 koaning.github.io/drawdata
- 
nbqa-dev/nbQA ⭐ 1,158 
 Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks
 🔗 nbqa.readthedocs.io/en/latest/index.html
- 
8080labs/pyforest ⭐ 1,117 
 With pyforest you can use all your favorite Python libraries without importing them before. If you use a package that is not imported yet, pyforest imports the package for you and adds the code to the first Jupyter cell.
 🔗 8080labs.com
- 
vizzuhq/ipyvizzu ⭐ 970 
 Build animated charts in Jupyter Notebook and similar environments with a simple Python syntax.
 🔗 ipyvizzu.vizzuhq.com
- 
aws/graph-notebook ⭐ 799 
 Library extending Jupyter notebooks to integrate with Apache TinkerPop, openCypher, and RDF SPARQL.
 🔗 github.com/aws/graph-notebook
- 
linealabs/lineapy ⭐ 667 
 Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
 🔗 lineapy.org
- 
xiaohk/stickyland ⭐ 569 
 Break the linear presentation of Jupyter Notebooks with sticky cells!
 🔗 xiaohk.github.io/stickyland
- 
infuseai/colab-xterm ⭐ 469 
 Open a terminal in colab, including the free tier.
Large language model and GPT libraries and frameworks: auto-gpt, agents, QnA, chain-of-thought workflows, API integations. Also see the Natural Language Processing category for crossover.
- 
significant-gravitas/AutoGPT ⭐ 178,815 
 AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
 🔗 agpt.co
- 
open-webui/open-webui ⭐ 111,608 
 Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG
 🔗 openwebui.com
- 
deepseek-ai/DeepSeek-V3 ⭐ 99,541 
 A strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.
- 
ggerganov/llama.cpp ⭐ 87,208 
 LLM inference in C/C++
- 
nomic-ai/gpt4all ⭐ 76,745 
 GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
 🔗 nomic.ai/gpt4all
- 
modelcontextprotocol/servers ⭐ 69,508 
 A collection of reference implementations for the Model Context Protocol (MCP), as well as references to community built servers
 🔗 modelcontextprotocol.io
- 
infiniflow/ragflow ⭐ 65,505 
 RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
 🔗 ragflow.io
- 
xtekky/gpt4free ⭐ 65,176 
 The official gpt4free repository | various collection of powerful language models | o4, o3 and deepseek r1, gpt-4.1, gemini 2.5
 🔗 t.me/g4f_channel
- 
killianlucas/open-interpreter ⭐ 60,583 
 A natural language interface for computers
 🔗 openinterpreter.com
- 
hiyouga/LLaMA-Factory ⭐ 59,703 
 Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
 🔗 llamafactory.readthedocs.io
- 
vllm-project/vllm ⭐ 59,438 
 A high-throughput and memory-efficient inference and serving engine for LLMs
 🔗 docs.vllm.ai
- 
facebookresearch/llama ⭐ 58,781 
 Inference code for Llama models
- 
imartinez/private-gpt ⭐ 56,605 
 Interact with your documents using the power of GPT, 100% privately, no data leaks
 🔗 privategpt.dev
- 
gpt-engineer-org/gpt-engineer ⭐ 54,908 
 CLI platform to experiment with codegen. Precursor to: https://lovable.dev
- 
unclecode/crawl4ai ⭐ 54,189 
 AI-ready web crawling tailored for LLMs, AI agents, and data pipelines. Open source, flexible, and built for real-time performance, Crawl4AI empowers developers with unmatched speed, precision, and deployment ease.
 🔗 crawl4ai.com
- 
xai-org/grok-1 ⭐ 50,515 
 This repository contains JAX example code for loading and running the Grok-1 open-weights model.
- 
unslothai/unsloth ⭐ 46,583 
 Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.
 🔗 docs.unsloth.ai
- 
oobabooga/text-generation-webui ⭐ 45,111 
 The definitive Web UI for local AI, with powerful features and easy setup.
 🔗 oobabooga.gumroad.com/l/deep_reason
- 
karpathy/nanoGPT ⭐ 44,820 
 The simplest, fastest repository for training/finetuning medium-sized GPTs.
- 
pathwaycom/llm-app ⭐ 41,441 
 Ready-to-run cloud templates for RAG, AI pipelines, and enterprise search with live data. 🐳Docker-friendly.⚡Always in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
 🔗 pathway.com/developers/templates
- 
hpcaitech/ColossalAI ⭐ 41,194 
 Making large AI models cheaper, faster and more accessible
 🔗 www.colossalai.org
- 
thudm/ChatGLM-6B ⭐ 41,120 
 ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
- 
lm-sys/FastChat ⭐ 39,132 
 An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
- 
quivrhq/quivr ⭐ 38,466 
 Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: PGVector, Faiss. Any Files. Anyway you want.
 🔗 core.quivr.com
- 
laion-ai/Open-Assistant ⭐ 37,470 
 OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
 🔗 open-assistant.io
- 
moymix/TaskMatrix ⭐ 34,374 
 Connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.
- 
danielmiessler/Fabric ⭐ 33,705 
 Fabric is an open-source framework for augmenting humans using AI. It provides a modular system for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
 🔗 danielmiessler.com/p/fabric-origin-story
- 
pythagora-io/gpt-pilot ⭐ 33,411 
 The first real AI developer
- 
exo-explore/exo ⭐ 31,689 
 Run your own AI cluster at home. Unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, NVIDIA, Raspberry Pi etc
- 
khoj-ai/khoj ⭐ 31,234 
 Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI
 🔗 khoj.dev
- 
tatsu-lab/stanford_alpaca ⭐ 30,165 
 Code and documentation to train Stanford's Alpaca models, and generate the data.
 🔗 crfm.stanford.edu/2023/03/13/alpaca.html
- 
berriai/litellm ⭐ 29,536 
 Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
 🔗 docs.litellm.ai/docs
- 
meta-llama/llama3 ⭐ 29,003 
 The official Meta Llama 3 GitHub site
- 
stanfordnlp/dspy ⭐ 28,858 
 DSPy: The framework for programming—not prompting—language models
 🔗 dspy.ai
- 
microsoft/graphrag ⭐ 28,482 
 A modular graph-based Retrieval-Augmented Generation (RAG) system
 🔗 microsoft.github.io/graphrag
- 
karpathy/llm.c ⭐ 27,749 
 LLM training in simple, pure C/CUDA. There is no need for 245MB of PyTorch or 107MB of cPython
- 
microsoft/semantic-kernel ⭐ 26,360 
 An SDK that integrates LLMs like OpenAI, Azure OpenAI, and Hugging Face with conventional programming languages like C#, Python, and Java
 🔗 aka.ms/semantic-kernel
- 
vision-cair/MiniGPT-4 ⭐ 25,739 
 Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
 🔗 minigpt-4.github.io
- 
huggingface/open-r1 ⭐ 25,496 
 The goal of this repo is to build the missing pieces of the R1 pipeline such that everybody can reproduce and build on top of it
- 
qwenlm/Qwen3 ⭐ 24,887 
 Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
- 
cinnamon/kotaemon ⭐ 24,460 
 An open-source RAG UI for chatting with your documents. Built with both end users and developers in mind
 🔗 cinnamon.github.io/kotaemon
- 
microsoft/JARVIS ⭐ 24,383 
 JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
- 
openai/gpt-2 ⭐ 24,234 
 Code for the paper "Language Models are Unsupervised Multitask Learners"
 🔗 openai.com/blog/better-language-models
- 
haotian-liu/LLaVA ⭐ 23,659 
 [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
 🔗 llava.hliu.cc
- 
deepset-ai/haystack ⭐ 22,851 
 AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversatio...
 🔗 haystack.deepset.ai
- 
karpathy/minGPT ⭐ 22,659 
 A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
- 
microsoft/BitNet ⭐ 22,607 
 Official inference framework for 1-bit LLMs (e.g., BitNet b1.58). It offers a suite of optimized kernels, that support fast and lossless inference of 1.58-bit models
- 
nirdiamant/RAG_Techniques ⭐ 22,147 
 The most comprehensive and dynamic collections of Retrieval-Augmented Generation (RAG) tutorials available today. This repository serves as a hub for cutting-edge techniques aimed at enhancing the accuracy, efficiency, and contextual richness of RAG systems.
- 
mlc-ai/mlc-llm ⭐ 21,432 
 Universal LLM Deployment Engine with ML Compilation
 🔗 llm.mlc.ai
- 
openai/chatgpt-retrieval-plugin ⭐ 21,226 
 The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
- 
guidance-ai/guidance ⭐ 20,805 
 A guidance language for controlling large language models.
- 
rasahq/rasa ⭐ 20,720 
 💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
 🔗 rasa.com/docs/rasa
- 
vanna-ai/vanna ⭐ 20,712 
 RAG (Retrieval-Augmented Generation) framework for SQL generation and related functionality.
 🔗 vanna.ai/docs
- 
anthropics/claude-cookbooks ⭐ 20,539 
 Provides code and guides designed to help developers build with Claude, offering copy-able code snippets that you can easily integrate into your own projects.
- 
dao-ailab/flash-attention ⭐ 19,784 
 Fast and memory-efficient exact attention
- 
huggingface/peft ⭐ 19,746 
 🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
 🔗 huggingface.co/docs/peft
- 
stitionai/devika ⭐ 19,487 
 Devika is an advanced AI software engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective.
 🔗 opcode.sh
- 
qwenlm/Qwen ⭐ 19,430 
 The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
- 
modelcontextprotocol/python-sdk ⭐ 19,025 
 The Model Context Protocol allows applications to provide context for LLMs in a standardized way, separating the concerns of providing context from the actual LLM interaction.
 🔗 modelcontextprotocol.io
- 
tloen/alpaca-lora ⭐ 18,956 
 Instruct-tune LLaMA on consumer hardware
- 
karpathy/llama2.c ⭐ 18,810 
 Inference Llama 2 in one file of pure C
- 
sgl-project/sglang ⭐ 18,606 
 SGLang is a fast serving framework for large language models and vision language models.
 🔗 docs.sglang.ai
- 
facebookresearch/llama-cookbook ⭐ 17,924 
 Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama model family and using them on various provider services
 🔗 www.llama.com
- 
openai/evals ⭐ 17,037 
 Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
- 
idea-research/Grounded-Segment-Anything ⭐ 16,971 
 Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
 🔗 arxiv.org/abs/2401.14159
- 
transformeroptimus/SuperAGI ⭐ 16,759 
 <⚡️> SuperAGI - A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
 🔗 superagi.com
- 
mlc-ai/web-llm ⭐ 16,585 
 High-performance In-browser LLM Inference Engine
 🔗 webllm.mlc.ai
- 
facebookresearch/codellama ⭐ 16,367 
 Inference code for CodeLlama models
- 
google/langextract ⭐ 16,089 
 Library that uses LLMs to extract structured information from unstructured text documents based on user-defined instructions
 🔗 pypi.org/project/langextract
- 
mayooear/ai-pdf-chatbot-langchain ⭐ 16,025 
 AI PDF chatbot agent built with LangChain & LangGraph
 🔗 www.youtube.com/watch?v=of6soldiewu
- 
lvwerra/trl ⭐ 15,739 
 Train transformer language models with reinforcement learning.
 🔗 hf.co/docs/trl
- 
thudm/ChatGLM2-6B ⭐ 15,698 
 ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
- 
kvcache-ai/ktransformers ⭐ 15,130 
 A Flexible Framework for LLM Inference Optimizations - allows researchers to replace original torch modules with optimized variants
 🔗 kvcache-ai.github.io/ktransformers
- 
fauxpilot/fauxpilot ⭐ 14,747 
 FauxPilot - an open-source alternative to GitHub Copilot server
- 
skyvern-ai/skyvern ⭐ 14,518 
 Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions.
 🔗 www.skyvern.com
- 
llmware-ai/llmware ⭐ 14,457 
 Unified framework for building enterprise RAG pipelines with small, specialized models
 🔗 llmware-ai.github.io/llmware
- 
volcengine/verl ⭐ 14,021 
 veRL is a flexible, efficient and production-ready RL training library for large language models (LLMs).
 🔗 verl.readthedocs.io/en/latest/index.html
- 
blinkdl/RWKV-LM ⭐ 13,997 
 RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RNN and transformer - great performance, linear time, constant space (no kv-cache), fast training, infinite ctx_len, and f...
- 
qwenlm/Qwen3-Coder ⭐ 13,837 
 Qwen3-Coder is the code version of Qwen3, the large language model series developed by Qwen team, Alibaba Cloud.
- 
nvidia/Megatron-LM ⭐ 13,748 
 Ongoing research training transformer models at scale
 🔗 docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
- 
swivid/F5-TTS ⭐ 13,344 
 Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
 🔗 arxiv.org/abs/2410.06885
- 
lightning-ai/litgpt ⭐ 12,815 
 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
 🔗 lightning.ai
- 
lightning-ai/litgpt ⭐ 12,815 
 20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
 🔗 lightning.ai
- 
paddlepaddle/PaddleNLP ⭐ 12,787 
 Easy-to-use and powerful LLM and SLM library with awesome model zoo.
 🔗 paddlenlp.readthedocs.io
- 
microsoft/LoRA ⭐ 12,770 
 Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
 🔗 arxiv.org/abs/2106.09685
- 
dottxt-ai/outlines ⭐ 12,651 
 Structured Text Generation from LLMs
 🔗 dottxt-ai.github.io/outlines
- 
shishirpatil/gorilla ⭐ 12,452 
 Enables LLMs to use tools by invoking APIs. Given a query, Gorilla comes up with the semantically and syntactically correct API.
 🔗 gorilla.cs.berkeley.edu
- 
andrewyng/aisuite ⭐ 12,362 
 Simple, unified interface to multiple Generative AI providers. aisuite makes it easy for developers to use multiple LLM through a standardized interface.
- 
jiayi-pan/TinyZero ⭐ 12,229 
 TinyZero is a reproduction of DeepSeek R1 Zero in countdown and multiplication tasks.
- 
canner/WrenAI ⭐ 12,167 
 Open-source GenBI AI Agent that empowers data-driven teams to chat with their data to generate Text-to-SQL, charts, spreadsheets, reports, and BI.
 🔗 getwren.ai/oss
- 
openlmlab/MOSS ⭐ 12,051 
 An open-source tool-augmented conversational language model from Fudan University
 🔗 txsun1997.github.io/blogs/moss.html
- 
h2oai/h2ogpt ⭐ 11,923 
 Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
 🔗 h2o.ai
- 
google-research/vision_transformer ⭐ 11,830 
 Vision Transformer and MLP-Mixer Architectures
- 
qwenlm/Qwen-Agent ⭐ 11,804 
 Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
 🔗 pypi.org/project/qwen-agent
- 
instructor-ai/instructor ⭐ 11,564 
 Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses.
 🔗 python.useinstructor.com
- 
explodinggradients/ragas ⭐ 10,989 
 Supercharge Your LLM Application Evaluations 🚀
 🔗 docs.ragas.io
- 
databrickslabs/dolly ⭐ 10,798 
 Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
 🔗 www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html
- 
chainlit/chainlit ⭐ 10,788 
 Build Conversational AI in minutes ⚡️
 🔗 docs.chainlit.io
- 
microsoft/promptflow ⭐ 10,787 
 Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
 🔗 microsoft.github.io/promptflow
- 
sapientinc/HRM ⭐ 10,763 
 Hierarchical Reasoning Model (HRM), a novel recurrent architecture that attains significant computational depth while maintaining both training stability and efficiency
 🔗 sapient.inc
- 
artidoro/qlora ⭐ 10,679 
 QLoRA: Efficient Finetuning of Quantized LLMs
 🔗 arxiv.org/abs/2305.14314
- 
axolotl-ai-cloud/axolotl ⭐ 10,554 
 Go ahead and axolotl questions
 🔗 docs.axolotl.ai
- 
mistralai/mistral-inference ⭐ 10,486 
 Official inference library for Mistral models
 🔗 mistral.ai
- 
eleutherai/lm-evaluation-harness ⭐ 10,272 
 A framework for few-shot evaluation of language models.
 🔗 www.eleuther.ai
- 
modelscope/ms-swift ⭐ 10,191 
 Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).
 🔗 swift.readthedocs.io/zh-cn/latest
- 
karpathy/minbpe ⭐ 9,982 
 Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
- 
anthropics/claude-quickstarts ⭐ 9,948 
 A collection of projects designed to help developers quickly get started with building applications using the Anthropic API. Each quickstart provides a foundation that you can easily build upon and customize for your specific needs.
- 
abetlen/llama-cpp-python ⭐ 9,633 
 Simple Python bindings for @ggerganov's llama.cpp library.
 🔗 llama-cpp-python.readthedocs.io
- 
e2b-dev/E2B ⭐ 9,610 
 E2B is an open-source infrastructure that allows you to run AI-generated code in secure isolated sandboxes in the cloud
 🔗 e2b.dev/docs
- 
mshumer/gpt-prompt-engineer ⭐ 9,597 
 Simply input a description of your task and some test cases, and the system will generate, test, and rank a multitude of prompts to find the ones that perform the best.
- 
blinkdl/ChatRWKV ⭐ 9,505 
 ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.
- 
skypilot-org/skypilot ⭐ 8,794 
 Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 17+ clouds, or on-prem).
 🔗 docs.skypilot.co
- 
jzhang38/TinyLlama ⭐ 8,758 
 The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
- 
vaibhavs10/insanely-fast-whisper ⭐ 8,682 
 An opinionated CLI to transcribe Audio files w/ Whisper on-device! Powered by 🤗 Transformers, Optimum & flash-attn
- 
thudm/CodeGeeX ⭐ 8,652 
 CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
 🔗 codegeex.cn
- 
apple/ml-ferret ⭐ 8,650 
 Ferret: Refer and Ground Anything Anywhere at Any Granularity
- 
vikhyat/moondream ⭐ 8,600 
 A tiny open-source computer-vision language model designed to run efficiently on edge devices
 🔗 moondream.ai
- 
promptfoo/promptfoo ⭐ 8,588 
 Test your prompts, agents, and RAGs. AI Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
 🔗 promptfoo.dev
- 
optimalscale/LMFlow ⭐ 8,465 
 An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
 🔗 optimalscale.github.io/lmflow
- 
sjtu-ipads/PowerInfer ⭐ 8,347 
 High-speed Large Language Model Serving for Local Deployment
- 
eleutherai/gpt-neo ⭐ 8,289 
 An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.
 🔗 www.eleuther.ai
- 
pipecat-ai/pipecat ⭐ 8,259 
 Open Source framework for voice and multimodal conversational AI
- 
lianjiatech/BELLE ⭐ 8,232 
 BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
- 
plachtaa/VALL-E-X ⭐ 7,924 
 An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
- 
01-ai/Yi ⭐ 7,841 
 The Yi series models are the next generation of open-source large language models trained from scratch by 01.AI.
 🔗 01.ai
- 
zilliztech/GPTCache ⭐ 7,779 
 Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
 🔗 gptcache.readthedocs.io
- 
future-house/paper-qa ⭐ 7,737 
 High-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature
 🔗 futurehouse.gitbook.io/futurehouse-cookbook
- 
thudm/GLM-130B ⭐ 7,680 
 GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
- 
sweepai/sweep ⭐ 7,592 
 Sweep: AI coding assistant for JetBrains
 🔗 sweep.dev
- 
topoteretes/cognee ⭐ 7,556 
 Memory for AI Agents in 6 lines of code
 🔗 docs.cognee.ai
- 
openlm-research/open_llama ⭐ 7,520 
 OpenLLaMA: An Open Reproduction of LLaMA
- 
bigcode-project/starcoder ⭐ 7,459 
 Home of StarCoder: fine-tuning & inference!
- 
weaviate/Verba ⭐ 7,363 
 Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
- 
eleutherai/gpt-neox ⭐ 7,307 
 An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
 🔗 www.eleuther.ai
- 
bytedance/Dolphin ⭐ 7,291 
 A novel multimodal document image parsing model following an analyze-then-parse paradigm
- 
bhaskatripathi/pdfGPT ⭐ 7,148 
 PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities. The most effective open source solution to turn your pdf files in a chatbot!
 🔗 huggingface.co/spaces/bhaskartripathi/pdfchatter
- 
internlm/InternLM ⭐ 7,063 
 Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
 🔗 internlm.readthedocs.io
- 
mit-han-lab/streaming-llm ⭐ 7,055 
 [ICLR 2024] Efficient Streaming Language Models with Attention Sinks
 🔗 arxiv.org/abs/2309.17453
- 
apple/corenet ⭐ 7,016 
 CoreNet is a deep neural network toolkit that allows researchers and engineers to train standard and novel small and large-scale models for variety of tasks, including foundation models (e.g., CLIP and LLM), object classification, object detection, and semantic segmentation.
- 
geeeekexplorer/nano-vllm ⭐ 6,958 
 A lightweight vLLM implementation built from scratch.
- 
apple/ml-fastvlm ⭐ 6,716 
 FastVLM: Efficient Vision Encoding for Vision Language Models
- 
langchain-ai/opengpts ⭐ 6,714 
 An open source effort to create a similar experience to OpenAI's GPTs and Assistants API.
- 
nirdiamant/Prompt_Engineering ⭐ 6,600 
 A comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies.
- 
run-llama/rags ⭐ 6,511 
 RAGs is a Streamlit app that lets you create a RAG pipeline from a data source using natural language.
- 
minedojo/Voyager ⭐ 6,392 
 An Open-Ended Embodied Agent with Large Language Models
 🔗 voyager.minedojo.org
- 
nat/openplayground ⭐ 6,358 
 An LLM playground you can run on your laptop
- 
arcee-ai/mergekit ⭐ 6,338 
 Tools for merging pretrained large language models.
- 
qwenlm/Qwen-VL ⭐ 6,275 
 The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
- 
lyogavin/airllm ⭐ 6,257 
 AirLLM optimizes inference memory usage, allowing 70B large language models to run inference on a single 4GB GPU card without quantization, distillation and pruning. And you can run 405B Llama3.1 on 8GB vram now.
- 
boundaryml/baml ⭐ 6,217 
 The AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)
 🔗 docs.boundaryml.com
- 
open-compass/opencompass ⭐ 6,125 
 OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
 🔗 opencompass.org.cn
- 
pytorch-labs/gpt-fast ⭐ 6,111 
 Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
- 
langchain-ai/chat-langchain ⭐ 6,103 
 Locally hosted chatbot specifically focused on question answering over the LangChain documentation
 🔗 chat.langchain.com
- 
lightning-ai/lit-llama ⭐ 6,071 
 Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
- 
yuliang-liu/MonkeyOCR ⭐ 6,016 
 A lightweight LMM-based Document Parsing Model with a Structure-Recognition-Relation Triplet Paradigm
- 
allenai/OLMo ⭐ 6,016 
 OLMo is a repository for training and using AI2's state-of-the-art open language models. It is designed by scientists, for scientists.
 🔗 allenai.org/olmo
- 
guardrails-ai/guardrails ⭐ 5,739 
 Open-source Python package for specifying structure and type, validating and correcting the outputs of large language models (LLMs)
 🔗 www.guardrailsai.com/docs
- 
linkedin/Liger-Kernel ⭐ 5,712 
 Efficient Triton Kernels for LLM Training
 🔗 openreview.net/pdf?id=36sjait42g
- 
microsoft/promptbase ⭐ 5,685 
 promptbase is an evolving collection of resources, best practices, and example scripts for eliciting the best performance from foundation models.
- 
microsoft/LLMLingua ⭐ 5,462 
 [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
 🔗 llmlingua.com
- 
openbmb/ToolBench ⭐ 5,264 
 [ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
 🔗 openbmb.github.io/toolbench
- 
dsdanielpark/Bard-API ⭐ 5,241 
 The unofficial python package that returns response of Google Bard through cookie value.
 🔗 pypi.org/project/bardapi
- 
nvidia/Guardrails ⭐ 5,112 
 NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
- 
katanaml/sparrow ⭐ 5,004 
 Sparrow is a solution for efficient data extraction and processing from various documents and images like invoices and receipts
 🔗 sparrow.katanaml.io
- 
1rgs/jsonformer ⭐ 4,831 
 A Bulletproof Way to Generate Structured JSON from Language Models
- 
togethercomputer/RedPajama-Data ⭐ 4,825 
 The RedPajama-Data repository contains code for preparing large datasets for training large language models.
- 
agiresearch/AIOS ⭐ 4,687 
 AIOS, a Large Language Model (LLM) Agent operating system, embeds large language model into Operating Systems (OS) as the brain of the OS, enabling an operating system "with soul" -- an important step towards AGI.
 🔗 aios.foundation
- 
h2oai/h2o-llmstudio ⭐ 4,652 
 H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs. Documentation: https://docs.h2o.ai/h2o-llmstudio/
 🔗 h2o.ai
- 
kyegomez/tree-of-thoughts ⭐ 4,540 
 Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70%
 🔗 discord.gg/qutxnk2nmf
- 
yizhongw/self-instruct ⭐ 4,486 
 Aligning pretrained language models with instruction data generated by themselves.
- 
microsoft/BioGPT ⭐ 4,460 
 Implementation of BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining
- 
turboderp/exllamav2 ⭐ 4,334 
 A fast inference library for running LLMs locally on modern consumer-class GPUs
- 
ragapp/ragapp ⭐ 4,334 
 The easiest way to use Agentic RAG in any enterprise
- 
marker-inc-korea/AutoRAG ⭐ 4,333 
 AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
 🔗 marker-inc-korea.github.io/autorag
- 
instruction-tuning-with-gpt-4/GPT-4-LLM ⭐ 4,331 
 Instruction Tuning with GPT-4
 🔗 instruction-tuning-with-gpt-4.github.io
- 
lm-sys/RouteLLM ⭐ 4,307 
 A framework for serving and evaluating LLM routers - save LLM costs without compromising quality
- 
vllm-project/aibrix ⭐ 4,277 
 AIBrix delivers a cloud-native solution optimized for deploying, managing, and scaling large language model (LLM) inference, tailored specifically to enterprise needs.
- 
truefoundry/cognita ⭐ 4,259 
 RAG (Retrieval Augmented Generation) Framework for building modular, open source applications for production by TrueFoundry
 🔗 cognita.truefoundry.com
- 
llm-attacks/llm-attacks ⭐ 4,238 
 This is the official repository for "Universal and Transferable Adversarial Attacks on Aligned Language Models"
 🔗 llm-attacks.org
- 
kiln-ai/Kiln ⭐ 4,218 
 The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
 🔗 kiln.tech
- 
mshumer/gpt-llm-trainer ⭐ 4,154 
 Input a description of your task, and the system will generate a dataset, parse it, and fine-tune a LLaMA 2 model for you
- 
microsoft/LMOps ⭐ 4,144 
 General technology for enabling AI capabilities w/ LLMs and MLLMs
 🔗 aka.ms/generalai
- 
openai/simple-evals ⭐ 4,090 
 Lightweight library for evaluating language models
- 
eth-sri/lmql ⭐ 4,060 
 A language for constraint-guided and efficient LLM programming.
 🔗 lmql.ai
- 
huggingface/text-embeddings-inference ⭐ 4,056 
 A blazing fast inference solution for text embeddings models
 🔗 huggingface.co/docs/text-embeddings-inference/quick_tour
- 
deep-agent/R1-V ⭐ 3,948 
 We are building a general framework for Reinforcement Learning with Verifiable Rewards (RLVR) in VLM. RLVR outperforms chain-of-thought supervised fine-tuning (CoT-SFT) in both effectiveness and out-of-distribution (OOD) robustness for vision language models.
- 
defog-ai/sqlcoder ⭐ 3,904 
 SoTA LLM for converting natural language questions to SQL queries
- 
flashinfer-ai/flashinfer ⭐ 3,843 
 FlashInfer is a library and kernel generator for Large Language Models that provides high-performance implementation of LLM GPU kernels such as FlashAttention, SparseAttention, PageAttention, Sampling
 🔗 flashinfer.ai
- 
ravenscroftj/turbopilot ⭐ 3,813 
 Turbopilot is an open source large-language-model based code completion engine that runs locally on CPU
- 
meta-llama/PurpleLlama ⭐ 3,807 
 Set of tools to assess and improve LLM security. An umbrella project to bring together tools and evals to help the community build responsibly with open genai models.
- 
sylphai-inc/AdalFlow ⭐ 3,787 
 Unified auto-differentiative framework for both zero-shot prompt optimization and few-shot optimization. It advances existing auto-optimization research, including Text-Grad and DsPy
 🔗 adalflow.sylph.ai
- 
mmabrouk/llm-workflow-engine ⭐ 3,709 
 Power CLI and Workflow manager for LLMs (core package)
- 
hiyouga/EasyR1 ⭐ 3,705 
 EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
 🔗 verl.readthedocs.io/en/latest/index.html
- 
bclavie/RAGatouille ⭐ 3,690 
 Bridging the gap between state-of-the-art research and alchemical RAG pipeline practices.
- 
lightning-ai/LitServe ⭐ 3,579 
 The easiest way to deploy agents, MCP servers, models, RAG, pipelines and more. No MLOps. No YAML.
 🔗 lightning.ai/litserve
- 
next-gpt/NExT-GPT ⭐ 3,566 
 Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model
 🔗 next-gpt.github.io
- 
minimaxir/simpleaichat ⭐ 3,521 
 Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
- 
iryna-kondr/scikit-llm ⭐ 3,481 
 Seamlessly integrate LLMs into scikit-learn.
 🔗 beastbyte.ai
- 
predibase/lorax ⭐ 3,445 
 Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
 🔗 loraexchange.ai
- 
jaymody/picoGPT ⭐ 3,412 
 An unnecessarily tiny implementation of GPT-2 in NumPy.
- 
minimaxir/gpt-2-simple ⭐ 3,406 
 Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
- 
verazuo/jailbreak_llms ⭐ 3,369 
 Official repo for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts
 🔗 jailbreak-llms.xinyueshen.me
- 
novasky-ai/SkyThought ⭐ 3,337 
 Sky-T1: Train your own O1 preview model within $450
 🔗 novasky-ai.github.io
- 
deep-diver/LLM-As-Chatbot ⭐ 3,337 
 LLM as a Chatbot Service
- 
huggingface/smollm ⭐ 3,294 
 Everything about the SmolLM and SmolVLM family of models
 🔗 huggingface.co/huggingfacetb
- 
mit-han-lab/llm-awq ⭐ 3,283 
 AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
- 
pytorch/executorch ⭐ 3,277 
 An end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.
 🔗 pytorch.org/executorch
- 
luodian/Otter ⭐ 3,270 
 🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
 🔗 otter-ntu.github.io
- 
agenta-ai/agenta ⭐ 3,217 
 The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
 🔗 www.agenta.ai
- 
evolvinglmms-lab/lmms-eval ⭐ 3,134 
 One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
 🔗 www.lmms-lab.com
- 
cohere-ai/cohere-toolkit ⭐ 3,117 
 Cohere Toolkit is a collection of prebuilt components enabling users to quickly build and deploy RAG applications.
- 
microsoft/torchscale ⭐ 3,117 
 Foundation Architecture for (M)LLMs
 🔗 aka.ms/generalai
- 
mistralai/mistral-finetune ⭐ 3,028 
 A light-weight codebase that enables memory-efficient and performant finetuning of Mistral's models. It is based on LoRA.
- 
ruc-nlpir/FlashRAG ⭐ 3,021 
 FlashRAG is a Python toolkit for the reproduction and development of RAG research. Our toolkit includes 36 pre-processed benchmark RAG datasets and 15 state-of-the-art RAG algorithms.
 🔗 arxiv.org/abs/2405.13576
- 
li-plus/chatglm.cpp ⭐ 2,975 
 C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
- 
baichuan-inc/Baichuan-13B ⭐ 2,962 
 A 13B large language model developed by Baichuan Intelligent Technology
 🔗 huggingface.co/baichuan-inc/baichuan-13b-chat
- 
freedomintelligence/LLMZoo ⭐ 2,940 
 ⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡
- 
hegelai/prompttools ⭐ 2,937 
 Open-source tools for prompt testing and experimentation, with support for both LLMs (e.g. OpenAI, LLaMA) and vector databases (e.g. Chroma, Weaviate, LanceDB).
 🔗 prompttools.readthedocs.io
- 
argilla-io/distilabel ⭐ 2,896 
 Distilabel is the framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
 🔗 distilabel.argilla.io
- 
noahshinn/reflexion ⭐ 2,880 
 [NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
- 
deepseek-ai/DualPipe ⭐ 2,865 
 DualPipe is an innovative bidirectional pipeline parallelism algorithm introduced in the DeepSeek-V3 Technical Report.
- 
truera/trulens ⭐ 2,815 
 Evaluation and Tracking for LLM Experiments and AI Agents
 🔗 www.trulens.org
- 
juncongmoo/pyllama ⭐ 2,800 
 LLaMA: Open and Efficient Foundation Language Models
- 
alpha-vllm/LLaMA2-Accessory ⭐ 2,787 
 An Open-source Toolkit for LLM Development
 🔗 llama2-accessory.readthedocs.io
- 
janhq/cortex.cpp ⭐ 2,760 
 Cortex is a Local AI API Platform that is used to run and customize LLMs.
 🔗 cortex.so
- 
paperswithcode/galai ⭐ 2,738 
 Model API for GALACTICA
- 
vectifyai/PageIndex ⭐ 2,651 
 A document indexing system that builds search tree structures from long documents, making them ready for reasoning-based RAG
 🔗 pageindex.ai
- 
roboflow/maestro ⭐ 2,632 
 streamline the fine-tuning process for multimodal models: PaliGemma 2, Florence-2, and Qwen2.5-VL
 🔗 maestro.roboflow.com
- 
databricks/dbrx ⭐ 2,568 
 Code examples and resources for DBRX, a large language model developed by Databricks
 🔗 www.databricks.com
- 
googleapis/python-genai ⭐ 2,561 
 Google Gen AI Python SDK provides an interface for developers to integrate Google's generative models into their Python applications.
 🔗 googleapis.github.io/python-genai
- 
ofa-sys/OFA ⭐ 2,536 
 Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
- 
langwatch/langwatch ⭐ 2,531 
 LangWatch is an open platform for Observing, Evaluating and Optimizing your LLM and Agentic applications.
 🔗 langwatch.ai
- 
intel/neural-compressor ⭐ 2,503 
 SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
 🔗 intel.github.io/neural-compressor
- 
young-geng/EasyLM ⭐ 2,495 
 Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
- 
spcl/graph-of-thoughts ⭐ 2,487 
 Official Implementation of "Graph of Thoughts: Solving Elaborate Problems with Large Language Models"
 🔗 arxiv.org/pdf/2308.09687.pdf
- 
azure-samples/graphrag-accelerator ⭐ 2,385 
 One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
 🔗 github.com/microsoft/graphrag
- 
civitai/sd_civitai_extension ⭐ 2,382 
 All of the Civitai models inside Automatic 1111 Stable Diffusion Web UI
- 
uptrain-ai/uptrain ⭐ 2,323 
 An open-source unified platform to evaluate and improve Generative AI applications. Provide grades for 20+ preconfigured evaluations (covering language, code, embedding use cases)
 🔗 uptrain.ai
- 
facebookresearch/large_concept_model ⭐ 2,290 
 Large Concept Models: Language modeling in a sentence representation space
- 
casper-hansen/AutoAWQ ⭐ 2,251 
 AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
 🔗 casper-hansen.github.io/autoawq
- 
huggingface/nanotron ⭐ 2,244 
 Minimalistic large language model 3D-parallelism training
- 
openai/finetune-transformer-lm ⭐ 2,244 
 Code and model for the paper "Improving Language Understanding by Generative Pre-Training"
 🔗 s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf
- 
illuin-tech/colpali ⭐ 2,234 
 Code used for training the vision retrievers in the ColPali: Efficient Document Retrieval with Vision Language Models paper
 🔗 huggingface.co/vidore
- 
akariasai/self-rag ⭐ 2,204 
 This includes the original implementation of SELF-RAG: Learning to Retrieve, Generate and Critique through self-reflection by Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi.
 🔗 selfrag.github.io
- 
ist-daslab/gptq ⭐ 2,192 
 Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
 🔗 arxiv.org/abs/2210.17323
- 
microsoft/Megatron-DeepSpeed ⭐ 2,166 
 Ongoing research training transformer language models at scale, including: BERT & GPT-2
- 
protectai/llm-guard ⭐ 2,117 
 Sanitization, detection of harmful language, prevention of data leakage, and resistance against prompt injection attacks for LLMs
 🔗 protectai.github.io/llm-guard
- 
tairov/llama2.mojo ⭐ 2,117 
 Inference Llama 2 in one file of pure 🔥
 🔗 www.modular.com/blog/community-spotlight-how-i-built-llama2-by-aydyn-tairov
- 
epfllm/meditron ⭐ 2,092 
 Meditron is a suite of open-source medical Large Language Models (LLMs).
 🔗 huggingface.co/epfl-llm
- 
openai/image-gpt ⭐ 2,068 
 Archived. Code and models from the paper "Generative Pretraining from Pixels"
- 
facebookresearch/chameleon ⭐ 2,055 
 Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
 🔗 arxiv.org/abs/2405.09818
- 
lucidrains/toolformer-pytorch ⭐ 2,049 
 Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI
- 
vllm-project/llm-compressor ⭐ 2,040 
 Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
 🔗 docs.vllm.ai/projects/llm-compressor
- 
neulab/prompt2model ⭐ 2,006 
 A system that takes a natural language task description to train a small special-purpose model that is conducive for deployment.
- 
openai/gpt-2-output-dataset ⭐ 1,995 
 Dataset of GPT-2 outputs for research in detection, biases, and more
- 
huggingface/lighteval ⭐ 1,982 
 LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
 🔗 huggingface.co/docs/lighteval/en/index
- 
google-gemini/genai-processors ⭐ 1,973 
 GenAI Processors is a lightweight Python library that enables efficient, parallel content processing.
- 
noamgat/lm-format-enforcer ⭐ 1,933 
 Enforce the output format (JSON Schema, Regex etc) of a language model
- 
ai-hypercomputer/maxtext ⭐ 1,919 
 MaxText is a high performance, highly scalable, open-source LLM written in pure Python/Jax and targeting Google Cloud TPUs and GPUs for training and inference.
 🔗 maxtext.readthedocs.io
- 
minishlab/model2vec ⭐ 1,852 
 Model2Vec is a technique to turn any sentence transformer into a really small static model, reducing model size by 15x and making the models up to 500x faster, with a small drop in performance
 🔗 minish.ai/packages/model2vec
- 
huggingface/picotron ⭐ 1,841 
 Minimalist & most-hackable repository for pre-training Llama-like models with 4D Parallelism (Data, Tensor, Pipeline, Context parallel)
- 
minimaxir/aitextgen ⭐ 1,841 
 A robust Python tool for text-based AI training and generation using GPT-2.
 🔗 docs.aitextgen.io
- 
openai/gpt-discord-bot ⭐ 1,834 
 Example Discord bot written in Python that uses the completions API to have conversations with thetext-davinci-003model, and the moderations API to filter the messages.
- 
ray-project/llm-applications ⭐ 1,832 
 A comprehensive guide to building RAG-based LLM applications for production.
- 
agentops-ai/tokencost ⭐ 1,815 
 Easy token price estimates for 400+ LLMs. TokenOps.
 🔗 agentops.ai
- 
qwenlm/Qwen-Audio ⭐ 1,802 
 The official repo of Qwen-Audio (通义千问-Audio) chat & pretrained large audio language model proposed by Alibaba Cloud.
- 
jennyzzt/dgm ⭐ 1,680 
 Self-improving system that iteratively modifies its own code and empirically validates each change
- 
jina-ai/thinkgpt ⭐ 1,583 
 Agent techniques to augment your LLM and push it beyong its limits
- 
meetkai/functionary ⭐ 1,582 
 Chat language model that can use tools and interpret the results
- 
answerdotai/rerankers ⭐ 1,546 
 Welcome to rerankers! Our goal is to provide users with a simple API to use any reranking models.
- 
run-llama/llama-lab ⭐ 1,506 
 Llama Lab is a repo dedicated to building cutting-edge projects using LlamaIndex
- 
chatarena/chatarena ⭐ 1,503 
 ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments for LLMs. The goal is to develop communication and collaboration capabilities of AIs.
- 
cstankonrad/long_llama ⭐ 1,460 
 LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.
- 
farizrahman4u/loopgpt ⭐ 1,457 
 Re-implementation of Auto-GPT as a python package, written with modularity and extensibility in mind.
- 
nirdiamant/Controllable-RAG-Agent ⭐ 1,445 
 An advanced Retrieval-Augmented Generation (RAG) solution designed to tackle complex questions that simple semantic similarity-based retrieval cannot solve
- 
karpathy/nano-llama31 ⭐ 1,429 
 This repo is to Llama 3.1 what nanoGPT is to GPT-2. i.e. it is a minimal, dependency-free implementation of the Llama 3.1 architecture
- 
bigscience-workshop/Megatron-DeepSpeed ⭐ 1,419 
 Ongoing research training transformer language models at scale, including: BERT & GPT-2
- 
explosion/spacy-transformers ⭐ 1,395 
 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
 🔗 spacy.io/usage/embeddings-transformers
- 
facebookresearch/MobileLLM ⭐ 1,372 
 Training code of MobileLLM introduced in our work: "MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases"
- 
mlfoundations/dclm ⭐ 1,369 
 DataComp for Language Models
- 
protectai/rebuff ⭐ 1,356 
 Rebuff is designed to protect AI applications from prompt injection (PI) attacks through a multi-layered defense
 🔗 playground.rebuff.ai
- 
explosion/spacy-llm ⭐ 1,318 
 🦙 Integrating LLMs into structured NLP pipelines
 🔗 spacy.io/usage/large-language-models
- 
keirp/automatic_prompt_engineer ⭐ 1,316 
 Large Language Models Are Human-Level Prompt Engineers
- 
mlc-ai/xgrammar ⭐ 1,288 
 XGrammar is an open-source library for efficient, flexible, and portable structured generation. It supports general context-free grammar to enable a broad range of structures while bringing careful system optimizations to enable fast executions.
 🔗 xgrammar.mlc.ai/docs
- 
hao-ai-lab/LookaheadDecoding ⭐ 1,281 
 Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
 🔗 arxiv.org/abs/2402.02057
- 
deepseek-ai/EPLB ⭐ 1,273 
 Expert Parallelism Load Balancer across GPUs
- 
ray-project/ray-llm ⭐ 1,262 
 RayLLM - LLMs on Ray (Archived). Read README for more info.
 🔗 docs.ray.io/en/latest
- 
srush/MiniChain ⭐ 1,234 
 A tiny library for coding with large language models.
 🔗 srush-minichain.hf.space
- 
run-llama/semtools ⭐ 1,201 
 Semantic search and document parsing tools for the command line
- 
sumandora/remove-refusals-with-transformers ⭐ 1,165 
 A proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens
- 
ibm/Dromedary ⭐ 1,143 
 Dromedary: towards helpful, ethical and reliable LLMs.
- 
lupantech/chameleon-llm ⭐ 1,133 
 Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".
 🔗 chameleon-llm.github.io
- 
centerforaisafety/hle ⭐ 1,114 
 Humanity's Last Exam (HLE) is a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage
 🔗 lastexam.ai
- 
leochlon/hallbayes ⭐ 1,113 
 Hallucination Risk Calculator & Prompt Re-engineering Toolkit (OpenAI-only)
- 
nousresearch/Hermes-Function-Calling ⭐ 1,089 
 Code for the Hermes Pro Large Language Model to perform function calling based on the provided schema. It allows users to query the model and retrieve information related to stock prices, company fundamentals, financial statements
- 
rlancemartin/auto-evaluator ⭐ 1,088 
 Evaluation tool for LLM QA chains
 🔗 autoevaluator.langchain.com
- 
cerebras/modelzoo ⭐ 1,077 
 Examples of common deep learning models that can be trained on Cerebras hardware
- 
datadreamer-dev/DataDreamer ⭐ 1,062 
 DataDreamer is a powerful open-source Python library for prompting, synthetic data generation, and training workflows. It is designed to be simple, extremely efficient, and research-grade.
 🔗 datadreamer.dev
- 
ctlllll/LLM-ToolMaker ⭐ 1,042 
 Large Language Models as Tool Makers
- 
microsoft/Llama-2-Onnx ⭐ 1,026 
 A Microsoft optimized version of the Llama 2 model, available from Meta
- 
pinecone-io/canopy ⭐ 1,022 
 Retrieval Augmented Generation (RAG) framework and context engine powered by Pinecone
 🔗 www.pinecone.io
- 
nomic-ai/pygpt4all ⭐ 1,016 
 Official supported Python bindings for llama.cpp + gpt4all
 🔗 nomic-ai.github.io/pygpt4all
- 
huggingface/optimum-nvidia ⭐ 1,004 
 Optimum-NVIDIA delivers the best inference performance on the NVIDIA platform through Hugging Face. Run LLaMA 2 at 1,200 tokens/second (up to 28x faster than the framework)
- 
wandb/weave ⭐ 998 
 Weave is a toolkit for developing AI-powered applications, built by Weights & Biases.
 🔗 wandb.me/weave
- 
prometheus-eval/prometheus-eval ⭐ 996 
 Evaluate your LLM's response with Prometheus and GPT4 💯
- 
ajndkr/lanarky ⭐ 994 
 The web framework for building LLM microservices [deprecated]
 🔗 lanarky.ajndkr.com
- 
likejazz/llama3.np ⭐ 990 
 llama3.np is a pure NumPy implementation for Llama 3 model.
- 
utkusen/promptmap ⭐ 986 
 Vulnerability scanning tool that automatically tests prompt injection attacks on your LLM applications. It analyzes your LLM system prompts, runs them, and sends attack prompts to them.
- 
langchain-ai/langsmith-cookbook ⭐ 967 
 LangSmith is a platform for building production-grade LLM applications.
 🔗 langsmith-cookbook.vercel.app
- 
cagostino/npcpy ⭐ 965 
 This repo leverages the power of LLMs to understand your natural language commands and questions, executing tasks, answering queries, and providing relevant information from local files and the web.
- 
soulter/hugging-chat-api ⭐ 928 
 HuggingChat Python API🤗
- 
muennighoff/sgpt ⭐ 873 
 SGPT: GPT Sentence Embeddings for Semantic Search
 🔗 arxiv.org/abs/2202.08904
- 
opengvlab/OmniQuant ⭐ 853 
 [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
- 
oliveirabruno01/babyagi-asi ⭐ 800 
 BabyAGI: an Autonomous and Self-Improving agent, or BASI
- 
thinking-machines-lab/batch_invariant_ops ⭐ 798 
 Defeating Nondeterminism in LLM Inference: fixing floating-point non-associativity
- 
junruxiong/IncarnaMind ⭐ 795 
 Connect and chat with your multiple documents (pdf and txt) through GPT 3.5, GPT-4 Turbo, Claude and Local Open-Source LLMs
 🔗 www.incarnamind.com
- 
cyberark/FuzzyAI ⭐ 776 
 A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
- 
tag-research/TAG-Bench ⭐ 757 
 Table-Augmented Generation (TAG) is a unified and general-purpose paradigm for answering natural language questions over databases
 🔗 arxiv.org/pdf/2408.14717
- 
opengenerativeai/GenossGPT ⭐ 750 
 One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) 🌈🐂 Replace OpenAI GPT with any LLMs in your app with one line.
 🔗 genoss.ai
- 
developersdigest/llm-api-engine ⭐ 732 
 Build and deploy AI-powered APIs in seconds. This project allows you to create custom APIs that extract structured data from websites using natural language descriptions, powered by LLMs and web scraping technology.
 🔗 www.youtube.com/watch?v=8kuek1bo4mm
- 
salesforce/xgen ⭐ 721 
 Salesforce open-source LLMs with 8k sequence length.
- 
bytedtsinghua-sia/MemAgent ⭐ 703 
 A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.
- 
squeezeailab/SqueezeLLM ⭐ 702 
 [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization
 🔗 arxiv.org/abs/2306.07629
- 
lupantech/ScienceQA ⭐ 691 
 Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
- 
magnivorg/prompt-layer-library ⭐ 672 
 🍰 PromptLayer - Maintain a log of your prompts and OpenAI API requests. Track, debug, and replay old completions.
 🔗 www.promptlayer.com
- 
tsinghuadatabasegroup/DB-GPT ⭐ 666 
 LLM As Database Administrator
 🔗 dbgpt.dbmind.cn
- 
microsoft/VPTQ ⭐ 657 
 Extreme Low-bit Vector Post-Training Quantization for Large Language Models
- 
langchain-ai/langsmith-sdk ⭐ 651 
 LangSmith Client SDK Implementations
 🔗 docs.smith.langchain.com
- 
metauto-ai/agent-as-a-judge ⭐ 640 
 ⚖️ The First Coding Agent-as-a-Judge
 🔗 arxiv.org/pdf/2410.10934
- 
facebookresearch/cwm ⭐ 627 
 Code World Model (CWM) is a 32-billion-parameter open-weights LLM, to advance research on code generation with world models.
- 
modal-labs/llm-finetuning ⭐ 623 
 Guide for fine-tuning Llama/Mistral/CodeLlama models and more
- 
judahpaul16/gpt-home ⭐ 611 
 ChatGPT at home! Basically a better Google Nest Hub or Amazon Alexa home assistant. Built on the Raspberry Pi using the OpenAI API.
 🔗 hub.docker.com/r/judahpaul/gpt-home
- 
zhudotexe/kani ⭐ 590 
 kani (カニ) is a highly hackable microframework for tool-calling language models. (NLP-OSS @ EMNLP 2023)
 🔗 kani.readthedocs.io
- 
qixucen/atom ⭐ 588 
 Atom of Thoughts (AoT) is a new reasoning framework that represents the solution as a composition of atomic questions. This approach transforms the reasoning process into a Markov process with atomic states
- 
huggingface/text-clustering ⭐ 578 
 Easily embed, cluster and semantically label text datasets
- 
predibase/llm_distillation_playbook ⭐ 576 
 Best practices for distilling large language models.
- 
eugeneyan/obsidian-copilot ⭐ 554 
 🤖 A prototype assistant for writing and thinking
 🔗 eugeneyan.com/writing/obsidian-copilot
- 
declare-lab/instruct-eval ⭐ 548 
 This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.
 🔗 declare-lab.github.io/instruct-eval
- 
likenneth/honest_llama ⭐ 548 
 Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
- 
vahe1994/SpQR ⭐ 546 
 Quantization algorithm and the model evaluation code for SpQR method for LLM compression
- 
hazyresearch/ama_prompting ⭐ 546 
 Ask Me Anything language model prompting
- 
deepseek-ai/DeepSeek-Prover-V1.5 ⭐ 539 
 DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
- 
kbressem/medAlpaca ⭐ 539 
 LLM finetuned for medical question answering
- 
continuum-llms/chatgpt-memory ⭐ 532 
 Allows to scale the ChatGPT API to multiple simultaneous sessions with infinite contextual and adaptive memory powered by GPT and Redis datastore.
- 
hazyresearch/H3 ⭐ 517 
 Language Modeling with the H3 State Space Model
- 
reasoning-machines/pal ⭐ 511 
 PaL: Program-Aided Language Models (ICML 2023)
 🔗 reasonwithpal.com
- 
alphasecio/langchain-examples ⭐ 504 
 A collection of apps powered by the LangChain LLM framework.
 🔗 go.alphasec.io/langchain-examples
- 
krlabsorg/LettuceDetect ⭐ 503 
 LettuceDetect is a lightweight and efficient tool for detecting hallucinations in Retrieval-Augmented Generation (RAG) systems. It identifies unsupported parts of an answer by comparing it to the provided context.
 🔗 krlabs.eu/lettucedetect
- 
codelion/adaptive-classifier ⭐ 465 
 A flexible, adaptive classification system that allows for dynamic addition of new classes and continuous learning from examples. Built on top of transformers from HuggingFace, this library provides an easy-to-use interface for creating and updating text classifiers.
- 
quotient-ai/judges ⭐ 288 
 judges is a small library to use and create LLM-as-a-Judge evaluators. The purpose of judges is to have a curated set of LLM evaluators in a low-friction format across a variety of use cases
- 
stanford-oval/suql ⭐ 286 
 SUQL: Conversational Search over Structured and Unstructured Data with LLMs
 🔗 arxiv.org/abs/2311.09818
- 
emissary-tech/legit-rag ⭐ 266 
 A modular Retrieval-Augmented Generation (RAG) system built with FastAPI, Qdrant, and OpenAI.
- 
dottxt-ai/outlines-core ⭐ 252 
 Core functionality for structured generation, formerly implemented in Outlines, with a focus on performance and portability.
 🔗 docs.rs/outlines-core
- 
jina-ai/llm-query-expansion ⭐ 57 
 Query Expension for Better Query Embedding using LLMs
Mathematical, numerical and scientific libraries.
- 
numpy/numpy ⭐ 30,498 
 The fundamental package for scientific computing with Python.
 🔗 numpy.org
- 
camdavidsonpilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers ⭐ 28,094 
 aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
 🔗 camdavidsonpilon.github.io/probabilistic-programming-and-bayesian-methods-for-hackers
- 
taichi-dev/taichi ⭐ 27,578 
 Productive, portable, and performant GPU programming in Python: Taichi Lang is an open-source, imperative, parallel programming language for high-performance numerical computation.
 🔗 taichi-lang.org
- 
experience-monks/math-as-code ⭐ 15,414 
 This is a reference to ease developers into mathematical notation by showing comparisons with Python code
- 
scipy/scipy ⭐ 14,059 
 SciPy library main repository
 🔗 scipy.org
- 
sympy/sympy ⭐ 13,967 
 A computer algebra system written in pure Python
 🔗 sympy.org
- 
google/or-tools ⭐ 12,550 
 Google Optimization Tools (a.k.a., OR-Tools) is an open-source, fast and portable software suite for solving combinatorial optimization problems.
 🔗 developers.google.com/optimization
- 
z3prover/z3 ⭐ 11,409 
 Z3 is a theorem prover from Microsoft Research with a Python language binding.
- 
google-deepmind/alphageometry ⭐ 4,655 
 Solving Olympiad Geometry without Human Demonstrations
- 
pim-book/programmers-introduction-to-mathematics ⭐ 3,618 
 Code for A Programmer's Introduction to Mathematics
 🔗 pimbook.org
- 
mikedh/trimesh ⭐ 3,357 
 Python library for loading and using triangular meshes.
 🔗 trimesh.org
- 
talalalrawajfeh/mathematics-roadmap ⭐ 3,194 
 A Comprehensive Roadmap to Mathematics
- 
pyro-ppl/numpyro ⭐ 2,535 
 Probabilistic programming with NumPy powered by JAX for autograd and JIT compilation to GPU/TPU/CPU.
 🔗 num.pyro.ai
- 
mckinsey/causalnex ⭐ 2,377 
 A Python library that helps data scientists to infer causation rather than observing correlation.
 🔗 causalnex.readthedocs.io
- 
pyomo/pyomo ⭐ 2,291 
 An object-oriented algebraic modeling language in Python for structured optimization problems.
 🔗 www.pyomo.org
- 
facebookresearch/theseus ⭐ 1,946 
 A library for differentiable nonlinear optimization
- 
arviz-devs/arviz ⭐ 1,733 
 Exploratory analysis of Bayesian models with Python
 🔗 python.arviz.org
- 
google-research/torchsde ⭐ 1,673 
 Differentiable SDE solvers with GPU support and efficient sensitivity analysis.
- 
dynamicslab/pysindy ⭐ 1,669 
 A package for the sparse identification of nonlinear dynamical systems from data
 🔗 pysindy.readthedocs.io/en/latest
- 
geomstats/geomstats ⭐ 1,408 
 Computations and statistics on manifolds with geometric structures.
 🔗 geomstats.ai
- 
cma-es/pycma ⭐ 1,227 
 pycma is a Python implementation of CMA-ES and a few related numerical optimization tools.
- 
pymc-labs/CausalPy ⭐ 1,042 
 A Python package for causal inference in quasi-experimental settings
 🔗 causalpy.readthedocs.io
- 
lean-dojo/LeanDojo ⭐ 708 
 Tool for data extraction and interacting with Lean programmatically.
 🔗 leandojo.org/leandojo.html
- 
brandondube/prysm ⭐ 310 
 Prysm is an open-source library for physical and first-order modeling of optical systems and analysis of related data: numerical and physical optics, integrated modeling, phase retrieval, segmented systems, polynomials and fitting, sequential raytracing.
 🔗 prysm.readthedocs.io/en/stable
- 
lean-dojo/ReProver ⭐ 297 
 Retrieval-Augmented Theorem Provers for Lean
 🔗 leandojo.org/leandojo.html
- 
albahnsen/pycircular ⭐ 105 
 pycircular is a Python module for circular data analysis
- 
gbillotey/Fractalshades ⭐ 35 
 Arbitrary-precision fractal explorer - Python package
General and classical machine learning libraries. See below for other sections covering specialised ML areas.
- 
openai/openai-cookbook ⭐ 68,240 
 Examples and guides for using the OpenAI API
 🔗 cookbook.openai.com
- 
scikit-learn/scikit-learn ⭐ 63,559 
 scikit-learn: machine learning in Python
 🔗 scikit-learn.org
- 
suno-ai/bark ⭐ 38,553 
 🔊 Text-Prompted Generative Audio Model
- 
facebookresearch/faiss ⭐ 37,374 
 A library for efficient similarity search and clustering of dense vectors.
 🔗 faiss.ai
- 
tencentarc/GFPGAN ⭐ 37,110 
 GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
- 
google-research/google-research ⭐ 36,458 
 This repository contains code released by Google Research
 🔗 research.google
- 
roboflow/supervision ⭐ 35,474 
 We write your reusable computer vision tools. 💜
 🔗 supervision.roboflow.com
- 
google/jax ⭐ 33,610 
 Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
 🔗 docs.jax.dev
- 
open-mmlab/mmdetection ⭐ 31,754 
 OpenMMLab Detection Toolbox and Benchmark
 🔗 mmdetection.readthedocs.io
- 
google/mediapipe ⭐ 31,515 
 Cross-platform, customizable ML solutions for live and streaming media.
 🔗 ai.google.dev/edge/mediapipe
- 
lutzroeder/netron ⭐ 31,500 
 Visualizer for neural network, deep learning and machine learning models
 🔗 netron.app
- 
ageron/handson-ml2 ⭐ 29,433 
 A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.
- 
dmlc/xgboost ⭐ 27,448 
 Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
 🔗 xgboost.readthedocs.io
- 
facebookresearch/fastText ⭐ 26,366 
 A library for efficient learning of word representations and sentence classification.
 🔗 fasttext.cc
- 
modular/modular ⭐ 24,922 
 The Modular Accelerated Xecution (MAX) platform is an integrated suite of AI libraries, tools, and technologies that unifies commonly fragmented AI deployment workflows
 🔗 docs.modular.com
- 
harisiqbal88/PlotNeuralNet ⭐ 23,951 
 Latex code for making neural networks diagrams
- 
ml-explore/mlx ⭐ 22,394 
 MLX is an array framework for machine learning on Apple silicon, brought to you by Apple machine learning research.
 🔗 ml-explore.github.io/mlx
- 
jina-ai/serve ⭐ 21,753 
 ☁️ Build multimodal AI applications with cloud-native stack
 🔗 jina.ai/serve
- 
onnx/onnx ⭐ 19,670 
 Open standard for machine learning interoperability
 🔗 onnx.ai
- 
huggingface/candle ⭐ 18,237 
 Candle is a minimalist ML framework for Rust with a focus on performance (including GPU support) and ease of use.
- 
microsoft/onnxruntime ⭐ 17,997 
 ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
 🔗 onnxruntime.ai
- 
microsoft/LightGBM ⭐ 17,707 
 A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
 🔗 lightgbm.readthedocs.io/en/latest
- 
tensorflow/tensor2tensor ⭐ 16,527 
 Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
- 
ddbourgin/numpy-ml ⭐ 16,154 
 Machine learning, in numpy
 🔗 numpy-ml.readthedocs.io
- 
google-gemini/cookbook ⭐ 15,056 
 A collection of guides and examples for the Gemini API, including quickstart tutorials for writing prompts.
 🔗 ai.google.dev/gemini-api/docs
- 
aleju/imgaug ⭐ 14,678 
 Image augmentation for machine learning experiments.
 🔗 imgaug.readthedocs.io
- 
neonbjb/tortoise-tts ⭐ 14,622 
 A multi-voice TTS system trained with an emphasis on quality
- 
deepmind/deepmind-research ⭐ 14,349 
 This repository contains implementations and illustrative code to accompany DeepMind publications
- 
microsoft/nni ⭐ 14,278 
 An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
 🔗 nni.readthedocs.io
- 
jindongwang/transferlearning ⭐ 14,096 
 Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
 🔗 transferlearning.xyz
- 
spotify/annoy ⭐ 13,975 
 Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
- 
deepmind/alphafold ⭐ 13,860 
 Implementation of the inference pipeline of AlphaFold v2
- 
ggerganov/ggml ⭐ 13,228 
 Tensor library for machine learning
- 
optuna/optuna ⭐ 12,783 
 A hyperparameter optimization framework
 🔗 optuna.org
- 
facebookresearch/AnimatedDrawings ⭐ 12,692 
 Code to accompany "A Method for Animating Children's Drawings of the Human Figure"
- 
thudm/CogVideo ⭐ 11,973 
 text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
- 
statsmodels/statsmodels ⭐ 10,984 
 Statsmodels: statistical modeling and econometrics in Python
 🔗 www.statsmodels.org/devel
- 
cleanlab/cleanlab ⭐ 10,937 
 Cleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
 🔗 cleanlab.ai
- 
wandb/wandb ⭐ 10,376 
 The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
 🔗 wandb.ai
- 
twitter/the-algorithm-ml ⭐ 10,367 
 Source code for Twitter's Recommendation Algorithm
 🔗 blog.twitter.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm
- 
megvii-basedetection/YOLOX ⭐ 10,107 
 YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
- 
epistasislab/tpot ⭐ 9,996 
 A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
 🔗 epistasislab.github.io/tpot
- 
facebookresearch/xformers ⭐ 9,975 
 Hackable and optimized Transformers building blocks, supporting a composable construction.
 🔗 facebookresearch.github.io/xformers
- 
pycaret/pycaret ⭐ 9,535 
 An open-source, low-code machine learning library in Python
 🔗 www.pycaret.org
- 
awslabs/autogluon ⭐ 9,460 
 Fast and Accurate ML in 3 Lines of Code
 🔗 auto.gluon.ai
- 
pymc-devs/pymc ⭐ 9,275 
 Bayesian Modeling and Probabilistic Programming in Python
 🔗 www.pymc.io
- 
open-mmlab/mmsegmentation ⭐ 9,260 
 OpenMMLab Semantic Segmentation Toolbox and Benchmark.
 🔗 mmsegmentation.readthedocs.io/en/main
- 
huggingface/accelerate ⭐ 9,180 
 🚀 A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
 🔗 huggingface.co/docs/accelerate
- 
uberi/speech_recognition ⭐ 8,871 
 Speech recognition module for Python, supporting several engines and APIs, online and offline.
 🔗 pypi.python.org/pypi/speechrecognition
- 
catboost/catboost ⭐ 8,602 
 A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
 🔗 catboost.ai
- 
automl/auto-sklearn ⭐ 7,963 
 Automated Machine Learning with scikit-learn
 🔗 automl.github.io/auto-sklearn
- 
lmcinnes/umap ⭐ 7,956 
 Uniform Manifold Approximation and Projection
- 
ml-explore/mlx-examples ⭐ 7,897 
 Examples in the MLX framework
- 
py-why/dowhy ⭐ 7,741 
 DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.
 🔗 www.pywhy.org/dowhy
- 
featurelabs/featuretools ⭐ 7,543 
 An open source python library for automated feature engineering
 🔗 www.featuretools.com
- 
hyperopt/hyperopt ⭐ 7,476 
 Distributed Asynchronous Hyperparameter Optimization in Python
 🔗 hyperopt.github.io/hyperopt
- 
hips/autograd ⭐ 7,365 
 Efficiently computes derivatives of NumPy code.
- 
open-mmlab/mmagic ⭐ 7,275 
 OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image generation, image/video restoration/enhancement, etc.
 🔗 mmagic.readthedocs.io/en/latest
- 
scikit-learn-contrib/imbalanced-learn ⭐ 7,040 
 A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
 🔗 imbalanced-learn.org
- 
yangchris11/samurai ⭐ 6,954 
 Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
 🔗 yangchris11.github.io/samurai
- 
probml/pyprobml ⭐ 6,912 
 Python code for "Probabilistic Machine learning" book by Kevin Murphy
- 
project-monai/MONAI ⭐ 6,880 
 AI Toolkit for Healthcare Imaging
 🔗 monai.io
- 
nicolashug/Surprise ⭐ 6,681 
 A Python scikit for building and analyzing recommender systems
 🔗 surpriselib.com
- 
cleverhans-lab/cleverhans ⭐ 6,366 
 An adversarial example library for constructing attacks, building defenses, and benchmarking both
- 
google-deepmind/graphcast ⭐ 6,333 
 GraphCast: Learning skillful medium-range global weather forecasting
- 
open-mmlab/mmcv ⭐ 6,269 
 OpenMMLab Computer Vision Foundation
 🔗 mmcv.readthedocs.io/en/latest
- 
kevinmusgrave/pytorch-metric-learning ⭐ 6,233 
 The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
 🔗 kevinmusgrave.github.io/pytorch-metric-learning
- 
uber/causalml ⭐ 5,600 
 Uplift modeling and causal inference with machine learning algorithms
- 
online-ml/river ⭐ 5,568 
 🌊 Online machine learning in Python
 🔗 riverml.xyz
- 
mdbloice/Augmentor ⭐ 5,132 
 Image augmentation library in Python for machine learning.
 🔗 augmentor.readthedocs.io/en/stable
- 
rasbt/mlxtend ⭐ 5,073 
 A library of extension and helper modules for Python's data analysis and machine learning libraries.
 🔗 rasbt.github.io/mlxtend
- 
skvark/opencv-python ⭐ 5,022 
 Automated CI toolchain to produce precompiled opencv-python, opencv-python-headless, opencv-contrib-python and opencv-contrib-python-headless packages.
 🔗 pypi.org/project/opencv-python
- 
marqo-ai/marqo ⭐ 4,972 
 Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
 🔗 www.marqo.ai
- 
apple/coremltools ⭐ 4,967 
 Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.
 🔗 coremltools.readme.io
- 
nmslib/hnswlib ⭐ 4,924 
 Header-only C++/python library for fast approximate nearest neighbors
 🔗 github.com/nmslib/hnswlib
- 
sanchit-gandhi/whisper-jax ⭐ 4,633 
 JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
- 
priorlabs/TabPFN ⭐ 4,523 
 The TabPFN is a neural network that learned to do tabular data prediction. This is the original CUDA-supporting pytorch impelementation.
 🔗 priorlabs.ai
- 
huggingface/autotrain-advanced ⭐ 4,498 
 AutoTrain Advanced: faster and easier training and deployments of state-of-the-art machine learning models
 🔗 huggingface.co/autotrain
- 
nv-tlabs/GET3D ⭐ 4,410 
 Generative Model of High Quality 3D Textured Shapes Learned from Images
- 
districtdatalabs/yellowbrick ⭐ 4,373 
 Visual analysis and diagnostic tools to facilitate machine learning model selection.
 🔗 www.scikit-yb.org
- 
lucidrains/deep-daze ⭐ 4,343 
 Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun
- 
huggingface/notebooks ⭐ 4,329 
 Notebooks using the Hugging Face libraries 🤗
- 
py-why/EconML ⭐ 4,322 
 ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to brin...
 🔗 www.microsoft.com/en-us/research/project/alice
- 
microsoft/FLAML ⭐ 4,221 
 A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
 🔗 microsoft.github.io/flaml
- 
cmusphinx/pocketsphinx ⭐ 4,203 
 A small speech recognizer
- 
huggingface/speech-to-speech ⭐ 4,194 
 Speech To Speech: an effort for an open-sourced and modular GPT4-o
- 
ourownstory/neural_prophet ⭐ 4,179 
 NeuralProphet: A simple forecasting package
 🔗 neuralprophet.com
- 
zjunlp/DeepKE ⭐ 4,134 
 [EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction
 🔗 deepke.zjukg.cn
- 
rucaibox/RecBole ⭐ 4,000 
 A unified, comprehensive and efficient recommendation library
 🔗 recbole.io
- 
cornellius-gp/gpytorch ⭐ 3,776 
 GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian process models with ease.
- 
lightly-ai/lightly ⭐ 3,549 
 A python library for self-supervised learning on images.
 🔗 docs.lightly.ai/self-supervised-learning
- 
yoheinakajima/instagraph ⭐ 3,535 
 Converts text input or URL into knowledge graph and displays
- 
facebookresearch/flow_matching ⭐ 3,479 
 Flow Matching (FM) is a recent framework for generative modeling that has achieved state-of-the-art performance across various domains, including image, video, audio, speech, and biological structures
 🔗 facebookresearch.github.io/flow_matching
- 
huggingface/safetensors ⭐ 3,468 
 Implements a new simple format for storing tensors safely (as opposed to pickle) and that is still fast (zero-copy).
 🔗 huggingface.co/docs/safetensors
- 
pytorch/glow ⭐ 3,313 
 Compiler for Neural Network hardware accelerators
- 
facebookresearch/vissl ⭐ 3,289 
 VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.
 🔗 vissl.ai
- 
lucidrains/musiclm-pytorch ⭐ 3,281 
 Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
- 
hrnet/HRNet-Semantic-Segmentation ⭐ 3,274 
 The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
- 
shankarpandala/lazypredict ⭐ 3,236 
 Lazy Predict help build a lot of basic models without much code and helps understand which models works better without any parameter tuning
- 
mljar/mljar-supervised ⭐ 3,210 
 Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
 🔗 mljar.com
- 
huggingface/optimum ⭐ 3,109 
 🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
 🔗 huggingface.co/docs/optimum/main
- 
scikit-learn-contrib/hdbscan ⭐ 3,000 
 A high performance implementation of HDBSCAN clustering.
 🔗 hdbscan.readthedocs.io/en/latest
- 
nvidia/cuda-python ⭐ 2,989 
 CUDA Python: Performance meets Productivity
 🔗 nvidia.github.io/cuda-python
- 
neuraloperator/neuraloperator ⭐ 2,956 
 Comprehensive library for learning neural operators in PyTorch. It is the official implementation for Fourier Neural Operators and Tensorized Neural Operators.
 🔗 neuraloperator.github.io/dev/index.html
- 
huggingface/huggingface_hub ⭐ 2,955 
 The official Python client for the Hugging Face Hub.
 🔗 huggingface.co/docs/huggingface_hub
- 
google-research/t5x ⭐ 2,889 
 T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models (starting with language) at many scales.
- 
scikit-optimize/scikit-optimize ⭐ 2,795 
 Sequential model-based optimization with ascipy.optimizeinterface
 🔗 scikit-optimize.github.io
- 
eric-mitchell/direct-preference-optimization ⭐ 2,744 
 Reference implementation for DPO (Direct Preference Optimization)
- 
freedmand/semantra ⭐ 2,660 
 Semantra is a multipurpose tool for semantically searching documents. Query by meaning rather than just by matching text.
- 
apple/ml-ane-transformers ⭐ 2,657 
 Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
- 
rom1504/clip-retrieval ⭐ 2,653 
 Easily compute clip embeddings and build a clip retrieval system with them
 🔗 rom1504.github.io/clip-retrieval
- 
scikit-learn-contrib/category_encoders ⭐ 2,463 
 A library of sklearn compatible categorical variable encoders
 🔗 contrib.scikit-learn.org/category_encoders
- 
qdrant/fastembed ⭐ 2,418 
 Fast, Accurate, Lightweight Python library to make State of the Art Embedding
 🔗 qdrant.github.io/fastembed
- 
huggingface/evaluate ⭐ 2,335 
 🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
 🔗 huggingface.co/docs/evaluate
- 
aws/sagemaker-python-sdk ⭐ 2,192 
 A library for training and deploying machine learning models on Amazon SageMaker
 🔗 sagemaker.readthedocs.io
- 
feature-engine/feature_engine ⭐ 2,134 
 Feature engineering package with sklearn like functionality
 🔗 feature-engine.trainindata.com
- 
microsoft/Olive ⭐ 2,124 
 Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
 🔗 microsoft.github.io/olive
- 
castorini/pyserini ⭐ 1,946 
 Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
 🔗 pyserini.io
- 
contextlab/hypertools ⭐ 1,869 
 A Python toolbox for gaining geometric insights into high-dimensional data
 🔗 hypertools.readthedocs.io
- 
linkedin/greykite ⭐ 1,851 
 A flexible, intuitive and fast forecasting library
- 
bmabey/pyLDAvis ⭐ 1,840 
 Python library for interactive topic model visualization. Port of the R LDAvis package.
- 
rentruewang/koila ⭐ 1,829 
 Prevent PyTorch'sCUDA error: out of memoryin just 1 line of code.
 🔗 koila.rentruewang.com
- 
laekov/fastmoe ⭐ 1,799 
 A fast MoE impl for PyTorch
 🔗 fastmoe.ai
- 
stanfordmlgroup/ngboost ⭐ 1,782 
 Natural Gradient Boosting for Probabilistic Prediction
- 
visual-layer/fastdup ⭐ 1,737 
 fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
- 
microsoft/i-Code ⭐ 1,709 
 The ambition of the i-Code project is to build integrative and composable multimodal AI. The "i" stands for integrative multimodal learning.
- 
tensorflow/addons ⭐ 1,707 
 Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
- 
kubeflow/katib ⭐ 1,631 
 Automated Machine Learning on Kubernetes
 🔗 www.kubeflow.org/docs/components/katib
- 
google/vizier ⭐ 1,601 
 Python-based research interface for blackbox and hyperparameter optimization, based on the internal Google Vizier Service.
 🔗 oss-vizier.readthedocs.io
- 
microsoft/Semi-supervised-learning ⭐ 1,533 
 A Unified Semi-Supervised Learning Codebase (NeurIPS'22)
 🔗 usb.readthedocs.io
- 
spotify/voyager ⭐ 1,508 
 🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
 🔗 spotify.github.io/voyager
- 
jina-ai/finetuner ⭐ 1,503 
 🎯 Task-oriented embedding tuning for BERT, CLIP, etc.
 🔗 finetuner.jina.ai
- 
csinva/imodels ⭐ 1,501 
 Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
 🔗 csinva.io/imodels
- 
patchy631/machine-learning ⭐ 1,457 
 Machine Learning Tutorials Repository
- 
pytorch/FBGEMM ⭐ 1,443 
 FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
- 
lightning-ai/lightning-thunder ⭐ 1,413 
 Thunder is a source-to-source compiler for PyTorch. It makes PyTorch programs faster by combining and using different hardware executors at once
- 
koaning/scikit-lego ⭐ 1,362 
 Extra blocks for scikit-learn pipelines.
 🔗 koaning.github.io/scikit-lego
- 
borealisai/advertorch ⭐ 1,356 
 A Toolbox for Adversarial Robustness Research
- 
awslabs/dgl-ke ⭐ 1,323 
 High performance, easy-to-use, and scalable package for learning large-scale knowledge graph embeddings.
 🔗 dglke.dgl.ai/doc
- 
opentensor/bittensor ⭐ 1,245 
 Internet-scale Neural Networks
 🔗 www.bittensor.com
- 
davidmrau/mixture-of-experts ⭐ 1,181 
 PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538
- 
google-research/deeplab2 ⭐ 1,024 
 DeepLab2 is a TensorFlow library for deep labeling, aiming to provide a unified and state-of-the-art TensorFlow codebase for dense pixel labeling tasks.
- 
huggingface/optimum-quanto ⭐ 987 
 A pytorch quantization backend for optimum
- 
oml-team/open-metric-learning ⭐ 975 
 OML is a PyTorch-based framework to train and validate the models producing high-quality embeddings.
 🔗 open-metric-learning.readthedocs.io/en/latest/index.html
- 
pymc-labs/pymc-marketing ⭐ 952 
 Bayesian marketing toolbox in PyMC. Media Mix (MMM), customer lifetime value (CLV), buy-till-you-die (BTYD) models and more.
 🔗 www.pymc-marketing.io
- 
gradio-app/trackio ⭐ 930 
 A lightweight, local-first, and free experiment tracking library from Hugging Face 🤗
- 
hazyresearch/safari ⭐ 897 
 Convolutions for Sequence Modeling
- 
criteo/autofaiss ⭐ 871 
 Automatically create Faiss knn indices with the most optimal similarity search parameters.
 🔗 criteo.github.io/autofaiss
- 
replicate/replicate-python ⭐ 870 
 Python client for Replicate
 🔗 replicate.com
- 
minishlab/semhash ⭐ 810 
 SemHash is a lightweight and flexible tool for deduplicating datasets using semantic similarity. It combines fast embedding generation from Model2Vec with efficient ANN-based similarity search through Vicinity
 🔗 minish.ai/packages/semhash
- 
googleapis/python-aiplatform ⭐ 801 
 A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
- 
awslabs/python-deequ ⭐ 796 
 Python API for Deequ, a library built on Spark for defining "unit tests for data", which measure data quality in large datasets
- 
nomic-ai/contrastors ⭐ 750 
 Contrastive learning toolkit that enables researchers and engineers to train and evaluate contrastive models efficiently.
- 
facebookresearch/balance ⭐ 702 
 The balance python package offers a simple workflow and methods for dealing with biased data samples when looking to infer from them to some target population of interest.
 🔗 import-balance.org
- 
intel/intel-npu-acceleration-library ⭐ 692 
 The Intel NPU Acceleration Library is a Python library designed to boost the efficiency of your applications by leveraging the power of the Intel Neural Processing Unit (NPU) to perform high-speed computations on compatible hardware.
- 
nicolas-hbt/pygraft ⭐ 689 
 Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
 🔗 pygraft.readthedocs.io/en/latest
- 
huggingface/exporters ⭐ 675 
 Export Hugging Face models to Core ML and TensorFlow Lite
- 
qdrant/quaterion ⭐ 657 
 Blazing fast framework for fine-tuning similarity learning models
 🔗 quaterion.qdrant.tech
- 
hpcaitech/EnergonAI ⭐ 631 
 Large-scale model inference.
- 
eleutherai/sparsify ⭐ 630 
 This library trains k-sparse autoencoders (SAEs) on the residual stream activations of HuggingFace language models, roughly following the recipe detailed in Scaling and evaluating sparse autoencoders (Gao et al. 2024)
- 
deepgraphlearning/ULTRA ⭐ 567 
 A foundation model for knowledge graph reasoning
- 
google-deepmind/limit ⭐ 566 
 On the Theoretical Limitations of Embedding-Based Retrieval
 🔗 arxiv.org/abs/2508.21038
- 
raivnlab/MRL ⭐ 564 
 Code repository for the paper - "Matryoshka Representation Learning"
- 
microsoft/Focal-Transformer ⭐ 559 
 [NeurIPS 2021 Spotlight] Official code for "Focal Self-attention for Local-Global Interactions in Vision Transformers"
- 
hkust-knowcomp/AutoSchemaKG ⭐ 551 
 A Knowledge Graph Construction Framework with Schema Generation and Knowledge Graph Completion
- 
lightning-ai/litData ⭐ 543 
 Transform datasets at scale. Optimize datasets for fast AI model training.
- 
linkedin/FastTreeSHAP ⭐ 542 
 Fast SHAP value computation for interpreting tree-based models
- 
mrdbourke/m1-machine-learning-test ⭐ 534 
 Code for testing various M1 Chip benchmarks with TensorFlow.
- 
nevronai/MetisFL ⭐ 522 
 The first open Federated Learning framework implemented in C++ and Python.
 🔗 metisfl.org
- 
apple/ml-l3m ⭐ 174 
 A flexible library for training any type of large model, regardless of modality. Instead of more traditional approaches, we opt for a config-heavy approach
- 
dylanhogg/gptauthor ⭐ 88 
 GPTAuthor is an AI tool for writing long form, multi-chapter stories given a story prompt.
Machine learning libraries that cross over with deep learning in some way.
- 
tensorflow/tensorflow ⭐ 191,911 
 An Open Source Machine Learning Framework for Everyone
 🔗 tensorflow.org
- 
pytorch/pytorch ⭐ 93,691 
 Tensors and Dynamic neural networks in Python with strong GPU acceleration
 🔗 pytorch.org
- 
openai/whisper ⭐ 88,989 
 Robust Speech Recognition via Large-Scale Weak Supervision
- 
keras-team/keras ⭐ 63,451 
 Deep Learning for humans
 🔗 keras.io
- 
deepfakes/faceswap ⭐ 54,536 
 Deepfakes Software For All
 🔗 www.faceswap.dev
- 
facebookresearch/segment-anything ⭐ 52,001 
 The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
- 
microsoft/DeepSpeed ⭐ 40,304 
 DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
 🔗 www.deepspeed.ai
- 
rwightman/pytorch-image-models ⭐ 35,405 
 The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
 🔗 huggingface.co/docs/timm
- 
facebookresearch/detectron2 ⭐ 33,437 
 Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
 🔗 detectron2.readthedocs.io/en/latest
- 
xinntao/Real-ESRGAN ⭐ 32,720 
 Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
- 
openai/CLIP ⭐ 30,921 
 CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
- 
lightning-ai/pytorch-lightning ⭐ 30,205 
 The deep learning framework to pretrain, finetune and deploy AI models. PyTorch Lightning is just organized PyTorch - Lightning disentangles PyTorch code to decouple the science from the engineering.
 🔗 lightning.ai/pytorch-lightning
- 
google-research/tuning_playbook ⭐ 29,216 
 A playbook for systematically maximizing the performance of deep learning models.
- 
facebookresearch/Detectron ⭐ 26,372 
 FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
- 
matterport/Mask_RCNN ⭐ 25,366 
 Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
- 
lucidrains/vit-pytorch ⭐ 24,092 
 Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
- 
paddlepaddle/Paddle ⭐ 23,260 
 PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
 🔗 www.paddlepaddle.org
- 
pyg-team/pytorch_geometric ⭐ 22,958 
 Graph Neural Network Library for PyTorch
 🔗 pyg.org
- 
sanster/IOPaint ⭐ 22,156 
 Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
 🔗 www.iopaint.com
- 
apache/mxnet ⭐ 20,825 
 Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
 🔗 mxnet.apache.org
- 
danielgatis/rembg ⭐ 20,622 
 Rembg is a tool to remove images background
- 
rasbt/deeplearning-models ⭐ 17,254 
 A collection of various deep learning architectures, models, and tips
- 
microsoft/Swin-Transformer ⭐ 15,249 
 This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
 🔗 arxiv.org/abs/2103.14030
- 
albumentations-team/albumentations ⭐ 15,159 
 Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
 🔗 albumentations.ai
- 
facebookresearch/detr ⭐ 14,745 
 End-to-End Object Detection with Transformers
- 
nvidia/DeepLearningExamples ⭐ 14,512 
 State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
- 
dmlc/dgl ⭐ 14,084 
 Python package built to ease deep learning on graph, on top of existing DL frameworks.
 🔗 dgl.ai
- 
mlfoundations/open_clip ⭐ 12,700 
 Open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training).
- 
tencent-hunyuan/HunyuanVideo ⭐ 11,104 
 HunyuanVideo: A Systematic Framework For Large Video Generation Model
 🔗 aivideo.hunyuan.tencent.com
- 
kornia/kornia ⭐ 10,782 
 🐍 Geometric Computer Vision Library for Spatial AI
 🔗 kornia.readthedocs.io
- 
facebookresearch/pytorch3d ⭐ 9,535 
 PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
 🔗 pytorch3d.org
- 
modelscope/facechain ⭐ 9,487 
 FaceChain is a deep-learning toolchain for generating your Digital-Twin.
- 
keras-team/autokeras ⭐ 9,270 
 AutoML library for deep learning
 🔗 autokeras.com
- 
arogozhnikov/einops ⭐ 9,206 
 Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
 🔗 einops.rocks
- 
bytedance/monolith ⭐ 8,965 
 A deep learning framework for large scale recommendation modeling with collisionless embedding and real time training captures.
- 
pyro-ppl/pyro ⭐ 8,871 
 Deep universal probabilistic programming with Python and PyTorch
 🔗 pyro.ai
- 
nvidia/apex ⭐ 8,811 
 A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
- 
facebookresearch/ImageBind ⭐ 8,799 
 ImageBind One Embedding Space to Bind Them All
- 
lucidrains/imagen-pytorch ⭐ 8,368 
 Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
- 
google/trax ⭐ 8,286 
 Trax — Deep Learning with Clear Code and Speed
- 
xpixelgroup/BasicSR ⭐ 7,787 
 Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also support StyleGAN2, DFDNet.
 🔗 basicsr.readthedocs.io/en/latest
- 
google/flax ⭐ 6,832 
 Flax is a neural network library for JAX that is designed for flexibility.
 🔗 flax.readthedocs.io
- 
skorch-dev/skorch ⭐ 6,116 
 A scikit-learn compatible neural network library that wraps PyTorch
- 
facebookresearch/mmf ⭐ 5,595 
 A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
 🔗 mmf.sh
- 
mosaicml/composer ⭐ 5,414 
 Supercharge Your Model Training
 🔗 docs.mosaicml.com
- 
deci-ai/super-gradients ⭐ 4,917 
 Easily train or fine-tune SOTA computer vision models with one open source training library. The home of Yolo-NAS.
 🔗 www.supergradients.com
- 
nvidiagameworks/kaolin ⭐ 4,909 
 A PyTorch Library for Accelerating 3D Deep Learning Research
- 
pytorch/ignite ⭐ 4,700 
 High-level library to help with training and evaluating neural networks in PyTorch flexibly and transparently.
 🔗 pytorch-ignite.ai
- 
facebookincubator/AITemplate ⭐ 4,679 
 AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
- 
cvg/LightGlue ⭐ 4,094 
 LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
- 
google-research/scenic ⭐ 3,683 
 Scenic: A Jax Library for Computer Vision Research and Beyond
- 
williamyang1991/VToonify ⭐ 3,590 
 [SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
- 
modelscope/ClearerVoice-Studio ⭐ 3,452 
 An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.
- 
facebookresearch/PyTorch-BigGraph ⭐ 3,414 
 Generate embeddings from large-scale graph-structured data.
 🔗 torchbiggraph.readthedocs.io
- 
pytorch/botorch ⭐ 3,360 
 Bayesian optimization in PyTorch
 🔗 botorch.org
- 
alpa-projects/alpa ⭐ 3,155 
 Training and serving large-scale neural networks with auto parallelization.
 🔗 alpa.ai
- 
deepmind/dm-haiku ⭐ 3,100 
 JAX-based neural network library
 🔗 dm-haiku.readthedocs.io
- 
explosion/thinc ⭐ 2,876 
 🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
 🔗 thinc.ai
- 
nerdyrodent/VQGAN-CLIP ⭐ 2,658 
 Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.
- 
danielegrattarola/spektral ⭐ 2,384 
 Graph Neural Networks with Keras and Tensorflow 2.
 🔗 graphneural.network
- 
google-research/electra ⭐ 2,362 
 ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
- 
pytorch/torchrec ⭐ 2,355 
 Pytorch domain library for recommendation systems
 🔗 pytorch.org/torchrec
- 
fepegar/torchio ⭐ 2,277 
 Medical imaging processing for AI applications.
 🔗 docs.torchio.org
- 
neuralmagic/sparseml ⭐ 2,145 
 Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
- 
jeshraghian/snntorch ⭐ 1,726 
 Deep and online learning with spiking neural networks in Python
 🔗 snntorch.readthedocs.io/en/latest
- 
calculatedcontent/WeightWatcher ⭐ 1,670 
 The WeightWatcher tool for predicting the accuracy of Deep Neural Networks
- 
tensorly/tensorly ⭐ 1,635 
 TensorLy: Tensor Learning in Python.
 🔗 tensorly.org
- 
tensorflow/mesh ⭐ 1,617 
 Mesh TensorFlow: Model Parallelism Made Easier
- 
vt-vl-lab/FGVC ⭐ 1,556 
 [ECCV 2020] Flow-edge Guided Video Completion
- 
hysts/pytorch_image_classification ⭐ 1,421 
 PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet
- 
xl0/lovely-tensors ⭐ 1,292 
 Tensors, for human consumption
 🔗 xl0.github.io/lovely-tensors
- 
deepmind/android_env ⭐ 1,140 
 RL research on Android devices.
- 
keras-team/keras-cv ⭐ 1,048 
 Industry-strength Computer Vision workflows with Keras
- 
tensorflow/similarity ⭐ 1,023 
 TensorFlow Similarity is a python package focused on making similarity learning quick and easy.
- 
kakaobrain/rq-vae-transformer ⭐ 956 
 The official implementation of Autoregressive Image Generation using Residual Quantization (CVPR '22)
- 
deepmind/chex ⭐ 884 
 Chex is a library of utilities for helping to write reliable JAX code
 🔗 chex.readthedocs.io
- 
mlfoundations/datacomp ⭐ 742 
 DataComp: In search of the next generation of multimodal datasets
 🔗 datacomp.ai
- 
whitead/dmol-book ⭐ 675 
 Deep learning for molecules and materials book
 🔗 dmol.pub
- 
allenai/reward-bench ⭐ 639 
 RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models (including those trained with Direct Preference Optimization, DPO)
 🔗 huggingface.co/spaces/allenai/reward-bench
Machine learning interpretability libraries. Covers explainability, prediction explainations, dashboards, understanding knowledge development in training.
- 
slundberg/shap ⭐ 24,481 
 A game theoretic approach to explain the output of any machine learning model.
 🔗 shap.readthedocs.io
- 
marcotcr/lime ⭐ 12,003 
 Lime: Explaining the predictions of any machine learning classifier
- 
arize-ai/phoenix ⭐ 7,158 
 AI Observability & Evaluation
 🔗 arize.com/docs/phoenix
- 
interpretml/interpret ⭐ 6,686 
 Fit interpretable models. Explain blackbox machine learning.
 🔗 interpret.ml/docs
- 
pytorch/captum ⭐ 5,414 
 Model interpretability and understanding for PyTorch
 🔗 captum.ai
- 
tensorflow/lucid ⭐ 4,697 
 A collection of infrastructure and tools for research in neural network interpretability.
- 
pair-code/lit ⭐ 3,596 
 The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
 🔗 pair-code.github.io/lit
- 
maif/shapash ⭐ 2,957 
 🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
 🔗 maif.github.io/shapash
- 
teamhg-memex/eli5 ⭐ 2,772 
 A library for debugging/inspecting machine learning classifiers and explaining their predictions
 🔗 eli5.readthedocs.io
- 
transformerlensorg/TransformerLens ⭐ 2,635 
 A library for mechanistic interpretability of GPT-style language models
 🔗 transformerlensorg.github.io/transformerlens
- 
eleutherai/pythia ⭐ 2,622 
 Interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers
- 
seldonio/alibi ⭐ 2,561 
 Algorithms for explaining machine learning models
 🔗 docs.seldon.io/projects/alibi/en/stable
- 
oegedijk/explainerdashboard ⭐ 2,448 
 Quickly build Explainable AI dashboards that show the inner workings of so-called "blackbox" machine learning models.
 🔗 explainerdashboard.readthedocs.io
- 
jalammar/ecco ⭐ 2,059 
 Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, BERT, RoBERTA, T5, and T0).
 🔗 ecco.readthedocs.io
- 
google-deepmind/penzai ⭐ 1,822 
 A JAX library for writing models as legible, functional pytree data structures, along with tools for visualizing, modifying, and analyzing them. Penzai focuses on making it easy to do stuff with models after they have been trained
 🔗 penzai.readthedocs.io
- 
trusted-ai/AIX360 ⭐ 1,738 
 Interpretability and explainability of data and machine learning models
 🔗 aix360.res.ibm.com
- 
stanfordnlp/pyreft ⭐ 1,512 
 Stanford NLP Python library for Representation Finetuning (ReFT)
 🔗 arxiv.org/abs/2404.03592
- 
cdpierse/transformers-interpret ⭐ 1,381 
 Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
- 
selfexplainml/PiML-Toolbox ⭐ 1,266 
 PiML (Python Interpretable Machine Learning) toolbox for model development & diagnostics
 🔗 selfexplainml.github.io/piml-toolbox
- 
ethicalml/xai ⭐ 1,200 
 XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models
 🔗 ethical.institute/principles.html#commitment-3
- 
jbloomaus/SAELens ⭐ 981 
 Training Sparse Autoencoders on LLms. Analyse sparse autoencoders and neural network internals.
 🔗 jbloomaus.github.io/saelens
- 
salesforce/OmniXAI ⭐ 950 
 OmniXAI: A Library for eXplainable AI
- 
andyzoujm/representation-engineering ⭐ 884 
 Representation Engineering: A Top-Down Approach to AI Transparency
 🔗 www.ai-transparency.org
- 
stanfordnlp/pyvene ⭐ 815 
 Library for intervening on the internal states of PyTorch models. Interventions are an important operation in many areas of AI, including model editing, steering, robustness, and interpretability.
 🔗 pyvene.ai
- 
labmlai/inspectus ⭐ 686 
 Inspectus provides visualization tools for attention mechanisms in deep learning models. It provides a set of comprehensive views, making it easier to understand how these models work.
- 
ndif-team/nnsight ⭐ 673 
 The nnsight package enables interpreting and manipulating the internals of deep learned models.
 🔗 nnsight.net
- 
alignmentresearch/tuned-lens ⭐ 529 
 Tools for understanding how transformer predictions are built layer-by-layer
 🔗 tuned-lens.readthedocs.io/en/latest
MLOps tools, frameworks and libraries: intersection of machine learning, data engineering and DevOps; deployment, health, diagnostics and governance of ML models.
- 
apache/airflow ⭐ 42,663 
 Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
 🔗 airflow.apache.org
- 
ray-project/ray ⭐ 39,190 
 Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
 🔗 ray.io
- 
mlflow/mlflow ⭐ 22,336 
 The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
 🔗 mlflow.org
- 
kestra-io/kestra ⭐ 21,856 
 Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot. Simple. Fast. Scalable.
 🔗 kestra.io
- 
prefecthq/prefect ⭐ 20,504 
 Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
 🔗 prefect.io
- 
jlowin/fastmcp ⭐ 18,592 
 FastMCP is the standard framework for building MCP servers and clients. FastMCP 1.0 was incorporated into the official MCP Python SDK.
 🔗 gofastmcp.com
- 
spotify/luigi ⭐ 18,513 
 Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.
- 
langfuse/langfuse ⭐ 16,772 
 🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
 🔗 langfuse.com/docs
- 
iterative/dvc ⭐ 14,931 
 🦉 Data Versioning and ML Experiments
 🔗 dvc.org
- 
horovod/horovod ⭐ 14,600 
 Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
 🔗 horovod.ai
- 
dagster-io/dagster ⭐ 14,145 
 An orchestration platform for the development, production, and observation of data assets.
 🔗 dagster.io
- 
bentoml/OpenLLM ⭐ 11,819 
 Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
 🔗 bentoml.com
- 
dbt-labs/dbt-core ⭐ 11,600 
 dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
 🔗 getdbt.com
- 
ludwig-ai/ludwig ⭐ 11,596 
 Low-code framework for building custom LLMs, neural networks, and other AI models
 🔗 ludwig.ai
- 
great-expectations/great_expectations ⭐ 10,802 
 Always know what to expect from your data.
 🔗 docs.greatexpectations.io
- 
kedro-org/kedro ⭐ 10,564 
 Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.
 🔗 kedro.org
- 
huggingface/text-generation-inference ⭐ 10,544 
 A Rust, Python and gRPC server for text generation inference. Used in production at HuggingFace to power Hugging Chat, the Inference API and Inference Endpoint.
 🔗 hf.co/docs/text-generation-inference
- 
netflix/metaflow ⭐ 9,554 
 Build, Manage and Deploy AI/ML Systems
 🔗 metaflow.org
- 
activeloopai/deeplake ⭐ 8,852 
 Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
 🔗 activeloop.ai
- 
mage-ai/mage-ai ⭐ 8,485 
 🧙 Build, run, and manage data pipelines for integrating and transforming data.
 🔗 www.mage.ai
- 
bentoml/BentoML ⭐ 8,108 
 The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
 🔗 bentoml.com
- 
internlm/lmdeploy ⭐ 7,134 
 LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
 🔗 lmdeploy.readthedocs.io/en/latest
- 
evidentlyai/evidently ⭐ 6,669 
 Evidently is an open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.
 🔗 discord.gg/xzjkranp8b
- 
flyteorg/flyte ⭐ 6,527 
 Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
 🔗 flyte.org
- 
feast-dev/feast ⭐ 6,367 
 The Open Source Feature Store for AI/ML
 🔗 feast.dev
- 
allegroai/clearml ⭐ 6,307 
 ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
 🔗 clear.ml/docs
- 
adap/flower ⭐ 6,292 
 Flower: A Friendly Federated AI Framework
 🔗 flower.ai
- 
aimhubio/aim ⭐ 5,816 
 Aim 💫 — An easy-to-use & supercharged open-source experiment tracker.
 🔗 aimstack.io
- 
zenml-io/zenml ⭐ 4,914 
 ZenML 🙏: MLOps for Reliable AI: from Classical ML to Agents. https://zenml.io.
 🔗 zenml.io
- 
internlm/xtuner ⭐ 4,912 
 A Next-Generation Training Engine Built for Ultra-Large MoE Models
 🔗 xtuner.readthedocs.io/zh-cn/latest
- 
orchest/orchest ⭐ 4,141 
 Build data pipelines, the easy way 🛠️
 🔗 orchest.readthedocs.io/en/stable
- 
kubeflow/pipelines ⭐ 3,955 
 Machine Learning Pipelines for Kubeflow
 🔗 www.kubeflow.org/docs/components/pipelines
- 
polyaxon/polyaxon ⭐ 3,682 
 MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
 🔗 polyaxon.com
- 
ploomber/ploomber ⭐ 3,604 
 The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
 🔗 docs.ploomber.io
- 
towhee-io/towhee ⭐ 3,426 
 Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
 🔗 towhee.io
- 
determined-ai/determined ⭐ 3,185 
 Determined is an open-source machine learning platform that simplifies distributed training, hyperparameter tuning, experiment tracking, and resource management. Works with PyTorch and TensorFlow.
 🔗 determined.ai
- 
azure/PyRIT ⭐ 2,945 
 The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and ML engineers to red team foundation models and their applications.
 🔗 azure.github.io/pyrit
- 
leptonai/leptonai ⭐ 2,791 
 A Pythonic framework to simplify AI service building
 🔗 lepton.ai
- 
michaelfeil/infinity ⭐ 2,479 
 Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models, clip, clap and colpali
 🔗 michaelfeil.github.io/infinity
- 
apache/hamilton ⭐ 2,275 
 Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
 🔗 hamilton.apache.org
- 
labmlai/labml ⭐ 2,231 
 🔎 Monitor deep learning model training and hardware usage from your mobile phone 📱
 🔗 labml.ai
- 
meltano/meltano ⭐ 2,213 
 Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
 🔗 meltano.com
- 
dstackai/dstack ⭐ 1,916 
 dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.
 🔗 dstack.ai
- 
vllm-project/production-stack ⭐ 1,823 
 vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization
 🔗 docs.vllm.ai/projects/production-stack
- 
dagworks-inc/burr ⭐ 1,804 
 Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastructure.
 🔗 burr.apache.org
- 
hi-primus/optimus ⭐ 1,519 
 🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
 🔗 hi-optimus.com
- 
kubeflow/examples ⭐ 1,447 
 A repository to host extended examples and tutorials
- 
substratusai/kubeai ⭐ 1,074 
 AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
 🔗 www.kubeai.org
- 
arize-ai/openinference ⭐ 632 
 OpenInference is a set of conventions and plugins that is complimentary to OpenTelemetry to enable tracing of AI applications.
 🔗 arize-ai.github.io/openinference
- 
lightonai/pylate ⭐ 608 
 Built on Sentence Transformers, designed to simplify fine-tuning, inference, and retrieval with state-of-the-art ColBERT models
 🔗 lightonai.github.io/pylate
Machine learning libraries and toolkits that cross over with reinforcement learning in some way: agent reinforcement learning, agent environemnts, RLHF
- 
openai/gym ⭐ 36,617 
 A toolkit for developing and comparing reinforcement learning algorithms.
 🔗 www.gymlibrary.dev
- 
openai/baselines ⭐ 16,458 
 OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
- 
google/dopamine ⭐ 10,804 
 Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
 🔗 github.com/google/dopamine
- 
farama-foundation/Gymnasium ⭐ 10,303 
 An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
 🔗 gymnasium.farama.org
- 
thu-ml/tianshou ⭐ 8,825 
 An elegant PyTorch deep reinforcement learning library.
 🔗 tianshou.org
- 
lucidrains/PaLM-rlhf-pytorch ⭐ 7,869 
 Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
- 
tensorlayer/TensorLayer ⭐ 7,375 
 Deep Learning and Reinforcement Learning Library for Scientists and Engineers
 🔗 tensorlayerx.com
- 
keras-rl/keras-rl ⭐ 5,554 
 Deep Reinforcement Learning for Keras.
 🔗 keras-rl.readthedocs.io
- 
deepmind/dm_control ⭐ 4,251 
 Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
- 
ai4finance-foundation/ElegantRL ⭐ 4,186 
 Massively Parallel Deep Reinforcement Learning. 🔥
 🔗 ai4finance.org
- 
deepmind/acme ⭐ 3,797 
 A library of reinforcement learning components and agents
- 
facebookresearch/ReAgent ⭐ 3,662 
 A platform for Reasoning systems (Reinforcement Learning, Contextual Bandits, etc.)
 🔗 reagent.ai
- 
opendilab/DI-engine ⭐ 3,524 
 DI-engine is a generalized decision intelligence engine for PyTorch and JAX. It provides python-first and asynchronous-native task and middleware abstractions
 🔗 di-engine-docs.readthedocs.io
- 
pettingzoo-team/PettingZoo ⭐ 3,134 
 An API standard for multi-agent reinforcement learning environments, with popular reference environments and related utilities
 🔗 pettingzoo.farama.org
- 
pytorch/rl ⭐ 3,085 
 A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
 🔗 pytorch.org/rl
- 
eureka-research/Eureka ⭐ 3,048 
 Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
 🔗 eureka-research.github.io
- 
kzl/decision-transformer ⭐ 2,664 
 Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.
- 
arise-initiative/robosuite ⭐ 1,954 
 robosuite: A Modular Simulation Framework and Benchmark for Robot Learning
 🔗 robosuite.ai
- 
anthropics/hh-rlhf ⭐ 1,783 
 Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
 🔗 arxiv.org/abs/2204.05862
- 
humancompatibleai/imitation ⭐ 1,600 
 Clean PyTorch implementations of imitation and reward learning algorithms
 🔗 imitation.readthedocs.io
- 
denys88/rl_games ⭐ 1,208 
 RL Games: High performance RL library
- 
google-deepmind/meltingpot ⭐ 740 
 A suite of test scenarios for multi-agent reinforcement learning.
Natural language processing libraries and toolkits: text processing, topic modelling, tokenisers, chatbots. Also see the LLMs and ChatGPT category for crossover.
- 
huggingface/transformers ⭐ 150,627 
 🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
 🔗 huggingface.co/transformers
- 
myshell-ai/OpenVoice ⭐ 34,558 
 Instant voice cloning by MIT and MyShell. Audio foundation model.
 🔗 research.myshell.ai/open-voice
- 
explosion/spaCy ⭐ 32,586 
 💫 Industrial-strength Natural Language Processing (NLP) in Python
 🔗 spacy.io
- 
pytorch/fairseq ⭐ 31,847 
 Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
- 
vikparuchuri/marker ⭐ 29,043 
 Marker converts PDF, EPUB, and MOBI to markdown. It's 10x faster than nougat, more accurate on most documents, and has low hallucination risk.
 🔗 www.datalab.to
- 
microsoft/unilm ⭐ 21,756 
 Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
 🔗 aka.ms/generalai
- 
huggingface/datasets ⭐ 20,710 
 🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
 🔗 huggingface.co/docs/datasets
- 
vikparuchuri/surya ⭐ 18,649 
 OCR, layout analysis, reading order, table recognition in 90+ languages
 🔗 www.datalab.to
- 
m-bain/whisperX ⭐ 18,008 
 WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- 
ukplab/sentence-transformers ⭐ 17,627 
 State-of-the-Art Text Embeddings
 🔗 www.sbert.net
- 
rare-technologies/gensim ⭐ 16,203 
 Topic Modelling for Humans
 🔗 radimrehurek.com/gensim
- 
openai/tiktoken ⭐ 16,098 
 tiktoken is a fast BPE tokeniser for use with OpenAI's models.
- 
nvidia/NeMo ⭐ 15,809 
 A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
 🔗 docs.nvidia.com/nemo-framework/user-guide/latest/overview.html
- 
gunthercox/ChatterBot ⭐ 14,419 
 ChatterBot is a machine learning, conversational dialog engine for creating chat bots
 🔗 docs.chatterbot.us
- 
nltk/nltk ⭐ 14,314 
 NLTK Source
 🔗 www.nltk.org
- 
flairnlp/flair ⭐ 14,295 
 A very simple framework for state-of-the-art Natural Language Processing (NLP)
 🔗 flairnlp.github.io/flair
- 
jina-ai/clip-as-service ⭐ 12,750 
 🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
 🔗 clip-as-service.jina.ai
- 
allenai/allennlp ⭐ 11,878 
 An open-source NLP research library, built on PyTorch.
 🔗 www.allennlp.org
- 
facebookresearch/seamless_communication ⭐ 11,668 
 Foundational Models for State-of-the-Art Speech and Text Translation
- 
neuml/txtai ⭐ 11,660 
 💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
 🔗 neuml.github.io/txtai
- 
google/sentencepiece ⭐ 11,327 
 Unsupervised text tokenizer for Neural Network-based text generation.
- 
facebookresearch/ParlAI ⭐ 10,621 
 A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
 🔗 parl.ai
- 
speechbrain/speechbrain ⭐ 10,513 
 A PyTorch-based Speech Toolkit
 🔗 speechbrain.github.io
- 
doccano/doccano ⭐ 10,301 
 Open source annotation tool for machine learning practitioners.
- 
facebookresearch/nougat ⭐ 9,664 
 Implementation of Nougat Neural Optical Understanding for Academic Documents
 🔗 facebookresearch.github.io/nougat
- 
espnet/espnet ⭐ 9,494 
 End-to-End Speech Processing Toolkit
 🔗 espnet.github.io/espnet
- 
sloria/TextBlob ⭐ 9,436 
 Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
 🔗 textblob.readthedocs.io
- 
togethercomputer/OpenChatKit ⭐ 9,015 
 OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots
- 
clips/pattern ⭐ 8,838 
 Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
 🔗 github.com/clips/pattern/wiki
- 
quivrhq/MegaParse ⭐ 7,189 
 File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
 🔗 megaparse.com
- 
maartengr/BERTopic ⭐ 7,084 
 Leveraging BERT and c-TF-IDF to create easily interpretable topics.
 🔗 maartengr.github.io/bertopic
- 
deeppavlov/DeepPavlov ⭐ 6,933 
 An open source library for deep learning end-to-end dialog systems and chatbots.
 🔗 deeppavlov.ai
- 
facebookresearch/metaseq ⭐ 6,545 
 A codebase for working with Open Pre-trained Transformers, originally forked from fairseq.
- 
kingoflolz/mesh-transformer-jax ⭐ 6,346 
 Model parallel transformers in JAX and Haiku
- 
aiwaves-cn/agents ⭐ 5,730 
 An Open-source Framework for Data-centric, Self-evolving Autonomous Language Agents
- 
layout-parser/layout-parser ⭐ 5,516 
 A Unified Toolkit for Deep Learning Based Document Image Analysis
 🔗 layout-parser.github.io
- 
salesforce/CodeGen ⭐ 5,133 
 CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
- 
minimaxir/textgenrnn ⭐ 4,936 
 Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
- 
argilla-io/argilla ⭐ 4,711 
 Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
 🔗 argilla-io.github.io/argilla/latest
- 
makcedward/nlpaug ⭐ 4,624 
 Data augmentation for NLP
 🔗 makcedward.github.io
- 
facebookresearch/DrQA ⭐ 4,490 
 Reading Wikipedia to Answer Open-Domain Questions
- 
promptslab/Promptify ⭐ 4,257 
 Prompt Engineering | Prompt Versioning | Use GPT or other prompt based models to get structured output. Join our discord for Prompt-Engineering, LLMs and other latest research
 🔗 discord.gg/m88xfymbk6
- 
thilinarajapakse/simpletransformers ⭐ 4,215 
 Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
 🔗 simpletransformers.ai
- 
maartengr/KeyBERT ⭐ 4,015 
 A minimal and easy-to-use keyword extraction technique that leverages BERT embeddings to create keywords and keyphrases that are most similar to a document.
 🔗 maartengr.github.io/keybert
- 
life4/textdistance ⭐ 3,493 
 📐 Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
- 
jsvine/markovify ⭐ 3,369 
 A simple, extensible Markov chain generator.
- 
bytedance/lightseq ⭐ 3,289 
 LightSeq: A High Performance Library for Sequence Processing and Generation
- 
errbotio/errbot ⭐ 3,228 
 Errbot is a chatbot, a daemon that connects to your favorite chat service and bring your tools and some fun into the conversation.
 🔗 errbot.io
- 
neuralmagic/deepsparse ⭐ 3,155 
 Sparsity-aware deep learning inference runtime for CPUs
 🔗 neuralmagic.com/deepsparse
- 
huawei-noah/Pretrained-Language-Model ⭐ 3,139 
 Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
- 
ddangelov/Top2Vec ⭐ 3,084 
 Top2Vec learns jointly embedded topic, document and word vectors.
- 
salesforce/CodeT5 ⭐ 3,063 
 Home of CodeT5: Open Code LLMs for Code Understanding and Generation
 🔗 arxiv.org/abs/2305.07922
- 
bigscience-workshop/promptsource ⭐ 2,942 
 Toolkit for creating, sharing and using natural language prompts.
- 
jbesomi/texthero ⭐ 2,908 
 Text preprocessing, representation and visualization from zero to hero.
 🔗 texthero.org
- 
huggingface/neuralcoref ⭐ 2,885 
 ✨Fast Coreference Resolution in spaCy with Neural Networks
 🔗 huggingface.co/coref
- 
nvidia/nv-ingest ⭐ 2,745 
 NVIDIA-Ingest is a scalable, performance-oriented document content and metadata extraction microservice.
 🔗 docs.nvidia.com/nemo/retriever/latest/extraction/overview
- 
huggingface/setfit ⭐ 2,572 
 SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers.
 🔗 hf.co/docs/setfit
- 
chonkie-inc/chonkie ⭐ 2,489 
 🦛 CHONK docs with Chonkie ✨ — The no-nonsense RAG library
 🔗 docs.chonkie.ai
- 
urchade/GLiNER ⭐ 2,391 
 Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
 🔗 arxiv.org/abs/2311.08526
- 
alibaba/EasyNLP ⭐ 2,169 
 EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit
- 
jamesturk/jellyfish ⭐ 2,154 
 🪼 a python library for doing approximate and phonetic matching of strings.
 🔗 jamesturk.github.io/jellyfish
- 
thudm/P-tuning-v2 ⭐ 2,059 
 An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
- 
featureform/featureform ⭐ 1,943 
 The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
 🔗 www.featureform.com
- 
marella/ctransformers ⭐ 1,877 
 Python bindings for the Transformer models implemented in C/C++ using GGML library.
- 
nomic-ai/nomic ⭐ 1,824 
 Interact, analyze and structure massive text, image, embedding, audio and video datasets
 🔗 atlas.nomic.ai
- 
explosion/spacy-models ⭐ 1,807 
 💫 Models for the spaCy Natural Language Processing (NLP) library
 🔗 spacy.io
- 
deepset-ai/FARM ⭐ 1,753 
 🏡 Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.
 🔗 farm.deepset.ai
- 
intellabs/fastRAG ⭐ 1,728 
 Efficient Retrieval Augmentation and Generation Framework
- 
franck-dernoncourt/NeuroNER ⭐ 1,717 
 Named-entity recognition using neural networks. Easy-to-use and state-of-the-art results.
 🔗 neuroner.com
- 
google-research/language ⭐ 1,709 
 Shared repository for open-sourced projects from the Google AI Language team.
 🔗 ai.google/research/teams/language
- 
plasticityai/magnitude ⭐ 1,654 
 A fast, efficient universal vector embedding utility package.
- 
arxiv-vanity/arxiv-vanity ⭐ 1,627 
 Renders papers from arXiv as responsive web pages so you don't have to squint at a PDF.
 🔗 www.arxiv-vanity.com
- 
chrismattmann/tika-python ⭐ 1,625 
 Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
- 
answerdotai/ModernBERT ⭐ 1,530 
 Bringing BERT into modernity via both architecture changes and scaling
 🔗 arxiv.org/abs/2412.13663
- 
pemistahl/lingua-py ⭐ 1,509 
 The most accurate natural language detection library for Python, suitable for short text and mixed-language text
- 
dmmiller612/bert-extractive-summarizer ⭐ 1,440 
 Easy to use extractive text summarization with BERT
- 
gunthercox/chatterbot-corpus ⭐ 1,407 
 A multilingual dialog corpus
 🔗 corpus.chatterbot.us
- 
jonasgeiping/cramming ⭐ 1,349 
 Cramming the training of a (BERT-type) language model into limited compute.
- 
xhluca/bm25s ⭐ 1,346 
 Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
 🔗 bm25s.github.io
- 
openai/grade-school-math ⭐ 1,335 
 GSM8K, a dataset of 8.5K high quality linguistically diverse grade school math word problems
- 
unitaryai/detoxify ⭐ 1,119 
 Toxic Comment Classification with Pytorch Lightning and Transformers
 🔗 www.unitary.ai
- 
abertsch72/unlimiformer ⭐ 1,062 
 Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
- 
keras-team/keras-hub ⭐ 937 
 Pretrained model hub for Keras 3.
 🔗 keras.io/keras_hub
- 
norskregnesentral/skweak ⭐ 926 
 skweak: A software toolkit for weak supervision applied to NLP tasks
- 
explosion/spacy-streamlit ⭐ 843 
 👑 spaCy building blocks and visualizers for Streamlit apps
 🔗 share.streamlit.io/ines/spacy-streamlit-demo/master/app.py
- 
maartengr/PolyFuzz ⭐ 784 
 Performs fuzzy string matching, string grouping, and contains extensive evaluation functions. PolyFuzz is meant to bring fuzzy string matching techniques together within a single framework.
 🔗 maartengr.github.io/polyfuzz
- 
paddlepaddle/RocketQA ⭐ 781 
 🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models.
- 
webis-de/small-text ⭐ 628 
 Small-Text provides state-of-the-art Active Learning for Text Classification. Several pre-implemented Query Strategies, Initialization Strategies, and Stopping Critera are provided, which can be easily mixed and matched to build active learning experiments or applications.
 🔗 small-text.readthedocs.io
- 
babelscape/rebel ⭐ 545 
 REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 2021).
Python packaging, dependency management and bundling.
- 
astral-sh/uv ⭐ 69,104 
 An extremely fast Python package installer and resolver, written in Rust. Designed as a drop-in replacement for pip and pip-compile.
 🔗 docs.astral.sh/uv
- 
pyenv/pyenv ⭐ 43,294 
 pyenv lets you easily switch between multiple versions of Python.
- 
python-poetry/poetry ⭐ 33,927 
 Python packaging and dependency management made easy
 🔗 python-poetry.org
- 
pypa/pipenv ⭐ 25,098 
 A virtualenv management tool that supports a multitude of systems and nicely bridges the gaps between pip, python and virtualenv.
 🔗 pipenv.pypa.io
- 
mitsuhiko/rye ⭐ 14,303 
 a Hassle-Free Python Experience
 🔗 rye.astral.sh
- 
pyinstaller/pyinstaller ⭐ 12,640 
 Freeze (package) Python programs into stand-alone executables
 🔗 www.pyinstaller.org
- 
pypa/pipx ⭐ 12,105 
 Install and Run Python Applications in Isolated Environments
 🔗 pipx.pypa.io
- 
conda-forge/miniforge ⭐ 8,593 
 A conda-forge distribution.
 🔗 conda-forge.org/download
- 
pdm-project/pdm ⭐ 8,472 
 A modern Python package and dependency manager supporting the latest PEP standards
 🔗 pdm-project.org
- 
jazzband/pip-tools ⭐ 7,940 
 A set of tools to keep your pinned Python dependencies fresh (pip-compile + pip-sync)
 🔗 pip-tools.rtfd.io
- 
mamba-org/mamba ⭐ 7,671 
 The Fast Cross-Platform Package Manager: mamba is a reimplementation of the conda package manager in C++
 🔗 mamba.readthedocs.io
- 
conda/conda ⭐ 7,118 
 A system-level, binary package and environment manager running on all major operating systems and platforms.
 🔗 docs.conda.io/projects/conda
- 
pypa/hatch ⭐ 6,832 
 Modern, extensible Python project management
 🔗 hatch.pypa.io/latest
- 
indygreg/PyOxidizer ⭐ 5,962 
 A modern Python application packaging and distribution tool
- 
prefix-dev/pixi ⭐ 5,361 
 pixi is a cross-platform, multi-language package manager and workflow tool built on the foundation of the conda ecosystem.
 🔗 pixi.sh
- 
pypa/virtualenv ⭐ 4,968 
 A tool to create isolated Python environments. Since Python 3.3, a subset of it has been integrated into the standard lib venv module.
 🔗 virtualenv.pypa.io
- 
spack/spack ⭐ 4,804 
 A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
 🔗 spack.io
- 
pantsbuild/pex ⭐ 4,098 
 A tool for generating .pex (Python EXecutable) files, lock files and venvs.
 🔗 docs.pex-tool.org
- 
beeware/briefcase ⭐ 3,082 
 Tools to support converting a Python project into a standalone native application.
 🔗 briefcase.readthedocs.io
- 
pypa/flit ⭐ 2,227 
 Simplified packaging of Python modules
 🔗 flit.pypa.io
- 
linkedin/shiv ⭐ 1,878 
 shiv is a command line utility for building fully self contained Python zipapps as outlined in PEP 441, but with all their dependencies included.
- 
ofek/pyapp ⭐ 1,755 
 Runtime installer for Python applications
 🔗 ofek.dev/pyapp
- 
marcelotduarte/cx_Freeze ⭐ 1,491 
 Creates standalone executables from Python scripts with the same performance as the original script. It is cross-platform and should work on any platform that Python runs on.
 🔗 marcelotduarte.github.io/cx_freeze
- 
pypa/gh-action-pypi-publish ⭐ 1,090 
 The blessed GitHub Action, for publishing your 📦 distribution files to PyPI, the tokenless way: https://github.com/marketplace/actions/pypi-publish GitHub Action, for publishing your 📦 distribution files to PyPI, the tokenless way: https://github.com/marketplace/actions/pypi-publish
 🔗 packaging.python.org/guides/publishing-package-distribution-releases-using-github-actions-ci-cd-workflows
- 
py2exe/py2exe ⭐ 964 
 Create standalone Windows programs from Python code
 🔗 www.py2exe.org
- 
prefix-dev/rip ⭐ 667 
 RIP is a library that allows the resolving and installing of Python PyPI packages from Rust into a virtual environment. It's based on our experience with building Rattler and aims to provide the same experience but for PyPI instead of Conda.
 🔗 prefix.dev
- 
python-poetry/install.python-poetry.org ⭐ 238 
 The official Poetry installation script
 🔗 install.python-poetry.org
Pandas and dataframe libraries: data analysis, statistical reporting, pandas GUIs, pandas performance optimisations.
- 
pandas-dev/pandas ⭐ 46,739 
 Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
 🔗 pandas.pydata.org
- 
pola-rs/polars ⭐ 35,555 
 Extremely fast Query Engine for DataFrames, written in Rust
 🔗 docs.pola.rs
- 
duckdb/duckdb ⭐ 33,192 
 DuckDB is an analytical in-process SQL database management system
 🔗 www.duckdb.org
- 
gventuri/pandas-ai ⭐ 22,197 
 Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
 🔗 pandas-ai.com
- 
kanaries/pygwalker ⭐ 15,227 
 PyGWalker: Turn your dataframe into an interactive UI for visual analysis
 🔗 kanaries.net/pygwalker
- 
ydataai/ydata-profiling ⭐ 13,166 
 1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
 🔗 docs.sdk.ydata.ai
- 
rapidsai/cudf ⭐ 9,231 
 cuDF is a GPU DataFrame library for loading joining, aggregating, filtering, and otherwise manipulating data
 🔗 docs.rapids.ai/api/cudf/stable
- 
deepseek-ai/smallpond ⭐ 4,787 
 A lightweight data processing framework built on DuckDB and 3FS.
- 
eventual-inc/Daft ⭐ 4,548 
 Distributed query engine providing simple and reliable data processing for any modality and scale
 🔗 daft.ai
- 
aws/aws-sdk-pandas ⭐ 4,065 
 pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
 🔗 aws-sdk-pandas.readthedocs.io
- 
unionai-oss/pandera ⭐ 4,037 
 A light-weight, flexible, and expressive statistical data testing library
 🔗 www.union.ai/pandera
- 
nalepae/pandarallel ⭐ 3,792 
 A simple and efficient tool to parallelize Pandas operations on all available CPUs
 🔗 nalepae.github.io/pandarallel
- 
adamerose/PandasGUI ⭐ 3,248 
 A GUI for Pandas DataFrames
- 
blaze/blaze ⭐ 3,200 
 NumPy and Pandas interface to Big Data
 🔗 blaze.pydata.org
- 
pydata/pandas-datareader ⭐ 3,102 
 Extract data from a wide range of Internet sources into a pandas DataFrame.
 🔗 pydata.github.io/pandas-datareader/stable/index.html
- 
delta-io/delta-rs ⭐ 2,965 
 A native Rust library for Delta Lake, with bindings into Python
 🔗 delta-io.github.io/delta-rs
- 
scikit-learn-contrib/sklearn-pandas ⭐ 2,842 
 Pandas integration with sklearn
- 
jmcarpenter2/swifter ⭐ 2,630 
 A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
- 
fugue-project/fugue ⭐ 2,116 
 A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
 🔗 fugue-tutorials.readthedocs.io
- 
pyjanitor-devs/pyjanitor ⭐ 1,457 
 Clean APIs for data cleaning. Python implementation of R package Janitor
 🔗 pyjanitor-devs.github.io/pyjanitor
- 
holoviz/hvplot ⭐ 1,267 
 A high-level plotting API for pandas, dask, xarray, and networkx built on HoloViews
 🔗 hvplot.holoviz.org
- 
renumics/spotlight ⭐ 1,195 
 Interactively explore unstructured datasets from your dataframe.
 🔗 renumics.com
- 
machow/siuba ⭐ 1,175 
 Python library for using dplyr like syntax with pandas and SQL
 🔗 siuba.org
- 
tkrabel/bamboolib ⭐ 951 
 bamboolib - a GUI for pandas DataFrames
 🔗 bamboolib.com
- 
mwouts/itables ⭐ 914 
 This packages changes how Pandas and Polars DataFrames are rendered in Jupyter Notebooks. With itables you can display your tables as interactive DataTables that you can sort, paginate, scroll or filter.
 🔗 mwouts.github.io/itables
Performance, parallelisation and low level libraries.
- 
celery/celery ⭐ 27,290 
 Distributed Task Queue (development branch)
 🔗 docs.celeryq.dev
- 
google/flatbuffers ⭐ 24,813 
 FlatBuffers: Memory Efficient Serialization Library
 🔗 flatbuffers.dev
- 
pybind/pybind11 ⭐ 17,325 
 Seamless operability between C++11 and Python
 🔗 pybind11.readthedocs.io
- 
exaloop/codon ⭐ 15,931 
 A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
 🔗 docs.exaloop.io
- 
dask/dask ⭐ 13,508 
 Parallel computing with task scheduling
 🔗 dask.org
- 
numba/numba ⭐ 10,646 
 NumPy aware dynamic Python compiler using LLVM
 🔗 numba.pydata.org
- 
modin-project/modin ⭐ 10,286 
 Modin: Scale your Pandas workflows by changing a single line of code
 🔗 modin.readthedocs.io
- 
vaexio/vaex ⭐ 8,432 
 Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
 🔗 vaex.io
- 
nebuly-ai/optimate ⭐ 8,365 
 A collection of libraries to optimise AI model performances
 🔗 www.nebuly.com
- 
mher/flower ⭐ 6,948 
 Real-time monitor and web admin for Celery distributed task queue
 🔗 flower.readthedocs.io
- 
python-trio/trio ⭐ 6,878 
 Trio – a friendly Python library for async concurrency and I/O
 🔗 trio.readthedocs.io
- 
airtai/faststream ⭐ 4,596 
 FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.
 🔗 faststream.ag2.ai/latest
- 
ultrajson/ultrajson ⭐ 4,446 
 Ultra fast JSON decoder and encoder written in C with Python bindings
 🔗 pypi.org/project/ujson
- 
tlkh/asitop ⭐ 4,240 
 Perf monitoring CLI tool for Apple Silicon
 🔗 tlkh.github.io/asitop
- 
facebookincubator/cinder ⭐ 3,711 
 Cinder is Meta's internal performance-oriented production version of CPython.
 🔗 trycinder.com
- 
ipython/ipyparallel ⭐ 2,624 
 IPython Parallel: Interactive Parallel Computing in Python
 🔗 ipyparallel.readthedocs.io
- 
agronholm/anyio ⭐ 2,230 
 High level asynchronous concurrency and networking framework that works on top of either Trio or asyncio
- 
h5py/h5py ⭐ 2,172 
 HDF5 for Python -- The h5py package is a Pythonic interface to the HDF5 binary data format.
 🔗 www.h5py.org
- 
intel/intel-extension-for-transformers ⭐ 2,165 
 ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
- 
tiangolo/asyncer ⭐ 2,143 
 Asyncer, async and await, focused on developer experience.
 🔗 asyncer.tiangolo.com
- 
intel/intel-extension-for-pytorch ⭐ 1,970 
 A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
- 
faster-cpython/ideas ⭐ 1,720 
 Discussion and work tracker for Faster CPython project.
- 
dask/distributed ⭐ 1,651 
 A distributed task scheduler for Dask
 🔗 distributed.dask.org
- 
nschloe/perfplot ⭐ 1,381 
 📈 Performance analysis for Python snippets
- 
intel/scikit-learn-intelex ⭐ 1,313 
 Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
 🔗 uxlfoundation.github.io/scikit-learn-intelex
- 
markshannon/faster-cpython ⭐ 957 
 How to make CPython faster.
- 
zerointensity/pointers.py ⭐ 937 
 Bringing the hell of pointers to Python.
 🔗 pointers.zintensity.dev
- 
brandtbucher/specialist ⭐ 661 
 Visualize CPython's specializing, adaptive interpreter. 🔥
Memory and CPU/GPU profiling tools and libraries.
- 
bloomberg/memray ⭐ 14,445 
 Memray is a memory profiler for Python
 🔗 bloomberg.github.io/memray
- 
benfred/py-spy ⭐ 14,378 
 Sampling profiler for Python programs
- 
plasma-umass/scalene ⭐ 12,990 
 Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
- 
joerick/pyinstrument ⭐ 7,379 
 🚴 Call stack profiler for Python. Shows you why your code is slow!
 🔗 pyinstrument.readthedocs.io
- 
gaogaotiantian/viztracer ⭐ 7,086 
 A debugging and profiling tool that can trace and visualize python code execution
 🔗 viztracer.readthedocs.io
- 
pythonprofilers/memory_profiler ⭐ 4,532 
 Monitor Memory usage of Python code
 🔗 pypi.python.org/pypi/memory_profiler
- 
pyutils/line_profiler ⭐ 3,117 
 Line-by-line profiling for Python
- 
reloadware/reloadium ⭐ 2,986 
 Hot Reloading and Profiling for Python
- 
jiffyclub/snakeviz ⭐ 2,499 
 An in-browser Python profile viewer
 🔗 jiffyclub.github.io/snakeviz
- 
p403n1x87/austin ⭐ 2,120 
 Python frame stack sampler for CPython
 🔗 pypi.org/project/austin-dist
- 
pythonspeed/filprofiler ⭐ 886 
 A Python memory profiler for data processing and scientific computing applications
 🔗 pythonspeed.com/products/filmemoryprofiler
Security related libraries: vulnerability discovery, SQL injection, environment auditing.
- 
swisskyrepo/PayloadsAllTheThings ⭐ 70,416 
 A list of useful payloads and bypass for Web Application Security and Pentest/CTF
 🔗 swisskyrepo.github.io/payloadsallthethings
- 
sqlmapproject/sqlmap ⭐ 35,436 
 Automatic SQL injection and database takeover tool
 🔗 sqlmap.org
- 
certbot/certbot ⭐ 32,483 
 Certbot is EFF's tool to obtain certs from Let's Encrypt and (optionally) auto-enable HTTPS on your server. It can also act as a client for any other CA that uses the ACME protocol.
- 
aquasecurity/trivy ⭐ 29,299 
 Find vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
 🔗 trivy.dev
- 
bridgecrewio/checkov ⭐ 8,040 
 Checkov is a static code analysis tool for infrastructure as code (IaC) and also a software composition analysis (SCA) tool for images and open source packages.
 🔗 www.checkov.io
- 
nccgroup/ScoutSuite ⭐ 7,365 
 Multi-Cloud Security Auditing Tool
- 
pycqa/bandit ⭐ 7,356 
 Bandit is a tool designed to find common security issues in Python code.
 🔗 bandit.readthedocs.io
- 
stamparm/maltrail ⭐ 7,212 
 Malicious traffic detection system
- 
microsoft/presidio ⭐ 5,690 
 Context aware, pluggable and customizable PII de-identification service for text and images
 🔗 microsoft.github.io/presidio
- 
rhinosecuritylabs/pacu ⭐ 4,917 
 The AWS exploitation framework, designed for testing the security of Amazon Web Services environments.
 🔗 rhinosecuritylabs.com/aws/pacu-open-source-aws-exploitation-framework
- 
dashingsoft/pyarmor ⭐ 4,643 
 A tool used to obfuscate python scripts, bind obfuscated scripts to fixed machine or expire obfuscated scripts.
 🔗 pyarmor.dashingsoft.com
- 
mozilla/bleach ⭐ 2,718 
 Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
 🔗 bleach.readthedocs.io/en/latest
- 
pyupio/safety ⭐ 1,901 
 Safety checks Python dependencies for known security vulnerabilities and suggests the proper remediations for vulnerabilities detected.
 🔗 safetycli.com/product/safety-cli
- 
trailofbits/pip-audit ⭐ 1,113 
 Audits Python environments, requirements files and dependency trees for known security vulnerabilities, and can automatically fix them
 🔗 pypi.org/project/pip-audit
- 
thecyb3ralpha/BobTheSmuggler ⭐ 557 
 A tool that leverages HTML Smuggling Attack and allows you to create HTML files with embedded 7z/zip archives.
Simulation libraries: robotics, economic, agent-based, traffic, physics, astronomy, chemistry, quantum simulation. Also see the Maths and Science category for crossover.
- 
genesis-embodied-ai/Genesis ⭐ 27,329 
 Genesis is a physics platform, and generative data engine, designed for general purpose Robotics/Embodied AI/Physical AI applications
 🔗 genesis-world.readthedocs.io
- 
atsushisakai/PythonRobotics ⭐ 25,969 
 Python sample codes and textbook for robotics algorithms.
 🔗 atsushisakai.github.io/pythonrobotics
- 
bulletphysics/bullet3 ⭐ 13,803 
 Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.
 🔗 bulletphysics.org
- 
isl-org/Open3D ⭐ 12,827 
 Open3D: A Modern Library for 3D Data Processing
 🔗 www.open3d.org
- 
dlr-rm/stable-baselines3 ⭐ 11,686 
 Stable Baselines3 (SB3) is a set of reliable implementations of reinforcement learning algorithms in PyTorch
 🔗 stable-baselines3.readthedocs.io
- 
nvidia/Cosmos ⭐ 8,059 
 NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster.
 🔗 github.com/nvidia-cosmos
- 
qiskit/qiskit ⭐ 6,533 
 Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
 🔗 www.ibm.com/quantum/qiskit
- 
nvidia/warp ⭐ 5,596 
 A Python framework for accelerated simulation, data generation and spatial computing.
 🔗 nvidia.github.io/warp
- 
nvidia-omniverse/IsaacLab ⭐ 5,050 
 Unified framework for robot learning built on NVIDIA Isaac Sim
 🔗 isaac-sim.github.io/isaaclab
- 
astropy/astropy ⭐ 4,862 
 Astronomy and astrophysics core library
 🔗 www.astropy.org
- 
quantumlib/Cirq ⭐ 4,715 
 Python framework for creating, editing, and invoking Noisy Intermediate-Scale Quantum (NISQ) circuits.
 🔗 quantumai.google/cirq
- 
chakazul/Lenia ⭐ 3,683 
 Lenia is a 2D cellular automata with continuous space, time and states. It produces a huge variety of interesting methematical life forms
 🔗 chakazul.github.io/lenia/javascript/lenia.html
- 
projectmesa/mesa ⭐ 3,168 
 Mesa is an open-source Python library for agent-based modeling, ideal for simulating complex systems and exploring emergent behaviors.
 🔗 mesa.readthedocs.io
- 
rdkit/rdkit ⭐ 3,096 
 The official sources for the RDKit library
- 
openai/mujoco-py ⭐ 3,073 
 MuJoCo is a physics engine for detailed, efficient rigid body simulations with contacts. mujoco-py allows using MuJoCo from Python 3.
- 
google/brax ⭐ 2,875 
 Massively parallel rigidbody physics simulation on accelerator hardware.
- 
pennylaneai/pennylane ⭐ 2,834 
 PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Built by researchers, for research.
 🔗 pennylane.ai
- 
nvidia-omniverse/IsaacGymEnvs ⭐ 2,673 
 Example RL environments for the NVIDIA Isaac Gym high performance environments
- 
taichi-dev/difftaichi ⭐ 2,668 
 10 differentiable physical simulators built with Taichi differentiable programming (DiffTaichi, ICLR 2020)
- 
facebookresearch/habitat-lab ⭐ 2,585 
 A modular high-level library to train embodied AI agents across a variety of tasks and environments.
 🔗 aihabitat.org
- 
dlr-rm/rl-baselines3-zoo ⭐ 2,568 
 A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.
 🔗 rl-baselines3-zoo.readthedocs.io
- 
quantecon/QuantEcon.py ⭐ 2,204 
 A community based Python library for quantitative economics
 🔗 quantecon.org/quantecon-py
- 
tencent-hunyuan/Hunyuan3D-2.1 ⭐ 2,201 
 Tencent Hunyuan3D-2.1 is a scalable 3D asset creation system that advances state-of-the-art 3D generation
 🔗 3d.hunyuan.tencent.com
- 
microsoft/PromptCraft-Robotics ⭐ 2,056 
 Community for applying LLMs to robotics and a robot simulator with ChatGPT integration
 🔗 aka.ms/chatgpt-robotics
- 
eloialonso/diamond ⭐ 1,871 
 DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model
 🔗 diamond-wm.github.io
- 
deepmodeling/deepmd-kit ⭐ 1,762 
 A deep learning package for many-body potential energy representation and molecular dynamics
 🔗 docs.deepmodeling.com/projects/deepmd
- 
isaac-sim/IsaacSim ⭐ 1,523 
 NVIDIA Isaac Sim is a simulation platform built on NVIDIA Omniverse, designed to develop, test, train, and deploy AI-powered robots in realistic virtual environments.
 🔗 developer.nvidia.com/isaac/sim
- 
bowang-lab/scGPT ⭐ 1,329 
 scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI
 🔗 scgpt.readthedocs.io/en/latest
- 
sail-sg/envpool ⭐ 1,196 
 C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
 🔗 envpool.readthedocs.io
- 
altera-al/project-sid ⭐ 1,138 
 Project Sid: Many-agent simulations toward AI civilization technical report
- 
a-r-j/graphein ⭐ 1,133 
 Protein Graph Library
 🔗 graphein.ai
- 
polymathicai/the_well ⭐ 1,124 
 15TB of Physics Simulations: collection of machine learning datasets containing numerical simulations of a wide variety of spatiotemporal physical systems.
 🔗 polymathic-ai.org/the_well
- 
google-deepmind/materials_discovery ⭐ 1,047 
 Graph Networks for Materials Science (GNoME) is a project centered around scaling machine learning methods to tackle materials science.
- 
viblo/pymunk ⭐ 1,020 
 Pymunk is a easy-to-use pythonic 2d physics library that can be used whenever you need 2d rigid body physics from Python
 🔗 www.pymunk.org
- 
nvidia-omniverse/OmniIsaacGymEnvs ⭐ 1,006 
 Reinforcement Learning Environments for Omniverse Isaac Gym
- 
google/evojax ⭐ 917 
 EvoJAX is a scalable, general purpose, hardware-accelerated neuroevolution toolkit built on the JAX library
- 
eureka-research/DrEureka ⭐ 902 
 Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
 🔗 eureka-research.github.io/dr-eureka
- 
facebookresearch/fairo ⭐ 894 
 A modular embodied agent architecture and platform for building embodied agents
- 
ur-whitelab/chemcrow-public ⭐ 815 
 Chemcrow
- 
araffin/sbx ⭐ 507 
 SBX: Stable Baselines Jax (SB3 + Jax) RL algorithms
- 
sakanaai/asal ⭐ 429 
 Automating the Search for Artificial Life with Foundation Models!
- 
arshka/PhysiX ⭐ 103 
 A Foundation Model for physics simulations
- 
ur-whitelab/chemcrow-runs ⭐ 92 
 ur-whitelab/chemcrow-runs
Miscellaneous study resources: algorithms, general resources, system design, code repos for textbooks, best practices, tutorials.
- 
thealgorithms/Python ⭐ 209,909 
 All Algorithms implemented in Python
 🔗 thealgorithms.github.io/python
- 
microsoft/generative-ai-for-beginners ⭐ 99,714 
 Learn the fundamentals of building Generative AI applications with our 21-lesson comprehensive course by Microsoft Cloud Advocates.
- 
rasbt/LLMs-from-scratch ⭐ 74,440 
 Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
 🔗 amzn.to/4fqvn0d
- 
mlabonne/llm-course ⭐ 64,359 
 Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
 🔗 mlabonne.github.io/blog
- 
labmlai/annotated_deep_learning_paper_implementations ⭐ 63,418 
 🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
 🔗 nn.labml.ai
- 
jakevdp/PythonDataScienceHandbook ⭐ 45,657 
 Python Data Science Handbook: full text in Jupyter Notebooks
 🔗 jakevdp.github.io/pythondatasciencehandbook
- 
realpython/python-guide ⭐ 29,264 
 Python best practices guidebook, written for humans.
 🔗 docs.python-guide.org
- 
d2l-ai/d2l-en ⭐ 27,033 
 Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.
 🔗 d2l.ai
- 
christoschristofidis/awesome-deep-learning ⭐ 26,274 
 A curated list of awesome Deep Learning tutorials, projects and communities.
- 
hannibal046/Awesome-LLM ⭐ 25,191 
 Awesome-LLM: a curated list of Large Language Model
- 
wesm/pydata-book ⭐ 23,761 
 Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
- 
huggingface/agents-course ⭐ 22,815 
 This repository contains the Hugging Face Agents Course.
- 
microsoft/recommenders ⭐ 20,965 
 Best Practices on Recommendation Systems
 🔗 recommenders-team.github.io/recommenders/intro.html
- 
fchollet/deep-learning-with-python-notebooks ⭐ 19,580 
 Jupyter notebooks for the code samples of the book "Deep Learning with Python"
- 
karpathy/nn-zero-to-hero ⭐ 17,816 
 Neural Networks: Zero to Hero
- 
handsonllm/Hands-On-Large-Language-Models ⭐ 16,168 
 Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
 🔗 www.llm-book.com
- 
mrdbourke/pytorch-deep-learning ⭐ 15,828 
 Materials for the Learn PyTorch for Deep Learning: Zero to Mastery course.
 🔗 learnpytorch.io
- 
naklecha/llama3-from-scratch ⭐ 15,162 
 llama3 implementation one matrix multiplication at a time
- 
graykode/nlp-tutorial ⭐ 14,754 
 Natural Language Processing Tutorial for Deep Learning Researchers
 🔗 www.reddit.com/r/machinelearning/comments/amfinl/project_nlptutoral_repository_who_is_studying
- 
shangtongzhang/reinforcement-learning-an-introduction ⭐ 14,325 
 Python Implementation of Reinforcement Learning: An Introduction
- 
zhanymkanov/fastapi-best-practices ⭐ 13,670 
 FastAPI Best Practices and Conventions we used at our startup
- 
nirdiamant/agents-towards-production ⭐ 13,466 
 The open-source playbook for turning AI agents into real-world products.
- 
karpathy/micrograd ⭐ 12,859 
 A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
- 
eugeneyan/open-llms ⭐ 12,425 
 📋 A list of open LLMs available for commercial use.
- 
rucaibox/LLMSurvey ⭐ 11,853 
 The official GitHub page for the survey paper "A Survey of Large Language Models".
 🔗 arxiv.org/abs/2303.18223
- 
srush/GPU-Puzzles ⭐ 11,516 
 Teaching beginner GPU programming in a completely interactive fashion
- 
nielsrogge/Transformers-Tutorials ⭐ 11,262 
 This repository contains demos I made with the Transformers library by HuggingFace.
- 
openai/spinningup ⭐ 11,261 
 An educational resource to help anyone learn deep reinforcement learning.
 🔗 spinningup.openai.com
- 
mooler0410/LLMsPracticalGuide ⭐ 10,054 
 A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
 🔗 arxiv.org/abs/2304.13712v2
- 
chiphuyen/aie-book ⭐ 9,911 
 Code for AI Engineering: Building Applications with Foundation Models (Chip Huyen 2025)
- 
roboflow/notebooks ⭐ 8,532 
 A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM 2, Florence-2, PaliGemma 2, and Qwen2.5VL.
 🔗 roboflow.com/models
- 
udlbook/udlbook ⭐ 8,247 
 Understanding Deep Learning - Simon J.D. Prince
- 
firmai/industry-machine-learning ⭐ 7,395 
 A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
 🔗 www.sov.ai
- 
gkamradt/langchain-tutorials ⭐ 7,257 
 Overview and tutorial of the LangChain Library
- 
alirezadir/Machine-Learning-Interviews ⭐ 6,960 
 This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
- 
huggingface/smol-course ⭐ 6,421 
 a practical course on aligning language models for your specific use case. It's a handy way to get started with aligning language models, because everything runs on most local machines.
- 
neetcode-gh/leetcode ⭐ 6,156 
 Leetcode solutions for NeetCode.io
- 
mrdbourke/tensorflow-deep-learning ⭐ 5,748 
 All course materials for the Zero to Mastery Deep Learning with TensorFlow course.
 🔗 dbourke.link/ztmtfcourse
- 
udacity/deep-learning-v2-pytorch ⭐ 5,431 
 Projects and exercises for the latest Deep Learning ND program https://www.udacity.com/course/deep-learning-nanodegree--nd101
- 
promptslab/Awesome-Prompt-Engineering ⭐ 4,972 
 This repository contains a hand-curated resources for Prompt Engineering with a focus on Generative Pre-trained Transformer (GPT), ChatGPT, PaLM etc
 🔗 discord.gg/m88xfymbk6
- 
timofurrer/awesome-asyncio ⭐ 4,897 
 A curated list of awesome Python asyncio frameworks, libraries, software and resources
- 
rasbt/machine-learning-book ⭐ 4,677 
 Code Repository for Machine Learning with PyTorch and Scikit-Learn
 🔗 sebastianraschka.com/books/#machine-learning-with-pytorch-and-scikit-learn
- 
huggingface/deep-rl-class ⭐ 4,544 
 This repo contains the Hugging Face Deep Reinforcement Learning Course.
- 
zotroneneis/machine_learning_basics ⭐ 4,402 
 Plain python implementations of basic machine learning algorithms
- 
huggingface/diffusion-models-class ⭐ 4,147 
 Materials for the Hugging Face Diffusion Models Course
- 
amanchadha/coursera-deep-learning-specialization ⭐ 3,990 
 Notes, programming assignments and quizzes from all courses within the Coursera Deep Learning specialization offered by deeplearning.ai: (i) Neural Networks and Deep Learning; (ii) Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization; (iii) Structuring Machine Learning Projects; (iv...
- 
engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies ⭐ 3,812 
 Curated collection of 300+ case studies from over 80 companies, detailing practical applications and insights into machine learning (ML) system design
- 
fluentpython/example-code-2e ⭐ 3,768 
 Example code for Fluent Python, 2nd edition (O'Reilly 2022)
 🔗 amzn.to/3j48u2j
- 
cosmicpython/book ⭐ 3,622 
 A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
 🔗 www.cosmicpython.com
- 
mrdbourke/zero-to-mastery-ml ⭐ 3,472 
 All course materials for the Zero to Mastery Machine Learning and Data Science course.
 🔗 dbourke.link/ztmmlcourse
- 
krzjoa/awesome-python-data-science ⭐ 3,126 
 Probably the best curated list of data science software in Python.
 🔗 krzjoa.github.io/awesome-python-data-science
- 
gerdm/prml ⭐ 2,444 
 Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop
- 
huggingface/cookbook ⭐ 2,274 
 Community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models.
 🔗 huggingface.co/learn/cookbook
- 
cgpotts/cs224u ⭐ 2,158 
 Code for CS224u: Natural Language Understanding
- 
cerlymarco/MEDIUM_NoteBook ⭐ 2,125 
 Repository containing notebooks of my posts on Medium
- 
trananhkma/fucking-awesome-python ⭐ 2,038 
 awesome-python with ⭐ and 🍴 ⭐ and 🍴
- 
aburkov/theLMbook ⭐ 1,929 
 Code for Hundred-Page Language Models Book by Andriy Burkov
 🔗 www.thelmbook.com
- 
chandlerbang/awesome-self-supervised-gnn ⭐ 1,687 
 Papers about pretraining and self-supervised learning on Graph Neural Networks (GNN).
- 
huggingface/evaluation-guidebook ⭐ 1,652 
 Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
- 
atcold/NYU-DLSP21 ⭐ 1,643 
 NYU Deep Learning Spring 2021
 🔗 atcold.github.io/nyu-dlsp21
- 
patrickloeber/MLfromscratch ⭐ 1,523 
 Machine Learning algorithm implementations from scratch.
- 
davidadsp/Generative_Deep_Learning_2nd_Edition ⭐ 1,379 
 The official code repository for the second edition of the O'Reilly book Generative Deep Learning: Teaching Machines to Paint, Write, Compose and Play.
 🔗 www.oreilly.com/library/view/generative-deep-learning/9781098134174
- 
rasbt/LLM-workshop-2024 ⭐ 1,034 
 A 4-hour coding workshop to understand how LLMs are implemented and used
- 
jackhidary/quantumcomputingbook ⭐ 883 
 Companion site for the textbook Quantum Computing: An Applied Approach
- 
rasbt/MachineLearning-QandAI-book ⭐ 659 
 Machine Learning Q and AI book
 🔗 www.amazon.com/machine-learning-ai-essential-questions/dp/1718503768
- 
bayesianmodelingandcomputationinpython/BookCode_Edition1 ⭐ 548 
 Bayesian Modeling and Computation in Python: open-access version of the text and the code examples in the book
 🔗 www.bayesiancomputationbook.com
- 
rwitten/HighPerfLLMs2024 ⭐ 538 
 Build a full scale, high-performance LLM from scratch in Jax! We cover training and inference, roofline analysis, compilation, sharding, profiling and more.
- 
towardsai/ragbook-notebooks ⭐ 507 
 Repository for the "Building LLMs for Production" book by Towards AI.
 🔗 academy.towardsai.net/courses/buildingllmsforproduction
- 
dylanhogg/awesome-python ⭐ 407 
 🐍 Hand-picked awesome Python libraries and frameworks, organised by category
 🔗 www.awesomepython.org
Template tools and libraries: cookiecutter repos, generators, quick-starts.
- 
tiangolo/full-stack-fastapi-template ⭐ 38,133 
 Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more.
- 
cookiecutter/cookiecutter ⭐ 24,140 
 A cross-platform command-line utility that creates projects from cookiecutters (project templates), e.g. Python package projects, C projects.
 🔗 pypi.org/project/cookiecutter
- 
drivendata/cookiecutter-data-science ⭐ 9,325 
 A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.
 🔗 cookiecutter-data-science.drivendata.org
- 
buuntu/fastapi-react ⭐ 2,450 
 🚀 Cookiecutter Template for FastAPI + React Projects. Using PostgreSQL, SQLAlchemy, and Docker
- 
pyscaffold/pyscaffold ⭐ 2,231 
 🛠 Python project template generator with batteries included
 🔗 pyscaffold.org
- 
cjolowicz/cookiecutter-hypermodern-python ⭐ 1,885 
 Cookiecutter template for a Python package based on the Hypermodern Python article series.
 🔗 cookiecutter-hypermodern-python.readthedocs.io
- 
fmind/mlops-python-package ⭐ 1,353 
 Best practices designed to support your MLOPs initiatives. You can use this package as part of your MLOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference
 🔗 fmind.github.io/mlops-python-package
- 
tezromach/python-package-template ⭐ 1,093 
 🚀 Your next Python package needs a bleeding-edge project structure.
- 
fpgmaas/cookiecutter-uv ⭐ 1,069 
 A modern cookiecutter template for Python projects that use uv for dependency management
 🔗 fpgmaas.github.io/cookiecutter-uv
- 
martinheinz/python-project-blueprint ⭐ 969 
 Blueprint/Boilerplate For Python Projects
- 
callmesora/llmops-python-package ⭐ 881 
 Best practices designed to support your LLMOps initiatives. You can use this package as part of your LLMOps toolkit or platform e.g. Model Registry, Experiment Tracking, Realtime Inference
Terminal and console tools and libraries: CLI tools, terminal based formatters, progress bars.
- 
willmcgugan/rich ⭐ 53,945 
 Rich is a Python library for rich text and beautiful formatting in the terminal.
 🔗 rich.readthedocs.io/en/latest
- 
aider-ai/aider ⭐ 37,773 
 Aider lets you pair program with LLMs, to edit code in your local git repository
 🔗 aider.chat
- 
anthropics/claude-code ⭐ 35,488 
 Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows
 🔗 docs.anthropic.com/s/claude-code
- 
willmcgugan/textual ⭐ 31,279 
 The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
 🔗 textual.textualize.io
- 
tqdm/tqdm ⭐ 30,492 
 ⚡ A Fast, Extensible Progress Bar for Python and CLI
 🔗 tqdm.github.io
- 
google/python-fire ⭐ 27,911 
 Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
- 
tiangolo/typer ⭐ 18,003 
 Typer, build great CLIs. Easy to code. Based on Python type hints.
 🔗 typer.tiangolo.com
- 
pallets/click ⭐ 16,878 
 Python composable command line interface toolkit
 🔗 click.palletsprojects.com
- 
prompt-toolkit/python-prompt-toolkit ⭐ 9,981 
 Library for building powerful interactive command line applications in Python
 🔗 python-prompt-toolkit.readthedocs.io
- 
simonw/llm ⭐ 9,842 
 A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.
 🔗 llm.datasette.io
- 
saulpw/visidata ⭐ 8,479 
 A terminal spreadsheet multitool for discovering and arranging data
 🔗 visidata.org
- 
xxh/xxh ⭐ 5,767 
 🚀 Bring your favorite shell wherever you go through the ssh. Xonsh shell, fish, zsh, osquery and so on.
- 
tconbeer/harlequin ⭐ 5,089 
 The SQL IDE for Your Terminal.
 🔗 harlequin.sh
- 
manrajgrover/halo ⭐ 2,969 
 💫 Beautiful spinners for terminal, IPython and Jupyter
- 
urwid/urwid ⭐ 2,951 
 Console user interface library for Python (official repo)
 🔗 urwid.org
- 
textualize/trogon ⭐ 2,712 
 Easily turn your Click CLI into a powerful terminal application
- 
darrenburns/elia ⭐ 2,306 
 A snappy, keyboard-centric terminal user interface for interacting with large language models. Chat with ChatGPT, Claude, Llama 3, Phi 3, Mistral, Gemma and more.
- 
tmbo/questionary ⭐ 1,888 
 Python library to build pretty command line user prompts ✨Easy to use multi-select lists, confirmations, free text prompts ...
- 
jazzband/prettytable ⭐ 1,549 
 Display tabular data in a visually appealing ASCII table format
 🔗 pypi.org/project/prettytable
- 
shobrook/wut ⭐ 1,386 
 Just type wut and an LLM will help you understand whatever's in your terminal. You'll be surprised how useful this can be.
- 
1j01/textual-paint ⭐ 1,060 
 🎨 MS Paint in your terminal.
 🔗 pypi.org/project/textual-paint
Testing libraries: unit testing, load testing, acceptance testing, code coverage, browser automation, plugins.
- 
mitmproxy/mitmproxy ⭐ 40,721 
 An interactive TLS-capable intercepting HTTP proxy for penetration testers and software developers.
 🔗 mitmproxy.org
- 
locustio/locust ⭐ 26,877 
 Write scalable load tests in plain Python 🚗💨
 🔗 locust.cloud
- 
microsoft/playwright-python ⭐ 13,756 
 Playwright is a Python library to automate Chromium, Firefox and WebKit browsers with a single API.
 🔗 playwright.dev/python
- 
pytest-dev/pytest ⭐ 13,131 
 The pytest framework makes it easy to write small tests, yet scales to support complex functional testing
 🔗 pytest.org
- 
seleniumbase/SeleniumBase ⭐ 11,685 
 Python APIs for web automation, testing, and bypassing bot-detection.
 🔗 seleniumbase.io
- 
confident-ai/deepeval ⭐ 11,440 
 LLM evaluation framework similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc
 🔗 deepeval.com
- 
robotframework/robotframework ⭐ 11,119 
 Generic automation framework for acceptance testing and RPA
 🔗 robotframework.org
- 
hypothesisworks/hypothesis ⭐ 8,096 
 The property-based testing library for Python
 🔗 hypothesis.works
- 
getmoto/moto ⭐ 8,046 
 A library that allows you to easily mock out tests based on AWS infrastructure.
 🔗 docs.getmoto.org/en/latest
- 
newsapps/beeswithmachineguns ⭐ 6,603 
 A utility for arming (creating) many bees (micro EC2 instances) to attack (load test) targets (web applications).
 🔗 apps.chicagotribune.com
- 
codium-ai/qodo-cover ⭐ 5,180 
 Qodo-Cover: An AI-Powered Tool for Automated Test Generation and Code Coverage Enhancement! 💻🤖🧪🐞
 🔗 qodo.ai
- 
spulec/freezegun ⭐ 4,431 
 Let your Python tests travel through time
- 
getsentry/responses ⭐ 4,292 
 A utility for mocking out the Python Requests library.
- 
tox-dev/tox ⭐ 3,841 
 Command line driven CI frontend and development task automation tool.
 🔗 tox.wiki
- 
behave/behave ⭐ 3,373 
 BDD, Python style.
 🔗 behave.readthedocs.io/en/latest
- 
nedbat/coveragepy ⭐ 3,252 
 The code coverage tool for Python
 🔗 coverage.readthedocs.io
- 
kevin1024/vcrpy ⭐ 2,861 
 Automatically mock your HTTP interactions to simplify and speed up testing
- 
cobrateam/splinter ⭐ 2,753 
 splinter - python test framework for web applications
 🔗 splinter.readthedocs.org/en/stable/index.html
- 
pytest-dev/pytest-testinfra ⭐ 2,440 
 With Testinfra you can write unit tests in Python to test actual state of your servers configured by management tools like Salt, Ansible, Puppet, Chef and so on.
 🔗 testinfra.readthedocs.io
- 
pytest-dev/pytest-mock ⭐ 1,982 
 Thin-wrapper around the mock package for easier use with pytest
 🔗 pytest-mock.readthedocs.io/en/latest
- 
pytest-dev/pytest-cov ⭐ 1,951 
 Coverage plugin for pytest.
- 
pytest-dev/pytest-xdist ⭐ 1,710 
 pytest plugin for distributed testing and loop-on-failures testing modes.
 🔗 pytest-xdist.readthedocs.io
- 
pytest-dev/pytest-asyncio ⭐ 1,568 
 Asyncio support for pytest
 🔗 pytest-asyncio.readthedocs.io
- 
taverntesting/tavern ⭐ 1,114 
 A command-line tool and Python library and Pytest plugin for automated testing of RESTful APIs, with a simple, concise and flexible YAML-based syntax
 🔗 taverntesting.github.io
Machine learning and classical timeseries libraries: forecasting, seasonality, anomaly detection, econometrics.
- 
facebook/prophet ⭐ 19,654 
 Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
 🔗 facebook.github.io/prophet
- 
sktime/sktime ⭐ 9,281 
 A unified framework for machine learning with time series
 🔗 www.sktime.net
- 
blue-yonder/tsfresh ⭐ 8,973 
 Automatic extraction of relevant features from time series:
 🔗 tsfresh.readthedocs.io
- 
unit8co/darts ⭐ 8,932 
 A python library for user-friendly forecasting and anomaly detection on time series.
 🔗 unit8co.github.io/darts
- 
google-research/timesfm ⭐ 6,559 
 TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
 🔗 research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting
- 
facebookresearch/Kats ⭐ 6,204 
 Kats, a kit to analyze time series data, a lightweight, easy-to-use, generalizable, and extendable framework to perform time series analysis, from understanding the key statistics and characteristics, detecting change points and anomalies, to forecasting future trends.
- 
awslabs/gluonts ⭐ 5,020 
 Probabilistic time series modeling in Python
 🔗 ts.gluon.ai
- 
nixtla/statsforecast ⭐ 4,528 
 Lightning ⚡️ fast forecasting with statistical and econometric models.
 🔗 nixtlaverse.nixtla.io/statsforecast
- 
salesforce/Merlion ⭐ 4,365 
 Merlion: A Machine Learning Framework for Time Series Intelligence
- 
tdameritrade/stumpy ⭐ 3,998 
 STUMPY is a powerful and scalable Python library for modern time series analysis
 🔗 stumpy.readthedocs.io/en/latest
- 
amazon-science/chronos-forecasting ⭐ 3,697 
 Chronos: Pretrained Models for Probabilistic Time Series Forecasting
 🔗 arxiv.org/abs/2403.07815
- 
aistream-peelout/flow-forecast ⭐ 2,235 
 Deep learning PyTorch library for time series forecasting, classification, and anomaly detection (originally for flood forecasting).
 🔗 flow-forecast.atlassian.net/wiki/spaces/ff/overview
- 
yuqinie98/PatchTST ⭐ 2,193 
 An offical implementation of PatchTST: A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
- 
rjt1990/pyflux ⭐ 2,135 
 Open source time series library for Python
- 
uber/orbit ⭐ 2,000 
 A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
 🔗 orbit-ml.readthedocs.io/en/stable
- 
alkaline-ml/pmdarima ⭐ 1,677 
 A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
 🔗 www.alkaline-ml.com/pmdarima
- 
time-series-foundation-models/lag-llama ⭐ 1,493 
 Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting
- 
winedarksea/AutoTS ⭐ 1,323 
 Automated Time Series Forecasting
- 
ngruver/llmtime ⭐ 801 
 LLMTime, a method for zero-shot time series forecasting with large language models (LLMs) by encoding numbers as text and sampling possible extrapolations as text completions
 🔗 arxiv.org/abs/2310.07820
- 
autoviml/Auto_TS ⭐ 760 
 Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Created by Ram Seshadri. Collaborators welcome.
- 
google/temporian ⭐ 704 
 Temporian is an open-source Python library for preprocessing ⚡ and feature engineering 🛠 temporal data 📈 for machine learning applications 🤖
 🔗 temporian.readthedocs.io
- 
microsoft/robustlearn ⭐ 501 
 Robust machine learning for responsible AI
 🔗 aka.ms/roblearn
Typing libraries: static and run-time type checking, annotations.
- 
python/mypy ⭐ 19,849 
 Optional static typing for Python
 🔗 www.mypy-lang.org
- 
microsoft/pyright ⭐ 14,850 
 Static Type Checker for Python
- 
facebook/pyre-check ⭐ 7,093 
 Performant type-checking for python.
 🔗 pyre-check.org
- 
python-attrs/attrs ⭐ 5,623 
 Python Classes Without Boilerplate
 🔗 www.attrs.org
- 
google/pytype ⭐ 5,003 
 A static type analyzer for Python code
 🔗 google.github.io/pytype
- 
instagram/MonkeyType ⭐ 4,942 
 A Python library that generates static type annotations by collecting runtime types
- 
python/typeshed ⭐ 4,865 
 Collection of library stubs for Python, with static types
- 
facebook/pyrefly ⭐ 3,784 
 A fast type checker and IDE for Python. (A new version of Pyre)
 🔗 pyrefly.org
- 
koxudaxi/datamodel-code-generator ⭐ 3,502 
 Pydantic model and dataclasses.dataclass generator for easy conversion of JSON, OpenAPI, JSON Schema, and YAML data sources.
 🔗 koxudaxi.github.io/datamodel-code-generator
- 
mtshiba/pylyzer ⭐ 2,866 
 A fast, feature-rich static code analyzer & language server for Python
 🔗 mtshiba.github.io/pylyzer
- 
microsoft/pylance-release ⭐ 1,912 
 Fast, feature-rich language support for Python. Documentation and issues for Pylance.
- 
agronholm/typeguard ⭐ 1,705 
 Run-time type checker for Python
- 
patrick-kidger/torchtyping ⭐ 1,447 
 Type annotations and dynamic checking for a tensor's shape, dtype, names, etc.
- 
python/typing_extensions ⭐ 530 
 Backported and experimental type hints for Python
- 
robertcraigie/pyright-python ⭐ 246 
 Python command line wrapper for pyright, a static type checker
 🔗 pypi.org/project/pyright
General utility libraries: miscellaneous tools, linters, code formatters, version management, package tools, documentation tools.
- 
yt-dlp/yt-dlp ⭐ 129,580 
 A feature-rich command-line audio/video downloader
 🔗 discord.gg/h5mncfw63r
- 
home-assistant/core ⭐ 81,762 
 🏡 Open source home automation that puts local control and privacy first.
 🔗 www.home-assistant.io
- 
abi/screenshot-to-code ⭐ 70,940 
 Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
 🔗 screenshottocode.com
- 
python/cpython ⭐ 69,134 
 The Python programming language
 🔗 www.python.org
- 
localstack/localstack ⭐ 60,716 
 💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline
 🔗 localstack.cloud
- 
ggerganov/whisper.cpp ⭐ 43,635 
 Port of OpenAI's Whisper model in C/C++
- 
faif/python-patterns ⭐ 42,215 
 A collection of design patterns/idioms in Python
- 
mingrammer/diagrams ⭐ 41,562 
 🎨 Diagram as Code for prototyping cloud system architectures
 🔗 diagrams.mingrammer.com
- 
openai/openai-python ⭐ 28,817 
 The official Python library for the OpenAI API
 🔗 pypi.org/project/openai
- 
blakeblackshear/frigate ⭐ 26,248 
 NVR with realtime local object detection for IP cameras
 🔗 frigate.video
- 
pydantic/pydantic ⭐ 25,366 
 Data validation using Python type hints
 🔗 docs.pydantic.dev
- 
keon/algorithms ⭐ 24,769 
 Minimal examples of data structures and algorithms in Python
- 
squidfunk/mkdocs-material ⭐ 24,699 
 Documentation that simply works
 🔗 squidfunk.github.io/mkdocs-material
- 
norvig/pytudes ⭐ 24,032 
 Python programs, usually short, of considerable difficulty, to perfect particular skills.
- 
delgan/loguru ⭐ 22,823 
 Python logging made (stupidly) simple
 🔗 loguru.readthedocs.io
- 
facebookresearch/audiocraft ⭐ 22,512 
 Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
- 
chriskiehl/Gooey ⭐ 21,591 
 Turn (almost) any Python command line program into a full GUI application with one line
- 
mkdocs/mkdocs ⭐ 21,127 
 Project documentation with Markdown.
 🔗 www.mkdocs.org
- 
micropython/micropython ⭐ 20,900 
 MicroPython - a lean and efficient Python implementation for microcontrollers and constrained systems
 🔗 micropython.org
- 
rustpython/RustPython ⭐ 20,571 
 A Python Interpreter written in Rust
 🔗 rustpython.github.io
- 
higherorderco/Bend ⭐ 19,024 
 A massively parallel, high-level programming language
 🔗 higherorderco.com
- 
kivy/kivy ⭐ 18,653 
 Open source UI framework written in Python, running on Windows, Linux, macOS, Android and iOS
 🔗 kivy.org
- 
openai/triton ⭐ 17,126 
 Development repository for the Triton language and compiler
 🔗 triton-lang.org
- 
ipython/ipython ⭐ 16,580 
 Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
 🔗 ipython.readthedocs.org
- 
alievk/avatarify-python ⭐ 16,521 
 Avatars for Zoom, Skype and other video-conferencing apps.
- 
caronc/apprise ⭐ 14,578 
 Apprise - Push Notifications that work with just about every platform!
 🔗 hub.docker.com/r/caronc/apprise
- 
comet-ml/opik ⭐ 14,516 
 Opik is an open-source platform for evaluating, testing and monitoring LLM applications.
 🔗 www.comet.com/docs/opik
- 
pyo3/pyo3 ⭐ 14,501 
 Rust bindings for the Python interpreter
 🔗 pyo3.rs
- 
google/brotli ⭐ 14,325 
 Brotli is a generic-purpose lossless compression algorithm that compresses data using a combination of a modern variant of the LZ77 algorithm, Huffman coding and 2nd order context modeling
- 
zulko/moviepy ⭐ 13,963 
 Video editing with Python
 🔗 zulko.github.io/moviepy
- 
nuitka/Nuitka ⭐ 13,912 
 Nuitka is a Python compiler written in Python. It's fully compatible with Python 2.6, 2.7, 3.4-3.13. You feed it your Python app, it does a lot of clever things, and spits out an executable or extension module.
 🔗 nuitka.net
- 
pyodide/pyodide ⭐ 13,750 
 Pyodide is a Python distribution for the browser and Node.js based on WebAssembly
 🔗 pyodide.org/en/stable
- 
python-pillow/Pillow ⭐ 13,106 
 The Python Imaging Library adds image processing capabilities to Python (Pillow is the friendly PIL fork)
 🔗 python-pillow.github.io
- 
pytube/pytube ⭐ 12,956 
 A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
 🔗 pytube.io
- 
ninja-build/ninja ⭐ 12,302 
 Ninja is a small build system with a focus on speed.
 🔗 ninja-build.org
- 
dbader/schedule ⭐ 12,154 
 Python job scheduling for humans.
 🔗 schedule.readthedocs.io
- 
asweigart/pyautogui ⭐ 11,929 
 A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
- 
secdev/scapy ⭐ 11,738 
 Scapy: the Python-based interactive packet manipulation program & library.
 🔗 scapy.net
- 
magicstack/uvloop ⭐ 11,304 
 Ultra fast asyncio event loop.
- 
pallets/jinja ⭐ 11,178 
 A very fast and expressive template engine.
 🔗 jinja.palletsprojects.com
- 
aristocratos/bpytop ⭐ 10,769 
 Linux/OSX/FreeBSD resource monitor
- 
cython/cython ⭐ 10,320 
 The most widely used Python to C compiler
 🔗 cython.org
- 
facebookresearch/hydra ⭐ 9,794 
 Hydra is a framework for elegantly configuring complex applications
 🔗 hydra.cc
- 
boto/boto3 ⭐ 9,545 
 Boto3, an AWS SDK for Python
 🔗 aws.amazon.com/sdk-for-python
- 
paramiko/paramiko ⭐ 9,512 
 The leading native Python SSHv2 protocol library.
 🔗 paramiko.org
- 
aws/serverless-application-model ⭐ 9,499 
 The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.
 🔗 aws.amazon.com/serverless/sam
- 
py-pdf/pypdf ⭐ 9,437 
 A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
 🔗 pypdf.readthedocs.io/en/latest
- 
xonsh/xonsh ⭐ 8,991 
 🐚 Python-powered shell. Full-featured and cross-platform.
 🔗 xon.sh
- 
icloud-photos-downloader/icloud_photos_downloader ⭐ 8,928 
 A command-line tool to download photos from iCloud
- 
arrow-py/arrow ⭐ 8,928 
 🏹 Better dates & times for Python
 🔗 arrow.readthedocs.io
- 
eternnoir/pyTelegramBotAPI ⭐ 8,565 
 Python Telegram bot api.
- 
googleapis/google-api-python-client ⭐ 8,498 
 🐍 The official Python client library for Google's discovery based APIs.
 🔗 googleapis.github.io/google-api-python-client/docs
- 
theskumar/python-dotenv ⭐ 8,401 
 Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
 🔗 saurabh-kumar.com/python-dotenv
- 
jasonppy/VoiceCraft ⭐ 8,396 
 Zero-Shot Speech Editing and Text-to-Speech in the Wild
- 
kellyjonbrazil/jc ⭐ 8,387 
 CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.
- 
jd/tenacity ⭐ 7,910 
 Retrying library for Python
 🔗 tenacity.readthedocs.io
- 
googlecloudplatform/python-docs-samples ⭐ 7,846 
 Code samples used on cloud.google.com
- 
timdettmers/bitsandbytes ⭐ 7,627 
 Accessible large language models via k-bit quantization for PyTorch.
 🔗 huggingface.co/docs/bitsandbytes/main/en/index
- 
google/latexify_py ⭐ 7,568 
 A library to generate LaTeX expression from Python code.
- 
pygithub/PyGithub ⭐ 7,548 
 Typed interactions with the GitHub API v3
 🔗 pygithub.readthedocs.io
- 
ijl/orjson ⭐ 7,445 
 Fast, correct Python JSON library supporting dataclasses, datetimes, and numpy
- 
sphinx-doc/sphinx ⭐ 7,408 
 The Sphinx documentation generator
 🔗 www.sphinx-doc.org
- 
bndr/pipreqs ⭐ 7,343 
 pipreqs - Generate pip requirements.txt file based on imports of any project. Looking for maintainers to move this project forward.
- 
pyca/cryptography ⭐ 7,246 
 cryptography is a package designed to expose cryptographic primitives and recipes to Python developers.
 🔗 cryptography.io
- 
marshmallow-code/marshmallow ⭐ 7,190 
 A lightweight library for converting complex objects to and from simple Python datatypes.
 🔗 marshmallow.readthedocs.io
- 
gorakhargosh/watchdog ⭐ 7,106 
 Python library and shell utilities to monitor filesystem events.
 🔗 packages.python.org/watchdog
- 
agronholm/apscheduler ⭐ 7,017 
 Task scheduling library for Python
- 
hugapi/hug ⭐ 6,896 
 Embrace the APIs of the future. Hug aims to make developing APIs as simple as possible, but no simpler.
- 
openai/point-e ⭐ 6,792 
 Point cloud diffusion for 3D model synthesis
- 
pdfminer/pdfminer.six ⭐ 6,741 
 Community maintained fork of pdfminer - we fathom PDF
 🔗 pdfminersix.readthedocs.io
- 
sdispater/pendulum ⭐ 6,558 
 Python datetimes made easy
 🔗 pendulum.eustace.io
- 
traceloop/openllmetry ⭐ 6,451 
 Open-source observability for your GenAI or LLM application, based on OpenTelemetry
 🔗 www.traceloop.com/openllmetry
- 
scikit-image/scikit-image ⭐ 6,356 
 Image processing in Python
 🔗 scikit-image.org
- 
wireservice/csvkit ⭐ 6,255 
 A suite of utilities for converting to and working with CSV, the king of tabular file formats.
 🔗 csvkit.readthedocs.io
- 
pytransitions/transitions ⭐ 6,245 
 A lightweight, object-oriented finite state machine implementation in Python with many extensions
- 
rsalmei/alive-progress ⭐ 6,129 
 A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!
- 
spotify/pedalboard ⭐ 5,785 
 🎛 🔊 A Python library for audio.
 🔗 spotify.github.io/pedalboard
- 
pywinauto/pywinauto ⭐ 5,666 
 Windows GUI Automation with Python (based on text properties)
 🔗 pywinauto.github.io
- 
buildbot/buildbot ⭐ 5,389 
 Python-based continuous integration testing framework; your pull requests are more than welcome!
 🔗 www.buildbot.net
- 
prompt-toolkit/ptpython ⭐ 5,357 
 A better Python REPL
- 
tebelorg/RPA-Python ⭐ 5,346 
 Python package for doing RPA
- 
pythonnet/pythonnet ⭐ 5,266 
 Python for .NET is a package that gives Python programmers nearly seamless integration with the .NET Common Language Runtime (CLR) and provides a powerful application scripting tool for .NET developers.
 🔗 pythonnet.github.io
- 
pycqa/pycodestyle ⭐ 5,118 
 Simple Python style checker in one Python file
 🔗 pycodestyle.pycqa.org
- 
pytoolz/toolz ⭐ 5,040 
 A functional standard library for Python.
 🔗 toolz.readthedocs.org
- 
jorgebastida/awslogs ⭐ 4,960 
 AWS CloudWatch logs for Humans™
- 
pyo3/maturin ⭐ 4,936 
 Build and publish crates with pyo3, cffi and uniffi bindings as well as rust binaries as python packages
 🔗 maturin.rs
- 
ashleve/lightning-hydra-template ⭐ 4,903 
 PyTorch Lightning + Hydra. A very user-friendly template for ML experimentation. ⚡🔥⚡
- 
bogdanp/dramatiq ⭐ 4,901 
 A fast and reliable background task processing library for Python 3.
 🔗 dramatiq.io
- 
hhatto/autopep8 ⭐ 4,638 
 A tool that automatically formats Python code to conform to the PEP 8 style guide.
 🔗 pypi.org/project/autopep8
- 
pyinvoke/invoke ⭐ 4,624 
 Pythonic task management & command execution.
 🔗 pyinvoke.org
- 
ets-labs/python-dependency-injector ⭐ 4,606 
 Dependency injection framework for Python
 🔗 python-dependency-injector.ets-labs.org
- 
blealtan/efficient-kan ⭐ 4,480 
 An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
- 
pyinfra-dev/pyinfra ⭐ 4,438 
 🔧 pyinfra turns Python code into shell commands and runs them on your servers. Execute ad-hoc commands and write declarative operations. Target SSH servers, local machine and Docker containers. Fast and scales from one server to thousands.
 🔗 pyinfra.com
- 
adafruit/circuitpython ⭐ 4,347 
 CircuitPython - a Python implementation for teaching coding with microcontrollers
 🔗 circuitpython.org
- 
spotify/basic-pitch ⭐ 4,285 
 A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
 🔗 basicpitch.io
- 
hynek/structlog ⭐ 4,275 
 Simple, powerful, and fast logging for Python.
 🔗 www.structlog.org
- 
evhub/coconut ⭐ 4,264 
 Coconut (coconut-lang.org) is a variant of Python that adds on top of Python syntax new features for simple, elegant, Pythonic functional programming.
 🔗 coconut-lang.org
- 
miguelgrinberg/python-socketio ⭐ 4,257 
 Python Socket.IO server and client
- 
joblib/joblib ⭐ 4,235 
 Computing with Python functions.
 🔗 joblib.readthedocs.org
- 
python-markdown/markdown ⭐ 4,095 
 A Python implementation of John Gruber’s Markdown with Extension support.
 🔗 python-markdown.github.io
- 
pydata/xarray ⭐ 3,988 
 N-D labeled arrays and datasets in Python
 🔗 xarray.dev
- 
zeromq/pyzmq ⭐ 3,984 
 PyZMQ: Python bindings for zeromq
 🔗 zguide.zeromq.org/py:all
- 
more-itertools/more-itertools ⭐ 3,983 
 More routines for operating on iterables, beyond itertools
 🔗 more-itertools.rtfd.io
- 
rspeer/python-ftfy ⭐ 3,969 
 Fixes mojibake and other glitches in Unicode text, after the fact.
 🔗 ftfy.readthedocs.org
- 
tartley/colorama ⭐ 3,725 
 Simple cross-platform colored terminal text in Python
- 
pydantic/logfire ⭐ 3,626 
 Uncomplicated Observability for Python and beyond! 🪵🔥
 🔗 logfire.pydantic.dev/docs
- 
jorisschellekens/borb ⭐ 3,528 
 borb is a library for reading, creating and manipulating PDF files in python.
 🔗 borbpdf.com
- 
camelot-dev/camelot ⭐ 3,469 
 A Python library to extract tabular data from PDFs
 🔗 camelot-py.readthedocs.io
- 
libaudioflux/audioFlux ⭐ 3,172 
 A library for audio and music analysis, feature extraction.
 🔗 audioflux.top
- 
jcrist/<a href="https://git 
