📈 Bản tin xu hướng AI mã nguồn mở 2026-03-22

# Xu hướng AI Mã nguồn mở 2026-03-22

> Nguồn: GitHub Trending + GitHub Search API | Thời gian tạo: 2026-03-22 02:06 UTC

---

Báo cáo Xu hướng Mã nguồn mở AI — 2026-03-22

Bước 1 — Lọc
Từ dữ liệu được cung cấp, tôi chọn các repo rõ ràng liên quan đến AI/ML và loại trừ các công cụ hạ tầng chung, frontend thuần túy, game không dùng ML, v.v. (Ví dụ: loại bỏ systemd, trivy, protobuf). Tập kết quả lọc gồm các dự án trending liên quan AI và các repo topic=ai/llm/rag/vector-db/agent/ml từ kết quả tìm kiếm.

Bước 2 — Phân loại (mỗi dự án được đặt vào nhóm chính phù hợp)

🔧 Cơ sở hạ tầng AI
- vllm-project/vllm — https://github.com/vllm-project/vllm — ⭐73,884  
  Framework suy luận hiệu suất cao cho LLM, tối ưu hoá throughput và bộ nhớ; là hạ tầng phục vụ inference quy mô.
- vllm-project/vllm-omni — https://github.com/vllm-project/vllm-omni — (xu hướng +71 hôm nay)  
  Extension cho suy luận đa-modal: nổi bật hôm nay trên Trending vì tương tác với mô hình omni-modality.
- huggingface/transformers — https://github.com/huggingface/transformers — ⭐158,210  
  Thư viện định nghĩa mô hình hàng đầu cho NLP/multimodal; nền tảng cho fine‑tuning và deployment.
- ollama/ollama — https://github.com/ollama/ollama — ⭐165,808  
  Runtime/self-hosted model manager, giúp chạy mô hình offline/locally — tăng nhu cầu privacy/self-hosting.
- tensorflow/tensorflow — https://github.com/tensorflow/tensorflow — ⭐194,301  
  Framework ML tổng quát, vẫn là hạ tầng trọng tâm cho nhiều pipeline training/serving.
- Picovoice/picollm — https://github.com/Picovoice/picollm — ⭐305  
  Inference LLM on-device với x-bit quantization — xu hướng edge và riêng tư.

🤖 AI Agent / Workflow
- langchain-ai/langchain — https://github.com/langchain-ai/langchain — ⭐130,504  
  Nền tảng agent engineering, phổ biến cho orchestration tool-calling và workflow agent.
- langgenius/dify — https://github.com/langgenius/dify — ⭐133,880  
  Nền tảng production-ready để xây dựng workflow agent; tập trung vào triển khai doanh nghiệp.
- FlowiseAI/Flowise — https://github.com/FlowiseAI/Flowise — ⭐50,954  
  Xây dựng agents bằng giao diện đồ hoạ — thu hút người dùng non-dev.
- shareAI-lab/learn-claude-code — https://github.com/shareAI-lab/learn-claude-code — ⭐35,306  
  Lightweight agent harness (Claude Code–like) — phản ánh quan tâm đến agent harness.
- CopilotKit/CopilotKit — https://github.com/CopilotKit/CopilotKit — ⭐29,637  
  Frontend stack cho agents & generative UI — tập trung trải nghiệm dev.

📦 Ứng dụng AI (giải pháp ngành / ứng dụng cụ thể)
- Crosstalk-Solutions/project-nomad — https://github.com/Crosstalk-Solutions/project-nomad — (xu hướng +2032 hôm nay)  
  “Offline survival computer” tích hợp công cụ/AI — bật lên hôm nay với mức tăng sao lớn; hướng tới ứng dụng AI ngoại tuyến.
- opendataloader-project/opendataloader-pdf — https://github.com/opendataloader-project/opendataloader-pdf — (xu hướng +950 hôm nay)  
  Parser PDF “AI-ready” — lý giải nhu cầu RAG/ingestion dữ liệu phi cấu trúc.
- jarrodwatts/claude-hud — https://github.com/jarrodwatts/claude-hud — (xu hướng +970 hôm nay)  
  Plugin Claude Code để giám sát context, tools, agents — phản ánh quan tâm đến observability/UX cho agents.
- PaddlePaddle/PaddleOCR — https://github.com/PaddlePaddle/PaddleOCR — ⭐72,771  
  OCR đa ngôn ngữ để chuyển PDF/image thành dữ liệu cho LLM/RAG.
- OpenBB-finance/OpenBB — https://github.com/OpenBB-finance/OpenBB — ⭐63,393  
  Ứng dụng AI cho phân tích tài chính — ví dụ ngành dọc áp dụng ML/LLM.

🧠 LLM / Huấn luyện
- hiyouga/LlamaFactory — https://github.com/hiyouga/LlamaFactory — ⭐68,848  
  Unified efficient fine‑tuning cho hàng trăm LLM/VLM — hướng scaling fine‑tune.
- rasbt/LLMs-from-scratch — https://github.com/rasbt/LLMs-from-scratch — ⭐88,928  
  Giải thích/cài đặt LLM từ đầu — hữu ích cho học thuật và nghiên cứu.
- open-compass/opencompass — https://github.com/open-compass/opencompass — ⭐6,781  
  Nền tảng đánh giá LLM hỗ trợ nhiều mô hình/dataset — cần thiết khi nhiều LLM mới xuất hiện.
- galilai-group/stable-pretraining — https://github.com/galilai-group/stable-pretraining — ⭐135  
  Tooling pretraining tối ưu, hướng tới huấn luyện đáng tin cậy.

🔍 RAG / Tri thức (vector DB, memory, retrieval)
- milvus-io/milvus — https://github.com/milvus-io/milvus — ⭐43,442  
  Vector DB cloud-native, tiêu chuẩn cho ANN search quy mô.
- qdrant/qdrant — https://github.com/qdrant/qdrant — ⭐29,763  
  Vector search engine hiệu năng cao.
- weaviate/weaviate — https://github.com/weaviate/weaviate — ⭐15,850  
  Vector DB tích hợp schema/object storage và semantic search.
- run-llama/llama_index — https://github.com/run-llama/llama_index — ⭐47,844  
  Document agent / indexing layer phổ biến cho RAG pipelines.
- yichuan-w/LEANN — https://github.com/yichuan-w/LEANN — ⭐10,348  
  RAG with extreme storage savings (97%) — đáng chú ý cho edge/PRIVACY.
- VectifyAI/PageIndex — https://github.com/VectifyAI/PageIndex — ⭐22,511  
  Document index cho “vectorless” RAG — hướng mới trong giảm phụ thuộc vector.

Phần 3 — Điểm nổi bật hôm nay (3–5 câu)
- Cộng đồng đang bùng nổ quan tâm tới tooling quanh agent và RAG: project-nomad, claude-hud, opendataloader-pdf và vllm-omni là các mục trending với lượng sao tăng mạnh hôm nay.  
- Xu hướng rõ ràng: đầu tư vào inference hiệu năng (vllm/vllm-omni), pipeline ingest dữ liệu phi cấu trúc (opendataloader-pdf, PaddleOCR) và lớp tri thức/memory (Milvus, Qdrant, LlamaIndex, LEANN).  
- Ngoài ra, nhu cầu về observability/UX cho agent (claude-hud, CopilotKit) và self-hosting/privacy runtimes (ollama, Picovoice) đang gia tăng.

Phần 4 — Phân tích tín hiệu xu hướng (200–300 từ)
Cộng đồng OSS đang tập trung trên 3 trục chính: (1) inference hiệu năng & đa-modality, (2) ingestion/knowledge layer cho RAG, và (3) orchestration/observability cho agents. Sự nổi bật của vllm và vllm-omni cho thấy áp lực tối ưu chi phí GPU và latency khi triển khai LLMs lớn — đội ngũ dev cần throughput cao và hỗ trợ multi-modal. Đồng thời, spike về opendataloader-pdf và PaddleOCR phản ánh nhu cầu chuyển đổi lượng lớn tài liệu PDF/image thành context LLM: RAG không chỉ cần vector DB mà còn cần pipeline chuẩn hoá dữ liệu. Vector DB và memory layer (Milvus, Qdrant, Weaviate, LEANN, LlamaIndex) vẫn là nền tảng của nhiều hệ thống RAG; các sáng kiến “vectorless” hoặc tiết kiệm lưu trữ (LEANN, PageIndex) cho thấy tối ưu chi phí lưu trữ/truy xuất ngày càng quan trọng. Trên tầng agent, LangChain/Dify/Flowise và công cụ observability như claude-hud minh hoạ một giai đoạn trưởng thành: từ PoC agent sang production orchestration, monitoring và UX cho developer. Cuối cùng, xu hướng privacy/self-hosting (Ollama, Picovoice, project-nomad) liên quan trực tiếp tới lo ngại chi phí và bảo mật sau làn sóng ra mắt LLMs mới (Llama3, Mistral, Qwen, Gemini và các biến thể mở) — cộng đồng muốn chạy hoặc tại chỗ hoặc trên edge với hiệu năng tốt.

Phần 5 — Điểm nóng cộng đồng (3–5 đề xuất)
- RAG & Vector DB tích hợp (Milvus, Qdrant, Weaviate, LEANN) — lý do: hạ tầng lưu trữ/truy xuất là bottleneck cho ứng dụng LLM sản xuất.  
  - Milvus: https://github.com/milvus-io/milvus  
  - Qdrant: https://github.com/qdrant/qdrant  
  - LEANN: https://github.com/yichuan-w/LEANN
- Inference & quantization cho production (vllm, vllm-omni, picollm) — lý do: tối ưu chi phí GPU & mở rộng multi-modal.  
  - vllm: https://github.com/vllm-project/vllm  
  - vllm-omni: https://github.com/vllm-project/vllm-omni  
  - picollm: https://github.com/Picovoice/picollm
- Data ingestion & preprocessing (opendataloader-pdf, PaddleOCR, PageIndex) — lý do: chất lượng context quyết định hiệu quả RAG.  
  - opendataloader-pdf: https://github.com/opendataloader-project/opendataloader-pdf  
  - PaddleOCR: https://github.com/PaddlePaddle/PaddleOCR  
  - PageIndex: https://github.com/VectifyAI/PageIndex
- Agent orchestration & observability (LangChain, Dify, claude-hud, CopilotKit) — lý do: chuyển từ prototype sang production đòi hỏi harness, monitoring và UX.  
  - langchain: https://github.com/langchain-ai/langchain  
  - dify: https://github.com/langgenius/dify  
  - claude-hud: https://github.com/jarrodwatts/claude-hud

Kết luận ngắn gọn: hôm nay cộng đồng ưu tiên xây dựng nền tảng vận hành (inference, ingestion, vector store) và công cụ thao tác/quan sát cho agent. Nếu bạn là nhà phát triển, tập trung vào performance/scale và pipeline dữ liệu sẽ mang lại tác động lớn nhất trong 6–12 tháng tới.

---
*Bản tin này được tạo tự động bởi [agents-radar](https://github.com/compasify/agents-radar).*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📈 Bản tin xu hướng AI mã nguồn mở 2026-03-22 #68

Xu hướng AI Mã nguồn mở 2026-03-22

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

📈 Bản tin xu hướng AI mã nguồn mở 2026-03-22 #68

Description

Xu hướng AI Mã nguồn mở 2026-03-22

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions