505 sources. 30 languages. 30 countries. 17,800+ full-text volumes. One search.
Aggregating the world's Buddhist digital heritage — 9,200+ texts with 17,800+ volumes of full content in Pali, Classical Chinese, Tibetan, and Sanskrit from 505 data sources — with CBETA-style reading, AI-powered Q&A (RAG + reranking + citations), knowledge graph, timeline visualization, collections, citations, annotations, bookmarks, and multi-language parallel reading.
Live Demo · API Docs · 中文文档 · Discussions · Discord · Report Bug
Buddhist texts are scattered across hundreds of databases worldwide — CBETA, SuttaCentral, BDRC, SAT, 84000, GRETIL, and many more. Each has different interfaces, languages, and data formats. Researchers spend more time finding texts than reading them.
FoJin solves this. It aggregates 505 sources into a single, searchable platform with features no other tool provides:
| What you need | How FoJin helps |
|---|---|
| Find a sutra across databases | Multi-dimensional search across 9,200+ texts from 505 sources |
| Read the full text online | 7,600+ texts with 17,800+ volumes of full content, CBETA-style layout |
| Compare translations | Parallel reading in 30 languages side by side |
| Look up Buddhist terms | 6 dictionaries, 285K entries (Chinese/Sanskrit/Pali/English) |
| Explore relationships | Knowledge graph with 9,700+ entities and 4,100+ relations |
| Discover similar texts | Semantic similarity powered by 420K+ embedding vectors (pgvector + HNSW) |
| View original manuscripts | IIIF manuscript viewer connected to BDRC and more |
| Ask questions about texts | AI Q&A ("XiaoJin") with RAG, reranking, clickable citations, and follow-up suggestions |
| Explore history visually | Timeline & Dashboard — dynasty charts, translation trends, category analytics |
| Save and organize | Collections, bookmarks, annotations for personal study |
| Cite in research | Citation export (BibTeX, RIS, APA) for academic use |
git clone https://github.com/xr843/fojin.git
cd fojin
cp .env.example .env # edit POSTGRES_PASSWORD before starting
docker compose up -d # database migrations run automaticallyThen visit: http://localhost:3000
API docs at http://localhost:8000/docs
After first startup, the platform has the database schema and source metadata but no text content. To import texts from public data sources:
# Import CBETA Chinese Buddhist Canon
docker exec fojin-backend python scripts/import_cbeta.py
# Import SuttaCentral Early Buddhist Texts
docker exec fojin-backend python scripts/import_suttacentral.py
# See all available importers
ls backend/scripts/import_*.pyEach importer downloads data directly from the original source (CBETA, SuttaCentral, etc.) — no data is bundled in this repository.
Search across Buddhist canons by title, translator, catalog number, or full-text keyword. Powered by Elasticsearch with ICU tokenizer for multi-language support.
Read 7,600+ Buddhist texts with 17,800+ volumes of full content online. CBETA-style typography with intelligent verse/prose detection, paragraph reflow, and adjustable font size. Navigate by volume, scroll through content, and jump between related texts.
Compare translations side by side — Classical Chinese, Sanskrit, Pali, Tibetan, English, Japanese, Korean, Gandhari, and 21 more languages.
6 authoritative dictionaries with 285,000+ entries:
- DDB (Digital Dictionary of Buddhism)
- SuttaCentral Glossary (Pali)
- NCPED (New Concise Pali-English Dictionary)
- NTI (Nan Tien Institute Buddhist Dictionary)
- Edgerton BHS (Buddhist Hybrid Sanskrit Dictionary)
- Monier-Williams (Sanskrit-English Dictionary)
9,700+ entities (persons, monasteries, texts, schools) and 4,100+ relationships, visualized as an interactive force-directed graph. Click any node to explore connections.
Ask questions in natural language. XiaoJin answers based on canonical Buddhist texts using RAG (Retrieval-Augmented Generation) with 420K+ embedding vectors and HNSW index for fast semantic search. Features include:
- Multi-turn conversation with context awareness
- Keyword + optional API cross-encoder reranking for higher answer quality
- Clickable citations in 【《经名》第N卷】 format — click to jump to the text reader
- Progressive follow-up suggestions (concept → related texts → practice)
- "Ask XiaoJin" button on the reader page — select text to ask about it
- Tab key cycles through suggested questions in the input box
- BYOK (Bring Your Own Key) support for multiple LLM providers
When reading any text, the sidebar automatically finds semantically similar passages from other texts using pgvector cosine similarity. Discover cross-textual parallels, related commentaries, and thematic connections across the entire canon.
Visualize Buddhist textual history with interactive D3 charts — dynasty distribution, translation trends, language breakdown, category treemap, and top translators. Toggle between scholarly and popular presentation modes.
Save texts to personal collections, bookmark specific passages, and add annotations for study and research.
Export citations in BibTeX, RIS, and APA formats for academic papers and reference managers.
Browse digitized manuscripts and rare editions from BDRC and other institutions via IIIF protocol.
Available in 9 languages: Simplified Chinese, Traditional Chinese, English, Japanese, Korean, Thai, Vietnamese, Sinhala, and Burmese.
FoJin aggregates data from major Buddhist digital projects worldwide:
| Source | Content | Languages |
|---|---|---|
| CBETA | Chinese Buddhist Canon | Classical Chinese |
| SuttaCentral | Early Buddhist Texts | Pali, Chinese, English |
| 84000 | Tibetan Buddhist Canon | Tibetan, English, Sanskrit |
| BDRC | Tibetan manuscripts (IIIF) | Tibetan |
| SAT | Taisho Tripitaka | Chinese, Japanese |
| GRETIL | Sanskrit e-texts | Sanskrit |
| DSBC | Digital Sanskrit Buddhist Canon | Sanskrit |
| Gandhari.org | Gandhari manuscripts | Gandhari |
| VRI Tipitaka | Pali Canon (Chattha Sangayana) | Pali |
| Korean Tripitaka | Goryeo Tripitaka | Chinese, Korean |
| + 495 more... |
| Layer | Technology |
|---|---|
| Frontend | React 18, TypeScript, Vite, Ant Design 5, Zustand, TanStack Query, D3.js |
| Backend | FastAPI, SQLAlchemy (async), Pydantic v2, SSE streaming |
| Database | PostgreSQL 15 + pgvector (HNSW index) + pg_trgm |
| Search | Elasticsearch 8 (ICU tokenizer) |
| Cache | Redis 7 |
| AI | RAG (420K+ vectors, BGE-M3 embeddings) + multi-provider LLM (OpenAI/DashScope/DeepSeek/SiliconFlow) |
| Deploy | Docker Compose, Nginx (gzip, security headers), Cloudflare CDN |
| CI | GitHub Actions (lint, test, security scan) |
+-------------+
| Cloudflare | (CDN, SSL, DDoS protection)
+------+------+
|
+------+------+
| Nginx | (gzip, security headers, static cache)
+------+------+
|
+-----------+-----------+
| |
+-----+------+ +-----+------+
| React 18 | | FastAPI |
| Vite + D3 | | async SSE |
+-------------+ +------+------+
|
+--------+---------+---------+
| | | |
+-----+--+ +--+----+ +--+---+ +---+--------+
| PG 15 | | ES 8 | |Redis | | LLM APIs |
| pgvector | | ICU | |cache | | (multi- |
| HNSW idx | | | | | | provider) |
+---------+ +-------+ +------+ +------------+
# Backend
cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements-dev.txt
alembic upgrade head
uvicorn app.main:app --reload
# Frontend
cd frontend
npm install
npm run dev
# Tests
cd backend && pytest tests/ -q- Non-root containers (backend:
app, frontend:nginx) - Multi-stage Docker builds (no build tools in production)
- Internal services bound to
127.0.0.1only - Memory/CPU limits per container
- CSP, X-Frame-Options, X-Content-Type-Options headers
- Query length limits on all search parameters
- JWT with 8h expiry, production requires strong secret
Contributions are welcome! Whether it's adding a new data source, improving search, fixing bugs, or translating the UI — we'd love your help.
- Fork the repository
- Create your feature branch (
git checkout -b feat/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feat/amazing-feature) - Open a Pull Request
See CONTRIBUTING.md for detailed guidelines.
- Citation export (BibTeX, RIS, APA)
- Mobile-responsive reader
- Public REST API with rate limiting
- User annotations
- Community-contributed data sources
- Internationalization (i18n) — 9 UI languages
- Embedding-based semantic search (420K+ vectors, HNSW index)
- AI Q&A with RAG, multi-turn context, and streaming
- Similar passages discovery (cross-text semantic matching)
- Timeline visualization and statistics dashboard
- User feedback system and notification center
- Admin dashboard (user management, platform analytics)
- API documentation (OpenAPI/Swagger at
/docs, ReDoc at/redoc) - AI answer reranking (keyword + optional API cross-encoder)
- Clickable citation links in AI answers
- Progressive follow-up suggestions after AI answers
- "Ask XiaoJin" floating button on reader page
- Tab key to cycle through suggested questions
- CBETA-style text layout with verse/prose detection
- Auto database migration on Docker startup
- AI answer rating (thumbs up/down) for quality tracking
- Topic ontology browsing page
- Cross-lingual search (query in Chinese, find Sanskrit/Pali/Tibetan results)
- Open data export (JSON/CSV for researchers)
- MCP Server for AI assistant integration
- OCR pipeline for scanned texts
- Collaborative annotation sharing
- Integration with Zotero and reference managers
Apache License 2.0 — applies to FoJin source code only. Third-party data sources retain their own licenses (CC BY-NC-SA, CC0, CC BY-NC-ND, etc.). See NOTICE for details.
FoJin is built on the generous work of the global Buddhist digital humanities community. Special thanks to:
- CBETA — Chinese Buddhist Electronic Text Association
- SuttaCentral — Early Buddhist Texts
- BDRC — Buddhist Digital Resource Center
- 84000 — Translating the Words of the Buddha
- SAT — SAT Daizokyo Text Database
- All other data source providers listed in the Sources page
- The Open Buddhist University — Free courses, books, and encyclopaedia for Buddhist studies
If FoJin is useful for your research, please consider giving it a star!
Discussions · Issues · Contributing · contact@fojin.app
Made with care for the Buddhist studies community.



