FoJin 佛津

The World's Encyclopedic Buddhist Digital Text Platform

505 sources. 30 languages. 30 countries. 17,800+ full-text volumes. One search.

Aggregating the world's Buddhist digital heritage — 9,200+ texts with 17,800+ volumes of full content in Pali, Classical Chinese, Tibetan, and Sanskrit from 505 data sources — with CBETA-style reading, AI-powered Q&A (RAG + reranking + citations), knowledge graph, timeline visualization, collections, citations, annotations, bookmarks, and multi-language parallel reading.

Live Demo · API Docs · 中文文档 · Discussions · Discord · Report Bug

Why FoJin?

Buddhist texts are scattered across hundreds of databases worldwide — CBETA, SuttaCentral, BDRC, SAT, 84000, GRETIL, and many more. Each has different interfaces, languages, and data formats. Researchers spend more time finding texts than reading them.

FoJin solves this. It aggregates 505 sources into a single, searchable platform with features no other tool provides:

What you need	How FoJin helps
Find a sutra across databases	Multi-dimensional search across 9,200+ texts from 505 sources
Read the full text online	7,600+ texts with 17,800+ volumes of full content, CBETA-style layout
Compare translations	Parallel reading in 30 languages side by side
Look up Buddhist terms	6 dictionaries, 285K entries (Chinese/Sanskrit/Pali/English)
Explore relationships	Knowledge graph with 9,700+ entities and 4,100+ relations
Discover similar texts	Semantic similarity powered by 420K+ embedding vectors (pgvector + HNSW)
View original manuscripts	IIIF manuscript viewer connected to BDRC and more
Ask questions about texts	AI Q&A ("XiaoJin") with RAG, reranking, clickable citations, and follow-up suggestions
Explore history visually	Timeline & Dashboard — dynasty charts, translation trends, category analytics
Save and organize	Collections, bookmarks, annotations for personal study
Cite in research	Citation export (BibTeX, RIS, APA) for academic use

Quick Start

git clone https://github.com/xr843/fojin.git
cd fojin
cp .env.example .env        # edit POSTGRES_PASSWORD before starting
docker compose up -d         # database migrations run automatically

Then visit: http://localhost:3000

API docs at http://localhost:8000/docs

After first startup, the platform has the database schema and source metadata but no text content. To import texts from public data sources:

# Import CBETA Chinese Buddhist Canon
docker exec fojin-backend python scripts/import_cbeta.py

# Import SuttaCentral Early Buddhist Texts
docker exec fojin-backend python scripts/import_suttacentral.py

# See all available importers
ls backend/scripts/import_*.py

Each importer downloads data directly from the original source (CBETA, SuttaCentral, etc.) — no data is bundled in this repository.

Features

Multi-Dimensional Search

Search across Buddhist canons by title, translator, catalog number, or full-text keyword. Powered by Elasticsearch with ICU tokenizer for multi-language support.

Full-Text Reading

Read 7,600+ Buddhist texts with 17,800+ volumes of full content online. CBETA-style typography with intelligent verse/prose detection, paragraph reflow, and adjustable font size. Navigate by volume, scroll through content, and jump between related texts.

Parallel Reading (30 Languages)

Compare translations side by side — Classical Chinese, Sanskrit, Pali, Tibetan, English, Japanese, Korean, Gandhari, and 21 more languages.

Dictionary Lookup

6 authoritative dictionaries with 285,000+ entries:

DDB (Digital Dictionary of Buddhism)
SuttaCentral Glossary (Pali)
NCPED (New Concise Pali-English Dictionary)
NTI (Nan Tien Institute Buddhist Dictionary)
Edgerton BHS (Buddhist Hybrid Sanskrit Dictionary)
Monier-Williams (Sanskrit-English Dictionary)

Knowledge Graph

9,700+ entities (persons, monasteries, texts, schools) and 4,100+ relationships, visualized as an interactive force-directed graph. Click any node to explore connections.

AI Q&A — "XiaoJin"

Ask questions in natural language. XiaoJin answers based on canonical Buddhist texts using RAG (Retrieval-Augmented Generation) with 420K+ embedding vectors and HNSW index for fast semantic search. Features include:

Multi-turn conversation with context awareness
Keyword + optional API cross-encoder reranking for higher answer quality
Clickable citations in 【《经名》第N卷】 format — click to jump to the text reader
Progressive follow-up suggestions (concept → related texts → practice)
"Ask XiaoJin" button on the reader page — select text to ask about it
Tab key cycles through suggested questions in the input box
BYOK (Bring Your Own Key) support for multiple LLM providers

Similar Passages Discovery

When reading any text, the sidebar automatically finds semantically similar passages from other texts using pgvector cosine similarity. Discover cross-textual parallels, related commentaries, and thematic connections across the entire canon.

Timeline & Statistics Dashboard

Visualize Buddhist textual history with interactive D3 charts — dynasty distribution, translation trends, language breakdown, category treemap, and top translators. Toggle between scholarly and popular presentation modes.

Collections, Bookmarks & Annotations

Save texts to personal collections, bookmark specific passages, and add annotations for study and research.

Citation Export

Export citations in BibTeX, RIS, and APA formats for academic papers and reference managers.

Manuscript Viewer

Browse digitized manuscripts and rare editions from BDRC and other institutions via IIIF protocol.

Multi-Language UI

Available in 9 languages: Simplified Chinese, Traditional Chinese, English, Japanese, Korean, Thai, Vietnamese, Sinhala, and Burmese.

Data Sources

FoJin aggregates data from major Buddhist digital projects worldwide:

Source	Content	Languages
CBETA	Chinese Buddhist Canon	Classical Chinese
SuttaCentral	Early Buddhist Texts	Pali, Chinese, English
84000	Tibetan Buddhist Canon	Tibetan, English, Sanskrit
BDRC	Tibetan manuscripts (IIIF)	Tibetan
SAT	Taisho Tripitaka	Chinese, Japanese
GRETIL	Sanskrit e-texts	Sanskrit
DSBC	Digital Sanskrit Buddhist Canon	Sanskrit
Gandhari.org	Gandhari manuscripts	Gandhari
VRI Tipitaka	Pali Canon (Chattha Sangayana)	Pali
Korean Tripitaka	Goryeo Tripitaka	Chinese, Korean
+ 495 more...

Tech Stack

Layer	Technology
Frontend	React 18, TypeScript, Vite, Ant Design 5, Zustand, TanStack Query, D3.js
Backend	FastAPI, SQLAlchemy (async), Pydantic v2, SSE streaming
Database	PostgreSQL 15 + pgvector (HNSW index) + pg_trgm
Search	Elasticsearch 8 (ICU tokenizer)
Cache	Redis 7
AI	RAG (420K+ vectors, BGE-M3 embeddings) + multi-provider LLM (OpenAI/DashScope/DeepSeek/SiliconFlow)
Deploy	Docker Compose, Nginx (gzip, security headers), Cloudflare CDN
CI	GitHub Actions (lint, test, security scan)

Architecture

                  +-------------+
                  | Cloudflare  |  (CDN, SSL, DDoS protection)
                  +------+------+
                         |
                  +------+------+
                  |   Nginx     |  (gzip, security headers, static cache)
                  +------+------+
                         |
             +-----------+-----------+
             |                       |
       +-----+------+         +-----+------+
       |  React 18   |         |  FastAPI    |
       |  Vite + D3  |         |  async SSE  |
       +-------------+         +------+------+
                                      |
                   +--------+---------+---------+
                   |        |         |         |
             +-----+--+ +--+----+ +--+---+ +---+--------+
             | PG 15   | | ES 8  | |Redis | | LLM APIs   |
             | pgvector | | ICU   | |cache | | (multi-    |
             | HNSW idx | |       | |      | |  provider) |
             +---------+ +-------+ +------+ +------------+

Development

# Backend
cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements-dev.txt
alembic upgrade head
uvicorn app.main:app --reload

# Frontend
cd frontend
npm install
npm run dev

# Tests
cd backend && pytest tests/ -q

Security

Non-root containers (backend: app, frontend: nginx)
Multi-stage Docker builds (no build tools in production)
Internal services bound to 127.0.0.1 only
Memory/CPU limits per container
CSP, X-Frame-Options, X-Content-Type-Options headers
Query length limits on all search parameters
JWT with 8h expiry, production requires strong secret

Contributing

Contributions are welcome! Whether it's adding a new data source, improving search, fixing bugs, or translating the UI — we'd love your help.

Fork the repository
Create your feature branch (git checkout -b feat/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feat/amazing-feature)
Open a Pull Request

See CONTRIBUTING.md for detailed guidelines.

Roadmap

License

Apache License 2.0 — applies to FoJin source code only. Third-party data sources retain their own licenses (CC BY-NC-SA, CC0, CC BY-NC-ND, etc.). See NOTICE for details.

Acknowledgments

FoJin is built on the generous work of the global Buddhist digital humanities community. Special thanks to:

CBETA — Chinese Buddhist Electronic Text Association
SuttaCentral — Early Buddhist Texts
BDRC — Buddhist Digital Resource Center
84000 — Translating the Words of the Buddha
SAT — SAT Daizokyo Text Database
All other data source providers listed in the Sources page

Related Projects

The Open Buddhist University — Free courses, books, and encyclopaedia for Buddhist studies

If FoJin is useful for your research, please consider giving it a star!

Discussions · Issues · Contributing · contact@fojin.app

Made with care for the Buddhist studies community.

Name		Name	Last commit message	Last commit date
Latest commit History 382 Commits
.github		.github
backend		backend
docs		docs
elasticsearch		elasticsearch
frontend		frontend
workers/prerender		workers/prerender
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FoJin 佛津

The World's Encyclopedic Buddhist Digital Text Platform

Why FoJin?

Quick Start

Features

Multi-Dimensional Search

Full-Text Reading

Parallel Reading (30 Languages)

Dictionary Lookup

Knowledge Graph

AI Q&A — "XiaoJin"

Similar Passages Discovery

Timeline & Statistics Dashboard

Collections, Bookmarks & Annotations

Citation Export

Manuscript Viewer

Multi-Language UI

Data Sources

Tech Stack

Architecture

Development

Security

Contributing

Roadmap

License

Acknowledgments

Related Projects

About

Uh oh!

Releases 4

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FoJin 佛津

The World's Encyclopedic Buddhist Digital Text Platform

Why FoJin?

Quick Start

Features

Multi-Dimensional Search

Full-Text Reading

Parallel Reading (30 Languages)

Dictionary Lookup

Knowledge Graph

AI Q&A — "XiaoJin"

Similar Passages Discovery

Timeline & Statistics Dashboard

Collections, Bookmarks & Annotations

Citation Export

Manuscript Viewer

Multi-Language UI

Data Sources

Tech Stack

Architecture

Development

Security

Contributing

Roadmap

License

Acknowledgments

Related Projects

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages