From 9d20ef8118f5475568af4e89e44c728891e2e812 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sat, 31 Jan 2026 21:19:28 +0800 Subject: [PATCH 01/10] docs: AGENTS.md, README.md, contribution.md Restructure and clarify repository documentation. AGENTS.md was rewritten and condensed into a practical agent/dev guide with development commands, prerequisites (Hugo Extended, Node.js v16+), content/structure overview, CI/CD notes, and troubleshooting tips. README.md was replaced with a bilingual (ZH/EN) homepage including a 3-step quickstart, repo layout, common commands, contributing requirements, contact info, and license. contribution.md was expanded with a PR checklist and clearer contribution steps (fork/branch/PR with screenshots). These changes improve onboarding and contribution workflows for the docs site. --- AGENTS.md | 203 +++++++++++++++++------------------------------- README.md | 192 +++++++++++++++++++++++++++++++-------------- contribution.md | 17 +++- 3 files changed, 221 insertions(+), 191 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 9108afe52..c4a054515 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,171 +1,110 @@ -# AI Development Agent Instructions +# AGENTS.md This file provides guidance to AI coding assistants (Claude Code, Cursor, GitHub Copilot, etc.) when working with code in this repository. ## Project Overview -This is the **Apache HugeGraph documentation website** repository (`hugegraph-doc`), built with Hugo static site generator using the Docsy theme. The site provides comprehensive documentation for the HugeGraph graph database system, including quickstart guides, API references, configuration guides, and contribution guidelines. +Apache HugeGraph documentation website built with Hugo static site generator and the Docsy theme. The site is bilingual (Chinese/English) and covers the complete HugeGraph graph database ecosystem. -The documentation is multilingual, supporting both **Chinese (cn)** and **English (en)** content. - -## Development Setup - -### Prerequisites - -1. **Hugo Extended** (v0.95.0 recommended, v0.102.3 used in CI) - - Must be the "extended" version (includes SASS/SCSS support) - - Download from: https://github.com/gohugoio/hugo/releases - - Install location: `/usr/bin` or `/usr/local/bin` - -2. **Node.js and npm** (v16+ as specified in CI) - -### Quick Start +## Development Commands ```bash -# Install npm dependencies (autoprefixer, postcss, postcss-cli) +# Install dependencies npm install -# Start local development server (with auto-reload) +# Start development server (auto-reload enabled) hugo server -# Custom server with different ip/port -hugo server -b http://127.0.0.1 -p 80 --bind=0.0.0.0 - # Build production site (output to ./public) hugo --minify -``` - -## Project Structure - -### Key Directories - -- **`content/`** - All documentation content in Markdown - - `content/cn/` - Chinese (simplified) documentation - - `content/en/` - English documentation - - Each language has parallel structure: `docs/`, `blog/`, `community/`, `about/` - -- **`themes/docsy/`** - The Docsy Hugo theme (submodule or vendored) - -- **`static/`** - Static assets (images, files) served directly - -- **`assets/`** - Assets processed by Hugo pipelines (SCSS, images for processing) - -- **`layouts/`** - Custom Hugo template overrides for the Docsy theme - -- **`public/`** - Generated site output (gitignored, created by `hugo` build) - -- **`dist/`** - Additional distribution files - -### Important Files - -- **`config.toml`** - Main site configuration - - Defines language settings (cn as default, en available) - - Menu structure and navigation - - Theme parameters and UI settings - - Currently shows version `0.13` - -- **`package.json`** - Node.js dependencies for CSS processing (postcss, autoprefixer) -- **`.editorconfig`** - Code style rules (UTF-8, LF line endings, spaces for indentation) +# Clean build +rm -rf public/ -- **`contribution.md`** - Contributing guide (Chinese/English mixed) +# Production build with garbage collection +HUGO_ENV="production" hugo --gc -- **`maturity.md`** - Project maturity assessment documentation +# Custom server configuration +hugo server -b http://127.0.0.1 -p 80 --bind=0.0.0.0 +``` -## Content Organization +## Prerequisites -Documentation is organized into major sections: +- **Hugo Extended** v0.95.0 recommended (v0.102.3 in CI) - must be the "extended" version for SASS/SCSS support +- **Node.js** v16+ and npm +- Download Hugo from: https://github.com/gohugoio/hugo/releases -- **`quickstart/`** - Getting started guides for HugeGraph components (Server, Loader, Hubble, Tools, Computer, AI) -- **`config/`** - Configuration documentation -- **`clients/`** - Client API documentation (Gremlin Console, RESTful API) -- **`guides/`** - User guides and tutorials -- **`performance/`** - Performance benchmarks and optimization -- **`language/`** - Query language documentation -- **`contribution-guidelines/`** - How to contribute to HugeGraph -- **`changelog/`** - Release notes and version history -- **`download/`** - Download links and instructions +## Architecture -## Common Tasks +``` +content/ +├── cn/ # Chinese documentation (default language) +│ ├── docs/ # Main documentation +│ ├── blog/ # Blog posts +│ ├── community/ +│ └── about/ +└── en/ # English documentation (parallel structure) + +themes/docsy/ # Docsy theme (submodule) +layouts/ # Custom template overrides +assets/ # Processed assets (SCSS, images) +static/ # Static files served directly +config.toml # Main site configuration +``` -### Building and Testing +### Content Structure -```bash -# Build for production (with minification) -hugo --minify +Documentation sections in `content/{cn,en}/docs/`: +- `quickstart/` - Getting started guides for HugeGraph components +- `config/` - Configuration documentation +- `clients/` - Client API documentation (Gremlin, RESTful) +- `guides/` - User guides and tutorials +- `performance/` - Benchmarks and optimization +- `language/` - Query language docs +- `contribution-guidelines/` - Contributing guides +- `changelog/` - Release notes +- `download/` - Download instructions -# Clean previous build -rm -rf public/ +## Key Configuration Files -# Build with specific environment -HUGO_ENV="production" hugo --gc -``` +- `config.toml` - Site-wide settings, language config, menu structure, version (currently 0.13) +- `package.json` - Node dependencies for CSS processing (postcss, autoprefixer, mermaid) +- `.editorconfig` - UTF-8, LF line endings, spaces for indentation -### Working with Content +## Working with Content When editing documentation: - 1. Maintain parallel structure between `content/cn/` and `content/en/` -2. Use Markdown format for all documentation files -3. Include front matter in each file (title, weight, description) -4. For translated content, ensure both Chinese and English versions are updated - -### Theme Customization - -- Global site config: `config.toml` (root directory) -- Theme-specific config: `themes/docsy/config.toml` -- Custom layouts: Place in `layouts/` to override theme defaults -- Custom styles: Modify files in `assets/` directory - -Refer to [Docsy documentation](https://www.docsy.dev/docs/) for theme customization details. +2. Use Markdown with Hugo front matter (title, weight, description) +3. For bilingual changes, update both Chinese and English versions +4. Include mermaid diagrams where appropriate (mermaid.js is available) ## Deployment -The site uses GitHub Actions for CI/CD (`.github/workflows/hugo.yml`): - -1. **Triggers**: On push to `master` branch or pull requests -2. **Build process**: - - Checkout with submodules (for themes) - - Setup Node v16 and Hugo v0.102.3 extended - - Run `npm i && hugo --minify` -3. **Deployment**: Publishes to `asf-site` branch (GitHub Pages) - -The deployed site is hosted as part of Apache HugeGraph's documentation infrastructure. - -## HugeGraph Architecture Context - -This documentation covers the complete HugeGraph ecosystem: - -- **HugeGraph-Server** - Core graph database engine with REST API -- **HugeGraph-Store** - Distributed storage engine with integrated computation -- **HugeGraph-PD** - Placement Driver for metadata management -- **HugeGraph-Toolchain**: - - Client (Java RESTful API client) - - Loader (data import tool) - - Hubble (web visualization platform) - - Tools (deployment and management utilities) -- **HugeGraph-Computer** - Distributed graph processing system (OLAP) -- **HugeGraph-AI** - Graph neural networks and LLM/RAG components +- **CI/CD**: GitHub Actions (`.github/workflows/hugo.yml`) +- **Trigger**: Push to `master` branch or pull requests +- **Build**: `npm i && hugo --minify` with Node v16 and Hugo v0.102.3 extended +- **Deploy**: Publishes to `asf-site` branch (GitHub Pages) +- **PR Requirements**: Include screenshots showing before/after changes -## Git Workflow +## HugeGraph Ecosystem Context -- **Main branch**: `master` (protected, triggers deployment) -- **PR requirements**: Include screenshots showing before/after changes in documentation -- **Commit messages**: Follow Apache commit conventions -- Always create a new branch from `master` for changes -- Deployment to `asf-site` branch is automated via GitHub Actions +This documentation covers: +- **HugeGraph-Server** - Core graph database with REST API +- **HugeGraph-Store** - Distributed storage engine +- **HugeGraph-PD** - Placement Driver for metadata +- **Toolchain** - Client, Loader, Hubble (web UI), Tools +- **HugeGraph-Computer** - Distributed OLAP graph processing +- **HugeGraph-AI** - GNN, LLM/RAG components ## Troubleshooting -**Error: "TOCSS: failed to transform scss/main.scss"** -- Cause: Using standard Hugo instead of Hugo Extended -- Solution: Install Hugo Extended version +**"TOCSS: failed to transform scss/main.scss"** +- Install Hugo Extended (not standard Hugo) -**Error: Module/theme not found** -- Cause: Git submodules not initialized -- Solution: `git submodule update --init --recursive` +**Theme/module not found** +- Run: `git submodule update --init --recursive` -**Build fails in CI but works locally** -- Check Hugo version match (CI uses v0.102.3) -- Ensure npm dependencies are installed -- Verify Node.js version (CI uses v16) +**CI build fails but works locally** +- Match Hugo version (v0.102.3) and Node.js (v16) +- Verify npm dependencies are installed diff --git a/README.md b/README.md index 18656cd5e..45ec41a7e 100644 --- a/README.md +++ b/README.md @@ -1,80 +1,156 @@ +# Apache HugeGraph Documentation Website + [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/apache/hugegraph-doc) +[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) +[![Hugo](https://img.shields.io/badge/Hugo-Extended-ff4088?logo=hugo)](https://gohugo.io/) + +--- + +**中文** | [English](#english-version) + +这是 [HugeGraph 官方文档网站](https://hugegraph.apache.org/docs/) 的**源代码仓库**。 + +如果你想查找 HugeGraph 数据库本身,请访问 [apache/hugegraph](https://github.com/apache/hugegraph)。 + +## 快速开始 + +只需 **3 步**即可在本地启动文档网站: + +**前置条件:** [Hugo Extended](https://github.com/gohugoio/hugo/releases) v0.95+ 和 Node.js v16+ + +```bash +# 1. 克隆仓库 +git clone https://github.com/apache/hugegraph-doc.git +cd hugegraph-doc + +# 2. 安装依赖 +npm install + +# 3. 启动开发服务器(支持热重载) +hugo server +``` + +打开 http://localhost:1313 预览网站。 + +> **常见问题:** 如果遇到 `TOCSS: failed to transform "scss/main.scss"` 错误, +> 说明你需要安装 Hugo **Extended** 版本,而不是标准版本。 + +## 仓库结构 -## Build/Test/Contribute to website +``` +hugegraph-doc/ +├── content/ # 📄 文档内容 (Markdown) +│ ├── cn/ # 🇨🇳 中文文档 +│ │ ├── docs/ # 主要文档目录 +│ │ │ ├── quickstart/ # 快速开始指南 +│ │ │ ├── config/ # 配置文档 +│ │ │ ├── clients/ # 客户端文档 +│ │ │ ├── guides/ # 使用指南 +│ │ │ └── ... +│ │ ├── blog/ # 博客文章 +│ │ └── community/ # 社区页面 +│ └── en/ # 🇺🇸 英文文档(与 cn/ 结构一致) +│ +├── themes/docsy/ # 🎨 Docsy 主题 (git submodule) +├── assets/ # 🖼️ 自定义资源 (fonts, images, scss) +├── layouts/ # 📐 Hugo 模板覆盖 +├── static/ # 📁 静态文件 +├── config.toml # ⚙️ 站点配置 +└── package.json # 📦 Node.js 依赖 +``` -Please visit the [contribution doc](./contribution.md) to get start, include theme/website description & settings~ +## 如何贡献 -### Summary +### 贡献流程 + +1. **Fork** 本仓库 +2. 基于 `master` 创建**新分支** +3. 修改文档内容 +4. 提交 **Pull Request**(附截图) + +### 重要说明 + +| 要求 | 说明 | +|------|------| +| **双语更新** | 修改内容时需**同时更新** `content/cn/` 和 `content/en/` | +| **PR 截图** | 提交 PR 时需附上修改**前后对比截图** | +| **Markdown** | 文档使用 Markdown 格式,带 Hugo front matter | + +### 详细指南 + +查看 [contribution.md](./contribution.md) 了解: +- 各平台 Hugo 安装方法 +- Docsy 主题定制 +- 翻译技巧 + +## 常用命令 + +| 命令 | 说明 | +|------|------| +| `hugo server` | 启动开发服务器(热重载) | +| `hugo --minify` | 构建生产版本到 `./public/` | +| `hugo server -p 8080` | 指定端口 | + +## 联系我们 + +- **问题反馈:** [GitHub Issues](https://github.com/apache/hugegraph-doc/issues) +- **邮件列表:** [dev@hugegraph.apache.org](mailto:dev@hugegraph.apache.org)([需先订阅](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/)) +- **Slack:** [ASF Slack](https://the-asf.slack.com/archives/C059UU2FJ23) +- **微信公众号:** Apache HugeGraph + +WeChat QR Code + +### 贡献者 + +感谢所有为 HugeGraph 文档做出贡献的人! + +[![contributors](https://contrib.rocks/image?repo=apache/hugegraph-doc)](https://github.com/apache/hugegraph-doc/graphs/contributors) + +--- -Apache HugeGraph is an easy-to-use, efficient, general-purpose open-source graph database system -(Graph Database, [GitHub project address](https://github.com/hugegraph/hugegraph)), implementing the [Apache TinkerPop3](https://tinkerpop.apache.org) framework and fully compatible with the [Gremlin](https://tinkerpop.apache.org/gremlin.html) query language, -With complete toolchain components, it helps users easily build applications and products based on graph databases. HugeGraph supports fast import of more than 10 billion vertices and edges, and provides millisecond-level relational query capability (OLTP). -It also supports large-scale distributed graph computing (OLAP). +## English Version -Typical application scenarios of HugeGraph include deep relationship exploration, association analysis, path search, feature extraction, data clustering, community detection, knowledge graph, etc., and are applicable to business fields such as network security, telecommunication fraud, financial risk control, advertising recommendation, social network and intelligence Robots etc. +This is the **source code repository** for the [HugeGraph documentation website](https://hugegraph.apache.org/docs/). -### Features +For the HugeGraph database project, visit [apache/hugegraph](https://github.com/apache/hugegraph). -HugeGraph supports graph operations in online and offline environments, batch importing of data and efficient complex relationship analysis. It can seamlessly be integrated with big data platforms. -HugeGraph supports multi-user parallel operations. Users can enter Gremlin query statements and get graph query results in time. They can also call the HugeGraph API in user programs for graph analysis or queries. +### Quick Start -This system has the following features: +**Prerequisites:** [Hugo Extended](https://github.com/gohugoio/hugo/releases) v0.95+ and Node.js v16+ -- Ease of use: HugeGraph supports the Gremlin graph query language and a RESTful API, providing common interfaces for graph retrieval, and peripheral tools with complete functions to easily implement various graph-based query and analysis operations. -- Efficiency: HugeGraph has been deeply optimized in graph storage and graph computing, and provides a variety of batch import tools, which can easily complete the rapid import of tens of billions of data, and achieve millisecond-level response for graph retrieval through optimized queries. Supports simultaneous online real-time operations of thousands of users. -- Universal: HugeGraph supports the Apache Gremlin standard graph query language and the Property Graph standard graph modeling method, and supports graph-based OLTP and OLAP schemes. Integrate Apache Hadoop and Apache Spark big data platform. -- Scalable: supports distributed storage, multiple copies of data and horizontal expansion, built-in multiple back-end storage engines, and can easily expand the back-end storage engine through plug-ins. -- Open: HugeGraph code is open source (Apache 2 License), customers can modify and customize independently, and selectively give back to the open source community. +```bash +# 1. Clone repository +git clone https://github.com/apache/hugegraph-doc.git +cd hugegraph-doc -The functions of this system include but are not limited to: +# 2. Install dependencies +npm install -- Supports batch import of data from multiple data sources (including local files, HDFS files, MySQL databases and other data sources), and supports import of multiple file formats (including TXT, CSV, JSON and other formats) -- With a visual operation interface, it can be used for operation, analysis and display diagrams, reducing the threshold for users to use -- Optimized graph interface: shortest path (Shortest Path), K-step connected subgraph (K-neighbor), K-step to reach the adjacent point (K-out), personalized recommendation algorithm PersonalRank, etc. -- Implemented based on the Apache-TinkerPop3 framework, supports Gremlin graph query language -- Support attribute graph, attributes can be added to vertices and edges, and support rich attribute types -- Has independent schema metadata information, has powerful graph modeling capabilities, and facilitates third-party system integration -- Support multi-vertex ID strategy: support primary key ID, support automatic ID generation, support user-defined string ID, support user-defined digital ID -- The attributes of edges and vertices can be indexed to support precise query, range query, and full-text search -- The storage system adopts plug-in mode, supporting RocksDB, Cassandra, ScyllaDB, HBase, MySQL, PostgreSQL, Palo, and InMemory, etc. -- Integrate with big data systems such as Hadoop and Spark GraphX, and support Bulk Load operations -- Support high availability (HA), multiple copies of data, backup recovery, monitoring, etc. +# 3. Start development server (auto-reload) +hugo server +``` -### Modules +Open http://localhost:1313 to preview. -- [HugeGraph-Store]: HugeGraph-Store is a distributed storage engine to manage large-scale graph data by integrating storage and computation within a unified system. -- [HugeGraph-PD]: HugeGraph-PD (Placement Driver) manages metadata and coordinates storage nodes. -- [HugeGraph-Server](/docs/quickstart/hugegraph-server): HugeGraph-Server is the core part of the HugeGraph project, containing Core, Backend, API and other submodules; - - Core: Implements the graph engine, connects to the Backend module downwards, and supports the API module upwards; - - Backend: Implements the storage of graph data to the backend, supports backends including Memory, Cassandra, ScyllaDB, RocksDB, HBase, MySQL and PostgreSQL, users can choose one according to the actual situation; - - API: Built-in REST Server, provides RESTful API to users, and is fully compatible with Gremlin queries. (Supports distributed storage and computation pushdown) -- [HugeGraph-Toolchain](https://github.com/apache/hugegraph-toolchain): (Toolchain) - - [HugeGraph-Client](/docs/quickstart/hugegraph-client): HugeGraph-Client provides a RESTful API client for connecting to HugeGraph-Server, currently only the Java version is implemented, users of other languages can implement it themselves; - - [HugeGraph-Loader](/docs/quickstart/hugegraph-loader): HugeGraph-Loader is a data import tool based on HugeGraph-Client, which transforms ordinary text data into vertices and edges of the graph and inserts them into the graph database; - - [HugeGraph-Hubble](/docs/quickstart/hugegraph-hubble): HugeGraph-Hubble is HugeGraph's Web - visualization management platform, a one-stop visualization analysis platform, the platform covers the whole process from data modeling, to fast data import, to online and offline analysis of data, and unified management of the graph; - - [HugeGraph-Tools](/docs/quickstart/hugegraph-tools): HugeGraph-Tools is HugeGraph's deployment and management tool, including graph management, backup/recovery, Gremlin execution and other functions. -- [HugeGraph-Computer](/docs/quickstart/hugegraph-computer): HugeGraph-Computer is a distributed graph processing system (OLAP). - It is an implementation of [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It can run on clusters such as Kubernetes/Yarn, and supports large-scale graph computing. -- [HugeGraph-AI](/docs/quickstart/hugegraph-ai): HugeGraph-AI is HugeGraph's independent AI - component, providing training and inference functions of graph neural networks, LLM/Graph RAG combination/Python-Client and other related components, continuously updating. +> **Troubleshooting:** If you see `TOCSS: failed to transform "scss/main.scss"`, +> install Hugo **Extended** version, not the standard version. -## Contributing +### Contributing -- Welcome to contribute to HugeGraph, please see [How to Contribute](https://hugegraph.apache.org/docs/contribution-guidelines/contribute/) for more information. -- Note: It's recommended to use [GitHub Desktop](https://desktop.github.com/) to greatly simplify the PR and commit process. -- Thank you to all the people who already contributed to HugeGraph! +1. **Fork** this repository +2. Create a **new branch** from `master` +3. Make your changes +4. Submit a **Pull Request** with screenshots -[![contributors graph](https://contrib.rocks/image?repo=apache/hugegraph-doc)](https://github.com/apache/incubator-hugegraph-doc/graphs/contributors) +**Requirements:** +- Update **BOTH** `content/cn/` and `content/en/` +- Include **before/after screenshots** in PR +- Use Markdown with Hugo front matter -### Contact Us +See [contribution.md](./contribution.md) for detailed instructions. --- -- [GitHub Issues](https://github.com/apache/incubator-hugegraph-doc/issues): Feedback on usage issues and functional requirements (quick response) -- Feedback Email: [dev@hugegraph.apache.org](mailto:dev@hugegraph.apache.org) ([subscriber](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/) only) -- Security Email: [security@hugegraph.apache.org](mailto:security@hugegraph.apache.org) (Report SEC problems) -- Slack: [ASF Online Channel](https://the-asf.slack.com/archives/C059UU2FJ23) -- WeChat public account: Apache HugeGraph, welcome to scan this QR code to follow us. +## License - QR png +[Apache License 2.0](LICENSE) diff --git a/contribution.md b/contribution.md index 9b59ab136..d736b79b9 100644 --- a/contribution.md +++ b/contribution.md @@ -1,4 +1,19 @@ -# How to help us (如何参与) +# Contribution Guide - Detailed Reference + +> **快速开始请看 [README.md](./README.md)**,这里是详细的参考文档。 + +## PR 检查清单 + +提交 Pull Request 前请确认: + +- [ ] 本地构建并验证了修改效果 +- [ ] 同时更新了中文 (`content/cn/`) 和英文 (`content/en/`) 版本 +- [ ] PR 描述中包含修改前后的截图对比 +- [ ] 如有相关 Issue,已在 PR 中关联 + +--- + +## How to help us (如何参与) 1. 在本地 3 步快速构建官网环境,启动起来看下目前效果 (Auto reload) 2. 先 fork 仓库,然后基于 `master` 创建一个**新的**分支,修改完成后提交 PR ✅ (请在 PR 内**截图**对比一下修改**前后**的效果 & 简要说明,感谢) From b1a48cc53ae7d742f00867c3bf41aacdfdfcdb86 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sat, 31 Jan 2026 21:51:16 +0800 Subject: [PATCH 02/10] Update README.md --- README.md | 161 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 103 insertions(+), 58 deletions(-) diff --git a/README.md b/README.md index 45ec41a7e..8f19005ba 100644 --- a/README.md +++ b/README.md @@ -6,13 +6,100 @@ --- -**中文** | [English](#english-version) +[中文](#中文版) | **English** + +This is the **source code repository** for the [HugeGraph documentation website](https://hugegraph.apache.org/docs/). + +For the HugeGraph database project, visit [apache/hugegraph](https://github.com/apache/hugegraph). + +## Quick Start + +Only **3 steps** to run the documentation website locally: + +**Prerequisites:** [Hugo Extended](https://github.com/gohugoio/hugo/releases) v0.95+ and Node.js v16+ + +```bash +# 1. Clone repository +git clone https://github.com/apache/hugegraph-doc.git +cd hugegraph-doc + +# 2. Install dependencies +npm install + +# 3. Start development server (auto-reload) +hugo server +``` + +Open http://localhost:1313 to preview. + +> **Troubleshooting:** If you see `TOCSS: failed to transform "scss/main.scss"`, +> install Hugo **Extended** version, not the standard version. + +## Repository Structure + +``` +hugegraph-doc/ +├── content/ # 📄 Documentation content (Markdown) +│ ├── cn/ # 🇨🇳 Chinese documentation +│ │ ├── docs/ # Main documentation +│ │ │ ├── quickstart/ # Quick start guides +│ │ │ ├── config/ # Configuration docs +│ │ │ ├── clients/ # Client docs +│ │ │ ├── guides/ # User guides +│ │ │ └── ... +│ │ ├── blog/ # Blog posts +│ │ └── community/ # Community pages +│ └── en/ # 🇺🇸 English documentation (mirrors cn/ structure) +│ +├── themes/docsy/ # 🎨 Docsy theme (git submodule) +├── assets/ # 🖼️ Custom assets (fonts, images, scss) +├── layouts/ # 📐 Hugo template overrides +├── static/ # 📁 Static files +├── config.toml # ⚙️ Site configuration +└── package.json # 📦 Node.js dependencies +``` + +## Contributing + +### Contribution Workflow + +1. **Fork** this repository +2. Create a **new branch** from `master` +3. Make your changes +4. Submit a **Pull Request** with screenshots + +### Requirements + +| Requirement | Description | +|-------------|-------------| +| **Bilingual Updates** | Update **BOTH** `content/cn/` and `content/en/` | +| **PR Screenshots** | Include **before/after screenshots** in PR | +| **Markdown** | Use Markdown with Hugo front matter | + +### Detailed Guide + +See [contribution.md](./contribution.md) for: +- Platform-specific Hugo installation +- Docsy theme customization +- Translation tips + +## Commands + +| Command | Description | +|---------|-------------| +| `hugo server` | Start dev server (hot reload) | +| `hugo --minify` | Build production to `./public/` | +| `hugo server -p 8080` | Custom port | + +--- + +## 中文版 这是 [HugeGraph 官方文档网站](https://hugegraph.apache.org/docs/) 的**源代码仓库**。 如果你想查找 HugeGraph 数据库本身,请访问 [apache/hugegraph](https://github.com/apache/hugegraph)。 -## 快速开始 +### 快速开始 只需 **3 步**即可在本地启动文档网站: @@ -35,7 +122,7 @@ hugo server > **常见问题:** 如果遇到 `TOCSS: failed to transform "scss/main.scss"` 错误, > 说明你需要安装 Hugo **Extended** 版本,而不是标准版本。 -## 仓库结构 +### 仓库结构 ``` hugegraph-doc/ @@ -59,16 +146,16 @@ hugegraph-doc/ └── package.json # 📦 Node.js 依赖 ``` -## 如何贡献 +### 如何贡献 -### 贡献流程 +#### 贡献流程 1. **Fork** 本仓库 2. 基于 `master` 创建**新分支** 3. 修改文档内容 4. 提交 **Pull Request**(附截图) -### 重要说明 +#### 重要说明 | 要求 | 说明 | |------|------| @@ -76,14 +163,14 @@ hugegraph-doc/ | **PR 截图** | 提交 PR 时需附上修改**前后对比截图** | | **Markdown** | 文档使用 Markdown 格式,带 Hugo front matter | -### 详细指南 +#### 详细指南 查看 [contribution.md](./contribution.md) 了解: - 各平台 Hugo 安装方法 - Docsy 主题定制 - 翻译技巧 -## 常用命令 +### 常用命令 | 命令 | 说明 | |------|------| @@ -91,63 +178,21 @@ hugegraph-doc/ | `hugo --minify` | 构建生产版本到 `./public/` | | `hugo server -p 8080` | 指定端口 | -## 联系我们 - -- **问题反馈:** [GitHub Issues](https://github.com/apache/hugegraph-doc/issues) -- **邮件列表:** [dev@hugegraph.apache.org](mailto:dev@hugegraph.apache.org)([需先订阅](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/)) -- **Slack:** [ASF Slack](https://the-asf.slack.com/archives/C059UU2FJ23) -- **微信公众号:** Apache HugeGraph - -WeChat QR Code - -### 贡献者 - -感谢所有为 HugeGraph 文档做出贡献的人! - -[![contributors](https://contrib.rocks/image?repo=apache/hugegraph-doc)](https://github.com/apache/hugegraph-doc/graphs/contributors) - --- -## English Version - -This is the **source code repository** for the [HugeGraph documentation website](https://hugegraph.apache.org/docs/). - -For the HugeGraph database project, visit [apache/hugegraph](https://github.com/apache/hugegraph). - -### Quick Start - -**Prerequisites:** [Hugo Extended](https://github.com/gohugoio/hugo/releases) v0.95+ and Node.js v16+ - -```bash -# 1. Clone repository -git clone https://github.com/apache/hugegraph-doc.git -cd hugegraph-doc +## Contact & Community -# 2. Install dependencies -npm install +- **Issues:** [GitHub Issues](https://github.com/apache/hugegraph-doc/issues) +- **Mailing List:** [dev@hugegraph.apache.org](mailto:dev@hugegraph.apache.org) ([subscribe first](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/)) +- **Slack:** [ASF Slack](https://the-asf.slack.com/archives/C059UU2FJ23) -# 3. Start development server (auto-reload) -hugo server -``` - -Open http://localhost:1313 to preview. - -> **Troubleshooting:** If you see `TOCSS: failed to transform "scss/main.scss"`, -> install Hugo **Extended** version, not the standard version. - -### Contributing +WeChat QR Code -1. **Fork** this repository -2. Create a **new branch** from `master` -3. Make your changes -4. Submit a **Pull Request** with screenshots +## Contributors -**Requirements:** -- Update **BOTH** `content/cn/` and `content/en/` -- Include **before/after screenshots** in PR -- Use Markdown with Hugo front matter +Thanks to all contributors to the HugeGraph documentation! -See [contribution.md](./contribution.md) for detailed instructions. +[![contributors](https://contrib.rocks/image?repo=apache/hugegraph-doc)](https://github.com/apache/hugegraph-doc/graphs/contributors) --- From b12fe33a43cc6b492e60c26eaa401bde2d84ebf3 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sat, 31 Jan 2026 23:35:44 +0800 Subject: [PATCH 03/10] server: update docs and configs for HugeGraph 1.7.0 Bump documentation site version and update docs/configs to reflect HugeGraph 1.7.0 changes: update config.toml version to 1.7; add Version Change notices for Auth REST API in CN/EN; revise HugeGraph-Server quickstart (docker images, download/toolchain URLs) and add deprecation warnings for removed legacy backends (MySQL, PostgreSQL, Cassandra, ScyllaDB) in favor of 1.7 backends (RocksDB, HStore, HBase, Memory). Update default rest-server config docs: increase batch.max_* defaults, raise batch.max_write_ratio, set exception.allow_trace true, add log.slow_query_threshold, and add K8s / PD/Meta / Arthas configuration option sections. --- config.toml | 2 +- content/cn/docs/clients/restful-api/auth.md | 4 ++ content/cn/docs/config/config-option.md | 42 ++++++++++++++++--- .../quickstart/hugegraph/hugegraph-server.md | 32 +++++++++----- content/en/docs/clients/restful-api/auth.md | 4 ++ content/en/docs/config/config-option.md | 42 ++++++++++++++++--- .../quickstart/hugegraph/hugegraph-server.md | 32 +++++++++----- 7 files changed, 123 insertions(+), 35 deletions(-) diff --git a/config.toml b/config.toml index 493c1462b..f37873d09 100644 --- a/config.toml +++ b/config.toml @@ -152,7 +152,7 @@ archived_version = false # The version number for the version of the docs represented in this doc set. # Used in the "version-banner" partial to display a version number for the # current doc set. -version = "0.13" +version = "1.7" # A link to latest version of the docs. Used in the "version-banner" partial to # point people to the main doc site. diff --git a/content/cn/docs/clients/restful-api/auth.md b/content/cn/docs/clients/restful-api/auth.md index 606b4e5c0..6c3c086aa 100644 --- a/content/cn/docs/clients/restful-api/auth.md +++ b/content/cn/docs/clients/restful-api/auth.md @@ -4,6 +4,10 @@ linkTitle: "Authentication" weight: 16 --- +> **版本变更说明**: +> - 1.7.0+: Auth API 路径使用 GraphSpace 格式,如 `/graphspaces/DEFAULT/auth/users`,且 group/target 等 id 格式与 name 一致(如 `admin`) +> - 1.5.x 及更早: Auth API 路径包含 graph 名称,group/target 等 id 格式类似 `-69:grant`。参考 [HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) + ### 10.1 用户认证与权限控制 > 开启权限及相关配置请先参考 [权限配置](/cn/docs/config/config-authentication/) 文档 diff --git a/content/cn/docs/config/config-option.md b/content/cn/docs/config/config-option.md index 0bf56af8b..238ac8f50 100644 --- a/content/cn/docs/config/config-option.md +++ b/content/cn/docs/config/config-option.md @@ -37,9 +37,9 @@ weight: 2 | gremlinserver.url | http://127.0.0.1:8182 | The url of gremlin server. | | gremlinserver.max_route | 8 | The max route number for gremlin server. | | gremlinserver.timeout | 30 | The timeout in seconds of waiting for gremlin server. | -| batch.max_edges_per_batch | 500 | The maximum number of edges submitted per batch. | -| batch.max_vertices_per_batch | 500 | The maximum number of vertices submitted per batch. | -| batch.max_write_ratio | 50 | The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0. | +| batch.max_edges_per_batch | 2500 | The maximum number of edges submitted per batch. | +| batch.max_vertices_per_batch | 2500 | The maximum number of vertices submitted per batch. | +| batch.max_write_ratio | 70 | The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0. | | batch.max_write_threads | 0 | The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads. | | auth.authenticator | | The class path of authenticator implementation. e.g., org.apache.hugegraph.auth.StandardAuthenticator, or a custom implementation. | | auth.graph_store | hugegraph | The name of graph used to store authentication information, like users, only for org.apache.hugegraph.auth.StandardAuthenticator. | @@ -49,9 +49,39 @@ weight: 2 | auth.remote_url | | If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ','. | | auth.token_expire | 86400 | The expiration time in seconds after token created | | auth.token_secret | FXQXbJtbCLxODc6tGci732pkH1cyf8Qg | Secret key of HS256 algorithm. | -| exception.allow_trace | false | Whether to allow exception trace stack. | -| memory_monitor.threshold | 0.85 | The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function. | +| exception.allow_trace | true | Whether to allow exception trace stack. | +| memory_monitor.threshold | 0.85 | The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function. | | memory_monitor.period | 2000 | The period in ms of JVM(in-heap) memory usage monitoring. | +| log.slow_query_threshold | 1000 | Slow query log threshold in milliseconds, 0 means disabled. | + +### K8s 配置项 (可选) + +对应配置文件`rest-server.properties` + +| config option | default value | description | +|------------------|-------------------------------|------------------------------------------| +| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. | +| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. | +| k8s.kubeconfig | | Path to kubeconfig file. | + +### PD/Meta 配置项 (分布式模式) + +对应配置文件`rest-server.properties` + +| config option | default value | description | +|------------------|------------------------|--------------------------------------------| +| pd.peers | 127.0.0.1:8686 | PD server addresses (comma separated). | +| meta.endpoints | http://127.0.0.1:2379 | Meta service endpoints. | + +### Arthas 诊断配置项 (可选) + +对应配置文件`rest-server.properties` + +| config option | default value | description | +|--------------------|---------------|-----------------------| +| arthas.telnetPort | 8562 | Arthas telnet port. | +| arthas.httpPort | 8561 | Arthas HTTP port. | +| arthas.ip | 0.0.0.0 | Arthas bind IP. | ### 基本配置项 @@ -60,7 +90,7 @@ weight: 2 | config option | default value | description | |---------------------------------------|----------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | gremlin.graph | org.apache.hugegraph.HugeFactory | Gremlin entrance to create graph. | -| backend | rocksdb | The data store type, available values are [memory, rocksdb, cassandra, scylladb, hbase, mysql]. | +| backend | rocksdb | The data store type. For version 1.7.0+: [memory, rocksdb, hstore, hbase]. Note: cassandra, scylladb, mysql, postgresql were removed in 1.7.0 (use <= 1.5.x for legacy backends). | | serializer | binary | The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql]. | | store | hugegraph | The database name like Cassandra Keyspace. | | store.connection_detect_interval | 600 | The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time. | diff --git a/content/cn/docs/quickstart/hugegraph/hugegraph-server.md b/content/cn/docs/quickstart/hugegraph/hugegraph-server.md index d1deef244..b9daaa5f2 100644 --- a/content/cn/docs/quickstart/hugegraph/hugegraph-server.md +++ b/content/cn/docs/quickstart/hugegraph/hugegraph-server.md @@ -8,7 +8,9 @@ weight: 1 HugeGraph-Server 是 HugeGraph 项目的核心部分,包含 graph-core、backend、API 等子模块。 -Core 模块是 Tinkerpop 接口的实现,Backend 模块用于管理数据存储,目前支持的后端包括:Memory、Cassandra、ScyllaDB 以及 RocksDB,API 模块提供 HTTP Server,将 Client 的 HTTP 请求转化为对 Core 的调用。 +Core 模块是 Tinkerpop 接口的实现,Backend 模块用于管理数据存储,1.7.0+ 版本支持的后端包括:RocksDB(单机默认)、HStore(分布式)、HBase 和 Memory。API 模块提供 HTTP Server,将 Client 的 HTTP 请求转化为对 Core 的调用。 + +> ⚠️ **重要变更**: 从 1.7.0 版本开始,MySQL、PostgreSQL、Cassandra、ScyllaDB 等遗留后端已被移除。如需使用这些后端,请使用 1.5.x 或更早版本。 > 文档中会出现 `HugeGraph-Server` 及 `HugeGraphServer` 这两种写法,其他组件也类似。 > 这两种写法含义上并明显差异,可以这么区分:`HugeGraph-Server` 表示服务端相关组件代码,`HugeGraphServer` 表示服务进程。 @@ -39,12 +41,12 @@ Core 模块是 Tinkerpop 接口的实现,Backend 模块用于管理数据存 可参考 [Docker 部署方式](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/README.md)。 -我们可以使用 `docker run -itd --name=server -p 8080:8080 -e PASSWORD=xxx hugegraph/hugegraph:1.5.0` 去快速启动一个内置了 `RocksDB` 的 `Hugegraph server`. +我们可以使用 `docker run -itd --name=server -p 8080:8080 -e PASSWORD=xxx hugegraph/hugegraph:1.7.0` 去快速启动一个内置了 `RocksDB` 的 `Hugegraph server`. 可选项: 1. 可以使用 `docker exec -it server bash` 进入容器完成一些操作 -2. 可以使用 `docker run -itd --name=server -p 8080:8080 -e PRELOAD="true" hugegraph/hugegraph:1.5.0` 在启动的时候预加载一个**内置的**样例图。可以通过 `RESTful API` 进行验证。具体步骤可以参考 [5.1.9](#519-%E5%90%AF%E5%8A%A8-server-%E7%9A%84%E6%97%B6%E5%80%99%E5%88%9B%E5%BB%BA%E7%A4%BA%E4%BE%8B%E5%9B%BE) +2. 可以使用 `docker run -itd --name=server -p 8080:8080 -e PRELOAD="true" hugegraph/hugegraph:1.7.0` 在启动的时候预加载一个**内置的**样例图。可以通过 `RESTful API` 进行验证。具体步骤可以参考 [5.1.9](#519-%E5%90%AF%E5%8A%A8-server-%E7%9A%84%E6%97%B6%E5%80%99%E5%88%9B%E5%BB%BA%E7%A4%BA%E4%BE%8B%E5%9B%BE) 3. 可以使用 `-e PASSWORD=xxx` 设置是否开启鉴权模式以及 admin 的密码,具体步骤可以参考 [Config Authentication](/cn/docs/config/config-authentication#使用-docker-时开启鉴权模式) 如果使用 docker desktop,则可以按照如下的方式设置可选项: @@ -59,7 +61,7 @@ Core 模块是 Tinkerpop 接口的实现,Backend 模块用于管理数据存 version: '3' services: server: - image: hugegraph/hugegraph:1.5.0 + image: hugegraph/hugegraph:1.7.0 container_name: server environment: - PASSWORD=xxx @@ -74,12 +76,12 @@ services: > > 1. hugegraph 的 docker 镜像是一个便捷版本,用于快速启动 hugegraph,并不是**官方发布物料包方式**。你可以从 [ASF Release Distribution Policy](https://infra.apache.org/release-distribution.html#dockerhub) 中得到更多细节。 > -> 2. 推荐使用 `release tag` (如 `1.5.0/1.x.0`) 以获取稳定版。使用 `latest` tag 可以使用开发中的最新功能。 +> 2. 推荐使用 `release tag` (如 `1.7.0/1.x.0`) 以获取稳定版。使用 `latest` tag 可以使用开发中的最新功能。 #### 3.2 下载 tar 包 ```bash -# use the latest version, here is 1.5.0 for example +# use the latest version, here is 1.7.0 for example wget https://downloads.apache.org/incubator/hugegraph/{version}/apache-hugegraph-incubating-{version}.tar.gz tar zxf *hugegraph*.tar.gz ``` @@ -138,11 +140,11 @@ mvn package -DskipTests HugeGraph-Tools 提供了一键部署的命令行工具,用户可以使用该工具快速地一键下载、解压、配置并启动 HugeGraph-Server 和 HugeGraph-Hubble,最新的 HugeGraph-Toolchain 中已经包含所有的这些工具,直接下载它解压就有工具包集合了 ```bash -# download toolchain package, it includes loader + tool + hubble, please check the latest version (here is 1.5.0) -wget https://downloads.apache.org/incubator/hugegraph/1.5.0/apache-hugegraph-toolchain-incubating-1.5.0.tar.gz +# download toolchain package, it includes loader + tool + hubble, please check the latest version (here is 1.7.0) +wget https://downloads.apache.org/incubator/hugegraph/1.7.0/apache-hugegraph-toolchain-incubating-1.7.0.tar.gz tar zxf *hugegraph-*.tar.gz # enter the tool's package -cd *hugegraph*/*tool* +cd *hugegraph*/*tool* ``` > 注:`${version}` 为版本号,最新版本号可参考 [Download 页面](/docs/download/download),或直接从 Download 页面点击链接下载 @@ -387,6 +389,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK ##### 5.1.4 MySQL +> ⚠️ **已废弃**: 此后端从 HugeGraph 1.7.0 版本开始已移除。如需使用,请参考 1.5.x 版本文档。 +
点击展开/折叠 MySQL 配置及启动方法 @@ -431,6 +435,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK ##### 5.1.5 Cassandra +> ⚠️ **已废弃**: 此后端从 HugeGraph 1.7.0 版本开始已移除。如需使用,请参考 1.5.x 版本文档。 +
点击展开/折叠 Cassandra 配置及启动方法 @@ -516,6 +522,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK ##### 5.1.7 ScyllaDB +> ⚠️ **已废弃**: 此后端从 HugeGraph 1.7.0 版本开始已移除。如需使用,请参考 1.5.x 版本文档。 +
点击展开/折叠 ScyllaDB 配置及启动方法 @@ -584,6 +592,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)......OK ##### 5.2.1 使用 Cassandra 作为后端 +> ⚠️ **已废弃**: Cassandra 后端从 HugeGraph 1.7.0 版本开始已移除。如需使用,请参考 1.5.x 版本文档。 +
点击展开/折叠 Cassandra 配置及启动方法 @@ -652,7 +662,7 @@ volumes: 1. 使用`docker run` - 使用 `docker run -itd --name=server -p 8080:8080 -e PRELOAD=true hugegraph/hugegraph:1.5.0` + 使用 `docker run -itd --name=server -p 8080:8080 -e PRELOAD=true hugegraph/hugegraph:1.7.0` 2. 使用`docker-compose` @@ -662,7 +672,7 @@ volumes: version: '3' services: server: - image: hugegraph/hugegraph:1.5.0 + image: hugegraph/hugegraph:1.7.0 container_name: server environment: - PRELOAD=true diff --git a/content/en/docs/clients/restful-api/auth.md b/content/en/docs/clients/restful-api/auth.md index e90b84089..4d87c90f1 100644 --- a/content/en/docs/clients/restful-api/auth.md +++ b/content/en/docs/clients/restful-api/auth.md @@ -4,6 +4,10 @@ linkTitle: "Authentication" weight: 16 --- +> **Version Change Notice**: +> - 1.7.0+: Auth API paths use GraphSpace format, such as `/graphspaces/DEFAULT/auth/users`, and group/target IDs match their names (e.g., `admin`) +> - 1.5.x and earlier: Auth API paths include graph name, and group/target IDs use format like `-69:grant`. See [HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) + ### 10.1 User Authentication and Access Control > To enable authentication and related configurations, please refer to the [Authentication Configuration](/docs/config/config-authentication/) documentation. diff --git a/content/en/docs/config/config-option.md b/content/en/docs/config/config-option.md index b0d937b66..bfba7d97e 100644 --- a/content/en/docs/config/config-option.md +++ b/content/en/docs/config/config-option.md @@ -37,9 +37,9 @@ Corresponding configuration file `rest-server.properties` | gremlinserver.url | http://127.0.0.1:8182 | The url of gremlin server. | | gremlinserver.max_route | 8 | The max route number for gremlin server. | | gremlinserver.timeout | 30 | The timeout in seconds of waiting for gremlin server. | -| batch.max_edges_per_batch | 500 | The maximum number of edges submitted per batch. | -| batch.max_vertices_per_batch | 500 | The maximum number of vertices submitted per batch. | -| batch.max_write_ratio | 50 | The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0. | +| batch.max_edges_per_batch | 2500 | The maximum number of edges submitted per batch. | +| batch.max_vertices_per_batch | 2500 | The maximum number of vertices submitted per batch. | +| batch.max_write_ratio | 70 | The maximum thread ratio for batch writing, only take effect if the batch.max_write_threads is 0. | | batch.max_write_threads | 0 | The maximum threads for batch writing, if the value is 0, the actual value will be set to batch.max_write_ratio * restserver.max_worker_threads. | | auth.authenticator | | The class path of authenticator implementation. e.g., org.apache.hugegraph.auth.StandardAuthenticator, or a custom implementation. | | auth.graph_store | hugegraph | The name of graph used to store authentication information, like users, only for org.apache.hugegraph.auth.StandardAuthenticator. | @@ -49,9 +49,39 @@ Corresponding configuration file `rest-server.properties` | auth.remote_url | | If the address is empty, it provide auth service, otherwise it is auth client and also provide auth service through rpc forwarding. The remote url can be set to multiple addresses, which are concat by ','. | | auth.token_expire | 86400 | The expiration time in seconds after token created | | auth.token_secret | FXQXbJtbCLxODc6tGci732pkH1cyf8Qg | Secret key of HS256 algorithm. | -| exception.allow_trace | false | Whether to allow exception trace stack. | -| memory_monitor.threshold | 0.85 | The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function. | +| exception.allow_trace | true | Whether to allow exception trace stack. | +| memory_monitor.threshold | 0.85 | The threshold of JVM(in-heap) memory usage monitoring , 1 means disabling this function. | | memory_monitor.period | 2000 | The period in ms of JVM(in-heap) memory usage monitoring. | +| log.slow_query_threshold | 1000 | Slow query log threshold in milliseconds, 0 means disabled. | + +### K8s Config Options (Optional) + +Corresponding configuration file `rest-server.properties` + +| config option | default value | description | +|------------------|-------------------------------|------------------------------------------| +| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. | +| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. | +| k8s.kubeconfig | | Path to kubeconfig file. | + +### PD/Meta Config Options (Distributed Mode) + +Corresponding configuration file `rest-server.properties` + +| config option | default value | description | +|------------------|------------------------|--------------------------------------------| +| pd.peers | 127.0.0.1:8686 | PD server addresses (comma separated). | +| meta.endpoints | http://127.0.0.1:2379 | Meta service endpoints. | + +### Arthas Diagnostic Config Options (Optional) + +Corresponding configuration file `rest-server.properties` + +| config option | default value | description | +|--------------------|---------------|-----------------------| +| arthas.telnetPort | 8562 | Arthas telnet port. | +| arthas.httpPort | 8561 | Arthas HTTP port. | +| arthas.ip | 0.0.0.0 | Arthas bind IP. | ### Basic Config Options @@ -60,7 +90,7 @@ Basic Config Options and Backend Config Options correspond to configuration file | config option | default value | description | |---------------------------------------|----------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | gremlin.graph | org.apache.hugegraph.HugeFactory | Gremlin entrance to create graph. | -| backend | rocksdb | The data store type, available values are [memory, rocksdb, cassandra, scylladb, hbase, mysql]. | +| backend | rocksdb | The data store type. For version 1.7.0+: [memory, rocksdb, hstore, hbase]. Note: cassandra, scylladb, mysql, postgresql were removed in 1.7.0 (use <= 1.5.x for legacy backends). | | serializer | binary | The serializer for backend store, available values are [text, binary, cassandra, hbase, mysql]. | | store | hugegraph | The database name like Cassandra Keyspace. | | store.connection_detect_interval | 600 | The interval in seconds for detecting connections, if the idle time of a connection exceeds this value, detect it and reconnect if needed before using, value 0 means detecting every time. | diff --git a/content/en/docs/quickstart/hugegraph/hugegraph-server.md b/content/en/docs/quickstart/hugegraph/hugegraph-server.md index 2db777070..06ebc9e86 100644 --- a/content/en/docs/quickstart/hugegraph/hugegraph-server.md +++ b/content/en/docs/quickstart/hugegraph/hugegraph-server.md @@ -8,7 +8,9 @@ weight: 1 `HugeGraph-Server` is the core part of the HugeGraph Project, contains submodules such as graph-core, backend, API. -The Core Module is an implementation of the Tinkerpop interface; The Backend module is used to save the graph data to the data store, currently supported backends include: Memory, Cassandra, ScyllaDB, RocksDB; The API Module provides HTTP Server, which converts Client's HTTP request into a call to Core Module. +The Core Module is an implementation of the Tinkerpop interface; The Backend module is used to save the graph data to the data store. For version 1.7.0+, supported backends include: RocksDB (standalone default), HStore (distributed), HBase, and Memory. The API Module provides HTTP Server, which converts Client's HTTP request into a call to Core Module. + +> ⚠️ **Important Change**: Starting from version 1.7.0, legacy backends such as MySQL, PostgreSQL, Cassandra, and ScyllaDB have been removed. If you need to use these backends, please use version 1.5.x or earlier. > There will be two spellings HugeGraph-Server and HugeGraphServer in the document, and other > modules are similar. There is no big difference in the meaning of these two ways, @@ -42,11 +44,11 @@ There are four ways to deploy HugeGraph-Server components: You can refer to the [Docker deployment guide](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/README.md). -We can use `docker run -itd --name=server -p 8080:8080 -e PASSWORD=xxx hugegraph/hugegraph:1.5.0` to quickly start a `HugeGraph Server` with a built-in `RocksDB` backend. +We can use `docker run -itd --name=server -p 8080:8080 -e PASSWORD=xxx hugegraph/hugegraph:1.7.0` to quickly start a `HugeGraph Server` with a built-in `RocksDB` backend. -Optional: +Optional: 1. use `docker exec -it graph bash` to enter the container to do some operations. -2. use `docker run -itd --name=graph -p 8080:8080 -e PRELOAD="true" hugegraph/hugegraph:1.5.0` to start with a **built-in** example graph. We can use `RESTful API` to verify the result. The detailed step can refer to [5.1.8](#518-create-an-example-graph-when-startup) +2. use `docker run -itd --name=graph -p 8080:8080 -e PRELOAD="true" hugegraph/hugegraph:1.7.0` to start with a **built-in** example graph. We can use `RESTful API` to verify the result. The detailed step can refer to [5.1.8](#518-create-an-example-graph-when-startup) 3. use `-e PASSWORD=xxx` to enable auth mode and set the password for admin. You can find more details from [Config Authentication](/docs/config/config-authentication#use-docker-to-enable-authentication-mode) If you use docker desktop, you can set the option like: @@ -60,7 +62,7 @@ Also, if we want to manage the other Hugegraph related instances in one file, we version: '3' services: server: - image: hugegraph/hugegraph:1.5.0 + image: hugegraph/hugegraph:1.7.0 container_name: server environment: - PASSWORD=xxx @@ -75,13 +77,13 @@ services: > > 1. The docker image of the hugegraph is a convenient release to start it quickly, but not **official distribution** artifacts. You can find more details from [ASF Release Distribution Policy](https://infra.apache.org/release-distribution.html#dockerhub). > -> 2. Recommend to use `release tag` (like `1.5.0`/`1.x.0`) for the stable version. Use `latest` tag to experience the newest functions in development. +> 2. Recommend to use `release tag` (like `1.7.0`/`1.x.0`) for the stable version. Use `latest` tag to experience the newest functions in development. #### 3.2 Download the binary tar tarball You could download the binary tarball from the download page of the ASF site like this: ```bash -# use the latest version, here is 1.5.0 for example +# use the latest version, here is 1.7.0 for example wget https://downloads.apache.org/incubator/hugegraph/{version}/apache-hugegraph-incubating-{version}.tar.gz tar zxf *hugegraph*.tar.gz @@ -156,8 +158,8 @@ Of course, you should download the tarball of `HugeGraph-Toolchain` first. ```bash # download toolchain binary package, it includes loader + tool + hubble -# please check the latest version (e.g. here is 1.5.0) -wget https://downloads.apache.org/incubator/hugegraph/1.5.0/apache-hugegraph-toolchain-incubating-1.5.0.tar.gz +# please check the latest version (e.g. here is 1.7.0) +wget https://downloads.apache.org/incubator/hugegraph/1.7.0/apache-hugegraph-toolchain-incubating-1.7.0.tar.gz tar zxf *hugegraph-*.tar.gz # enter the tool's package @@ -384,6 +386,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK ##### 5.1.4 Cassandra +> ⚠️ **Deprecated**: This backend has been removed starting from HugeGraph 1.7.0. If you need to use it, please refer to version 1.5.x documentation. +
Click to expand/collapse Cassandra configuration and startup methods @@ -444,6 +448,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK ##### 5.1.5 ScyllaDB +> ⚠️ **Deprecated**: This backend has been removed starting from HugeGraph 1.7.0. If you need to use it, please refer to version 1.5.x documentation. +
Click to expand/collapse ScyllaDB configuration and startup methods @@ -530,6 +536,8 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)....OK ##### 5.1.7 MySQL +> ⚠️ **Deprecated**: This backend has been removed starting from HugeGraph 1.7.0. If you need to use it, please refer to version 1.5.x documentation. +
Click to expand/collapse MySQL configuration and startup methods @@ -600,6 +608,8 @@ In [3.1 Use Docker container](#31-use-docker-container-convenient-for-testdev), ##### 5.2.1 Uses Cassandra as storage +> ⚠️ **Deprecated**: Cassandra backend has been removed starting from HugeGraph 1.7.0. If you need to use it, please refer to version 1.5.x documentation. +
Click to expand/collapse Cassandra configuration and startup methods @@ -668,7 +678,7 @@ Set the environment variable `PRELOAD=true` when starting Docker to load data du 1. Use `docker run` - Use `docker run -itd --name=server -p 8080:8080 -e PRELOAD=true hugegraph/hugegraph:1.5.0` + Use `docker run -itd --name=server -p 8080:8080 -e PRELOAD=true hugegraph/hugegraph:1.7.0` 2. Use `docker-compose` @@ -678,7 +688,7 @@ Set the environment variable `PRELOAD=true` when starting Docker to load data du version: '3' services: server: - image: hugegraph/hugegraph:1.5.0 + image: hugegraph/hugegraph:1.7.0 container_name: server environment: - PRELOAD=true From 0352d0a5fd9cecbf66e1de13934ae57f7b32ef97 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sat, 31 Jan 2026 23:44:16 +0800 Subject: [PATCH 04/10] toolchain: add Spark connector and update Hubble/Tools Add HugeGraph-Spark-Connector quick start docs (English and Chinese). Add a Configuration section to HugeGraph-Hubble docs (server settings and Gremlin query limits) in both languages. Update hugegraph-tools docs (EN/CN) to document new graph commands (graph-create, graph-clone, graph-drop), authentication backup/restore (auth-backup, auth-restore), --thread-num option for relevant commands, and minor heading/usage adjustments. --- .../quickstart/toolchain/hugegraph-hubble.md | 23 +++ .../toolchain/hugegraph-spark-connector.md | 182 ++++++++++++++++++ .../quickstart/toolchain/hugegraph-tools.md | 42 +++- .../quickstart/toolchain/hugegraph-hubble.md | 23 +++ .../toolchain/hugegraph-spark-connector.md | 182 ++++++++++++++++++ .../quickstart/toolchain/hugegraph-tools.md | 60 ++++-- 6 files changed, 491 insertions(+), 21 deletions(-) create mode 100644 content/cn/docs/quickstart/toolchain/hugegraph-spark-connector.md create mode 100644 content/en/docs/quickstart/toolchain/hugegraph-spark-connector.md diff --git a/content/cn/docs/quickstart/toolchain/hugegraph-hubble.md b/content/cn/docs/quickstart/toolchain/hugegraph-hubble.md index 2167c0a72..f99847866 100644 --- a/content/cn/docs/quickstart/toolchain/hugegraph-hubble.md +++ b/content/cn/docs/quickstart/toolchain/hugegraph-hubble.md @@ -551,3 +551,26 @@ Hubble 上暂未提供可视化的 OLAP 算法执行,可调用 RESTful API 进
image
+ + +### 5 配置说明 + +HugeGraph-Hubble 可以通过 `conf/hugegraph-hubble.properties` 文件进行配置。 + +#### 5.1 服务器配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| `hubble.host` | `0.0.0.0` | Hubble 服务绑定的地址 | +| `hubble.port` | `8088` | Hubble 服务监听的端口 | + +#### 5.2 Gremlin 查询限制 + +这些设置控制查询结果限制,防止内存问题: + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| `gremlin.suffix_limit` | `250` | 查询后缀最大长度 | +| `gremlin.vertex_degree_limit` | `100` | 显示的最大顶点度数 | +| `gremlin.edges_total_limit` | `500` | 返回的最大边数 | +| `gremlin.batch_query_ids` | `100` | ID 批量查询大小 | diff --git a/content/cn/docs/quickstart/toolchain/hugegraph-spark-connector.md b/content/cn/docs/quickstart/toolchain/hugegraph-spark-connector.md new file mode 100644 index 000000000..13dec291b --- /dev/null +++ b/content/cn/docs/quickstart/toolchain/hugegraph-spark-connector.md @@ -0,0 +1,182 @@ +--- +title: "HugeGraph-Spark-Connector Quick Start" +linkTitle: "使用 Spark Connector 读写图数据" +weight: 4 +--- + +### 1 HugeGraph-Spark-Connector 概述 + +HugeGraph-Spark-Connector 是一个用于在 Spark 中以标准格式读写 HugeGraph 数据的连接器应用程序。 + +### 2 环境要求 + +- Java 8+ +- Maven 3.6+ +- Spark 3.x +- Scala 2.12 + +### 3 编译 + +#### 3.1 不执行测试的编译 + +```bash +mvn clean package -DskipTests +``` + +#### 3.2 执行默认测试的编译 + +```bash +mvn clean package +``` + +### 4 使用方法 + +首先在你的 pom.xml 中添加依赖: + +```xml + + org.apache.hugegraph + hugegraph-spark-connector + ${revision} + +``` + +#### 4.1 Schema 定义示例 + +假设我们有一个图,其 schema 定义如下: + +```groovy +schema.propertyKey("name").asText().ifNotExist().create() +schema.propertyKey("age").asInt().ifNotExist().create() +schema.propertyKey("city").asText().ifNotExist().create() +schema.propertyKey("weight").asDouble().ifNotExist().create() +schema.propertyKey("lang").asText().ifNotExist().create() +schema.propertyKey("date").asText().ifNotExist().create() +schema.propertyKey("price").asDouble().ifNotExist().create() + +schema.vertexLabel("person") + .properties("name", "age", "city") + .useCustomizeStringId() + .nullableKeys("age", "city") + .ifNotExist() + .create() + +schema.vertexLabel("software") + .properties("name", "lang", "price") + .primaryKeys("name") + .ifNotExist() + .create() + +schema.edgeLabel("knows") + .sourceLabel("person") + .targetLabel("person") + .properties("date", "weight") + .ifNotExist() + .create() + +schema.edgeLabel("created") + .sourceLabel("person") + .targetLabel("software") + .properties("date", "weight") + .ifNotExist() + .create() +``` + +#### 4.2 写入顶点数据(Scala) + +```scala +val df = sparkSession.createDataFrame(Seq( + Tuple3("marko", 29, "Beijing"), + Tuple3("vadas", 27, "HongKong"), + Tuple3("Josh", 32, "Beijing"), + Tuple3("peter", 35, "ShangHai"), + Tuple3("li,nary", 26, "Wu,han"), + Tuple3("Bob", 18, "HangZhou"), +)) toDF("name", "age", "city") + +df.show() + +df.write + .format("org.apache.hugegraph.spark.connector.DataSource") + .option("host", "127.0.0.1") + .option("port", "8080") + .option("graph", "hugegraph") + .option("data-type", "vertex") + .option("label", "person") + .option("id", "name") + .option("batch-size", 2) + .mode(SaveMode.Overwrite) + .save() +``` + +#### 4.3 写入边数据(Scala) + +```scala +val df = sparkSession.createDataFrame(Seq( + Tuple4("marko", "vadas", "20160110", 0.5), + Tuple4("peter", "Josh", "20230801", 1.0), + Tuple4("peter", "li,nary", "20130220", 2.0) +)).toDF("source", "target", "date", "weight") + +df.show() + +df.write + .format("org.apache.hugegraph.spark.connector.DataSource") + .option("host", "127.0.0.1") + .option("port", "8080") + .option("graph", "hugegraph") + .option("data-type", "edge") + .option("label", "knows") + .option("source-name", "source") + .option("target-name", "target") + .option("batch-size", 2) + .mode(SaveMode.Overwrite) + .save() +``` + +### 5 配置参数 + +#### 5.1 客户端配置 + +客户端配置用于配置 hugegraph-client。 + +| 参数 | 默认值 | 说明 | +|----------------------|------------|-------------------------------------------------------| +| `host` | `localhost` | HugeGraphServer 的地址 | +| `port` | `8080` | HugeGraphServer 的端口 | +| `graph` | `hugegraph` | 图空间名称 | +| `protocol` | `http` | 向服务器发送请求的协议,可选 `http` 或 `https` | +| `username` | `null` | 当 HugeGraphServer 开启权限认证时,当前图的用户名 | +| `token` | `null` | 当 HugeGraphServer 开启权限认证时,当前图的 token | +| `timeout` | `60` | 插入结果返回的超时时间(秒) | +| `max-conn` | `CPUS * 4` | HugeClient 与 HugeGraphServer 之间的最大 HTTP 连接数 | +| `max-conn-per-route` | `CPUS * 2` | HugeClient 与 HugeGraphServer 之间每个路由的最大 HTTP 连接数 | +| `trust-store-file` | `null` | 当请求协议为 https 时,客户端的证书文件路径 | +| `trust-store-token` | `null` | 当请求协议为 https 时,客户端的证书密码 | + +#### 5.2 图数据配置 + +图数据配置用于设置图空间的配置。 + +| 参数 | 默认值 | 说明 | +|-------------------|-------|----------------------------------------------------------------------------------------------------------------------------------------------------| +| `data-type` | | 图数据类型,必须是 `vertex` 或 `edge` | +| `label` | | 要导入的顶点/边数据所属的标签 | +| `id` | | 指定某一列作为顶点的 id 列。当顶点 id 策略为 CUSTOMIZE 时,必填;当 id 策略为 PRIMARY_KEY 时,必须为空 | +| `source-name` | | 选择输入源的某些列作为源顶点的 id 列。当源顶点的 id 策略为 CUSTOMIZE 时,必须指定某一列作为顶点的 id 列;当源顶点的 id 策略为 PRIMARY_KEY 时,必须指定一列或多列用于拼接生成顶点的 id,即无论使用哪种 id 策略,此项都是必填的 | +| `target-name` | | 指定某些列作为目标顶点的 id 列,与 source-name 类似 | +| `selected-fields` | | 选择某些列进行插入,其他未选择的列不插入,不能与 ignored-fields 同时存在 | +| `ignored-fields` | | 忽略某些列使其不参与插入,不能与 selected-fields 同时存在 | +| `batch-size` | `500` | 导入数据时每批数据的条目数 | + +#### 5.3 通用配置 + +通用配置包含一些常用的配置项。 + +| 参数 | 默认值 | 说明 | +|-------------|-----|-------------------------------------------------------------------| +| `delimiter` | `,` | `source-name`、`target-name`、`selected-fields` 或 `ignored-fields` 的分隔符 | + +### 6 许可证 + +与 HugeGraph 一样,hugegraph-spark-connector 也采用 Apache 2.0 许可证。 diff --git a/content/cn/docs/quickstart/toolchain/hugegraph-tools.md b/content/cn/docs/quickstart/toolchain/hugegraph-tools.md index cd2414ed9..73239129e 100644 --- a/content/cn/docs/quickstart/toolchain/hugegraph-tools.md +++ b/content/cn/docs/quickstart/toolchain/hugegraph-tools.md @@ -55,10 +55,11 @@ mvn package -DskipTests 解压后,进入 hugegraph-tools 目录,可以使用`bin/hugegraph`或者`bin/hugegraph help`来查看 usage 信息。主要分为: -- 图管理类,graph-mode-set、graph-mode-get、graph-list、graph-get 和 graph-clear +- 图管理类,graph-mode-set、graph-mode-get、graph-list、graph-get、graph-clear、graph-create、graph-clone 和 graph-drop - 异步任务管理类,task-list、task-get、task-delete、task-cancel 和 task-clear - Gremlin类,gremlin-execute 和 gremlin-schedule - 备份/恢复类,backup、restore、migrate、schedule-backup 和 dump +- 认证数据备份/恢复类,auth-backup 和 auth-restore - 安装部署类,deploy、clear、start-all 和 stop-all ```bash @@ -105,7 +106,7 @@ Usage: hugegraph [options] [command] [command options] #export HUGEGRAPH_TRUST_STORE_PASSWORD= ``` -##### 3.3 图管理类,graph-mode-set、graph-mode-get、graph-list、graph-get和graph-clear +##### 3.3 图管理类,graph-mode-set、graph-mode-get、graph-list、graph-get、graph-clear、graph-create、graph-clone和graph-drop - graph-mode-set,设置图的 restore mode - --graph-mode 或者 -m,必填项,指定将要设置的模式,合法值包括 [NONE, RESTORING, MERGING, LOADING] @@ -114,6 +115,14 @@ Usage: hugegraph [options] [command] [command options] - graph-get,获取某个图及其存储后端类型 - graph-clear,清除某个图的全部 schema 和 data - --confirm-message 或者 -c,必填项,删除确认信息,需要手动输入,二次确认防止误删,"I'm sure to delete all data",包括双引号 +- graph-create,使用配置文件创建新图 + - --name 或者 -n,选填项,新图的名称,默认为 hugegraph + - --file 或者 -f,必填项,图配置文件的路径 +- graph-clone,克隆已存在的图 + - --name 或者 -n,选填项,新克隆图的名称,默认为 hugegraph + - --clone-graph-name,选填项,要克隆的源图名称,默认为 hugegraph +- graph-drop,删除图(不同于 graph-clear,这会完全删除图) + - --confirm-message 或者 -c,必填项,确认消息 "I'm sure to drop the graph",包括双引号 > 当需要把备份的图原样恢复到一个新的图中的时候,需要先将图模式设置为 RESTORING 模式;当需要将备份的图合并到已存在的图中时,需要先将图模式设置为 MERGING 模式。 @@ -159,6 +168,7 @@ Usage: hugegraph [options] [command] [command options] - --huge-types 或者 -t,要备份的数据类型,逗号分隔,可选值为 'all' 或者 一个或多个 [vertex,edge,vertex_label,edge_label,property_key,index_label] 的组合,'all' 代表全部6种类型,即顶点、边和所有schema - --log 或者 -l,指定日志目录,默认为当前目录 - --retry,指定失败重试次数,默认为 3 + - --thread-num 或者 -T,使用的线程数,默认为 Math.min(10, Math.max(4, CPUs / 2)) - --split-size 或者 -s,指定在备份时对顶点或者边分块的大小,默认为 1048576 - -D,用 -Dkey=value 的模式指定动态参数,用来备份数据到 HDFS 时,指定 HDFS 的配置项,例如:-Dfs.default.name=hdfs://localhost:9000 - restore,将 JSON 格式存储的 schema 或者 data 恢复到一个新图中(RESTORING 模式)或者合并到已存在的图中(MERGING 模式) @@ -167,6 +177,7 @@ Usage: hugegraph [options] [command] [command options] - --huge-types 或者 -t,要恢复的数据类型,逗号分隔,可选值为 'all' 或者 一个或多个 [vertex,edge,vertex_label,edge_label,property_key,index_label] 的组合,'all' 代表全部6种类型,即顶点、边和所有schema - --log 或者 -l,指定日志目录,默认为当前目录 - --retry,指定失败重试次数,默认为 3 + - --thread-num 或者 -T,使用的线程数,默认为 Math.min(10, Math.max(4, CPUs / 2)) - -D,用 -Dkey=value 的模式指定动态参数,用来从 HDFS 恢复图时,指定 HDFS 的配置项,例如:-Dfs.default.name=hdfs://localhost:9000 > 只有当 --format 为 json 执行 backup 时,才可以使用 restore 命令恢复 - migrate, 将当前连接的图迁移至另一个 HugeGraphServer 中 @@ -198,9 +209,28 @@ Usage: hugegraph [options] [command] [command options] - --log 或者 -l,指定日志目录,默认为当前目录 - --retry,指定失败重试次数,默认为 3 - --split-size 或者 -s,指定在备份时对顶点或者边分块的大小,默认为 1048576 - - -D,用 -Dkey=value 的模式指定动态参数,用来备份数据到 HDFS 时,指定 HDFS 的配置项,例如:-Dfs.default.name=hdfs://localhost:9000 + - -D,用 -Dkey=value 的模式指定动态参数,用来备份数据到 HDFS 时,指定 HDFS 的配置项,例如:-Dfs.default.name=hdfs://localhost:9000 + +##### 3.7 认证数据备份/恢复类 + +- auth-backup,备份认证数据到指定目录 + - --types 或者 -t,要备份的认证数据类型,逗号分隔,可选值为 'all' 或者一个或多个 [user, group, target, belong, access] 的组合,'all' 代表全部5种类型 + - --directory 或者 -d,备份数据存储目录,默认为当前目录 + - --log 或者 -l,指定日志目录,默认为当前目录 + - --retry,指定失败重试次数,默认为 3 + - --thread-num 或者 -T,使用的线程数,默认为 Math.min(10, Math.max(4, CPUs / 2)) + - -D,用 -Dkey=value 的模式指定动态参数,用来备份数据到 HDFS 时,指定 HDFS 的配置项,例如:-Dfs.default.name=hdfs://localhost:9000 +- auth-restore,从指定目录恢复认证数据 + - --types 或者 -t,要恢复的认证数据类型,逗号分隔,可选值为 'all' 或者一个或多个 [user, group, target, belong, access] 的组合,'all' 代表全部5种类型 + - --directory 或者 -d,备份数据存储目录,默认为当前目录 + - --log 或者 -l,指定日志目录,默认为当前目录 + - --retry,指定失败重试次数,默认为 3 + - --thread-num 或者 -T,使用的线程数,默认为 Math.min(10, Math.max(4, CPUs / 2)) + - --strategy,冲突处理策略,可选值为 [stop, ignore],默认为 stop。stop 表示遇到冲突时停止恢复,ignore 表示忽略冲突继续恢复 + - --init-password,恢复用户时设置的初始密码,恢复用户数据时必填 + - -D,用 -Dkey=value 的模式指定动态参数,用来从 HDFS 恢复数据时,指定 HDFS 的配置项,例如:-Dfs.default.name=hdfs://localhost:9000 -##### 3.7 安装部署类 +##### 3.8 安装部署类 - deploy,一键下载、安装和启动 HugeGraph-Server 和 HugeGraph-Studio - -v,必填项,指明安装的 HugeGraph-Server 和 HugeGraph-Studio 的版本号,最新的是 0.9 @@ -215,7 +245,7 @@ Usage: hugegraph [options] [command] [command options] > deploy命令中有可选参数 -u,提供时会使用指定的下载地址替代默认下载地址下载 tar 包,并且将地址写入`~/hugegraph-download-url-prefix`文件中;之后如果不指定地址时,会优先从`~/hugegraph-download-url-prefix`指定的地址下载 tar 包;如果 -u 和`~/hugegraph-download-url-prefix`都没有时,会从默认下载地址进行下载 -##### 3.8 具体命令参数 +##### 3.9 具体命令参数 各子命令的具体参数如下: @@ -524,7 +554,7 @@ Usage: hugegraph [options] [command] [command options] ``` -##### 3.9 具体命令示例 +##### 3.10 具体命令示例 ###### 1. gremlin语句 diff --git a/content/en/docs/quickstart/toolchain/hugegraph-hubble.md b/content/en/docs/quickstart/toolchain/hugegraph-hubble.md index d73403f0c..fb642e876 100644 --- a/content/en/docs/quickstart/toolchain/hugegraph-hubble.md +++ b/content/en/docs/quickstart/toolchain/hugegraph-hubble.md @@ -557,3 +557,26 @@ There is no visual OLAP algorithm execution on Hubble. You can call the RESTful image + +### 5 Configuration + +HugeGraph-Hubble can be configured through the `conf/hugegraph-hubble.properties` file. + +#### 5.1 Server Configuration + +| Configuration Item | Default Value | Description | +|-------------------|---------------|-------------| +| `hubble.host` | `0.0.0.0` | The address that Hubble service binds to | +| `hubble.port` | `8088` | The port that Hubble service listens on | + +#### 5.2 Gremlin Query Limits + +These settings control query result limits to prevent memory issues: + +| Configuration Item | Default Value | Description | +|-------------------|---------------|-------------| +| `gremlin.suffix_limit` | `250` | Maximum query suffix length | +| `gremlin.vertex_degree_limit` | `100` | Maximum vertex degree to display | +| `gremlin.edges_total_limit` | `500` | Maximum number of edges returned | +| `gremlin.batch_query_ids` | `100` | ID batch query size | + diff --git a/content/en/docs/quickstart/toolchain/hugegraph-spark-connector.md b/content/en/docs/quickstart/toolchain/hugegraph-spark-connector.md new file mode 100644 index 000000000..fb7494efa --- /dev/null +++ b/content/en/docs/quickstart/toolchain/hugegraph-spark-connector.md @@ -0,0 +1,182 @@ +--- +title: "HugeGraph-Spark-Connector Quick Start" +linkTitle: "Read/Write Graph Data with Spark Connector" +weight: 4 +--- + +### 1 HugeGraph-Spark-Connector Overview + +HugeGraph-Spark-Connector is a Spark connector application for reading and writing HugeGraph data in Spark standard format. + +### 2 Environment Requirements + +- Java 8+ +- Maven 3.6+ +- Spark 3.x +- Scala 2.12 + +### 3 Building + +#### 3.1 Build without executing tests + +```bash +mvn clean package -DskipTests +``` + +#### 3.2 Build with default tests + +```bash +mvn clean package +``` + +### 4 Usage + +First add the dependency in your pom.xml: + +```xml + + org.apache.hugegraph + hugegraph-spark-connector + ${revision} + +``` + +#### 4.1 Schema Definition Example + +If we have a graph, the schema is defined as follows: + +```groovy +schema.propertyKey("name").asText().ifNotExist().create() +schema.propertyKey("age").asInt().ifNotExist().create() +schema.propertyKey("city").asText().ifNotExist().create() +schema.propertyKey("weight").asDouble().ifNotExist().create() +schema.propertyKey("lang").asText().ifNotExist().create() +schema.propertyKey("date").asText().ifNotExist().create() +schema.propertyKey("price").asDouble().ifNotExist().create() + +schema.vertexLabel("person") + .properties("name", "age", "city") + .useCustomizeStringId() + .nullableKeys("age", "city") + .ifNotExist() + .create() + +schema.vertexLabel("software") + .properties("name", "lang", "price") + .primaryKeys("name") + .ifNotExist() + .create() + +schema.edgeLabel("knows") + .sourceLabel("person") + .targetLabel("person") + .properties("date", "weight") + .ifNotExist() + .create() + +schema.edgeLabel("created") + .sourceLabel("person") + .targetLabel("software") + .properties("date", "weight") + .ifNotExist() + .create() +``` + +#### 4.2 Vertex Sink (Scala) + +```scala +val df = sparkSession.createDataFrame(Seq( + Tuple3("marko", 29, "Beijing"), + Tuple3("vadas", 27, "HongKong"), + Tuple3("Josh", 32, "Beijing"), + Tuple3("peter", 35, "ShangHai"), + Tuple3("li,nary", 26, "Wu,han"), + Tuple3("Bob", 18, "HangZhou"), +)) toDF("name", "age", "city") + +df.show() + +df.write + .format("org.apache.hugegraph.spark.connector.DataSource") + .option("host", "127.0.0.1") + .option("port", "8080") + .option("graph", "hugegraph") + .option("data-type", "vertex") + .option("label", "person") + .option("id", "name") + .option("batch-size", 2) + .mode(SaveMode.Overwrite) + .save() +``` + +#### 4.3 Edge Sink (Scala) + +```scala +val df = sparkSession.createDataFrame(Seq( + Tuple4("marko", "vadas", "20160110", 0.5), + Tuple4("peter", "Josh", "20230801", 1.0), + Tuple4("peter", "li,nary", "20130220", 2.0) +)).toDF("source", "target", "date", "weight") + +df.show() + +df.write + .format("org.apache.hugegraph.spark.connector.DataSource") + .option("host", "127.0.0.1") + .option("port", "8080") + .option("graph", "hugegraph") + .option("data-type", "edge") + .option("label", "knows") + .option("source-name", "source") + .option("target-name", "target") + .option("batch-size", 2) + .mode(SaveMode.Overwrite) + .save() +``` + +### 5 Configuration Parameters + +#### 5.1 Client Configs + +Client Configs are used to configure hugegraph-client. + +| Parameter | Default Value | Description | +|----------------------|---------------|----------------------------------------------------------------------------------------------| +| `host` | `localhost` | Address of HugeGraphServer | +| `port` | `8080` | Port of HugeGraphServer | +| `graph` | `hugegraph` | Graph space name | +| `protocol` | `http` | Protocol for sending requests to the server, optional `http` or `https` | +| `username` | `null` | Username of the current graph when HugeGraphServer enables permission authentication | +| `token` | `null` | Token of the current graph when HugeGraphServer has enabled authorization authentication | +| `timeout` | `60` | Timeout (seconds) for inserting results to return | +| `max-conn` | `CPUS * 4` | The maximum number of HTTP connections between HugeClient and HugeGraphServer | +| `max-conn-per-route` | `CPUS * 2` | The maximum number of HTTP connections for each route between HugeClient and HugeGraphServer | +| `trust-store-file` | `null` | The client's certificate file path when the request protocol is https | +| `trust-store-token` | `null` | The client's certificate password when the request protocol is https | + +#### 5.2 Graph Data Configs + +Graph Data Configs are used to set graph space configuration. + +| Parameter | Default Value | Description | +|-------------------|---------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `data-type` | | Graph data type, must be `vertex` or `edge` | +| `label` | | Label to which the vertex/edge data to be imported belongs | +| `id` | | Specify a column as the id column of the vertex. When the vertex id policy is CUSTOMIZE, it is required; when the id policy is PRIMARY_KEY, it must be empty | +| `source-name` | | Select certain columns of the input source as the id column of source vertex. When the id policy of the source vertex is CUSTOMIZE, a certain column must be specified as the id column of the vertex; when the id policy of the source vertex is PRIMARY_KEY, one or more columns must be specified for splicing the id of the generated vertex, that is, no matter which id strategy is used, this item is required | +| `target-name` | | Specify certain columns as the id columns of target vertex, similar to source-name | +| `selected-fields` | | Select some columns to insert, other unselected ones are not inserted, cannot exist at the same time as ignored-fields | +| `ignored-fields` | | Ignore some columns so that they do not participate in insertion, cannot exist at the same time as selected-fields | +| `batch-size` | `500` | The number of data items in each batch when importing data | + +#### 5.3 Common Configs + +Common Configs contains some common configurations. + +| Parameter | Default Value | Description | +|-------------|---------------|---------------------------------------------------------------------------------| +| `delimiter` | `,` | Separator of `source-name`, `target-name`, `selected-fields` or `ignored-fields` | + +### 6 License + +The same as HugeGraph, hugegraph-spark-connector is also licensed under Apache 2.0 License. diff --git a/content/en/docs/quickstart/toolchain/hugegraph-tools.md b/content/en/docs/quickstart/toolchain/hugegraph-tools.md index c55199f8e..39938ee3d 100644 --- a/content/en/docs/quickstart/toolchain/hugegraph-tools.md +++ b/content/en/docs/quickstart/toolchain/hugegraph-tools.md @@ -55,11 +55,12 @@ Generate tar package hugegraph-tools-${version}.tar.gz After decompression, enter the hugegraph-tools directory, you can use `bin/hugegraph` or `bin/hugegraph help` to view the usage information. mainly divided: -- Graph management Type,graph-mode-set、graph-mode-get、graph-list、graph-get and graph-clear -- Asynchronous task management Type,task-list、task-get、task-delete、task-cancel and task-clear -- Gremlin Type,gremlin-execute and gremlin-schedule -- Backup/Restore Type,backup、restore、migrate、schedule-backup and dump -- Install the deployment Type,deploy、clear、start-all and stop-all +- Graph management type, graph-mode-set, graph-mode-get, graph-list, graph-get, graph-clear, graph-create, graph-clone and graph-drop +- Asynchronous task management type, task-list, task-get, task-delete, task-cancel and task-clear +- Gremlin type, gremlin-execute and gremlin-schedule +- Backup/Restore type, backup, restore, migrate, schedule-backup and dump +- Authentication data backup/restore type, auth-backup and auth-restore +- Install deployment type, deploy, clear, start-all and stop-all ```bash Usage: hugegraph [options] [command] [command options] @@ -105,15 +106,23 @@ Another way is to set the environment variable in the bin/hugegraph script: #export HUGEGRAPH_TRUST_STORE_PASSWORD= ``` -##### 3.3 Graph Management Type,graph-mode-set、graph-mode-get、graph-list、graph-get and graph-clear +##### 3.3 Graph Management Type, graph-mode-set, graph-mode-get, graph-list, graph-get, graph-clear, graph-create, graph-clone and graph-drop -- graph-mode-set,set graph restore mode +- graph-mode-set, set graph restore mode - --graph-mode or -m, required, specifies the mode to be set, legal values include [NONE, RESTORING, MERGING, LOADING] -- graph-mode-get,get graph restore mode -- graph-list,list all graphs in a HugeGraph-Server -- graph-get,get a graph and its storage backend type -- graph-clear,clear all schema and data of a graph - - --confirm-message Or -c, required, delete confirmation information, manual input is required, double confirmation to prevent accidental deletion, "I'm sure to delete all data", including double quotes +- graph-mode-get, get graph restore mode +- graph-list, list all graphs in a HugeGraph-Server +- graph-get, get a graph and its storage backend type +- graph-clear, clear all schema and data of a graph + - --confirm-message or -c, required, delete confirmation information, manual input is required, double confirmation to prevent accidental deletion, "I'm sure to delete all data", including double quotes +- graph-create, create a new graph with configuration file + - --name or -n, optional, the name of the new graph, default is hugegraph + - --file or -f, required, the path to the graph configuration file +- graph-clone, clone an existing graph + - --name or -n, optional, the name of the cloned graph, default is hugegraph + - --clone-graph-name, optional, the name of the source graph to clone from, default is hugegraph +- graph-drop, drop a graph (different from graph-clear, this completely removes the graph) + - --confirm-message or -c, required, confirmation message "I'm sure to drop the graph", including double quotes > When you need to restore the backup graph to a new graph, you need to set the graph mode to RESTORING mode; when you need to merge the backup graph into an existing graph, you need to first set the graph mode to MERGING model. @@ -159,6 +168,7 @@ Another way is to set the environment variable in the bin/hugegraph script: - --huge-types or -t, the data types to be backed up, separated by commas, the optional value is 'all' or a combination of one or more [vertex, edge, vertex_label, edge_label, property_key, index_label], 'all' Represents all 6 types, namely vertices, edges and all schemas - --log or -l, specify the log directory, the default is the current directory - --retry, specify the number of failed retries, the default is 3 + - --thread-num or -T, the number of threads to use, default is Math.min(10, Math.max(4, CPUs / 2)) - --split-size or -s, specifies the size of splitting vertices or edges when backing up, the default is 1048576 - -D, use the mode of -Dkey=value to specify dynamic parameters, and specify HDFS configuration items when backing up data to HDFS, for example: -Dfs.default.name=hdfs://localhost:9000 - restore, restore schema or data stored in JSON format to a new graph (RESTORING mode) or merge into an existing graph (MERGING mode) @@ -167,6 +177,7 @@ Another way is to set the environment variable in the bin/hugegraph script: - --huge-types or -t, data types to restore, separated by commas, optional value is 'all' or a combination of one or more [vertex, edge, vertex_label, edge_label, property_key, index_label], 'all' Represents all 6 types, namely vertices, edges and all schemas - --log or -l, specify the log directory, the default is the current directory - --retry, specify the number of failed retries, the default is 3 + - --thread-num or -T, the number of threads to use, default is Math.min(10, Math.max(4, CPUs / 2)) - -D, use the mode of -Dkey=value to specify dynamic parameters, which are used to specify HDFS configuration items when restoring graphs from HDFS, for example: -Dfs.default.name=hdfs://localhost:9000 > restore command can be used only if --format is executed as backup for json - migrate, migrate the currently connected graph to another HugeGraphServer @@ -200,7 +211,26 @@ Another way is to set the environment variable in the bin/hugegraph script: - --split-size or -s, specifies the size of splitting vertices or edges when backing up, the default is 1048576 - -D, use the mode of -Dkey=value to specify dynamic parameters, and specify HDFS configuration items when backing up data to HDFS, for example: -Dfs.default.name=hdfs://localhost:9000 -##### 3.7 Install the deployment type +##### 3.7 Authentication data backup/restore type + +- auth-backup, backup authentication data to a specified directory + - --types or -t, types of authentication data to back up, separated by commas, optional value is 'all' or a combination of one or more [user, group, target, belong, access], 'all' represents all 5 types + - --directory or -d, directory to store backup data, defaults to current directory + - --log or -l, specify the log directory, the default is the current directory + - --retry, specify the number of failed retries, the default is 3 + - --thread-num or -T, the number of threads to use, default is Math.min(10, Math.max(4, CPUs / 2)) + - -D, use the mode of -Dkey=value to specify dynamic parameters, and specify HDFS configuration items when backing up data to HDFS, for example: -Dfs.default.name=hdfs://localhost:9000 +- auth-restore, restore authentication data from a specified directory + - --types or -t, types of authentication data to restore, separated by commas, optional value is 'all' or a combination of one or more [user, group, target, belong, access], 'all' represents all 5 types + - --directory or -d, directory where backup data is stored, defaults to current directory + - --log or -l, specify the log directory, the default is the current directory + - --retry, specify the number of failed retries, the default is 3 + - --thread-num or -T, the number of threads to use, default is Math.min(10, Math.max(4, CPUs / 2)) + - --strategy, conflict handling strategy, optional values are [stop, ignore], default is stop. stop means stop restoring when encountering conflicts, ignore means ignore conflicts and continue restoring + - --init-password, initial password to set when restoring users, required when restoring user data + - -D, use the mode of -Dkey=value to specify dynamic parameters, which are used to specify HDFS configuration items when restoring data from HDFS, for example: -Dfs.default.name=hdfs://localhost:9000 + +##### 3.8 Install the deployment type - deploy, one-click download, install and start HugeGraph-Server and HugeGraph-Studio - -v, required, specifies the version number of HugeGraph-Server and HugeGraph-Studio installed, the latest is 0.9 @@ -215,7 +245,7 @@ Another way is to set the environment variable in the bin/hugegraph script: > There is an optional parameter -u in the deploy command. When provided, the specified download address will be used instead of the default download address to download the tar package, and the address will be written into the `~/hugegraph-download-url-prefix` file; if no address is specified later When -u and `~/hugegraph-download-url-prefix` are not specified, the tar package will be downloaded from the address specified by `~/hugegraph-download-url-prefix`; if there is neither -u nor `~/hugegraph-download-url-prefix`, it will be downloaded from the default download address -##### 3.8 Specific command parameters +##### 3.9 Specific command parameters The specific parameters of each subcommand are as follows: @@ -524,7 +554,7 @@ Usage: hugegraph [options] [command] [command options] ``` -##### 3.9 Specific command example +##### 3.10 Specific command example ###### 1. gremlin statement From 708a6df1955e229a75cbfabe0a1dc99764dcd508 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sun, 1 Feb 2026 00:03:48 +0800 Subject: [PATCH 05/10] AI: Add HugeGraph-AI docs and quickstart updates Add comprehensive HugeGraph-AI documentation and update quickstart content. New files added (Chinese & English): config-reference.md (full configuration reference), hugegraph-ml.md (HugeGraph-ML overview, algorithms and examples), and rest-api.md (REST API reference including RAG and Text2Gremlin endpoints). Updated pages: _index.md (feature list and v1.5.0 highlights such as Text2Gremlin, multi-model vectors, bilingual prompts, LiteLLM support, enhanced rerankers), hugegraph-llm.md (LLM provider/LiteLLM, reranker and Text2Gremlin usage), and quick_start.md (language switching / bilingual prompt guide). Also tightened environment requirements (Python 3.10+, uv 0.7+, HugeGraph 1.5+) and updated ML algorithm count/details to 21. These changes expand docs for deploying and integrating LLM, Text2Gremlin, ML workflows and REST APIs. --- .../cn/docs/quickstart/hugegraph-ai/_index.md | 31 +- .../hugegraph-ai/config-reference.md | 396 ++++++++++++++++ .../quickstart/hugegraph-ai/hugegraph-llm.md | 77 +++- .../quickstart/hugegraph-ai/hugegraph-ml.md | 289 ++++++++++++ .../quickstart/hugegraph-ai/quick_start.md | 60 +++ .../docs/quickstart/hugegraph-ai/rest-api.md | 428 ++++++++++++++++++ .../en/docs/quickstart/hugegraph-ai/_index.md | 31 +- .../hugegraph-ai/config-reference.md | 396 ++++++++++++++++ .../quickstart/hugegraph-ai/hugegraph-llm.md | 75 ++- .../quickstart/hugegraph-ai/hugegraph-ml.md | 289 ++++++++++++ .../quickstart/hugegraph-ai/quick_start.md | 60 +++ .../docs/quickstart/hugegraph-ai/rest-api.md | 428 ++++++++++++++++++ 12 files changed, 2539 insertions(+), 21 deletions(-) create mode 100644 content/cn/docs/quickstart/hugegraph-ai/config-reference.md create mode 100644 content/cn/docs/quickstart/hugegraph-ai/hugegraph-ml.md create mode 100644 content/cn/docs/quickstart/hugegraph-ai/rest-api.md create mode 100644 content/en/docs/quickstart/hugegraph-ai/config-reference.md create mode 100644 content/en/docs/quickstart/hugegraph-ai/hugegraph-ml.md create mode 100644 content/en/docs/quickstart/hugegraph-ai/rest-api.md diff --git a/content/cn/docs/quickstart/hugegraph-ai/_index.md b/content/cn/docs/quickstart/hugegraph-ai/_index.md index 330c93148..01906f0d7 100644 --- a/content/cn/docs/quickstart/hugegraph-ai/_index.md +++ b/content/cn/docs/quickstart/hugegraph-ai/_index.md @@ -18,20 +18,31 @@ weight: 3 ## ✨ 核心功能 - **GraphRAG**:利用图增强检索构建智能问答系统 +- **Text2Gremlin**:自然语言到图查询的转换,支持 REST API - **知识图谱构建**:使用大语言模型从文本自动构建图谱 -- **图机器学习**:集成 20 多种图学习算法(GCN、GAT、GraphSAGE 等) +- **图机器学习**:集成 21 种图学习算法(GCN、GAT、GraphSAGE 等) - **Python 客户端**:易于使用的 HugeGraph Python 操作接口 - **AI 智能体**:提供智能图分析与推理能力 +### 🎉 v1.5.0 新特性 + +- **Text2Gremlin REST API**:通过 REST 端点将自然语言查询转换为 Gremlin 命令 +- **多模型向量支持**:每个图实例可以使用独立的嵌入模型 +- **双语提示支持**:支持英文和中文提示词切换(EN/CN) +- **半自动 Schema 生成**:从文本数据智能推断 Schema +- **半自动 Prompt 生成**:上下文感知的提示词模板 +- **增强的 Reranker 支持**:集成 Cohere 和 SiliconFlow 重排序器 +- **LiteLLM 多供应商支持**:统一接口支持 OpenAI、Anthropic、Gemini 等 + ## 🚀 快速开始 > [!NOTE] > 如需完整的部署指南和详细示例,请参阅 [hugegraph-llm/README.md](https://github.com/apache/incubator-hugegraph-ai/blob/main/hugegraph-llm/README.md)。 ### 环境要求 -- Python 3.9+(建议 hugegraph-llm 使用 3.10+) -- [uv](https://docs.astral.sh/uv/)(推荐的包管理器) -- HugeGraph Server 1.3+(建议 1.5+) +- Python 3.10+(hugegraph-llm 必需) +- [uv](https://docs.astral.sh/uv/) 0.7+(推荐的包管理器) +- HugeGraph Server 1.5+(必需) - Docker(可选,用于容器化部署) ### 方案一:Docker 部署(推荐) @@ -123,11 +134,13 @@ from pyhugegraph.client import PyHugeClient - **AI 智能体**:智能图分析与推理 ### [hugegraph-ml](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-ml) -包含 20+ 算法的图机器学习: -- **节点分类**:GCN、GAT、GraphSAGE、APPNP 等 -- **图分类**:DiffPool、P-GNN 等 -- **图嵌入**:DeepWalk、Node2Vec、GRACE 等 -- **链接预测**:SEAL、GATNE 等 +包含 21 种算法的图机器学习: +- **节点分类**:GCN、GAT、GraphSAGE、APPNP、AGNN、ARMA、DAGNN、DeeperGCN、GRAND、JKNet、Cluster-GCN +- **图分类**:DiffPool、GIN +- **图嵌入**:DGI、BGRL、GRACE +- **链接预测**:SEAL、P-GNN、GATNE +- **欺诈检测**:CARE-GNN、BGNN +- **后处理**:C&S(Correct & Smooth) ### [hugegraph-python-client](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-python-client) 用于 HugeGraph 操作的 Python 客户端: diff --git a/content/cn/docs/quickstart/hugegraph-ai/config-reference.md b/content/cn/docs/quickstart/hugegraph-ai/config-reference.md new file mode 100644 index 000000000..4172ae12e --- /dev/null +++ b/content/cn/docs/quickstart/hugegraph-ai/config-reference.md @@ -0,0 +1,396 @@ +--- +title: "配置参考" +linkTitle: "配置参考" +weight: 4 +--- + +本文档提供 HugeGraph-LLM 所有配置选项的完整参考。 + +## 配置文件 + +- **环境文件**:`.env`(从模板创建或自动生成) +- **提示词配置**:`src/hugegraph_llm/resources/demo/config_prompt.yaml` + +> [!TIP] +> 运行 `python -m hugegraph_llm.config.generate --update` 可自动生成或更新带有默认值的配置文件。 + +## 环境变量概览 + +### 1. 语言和模型类型选择 + +```bash +# 提示词语言(影响系统提示词和生成文本) +LANGUAGE=EN # 选项: EN | CN + +# 不同任务的 LLM 类型 +CHAT_LLM_TYPE=openai # 对话/RAG: openai | litellm | ollama/local +EXTRACT_LLM_TYPE=openai # 实体抽取: openai | litellm | ollama/local +TEXT2GQL_LLM_TYPE=openai # 文本转 Gremlin: openai | litellm | ollama/local + +# 嵌入模型类型 +EMBEDDING_TYPE=openai # 选项: openai | litellm | ollama/local + +# Reranker 类型(可选) +RERANKER_TYPE= # 选项: cohere | siliconflow | (留空表示无) +``` + +### 2. OpenAI 配置 + +每个 LLM 任务(chat、extract、text2gql)都有独立配置: + +#### 2.1 Chat LLM(RAG 答案生成) + +```bash +OPENAI_CHAT_API_BASE=https://api.openai.com/v1 +OPENAI_CHAT_API_KEY=sk-your-api-key-here +OPENAI_CHAT_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_CHAT_TOKENS=8192 # 对话响应的最大 tokens +``` + +#### 2.2 Extract LLM(实体和关系抽取) + +```bash +OPENAI_EXTRACT_API_BASE=https://api.openai.com/v1 +OPENAI_EXTRACT_API_KEY=sk-your-api-key-here +OPENAI_EXTRACT_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_EXTRACT_TOKENS=1024 # 抽取任务的最大 tokens +``` + +#### 2.3 Text2GQL LLM(自然语言转 Gremlin) + +```bash +OPENAI_TEXT2GQL_API_BASE=https://api.openai.com/v1 +OPENAI_TEXT2GQL_API_KEY=sk-your-api-key-here +OPENAI_TEXT2GQL_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_TEXT2GQL_TOKENS=4096 # 查询生成的最大 tokens +``` + +#### 2.4 嵌入模型 + +```bash +OPENAI_EMBEDDING_API_BASE=https://api.openai.com/v1 +OPENAI_EMBEDDING_API_KEY=sk-your-api-key-here +OPENAI_EMBEDDING_MODEL=text-embedding-3-small +``` + +> [!NOTE] +> 您可以为每个任务使用不同的 API 密钥/端点,以优化成本或使用专用模型。 + +### 3. LiteLLM 配置(多供应商支持) + +LiteLLM 支持统一访问 100 多个 LLM 供应商(OpenAI、Anthropic、Google、Azure 等)。 + +#### 3.1 Chat LLM + +```bash +LITELLM_CHAT_API_BASE=http://localhost:4000 # LiteLLM 代理 URL +LITELLM_CHAT_API_KEY=sk-litellm-key # LiteLLM API 密钥 +LITELLM_CHAT_LANGUAGE_MODEL=anthropic/claude-3-5-sonnet-20241022 +LITELLM_CHAT_TOKENS=8192 +``` + +#### 3.2 Extract LLM + +```bash +LITELLM_EXTRACT_API_BASE=http://localhost:4000 +LITELLM_EXTRACT_API_KEY=sk-litellm-key +LITELLM_EXTRACT_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_EXTRACT_TOKENS=256 +``` + +#### 3.3 Text2GQL LLM + +```bash +LITELLM_TEXT2GQL_API_BASE=http://localhost:4000 +LITELLM_TEXT2GQL_API_KEY=sk-litellm-key +LITELLM_TEXT2GQL_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_TEXT2GQL_TOKENS=4096 +``` + +#### 3.4 嵌入模型 + +```bash +LITELLM_EMBEDDING_API_BASE=http://localhost:4000 +LITELLM_EMBEDDING_API_KEY=sk-litellm-key +LITELLM_EMBEDDING_MODEL=openai/text-embedding-3-small +``` + +**模型格式**: `供应商/模型名称` + +示例: +- `openai/gpt-4o-mini` +- `anthropic/claude-3-5-sonnet-20241022` +- `google/gemini-2.0-flash-exp` +- `azure/gpt-4` + +完整列表请参阅 [LiteLLM Providers](https://docs.litellm.ai/docs/providers)。 + +### 4. Ollama 配置(本地部署) + +使用 Ollama 运行本地 LLM,确保隐私和成本控制。 + +#### 4.1 Chat LLM + +```bash +OLLAMA_CHAT_HOST=127.0.0.1 +OLLAMA_CHAT_PORT=11434 +OLLAMA_CHAT_LANGUAGE_MODEL=llama3.1:8b +``` + +#### 4.2 Extract LLM + +```bash +OLLAMA_EXTRACT_HOST=127.0.0.1 +OLLAMA_EXTRACT_PORT=11434 +OLLAMA_EXTRACT_LANGUAGE_MODEL=llama3.1:8b +``` + +#### 4.3 Text2GQL LLM + +```bash +OLLAMA_TEXT2GQL_HOST=127.0.0.1 +OLLAMA_TEXT2GQL_PORT=11434 +OLLAMA_TEXT2GQL_LANGUAGE_MODEL=qwen2.5-coder:7b +``` + +#### 4.4 嵌入模型 + +```bash +OLLAMA_EMBEDDING_HOST=127.0.0.1 +OLLAMA_EMBEDDING_PORT=11434 +OLLAMA_EMBEDDING_MODEL=nomic-embed-text +``` + +> [!TIP] +> 下载模型:`ollama pull llama3.1:8b` 或 `ollama pull qwen2.5-coder:7b` + +### 5. Reranker 配置 + +Reranker 通过根据相关性重新排序检索结果来提高 RAG 准确性。 + +#### 5.1 Cohere Reranker + +```bash +RERANKER_TYPE=cohere +COHERE_BASE_URL=https://api.cohere.com/v1/rerank +RERANKER_API_KEY=your-cohere-api-key +RERANKER_MODEL=rerank-english-v3.0 +``` + +可用模型: +- `rerank-english-v3.0`(英文) +- `rerank-multilingual-v3.0`(100+ 种语言) + +#### 5.2 SiliconFlow Reranker + +```bash +RERANKER_TYPE=siliconflow +RERANKER_API_KEY=your-siliconflow-api-key +RERANKER_MODEL=BAAI/bge-reranker-v2-m3 +``` + +### 6. HugeGraph 连接 + +配置与 HugeGraph 服务器实例的连接。 + +```bash +# 服务器连接 +GRAPH_IP=127.0.0.1 +GRAPH_PORT=8080 +GRAPH_NAME=hugegraph # 图实例名称 +GRAPH_USER=admin # 用户名 +GRAPH_PWD=admin-password # 密码 +GRAPH_SPACE= # 图空间(可选,用于多租户) +``` + +### 7. 查询参数 + +控制图遍历行为和结果限制。 + +```bash +# 图遍历限制 +MAX_GRAPH_PATH=10 # 图查询的最大路径深度 +MAX_GRAPH_ITEMS=30 # 从图中检索的最大项数 +EDGE_LIMIT_PRE_LABEL=8 # 每个标签类型的最大边数 + +# 属性过滤 +LIMIT_PROPERTY=False # 限制结果中的属性(True/False) +``` + +### 8. 向量搜索配置 + +配置向量相似性搜索参数。 + +```bash +# 向量搜索阈值 +VECTOR_DIS_THRESHOLD=0.9 # 最小余弦相似度(0-1,越高越严格) +TOPK_PER_KEYWORD=1 # 每个提取关键词的 Top-K 结果 +``` + +### 9. Rerank 配置 + +```bash +# Rerank 结果限制 +TOPK_RETURN_RESULTS=20 # 重排序后的 top 结果数 +``` + +## 配置优先级 + +系统按以下顺序加载配置(后面的来源覆盖前面的): + +1. **默认值**(在 `*_config.py` 文件中) +2. **环境变量**(来自 `.env` 文件) +3. **运行时更新**(通过 Web UI 或 API 调用) + +## 配置示例 + +### 最小配置(OpenAI) + +```bash +# 语言 +LANGUAGE=EN + +# LLM 类型 +CHAT_LLM_TYPE=openai +EXTRACT_LLM_TYPE=openai +TEXT2GQL_LLM_TYPE=openai +EMBEDDING_TYPE=openai + +# OpenAI 凭据(所有任务共用一个密钥) +OPENAI_API_BASE=https://api.openai.com/v1 +OPENAI_API_KEY=sk-your-api-key-here +OPENAI_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_EMBEDDING_MODEL=text-embedding-3-small + +# HugeGraph 连接 +GRAPH_IP=127.0.0.1 +GRAPH_PORT=8080 +GRAPH_NAME=hugegraph +GRAPH_USER=admin +GRAPH_PWD=admin +``` + +### 生产环境配置(LiteLLM + Reranker) + +```bash +# 双语支持 +LANGUAGE=EN + +# 灵活使用 LiteLLM +CHAT_LLM_TYPE=litellm +EXTRACT_LLM_TYPE=litellm +TEXT2GQL_LLM_TYPE=litellm +EMBEDDING_TYPE=litellm + +# LiteLLM 代理 +LITELLM_CHAT_API_BASE=http://localhost:4000 +LITELLM_CHAT_API_KEY=sk-litellm-master-key +LITELLM_CHAT_LANGUAGE_MODEL=anthropic/claude-3-5-sonnet-20241022 +LITELLM_CHAT_TOKENS=8192 + +LITELLM_EXTRACT_API_BASE=http://localhost:4000 +LITELLM_EXTRACT_API_KEY=sk-litellm-master-key +LITELLM_EXTRACT_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_EXTRACT_TOKENS=256 + +LITELLM_TEXT2GQL_API_BASE=http://localhost:4000 +LITELLM_TEXT2GQL_API_KEY=sk-litellm-master-key +LITELLM_TEXT2GQL_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_TEXT2GQL_TOKENS=4096 + +LITELLM_EMBEDDING_API_BASE=http://localhost:4000 +LITELLM_EMBEDDING_API_KEY=sk-litellm-master-key +LITELLM_EMBEDDING_MODEL=openai/text-embedding-3-small + +# Cohere Reranker 提高准确性 +RERANKER_TYPE=cohere +COHERE_BASE_URL=https://api.cohere.com/v1/rerank +RERANKER_API_KEY=your-cohere-key +RERANKER_MODEL=rerank-multilingual-v3.0 + +# 带认证的 HugeGraph +GRAPH_IP=prod-hugegraph.example.com +GRAPH_PORT=8080 +GRAPH_NAME=production_graph +GRAPH_USER=rag_user +GRAPH_PWD=secure-password +GRAPH_SPACE=prod_space + +# 优化的查询参数 +MAX_GRAPH_PATH=15 +MAX_GRAPH_ITEMS=50 +VECTOR_DIS_THRESHOLD=0.85 +TOPK_RETURN_RESULTS=30 +``` + +### 本地/离线配置(Ollama) + +```bash +# 语言 +LANGUAGE=EN + +# 全部通过 Ollama 使用本地模型 +CHAT_LLM_TYPE=ollama/local +EXTRACT_LLM_TYPE=ollama/local +TEXT2GQL_LLM_TYPE=ollama/local +EMBEDDING_TYPE=ollama/local + +# Ollama 端点 +OLLAMA_CHAT_HOST=127.0.0.1 +OLLAMA_CHAT_PORT=11434 +OLLAMA_CHAT_LANGUAGE_MODEL=llama3.1:8b + +OLLAMA_EXTRACT_HOST=127.0.0.1 +OLLAMA_EXTRACT_PORT=11434 +OLLAMA_EXTRACT_LANGUAGE_MODEL=llama3.1:8b + +OLLAMA_TEXT2GQL_HOST=127.0.0.1 +OLLAMA_TEXT2GQL_PORT=11434 +OLLAMA_TEXT2GQL_LANGUAGE_MODEL=qwen2.5-coder:7b + +OLLAMA_EMBEDDING_HOST=127.0.0.1 +OLLAMA_EMBEDDING_PORT=11434 +OLLAMA_EMBEDDING_MODEL=nomic-embed-text + +# 离线环境不使用 reranker +RERANKER_TYPE= + +# 本地 HugeGraph +GRAPH_IP=127.0.0.1 +GRAPH_PORT=8080 +GRAPH_NAME=hugegraph +GRAPH_USER=admin +GRAPH_PWD=admin +``` + +## 配置验证 + +修改 `.env` 后,验证配置: + +1. **通过 Web UI**:访问 `http://localhost:8001` 并检查设置面板 +2. **通过 Python**: +```python +from hugegraph_llm.config import settings +print(settings.llm_config) +print(settings.hugegraph_config) +``` +3. **通过 REST API**: +```bash +curl http://localhost:8001/config +``` + +## 故障排除 + +| 问题 | 解决方案 | +|------|---------| +| "API key not found" | 检查 `.env` 中的 `*_API_KEY` 是否正确设置 | +| "Connection refused" | 验证 `GRAPH_IP` 和 `GRAPH_PORT` 是否正确 | +| "Model not found" | 对于 Ollama:运行 `ollama pull <模型名称>` | +| "Rate limit exceeded" | 减少 `MAX_GRAPH_ITEMS` 或使用不同的 API 密钥 | +| "Embedding dimension mismatch" | 删除现有向量并使用正确模型重建 | + +## 另见 + +- [HugeGraph-LLM 概述](./hugegraph-llm.md) +- [REST API 参考](./rest-api.md) +- [快速入门指南](./quick_start.md) diff --git a/content/cn/docs/quickstart/hugegraph-ai/hugegraph-llm.md b/content/cn/docs/quickstart/hugegraph-ai/hugegraph-llm.md index b353a8fba..116f473b0 100644 --- a/content/cn/docs/quickstart/hugegraph-ai/hugegraph-llm.md +++ b/content/cn/docs/quickstart/hugegraph-ai/hugegraph-llm.md @@ -214,7 +214,7 @@ graph TD ## 🔧 配置 -运行演示后,将自动生成配置文件: +运行演示后,将自动生成配置文件: - **环境**:`hugegraph-llm/.env` - **提示**:`hugegraph-llm/src/hugegraph_llm/resources/demo/config_prompt.yaml` @@ -222,7 +222,80 @@ graph TD > [!NOTE] > 使用 Web 界面时,配置更改会自动保存。对于手动更改,刷新页面即可加载更新。 -**LLM 提供商支持**:本项目使用 [LiteLLM](https://docs.litellm.ai/docs/providers) 实现多提供商 LLM 支持。 +### LLM 提供商配置 + +本项目使用 [LiteLLM](https://docs.litellm.ai/docs/providers) 实现多提供商 LLM 支持,可统一访问 OpenAI、Anthropic、Google、Cohere 以及 100 多个其他提供商。 + +#### 方案一:直接 LLM 连接(OpenAI、Ollama) + +```bash +# .env 配置 +chat_llm_type=openai # 或 ollama/local +openai_api_key=sk-xxx +openai_api_base=https://api.openai.com/v1 +openai_language_model=gpt-4o-mini +openai_max_tokens=4096 +``` + +#### 方案二:LiteLLM 多提供商支持 + +LiteLLM 作为多个 LLM 提供商的统一代理: + +```bash +# .env 配置 +chat_llm_type=litellm +extract_llm_type=litellm +text2gql_llm_type=litellm + +# LiteLLM 设置 +litellm_api_base=http://localhost:4000 # LiteLLM 代理服务器 +litellm_api_key=sk-1234 # LiteLLM API 密钥 + +# 模型选择(提供商/模型格式) +litellm_language_model=anthropic/claude-3-5-sonnet-20241022 +litellm_max_tokens=4096 +``` + +**支持的提供商**:OpenAI、Anthropic、Google(Gemini)、Azure、Cohere、Bedrock、Vertex AI、Hugging Face 等。 + +完整提供商列表和配置详情,请访问 [LiteLLM Providers](https://docs.litellm.ai/docs/providers)。 + +### Reranker 配置 + +Reranker 通过重新排序检索结果来提高 RAG 准确性。支持的提供商: + +```bash +# Cohere Reranker +reranker_type=cohere +cohere_api_key=your-cohere-key +cohere_rerank_model=rerank-english-v3.0 + +# SiliconFlow Reranker +reranker_type=siliconflow +siliconflow_api_key=your-siliconflow-key +siliconflow_rerank_model=BAAI/bge-reranker-v2-m3 +``` + +### Text2Gremlin 配置 + +将自然语言转换为 Gremlin 查询: + +```python +from hugegraph_llm.operators.graph_rag_task import Text2GremlinPipeline + +# 初始化工作流 +text2gremlin = Text2GremlinPipeline() + +# 生成 Gremlin 查询 +result = ( + text2gremlin + .query_to_gremlin(query="查找所有由 Francis Ford Coppola 执导的电影") + .execute_gremlin_query() + .run() +) +``` + +**REST API 端点**:有关 HTTP 端点详情,请参阅 [REST API 文档](./rest-api.md)。 ## 📚 其他资源 diff --git a/content/cn/docs/quickstart/hugegraph-ai/hugegraph-ml.md b/content/cn/docs/quickstart/hugegraph-ai/hugegraph-ml.md new file mode 100644 index 000000000..a75ba6c1b --- /dev/null +++ b/content/cn/docs/quickstart/hugegraph-ai/hugegraph-ml.md @@ -0,0 +1,289 @@ +--- +title: "HugeGraph-ML" +linkTitle: "HugeGraph-ML" +weight: 2 +--- + +HugeGraph-ML 将 HugeGraph 与流行的图学习库集成,支持直接在图数据上进行端到端的机器学习工作流。 + +## 概述 + +`hugegraph-ml` 提供了统一接口,用于将图神经网络和机器学习算法应用于存储在 HugeGraph 中的数据。它通过无缝转换 HugeGraph 数据到主流 ML 框架兼容格式,消除了复杂的数据导出/导入流程。 + +### 核心功能 + +- **直接 HugeGraph 集成**:无需手动导出即可直接从 HugeGraph 查询图数据 +- **21 种算法实现**:全面覆盖节点分类、图分类、嵌入和链接预测 +- **DGL 后端**:利用深度图库(DGL)进行高效训练 +- **端到端工作流**:从数据加载到模型训练和评估 +- **模块化任务**:可复用的常见 ML 场景任务抽象 + +## 环境要求 + +- **Python**:3.9+(独立模块) +- **HugeGraph Server**:1.0+(推荐:1.5+) +- **UV 包管理器**:0.7+(用于依赖管理) + +## 安装 + +### 1. 启动 HugeGraph Server + +```bash +# 方案一:Docker(推荐) +docker run -itd --name=hugegraph -p 8080:8080 hugegraph/hugegraph + +# 方案二:二进制包 +# 参见 https://hugegraph.apache.org/docs/download/download/ +``` + +### 2. 克隆并设置 + +```bash +git clone https://github.com/apache/incubator-hugegraph-ai.git +cd incubator-hugegraph-ai/hugegraph-ml +``` + +### 3. 安装依赖 + +```bash +# uv sync 自动创建 .venv 并安装所有依赖 +uv sync + +# 激活虚拟环境 +source .venv/bin/activate +``` + +### 4. 导航到源代码目录 + +```bash +cd ./src +``` + +> [!NOTE] +> 所有示例均假定您在已激活的虚拟环境中。 + +## 已实现算法 + +HugeGraph-ML 目前实现了跨多个类别的 **21 种图机器学习算法**: + +### 节点分类(11 种算法) + +基于网络结构和特征预测图节点的标签。 + +| 算法 | 论文 | 描述 | +|-----|------|------| +| **GCN** | [Kipf & Welling, 2017](https://arxiv.org/abs/1609.02907) | 图卷积网络 | +| **GAT** | [Veličković et al., 2018](https://arxiv.org/abs/1710.10903) | 图注意力网络 | +| **GraphSAGE** | [Hamilton et al., 2017](https://arxiv.org/abs/1706.02216) | 归纳式表示学习 | +| **APPNP** | [Klicpera et al., 2019](https://arxiv.org/abs/1810.05997) | 个性化 PageRank 传播 | +| **AGNN** | [Thekumparampil et al., 2018](https://arxiv.org/abs/1803.03735) | 基于注意力的 GNN | +| **ARMA** | [Bianchi et al., 2019](https://arxiv.org/abs/1901.01343) | 自回归移动平均滤波器 | +| **DAGNN** | [Liu et al., 2020](https://arxiv.org/abs/2007.09296) | 深度自适应图神经网络 | +| **DeeperGCN** | [Li et al., 2020](https://arxiv.org/abs/2006.07739) | 非常深的 GCN 架构 | +| **GRAND** | [Feng et al., 2020](https://arxiv.org/abs/2005.11079) | 图随机神经网络 | +| **JKNet** | [Xu et al., 2018](https://arxiv.org/abs/1806.03536) | 跳跃知识网络 | +| **Cluster-GCN** | [Chiang et al., 2019](https://arxiv.org/abs/1905.07953) | 通过聚类实现可扩展 GCN 训练 | + +### 图分类(2 种算法) + +基于结构和节点特征对整个图进行分类。 + +| 算法 | 论文 | 描述 | +|-----|------|------| +| **DiffPool** | [Ying et al., 2018](https://arxiv.org/abs/1806.08804) | 可微分图池化 | +| **GIN** | [Xu et al., 2019](https://arxiv.org/abs/1810.00826) | 图同构网络 | + +### 图嵌入(3 种算法) + +学习用于下游任务的无监督节点表示。 + +| 算法 | 论文 | 描述 | +|-----|------|------| +| **DGI** | [Veličković et al., 2019](https://arxiv.org/abs/1809.10341) | 深度图信息最大化(对比学习) | +| **BGRL** | [Thakoor et al., 2021](https://arxiv.org/abs/2102.06514) | 自举图表示学习 | +| **GRACE** | [Zhu et al., 2020](https://arxiv.org/abs/2006.04131) | 图对比学习 | + +### 链接预测(3 种算法) + +预测图中缺失或未来的连接。 + +| 算法 | 论文 | 描述 | +|-----|------|------| +| **SEAL** | [Zhang & Chen, 2018](https://arxiv.org/abs/1802.09691) | 子图提取和标注 | +| **P-GNN** | [You et al., 2019](http://proceedings.mlr.press/v97/you19b/you19b.pdf) | 位置感知 GNN | +| **GATNE** | [Cen et al., 2019](https://arxiv.org/abs/1905.01669) | 属性多元异构网络嵌入 | + +### 欺诈检测(2 种算法) + +检测图中的异常节点(例如欺诈账户)。 + +| 算法 | 论文 | 描述 | +|-----|------|------| +| **CARE-GNN** | [Dou et al., 2020](https://arxiv.org/abs/2008.08692) | 抗伪装 GNN | +| **BGNN** | [Zheng et al., 2021](https://arxiv.org/abs/2101.08543) | 二部图神经网络 | + +### 后处理(1 种算法) + +通过标签传播改进预测。 + +| 算法 | 论文 | 描述 | +|-----|------|------| +| **C&S** | [Huang et al., 2020](https://arxiv.org/abs/2010.13993) | 校正与平滑(预测优化) | + +## 使用示例 + +### 示例 1:使用 DGI 进行节点嵌入 + +使用深度图信息最大化(DGI)在 Cora 数据集上进行无监督节点嵌入。 + +#### 步骤 1:导入数据集(如需) + +```python +from hugegraph_ml.utils.dgl2hugegraph_utils import import_graph_from_dgl + +# 从 DGL 导入 Cora 数据集到 HugeGraph +import_graph_from_dgl("cora") +``` + +#### 步骤 2:转换图数据 + +```python +from hugegraph_ml.data.hugegraph2dgl import HugeGraph2DGL + +# 将 HugeGraph 数据转换为 DGL 格式 +hg2d = HugeGraph2DGL() +graph = hg2d.convert_graph(vertex_label="CORA_vertex", edge_label="CORA_edge") +``` + +#### 步骤 3:初始化模型 + +```python +from hugegraph_ml.models.dgi import DGI + +# 创建 DGI 模型 +model = DGI(n_in_feats=graph.ndata["feat"].shape[1]) +``` + +#### 步骤 4:训练并生成嵌入 + +```python +from hugegraph_ml.tasks.node_embed import NodeEmbed + +# 训练模型并生成节点嵌入 +node_embed_task = NodeEmbed(graph=graph, model=model) +embedded_graph = node_embed_task.train_and_embed( + add_self_loop=True, + n_epochs=300, + patience=30 +) +``` + +#### 步骤 5:下游任务(节点分类) + +```python +from hugegraph_ml.models.mlp import MLPClassifier +from hugegraph_ml.tasks.node_classify import NodeClassify + +# 使用嵌入进行节点分类 +model = MLPClassifier( + n_in_feat=embedded_graph.ndata["feat"].shape[1], + n_out_feat=embedded_graph.ndata["label"].unique().shape[0] +) +node_clf_task = NodeClassify(graph=embedded_graph, model=model) +node_clf_task.train(lr=1e-3, n_epochs=400, patience=40) +print(node_clf_task.evaluate()) +``` + +**预期输出:** +```python +{'accuracy': 0.82, 'loss': 0.5714246034622192} +``` + +**完整示例**:参见 [dgi_example.py](https://github.com/apache/incubator-hugegraph-ai/blob/main/hugegraph-ml/src/hugegraph_ml/examples/dgi_example.py) + +### 示例 2:使用 GRAND 进行节点分类 + +使用 GRAND 模型直接对节点进行分类(无需单独的嵌入步骤)。 + +```python +from hugegraph_ml.data.hugegraph2dgl import HugeGraph2DGL +from hugegraph_ml.models.grand import GRAND +from hugegraph_ml.tasks.node_classify import NodeClassify + +# 加载图 +hg2d = HugeGraph2DGL() +graph = hg2d.convert_graph(vertex_label="CORA_vertex", edge_label="CORA_edge") + +# 初始化 GRAND 模型 +model = GRAND( + n_in_feats=graph.ndata["feat"].shape[1], + n_out_feats=graph.ndata["label"].unique().shape[0] +) + +# 训练和评估 +node_clf_task = NodeClassify(graph=graph, model=model) +node_clf_task.train(lr=1e-2, n_epochs=1500, patience=100) +print(node_clf_task.evaluate()) +``` + +**完整示例**:参见 [grand_example.py](https://github.com/apache/incubator-hugegraph-ai/blob/main/hugegraph-ml/src/hugegraph_ml/examples/grand_example.py) + +## 核心组件 + +### HugeGraph2DGL 转换器 + +无缝将 HugeGraph 数据转换为 DGL 图格式: + +```python +from hugegraph_ml.data.hugegraph2dgl import HugeGraph2DGL + +hg2d = HugeGraph2DGL() +graph = hg2d.convert_graph( + vertex_label="person", # 要提取的顶点标签 + edge_label="knows", # 要提取的边标签 + directed=False # 图的方向性 +) +``` + +### 任务抽象 + +用于常见 ML 工作流的可复用任务对象: + +| 任务 | 类 | 用途 | +|-----|-----|------| +| 节点嵌入 | `NodeEmbed` | 生成无监督节点嵌入 | +| 节点分类 | `NodeClassify` | 预测节点标签 | +| 图分类 | `GraphClassify` | 预测图级标签 | +| 链接预测 | `LinkPredict` | 预测缺失边 | + +## 最佳实践 + +1. **从小数据集开始**:在扩展之前先在小图(例如 Cora、Citeseer)上测试您的流程 +2. **使用早停**:设置 `patience` 参数以避免过拟合 +3. **调整超参数**:根据数据集大小调整学习率、隐藏维度和周期数 +4. **监控 GPU 内存**:大图可能需要批量训练(例如 Cluster-GCN) +5. **验证 Schema**:确保顶点/边标签与您的 HugeGraph schema 匹配 + +## 故障排除 + +| 问题 | 解决方案 | +|-----|---------| +| 连接 HugeGraph "Connection refused" | 验证服务器是否在 8080 端口运行 | +| CUDA 内存不足 | 减少批大小或使用仅 CPU 模式 | +| 模型收敛问题 | 尝试不同的学习率(1e-2、1e-3、1e-4) | +| DGL 的 ImportError | 运行 `uv sync` 重新安装依赖 | + +## 贡献 + +添加新算法: + +1. 在 `src/hugegraph_ml/models/your_model.py` 创建模型文件 +2. 继承基础模型类并实现 `forward()` 方法 +3. 在 `src/hugegraph_ml/examples/` 添加示例脚本 +4. 更新此文档并添加算法详情 + +## 另见 + +- [HugeGraph-AI 概述](../_index.md) - 完整 AI 生态系统 +- [HugeGraph-LLM](./hugegraph-llm.md) - RAG 和知识图谱构建 +- [GitHub 仓库](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-ml) - 源代码和示例 diff --git a/content/cn/docs/quickstart/hugegraph-ai/quick_start.md b/content/cn/docs/quickstart/hugegraph-ai/quick_start.md index 6d8d22f90..da148f7e7 100644 --- a/content/cn/docs/quickstart/hugegraph-ai/quick_start.md +++ b/content/cn/docs/quickstart/hugegraph-ai/quick_start.md @@ -190,3 +190,63 @@ graph TD; # 5. 图工具 输入 Gremlin 查询以执行相应操作。 + +# 6. 语言切换 (v1.5.0+) + +HugeGraph-LLM 支持双语提示词,以提高跨语言的准确性。 + +### 在英文和中文之间切换 + +系统语言影响: +- **系统提示词**:LLM 使用的内部提示词 +- **关键词提取**:特定语言的提取逻辑 +- **答案生成**:响应格式和风格 + +#### 配置方法一:环境变量 + +编辑您的 `.env` 文件: + +```bash +# 英文提示词(默认) +LANGUAGE=EN + +# 中文提示词 +LANGUAGE=CN +``` + +更改语言设置后重启服务。 + +#### 配置方法二:Web UI(动态) + +如果您的部署中可用,使用 Web UI 中的设置面板切换语言,无需重启: + +1. 导航到**设置**或**配置**选项卡 +2. 选择**语言**:`EN` 或 `CN` +3. 点击**保存** - 更改立即生效 + +#### 特定语言的行为 + +| 语言 | 关键词提取 | 答案风格 | 使用场景 | +|-----|-----------|---------|---------| +| `EN` | 英文 NLP 模型 | 专业、简洁 | 国际用户、英文文档 | +| `CN` | 中文 NLP 模型 | 自然的中文表达 | 中文用户、中文文档 | + +> [!TIP] +> 将 `LANGUAGE` 设置与您的主要文档语言匹配,以获得最佳 RAG 准确性。 + +### REST API 语言覆盖 + +使用 REST API 时,您可以为每个请求指定自定义提示词,以覆盖默认语言设置: + +```bash +curl -X POST http://localhost:8001/rag \ + -H "Content-Type: application/json" \ + -d '{ + "query": "告诉我关于阿尔·帕西诺的信息", + "graph_only": true, + "keywords_extract_prompt": "请从以下文本中提取关键实体...", + "answer_prompt": "请根据以下上下文回答问题..." + }' +``` + +完整参数详情请参阅 [REST API 参考](./rest-api.md)。 diff --git a/content/cn/docs/quickstart/hugegraph-ai/rest-api.md b/content/cn/docs/quickstart/hugegraph-ai/rest-api.md new file mode 100644 index 000000000..349ff4c06 --- /dev/null +++ b/content/cn/docs/quickstart/hugegraph-ai/rest-api.md @@ -0,0 +1,428 @@ +--- +title: "REST API 参考" +linkTitle: "REST API" +weight: 5 +--- + +HugeGraph-LLM 提供 REST API 端点,用于将 RAG 和 Text2Gremlin 功能集成到您的应用程序中。 + +## 基础 URL + +``` +http://localhost:8001 +``` + +启动服务时更改主机/端口: +```bash +python -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 8001 +``` + +## 认证 + +目前 API 支持可选的基于令牌的认证: + +```bash +# 在 .env 中启用认证 +ENABLE_LOGIN=true +USER_TOKEN=your-user-token +ADMIN_TOKEN=your-admin-token +``` + +在请求头中传递令牌: +```bash +Authorization: Bearer +``` + +--- + +## RAG 端点 + +### 1. 完整 RAG 查询 + +**POST** `/rag` + +执行完整的 RAG 工作流,包括关键词提取、图检索、向量搜索、重排序和答案生成。 + +#### 请求体 + +```json +{ + "query": "给我讲讲阿尔·帕西诺的电影", + "raw_answer": false, + "vector_only": false, + "graph_only": true, + "graph_vector_answer": false, + "graph_ratio": 0.5, + "rerank_method": "cohere", + "near_neighbor_first": false, + "gremlin_tmpl_num": 5, + "max_graph_items": 30, + "topk_return_results": 20, + "vector_dis_threshold": 0.9, + "topk_per_keyword": 1, + "custom_priority_info": "", + "answer_prompt": "", + "keywords_extract_prompt": "", + "gremlin_prompt": "", + "client_config": { + "url": "127.0.0.1:8080", + "graph": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" + } +} +``` + +**参数说明:** + +| 字段 | 类型 | 必需 | 默认值 | 描述 | +|-----|------|------|-------|------| +| `query` | string | 是 | - | 用户的自然语言问题 | +| `raw_answer` | boolean | 否 | false | 返回 LLM 答案而不检索 | +| `vector_only` | boolean | 否 | false | 仅使用向量搜索(无图) | +| `graph_only` | boolean | 否 | false | 仅使用图检索(无向量) | +| `graph_vector_answer` | boolean | 否 | false | 结合图和向量结果 | +| `graph_ratio` | float | 否 | 0.5 | 图与向量结果的比例(0-1) | +| `rerank_method` | string | 否 | "" | 重排序器:"cohere"、"siliconflow"、"" | +| `near_neighbor_first` | boolean | 否 | false | 优先选择直接邻居 | +| `gremlin_tmpl_num` | integer | 否 | 5 | 尝试的 Gremlin 模板数量 | +| `max_graph_items` | integer | 否 | 30 | 图检索的最大项数 | +| `topk_return_results` | integer | 否 | 20 | 重排序后的 Top-K | +| `vector_dis_threshold` | float | 否 | 0.9 | 向量相似度阈值(0-1) | +| `topk_per_keyword` | integer | 否 | 1 | 每个关键词的 Top-K 向量 | +| `custom_priority_info` | string | 否 | "" | 要优先考虑的自定义上下文 | +| `answer_prompt` | string | 否 | "" | 自定义答案生成提示词 | +| `keywords_extract_prompt` | string | 否 | "" | 自定义关键词提取提示词 | +| `gremlin_prompt` | string | 否 | "" | 自定义 Gremlin 生成提示词 | +| `client_config` | object | 否 | null | 覆盖图连接设置 | + +#### 响应 + +```json +{ + "query": "给我讲讲阿尔·帕西诺的电影", + "graph_only": { + "answer": "阿尔·帕西诺主演了《教父》(1972 年),由弗朗西斯·福特·科波拉执导...", + "context": ["《教父》是 1972 年的犯罪电影...", "..."], + "graph_paths": ["..."], + "keywords": ["阿尔·帕西诺", "电影"] + } +} +``` + +#### 示例(curl) + +```bash +curl -X POST http://localhost:8001/rag \ + -H "Content-Type: application/json" \ + -d '{ + "query": "给我讲讲阿尔·帕西诺", + "graph_only": true, + "max_graph_items": 30 + }' +``` + +### 2. 仅图检索 + +**POST** `/rag/graph` + +检索图上下文而不生成答案。用于调试或自定义处理。 + +#### 请求体 + +```json +{ + "query": "阿尔·帕西诺的电影", + "max_graph_items": 30, + "topk_return_results": 20, + "vector_dis_threshold": 0.9, + "topk_per_keyword": 1, + "gremlin_tmpl_num": 5, + "rerank_method": "cohere", + "near_neighbor_first": false, + "custom_priority_info": "", + "gremlin_prompt": "", + "get_vertex_only": false, + "client_config": { + "url": "127.0.0.1:8080", + "graph": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" + } +} +``` + +**额外参数:** + +| 字段 | 类型 | 默认值 | 描述 | +|-----|------|-------|------| +| `get_vertex_only` | boolean | false | 仅返回顶点 ID,不返回完整详情 | + +#### 响应 + +```json +{ + "graph_recall": { + "query": "阿尔·帕西诺的电影", + "keywords": ["阿尔·帕西诺", "电影"], + "match_vids": ["1:阿尔·帕西诺", "2:教父"], + "graph_result_flag": true, + "gremlin": "g.V('1:阿尔·帕西诺').outE().inV().limit(30)", + "graph_result": [ + {"id": "1:阿尔·帕西诺", "label": "person", "properties": {"name": "阿尔·帕西诺"}}, + {"id": "2:教父", "label": "movie", "properties": {"title": "教父"}} + ], + "vertex_degree_list": [5, 12] + } +} +``` + +#### 示例(curl) + +```bash +curl -X POST http://localhost:8001/rag/graph \ + -H "Content-Type: application/json" \ + -d '{ + "query": "阿尔·帕西诺", + "max_graph_items": 30, + "get_vertex_only": false + }' +``` + +--- + +## Text2Gremlin 端点 + +### 3. 自然语言转 Gremlin + +**POST** `/text2gremlin` + +将自然语言查询转换为可执行的 Gremlin 命令。 + +#### 请求体 + +```json +{ + "query": "查找所有由弗朗西斯·福特·科波拉执导的电影", + "example_num": 5, + "gremlin_prompt": "", + "output_types": ["GREMLIN", "RESULT"], + "client_config": { + "url": "127.0.0.1:8080", + "graph": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" + } +} +``` + +**参数说明:** + +| 字段 | 类型 | 必需 | 默认值 | 描述 | +|-----|------|------|-------|------| +| `query` | string | 是 | - | 自然语言查询 | +| `example_num` | integer | 否 | 5 | 使用的示例模板数量 | +| `gremlin_prompt` | string | 否 | "" | Gremlin 生成的自定义提示词 | +| `output_types` | array | 否 | null | 输出类型:["GREMLIN", "RESULT", "CYPHER"] | +| `client_config` | object | 否 | null | 图连接覆盖 | + +**输出类型:** +- `GREMLIN`:生成的 Gremlin 查询 +- `RESULT`:图的执行结果 +- `CYPHER`:Cypher 查询(如果请求) + +#### 响应 + +```json +{ + "gremlin": "g.V().has('person','name','弗朗西斯·福特·科波拉').out('directed').hasLabel('movie').values('title')", + "result": [ + "教父", + "教父 2", + "现代启示录" + ] +} +``` + +#### 示例(curl) + +```bash +curl -X POST http://localhost:8001/text2gremlin \ + -H "Content-Type: application/json" \ + -d '{ + "query": "查找所有由弗朗西斯·福特·科波拉执导的电影", + "output_types": ["GREMLIN", "RESULT"] + }' +``` + +--- + +## 配置端点 + +### 4. 更新图连接 + +**POST** `/config/graph` + +动态更新 HugeGraph 连接设置。 + +#### 请求体 + +```json +{ + "url": "127.0.0.1:8080", + "name": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" +} +``` + +#### 响应 + +```json +{ + "status_code": 201, + "message": "图配置更新成功" +} +``` + +### 5. 更新 LLM 配置 + +**POST** `/config/llm` + +运行时更新聊天/提取 LLM 设置。 + +#### 请求体(OpenAI) + +```json +{ + "llm_type": "openai", + "api_key": "sk-your-api-key", + "api_base": "https://api.openai.com/v1", + "language_model": "gpt-4o-mini", + "max_tokens": 4096 +} +``` + +#### 请求体(Ollama) + +```json +{ + "llm_type": "ollama/local", + "host": "127.0.0.1", + "port": 11434, + "language_model": "llama3.1:8b" +} +``` + +### 6. 更新嵌入配置 + +**POST** `/config/embedding` + +更新嵌入模型设置。 + +#### 请求体 + +```json +{ + "llm_type": "openai", + "api_key": "sk-your-api-key", + "api_base": "https://api.openai.com/v1", + "language_model": "text-embedding-3-small" +} +``` + +### 7. 更新 Reranker 配置 + +**POST** `/config/rerank` + +配置重排序器设置。 + +#### 请求体(Cohere) + +```json +{ + "reranker_type": "cohere", + "api_key": "your-cohere-key", + "reranker_model": "rerank-multilingual-v3.0", + "cohere_base_url": "https://api.cohere.com/v1/rerank" +} +``` + +#### 请求体(SiliconFlow) + +```json +{ + "reranker_type": "siliconflow", + "api_key": "your-siliconflow-key", + "reranker_model": "BAAI/bge-reranker-v2-m3" +} +``` + +--- + +## 错误响应 + +所有端点返回标准 HTTP 状态码: + +| 代码 | 含义 | +|-----|------| +| 200 | 成功 | +| 201 | 已创建(配置已更新) | +| 400 | 错误请求(无效参数) | +| 500 | 内部服务器错误 | +| 501 | 未实现 | + +错误响应格式: +```json +{ + "detail": "描述错误的消息" +} +``` + +--- + +## Python 客户端示例 + +```python +import requests + +BASE_URL = "http://localhost:8001" + +# 1. 配置图连接 +graph_config = { + "url": "127.0.0.1:8080", + "name": "hugegraph", + "user": "admin", + "pwd": "admin" +} +requests.post(f"{BASE_URL}/config/graph", json=graph_config) + +# 2. 执行 RAG 查询 +rag_request = { + "query": "给我讲讲阿尔·帕西诺", + "graph_only": True, + "max_graph_items": 30 +} +response = requests.post(f"{BASE_URL}/rag", json=rag_request) +print(response.json()) + +# 3. 从自然语言生成 Gremlin +text2gql_request = { + "query": "查找所有与阿尔·帕西诺合作的导演", + "output_types": ["GREMLIN", "RESULT"] +} +response = requests.post(f"{BASE_URL}/text2gremlin", json=text2gql_request) +print(response.json()) +``` + +--- + +## 另见 + +- [配置参考](./config-reference.md) - 完整的 .env 配置指南 +- [HugeGraph-LLM 概述](./hugegraph-llm.md) - 架构和功能 +- [快速入门指南](./quick_start.md) - Web UI 入门 diff --git a/content/en/docs/quickstart/hugegraph-ai/_index.md b/content/en/docs/quickstart/hugegraph-ai/_index.md index 2875a1cef..196e66818 100644 --- a/content/en/docs/quickstart/hugegraph-ai/_index.md +++ b/content/en/docs/quickstart/hugegraph-ai/_index.md @@ -18,20 +18,31 @@ weight: 3 ## ✨ Key Features - **GraphRAG**: Build intelligent question-answering systems with graph-enhanced retrieval +- **Text2Gremlin**: Natural language to graph query conversion with REST API - **Knowledge Graph Construction**: Automated graph building from text using LLMs -- **Graph ML**: Integration with 20+ graph learning algorithms (GCN, GAT, GraphSAGE, etc.) +- **Graph ML**: Integration with 21 graph learning algorithms (GCN, GAT, GraphSAGE, etc.) - **Python Client**: Easy-to-use Python interface for HugeGraph operations - **AI Agents**: Intelligent graph analysis and reasoning capabilities +### 🎉 What's New in v1.5.0 + +- **Text2Gremlin REST API**: Convert natural language queries to Gremlin commands via REST endpoints +- **Multi-Model Vector Support**: Each graph instance can use independent embedding models +- **Bilingual Prompt Support**: Switch between English and Chinese prompts (EN/CN) +- **Semi-Automatic Schema Generation**: Intelligent schema inference from text data +- **Semi-Automatic Prompt Generation**: Context-aware prompt templates +- **Enhanced Reranker Support**: Integration with Cohere and SiliconFlow rerankers +- **LiteLLM Multi-Provider Support**: Unified interface for OpenAI, Anthropic, Gemini, and more + ## 🚀 Quick Start > [!NOTE] > For a complete deployment guide and detailed examples, please refer to [hugegraph-llm/README.md](https://github.com/apache/incubator-hugegraph-ai/blob/main/hugegraph-llm/README.md) ### Prerequisites -- Python 3.9+ (3.10+ recommended for hugegraph-llm) -- [uv](https://docs.astral.sh/uv/) (recommended package manager) -- HugeGraph Server 1.3+ (1.5+ recommended) +- Python 3.10+ (required for hugegraph-llm) +- [uv](https://docs.astral.sh/uv/) 0.7+ (recommended package manager) +- HugeGraph Server 1.5+ (required) - Docker (optional, for containerized deployment) ### Option 1: Docker Deployment (Recommended) @@ -123,11 +134,13 @@ Large language model integration for graph applications: - **AI Agents**: Intelligent graph analysis and reasoning ### [hugegraph-ml](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-ml) -Graph machine learning with 20+ implemented algorithms: -- **Node Classification**: GCN, GAT, GraphSAGE, APPNP, etc. -- **Graph Classification**: DiffPool, P-GNN, etc. -- **Graph Embedding**: DeepWalk, Node2Vec, GRACE, etc. -- **Link Prediction**: SEAL, GATNE, etc. +Graph machine learning with 21 implemented algorithms: +- **Node Classification**: GCN, GAT, GraphSAGE, APPNP, AGNN, ARMA, DAGNN, DeeperGCN, GRAND, JKNet, Cluster-GCN +- **Graph Classification**: DiffPool, GIN +- **Graph Embedding**: DGI, BGRL, GRACE +- **Link Prediction**: SEAL, P-GNN, GATNE +- **Fraud Detection**: CARE-GNN, BGNN +- **Post-Processing**: C&S (Correct & Smooth) ### [hugegraph-python-client](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-python-client) Python client for HugeGraph operations: diff --git a/content/en/docs/quickstart/hugegraph-ai/config-reference.md b/content/en/docs/quickstart/hugegraph-ai/config-reference.md new file mode 100644 index 000000000..502a1d568 --- /dev/null +++ b/content/en/docs/quickstart/hugegraph-ai/config-reference.md @@ -0,0 +1,396 @@ +--- +title: "Configuration Reference" +linkTitle: "Configuration Reference" +weight: 4 +--- + +This document provides a comprehensive reference for all configuration options in HugeGraph-LLM. + +## Configuration Files + +- **Environment File**: `.env` (created from template or auto-generated) +- **Prompt Configuration**: `src/hugegraph_llm/resources/demo/config_prompt.yaml` + +> [!TIP] +> Run `python -m hugegraph_llm.config.generate --update` to auto-generate or update configuration files with defaults. + +## Environment Variables Overview + +### 1. Language and Model Type Selection + +```bash +# Prompt language (affects system prompts and generated text) +LANGUAGE=EN # Options: EN | CN + +# LLM Type for different tasks +CHAT_LLM_TYPE=openai # Chat/RAG: openai | litellm | ollama/local +EXTRACT_LLM_TYPE=openai # Entity extraction: openai | litellm | ollama/local +TEXT2GQL_LLM_TYPE=openai # Text2Gremlin: openai | litellm | ollama/local + +# Embedding type +EMBEDDING_TYPE=openai # Options: openai | litellm | ollama/local + +# Reranker type (optional) +RERANKER_TYPE= # Options: cohere | siliconflow | (empty for none) +``` + +### 2. OpenAI Configuration + +Each LLM task (chat, extract, text2gql) has independent configuration: + +#### 2.1 Chat LLM (RAG Answer Generation) + +```bash +OPENAI_CHAT_API_BASE=https://api.openai.com/v1 +OPENAI_CHAT_API_KEY=sk-your-api-key-here +OPENAI_CHAT_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_CHAT_TOKENS=8192 # Max tokens for chat responses +``` + +#### 2.2 Extract LLM (Entity & Relation Extraction) + +```bash +OPENAI_EXTRACT_API_BASE=https://api.openai.com/v1 +OPENAI_EXTRACT_API_KEY=sk-your-api-key-here +OPENAI_EXTRACT_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_EXTRACT_TOKENS=1024 # Max tokens for extraction +``` + +#### 2.3 Text2GQL LLM (Natural Language to Gremlin) + +```bash +OPENAI_TEXT2GQL_API_BASE=https://api.openai.com/v1 +OPENAI_TEXT2GQL_API_KEY=sk-your-api-key-here +OPENAI_TEXT2GQL_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_TEXT2GQL_TOKENS=4096 # Max tokens for query generation +``` + +#### 2.4 Embedding Model + +```bash +OPENAI_EMBEDDING_API_BASE=https://api.openai.com/v1 +OPENAI_EMBEDDING_API_KEY=sk-your-api-key-here +OPENAI_EMBEDDING_MODEL=text-embedding-3-small +``` + +> [!NOTE] +> You can use different API keys/endpoints for each task to optimize costs or use specialized models. + +### 3. LiteLLM Configuration (Multi-Provider Support) + +LiteLLM enables unified access to 100+ LLM providers (OpenAI, Anthropic, Google, Azure, etc.). + +#### 3.1 Chat LLM + +```bash +LITELLM_CHAT_API_BASE=http://localhost:4000 # LiteLLM proxy URL +LITELLM_CHAT_API_KEY=sk-litellm-key # LiteLLM API key +LITELLM_CHAT_LANGUAGE_MODEL=anthropic/claude-3-5-sonnet-20241022 +LITELLM_CHAT_TOKENS=8192 +``` + +#### 3.2 Extract LLM + +```bash +LITELLM_EXTRACT_API_BASE=http://localhost:4000 +LITELLM_EXTRACT_API_KEY=sk-litellm-key +LITELLM_EXTRACT_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_EXTRACT_TOKENS=256 +``` + +#### 3.3 Text2GQL LLM + +```bash +LITELLM_TEXT2GQL_API_BASE=http://localhost:4000 +LITELLM_TEXT2GQL_API_KEY=sk-litellm-key +LITELLM_TEXT2GQL_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_TEXT2GQL_TOKENS=4096 +``` + +#### 3.4 Embedding + +```bash +LITELLM_EMBEDDING_API_BASE=http://localhost:4000 +LITELLM_EMBEDDING_API_KEY=sk-litellm-key +LITELLM_EMBEDDING_MODEL=openai/text-embedding-3-small +``` + +**Model Format**: `provider/model-name` + +Examples: +- `openai/gpt-4o-mini` +- `anthropic/claude-3-5-sonnet-20241022` +- `google/gemini-2.0-flash-exp` +- `azure/gpt-4` + +See [LiteLLM Providers](https://docs.litellm.ai/docs/providers) for the complete list. + +### 4. Ollama Configuration (Local Deployment) + +Run local LLMs with Ollama for privacy and cost control. + +#### 4.1 Chat LLM + +```bash +OLLAMA_CHAT_HOST=127.0.0.1 +OLLAMA_CHAT_PORT=11434 +OLLAMA_CHAT_LANGUAGE_MODEL=llama3.1:8b +``` + +#### 4.2 Extract LLM + +```bash +OLLAMA_EXTRACT_HOST=127.0.0.1 +OLLAMA_EXTRACT_PORT=11434 +OLLAMA_EXTRACT_LANGUAGE_MODEL=llama3.1:8b +``` + +#### 4.3 Text2GQL LLM + +```bash +OLLAMA_TEXT2GQL_HOST=127.0.0.1 +OLLAMA_TEXT2GQL_PORT=11434 +OLLAMA_TEXT2GQL_LANGUAGE_MODEL=qwen2.5-coder:7b +``` + +#### 4.4 Embedding + +```bash +OLLAMA_EMBEDDING_HOST=127.0.0.1 +OLLAMA_EMBEDDING_PORT=11434 +OLLAMA_EMBEDDING_MODEL=nomic-embed-text +``` + +> [!TIP] +> Download models: `ollama pull llama3.1:8b` or `ollama pull qwen2.5-coder:7b` + +### 5. Reranker Configuration + +Rerankers improve RAG accuracy by reordering retrieved results based on relevance. + +#### 5.1 Cohere Reranker + +```bash +RERANKER_TYPE=cohere +COHERE_BASE_URL=https://api.cohere.com/v1/rerank +RERANKER_API_KEY=your-cohere-api-key +RERANKER_MODEL=rerank-english-v3.0 +``` + +Available models: +- `rerank-english-v3.0` (English) +- `rerank-multilingual-v3.0` (100+ languages) + +#### 5.2 SiliconFlow Reranker + +```bash +RERANKER_TYPE=siliconflow +RERANKER_API_KEY=your-siliconflow-api-key +RERANKER_MODEL=BAAI/bge-reranker-v2-m3 +``` + +### 6. HugeGraph Connection + +Configure connection to your HugeGraph server instance. + +```bash +# Server connection +GRAPH_IP=127.0.0.1 +GRAPH_PORT=8080 +GRAPH_NAME=hugegraph # Graph instance name +GRAPH_USER=admin # Username +GRAPH_PWD=admin-password # Password +GRAPH_SPACE= # Graph space (optional, for multi-tenancy) +``` + +### 7. Query Parameters + +Control graph traversal behavior and result limits. + +```bash +# Graph traversal limits +MAX_GRAPH_PATH=10 # Max path depth for graph queries +MAX_GRAPH_ITEMS=30 # Max items to retrieve from graph +EDGE_LIMIT_PRE_LABEL=8 # Max edges per label type + +# Property filtering +LIMIT_PROPERTY=False # Limit properties in results (True/False) +``` + +### 8. Vector Search Configuration + +Configure vector similarity search parameters. + +```bash +# Vector search thresholds +VECTOR_DIS_THRESHOLD=0.9 # Min cosine similarity (0-1, higher = stricter) +TOPK_PER_KEYWORD=1 # Top-K results per extracted keyword +``` + +### 9. Rerank Configuration + +```bash +# Rerank result limits +TOPK_RETURN_RESULTS=20 # Number of top results after reranking +``` + +## Configuration Priority + +The system loads configuration in the following order (later sources override earlier ones): + +1. **Default Values** (in `*_config.py` files) +2. **Environment Variables** (from `.env` file) +3. **Runtime Updates** (via Web UI or API calls) + +## Example Configurations + +### Minimal Setup (OpenAI) + +```bash +# Language +LANGUAGE=EN + +# LLM Types +CHAT_LLM_TYPE=openai +EXTRACT_LLM_TYPE=openai +TEXT2GQL_LLM_TYPE=openai +EMBEDDING_TYPE=openai + +# OpenAI Credentials (single key for all tasks) +OPENAI_API_BASE=https://api.openai.com/v1 +OPENAI_API_KEY=sk-your-api-key-here +OPENAI_LANGUAGE_MODEL=gpt-4o-mini +OPENAI_EMBEDDING_MODEL=text-embedding-3-small + +# HugeGraph Connection +GRAPH_IP=127.0.0.1 +GRAPH_PORT=8080 +GRAPH_NAME=hugegraph +GRAPH_USER=admin +GRAPH_PWD=admin +``` + +### Production Setup (LiteLLM + Reranker) + +```bash +# Bilingual support +LANGUAGE=EN + +# LiteLLM for flexibility +CHAT_LLM_TYPE=litellm +EXTRACT_LLM_TYPE=litellm +TEXT2GQL_LLM_TYPE=litellm +EMBEDDING_TYPE=litellm + +# LiteLLM Proxy +LITELLM_CHAT_API_BASE=http://localhost:4000 +LITELLM_CHAT_API_KEY=sk-litellm-master-key +LITELLM_CHAT_LANGUAGE_MODEL=anthropic/claude-3-5-sonnet-20241022 +LITELLM_CHAT_TOKENS=8192 + +LITELLM_EXTRACT_API_BASE=http://localhost:4000 +LITELLM_EXTRACT_API_KEY=sk-litellm-master-key +LITELLM_EXTRACT_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_EXTRACT_TOKENS=256 + +LITELLM_TEXT2GQL_API_BASE=http://localhost:4000 +LITELLM_TEXT2GQL_API_KEY=sk-litellm-master-key +LITELLM_TEXT2GQL_LANGUAGE_MODEL=openai/gpt-4o-mini +LITELLM_TEXT2GQL_TOKENS=4096 + +LITELLM_EMBEDDING_API_BASE=http://localhost:4000 +LITELLM_EMBEDDING_API_KEY=sk-litellm-master-key +LITELLM_EMBEDDING_MODEL=openai/text-embedding-3-small + +# Cohere Reranker for better accuracy +RERANKER_TYPE=cohere +COHERE_BASE_URL=https://api.cohere.com/v1/rerank +RERANKER_API_KEY=your-cohere-key +RERANKER_MODEL=rerank-multilingual-v3.0 + +# HugeGraph with authentication +GRAPH_IP=prod-hugegraph.example.com +GRAPH_PORT=8080 +GRAPH_NAME=production_graph +GRAPH_USER=rag_user +GRAPH_PWD=secure-password +GRAPH_SPACE=prod_space + +# Optimized query parameters +MAX_GRAPH_PATH=15 +MAX_GRAPH_ITEMS=50 +VECTOR_DIS_THRESHOLD=0.85 +TOPK_RETURN_RESULTS=30 +``` + +### Local/Offline Setup (Ollama) + +```bash +# Language +LANGUAGE=EN + +# All local models via Ollama +CHAT_LLM_TYPE=ollama/local +EXTRACT_LLM_TYPE=ollama/local +TEXT2GQL_LLM_TYPE=ollama/local +EMBEDDING_TYPE=ollama/local + +# Ollama endpoints +OLLAMA_CHAT_HOST=127.0.0.1 +OLLAMA_CHAT_PORT=11434 +OLLAMA_CHAT_LANGUAGE_MODEL=llama3.1:8b + +OLLAMA_EXTRACT_HOST=127.0.0.1 +OLLAMA_EXTRACT_PORT=11434 +OLLAMA_EXTRACT_LANGUAGE_MODEL=llama3.1:8b + +OLLAMA_TEXT2GQL_HOST=127.0.0.1 +OLLAMA_TEXT2GQL_PORT=11434 +OLLAMA_TEXT2GQL_LANGUAGE_MODEL=qwen2.5-coder:7b + +OLLAMA_EMBEDDING_HOST=127.0.0.1 +OLLAMA_EMBEDDING_PORT=11434 +OLLAMA_EMBEDDING_MODEL=nomic-embed-text + +# No reranker for offline setup +RERANKER_TYPE= + +# Local HugeGraph +GRAPH_IP=127.0.0.1 +GRAPH_PORT=8080 +GRAPH_NAME=hugegraph +GRAPH_USER=admin +GRAPH_PWD=admin +``` + +## Configuration Validation + +After modifying `.env`, verify your configuration: + +1. **Via Web UI**: Visit `http://localhost:8001` and check the settings panel +2. **Via Python**: +```python +from hugegraph_llm.config import settings +print(settings.llm_config) +print(settings.hugegraph_config) +``` +3. **Via REST API**: +```bash +curl http://localhost:8001/config +``` + +## Troubleshooting + +| Issue | Solution | +|-------|----------| +| "API key not found" | Check `*_API_KEY` is set correctly in `.env` | +| "Connection refused" | Verify `GRAPH_IP` and `GRAPH_PORT` are correct | +| "Model not found" | For Ollama: run `ollama pull ` | +| "Rate limit exceeded" | Reduce `MAX_GRAPH_ITEMS` or use different API keys | +| "Embedding dimension mismatch" | Delete existing vectors and rebuild with correct model | + +## See Also + +- [HugeGraph-LLM Overview](./hugegraph-llm.md) +- [REST API Reference](./rest-api.md) +- [Quick Start Guide](./quick_start.md) diff --git a/content/en/docs/quickstart/hugegraph-ai/hugegraph-llm.md b/content/en/docs/quickstart/hugegraph-ai/hugegraph-llm.md index b64b1fa7d..171c3cf4d 100644 --- a/content/en/docs/quickstart/hugegraph-ai/hugegraph-llm.md +++ b/content/en/docs/quickstart/hugegraph-ai/hugegraph-llm.md @@ -224,7 +224,80 @@ After running the demo, configuration files are automatically generated: > [!NOTE] > Configuration changes are automatically saved when using the web interface. For manual changes, simply refresh the page to load updates. -**LLM Provider Support**: This project uses [LiteLLM](https://docs.litellm.ai/docs/providers) for multi-provider LLM support. +### LLM Provider Configuration + +This project uses [LiteLLM](https://docs.litellm.ai/docs/providers) for multi-provider LLM support, enabling unified access to OpenAI, Anthropic, Google, Cohere, and 100+ other providers. + +#### Option 1: Direct LLM Connection (OpenAI, Ollama) + +```bash +# .env configuration +chat_llm_type=openai # or ollama/local +openai_api_key=sk-xxx +openai_api_base=https://api.openai.com/v1 +openai_language_model=gpt-4o-mini +openai_max_tokens=4096 +``` + +#### Option 2: LiteLLM Multi-Provider Support + +LiteLLM acts as a unified proxy for multiple LLM providers: + +```bash +# .env configuration +chat_llm_type=litellm +extract_llm_type=litellm +text2gql_llm_type=litellm + +# LiteLLM settings +litellm_api_base=http://localhost:4000 # LiteLLM proxy server +litellm_api_key=sk-1234 # LiteLLM API key + +# Model selection (provider/model format) +litellm_language_model=anthropic/claude-3-5-sonnet-20241022 +litellm_max_tokens=4096 +``` + +**Supported Providers**: OpenAI, Anthropic, Google (Gemini), Azure, Cohere, Bedrock, Vertex AI, Hugging Face, and more. + +For full provider list and configuration details, visit [LiteLLM Providers](https://docs.litellm.ai/docs/providers). + +### Reranker Configuration + +Rerankers improve RAG accuracy by reordering retrieved results. Supported providers: + +```bash +# Cohere Reranker +reranker_type=cohere +cohere_api_key=your-cohere-key +cohere_rerank_model=rerank-english-v3.0 + +# SiliconFlow Reranker +reranker_type=siliconflow +siliconflow_api_key=your-siliconflow-key +siliconflow_rerank_model=BAAI/bge-reranker-v2-m3 +``` + +### Text2Gremlin Configuration + +Convert natural language to Gremlin queries: + +```python +from hugegraph_llm.operators.graph_rag_task import Text2GremlinPipeline + +# Initialize pipeline +text2gremlin = Text2GremlinPipeline() + +# Generate Gremlin query +result = ( + text2gremlin + .query_to_gremlin(query="Find all movies directed by Francis Ford Coppola") + .execute_gremlin_query() + .run() +) +``` + +**REST API Endpoint**: See the [REST API documentation](./rest-api.md) for HTTP endpoint details. ## 📚 Additional Resources diff --git a/content/en/docs/quickstart/hugegraph-ai/hugegraph-ml.md b/content/en/docs/quickstart/hugegraph-ai/hugegraph-ml.md new file mode 100644 index 000000000..18ff15297 --- /dev/null +++ b/content/en/docs/quickstart/hugegraph-ai/hugegraph-ml.md @@ -0,0 +1,289 @@ +--- +title: "HugeGraph-ML" +linkTitle: "HugeGraph-ML" +weight: 2 +--- + +HugeGraph-ML integrates HugeGraph with popular graph learning libraries, enabling end-to-end machine learning workflows directly on graph data. + +## Overview + +`hugegraph-ml` provides a unified interface for applying graph neural networks and machine learning algorithms to data stored in HugeGraph. It eliminates the need for complex data export/import pipelines by seamlessly converting HugeGraph data to formats compatible with leading ML frameworks. + +### Key Features + +- **Direct HugeGraph Integration**: Query graph data directly from HugeGraph without manual exports +- **21 Implemented Algorithms**: Comprehensive coverage of node classification, graph classification, embedding, and link prediction +- **DGL Backend**: Leverages Deep Graph Library (DGL) for efficient training +- **End-to-End Workflows**: From data loading to model training and evaluation +- **Modular Tasks**: Reusable task abstractions for common ML scenarios + +## Prerequisites + +- **Python**: 3.9+ (standalone module) +- **HugeGraph Server**: 1.0+ (recommended: 1.5+) +- **UV Package Manager**: 0.7+ (for dependency management) + +## Installation + +### 1. Start HugeGraph Server + +```bash +# Option 1: Docker (recommended) +docker run -itd --name=hugegraph -p 8080:8080 hugegraph/hugegraph + +# Option 2: Binary packages +# See https://hugegraph.apache.org/docs/download/download/ +``` + +### 2. Clone and Setup + +```bash +git clone https://github.com/apache/incubator-hugegraph-ai.git +cd incubator-hugegraph-ai/hugegraph-ml +``` + +### 3. Install Dependencies + +```bash +# uv sync automatically creates .venv and installs all dependencies +uv sync + +# Activate virtual environment +source .venv/bin/activate +``` + +### 4. Navigate to Source Directory + +```bash +cd ./src +``` + +> [!NOTE] +> All examples assume you're in the activated virtual environment. + +## Implemented Algorithms + +HugeGraph-ML currently implements **21 graph machine learning algorithms** across multiple categories: + +### Node Classification (11 algorithms) + +Predict labels for graph nodes based on network structure and features. + +| Algorithm | Paper | Description | +|-----------|-------|-------------| +| **GCN** | [Kipf & Welling, 2017](https://arxiv.org/abs/1609.02907) | Graph Convolutional Networks | +| **GAT** | [Veličković et al., 2018](https://arxiv.org/abs/1710.10903) | Graph Attention Networks | +| **GraphSAGE** | [Hamilton et al., 2017](https://arxiv.org/abs/1706.02216) | Inductive representation learning | +| **APPNP** | [Klicpera et al., 2019](https://arxiv.org/abs/1810.05997) | Personalized PageRank propagation | +| **AGNN** | [Thekumparampil et al., 2018](https://arxiv.org/abs/1803.03735) | Attention-based GNN | +| **ARMA** | [Bianchi et al., 2019](https://arxiv.org/abs/1901.01343) | Autoregressive moving average filters | +| **DAGNN** | [Liu et al., 2020](https://arxiv.org/abs/2007.09296) | Deep adaptive graph neural networks | +| **DeeperGCN** | [Li et al., 2020](https://arxiv.org/abs/2006.07739) | Very deep GCN architectures | +| **GRAND** | [Feng et al., 2020](https://arxiv.org/abs/2005.11079) | Graph random neural networks | +| **JKNet** | [Xu et al., 2018](https://arxiv.org/abs/1806.03536) | Jumping knowledge networks | +| **Cluster-GCN** | [Chiang et al., 2019](https://arxiv.org/abs/1905.07953) | Scalable GCN training via clustering | + +### Graph Classification (2 algorithms) + +Classify entire graphs based on their structure and node features. + +| Algorithm | Paper | Description | +|-----------|-------|-------------| +| **DiffPool** | [Ying et al., 2018](https://arxiv.org/abs/1806.08804) | Differentiable graph pooling | +| **GIN** | [Xu et al., 2019](https://arxiv.org/abs/1810.00826) | Graph isomorphism networks | + +### Graph Embedding (3 algorithms) + +Learn unsupervised node representations for downstream tasks. + +| Algorithm | Paper | Description | +|-----------|-------|-------------| +| **DGI** | [Veličković et al., 2019](https://arxiv.org/abs/1809.10341) | Deep graph infomax (contrastive learning) | +| **BGRL** | [Thakoor et al., 2021](https://arxiv.org/abs/2102.06514) | Bootstrapped graph representation learning | +| **GRACE** | [Zhu et al., 2020](https://arxiv.org/abs/2006.04131) | Graph contrastive learning | + +### Link Prediction (3 algorithms) + +Predict missing or future connections in graphs. + +| Algorithm | Paper | Description | +|-----------|-------|-------------| +| **SEAL** | [Zhang & Chen, 2018](https://arxiv.org/abs/1802.09691) | Subgraph extraction and labeling | +| **P-GNN** | [You et al., 2019](http://proceedings.mlr.press/v97/you19b/you19b.pdf) | Position-aware GNN | +| **GATNE** | [Cen et al., 2019](https://arxiv.org/abs/1905.01669) | Attributed multiplex heterogeneous network embedding | + +### Fraud Detection (2 algorithms) + +Detect anomalous nodes in graphs (e.g., fraudulent accounts). + +| Algorithm | Paper | Description | +|-----------|-------|-------------| +| **CARE-GNN** | [Dou et al., 2020](https://arxiv.org/abs/2008.08692) | Camouflage-resistant GNN | +| **BGNN** | [Zheng et al., 2021](https://arxiv.org/abs/2101.08543) | Bipartite graph neural network | + +### Post-Processing (1 algorithm) + +Improve predictions via label propagation. + +| Algorithm | Paper | Description | +|-----------|-------|-------------| +| **C&S** | [Huang et al., 2020](https://arxiv.org/abs/2010.13993) | Correct & Smooth (prediction refinement) | + +## Usage Examples + +### Example 1: Node Embedding with DGI + +Perform unsupervised node embedding on the Cora dataset using Deep Graph Infomax (DGI). + +#### Step 1: Import Dataset (if needed) + +```python +from hugegraph_ml.utils.dgl2hugegraph_utils import import_graph_from_dgl + +# Import Cora dataset from DGL to HugeGraph +import_graph_from_dgl("cora") +``` + +#### Step 2: Convert Graph Data + +```python +from hugegraph_ml.data.hugegraph2dgl import HugeGraph2DGL + +# Convert HugeGraph data to DGL format +hg2d = HugeGraph2DGL() +graph = hg2d.convert_graph(vertex_label="CORA_vertex", edge_label="CORA_edge") +``` + +#### Step 3: Initialize Model + +```python +from hugegraph_ml.models.dgi import DGI + +# Create DGI model +model = DGI(n_in_feats=graph.ndata["feat"].shape[1]) +``` + +#### Step 4: Train and Generate Embeddings + +```python +from hugegraph_ml.tasks.node_embed import NodeEmbed + +# Train model and generate node embeddings +node_embed_task = NodeEmbed(graph=graph, model=model) +embedded_graph = node_embed_task.train_and_embed( + add_self_loop=True, + n_epochs=300, + patience=30 +) +``` + +#### Step 5: Downstream Task (Node Classification) + +```python +from hugegraph_ml.models.mlp import MLPClassifier +from hugegraph_ml.tasks.node_classify import NodeClassify + +# Use embeddings for node classification +model = MLPClassifier( + n_in_feat=embedded_graph.ndata["feat"].shape[1], + n_out_feat=embedded_graph.ndata["label"].unique().shape[0] +) +node_clf_task = NodeClassify(graph=embedded_graph, model=model) +node_clf_task.train(lr=1e-3, n_epochs=400, patience=40) +print(node_clf_task.evaluate()) +``` + +**Expected Output:** +```python +{'accuracy': 0.82, 'loss': 0.5714246034622192} +``` + +**Full Example**: See [dgi_example.py](https://github.com/apache/incubator-hugegraph-ai/blob/main/hugegraph-ml/src/hugegraph_ml/examples/dgi_example.py) + +### Example 2: Node Classification with GRAND + +Directly classify nodes using the GRAND model (no separate embedding step needed). + +```python +from hugegraph_ml.data.hugegraph2dgl import HugeGraph2DGL +from hugegraph_ml.models.grand import GRAND +from hugegraph_ml.tasks.node_classify import NodeClassify + +# Load graph +hg2d = HugeGraph2DGL() +graph = hg2d.convert_graph(vertex_label="CORA_vertex", edge_label="CORA_edge") + +# Initialize GRAND model +model = GRAND( + n_in_feats=graph.ndata["feat"].shape[1], + n_out_feats=graph.ndata["label"].unique().shape[0] +) + +# Train and evaluate +node_clf_task = NodeClassify(graph=graph, model=model) +node_clf_task.train(lr=1e-2, n_epochs=1500, patience=100) +print(node_clf_task.evaluate()) +``` + +**Full Example**: See [grand_example.py](https://github.com/apache/incubator-hugegraph-ai/blob/main/hugegraph-ml/src/hugegraph_ml/examples/grand_example.py) + +## Core Components + +### HugeGraph2DGL Converter + +Seamlessly converts HugeGraph data to DGL graph format: + +```python +from hugegraph_ml.data.hugegraph2dgl import HugeGraph2DGL + +hg2d = HugeGraph2DGL() +graph = hg2d.convert_graph( + vertex_label="person", # Vertex label to extract + edge_label="knows", # Edge label to extract + directed=False # Graph directionality +) +``` + +### Task Abstractions + +Reusable task objects for common ML workflows: + +| Task | Class | Purpose | +|------|-------|---------| +| Node Embedding | `NodeEmbed` | Generate unsupervised node embeddings | +| Node Classification | `NodeClassify` | Predict node labels | +| Graph Classification | `GraphClassify` | Predict graph-level labels | +| Link Prediction | `LinkPredict` | Predict missing edges | + +## Best Practices + +1. **Start with Small Datasets**: Test your pipeline on small graphs (e.g., Cora, Citeseer) before scaling +2. **Use Early Stopping**: Set `patience` parameter to avoid overfitting +3. **Tune Hyperparameters**: Adjust learning rate, hidden dimensions, and epochs based on dataset size +4. **Monitor GPU Memory**: Large graphs may require batch training (e.g., Cluster-GCN) +5. **Validate Schema**: Ensure vertex/edge labels match your HugeGraph schema + +## Troubleshooting + +| Issue | Solution | +|-------|----------| +| "Connection refused" to HugeGraph | Verify server is running on port 8080 | +| CUDA out of memory | Reduce batch size or use CPU-only mode | +| Model convergence issues | Try different learning rates (1e-2, 1e-3, 1e-4) | +| ImportError for DGL | Run `uv sync` to reinstall dependencies | + +## Contributing + +To add a new algorithm: + +1. Create model file in `src/hugegraph_ml/models/your_model.py` +2. Inherit from base model class and implement `forward()` method +3. Add example script in `src/hugegraph_ml/examples/` +4. Update this documentation with algorithm details + +## See Also + +- [HugeGraph-AI Overview](../_index.md) - Full AI ecosystem +- [HugeGraph-LLM](./hugegraph-llm.md) - RAG and knowledge graph construction +- [GitHub Repository](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-ml) - Source code and examples diff --git a/content/en/docs/quickstart/hugegraph-ai/quick_start.md b/content/en/docs/quickstart/hugegraph-ai/quick_start.md index 58e367787..04852db52 100644 --- a/content/en/docs/quickstart/hugegraph-ai/quick_start.md +++ b/content/en/docs/quickstart/hugegraph-ai/quick_start.md @@ -207,3 +207,63 @@ graph TD; # 5. Graph Tools Input Gremlin queries to execute corresponding operations. + +# 6. Language Switching (v1.5.0+) + +HugeGraph-LLM supports bilingual prompts for improved accuracy across languages. + +### Switching Between English and Chinese + +The system language affects: +- **System prompts**: Internal prompts used by the LLM +- **Keyword extraction**: Language-specific extraction logic +- **Answer generation**: Response formatting and style + +#### Configuration Method 1: Environment Variable + +Edit your `.env` file: + +```bash +# English prompts (default) +LANGUAGE=EN + +# Chinese prompts +LANGUAGE=CN +``` + +Restart the service after changing the language setting. + +#### Configuration Method 2: Web UI (Dynamic) + +If available in your deployment, use the settings panel in the Web UI to switch languages without restarting: + +1. Navigate to the **Settings** or **Configuration** tab +2. Select **Language**: `EN` or `CN` +3. Click **Save** - changes apply immediately + +#### Language-Specific Behavior + +| Language | Keyword Extraction | Answer Style | Use Case | +|----------|-------------------|--------------|----------| +| `EN` | English NLP models | Professional, concise | International users, English documents | +| `CN` | Chinese NLP models | Natural Chinese phrasing | Chinese users, Chinese documents | + +> [!TIP] +> Match the `LANGUAGE` setting to your primary document language for best RAG accuracy. + +### REST API Language Override + +When using the REST API, you can specify custom prompts per request to override the default language setting: + +```bash +curl -X POST http://localhost:8001/rag \ + -H "Content-Type: application/json" \ + -d '{ + "query": "告诉我关于阿尔·帕西诺的信息", + "graph_only": true, + "keywords_extract_prompt": "请从以下文本中提取关键实体...", + "answer_prompt": "请根据以下上下文回答问题..." + }' +``` + +See the [REST API Reference](./rest-api.md) for complete parameter details. diff --git a/content/en/docs/quickstart/hugegraph-ai/rest-api.md b/content/en/docs/quickstart/hugegraph-ai/rest-api.md new file mode 100644 index 000000000..484afdac8 --- /dev/null +++ b/content/en/docs/quickstart/hugegraph-ai/rest-api.md @@ -0,0 +1,428 @@ +--- +title: "REST API Reference" +linkTitle: "REST API" +weight: 5 +--- + +HugeGraph-LLM provides REST API endpoints for integrating RAG and Text2Gremlin capabilities into your applications. + +## Base URL + +``` +http://localhost:8001 +``` + +Change host/port as configured when starting the service: +```bash +python -m hugegraph_llm.demo.rag_demo.app --host 127.0.0.1 --port 8001 +``` + +## Authentication + +Currently, the API supports optional token-based authentication: + +```bash +# Enable authentication in .env +ENABLE_LOGIN=true +USER_TOKEN=your-user-token +ADMIN_TOKEN=your-admin-token +``` + +Pass tokens in request headers: +```bash +Authorization: Bearer +``` + +--- + +## RAG Endpoints + +### 1. Complete RAG Query + +**POST** `/rag` + +Execute a full RAG pipeline including keyword extraction, graph retrieval, vector search, reranking, and answer generation. + +#### Request Body + +```json +{ + "query": "Tell me about Al Pacino's movies", + "raw_answer": false, + "vector_only": false, + "graph_only": true, + "graph_vector_answer": false, + "graph_ratio": 0.5, + "rerank_method": "cohere", + "near_neighbor_first": false, + "gremlin_tmpl_num": 5, + "max_graph_items": 30, + "topk_return_results": 20, + "vector_dis_threshold": 0.9, + "topk_per_keyword": 1, + "custom_priority_info": "", + "answer_prompt": "", + "keywords_extract_prompt": "", + "gremlin_prompt": "", + "client_config": { + "url": "127.0.0.1:8080", + "graph": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" + } +} +``` + +**Parameters:** + +| Field | Type | Required | Default | Description | +|-------|------|----------|---------|-------------| +| `query` | string | Yes | - | User's natural language question | +| `raw_answer` | boolean | No | false | Return LLM answer without retrieval | +| `vector_only` | boolean | No | false | Use only vector search (no graph) | +| `graph_only` | boolean | No | false | Use only graph retrieval (no vector) | +| `graph_vector_answer` | boolean | No | false | Combine graph and vector results | +| `graph_ratio` | float | No | 0.5 | Ratio of graph vs vector results (0-1) | +| `rerank_method` | string | No | "" | Reranker: "cohere", "siliconflow", "" | +| `near_neighbor_first` | boolean | No | false | Prioritize direct neighbors | +| `gremlin_tmpl_num` | integer | No | 5 | Number of Gremlin templates to try | +| `max_graph_items` | integer | No | 30 | Max items from graph retrieval | +| `topk_return_results` | integer | No | 20 | Top-K after reranking | +| `vector_dis_threshold` | float | No | 0.9 | Vector similarity threshold (0-1) | +| `topk_per_keyword` | integer | No | 1 | Top-K vectors per keyword | +| `custom_priority_info` | string | No | "" | Custom context to prioritize | +| `answer_prompt` | string | No | "" | Custom answer generation prompt | +| `keywords_extract_prompt` | string | No | "" | Custom keyword extraction prompt | +| `gremlin_prompt` | string | No | "" | Custom Gremlin generation prompt | +| `client_config` | object | No | null | Override graph connection settings | + +#### Response + +```json +{ + "query": "Tell me about Al Pacino's movies", + "graph_only": { + "answer": "Al Pacino starred in The Godfather (1972), directed by Francis Ford Coppola...", + "context": ["The Godfather is a 1972 crime film...", "..."], + "graph_paths": ["..."], + "keywords": ["Al Pacino", "movies"] + } +} +``` + +#### Example (curl) + +```bash +curl -X POST http://localhost:8001/rag \ + -H "Content-Type: application/json" \ + -d '{ + "query": "Tell me about Al Pacino", + "graph_only": true, + "max_graph_items": 30 + }' +``` + +### 2. Graph Retrieval Only + +**POST** `/rag/graph` + +Retrieve graph context without generating an answer. Useful for debugging or custom processing. + +#### Request Body + +```json +{ + "query": "Al Pacino movies", + "max_graph_items": 30, + "topk_return_results": 20, + "vector_dis_threshold": 0.9, + "topk_per_keyword": 1, + "gremlin_tmpl_num": 5, + "rerank_method": "cohere", + "near_neighbor_first": false, + "custom_priority_info": "", + "gremlin_prompt": "", + "get_vertex_only": false, + "client_config": { + "url": "127.0.0.1:8080", + "graph": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" + } +} +``` + +**Additional Parameter:** + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `get_vertex_only` | boolean | false | Return only vertex IDs without full details | + +#### Response + +```json +{ + "graph_recall": { + "query": "Al Pacino movies", + "keywords": ["Al Pacino", "movies"], + "match_vids": ["1:Al Pacino", "2:The Godfather"], + "graph_result_flag": true, + "gremlin": "g.V('1:Al Pacino').outE().inV().limit(30)", + "graph_result": [ + {"id": "1:Al Pacino", "label": "person", "properties": {"name": "Al Pacino"}}, + {"id": "2:The Godfather", "label": "movie", "properties": {"title": "The Godfather"}} + ], + "vertex_degree_list": [5, 12] + } +} +``` + +#### Example (curl) + +```bash +curl -X POST http://localhost:8001/rag/graph \ + -H "Content-Type: application/json" \ + -d '{ + "query": "Al Pacino", + "max_graph_items": 30, + "get_vertex_only": false + }' +``` + +--- + +## Text2Gremlin Endpoint + +### 3. Natural Language to Gremlin + +**POST** `/text2gremlin` + +Convert natural language queries to executable Gremlin commands. + +#### Request Body + +```json +{ + "query": "Find all movies directed by Francis Ford Coppola", + "example_num": 5, + "gremlin_prompt": "", + "output_types": ["GREMLIN", "RESULT"], + "client_config": { + "url": "127.0.0.1:8080", + "graph": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" + } +} +``` + +**Parameters:** + +| Field | Type | Required | Default | Description | +|-------|------|----------|---------|-------------| +| `query` | string | Yes | - | Natural language query | +| `example_num` | integer | No | 5 | Number of example templates to use | +| `gremlin_prompt` | string | No | "" | Custom prompt for Gremlin generation | +| `output_types` | array | No | null | Output types: ["GREMLIN", "RESULT", "CYPHER"] | +| `client_config` | object | No | null | Graph connection override | + +**Output Types:** +- `GREMLIN`: Generated Gremlin query +- `RESULT`: Execution result from graph +- `CYPHER`: Cypher query (if requested) + +#### Response + +```json +{ + "gremlin": "g.V().has('person','name','Francis Ford Coppola').out('directed').hasLabel('movie').values('title')", + "result": [ + "The Godfather", + "The Godfather Part II", + "Apocalypse Now" + ] +} +``` + +#### Example (curl) + +```bash +curl -X POST http://localhost:8001/text2gremlin \ + -H "Content-Type: application/json" \ + -d '{ + "query": "Find all movies directed by Francis Ford Coppola", + "output_types": ["GREMLIN", "RESULT"] + }' +``` + +--- + +## Configuration Endpoints + +### 4. Update Graph Connection + +**POST** `/config/graph` + +Dynamically update HugeGraph connection settings. + +#### Request Body + +```json +{ + "url": "127.0.0.1:8080", + "name": "hugegraph", + "user": "admin", + "pwd": "admin", + "gs": "" +} +``` + +#### Response + +```json +{ + "status_code": 201, + "message": "Graph configuration updated successfully" +} +``` + +### 5. Update LLM Configuration + +**POST** `/config/llm` + +Update chat/extract LLM settings at runtime. + +#### Request Body (OpenAI) + +```json +{ + "llm_type": "openai", + "api_key": "sk-your-api-key", + "api_base": "https://api.openai.com/v1", + "language_model": "gpt-4o-mini", + "max_tokens": 4096 +} +``` + +#### Request Body (Ollama) + +```json +{ + "llm_type": "ollama/local", + "host": "127.0.0.1", + "port": 11434, + "language_model": "llama3.1:8b" +} +``` + +### 6. Update Embedding Configuration + +**POST** `/config/embedding` + +Update embedding model settings. + +#### Request Body + +```json +{ + "llm_type": "openai", + "api_key": "sk-your-api-key", + "api_base": "https://api.openai.com/v1", + "language_model": "text-embedding-3-small" +} +``` + +### 7. Update Reranker Configuration + +**POST** `/config/rerank` + +Configure reranker settings. + +#### Request Body (Cohere) + +```json +{ + "reranker_type": "cohere", + "api_key": "your-cohere-key", + "reranker_model": "rerank-multilingual-v3.0", + "cohere_base_url": "https://api.cohere.com/v1/rerank" +} +``` + +#### Request Body (SiliconFlow) + +```json +{ + "reranker_type": "siliconflow", + "api_key": "your-siliconflow-key", + "reranker_model": "BAAI/bge-reranker-v2-m3" +} +``` + +--- + +## Error Responses + +All endpoints return standard HTTP status codes: + +| Code | Meaning | +|------|---------| +| 200 | Success | +| 201 | Created (config updated) | +| 400 | Bad Request (invalid parameters) | +| 500 | Internal Server Error | +| 501 | Not Implemented | + +Error response format: +```json +{ + "detail": "Error message describing what went wrong" +} +``` + +--- + +## Python Client Example + +```python +import requests + +BASE_URL = "http://localhost:8001" + +# 1. Configure graph connection +graph_config = { + "url": "127.0.0.1:8080", + "name": "hugegraph", + "user": "admin", + "pwd": "admin" +} +requests.post(f"{BASE_URL}/config/graph", json=graph_config) + +# 2. Execute RAG query +rag_request = { + "query": "Tell me about Al Pacino", + "graph_only": True, + "max_graph_items": 30 +} +response = requests.post(f"{BASE_URL}/rag", json=rag_request) +print(response.json()) + +# 3. Generate Gremlin from natural language +text2gql_request = { + "query": "Find all directors who worked with Al Pacino", + "output_types": ["GREMLIN", "RESULT"] +} +response = requests.post(f"{BASE_URL}/text2gremlin", json=text2gql_request) +print(response.json()) +``` + +--- + +## See Also + +- [Configuration Reference](./config-reference.md) - Complete .env configuration guide +- [HugeGraph-LLM Overview](./hugegraph-llm.md) - Architecture and features +- [Quick Start Guide](./quick_start.md) - Getting started with the Web UI From 06b7a951759421f0393eef401463486528ee2480 Mon Sep 17 00:00:00 2001 From: imbajin Date: Sun, 1 Feb 2026 20:17:56 +0800 Subject: [PATCH 06/10] Revise and expand Computer config docs (CN/EN) Rewrite and restructure the Computer configuration documentation in both Chinese and English. The changes replace the old flat option tables with organized sections (Basics, Algorithm, Input, Snapshot/Storage, Worker/Master, I/O/Output, Network/Transport, Storage, BSP, Performance, System-managed, K8s Operator, KubeDriver and CRD). Added clear default-value semantics, examples for local and MinIO snapshots, notes about system-managed options (do not modify), and more explanatory text for many options. Also updated related CRD/KubeDriver fields formatting and clarified operator environment variable mapping. Minor related updates applied to the quickstart computing hugegraph-computer page. --- content/cn/docs/config/config-computer.md | 603 +++++++++++++----- content/en/docs/config/config-computer.md | 579 ++++++++++++----- .../computing/hugegraph-computer.md | 315 ++++++++- 3 files changed, 1167 insertions(+), 330 deletions(-) diff --git a/content/cn/docs/config/config-computer.md b/content/cn/docs/config/config-computer.md index 0b270c9e8..08c0439e0 100644 --- a/content/cn/docs/config/config-computer.md +++ b/content/cn/docs/config/config-computer.md @@ -4,174 +4,445 @@ linkTitle: "图计算 Computer 配置" weight: 5 --- -### Computer Config Options - -| config option | default value | description | -|-----------------------------------------|-------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| algorithm.message_class | org.apache.hugegraph.computer.core.config.Null | The class of message passed when compute vertex. | -| algorithm.params_class | org.apache.hugegraph.computer.core.config.Null | The class used to transfer algorithms' parameters before algorithm been run. | -| algorithm.result_class | org.apache.hugegraph.computer.core.config.Null | The class of vertex's value, the instance is used to store computation result for the vertex. | -| allocator.max_vertices_per_thread | 10000 | Maximum number of vertices per thread processed in each memory allocator | -| bsp.etcd_endpoints | http://localhost:2379 | The end points to access etcd. | -| bsp.log_interval | 30000 | The log interval(in ms) to print the log while waiting bsp event. | -| bsp.max_super_step | 10 | The max super step of the algorithm. | -| bsp.register_timeout | 300000 | The max timeout to wait for master and works to register. | -| bsp.wait_master_timeout | 86400000 | The max timeout(in ms) to wait for master bsp event. | -| bsp.wait_workers_timeout | 86400000 | The max timeout to wait for workers bsp event. | -| hgkv.max_data_block_size | 65536 | The max byte size of hgkv-file data block. | -| hgkv.max_file_size | 2147483648 | The max number of bytes in each hgkv-file. | -| hgkv.max_merge_files | 10 | The max number of files to merge at one time. | -| hgkv.temp_file_dir | /tmp/hgkv | This folder is used to store temporary files, temporary files will be generated during the file merging process. | -| hugegraph.name | hugegraph | The graph name to load data and write results back. | -| hugegraph.url | http://127.0.0.1:8080 | The hugegraph url to load data and write results back. | -| input.edge_direction | OUT | The data of the edge in which direction is loaded, when the value is BOTH, the edges in both OUT and IN direction will be loaded. | -| input.edge_freq | MULTIPLE | The frequency of edges can exist between a pair of vertices, allowed values: [SINGLE, SINGLE_PER_LABEL, MULTIPLE]. SINGLE means that only one edge can exist between a pair of vertices, use sourceId + targetId to identify it; SINGLE_PER_LABEL means that each edge label can exist one edge between a pair of vertices, use sourceId + edgelabel + targetId to identify it; MULTIPLE means that many edge can exist between a pair of vertices, use sourceId + edgelabel + sortValues + targetId to identify it. | -| input.filter_class | org.apache.hugegraph.computer.core.input.filter.DefaultInputFilter | The class to create input-filter object, input-filter is used to Filter vertex edges according to user needs. | -| input.loader_schema_path || The schema path of loader input, only takes effect when the input.source_type=loader is enabled | -| input.loader_struct_path || The struct path of loader input, only takes effect when the input.source_type=loader is enabled | -| input.max_edges_in_one_vertex | 200 | The maximum number of adjacent edges allowed to be attached to a vertex, the adjacent edges will be stored and transferred together as a batch unit. | -| input.source_type | hugegraph-server | The source type to load input data, allowed values: ['hugegraph-server', 'hugegraph-loader'], the 'hugegraph-loader' means use hugegraph-loader load data from HDFS or file, if use 'hugegraph-loader' load data then please config 'input.loader_struct_path' and 'input.loader_schema_path'. | -| input.split_fetch_timeout | 300 | The timeout in seconds to fetch input splits | -| input.split_max_splits | 10000000 | The maximum number of input splits | -| input.split_page_size | 500 | The page size for streamed load input split data | -| input.split_size | 1048576 | The input split size in bytes | -| job.id | local_0001 | The job id on Yarn cluster or K8s cluster. | -| job.partitions_count | 1 | The partitions count for computing one graph algorithm job. | -| job.partitions_thread_nums | 4 | The number of threads for partition parallel compute. | -| job.workers_count | 1 | The workers count for computing one graph algorithm job. | -| master.computation_class | org.apache.hugegraph.computer.core.master.DefaultMasterComputation | Master-computation is computation that can determine whether to continue next superstep. It runs at the end of each superstep on master. | -| output.batch_size | 500 | The batch size of output | -| output.batch_threads | 1 | The threads number used to batch output | -| output.hdfs_core_site_path || The hdfs core site path. | -| output.hdfs_delimiter | , | The delimiter of hdfs output. | -| output.hdfs_kerberos_enable | false | Is Kerberos authentication enabled for Hdfs. | -| output.hdfs_kerberos_keytab || The Hdfs's key tab file for kerberos authentication. | -| output.hdfs_kerberos_principal || The Hdfs's principal for kerberos authentication. | -| output.hdfs_krb5_conf | /etc/krb5.conf | Kerberos configuration file. | -| output.hdfs_merge_partitions | true | Whether merge output files of multiple partitions. | -| output.hdfs_path_prefix | /hugegraph-computer/results | The directory of hdfs output result. | -| output.hdfs_replication | 3 | The replication number of hdfs. | -| output.hdfs_site_path || The hdfs site path. | -| output.hdfs_url | hdfs://127.0.0.1:9000 | The hdfs url of output. | -| output.hdfs_user | hadoop | The hdfs user of output. | -| output.output_class | org.apache.hugegraph.computer.core.output.LogOutput | The class to output the computation result of each vertex. Be called after iteration computation. | -| output.result_name | value | The value is assigned dynamically by #name() of instance created by WORKER_COMPUTATION_CLASS. | -| output.result_write_type | OLAP_COMMON | The result write-type to output to hugegraph, allowed values are: [OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE]. | -| output.retry_interval | 10 | The retry interval when output failed | -| output.retry_times | 3 | The retry times when output failed | -| output.single_threads | 1 | The threads number used to single output | -| output.thread_pool_shutdown_timeout | 60 | The timeout seconds of output threads pool shutdown | -| output.with_adjacent_edges | false | Output the adjacent edges of the vertex or not | -| output.with_edge_properties | false | Output the properties of the edge or not | -| output.with_vertex_properties | false | Output the properties of the vertex or not | -| sort.thread_nums | 4 | The number of threads performing internal sorting. | -| transport.client_connect_timeout | 3000 | The timeout(in ms) of client connect to server. | -| transport.client_threads | 4 | The number of transport threads for client. | -| transport.close_timeout | 10000 | The timeout(in ms) of close server or close client. | -| transport.finish_session_timeout | 0 | The timeout(in ms) to finish session, 0 means using (transport.sync_request_timeout * transport.max_pending_requests). | -| transport.heartbeat_interval | 20000 | The minimum interval(in ms) between heartbeats on client side. | -| transport.io_mode | AUTO | The network IO Mode, either 'NIO', 'EPOLL', 'AUTO', the 'AUTO' means selecting the property mode automatically. | -| transport.max_pending_requests | 8 | The max number of client unreceived ack, it will trigger the sending unavailable if the number of unreceived ack >= max_pending_requests. | -| transport.max_syn_backlog | 511 | The capacity of SYN queue on server side, 0 means using system default value. | -| transport.max_timeout_heartbeat_count | 120 | The maximum times of timeout heartbeat on client side, if the number of timeouts waiting for heartbeat response continuously > max_heartbeat_timeouts the channel will be closed from client side. | -| transport.min_ack_interval | 200 | The minimum interval(in ms) of server reply ack. | -| transport.min_pending_requests | 6 | The minimum number of client unreceived ack, it will trigger the sending available if the number of unreceived ack < min_pending_requests. | -| transport.network_retries | 3 | The number of retry attempts for network communication,if network unstable. | -| transport.provider_class | org.apache.hugegraph.computer.core.network.netty.NettyTransportProvider | The transport provider, currently only supports Netty. | -| transport.receive_buffer_size | 0 | The size of socket receive-buffer in bytes, 0 means using system default value. | -| transport.recv_file_mode | true | Whether enable receive buffer-file mode, it will receive buffer write file from socket by zero-copy if enable. | -| transport.send_buffer_size | 0 | The size of socket send-buffer in bytes, 0 means using system default value. | -| transport.server_host | 127.0.0.1 | The server hostname or ip to listen on to transfer data. | -| transport.server_idle_timeout | 360000 | The max timeout(in ms) of server idle. | -| transport.server_port | 0 | The server port to listen on to transfer data. The system will assign a random port if it's set to 0. | -| transport.server_threads | 4 | The number of transport threads for server. | -| transport.sync_request_timeout | 10000 | The timeout(in ms) to wait response after sending sync-request. | -| transport.tcp_keep_alive | true | Whether enable TCP keep-alive. | -| transport.transport_epoll_lt | false | Whether enable EPOLL level-trigger. | -| transport.write_buffer_high_mark | 67108864 | The high water mark for write buffer in bytes, it will trigger the sending unavailable if the number of queued bytes > write_buffer_high_mark. | -| transport.write_buffer_low_mark | 33554432 | The low water mark for write buffer in bytes, it will trigger the sending available if the number of queued bytes < write_buffer_low_mark.org.apache.hugegraph.config.OptionChecker$$Lambda$97/0x00000008001c8440@776a6d9b | -| transport.write_socket_timeout | 3000 | The timeout(in ms) to write data to socket buffer. | -| valuefile.max_segment_size | 1073741824 | The max number of bytes in each segment of value-file. | -| worker.combiner_class | org.apache.hugegraph.computer.core.config.Null | Combiner can combine messages into one value for a vertex, for example page-rank algorithm can combine messages of a vertex to a sum value. | -| worker.computation_class | org.apache.hugegraph.computer.core.config.Null | The class to create worker-computation object, worker-computation is used to compute each vertex in each superstep. | -| worker.data_dirs | [jobs] | The directories separated by ',' that received vertices and messages can persist into. | -| worker.edge_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | The combiner can combine several properties of the same edge into one properties at inputstep. | -| worker.partitioner | org.apache.hugegraph.computer.core.graph.partition.HashPartitioner | The partitioner that decides which partition a vertex should be in, and which worker a partition should be in. | -| worker.received_buffers_bytes_limit | 104857600 | The limit bytes of buffers of received data, the total size of all buffers can't excess this limit. If received buffers reach this limit, they will be merged into a file. | -| worker.vertex_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | The combiner can combine several properties of the same vertex into one properties at inputstep. | -| worker.wait_finish_messages_timeout | 86400000 | The max timeout(in ms) message-handler wait for finish-message of all workers. | -| worker.wait_sort_timeout | 600000 | The max timeout(in ms) message-handler wait for sort-thread to sort one batch of buffers. | -| worker.write_buffer_capacity | 52428800 | The initial size of write buffer that used to store vertex or message. | -| worker.write_buffer_threshold | 52428800 | The threshold of write buffer, exceeding it will trigger sorting, the write buffer is used to store vertex or message. | - -### K8s Operator Config Options - -> NOTE: Option needs to be converted through environment variable settings, e.g. k8s.internal_etcd_url => INTERNAL_ETCD_URL - -| config option | default value | description | -|------------------------------|---------------------------|---------------------------------------------------------------------------------------------------------------------------------| -| k8s.auto_destroy_pod | true | Whether to automatically destroy all pods when the job is completed or failed. | -| k8s.close_reconciler_timeout | 120 | The max timeout(in ms) to close reconciler. | -| k8s.internal_etcd_url | http://127.0.0.1:2379 | The internal etcd url for operator system. | -| k8s.max_reconcile_retry | 3 | The max retry times of reconcile. | -| k8s.probe_backlog | 50 | The maximum backlog for serving health probes. | -| k8s.probe_port | 9892 | The value is the port that the controller bind to for serving health probes. | -| k8s.ready_check_internal | 1000 | The time interval(ms) of check ready. | -| k8s.ready_timeout | 30000 | The max timeout(in ms) of check ready. | -| k8s.reconciler_count | 10 | The max number of reconciler thread. | -| k8s.resync_period | 600000 | The minimum frequency at which watched resources are reconciled. | -| k8s.timezone | Asia/Shanghai | The timezone of computer job and operator. | -| k8s.watch_namespace | hugegraph-computer-system | The value is watch custom resources in the namespace, ignore other namespaces, the '*' means is all namespaces will be watched. | +### Computer 配置选项 + +> **默认值说明:** +> - 以下配置项显示的是**代码默认值**(定义在 `ComputerOptions.java` 中) +> - 当**打包配置文件**(`conf/computer.properties` 分发包中)指定了不同的值时,会以 `值 (打包: 值)` 的形式标注 +> - 示例:`300000 (打包: 100000)` 表示代码默认值为 300000,但分发包默认值为 100000 +> - 对于生产环境部署,除非明确覆盖,否则打包默认值优先生效 + +--- + +### 1. 基础配置 + +HugeGraph-Computer 核心作业设置。 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| hugegraph.url | http://127.0.0.1:8080 | HugeGraph 服务器 URL,用于加载数据和写回结果。 | +| hugegraph.name | hugegraph | 图名称,用于加载数据和写回结果。 | +| hugegraph.username | "" (空) | HugeGraph 认证用户名(如果未启用认证则留空)。 | +| hugegraph.password | "" (空) | HugeGraph 认证密码(如果未启用认证则留空)。 | +| job.id | local_0001 (打包: local_001) | YARN 集群或 K8s 集群上的作业标识符。 | +| job.namespace | "" (空) | 作业命名空间,可以分隔不同的数据源。🔒 **由系统管理 - 不要手动修改**。 | +| job.workers_count | 1 | 执行一个图算法作业的 Worker 数量。🔒 **在 K8s 中由系统管理 - 不要手动修改**。 | +| job.partitions_count | 1 | 执行一个图算法作业的分区数量。 | +| job.partitions_thread_nums | 4 | 分区并行计算的线程数量。 | + +--- + +### 2. 算法配置 + +计算逻辑的算法特定配置。 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| algorithm.params_class | org.apache.hugegraph.computer.core.config.Null | ⚠️ **必填** 在算法运行前用于传递算法参数的类。 | +| algorithm.result_class | org.apache.hugegraph.computer.core.config.Null | 顶点值的类,用于存储顶点的计算结果。 | +| algorithm.message_class | org.apache.hugegraph.computer.core.config.Null | 计算顶点时传递的消息类。 | + +--- + +### 3. 输入配置 + +从 HugeGraph 或其他数据源加载输入数据的配置。 + +#### 3.1 输入源 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| input.source_type | hugegraph-server | 加载输入数据的源类型,允许值:['hugegraph-server', 'hugegraph-loader']。'hugegraph-loader' 表示使用 hugegraph-loader 从 HDFS 或文件加载数据。如果使用 'hugegraph-loader',请配置 'input.loader_struct_path' 和 'input.loader_schema_path'。 | +| input.loader_struct_path | "" (空) | Loader 输入的结构路径,仅在 input.source_type=loader 启用时生效。 | +| input.loader_schema_path | "" (空) | Loader 输入的 schema 路径,仅在 input.source_type=loader 启用时生效。 | + +#### 3.2 输入分片 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| input.split_size | 1048576 (1 MB) | 输入分片大小(字节)。 | +| input.split_max_splits | 10000000 | 最大输入分片数量。 | +| input.split_page_size | 500 | 流式加载输入分片数据的页面大小。 | +| input.split_fetch_timeout | 300 | 获取输入分片的超时时间(秒)。 | + +#### 3.3 输入处理 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| input.filter_class | org.apache.hugegraph.computer.core.input.filter.DefaultInputFilter | 创建输入过滤器对象的类。输入过滤器用于根据用户需求过滤顶点边。 | +| input.edge_direction | OUT | 要加载的边的方向,允许值:[OUT, IN, BOTH]。当值为 BOTH 时,将加载 OUT 和 IN 两个方向的边。 | +| input.edge_freq | MULTIPLE | 一对顶点之间可以存在的边的频率,允许值:[SINGLE, SINGLE_PER_LABEL, MULTIPLE]。SINGLE 表示一对顶点之间只能存在一条边(通过 sourceId + targetId 标识);SINGLE_PER_LABEL 表示每个边标签在一对顶点之间可以有一条边(通过 sourceId + edgeLabel + targetId 标识);MULTIPLE 表示一对顶点之间可以存在多条边(通过 sourceId + edgeLabel + sortValues + targetId 标识)。 | +| input.max_edges_in_one_vertex | 200 | 允许附加到一个顶点的最大邻接边数量。邻接边将作为一个批处理单元一起存储和传输。 | + +#### 3.4 输入性能 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| input.send_thread_nums | 4 | 并行发送顶点或边的线程数量。 | + +--- + +### 4. 快照与存储配置 + +HugeGraph-Computer 支持快照功能,可将顶点/边分区保存到本地存储或 MinIO 对象存储,用于断点恢复或加速重复计算。 + +#### 4.1 基础快照配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| snapshot.write | false | 是否写入输入顶点/边分区的快照。 | +| snapshot.load | false | 是否从顶点/边分区的快照加载。 | +| snapshot.name | "" (空) | 用户自定义的快照名称,用于区分不同的快照。 | + +#### 4.2 MinIO 集成(可选) + +MinIO 可用作 K8s 部署中快照的分布式对象存储后端。 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| snapshot.minio_endpoint | "" (空) | MinIO 服务端点(例如 `http://minio:9000`)。使用 MinIO 时必填。 | +| snapshot.minio_access_key | minioadmin | MinIO 认证访问密钥。 | +| snapshot.minio_secret_key | minioadmin | MinIO 认证密钥。 | +| snapshot.minio_bucket_name | "" (空) | 用于存储快照数据的 MinIO 存储桶名称。 | + +**使用场景:** +- **断点恢复**:作业失败后从快照恢复,避免重新加载数据 +- **重复计算**:多次运行同一算法时从快照加载数据以加速启动 +- **A/B 测试**:保存同一数据集的多个快照版本,测试不同的算法参数 + +**示例:本地快照**(在 `computer.properties` 中): +```properties +snapshot.write=true +snapshot.name=pagerank-snapshot-20260201 +``` + +**示例:MinIO 快照**(在 K8s CRD `computerConf` 中): +```yaml +computerConf: + snapshot.write: "true" + snapshot.name: "pagerank-snapshot-v1" + snapshot.minio_endpoint: "http://minio:9000" + snapshot.minio_access_key: "my-access-key" + snapshot.minio_secret_key: "my-secret-key" + snapshot.minio_bucket_name: "hugegraph-snapshots" +``` + +--- + +### 5. Worker 与 Master 配置 + +Worker 和 Master 计算逻辑的配置。 + +#### 5.1 Master 配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| master.computation_class | org.apache.hugegraph.computer.core.master.DefaultMasterComputation | Master 计算是可以决定是否继续下一个超步的计算。它在每个超步结束时在 master 上运行。 | + +#### 5.2 Worker 计算 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| worker.computation_class | org.apache.hugegraph.computer.core.config.Null | 创建 worker 计算对象的类。Worker 计算用于在每个超步中计算每个顶点。 | +| worker.combiner_class | org.apache.hugegraph.computer.core.config.Null | Combiner 可以将消息组合为一个顶点的一个值。例如,PageRank 算法可以将一个顶点的消息组合为一个求和值。 | +| worker.partitioner | org.apache.hugegraph.computer.core.graph.partition.HashPartitioner | 分区器,决定顶点应该在哪个分区中,以及分区应该在哪个 worker 中。 | + +#### 5.3 Worker 组合器 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| worker.vertex_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | 组合器可以在输入步骤将同一顶点的多个属性组合为一个属性。 | +| worker.edge_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | 组合器可以在输入步骤将同一边的多个属性组合为一个属性。 | + +#### 5.4 Worker 缓冲区 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| worker.received_buffers_bytes_limit | 104857600 (100 MB) | 接收数据缓冲区的限制字节数。所有缓冲区的总大小不能超过此限制。如果接收缓冲区达到此限制,它们将被合并到文件中(溢出到磁盘)。 | +| worker.write_buffer_capacity | 52428800 (50 MB) | 用于存储顶点或消息的写缓冲区的初始大小。 | +| worker.write_buffer_threshold | 52428800 (50 MB) | 写缓冲区的阈值。超过它将触发排序。写缓冲区用于存储顶点或消息。 | + +#### 5.5 Worker 数据与超时 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| worker.data_dirs | [jobs] | 用逗号分隔的目录,接收的顶点和消息可以持久化到其中。 | +| worker.wait_sort_timeout | 600000 (10 分钟) | 消息处理程序等待排序线程对一批缓冲区进行排序的最大超时时间(毫秒)。 | +| worker.wait_finish_messages_timeout | 86400000 (24 小时) | 消息处理程序等待所有 worker 完成消息的最大超时时间(毫秒)。 | + +--- + +### 6. I/O 与输出配置 + +输出计算结果的配置。 + +#### 6.1 输出类与结果 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| output.output_class | org.apache.hugegraph.computer.core.output.LogOutput | 输出每个顶点计算结果的类。在迭代计算后调用。 | +| output.result_name | value | 该值由 WORKER_COMPUTATION_CLASS 创建的实例的 #name() 动态分配。 | +| output.result_write_type | OLAP_COMMON | 输出到 HugeGraph 的结果写入类型,允许值:[OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE]。 | + +#### 6.2 输出行为 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| output.with_adjacent_edges | false | 是否输出顶点的邻接边。 | +| output.with_vertex_properties | false | 是否输出顶点的属性。 | +| output.with_edge_properties | false | 是否输出边的属性。 | + +#### 6.3 批量输出 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| output.batch_size | 500 | 输出的批处理大小。 | +| output.batch_threads | 1 | 用于批量输出的线程数量。 | +| output.single_threads | 1 | 用于单个输出的线程数量。 | + +#### 6.4 HDFS 输出 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| output.hdfs_url | hdfs://127.0.0.1:9000 | 输出的 HDFS URL。 | +| output.hdfs_user | hadoop | 输出的 HDFS 用户。 | +| output.hdfs_path_prefix | /hugegraph-computer/results | HDFS 输出结果的目录。 | +| output.hdfs_delimiter | , (逗号) | HDFS 输出的分隔符。 | +| output.hdfs_merge_partitions | true | 是否合并多个分区的输出文件。 | +| output.hdfs_replication | 3 | HDFS 的副本数。 | +| output.hdfs_core_site_path | "" (空) | HDFS core site 路径。 | +| output.hdfs_site_path | "" (空) | HDFS site 路径。 | +| output.hdfs_kerberos_enable | false | 是否为 HDFS 启用 Kerberos 认证。 | +| output.hdfs_kerberos_principal | "" (空) | HDFS 的 Kerberos 认证 principal。 | +| output.hdfs_kerberos_keytab | "" (空) | HDFS 的 Kerberos 认证 keytab 文件。 | +| output.hdfs_krb5_conf | /etc/krb5.conf | Kerberos 配置文件路径。 | + +#### 6.5 重试与超时 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| output.retry_times | 3 | 输出失败时的重试次数。 | +| output.retry_interval | 10 | 输出失败时的重试间隔(秒)。 | +| output.thread_pool_shutdown_timeout | 60 | 输出线程池关闭的超时时间(秒)。 | + +--- + +### 7. 网络与传输配置 + +Worker 和 Master 之间网络通信的配置。 + +#### 7.1 服务器配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.server_host | 127.0.0.1 | 🔒 **由系统管理** 监听传输数据的服务器主机名或 IP。不要手动修改。 | +| transport.server_port | 0 | 🔒 **由系统管理** 监听传输数据的服务器端口。如果设置为 0,系统将分配一个随机端口。不要手动修改。 | +| transport.server_threads | 4 | 服务器传输线程的数量。 | + +#### 7.2 客户端配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.client_threads | 4 | 客户端传输线程的数量。 | +| transport.client_connect_timeout | 3000 | 客户端连接到服务器的超时时间(毫秒)。 | + +#### 7.3 协议配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.provider_class | org.apache.hugegraph.computer.core.network.netty.NettyTransportProvider | 传输提供程序,目前仅支持 Netty。 | +| transport.io_mode | AUTO | 网络 IO 模式,允许值:[NIO, EPOLL, AUTO]。AUTO 表示自动选择适当的模式。 | +| transport.tcp_keep_alive | true | 是否启用 TCP keep-alive。 | +| transport.transport_epoll_lt | false | 是否启用 EPOLL 水平触发(仅在 io_mode=EPOLL 时有效)。 | + +#### 7.4 缓冲区配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.send_buffer_size | 0 | Socket 发送缓冲区大小(字节)。0 表示使用系统默认值。 | +| transport.receive_buffer_size | 0 | Socket 接收缓冲区大小(字节)。0 表示使用系统默认值。 | +| transport.write_buffer_high_mark | 67108864 (64 MB) | 写缓冲区的高水位标记(字节)。如果排队字节数 > write_buffer_high_mark,将触发发送不可用。 | +| transport.write_buffer_low_mark | 33554432 (32 MB) | 写缓冲区的低水位标记(字节)。如果排队字节数 < write_buffer_low_mark,将触发发送可用。 | + +#### 7.5 流量控制 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.max_pending_requests | 8 | 客户端未接收 ACK 的最大数量。如果未接收 ACK 的数量 >= max_pending_requests,将触发发送不可用。 | +| transport.min_pending_requests | 6 | 客户端未接收 ACK 的最小数量。如果未接收 ACK 的数量 < min_pending_requests,将触发发送可用。 | +| transport.min_ack_interval | 200 | 服务器回复 ACK 的最小间隔(毫秒)。 | + +#### 7.6 超时配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.close_timeout | 10000 | 关闭服务器或关闭客户端的超时时间(毫秒)。 | +| transport.sync_request_timeout | 10000 | 发送同步请求后等待响应的超时时间(毫秒)。 | +| transport.finish_session_timeout | 0 | 完成会话的超时时间(毫秒)。0 表示使用 (transport.sync_request_timeout × transport.max_pending_requests)。 | +| transport.write_socket_timeout | 3000 | 将数据写入 socket 缓冲区的超时时间(毫秒)。 | +| transport.server_idle_timeout | 360000 (6 分钟) | 服务器空闲的最大超时时间(毫秒)。 | + +#### 7.7 心跳配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.heartbeat_interval | 20000 (20 秒) | 客户端心跳之间的最小间隔(毫秒)。 | +| transport.max_timeout_heartbeat_count | 120 | 客户端超时心跳的最大次数。如果连续等待心跳响应超时的次数 > max_timeout_heartbeat_count,通道将从客户端关闭。 | + +#### 7.8 高级网络设置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| transport.max_syn_backlog | 511 | 服务器端 SYN 队列的容量。0 表示使用系统默认值。 | +| transport.recv_file_mode | true | 是否启用接收缓冲文件模式。如果启用,将使用零拷贝从 socket 接收缓冲区并写入文件。**注意**:需要操作系统支持零拷贝(例如 Linux sendfile/splice)。 | +| transport.network_retries | 3 | 网络通信不稳定时的重试次数。 | + +--- + +### 8. 存储与持久化配置 + +HGKV(HugeGraph Key-Value)存储引擎和值文件的配置。 + +#### 8.1 HGKV 配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| hgkv.max_file_size | 2147483648 (2 GB) | 每个 HGKV 文件的最大字节数。 | +| hgkv.max_data_block_size | 65536 (64 KB) | HGKV 文件数据块的最大字节大小。 | +| hgkv.max_merge_files | 10 | 一次合并的最大文件数。 | +| hgkv.temp_file_dir | /tmp/hgkv | 此文件夹用于在文件合并过程中存储临时文件。 | + +#### 8.2 值文件配置 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| valuefile.max_segment_size | 1073741824 (1 GB) | 值文件每个段的最大字节数。 | + +--- + +### 9. BSP 与协调配置 + +批量同步并行(BSP)协议和 etcd 协调的配置。 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| bsp.etcd_endpoints | http://localhost:2379 | 🔒 **在 K8s 中由系统管理** 访问 etcd 的端点。对于多个端点,使用逗号分隔列表:`http://host1:port1,http://host2:port2`。不要在 K8s 部署中手动修改。 | +| bsp.max_super_step | 10 (打包: 2) | 算法的最大超步数。 | +| bsp.register_timeout | 300000 (打包: 100000) | 等待 master 和 worker 注册的最大超时时间(毫秒)。 | +| bsp.wait_workers_timeout | 86400000 (24 小时) | 等待 worker BSP 事件的最大超时时间(毫秒)。 | +| bsp.wait_master_timeout | 86400000 (24 小时) | 等待 master BSP 事件的最大超时时间(毫秒)。 | +| bsp.log_interval | 30000 (30 秒) | 等待 BSP 事件时打印日志的日志间隔(毫秒)。 | + +--- + +### 10. 性能调优配置 + +性能优化的配置。 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| allocator.max_vertices_per_thread | 10000 | 每个内存分配器中每个线程处理的最大顶点数。 | +| sort.thread_nums | 4 | 执行内部排序的线程数量。 | + +--- + +### 11. 系统管理配置 + +⚠️ **由系统管理的配置项 - 禁止用户手动修改。** + +以下配置项由 K8s Operator、Driver 或运行时系统自动管理。手动修改将导致集群通信失败或作业调度错误。 + +| 配置项 | 管理者 | 说明 | +|--------|--------|------| +| bsp.etcd_endpoints | K8s Operator | 自动设置为 operator 的 etcd 服务地址 | +| transport.server_host | 运行时 | 自动设置为 pod/容器主机名 | +| transport.server_port | 运行时 | 自动分配随机端口 | +| job.namespace | K8s Operator | 自动设置为作业命名空间 | +| job.id | K8s Operator | 自动从 CRD 设置为作业 ID | +| job.workers_count | K8s Operator | 自动从 CRD `workerInstances` 设置 | +| rpc.server_host | 运行时 | RPC 服务器主机名(系统管理) | +| rpc.server_port | 运行时 | RPC 服务器端口(系统管理) | +| rpc.remote_url | 运行时 | RPC 远程 URL(系统管理) | + +**为什么禁止修改:** +- **BSP/RPC 配置**:必须与实际部署的 etcd/RPC 服务匹配。手动覆盖会破坏协调。 +- **作业配置**:必须与 K8s CRD 规范匹配。不匹配会导致 worker 数量错误。 +- **传输配置**:必须使用实际的 pod 主机名/端口。手动值会阻止 worker 间通信。 + +--- + +### K8s Operator 配置选项 + +> 注意:选项需要通过环境变量设置进行转换,例如 k8s.internal_etcd_url => INTERNAL_ETCD_URL + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| k8s.auto_destroy_pod | true | 作业完成或失败时是否自动销毁所有 pod。 | +| k8s.close_reconciler_timeout | 120 | 关闭 reconciler 的最大超时时间(毫秒)。 | +| k8s.internal_etcd_url | http://127.0.0.1:2379 | operator 系统的内部 etcd URL。 | +| k8s.max_reconcile_retry | 3 | reconcile 的最大重试次数。 | +| k8s.probe_backlog | 50 | 服务健康探针的最大积压。 | +| k8s.probe_port | 9892 | controller 绑定的用于服务健康探针的端口。 | +| k8s.ready_check_internal | 1000 | 检查就绪的时间间隔(毫秒)。 | +| k8s.ready_timeout | 30000 | 检查就绪的最大超时时间(毫秒)。 | +| k8s.reconciler_count | 10 | reconciler 线程的最大数量。 | +| k8s.resync_period | 600000 | 被监视资源进行 reconcile 的最小频率。 | +| k8s.timezone | Asia/Shanghai | computer 作业和 operator 的时区。 | +| k8s.watch_namespace | hugegraph-computer-system | 监视自定义资源的命名空间。使用 '*' 监视所有命名空间。 | + +--- ### HugeGraph-Computer CRD > CRD: https://github.com/apache/hugegraph-computer/blob/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1.yaml -| spec | default value | description | required | -|-----------------|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| algorithmName | | The name of algorithm. | true | -| jobId | | The job id. | true | -| image | | The image of algorithm. | true | -| computerConf | | The map of computer config options. | true | -| workerInstances | | The number of worker instances, it will instead the 'job.workers_count' option. | true | -| pullPolicy | Always | The pull-policy of image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy | false | -| pullSecrets | | The pull-secrets of Image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod | false | -| masterCpu | | The cpu limit of master, the unit can be 'm' or without unit detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | -| workerCpu | | The cpu limit of worker, the unit can be 'm' or without unit detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | -| masterMemory | | The memory limit of master, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | -| workerMemory | | The memory limit of worker, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | -| log4jXml | | The content of log4j.xml for computer job. | false | -| jarFile | | The jar path of computer algorithm. | false | -| remoteJarUri | | The remote jar uri of computer algorithm, it will overlay algorithm image. | false | -| jvmOptions | | The java startup parameters of computer job. | false | -| envVars | | please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/ | false | -| envFrom | | please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/ | false | -| masterCommand | bin/start-computer.sh | The run command of master, equivalent to 'Entrypoint' field of Docker. | false | -| masterArgs | ["-r master", "-d k8s"] | The run args of master, equivalent to 'Cmd' field of Docker. | false | -| workerCommand | bin/start-computer.sh | The run command of worker, equivalent to 'Entrypoint' field of Docker. | false | -| workerArgs | ["-r worker", "-d k8s"] | The run args of worker, equivalent to 'Cmd' field of Docker. | false | -| volumes | | Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/ | false | -| volumeMounts | | Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/ | false | -| secretPaths | | The map of k8s-secret name and mount path. | false | -| configMapPaths | | The map of k8s-configmap name and mount path. | false | -| podTemplateSpec | | Please refer to: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-template-v1/#PodTemplateSpec | false | -| securityContext | | Please refer to: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ | false | - -### KubeDriver Config Options - -| config option | default value | description | -|----------------------------------|------------------------------------------|-----------------------------------------------------------| -| k8s.build_image_bash_path || The path of command used to build image. | -| k8s.enable_internal_algorithm | true | Whether enable internal algorithm. | -| k8s.framework_image_url | hugegraph/hugegraph-computer:latest | The image url of computer framework. | -| k8s.image_repository_password || The password for login image repository. | -| k8s.image_repository_registry || The address for login image repository. | -| k8s.image_repository_url | hugegraph/hugegraph-computer | The url of image repository. | -| k8s.image_repository_username || The username for login image repository. | -| k8s.internal_algorithm | [pageRank] | The name list of all internal algorithm. | -| k8s.internal_algorithm_image_url | hugegraph/hugegraph-computer:latest | The image url of internal algorithm. | -| k8s.jar_file_dir | /cache/jars/ | The directory where the algorithm jar to upload location. | -| k8s.kube_config | ~/.kube/config | The path of k8s config file. | -| k8s.log4j_xml_path || The log4j.xml path for computer job. | -| k8s.namespace | hugegraph-computer-system | The namespace of hugegraph-computer system. | -| k8s.pull_secret_names | [] | The names of pull-secret for pulling image. | +| 字段 | 默认值 | 说明 | 必填 | +|------|--------|------|------| +| algorithmName | | 算法名称。 | true | +| jobId | | 作业 ID。 | true | +| image | | 算法镜像。 | true | +| computerConf | | computer 配置选项的映射。 | true | +| workerInstances | | worker 实例数量,将覆盖 'job.workers_count' 选项。 | true | +| pullPolicy | Always | 镜像拉取策略,详情请参考:https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy | false | +| pullSecrets | | 镜像拉取密钥,详情请参考:https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod | false | +| masterCpu | | master 的 CPU 限制,单位可以是 'm' 或无单位,详情请参考:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | +| workerCpu | | worker 的 CPU 限制,单位可以是 'm' 或无单位,详情请参考:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | +| masterMemory | | master 的内存限制,单位可以是 Ei、Pi、Ti、Gi、Mi、Ki 之一,详情请参考:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | +| workerMemory | | worker 的内存限制,单位可以是 Ei、Pi、Ti、Gi、Mi、Ki 之一,详情请参考:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | +| log4jXml | | computer 作业的 log4j.xml 内容。 | false | +| jarFile | | computer 算法的 jar 路径。 | false | +| remoteJarUri | | computer 算法的远程 jar URI,将覆盖算法镜像。 | false | +| jvmOptions | | computer 作业的 Java 启动参数。 | false | +| envVars | | 请参考:https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/ | false | +| envFrom | | 请参考:https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/ | false | +| masterCommand | bin/start-computer.sh | master 的运行命令,等同于 Docker 的 'Entrypoint' 字段。 | false | +| masterArgs | ["-r master", "-d k8s"] | master 的运行参数,等同于 Docker 的 'Cmd' 字段。 | false | +| workerCommand | bin/start-computer.sh | worker 的运行命令,等同于 Docker 的 'Entrypoint' 字段。 | false | +| workerArgs | ["-r worker", "-d k8s"] | worker 的运行参数,等同于 Docker 的 'Cmd' 字段。 | false | +| volumes | | 请参考:https://kubernetes.io/docs/concepts/storage/volumes/ | false | +| volumeMounts | | 请参考:https://kubernetes.io/docs/concepts/storage/volumes/ | false | +| secretPaths | | k8s-secret 名称和挂载路径的映射。 | false | +| configMapPaths | | k8s-configmap 名称和挂载路径的映射。 | false | +| podTemplateSpec | | 请参考:https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-template-v1/#PodTemplateSpec | false | +| securityContext | | 请参考:https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ | false | + +--- + +### KubeDriver 配置选项 + +| 配置项 | 默认值 | 说明 | +|--------|--------|------| +| k8s.build_image_bash_path | | 用于构建镜像的命令路径。 | +| k8s.enable_internal_algorithm | true | 是否启用内部算法。 | +| k8s.framework_image_url | hugegraph/hugegraph-computer:latest | computer 框架的镜像 URL。 | +| k8s.image_repository_password | | 登录镜像仓库的密码。 | +| k8s.image_repository_registry | | 登录镜像仓库的地址。 | +| k8s.image_repository_url | hugegraph/hugegraph-computer | 镜像仓库的 URL。 | +| k8s.image_repository_username | | 登录镜像仓库的用户名。 | +| k8s.internal_algorithm | [pageRank] | 所有内部算法的名称列表。**注意**:算法名称在这里使用驼峰命名法(例如 `pageRank`),但算法实现返回下划线命名法(例如 `page_rank`)。 | +| k8s.internal_algorithm_image_url | hugegraph/hugegraph-computer:latest | 内部算法的镜像 URL。 | +| k8s.jar_file_dir | /cache/jars/ | 算法 jar 将上传到的目录。 | +| k8s.kube_config | ~/.kube/config | k8s 配置文件的路径。 | +| k8s.log4j_xml_path | | computer 作业的 log4j.xml 路径。 | +| k8s.namespace | hugegraph-computer-system | hugegraph-computer 系统的命名空间。 | +| k8s.pull_secret_names | [] | 拉取镜像的 pull-secret 名称。 | diff --git a/content/en/docs/config/config-computer.md b/content/en/docs/config/config-computer.md index 08f804c10..3c97d796a 100644 --- a/content/en/docs/config/config-computer.md +++ b/content/en/docs/config/config-computer.md @@ -6,172 +6,443 @@ weight: 5 ### Computer Config Options -| config option | default value | description | -|-----------------------------------------|-------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| algorithm.message_class | org.apache.hugegraph.computer.core.config.Null | The class of message passed when compute vertex. | -| algorithm.params_class | org.apache.hugegraph.computer.core.config.Null | The class used to transfer algorithms' parameters before algorithm been run. | -| algorithm.result_class | org.apache.hugegraph.computer.core.config.Null | The class of vertex's value, the instance is used to store computation result for the vertex. | -| allocator.max_vertices_per_thread | 10000 | Maximum number of vertices per thread processed in each memory allocator | -| bsp.etcd_endpoints | http://localhost:2379 | The end points to access etcd. | -| bsp.log_interval | 30000 | The log interval(in ms) to print the log while waiting bsp event. | -| bsp.max_super_step | 10 | The max super step of the algorithm. | -| bsp.register_timeout | 300000 | The max timeout to wait for master and works to register. | -| bsp.wait_master_timeout | 86400000 | The max timeout(in ms) to wait for master bsp event. | -| bsp.wait_workers_timeout | 86400000 | The max timeout to wait for workers bsp event. | -| hgkv.max_data_block_size | 65536 | The max byte size of hgkv-file data block. | -| hgkv.max_file_size | 2147483648 | The max number of bytes in each hgkv-file. | -| hgkv.max_merge_files | 10 | The max number of files to merge at one time. | -| hgkv.temp_file_dir | /tmp/hgkv | This folder is used to store temporary files, temporary files will be generated during the file merging process. | -| hugegraph.name | hugegraph | The graph name to load data and write results back. | -| hugegraph.url | http://127.0.0.1:8080 | The hugegraph url to load data and write results back. | -| input.edge_direction | OUT | The data of the edge in which direction is loaded, when the value is BOTH, the edges in both OUT and IN direction will be loaded. | -| input.edge_freq | MULTIPLE | The frequency of edges can exist between a pair of vertices, allowed values: [SINGLE, SINGLE_PER_LABEL, MULTIPLE]. SINGLE means that only one edge can exist between a pair of vertices, use sourceId + targetId to identify it; SINGLE_PER_LABEL means that each edge label can exist one edge between a pair of vertices, use sourceId + edgelabel + targetId to identify it; MULTIPLE means that many edge can exist between a pair of vertices, use sourceId + edgelabel + sortValues + targetId to identify it. | -| input.filter_class | org.apache.hugegraph.computer.core.input.filter.DefaultInputFilter | The class to create input-filter object, input-filter is used to Filter vertex edges according to user needs. | -| input.loader_schema_path || The schema path of loader input, only takes effect when the input.source_type=loader is enabled | -| input.loader_struct_path || The struct path of loader input, only takes effect when the input.source_type=loader is enabled | -| input.max_edges_in_one_vertex | 200 | The maximum number of adjacent edges allowed to be attached to a vertex, the adjacent edges will be stored and transferred together as a batch unit. | -| input.source_type | hugegraph-server | The source type to load input data, allowed values: ['hugegraph-server', 'hugegraph-loader'], the 'hugegraph-loader' means use hugegraph-loader load data from HDFS or file, if use 'hugegraph-loader' load data then please config 'input.loader_struct_path' and 'input.loader_schema_path'. | -| input.split_fetch_timeout | 300 | The timeout in seconds to fetch input splits | -| input.split_max_splits | 10000000 | The maximum number of input splits | -| input.split_page_size | 500 | The page size for streamed load input split data | -| input.split_size | 1048576 | The input split size in bytes | -| job.id | local_0001 | The job id on Yarn cluster or K8s cluster. | -| job.partitions_count | 1 | The partitions count for computing one graph algorithm job. | -| job.partitions_thread_nums | 4 | The number of threads for partition parallel compute. | -| job.workers_count | 1 | The workers count for computing one graph algorithm job. | -| master.computation_class | org.apache.hugegraph.computer.core.master.DefaultMasterComputation | Master-computation is computation that can determine whether to continue next superstep. It runs at the end of each superstep on master. | -| output.batch_size | 500 | The batch size of output | -| output.batch_threads | 1 | The threads number used to batch output | -| output.hdfs_core_site_path || The hdfs core site path. | -| output.hdfs_delimiter | , | The delimiter of hdfs output. | -| output.hdfs_kerberos_enable | false | Is Kerberos authentication enabled for Hdfs. | -| output.hdfs_kerberos_keytab || The Hdfs's key tab file for kerberos authentication. | -| output.hdfs_kerberos_principal || The Hdfs's principal for kerberos authentication. | -| output.hdfs_krb5_conf | /etc/krb5.conf | Kerberos configuration file. | -| output.hdfs_merge_partitions | true | Whether merge output files of multiple partitions. | -| output.hdfs_path_prefix | /hugegraph-computer/results | The directory of hdfs output result. | -| output.hdfs_replication | 3 | The replication number of hdfs. | -| output.hdfs_site_path || The hdfs site path. | -| output.hdfs_url | hdfs://127.0.0.1:9000 | The hdfs url of output. | -| output.hdfs_user | hadoop | The hdfs user of output. | -| output.output_class | org.apache.hugegraph.computer.core.output.LogOutput | The class to output the computation result of each vertex. Be called after iteration computation. | -| output.result_name | value | The value is assigned dynamically by #name() of instance created by WORKER_COMPUTATION_CLASS. | -| output.result_write_type | OLAP_COMMON | The result write-type to output to hugegraph, allowed values are: [OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE]. | -| output.retry_interval | 10 | The retry interval when output failed | -| output.retry_times | 3 | The retry times when output failed | -| output.single_threads | 1 | The threads number used to single output | -| output.thread_pool_shutdown_timeout | 60 | The timeout seconds of output threads pool shutdown | -| output.with_adjacent_edges | false | Output the adjacent edges of the vertex or not | -| output.with_edge_properties | false | Output the properties of the edge or not | -| output.with_vertex_properties | false | Output the properties of the vertex or not | -| sort.thread_nums | 4 | The number of threads performing internal sorting. | -| transport.client_connect_timeout | 3000 | The timeout(in ms) of client connect to server. | -| transport.client_threads | 4 | The number of transport threads for client. | -| transport.close_timeout | 10000 | The timeout(in ms) of close server or close client. | -| transport.finish_session_timeout | 0 | The timeout(in ms) to finish session, 0 means using (transport.sync_request_timeout * transport.max_pending_requests). | -| transport.heartbeat_interval | 20000 | The minimum interval(in ms) between heartbeats on client side. | -| transport.io_mode | AUTO | The network IO Mode, either 'NIO', 'EPOLL', 'AUTO', the 'AUTO' means selecting the property mode automatically. | -| transport.max_pending_requests | 8 | The max number of client unreceived ack, it will trigger the sending unavailable if the number of unreceived ack >= max_pending_requests. | -| transport.max_syn_backlog | 511 | The capacity of SYN queue on server side, 0 means using system default value. | -| transport.max_timeout_heartbeat_count | 120 | The maximum times of timeout heartbeat on client side, if the number of timeouts waiting for heartbeat response continuously > max_heartbeat_timeouts the channel will be closed from client side. | -| transport.min_ack_interval | 200 | The minimum interval(in ms) of server reply ack. | -| transport.min_pending_requests | 6 | The minimum number of client unreceived ack, it will trigger the sending available if the number of unreceived ack < min_pending_requests. | -| transport.network_retries | 3 | The number of retry attempts for network communication,if network unstable. | -| transport.provider_class | org.apache.hugegraph.computer.core.network.netty.NettyTransportProvider | The transport provider, currently only supports Netty. | -| transport.receive_buffer_size | 0 | The size of socket receive-buffer in bytes, 0 means using system default value. | -| transport.recv_file_mode | true | Whether enable receive buffer-file mode, it will receive buffer write file from socket by zero-copy if enable. | -| transport.send_buffer_size | 0 | The size of socket send-buffer in bytes, 0 means using system default value. | -| transport.server_host | 127.0.0.1 | The server hostname or ip to listen on to transfer data. | -| transport.server_idle_timeout | 360000 | The max timeout(in ms) of server idle. | -| transport.server_port | 0 | The server port to listen on to transfer data. The system will assign a random port if it's set to 0. | -| transport.server_threads | 4 | The number of transport threads for server. | -| transport.sync_request_timeout | 10000 | The timeout(in ms) to wait response after sending sync-request. | -| transport.tcp_keep_alive | true | Whether enable TCP keep-alive. | -| transport.transport_epoll_lt | false | Whether enable EPOLL level-trigger. | -| transport.write_buffer_high_mark | 67108864 | The high water mark for write buffer in bytes, it will trigger the sending unavailable if the number of queued bytes > write_buffer_high_mark. | -| transport.write_buffer_low_mark | 33554432 | The low water mark for write buffer in bytes, it will trigger the sending available if the number of queued bytes < write_buffer_low_mark.org.apache.hugegraph.config.OptionChecker$$Lambda$97/0x00000008001c8440@776a6d9b | -| transport.write_socket_timeout | 3000 | The timeout(in ms) to write data to socket buffer. | -| valuefile.max_segment_size | 1073741824 | The max number of bytes in each segment of value-file. | -| worker.combiner_class | org.apache.hugegraph.computer.core.config.Null | Combiner can combine messages into one value for a vertex, for example page-rank algorithm can combine messages of a vertex to a sum value. | -| worker.computation_class | org.apache.hugegraph.computer.core.config.Null | The class to create worker-computation object, worker-computation is used to compute each vertex in each superstep. | -| worker.data_dirs | [jobs] | The directories separated by ',' that received vertices and messages can persist into. | -| worker.edge_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | The combiner can combine several properties of the same edge into one properties at inputstep. | -| worker.partitioner | org.apache.hugegraph.computer.core.graph.partition.HashPartitioner | The partitioner that decides which partition a vertex should be in, and which worker a partition should be in. | -| worker.received_buffers_bytes_limit | 104857600 | The limit bytes of buffers of received data, the total size of all buffers can't excess this limit. If received buffers reach this limit, they will be merged into a file. | -| worker.vertex_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | The combiner can combine several properties of the same vertex into one properties at inputstep. | -| worker.wait_finish_messages_timeout | 86400000 | The max timeout(in ms) message-handler wait for finish-message of all workers. | -| worker.wait_sort_timeout | 600000 | The max timeout(in ms) message-handler wait for sort-thread to sort one batch of buffers. | -| worker.write_buffer_capacity | 52428800 | The initial size of write buffer that used to store vertex or message. | -| worker.write_buffer_threshold | 52428800 | The threshold of write buffer, exceeding it will trigger sorting, the write buffer is used to store vertex or message. | +> **Default Value Notes:** +> - Configuration items listed below show the **code default values** (defined in `ComputerOptions.java`) +> - When the **packaged configuration file** (`conf/computer.properties` in the distribution) specifies a different value, it's noted as: `value (packaged: value)` +> - Example: `300000 (packaged: 100000)` means the code default is 300000, but the distributed package defaults to 100000 +> - For production deployments, the packaged defaults take precedence unless you explicitly override them + +--- + +### 1. Basic Configuration + +Core job settings for HugeGraph-Computer. + +| config option | default value | description | +|---------------|---------------|-------------| +| hugegraph.url | http://127.0.0.1:8080 | The HugeGraph server URL to load data and write results back. | +| hugegraph.name | hugegraph | The graph name to load data and write results back. | +| hugegraph.username | "" (empty) | The username for HugeGraph authentication (leave empty if authentication is disabled). | +| hugegraph.password | "" (empty) | The password for HugeGraph authentication (leave empty if authentication is disabled). | +| job.id | local_0001 (packaged: local_001) | The job identifier on YARN cluster or K8s cluster. | +| job.namespace | "" (empty) | The job namespace that can separate different data sources. 🔒 **Managed by system - do not modify manually**. | +| job.workers_count | 1 | The number of workers for computing one graph algorithm job. 🔒 **Managed by system - do not modify manually in K8s**. | +| job.partitions_count | 1 | The number of partitions for computing one graph algorithm job. | +| job.partitions_thread_nums | 4 | The number of threads for partition parallel compute. | + +--- + +### 2. Algorithm Configuration + +Algorithm-specific configuration for computation logic. + +| config option | default value | description | +|---------------|---------------|-------------| +| algorithm.params_class | org.apache.hugegraph.computer.core.config.Null | ⚠️ **REQUIRED** The class used to transfer algorithm parameters before the algorithm is run. | +| algorithm.result_class | org.apache.hugegraph.computer.core.config.Null | The class of vertex's value, used to store the computation result for the vertex. | +| algorithm.message_class | org.apache.hugegraph.computer.core.config.Null | The class of message passed when computing a vertex. | + +--- + +### 3. Input Configuration + +Configuration for loading input data from HugeGraph or other sources. + +#### 3.1 Input Source + +| config option | default value | description | +|---------------|---------------|-------------| +| input.source_type | hugegraph-server | The source type to load input data, allowed values: ['hugegraph-server', 'hugegraph-loader']. The 'hugegraph-loader' means use hugegraph-loader to load data from HDFS or file. If using 'hugegraph-loader', please configure 'input.loader_struct_path' and 'input.loader_schema_path'. | +| input.loader_struct_path | "" (empty) | The struct path of loader input, only takes effect when input.source_type=loader is enabled. | +| input.loader_schema_path | "" (empty) | The schema path of loader input, only takes effect when input.source_type=loader is enabled. | + +#### 3.2 Input Splits + +| config option | default value | description | +|---------------|---------------|-------------| +| input.split_size | 1048576 (1 MB) | The input split size in bytes. | +| input.split_max_splits | 10000000 | The maximum number of input splits. | +| input.split_page_size | 500 | The page size for streamed load input split data. | +| input.split_fetch_timeout | 300 | The timeout in seconds to fetch input splits. | + +#### 3.3 Input Processing + +| config option | default value | description | +|---------------|---------------|-------------| +| input.filter_class | org.apache.hugegraph.computer.core.input.filter.DefaultInputFilter | The class to create input-filter object. Input-filter is used to filter vertex edges according to user needs. | +| input.edge_direction | OUT | The direction of edges to load, allowed values: [OUT, IN, BOTH]. When the value is BOTH, edges in both OUT and IN directions will be loaded. | +| input.edge_freq | MULTIPLE | The frequency of edges that can exist between a pair of vertices, allowed values: [SINGLE, SINGLE_PER_LABEL, MULTIPLE]. SINGLE means only one edge can exist between a pair of vertices (identified by sourceId + targetId); SINGLE_PER_LABEL means each edge label can have one edge between a pair of vertices (identified by sourceId + edgeLabel + targetId); MULTIPLE means many edges can exist between a pair of vertices (identified by sourceId + edgeLabel + sortValues + targetId). | +| input.max_edges_in_one_vertex | 200 | The maximum number of adjacent edges allowed to be attached to a vertex. The adjacent edges will be stored and transferred together as a batch unit. | + +#### 3.4 Input Performance + +| config option | default value | description | +|---------------|---------------|-------------| +| input.send_thread_nums | 4 | The number of threads for parallel sending of vertices or edges. | + +--- + +### 4. Snapshot & Storage Configuration + +HugeGraph-Computer supports snapshot functionality to save vertex/edge partitions to local storage or MinIO object storage, enabling checkpoint recovery or accelerating repeated computations. + +#### 4.1 Basic Snapshot Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| snapshot.write | false | Whether to write snapshots of input vertex/edge partitions. | +| snapshot.load | false | Whether to load from snapshots of vertex/edge partitions. | +| snapshot.name | "" (empty) | User-defined snapshot name to distinguish different snapshots. | + +#### 4.2 MinIO Integration (Optional) + +MinIO can be used as a distributed object storage backend for snapshots in K8s deployments. + +| config option | default value | description | +|---------------|---------------|-------------| +| snapshot.minio_endpoint | "" (empty) | MinIO service endpoint (e.g., `http://minio:9000`). Required when using MinIO. | +| snapshot.minio_access_key | minioadmin | MinIO access key for authentication. | +| snapshot.minio_secret_key | minioadmin | MinIO secret key for authentication. | +| snapshot.minio_bucket_name | "" (empty) | MinIO bucket name for storing snapshot data. | + +**Usage Scenarios:** +- **Checkpoint Recovery**: Resume from snapshots after job failures, avoiding data reloading +- **Repeated Computations**: Load data from snapshots when running the same algorithm multiple times +- **A/B Testing**: Save multiple snapshot versions of the same dataset to test different algorithm parameters + +**Example: Local Snapshot** (in `computer.properties`): +```properties +snapshot.write=true +snapshot.name=pagerank-snapshot-20260201 +``` + +**Example: MinIO Snapshot** (in K8s CRD `computerConf`): +```yaml +computerConf: + snapshot.write: "true" + snapshot.name: "pagerank-snapshot-v1" + snapshot.minio_endpoint: "http://minio:9000" + snapshot.minio_access_key: "my-access-key" + snapshot.minio_secret_key: "my-secret-key" + snapshot.minio_bucket_name: "hugegraph-snapshots" +``` + +--- + +### 5. Worker & Master Configuration + +Configuration for worker and master computation logic. + +#### 5.1 Master Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| master.computation_class | org.apache.hugegraph.computer.core.master.DefaultMasterComputation | Master-computation is computation that can determine whether to continue to the next superstep. It runs at the end of each superstep on the master. | + +#### 5.2 Worker Computation + +| config option | default value | description | +|---------------|---------------|-------------| +| worker.computation_class | org.apache.hugegraph.computer.core.config.Null | The class to create worker-computation object. Worker-computation is used to compute each vertex in each superstep. | +| worker.combiner_class | org.apache.hugegraph.computer.core.config.Null | Combiner can combine messages into one value for a vertex. For example, PageRank algorithm can combine messages of a vertex to a sum value. | +| worker.partitioner | org.apache.hugegraph.computer.core.graph.partition.HashPartitioner | The partitioner that decides which partition a vertex should be in, and which worker a partition should be in. | + +#### 5.3 Worker Combiners + +| config option | default value | description | +|---------------|---------------|-------------| +| worker.vertex_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | The combiner can combine several properties of the same vertex into one properties at input step. | +| worker.edge_properties_combiner_class | org.apache.hugegraph.computer.core.combiner.OverwritePropertiesCombiner | The combiner can combine several properties of the same edge into one properties at input step. | + +#### 5.4 Worker Buffers + +| config option | default value | description | +|---------------|---------------|-------------| +| worker.received_buffers_bytes_limit | 104857600 (100 MB) | The limit bytes of buffers of received data. The total size of all buffers can't exceed this limit. If received buffers reach this limit, they will be merged into a file (spill to disk). | +| worker.write_buffer_capacity | 52428800 (50 MB) | The initial size of write buffer that used to store vertex or message. | +| worker.write_buffer_threshold | 52428800 (50 MB) | The threshold of write buffer. Exceeding it will trigger sorting. The write buffer is used to store vertex or message. | + +#### 5.5 Worker Data & Timeouts + +| config option | default value | description | +|---------------|---------------|-------------| +| worker.data_dirs | [jobs] | The directories separated by ',' that received vertices and messages can persist into. | +| worker.wait_sort_timeout | 600000 (10 minutes) | The max timeout (in ms) for message-handler to wait for sort-thread to sort one batch of buffers. | +| worker.wait_finish_messages_timeout | 86400000 (24 hours) | The max timeout (in ms) for message-handler to wait for finish-message of all workers. | + +--- + +### 6. I/O & Output Configuration + +Configuration for output computation results. + +#### 6.1 Output Class & Result + +| config option | default value | description | +|---------------|---------------|-------------| +| output.output_class | org.apache.hugegraph.computer.core.output.LogOutput | The class to output the computation result of each vertex. Called after iteration computation. | +| output.result_name | value | The value is assigned dynamically by #name() of instance created by WORKER_COMPUTATION_CLASS. | +| output.result_write_type | OLAP_COMMON | The result write-type to output to HugeGraph, allowed values: [OLAP_COMMON, OLAP_SECONDARY, OLAP_RANGE]. | + +#### 6.2 Output Behavior + +| config option | default value | description | +|---------------|---------------|-------------| +| output.with_adjacent_edges | false | Whether to output the adjacent edges of the vertex. | +| output.with_vertex_properties | false | Whether to output the properties of the vertex. | +| output.with_edge_properties | false | Whether to output the properties of the edge. | + +#### 6.3 Batch Output + +| config option | default value | description | +|---------------|---------------|-------------| +| output.batch_size | 500 | The batch size of output. | +| output.batch_threads | 1 | The number of threads used for batch output. | +| output.single_threads | 1 | The number of threads used for single output. | + +#### 6.4 HDFS Output + +| config option | default value | description | +|---------------|---------------|-------------| +| output.hdfs_url | hdfs://127.0.0.1:9000 | The HDFS URL for output. | +| output.hdfs_user | hadoop | The HDFS user for output. | +| output.hdfs_path_prefix | /hugegraph-computer/results | The directory of HDFS output results. | +| output.hdfs_delimiter | , (comma) | The delimiter of HDFS output. | +| output.hdfs_merge_partitions | true | Whether to merge output files of multiple partitions. | +| output.hdfs_replication | 3 | The replication number of HDFS. | +| output.hdfs_core_site_path | "" (empty) | The HDFS core site path. | +| output.hdfs_site_path | "" (empty) | The HDFS site path. | +| output.hdfs_kerberos_enable | false | Whether Kerberos authentication is enabled for HDFS. | +| output.hdfs_kerberos_principal | "" (empty) | The HDFS principal for Kerberos authentication. | +| output.hdfs_kerberos_keytab | "" (empty) | The HDFS keytab file for Kerberos authentication. | +| output.hdfs_krb5_conf | /etc/krb5.conf | Kerberos configuration file path. | + +#### 6.5 Retry & Timeout + +| config option | default value | description | +|---------------|---------------|-------------| +| output.retry_times | 3 | The retry times when output fails. | +| output.retry_interval | 10 | The retry interval (in seconds) when output fails. | +| output.thread_pool_shutdown_timeout | 60 | The timeout (in seconds) of output thread pool shutdown. | + +--- + +### 7. Network & Transport Configuration + +Configuration for network communication between workers and master. + +#### 7.1 Server Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.server_host | 127.0.0.1 | 🔒 **Managed by system** The server hostname or IP to listen on to transfer data. Do not modify manually. | +| transport.server_port | 0 | 🔒 **Managed by system** The server port to listen on to transfer data. The system will assign a random port if set to 0. Do not modify manually. | +| transport.server_threads | 4 | The number of transport threads for server. | + +#### 7.2 Client Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.client_threads | 4 | The number of transport threads for client. | +| transport.client_connect_timeout | 3000 | The timeout (in ms) of client connect to server. | + +#### 7.3 Protocol Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.provider_class | org.apache.hugegraph.computer.core.network.netty.NettyTransportProvider | The transport provider, currently only supports Netty. | +| transport.io_mode | AUTO | The network IO mode, allowed values: [NIO, EPOLL, AUTO]. AUTO means selecting the appropriate mode automatically. | +| transport.tcp_keep_alive | true | Whether to enable TCP keep-alive. | +| transport.transport_epoll_lt | false | Whether to enable EPOLL level-trigger (only effective when io_mode=EPOLL). | + +#### 7.4 Buffer Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.send_buffer_size | 0 | The size of socket send-buffer in bytes. 0 means using system default value. | +| transport.receive_buffer_size | 0 | The size of socket receive-buffer in bytes. 0 means using system default value. | +| transport.write_buffer_high_mark | 67108864 (64 MB) | The high water mark for write buffer in bytes. It will trigger sending unavailable if the number of queued bytes > write_buffer_high_mark. | +| transport.write_buffer_low_mark | 33554432 (32 MB) | The low water mark for write buffer in bytes. It will trigger sending available if the number of queued bytes < write_buffer_low_mark. | + +#### 7.5 Flow Control + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.max_pending_requests | 8 | The max number of client unreceived ACKs. It will trigger sending unavailable if the number of unreceived ACKs >= max_pending_requests. | +| transport.min_pending_requests | 6 | The minimum number of client unreceived ACKs. It will trigger sending available if the number of unreceived ACKs < min_pending_requests. | +| transport.min_ack_interval | 200 | The minimum interval (in ms) of server reply ACK. | + +#### 7.6 Timeouts + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.close_timeout | 10000 | The timeout (in ms) of close server or close client. | +| transport.sync_request_timeout | 10000 | The timeout (in ms) to wait for response after sending sync-request. | +| transport.finish_session_timeout | 0 | The timeout (in ms) to finish session. 0 means using (transport.sync_request_timeout × transport.max_pending_requests). | +| transport.write_socket_timeout | 3000 | The timeout (in ms) to write data to socket buffer. | +| transport.server_idle_timeout | 360000 (6 minutes) | The max timeout (in ms) of server idle. | + +#### 7.7 Heartbeat + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.heartbeat_interval | 20000 (20 seconds) | The minimum interval (in ms) between heartbeats on client side. | +| transport.max_timeout_heartbeat_count | 120 | The maximum times of timeout heartbeat on client side. If the number of timeouts waiting for heartbeat response continuously > max_timeout_heartbeat_count, the channel will be closed from client side. | + +#### 7.8 Advanced Network Settings + +| config option | default value | description | +|---------------|---------------|-------------| +| transport.max_syn_backlog | 511 | The capacity of SYN queue on server side. 0 means using system default value. | +| transport.recv_file_mode | true | Whether to enable receive buffer-file mode. It will receive buffer and write to file from socket using zero-copy if enabled. **Note**: Requires OS support for zero-copy (e.g., Linux sendfile/splice). | +| transport.network_retries | 3 | The number of retry attempts for network communication if network is unstable. | + +--- + +### 8. Storage & Persistence Configuration + +Configuration for HGKV (HugeGraph Key-Value) storage engine and value files. + +#### 8.1 HGKV Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| hgkv.max_file_size | 2147483648 (2 GB) | The max number of bytes in each HGKV file. | +| hgkv.max_data_block_size | 65536 (64 KB) | The max byte size of HGKV file data block. | +| hgkv.max_merge_files | 10 | The max number of files to merge at one time. | +| hgkv.temp_file_dir | /tmp/hgkv | This folder is used to store temporary files during the file merging process. | + +#### 8.2 Value File Configuration + +| config option | default value | description | +|---------------|---------------|-------------| +| valuefile.max_segment_size | 1073741824 (1 GB) | The max number of bytes in each segment of value-file. | + +--- + +### 9. BSP & Coordination Configuration + +Configuration for Bulk Synchronous Parallel (BSP) protocol and etcd coordination. + +| config option | default value | description | +|---------------|---------------|-------------| +| bsp.etcd_endpoints | http://localhost:2379 | 🔒 **Managed by system in K8s** The endpoints to access etcd. For multiple endpoints, use comma-separated list: `http://host1:port1,http://host2:port2`. Do not modify manually in K8s deployments. | +| bsp.max_super_step | 10 (packaged: 2) | The max super step of the algorithm. | +| bsp.register_timeout | 300000 (packaged: 100000) | The max timeout (in ms) to wait for master and workers to register. | +| bsp.wait_workers_timeout | 86400000 (24 hours) | The max timeout (in ms) to wait for workers BSP event. | +| bsp.wait_master_timeout | 86400000 (24 hours) | The max timeout (in ms) to wait for master BSP event. | +| bsp.log_interval | 30000 (30 seconds) | The log interval (in ms) to print the log while waiting for BSP event. | + +--- + +### 10. Performance Tuning Configuration + +Configuration for performance optimization. + +| config option | default value | description | +|---------------|---------------|-------------| +| allocator.max_vertices_per_thread | 10000 | Maximum number of vertices per thread processed in each memory allocator. | +| sort.thread_nums | 4 | The number of threads performing internal sorting. | + +--- + +### 11. System Administration Configuration + +⚠️ **Configuration items managed by the system - users are prohibited from modifying these manually.** + +The following configuration items are automatically managed by the K8s Operator, Driver, or runtime system. Manual modification will cause cluster communication failures or job scheduling errors. + +| config option | managed by | description | +|---------------|------------|-------------| +| bsp.etcd_endpoints | K8s Operator | Automatically set to operator's etcd service address | +| transport.server_host | Runtime | Automatically set to pod/container hostname | +| transport.server_port | Runtime | Automatically assigned random port | +| job.namespace | K8s Operator | Automatically set to job namespace | +| job.id | K8s Operator | Automatically set to job ID from CRD | +| job.workers_count | K8s Operator | Automatically set from CRD `workerInstances` | +| rpc.server_host | Runtime | RPC server hostname (system-managed) | +| rpc.server_port | Runtime | RPC server port (system-managed) | +| rpc.remote_url | Runtime | RPC remote URL (system-managed) | + +**Why These Are Forbidden:** +- **BSP/RPC Configuration**: Must match the actual deployed etcd/RPC services. Manual overrides break coordination. +- **Job Configuration**: Must match K8s CRD specifications. Mismatches cause worker count errors. +- **Transport Configuration**: Must use actual pod hostnames/ports. Manual values prevent inter-worker communication. + +--- ### K8s Operator Config Options > NOTE: Option needs to be converted through environment variable settings, e.g. k8s.internal_etcd_url => INTERNAL_ETCD_URL -| config option | default value | description | -|------------------------------|---------------------------|---------------------------------------------------------------------------------------------------------------------------------| -| k8s.auto_destroy_pod | true | Whether to automatically destroy all pods when the job is completed or failed. | -| k8s.close_reconciler_timeout | 120 | The max timeout(in ms) to close reconciler. | -| k8s.internal_etcd_url | http://127.0.0.1:2379 | The internal etcd url for operator system. | -| k8s.max_reconcile_retry | 3 | The max retry times of reconcile. | -| k8s.probe_backlog | 50 | The maximum backlog for serving health probes. | -| k8s.probe_port | 9892 | The value is the port that the controller bind to for serving health probes. | -| k8s.ready_check_internal | 1000 | The time interval(ms) of check ready. | -| k8s.ready_timeout | 30000 | The max timeout(in ms) of check ready. | -| k8s.reconciler_count | 10 | The max number of reconciler thread. | -| k8s.resync_period | 600000 | The minimum frequency at which watched resources are reconciled. | -| k8s.timezone | Asia/Shanghai | The timezone of computer job and operator. | -| k8s.watch_namespace | hugegraph-computer-system | The value is watch custom resources in the namespace, ignore other namespaces, the '*' means is all namespaces will be watched. | +| config option | default value | description | +|------------------------------|---------------------------|----------------------------------------------------------------------------------------------------------------------------------| +| k8s.auto_destroy_pod | true | Whether to automatically destroy all pods when the job is completed or failed. | +| k8s.close_reconciler_timeout | 120 | The max timeout (in ms) to close reconciler. | +| k8s.internal_etcd_url | http://127.0.0.1:2379 | The internal etcd URL for operator system. | +| k8s.max_reconcile_retry | 3 | The max retry times of reconcile. | +| k8s.probe_backlog | 50 | The maximum backlog for serving health probes. | +| k8s.probe_port | 9892 | The port that the controller binds to for serving health probes. | +| k8s.ready_check_internal | 1000 | The time interval (ms) of check ready. | +| k8s.ready_timeout | 30000 | The max timeout (in ms) of check ready. | +| k8s.reconciler_count | 10 | The max number of reconciler threads. | +| k8s.resync_period | 600000 | The minimum frequency at which watched resources are reconciled. | +| k8s.timezone | Asia/Shanghai | The timezone of computer job and operator. | +| k8s.watch_namespace | hugegraph-computer-system | The namespace to watch custom resources in. Use '*' to watch all namespaces. | + +--- ### HugeGraph-Computer CRD > CRD: https://github.com/apache/hugegraph-computer/blob/master/computer-k8s-operator/manifest/hugegraph-computer-crd.v1.yaml -| spec | default value | description | required | +| spec | default value | description | required | |-----------------|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------| -| algorithmName | | The name of algorithm. | true | -| jobId | | The job id. | true | -| image | | The image of algorithm. | true | -| computerConf | | The map of computer config options. | true | -| workerInstances | | The number of worker instances, it will instead the 'job.workers_count' option. | true | -| pullPolicy | Always | The pull-policy of image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy | false | -| pullSecrets | | The pull-secrets of Image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod | false | -| masterCpu | | The cpu limit of master, the unit can be 'm' or without unit detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | -| workerCpu | | The cpu limit of worker, the unit can be 'm' or without unit detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | -| masterMemory | | The memory limit of master, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | -| workerMemory | | The memory limit of worker, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to:[https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | -| log4jXml | | The content of log4j.xml for computer job. | false | -| jarFile | | The jar path of computer algorithm. | false | -| remoteJarUri | | The remote jar uri of computer algorithm, it will overlay algorithm image. | false | -| jvmOptions | | The java startup parameters of computer job. | false | -| envVars | | please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/ | false | -| envFrom | | please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/ | false | -| masterCommand | bin/start-computer.sh | The run command of master, equivalent to 'Entrypoint' field of Docker. | false | -| masterArgs | ["-r master", "-d k8s"] | The run args of master, equivalent to 'Cmd' field of Docker. | false | -| workerCommand | bin/start-computer.sh | The run command of worker, equivalent to 'Entrypoint' field of Docker. | false | -| workerArgs | ["-r worker", "-d k8s"] | The run args of worker, equivalent to 'Cmd' field of Docker. | false | -| volumes | | Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/ | false | -| volumeMounts | | Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/ | false | -| secretPaths | | The map of k8s-secret name and mount path. | false | -| configMapPaths | | The map of k8s-configmap name and mount path. | false | -| podTemplateSpec | | Please refer to: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-template-v1/#PodTemplateSpec | false | -| securityContext | | Please refer to: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ | false | +| algorithmName | | The name of algorithm. | true | +| jobId | | The job id. | true | +| image | | The image of algorithm. | true | +| computerConf | | The map of computer config options. | true | +| workerInstances | | The number of worker instances, it will override the 'job.workers_count' option. | true | +| pullPolicy | Always | The pull-policy of image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#image-pull-policy | false | +| pullSecrets | | The pull-secrets of Image, detail please refer to: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod | false | +| masterCpu | | The cpu limit of master, the unit can be 'm' or without unit detail please refer to: [https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | +| workerCpu | | The cpu limit of worker, the unit can be 'm' or without unit detail please refer to: [https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) | false | +| masterMemory | | The memory limit of master, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to: [https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | +| workerMemory | | The memory limit of worker, the unit can be one of Ei、Pi、Ti、Gi、Mi、Ki detail please refer to: [https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory) | false | +| log4jXml | | The content of log4j.xml for computer job. | false | +| jarFile | | The jar path of computer algorithm. | false | +| remoteJarUri | | The remote jar uri of computer algorithm, it will overlay algorithm image. | false | +| jvmOptions | | The java startup parameters of computer job. | false | +| envVars | | please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-interdependent-environment-variables/ | false | +| envFrom | | please refer to: https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/ | false | +| masterCommand | bin/start-computer.sh | The run command of master, equivalent to 'Entrypoint' field of Docker. | false | +| masterArgs | ["-r master", "-d k8s"] | The run args of master, equivalent to 'Cmd' field of Docker. | false | +| workerCommand | bin/start-computer.sh | The run command of worker, equivalent to 'Entrypoint' field of Docker. | false | +| workerArgs | ["-r worker", "-d k8s"] | The run args of worker, equivalent to 'Cmd' field of Docker. | false | +| volumes | | Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/ | false | +| volumeMounts | | Please refer to: https://kubernetes.io/docs/concepts/storage/volumes/ | false | +| secretPaths | | The map of k8s-secret name and mount path. | false | +| configMapPaths | | The map of k8s-configmap name and mount path. | false | +| podTemplateSpec | | Please refer to: https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/pod-template-v1/#PodTemplateSpec | false | +| securityContext | | Please refer to: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/ | false | + +--- ### KubeDriver Config Options -| config option | default value | description | +| config option | default value | description | |----------------------------------|------------------------------------------|-----------------------------------------------------------| -| k8s.build_image_bash_path || The path of command used to build image. | -| k8s.enable_internal_algorithm | true | Whether enable internal algorithm. | -| k8s.framework_image_url | hugegraph/hugegraph-computer:latest | The image url of computer framework. | -| k8s.image_repository_password || The password for login image repository. | -| k8s.image_repository_registry || The address for login image repository. | -| k8s.image_repository_url | hugegraph/hugegraph-computer | The url of image repository. | -| k8s.image_repository_username || The username for login image repository. | -| k8s.internal_algorithm | [pageRank] | The name list of all internal algorithm. | -| k8s.internal_algorithm_image_url | hugegraph/hugegraph-computer:latest | The image url of internal algorithm. | -| k8s.jar_file_dir | /cache/jars/ | The directory where the algorithm jar to upload location. | -| k8s.kube_config | ~/.kube/config | The path of k8s config file. | -| k8s.log4j_xml_path || The log4j.xml path for computer job. | -| k8s.namespace | hugegraph-computer-system | The namespace of hugegraph-computer system. | -| k8s.pull_secret_names | [] | The names of pull-secret for pulling image. | +| k8s.build_image_bash_path | | The path of command used to build image. | +| k8s.enable_internal_algorithm | true | Whether enable internal algorithm. | +| k8s.framework_image_url | hugegraph/hugegraph-computer:latest | The image url of computer framework. | +| k8s.image_repository_password | | The password for login image repository. | +| k8s.image_repository_registry | | The address for login image repository. | +| k8s.image_repository_url | hugegraph/hugegraph-computer | The url of image repository. | +| k8s.image_repository_username | | The username for login image repository. | +| k8s.internal_algorithm | [pageRank] | The name list of all internal algorithm. **Note**: Algorithm names use camelCase here (e.g., `pageRank`), but algorithm implementations return underscore_case (e.g., `page_rank`). | +| k8s.internal_algorithm_image_url | hugegraph/hugegraph-computer:latest | The image url of internal algorithm. | +| k8s.jar_file_dir | /cache/jars/ | The directory where the algorithm jar will be uploaded. | +| k8s.kube_config | ~/.kube/config | The path of k8s config file. | +| k8s.log4j_xml_path | | The log4j.xml path for computer job. | +| k8s.namespace | hugegraph-computer-system | The namespace of hugegraph-computer system. | +| k8s.pull_secret_names | [] | The names of pull-secret for pulling image. | diff --git a/content/en/docs/quickstart/computing/hugegraph-computer.md b/content/en/docs/quickstart/computing/hugegraph-computer.md index 26924c610..dce2b07d2 100644 --- a/content/en/docs/quickstart/computing/hugegraph-computer.md +++ b/content/en/docs/quickstart/computing/hugegraph-computer.md @@ -63,7 +63,36 @@ cd hugegraph-computer mvn clean package -DskipTests ``` -#### 3.1.3 Start master node +#### 3.1.3 Configure computer.properties + +Edit `conf/computer.properties` to configure the connection to HugeGraph-Server and etcd: + +```properties +# Job configuration +job.id=local_pagerank_001 +job.partitions_count=4 + +# HugeGraph connection (✅ Correct configuration keys) +hugegraph.url=http://localhost:8080 +hugegraph.name=hugegraph +# If authentication is enabled on HugeGraph-Server +hugegraph.username= +hugegraph.password= + +# BSP coordination (✅ Correct key: bsp.etcd_endpoints) +bsp.etcd_endpoints=http://localhost:2379 +bsp.max_super_step=10 + +# Algorithm parameters (⚠️ Required) +algorithm.params_class=org.apache.hugegraph.computer.algorithm.centrality.pagerank.PageRankParams +``` + +> **Important Configuration Notes:** +> - Use `bsp.etcd_endpoints` (NOT `bsp.etcd.url`) for etcd connection +> - `algorithm.params_class` is required for all algorithms +> - For multiple etcd endpoints, use comma-separated list: `http://host1:2379,http://host2:2379` + +#### 3.1.4 Start master node > You can use `-c` parameter specify the configuration file, more computer config please see:[Computer Config Options](/docs/config/config-computer#computer-config-options) @@ -72,15 +101,15 @@ cd hugegraph-computer bin/start-computer.sh -d local -r master ``` -#### 3.1.4 Start worker node +#### 3.1.5 Start worker node ```bash bin/start-computer.sh -d local -r worker ``` -#### 3.1.5 Query algorithm results +#### 3.1.6 Query algorithm results -3.1.5.1 Enable `OLAP` index query for server +3.1.6.1 Enable `OLAP` index query for server If the OLAP index is not enabled, it needs to be enabled. More reference: [modify-graphs-read-mode](/docs/clients/restful-api/graphs/#634-modify-graphs-read-mode-this-operation-requires-administrator-privileges) @@ -90,12 +119,14 @@ PUT http://localhost:8080/graphs/hugegraph/graph_read_mode "ALL" ``` -3.1.5.2 Query `page_rank` property value: +3.1.6.2 Query `page_rank` property value: ```bash curl "http://localhost:8080/graphs/hugegraph/graph/vertices?page&limit=3" | gunzip ``` +--- + ### 3.2 Run PageRank algorithm in Kubernetes > To run an algorithm with HugeGraph-Computer, you need to deploy HugeGraph-Server first @@ -141,6 +172,8 @@ hugegraph-computer-operator-etcd-28lm67jxk5 1/1 Runnin > > More computer config please see: [Computer Config Options](/docs/config/config-computer#computer-config-options) +**Basic Example:** + ```yaml cat < Date: Mon, 2 Feb 2026 16:06:42 +0800 Subject: [PATCH 07/10] BREAKING CHANGE: update docs links and Chinese content Change repository links from incubator to apache/hugegraph and apache/hugegraph-doc in config and many docs; update Chinese homepage and docs with refreshed wording (add AI/LLM, features blocks, revised introductions and CTAs); adjust multiple changelog weights and fix numerous internal/repo links and examples to reflect project rename and content improvements. --- config.toml | 6 +-- content/cn/_index.html | 31 +++++++---- content/cn/docs/_index.md | 50 ++++++++++++++++- .../hugegraph-0.10.4-release-notes.md | 2 +- .../hugegraph-0.12.0-release-notes.md | 2 +- .../changelog/hugegraph-0.2-release-notes.md | 2 +- .../hugegraph-0.2.4-release-notes.md | 2 +- .../hugegraph-0.3.3-release-notes.md | 2 +- .../hugegraph-0.4.4-release-notes.md | 2 +- .../hugegraph-0.5.6-release-notes.md | 2 +- .../hugegraph-0.6.1-release-notes.md | 2 +- .../hugegraph-0.7.4-release-notes.md | 2 +- .../hugegraph-0.8.0-release-notes.md | 2 +- .../hugegraph-0.9.2-release-notes.md | 2 +- .../hugegraph-1.0.0-release-notes.md | 2 +- .../hugegraph-1.2.0-release-notes.md | 2 +- .../hugegraph-1.3.0-release-notes.md | 2 +- .../hugegraph-1.5.0-release-notes.md | 2 +- .../hugegraph-1.7.0-release-notes.md | 2 +- content/cn/docs/clients/gremlin-console.md | 2 +- content/cn/docs/clients/restful-api/_index.md | 2 +- content/cn/docs/clients/restful-api/auth.md | 4 +- content/cn/docs/clients/restful-api/graphs.md | 2 +- .../cn/docs/config/config-authentication.md | 4 +- .../committer-guidelines.md | 28 +++++----- .../contribution-guidelines/contribute.md | 12 ++--- .../hugegraph-server-idea-setup.md | 10 ++-- content/cn/docs/download/download.md | 4 +- content/cn/docs/introduction/README.md | 51 +++++++++++++----- .../quickstart/client/hugegraph-client-go.md | 10 ++-- .../quickstart/client/hugegraph-client.md | 2 +- .../cn/docs/quickstart/computing/_index.md | 6 +-- .../computing/hugegraph-computer.md | 2 +- .../quickstart/computing/hugegraph-vermeer.md | 2 +- .../cn/docs/quickstart/hugegraph-ai/_index.md | 6 +-- .../cn/docs/quickstart/hugegraph/_index.md | 8 +-- .../quickstart/hugegraph/hugegraph-server.md | 6 +-- .../cn/docs/quickstart/toolchain/_index.md | 6 +-- .../quickstart/toolchain/hugegraph-loader.md | 2 +- content/en/_index.html | 37 ++++++++----- content/en/community/maturity.md | 2 +- content/en/docs/_index.md | 50 ++++++++++++++++- .../hugegraph-0.12.0-release-notes.md | 2 +- .../hugegraph-1.0.0-release-notes.md | 2 +- .../hugegraph-1.2.0-release-notes.md | 2 +- .../hugegraph-1.3.0-release-notes.md | 2 +- .../hugegraph-1.5.0-release-notes.md | 2 +- .../hugegraph-1.7.0-release-notes.md | 2 +- content/en/docs/clients/gremlin-console.md | 2 +- content/en/docs/clients/restful-api/_index.md | 2 +- content/en/docs/clients/restful-api/auth.md | 4 +- content/en/docs/clients/restful-api/graphs.md | 2 +- .../en/docs/config/config-authentication.md | 4 +- .../committer-guidelines.md | 28 +++++----- .../contribution-guidelines/contribute.md | 12 ++--- .../hugegraph-server-idea-setup.md | 10 ++-- content/en/docs/download/download.md | 4 +- content/en/docs/introduction/README.md | 53 ++++++++++++++----- .../quickstart/client/hugegraph-client-go.md | 10 ++-- .../quickstart/client/hugegraph-client.md | 2 +- .../en/docs/quickstart/computing/_index.md | 6 +-- .../computing/hugegraph-computer.md | 2 +- .../quickstart/computing/hugegraph-vermeer.md | 2 +- .../en/docs/quickstart/hugegraph-ai/_index.md | 6 +-- .../en/docs/quickstart/hugegraph/_index.md | 8 +-- .../quickstart/hugegraph/hugegraph-server.md | 6 +-- .../en/docs/quickstart/toolchain/_index.md | 6 +-- themes/docsy/layouts/partials/footer.html | 10 ++-- 68 files changed, 370 insertions(+), 198 deletions(-) diff --git a/config.toml b/config.toml index f37873d09..5076af003 100644 --- a/config.toml +++ b/config.toml @@ -45,7 +45,7 @@ theme = "default" name = "GitHub" weight = -99 pre = "" - url = "https://github.com/apache/incubator-hugegraph" + url = "https://github.com/apache/hugegraph" [[menu.main]] name ="Download" weight = -98 @@ -159,9 +159,9 @@ version = "1.7" url_latest_version = "https://example.com" # Repository configuration (URLs for in-page links to opening issues and suggesting changes) -github_repo = "https://github.com/apache/incubator-hugegraph-doc" +github_repo = "https://github.com/apache/hugegraph-doc" # An optional link to a related project repo. For example, the sibling repository where your product code lives. -github_project_repo = "https://github.com/apache/incubator-hugegraph" +github_project_repo = "https://github.com/apache/hugegraph" # Specify a value here if your content directory is not in your repo's root directory # github_subdir = "" diff --git a/content/cn/_index.html b/content/cn/_index.html index 8581a741f..a5d0cb46b 100644 --- a/content/cn/_index.html +++ b/content/cn/_index.html @@ -11,8 +11,6 @@

Apache HugeGraph

-

           -          Incubating

}}"> Learn More @@ -20,16 +18,16 @@

Apache Download -

HugeGraph是一款易用、高效、通用的图数据库

-

实现了Apache TinkerPop3框架、兼容Gremlin查询语言。

+

HugeGraph 是一款全栈式图数据库系统

+

支持从数据存储、实时查询到离线分析的完整图数据处理能力,同时支持 Gremlin 和 Cypher 查询语言。

{{< blocks/link-down color="info" >}} {{< /blocks/cover >}} {{% blocks/lead color="primary" %}} -

HugeGraph支持百亿以上的顶点(Vertex)和边(Edge)快速导入,毫秒级的关联查询能力,并可与Hadoop、Spark等

-

大数据平台集成以进行离线分析,主要应用场景包括关联分析、欺诈检测和知识图谱等。

+

HugeGraph 支持百亿级图数据高速导入与毫秒级实时查询,可与 Spark、Flink 等大数据平台深度集成。

+

在 AI 时代,通过与大语言模型 (LLM) 结合,为智能问答、推荐系统、风控反欺诈、知识图谱等场景提供强大的图计算能力。

{{% /blocks/lead %}} {{< blocks/section color="dark" >}} @@ -49,12 +47,27 @@

Apache {{% /blocks/feature %}} +{{% blocks/feature icon="fa-brain" title="智能化" %}} +集成LLM实现GraphRAG智能问答、自动化知识图谱构建,内置20+图机器学习算法,轻松构建AI驱动的图应用。 +{{% /blocks/feature %}} + + +{{% blocks/feature icon="fa-expand-arrows-alt" title="可扩展" %}} +支持水平扩容和分布式部署,从单机到PB级集群无缝迁移,提供多种存储引擎适配,满足不同规模和性能需求。 +{{% /blocks/feature %}} + + +{{% blocks/feature icon="fa-puzzle-piece" title="开放生态" %}} +遵循Apache TinkerPop标准,提供Java、Python、Go等多语言客户端,兼容主流大数据平台,社区活跃持续演进。 +{{% /blocks/feature %}} + + {{< /blocks/section >}} {{< blocks/section color="blue-deep">}}
-

第一个 Apache 图数据库项目

+

首个 Apache 基金会的顶级图项目

{{< /blocks/section >}} @@ -63,7 +76,7 @@

第一个 Apache 图数据库项目

{{< blocks/section >}} {{% blocks/feature icon="far fa-tools" title="使用易用的**工具链**" %}} -可从[此](https://github.com/apache/incubator-hugegraph-toolchain)获取图数据导入工具, 可视化界面以及备份还原迁移工具, 欢迎使用 +可从[此](https://github.com/apache/hugegraph-toolchain)获取图数据导入工具, 可视化界面以及备份还原迁移工具, 欢迎使用 {{% /blocks/feature %}} @@ -84,7 +97,7 @@

第一个 Apache 图数据库项目

{{< blocks/section color="blue-light">}}
-

欢迎大家参与 HugeGraph 的任何贡献

+

欢迎大家给 HugeGraph 添砖加瓦

{{< /blocks/section >}} diff --git a/content/cn/docs/_index.md b/content/cn/docs/_index.md index e7db53bff..6f1fe7198 100755 --- a/content/cn/docs/_index.md +++ b/content/cn/docs/_index.md @@ -7,4 +7,52 @@ menu: weight: 20 --- -欢迎阅读HugeGraph文档 +## Apache HugeGraph 文档 + +Apache HugeGraph 是一套完整的图数据库生态系统,支持 OLTP 实时查询、OLAP 离线分析和 AI 智能应用。 + +### 按场景快速导航 + +| 我想要... | 从这里开始 | +|----------|-----------| +| **运行图查询** (OLTP) | [HugeGraph Server 快速开始](quickstart/hugegraph-server/hugegraph-server) | +| **大规模图计算** (OLAP) | [图计算引擎](quickstart/hugegraph-computer/hugegraph-computer) | +| **构建 AI/RAG 应用** | [HugeGraph-AI](quickstart/hugegraph-ai) | +| **批量导入数据** | [HugeGraph Loader](quickstart/hugegraph-loader) | +| **可视化管理图** | [Hubble Web UI](quickstart/hugegraph-hubble) | + +### 生态系统一览 + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Apache HugeGraph 生态 │ +├─────────────────────────────────────────────────────────────────┤ +│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ +│ │ HugeGraph │ │ HugeGraph │ │ HugeGraph-AI │ │ +│ │ Server │ │ Computer │ │ (GraphRAG/ML/Python) │ │ +│ │ (OLTP) │ │ (OLAP) │ │ │ │ +│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ +│ │ │ │ │ +│ ┌──────┴───────────────┴────────────────────┴──────────────┐ │ +│ │ HugeGraph Toolchain │ │ +│ │ Hubble (UI) | Loader | Client (Java/Go/Python) | Tools │ │ +│ └───────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### 核心组件 + +- **HugeGraph Server** - 图数据库核心,REST API + Gremlin + Cypher 支持 +- **HugeGraph Toolchain** - 客户端 SDK、数据导入、可视化、运维工具 +- **HugeGraph Computer** - 分布式图计算 (Vermeer 高性能内存版 / Computer 海量存储外存版) +- **HugeGraph-AI** - GraphRAG、知识图谱构建、20+ 图机器学习算法 + +### 部署模式 + +| 模式 | 适用场景 | 数据规模 | +|-----|---------|---------| +| **单机版** | 极速稳定、存算一体 | < 1000TB | +| **分布式** | 海量存储、存算分离 | >= 1000TB | +| **Docker** | 快速体验 | 任意 | + +[📖 详细介绍](introduction/) diff --git a/content/cn/docs/changelog/hugegraph-0.10.4-release-notes.md b/content/cn/docs/changelog/hugegraph-0.10.4-release-notes.md index c8b480383..5ecab368b 100644 --- a/content/cn/docs/changelog/hugegraph-0.10.4-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.10.4-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.10 Release Notes" linkTitle: "Release-0.10.4" draft: true -weight: 14 +weight: 15 --- ### API & Client diff --git a/content/cn/docs/changelog/hugegraph-0.12.0-release-notes.md b/content/cn/docs/changelog/hugegraph-0.12.0-release-notes.md index 3735330d5..6f893d24e 100644 --- a/content/cn/docs/changelog/hugegraph-0.12.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.12.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 0.12 Release Notes" linkTitle: "Release-0.12.0" -weight: 1 +weight: 11 --- ### API & Client diff --git a/content/cn/docs/changelog/hugegraph-0.2-release-notes.md b/content/cn/docs/changelog/hugegraph-0.2-release-notes.md index c4b549045..be7b7ac29 100644 --- a/content/cn/docs/changelog/hugegraph-0.2-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.2-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.2 Release Notes" linkTitle: "Release-0.2.4" draft: true -weight: 23 +weight: 33 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.2.4-release-notes.md b/content/cn/docs/changelog/hugegraph-0.2.4-release-notes.md index e826bc014..c80ef0b6e 100644 --- a/content/cn/docs/changelog/hugegraph-0.2.4-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.2.4-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.2.4 Release Notes" linkTitle: "Release-0.2.4" draft: true -weight: 22 +weight: 31 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.3.3-release-notes.md b/content/cn/docs/changelog/hugegraph-0.3.3-release-notes.md index b114d1e2a..3bc00ee70 100644 --- a/content/cn/docs/changelog/hugegraph-0.3.3-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.3.3-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.3.3 Release Notes" linkTitle: "Release-0.3.3" draft: true -weight: 21 +weight: 29 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.4.4-release-notes.md b/content/cn/docs/changelog/hugegraph-0.4.4-release-notes.md index 93c12089a..4f3829be0 100644 --- a/content/cn/docs/changelog/hugegraph-0.4.4-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.4.4-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.4.4 Release Notes" linkTitle: "Release-0.4.4" draft: true -weight: 20 +weight: 27 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.5.6-release-notes.md b/content/cn/docs/changelog/hugegraph-0.5.6-release-notes.md index 0353b6c94..0fb199e4d 100644 --- a/content/cn/docs/changelog/hugegraph-0.5.6-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.5.6-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.5 Release Notes" linkTitle: "Release-0.5.6" draft: true -weight: 19 +weight: 25 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.6.1-release-notes.md b/content/cn/docs/changelog/hugegraph-0.6.1-release-notes.md index 57dae14f3..492ef4053 100644 --- a/content/cn/docs/changelog/hugegraph-0.6.1-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.6.1-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.6 Release Notes" linkTitle: "Release-0.6.1" draft: true -weight: 18 +weight: 23 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.7.4-release-notes.md b/content/cn/docs/changelog/hugegraph-0.7.4-release-notes.md index 115864755..4631aa004 100644 --- a/content/cn/docs/changelog/hugegraph-0.7.4-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.7.4-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.7 Release Notes" linkTitle: "Release-0.7.4" draft: true -weight: 17 +weight: 21 --- ### API & Java Client diff --git a/content/cn/docs/changelog/hugegraph-0.8.0-release-notes.md b/content/cn/docs/changelog/hugegraph-0.8.0-release-notes.md index ad50701f4..f72569a00 100644 --- a/content/cn/docs/changelog/hugegraph-0.8.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.8.0-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.8 Release Notes" linkTitle: "Release-0.8.0" draft: true -weight: 16 +weight: 19 --- ### API & Client diff --git a/content/cn/docs/changelog/hugegraph-0.9.2-release-notes.md b/content/cn/docs/changelog/hugegraph-0.9.2-release-notes.md index d6cdfa4d4..af3aaafe6 100644 --- a/content/cn/docs/changelog/hugegraph-0.9.2-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-0.9.2-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.9 Release Notes" linkTitle: "Release-0.9.2" draft: true -weight: 15 +weight: 17 --- ### API & Client diff --git a/content/cn/docs/changelog/hugegraph-1.0.0-release-notes.md b/content/cn/docs/changelog/hugegraph-1.0.0-release-notes.md index b2a8c84d0..1bb92988f 100644 --- a/content/cn/docs/changelog/hugegraph-1.0.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-1.0.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.0.0 Release Notes" linkTitle: "Release-1.0.0" -weight: 2 +weight: 9 --- ### OLTP API & Client 更新 diff --git a/content/cn/docs/changelog/hugegraph-1.2.0-release-notes.md b/content/cn/docs/changelog/hugegraph-1.2.0-release-notes.md index 113842c36..739371a8b 100644 --- a/content/cn/docs/changelog/hugegraph-1.2.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-1.2.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.2.0 Release Notes" linkTitle: "Release-1.2.0" -weight: 3 +weight: 7 --- ### Java version statement diff --git a/content/cn/docs/changelog/hugegraph-1.3.0-release-notes.md b/content/cn/docs/changelog/hugegraph-1.3.0-release-notes.md index 3ea786e6c..a0cf3e79d 100644 --- a/content/cn/docs/changelog/hugegraph-1.3.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-1.3.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.3.0 Release Notes" linkTitle: "Release-1.3.0" -weight: 4 +weight: 5 --- ### 运行环境/版本说明 diff --git a/content/cn/docs/changelog/hugegraph-1.5.0-release-notes.md b/content/cn/docs/changelog/hugegraph-1.5.0-release-notes.md index 016d091fa..46ff320af 100644 --- a/content/cn/docs/changelog/hugegraph-1.5.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-1.5.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.5.0 Release Notes" linkTitle: "Release-1.5.0" -weight: 5 +weight: 3 --- > WIP: This doc is under construction, please wait for the final version (BETA) diff --git a/content/cn/docs/changelog/hugegraph-1.7.0-release-notes.md b/content/cn/docs/changelog/hugegraph-1.7.0-release-notes.md index 1a385c335..e9c08b686 100644 --- a/content/cn/docs/changelog/hugegraph-1.7.0-release-notes.md +++ b/content/cn/docs/changelog/hugegraph-1.7.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.7.0 Release Notes" linkTitle: "Release-1.7.0" -weight: 7 +weight: 1 --- > WIP: This doc is under construction, please wait for the final version (BETA) diff --git a/content/cn/docs/clients/gremlin-console.md b/content/cn/docs/clients/gremlin-console.md index 1d1103550..468bea58e 100644 --- a/content/cn/docs/clients/gremlin-console.md +++ b/content/cn/docs/clients/gremlin-console.md @@ -43,7 +43,7 @@ gremlin> > 这里的 `--` 会被 getopts 解析为最后一个 option,这样后面的 options 就可以传入 Gremlin-Console 进行处理了。`-i` 代表 `Execute the specified script and leave the console open on completion`,更多的选项可以参考 Gremlin-Console 的[源代码](https://github.com/apache/tinkerpop/blob/3.5.1/gremlin-console/src/main/groovy/org/apache/tinkerpop/gremlin/console/Console.groovy#L483)。 -其中 [`example.groovy`](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) 是 scripts 目录下的一个示例脚本,该脚本插入了一些数据,并在最后查询图中顶点和边的数量。 +其中 [`example.groovy`](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) 是 scripts 目录下的一个示例脚本,该脚本插入了一些数据,并在最后查询图中顶点和边的数量。 此时还可以继续输入 Gremlin 语句对图进行操作: diff --git a/content/cn/docs/clients/restful-api/_index.md b/content/cn/docs/clients/restful-api/_index.md index afdd9d830..67097fa59 100644 --- a/content/cn/docs/clients/restful-api/_index.md +++ b/content/cn/docs/clients/restful-api/_index.md @@ -9,7 +9,7 @@ weight: 1 > - HugeGraph 1.7.0+ 引入了图空间功能,API 路径格式为:`/graphspaces/{graphspace}/graphs/{graph}` > - HugeGraph 1.5.x 及之前版本使用旧路径:`/graphs/{graph}`, 以及创建/克隆图的 api 使用 text/plain 作为 Content-Type, 1.7.0 及之后使用 json > - 默认图空间名称为 `DEFAULT`,可直接使用 -> - 旧版本 doc 参考:[HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) +> - 旧版本 doc 参考:[HugeGraph 1.5.x RESTful API](https://github.com/apache/hugegraph-doc/tree/release-1.5.0) 除了下方的文档,你还可以通过 `localhost:8080/swagger-ui/index.html` 访问 `swagger-ui` 以查看 `RESTful API`。[示例可以参考此处](/cn/docs/quickstart/hugegraph/hugegraph-server#swaggerui-example) diff --git a/content/cn/docs/clients/restful-api/auth.md b/content/cn/docs/clients/restful-api/auth.md index 6c3c086aa..608756fa9 100644 --- a/content/cn/docs/clients/restful-api/auth.md +++ b/content/cn/docs/clients/restful-api/auth.md @@ -6,7 +6,7 @@ weight: 16 > **版本变更说明**: > - 1.7.0+: Auth API 路径使用 GraphSpace 格式,如 `/graphspaces/DEFAULT/auth/users`,且 group/target 等 id 格式与 name 一致(如 `admin`) -> - 1.5.x 及更早: Auth API 路径包含 graph 名称,group/target 等 id 格式类似 `-69:grant`。参考 [HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) +> - 1.5.x 及更早: Auth API 路径包含 graph 名称,group/target 等 id 格式类似 `-69:grant`。参考 [HugeGraph 1.5.x RESTful API](https://github.com/apache/hugegraph-doc/tree/release-1.5.0) ### 10.1 用户认证与权限控制 @@ -26,7 +26,7 @@ city: Beijing}) ##### 接口说明: 用户认证与权限控制接口包括 5 类:UserAPI、GroupAPI、TargetAPI、BelongAPI、AccessAPI。 -**注意**: 1.5.0 及之前,group/target 等 id 的格式类似 -69:grant,1.7.0 及之后,id 和 name 一致,如 admin [HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) +**注意**: 1.5.0 及之前,group/target 等 id 的格式类似 -69:grant,1.7.0 及之后,id 和 name 一致,如 admin [HugeGraph 1.5.x RESTful API](https://github.com/apache/hugegraph-doc/tree/release-1.5.0) ### 10.2 用户(User)API 用户接口包括:创建用户,删除用户,修改用户,和查询用户相关信息接口。 diff --git a/content/cn/docs/clients/restful-api/graphs.md b/content/cn/docs/clients/restful-api/graphs.md index 2dfec1c4b..4e92033f4 100644 --- a/content/cn/docs/clients/restful-api/graphs.md +++ b/content/cn/docs/clients/restful-api/graphs.md @@ -173,7 +173,7 @@ POST http://localhost:8080/graphspaces/DEFAULT/graphs/hugegraph-xx - 非鉴权模式:`"gremlin.graph": "org.apache.hugegraph.HugeFactory"` **注意**!! -1. 在 1.7.0 版本中,动态创建图会导致 NPE 错误。该问题已在 [PR#2912](https://github.com/apache/incubator-hugegraph/pull/2912) 中修复。当前 master 版本和 1.7.0 之前的版本不受此问题影响。 +1. 在 1.7.0 版本中,动态创建图会导致 NPE 错误。该问题已在 [PR#2912](https://github.com/apache/hugegraph/pull/2912) 中修复。当前 master 版本和 1.7.0 之前的版本不受此问题影响。 2. 1.7.0 及之前版本,如果 backend 是 hstore,必须在请求体加上 "task.scheduler_type": "distributed"。同时请确保 HugeGraph-Server 已正确配置 PD,参见 [HStore 配置](/cn/docs/quickstart/hugegraph/hugegraph-server/#511-分布式存储hstore)。 **RocksDB 示例:** diff --git a/content/cn/docs/config/config-authentication.md b/content/cn/docs/config/config-authentication.md index d11f7db7d..fef920328 100644 --- a/content/cn/docs/config/config-authentication.md +++ b/content/cn/docs/config/config-authentication.md @@ -94,14 +94,14 @@ gremlin.graph=org.apache.hugegraph.auth.HugeFactoryAuthProxy 在鉴权配置完成后,需在首次执行 `init-store.sh` 时命令行中输入 `admin` 密码 (非 docker 部署模式下) 如果基于 docker 镜像部署或者已经初始化 HugeGraph 并需要转换为鉴权模式,需要删除相关图数据并重新启动 HugeGraph, 若图已有业务数据,暂时**无法直接转换**鉴权模式 (hugegraph 版本 <= 1.2.0) -> 对于该功能的改进已经在最新版本发布 (Docker latest 可用),可参考 [PR 2411](https://github.com/apache/incubator-hugegraph/pull/2411), 此时可无缝切换。 +> 对于该功能的改进已经在最新版本发布 (Docker latest 可用),可参考 [PR 2411](https://github.com/apache/hugegraph/pull/2411), 此时可无缝切换。 ```bash # stop the hugeGraph firstly bin/stop-hugegraph.sh # delete the store data (here we use the default path for rocksdb) -# Note: no need to delete data in the latest code (fixed in https://github.com/apache/incubator-hugegraph/pull/2411) +# Note: no need to delete data in the latest code (fixed in https://github.com/apache/hugegraph/pull/2411) rm -rf rocksdb-data/ # init store again diff --git a/content/cn/docs/contribution-guidelines/committer-guidelines.md b/content/cn/docs/contribution-guidelines/committer-guidelines.md index d32d93850..c756e36b1 100644 --- a/content/cn/docs/contribution-guidelines/committer-guidelines.md +++ b/content/cn/docs/contribution-guidelines/committer-guidelines.md @@ -9,7 +9,7 @@ weight: 5 # 候选人要求 1. 候选人应遵守 [Apache Code of Conduct](https://www.apache.org/foundation/policies/conduct.html) -2. PMC 成员将通过搜索[邮件列表](https://lists.apache.org/list?dev@hugegraph.apache.org)、[issues](https://github.com/apache/hugegraph/issues)、[PRs](https://github.com/apache/incubator-hugegraph/pulls)、[官网文档](https://hugegraph.apache.org/docs)等方式,了解候选人如何与他人互动,以及他们所做的贡献 +2. PMC 成员将通过搜索[邮件列表](https://lists.apache.org/list?dev@hugegraph.apache.org)、[issues](https://github.com/apache/hugegraph/issues)、[PRs](https://github.com/apache/hugegraph/pulls)、[官网文档](https://hugegraph.apache.org/docs)等方式,了解候选人如何与他人互动,以及他们所做的贡献 3. 以下是在评估候选人是否适合成为 Committer 时需要考虑的一些要点: 1. 与社区成员合作的能力 2. 担任导师的能力 @@ -73,33 +73,33 @@ Welcome everyone to share opinions~ Thanks! ``` -对于讨论邮件中贡献链接,可以使用 [GitHub Search](https://github.com/search) 的统计功能,按需输入如下对应关键词查询即可,可以在此基础上添加新的 repo 如 `repo:apache/incubator-hugegraph-computer`,特别注意调整**时间范围** (下面是一个模板参考,请自行调整参数): +对于讨论邮件中贡献链接,可以使用 [GitHub Search](https://github.com/search) 的统计功能,按需输入如下对应关键词查询即可,可以在此基础上添加新的 repo 如 `repo:apache/hugegraph-computer`,特别注意调整**时间范围** (下面是一个模板参考,请自行调整参数): - PR 提交次数 - - `is:pr author:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `is:pr author:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - 代码提交/修改行数 - - https://github.com/apache/incubator-hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - - https://github.com/apache/incubator-hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - PR 提交关联 Issue 次数 - - `linked:issue involves:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `linked:issue involves:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - PR Review 个数 - - `type:pr reviewed-by:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:pr reviewed-by:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - PR Review 行数 - 合并次数 - - `type:pr author:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:pr author:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - 有效合并行数 - - https://github.com/apache/incubator-hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - - https://github.com/apache/incubator-hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - Issue 提交数 - - `type:issue author:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:issue author:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Issue 修复数 - 在 Issue 提交数的基础上选取状态为 closed 的 Issues - Issue 参与数 - - `type:issue involves:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:issue involves:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - 评论 Issue 数 - - `type:issue commenter:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:issue commenter:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - 评论 PR 数 - - `type:pr commenter:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:pr commenter:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` Mailing Lists 的参与则可使用 https://lists.apache.org/list?dev@hugegraph.apache.org:lte=10M:xxx 查询。 diff --git a/content/cn/docs/contribution-guidelines/contribute.md b/content/cn/docs/contribution-guidelines/contribute.md index 05b6a598a..83884b8c6 100644 --- a/content/cn/docs/contribution-guidelines/contribute.md +++ b/content/cn/docs/contribution-guidelines/contribute.md @@ -22,7 +22,7 @@ Before submitting the code, we need to do some preparation: 1. Sign up or login to GitHub: [https://github.com](https://github.com) -2. Fork HugeGraph repo from GitHub: [https://github.com/apache/incubator-hugegraph/fork](https://github.com/apache/hugegraph/fork) +2. Fork HugeGraph repo from GitHub: [https://github.com/apache/hugegraph/fork](https://github.com/apache/hugegraph/fork) 3. Clone code from fork repo to local: [https://github.com/${GITHUB_USER_NAME}/hugegraph](https://github.com/${GITHUB_USER_NAME}/hugegraph) @@ -46,7 +46,7 @@ Before submitting the code, we need to do some preparation: ## 2. Create an Issue on GitHub -If you encounter bugs or have any questions, please go to [GitHub Issues](https://github.com/apache/incubator-hugegraph/issues) to report them and feel free to [create an issue](https://github.com/apache/hugegraph/issues/new). +If you encounter bugs or have any questions, please go to [GitHub Issues](https://github.com/apache/hugegraph/issues) to report them and feel free to [create an issue](https://github.com/apache/hugegraph/issues/new). ## 3. Make changes of code locally @@ -79,10 +79,10 @@ Note: In order to be consistent with the code style easily, if you use [IDEA](ht ##### 3.2.1 添加第三方依赖 如果我们要在 `HugeGraph` 项目中添加新的第三方依赖, 我们需要做下面的几件事情: -1. 找到第三方依赖的仓库,将依赖的 `license` 文件放到 [./hugegraph-dist/release-docs/licenses/](https://github.com/apache/incubator-hugegraph/tree/master/hugegraph-server/hugegraph-dist/release-docs/licenses) 路径下。 -2. 在[./hugegraph-dist/release-docs/LICENSE](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/LICENSE) 中声明该依赖的 `LICENSE` 信息。 -3. 找到仓库里的 NOTICE 文件,将其追加到 [./hugegraph-dist/release-docs/NOTICE](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/NOTICE) 文件后面(如果没有NOTICE文件则跳过这一步)。 -4. 本地执行[./hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh) 脚本来更新依赖列表[known-dependencies.txt](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/known-dependencies.txt) (或者手动更新)。 +1. 找到第三方依赖的仓库,将依赖的 `license` 文件放到 [./hugegraph-dist/release-docs/licenses/](https://github.com/apache/hugegraph/tree/master/hugegraph-server/hugegraph-dist/release-docs/licenses) 路径下。 +2. 在[./hugegraph-dist/release-docs/LICENSE](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/LICENSE) 中声明该依赖的 `LICENSE` 信息。 +3. 找到仓库里的 NOTICE 文件,将其追加到 [./hugegraph-dist/release-docs/NOTICE](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/NOTICE) 文件后面(如果没有NOTICE文件则跳过这一步)。 +4. 本地执行[./hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh) 脚本来更新依赖列表[known-dependencies.txt](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/known-dependencies.txt) (或者手动更新)。 **例如**:在项目中引入了第三方新依赖 -> `ant-1.9.1.jar` - 项目源码位于:https://github.com/apache/ant/tree/rel/1.9.1 diff --git a/content/cn/docs/contribution-guidelines/hugegraph-server-idea-setup.md b/content/cn/docs/contribution-guidelines/hugegraph-server-idea-setup.md index 502350ca8..a619f2e18 100644 --- a/content/cn/docs/contribution-guidelines/hugegraph-server-idea-setup.md +++ b/content/cn/docs/contribution-guidelines/hugegraph-server-idea-setup.md @@ -4,7 +4,7 @@ linkTitle: "在 IDEA 中配置 Server 开发环境" weight: 4 --- -> 注意:下述配置仅供参考,基于[这个版本](https://github.com/apache/incubator-hugegraph/commit/a946ad1de4e8f922251a5241ffc957c33379677f),在 Linux 和 macOS 平台下进行了测试。 +> 注意:下述配置仅供参考,基于[这个版本](https://github.com/apache/hugegraph/commit/a946ad1de4e8f922251a5241ffc957c33379677f),在 Linux 和 macOS 平台下进行了测试。 ### 背景 @@ -17,7 +17,7 @@ weight: 4 2. 启动 HugeGraph-Server,执行 `HugeGraphServer` 类加载初始化的图信息启动 在执行下述流程之前,请确保已经克隆了 HugeGraph 的源代码,并且已经配置了 Java 11 环境 & 可以参考这个 -[配置文档](https://github.com/apache/incubator-hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) +[配置文档](https://github.com/apache/hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) ```bash git clone https://github.com/apache/hugegraph.git @@ -57,7 +57,7 @@ rocksdb.wal_path=. - LD_LIBRARY_PATH=/path/to/your/library:$LD_LIBRARY_PATH - LD_PRELOAD=libjemalloc.so:librocksdbjni-linux64.so -> 若在 **Java 11** 环境下为 HugeGraph-Server 配置了**用户认证** (authenticator),需要参考二进制包的脚本[配置](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/init-store.sh#L52),添加下述 **VM options**: +> 若在 **Java 11** 环境下为 HugeGraph-Server 配置了**用户认证** (authenticator),需要参考二进制包的脚本[配置](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/init-store.sh#L52),添加下述 **VM options**: > > ```bash > --add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED @@ -93,7 +93,7 @@ rocksdb.wal_path=. - 将 `Main class` 设置为 `org.apache.hugegraph.dist.HugeGraphServer` - 设置运行参数为 `conf/gremlin-server.yaml conf/rest-server.properties`,同样地,这里的路径是相对于工作路径的,需要将工作路径设置为 `path-to-your-directory` -> 类似的,若在 **Java 11** 环境下为 HugeGraph-Server 配置了**用户认证** (authenticator),同样需要参考二进制包的脚本[配置](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/hugegraph-server.sh#L124),添加下述 **VM options**: +> 类似的,若在 **Java 11** 环境下为 HugeGraph-Server 配置了**用户认证** (authenticator),同样需要参考二进制包的脚本[配置](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/hugegraph-server.sh#L124),添加下述 **VM options**: > > ```bash > --add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED --add-modules=jdk.unsupported --add-exports=java.base/sun.nio.ch=ALL-UNNAMED @@ -171,4 +171,4 @@ curl "http://localhost:8080/graphs/hugegraph/graph/vertices" | gunzip 2. [hugegraph-server 本地调试文档 (Win/Unix)](https://gist.github.com/imbajin/1661450f000cd62a67e46d4f1abfe82c) 3. ["package sun.misc does not exist" compilation error](https://youtrack.jetbrains.com/issue/IDEA-180033) 4. [Cannot compile: java: package sun.misc does not exist](https://youtrack.jetbrains.com/issue/IDEA-201168) -5. [The code-style config for HugeGraph in IDEA](https://github.com/apache/incubator-hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) +5. [The code-style config for HugeGraph in IDEA](https://github.com/apache/hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) diff --git a/content/cn/docs/download/download.md b/content/cn/docs/download/download.md index d936e2751..834f5d070 100644 --- a/content/cn/docs/download/download.md +++ b/content/cn/docs/download/download.md @@ -1,5 +1,5 @@ --- -title: "下载 Apache HugeGraph (Incubating)" +title: "下载 Apache HugeGraph" linkTitle: "Download" weight: 2 --- @@ -9,7 +9,7 @@ weight: 2 > - 推荐使用最新版本的 HugeGraph 软件包, 运行时环境请选择 Java11 > - 验证下载版本, 请使用相应的哈希 (SHA512)、签名和 [项目签名验证 KEYS](https://downloads.apache.org/incubator/hugegraph/KEYS) > - 检查哈希 (SHA512)、签名的说明在 [版本验证](/docs/contribution-guidelines/validate-release/) 页面, 也可参考 [ASF 验证说明](https://www.apache.org/dyn/closer.cgi#verify) -> - 注: HugeGraph 所有组件版本号已保持一致, `client/loader/hubble/common` 等 maven 仓库版本号同理, 依赖引用可参考 [maven 示例](https://github.com/apache/incubator-hugegraph-toolchain#maven-dependencies) +> - 注: HugeGraph 所有组件版本号已保持一致, `client/loader/hubble/common` 等 maven 仓库版本号同理, 依赖引用可参考 [maven 示例](https://github.com/apache/hugegraph-toolchain#maven-dependencies) ### 最新版本 1.7.0 diff --git a/content/cn/docs/introduction/README.md b/content/cn/docs/introduction/README.md index c228893dc..301712cde 100644 --- a/content/cn/docs/introduction/README.md +++ b/content/cn/docs/introduction/README.md @@ -8,6 +8,7 @@ weight: 1 Apache HugeGraph 是一款易用、高效、通用的开源图数据库系统(Graph Database,[GitHub 项目地址](https://github.com/apache/hugegraph)), 实现了[Apache TinkerPop3](https://tinkerpop.apache.org)框架及完全兼容[Gremlin](https://tinkerpop.apache.org/gremlin.html)查询语言, +同时支持 [Cypher](https://opencypher.org/) 查询语言(OpenCypher 标准), 具备完善的工具链组件,助力用户轻松构建基于图数据库之上的应用和产品。HugeGraph 支持百亿以上的顶点和边快速导入,并提供毫秒级的关联关系查询能力(OLTP), 并支持大规模分布式图分析(OLAP)。 @@ -19,17 +20,43 @@ HugeGraph 典型应用场景包括深度关系探索、关联分析、路径搜 ### Features HugeGraph 支持在线及离线环境下的图操作,支持批量导入数据,支持高效的复杂关联关系分析,并且能够与大数据平台无缝集成。 -HugeGraph 支持多用户并行操作,用户可输入 Gremlin 查询语句,并及时得到图查询结果,也可在用户程序中调用 HugeGraph API 进行图分析或查询。 +HugeGraph 支持多用户并行操作,用户可输入 Gremlin/Cypher 查询语句,并及时得到图查询结果,也可在用户程序中调用 HugeGraph API 进行图分析或查询。 -本系统具备如下特点: +本系统具备如下特点: -- 易用:HugeGraph 支持 Gremlin 图查询语言与 RESTful API,同时提供图检索常用接口,具备功能齐全的周边工具,轻松实现基于图的各种查询分析运算。 +- 易用:HugeGraph 支持 Gremlin/Cypher 图查询语言与 RESTful API,同时提供图检索常用接口,具备功能齐全的周边工具,轻松实现基于图的各种查询分析运算。 - 高效:HugeGraph 在图存储和图计算方面做了深度优化,提供多种批量导入工具,轻松完成百亿级数据快速导入,通过优化过的查询达到图检索的毫秒级响应。支持数千用户并发的在线实时操作。 - 通用:HugeGraph 支持 Apache Gremlin 标准图查询语言和 Property Graph 标准图建模方法,支持基于图的 OLTP 和 OLAP 方案。集成 Apache Hadoop 及 Apache Spark 大数据平台。 - 可扩展:支持分布式存储、数据多副本及横向扩容,内置多种后端存储引擎,也可插件式轻松扩展后端存储引擎。 - 开放:HugeGraph 代码开源(Apache 2 License),客户可自主修改定制,选择性回馈开源社区。 -本系统的功能包括但不限于: +### 部署模式 + +HugeGraph 支持多种部署模式,满足不同规模和场景的需求: + +**单机模式 (Standalone)** +- Server + RocksDB 后端存储 +- 适合开发测试和中小规模数据(< 1TB) +- Docker 快速启动: `docker run hugegraph/hugegraph` +- 详见 [Server 快速开始](/cn/docs/quickstart/hugegraph-server/hugegraph-server) + +**分布式模式 (Distributed)** +- HugeGraph-PD: 元数据管理和集群调度 +- HugeGraph-Store (HStore): 分布式存储引擎 +- 支持水平扩展和高可用(100GB+ 数据规模) +- 适合生产环境和大规模图数据应用 + +### 快速入门指南 + +| 使用场景 | 推荐路径 | +|---------|---------| +| 快速体验 | [Docker 部署](/cn/docs/quickstart/hugegraph-server/hugegraph-server#docker) | +| 构建 OLTP 应用 | Server → REST API / Gremlin / Cypher | +| 图分析 (OLAP) | [Vermeer](/cn/docs/quickstart/computing/hugegraph-computer) (推荐) 或 Computer | +| 构建 AI 应用 | [HugeGraph-AI](/cn/docs/quickstart/hugegraph-ai) (GraphRAG/知识图谱) | +| 批量导入数据 | [Loader](/cn/docs/quickstart/toolchain/hugegraph-loader) + [Hubble](/cn/docs/quickstart/toolchain/hugegraph-hubble) | + +### 功能特性 - 支持从多数据源批量导入数据 (包括本地文件、HDFS 文件、MySQL 数据库等数据源),支持多种文件格式导入 (包括 TXT、CSV、JSON 等格式) - 具备可视化操作界面,可用于操作、分析及展示图,降低用户使用门槛 @@ -50,20 +77,20 @@ HugeGraph 支持多用户并行操作,用户可输入 Gremlin 查询语句, - Backend:实现将图数据存储到后端,支持的后端包括:Memory、Cassandra、ScyllaDB、RocksDB、HBase、MySQL 及 PostgreSQL,用户根据实际情况选择一种即可; - API:内置 REST Server,向用户提供 RESTful API,同时完全兼容 Gremlin 查询。(支持分布式存储和计算下推) - [HugeGraph-Toolchain](https://github.com/apache/hugegraph-toolchain): (工具链) - - [HugeGraph-Client](/cn/docs/quickstart/client/hugegraph-client):HugeGraph-Client 提供了 RESTful API 的客户端,用于连接 HugeGraph-Server,目前仅实现 Java 版,其他语言用户可自行实现; + - [HugeGraph-Client](/cn/docs/quickstart/client/hugegraph-client):HugeGraph-Client 提供了 RESTful API 的客户端,用于连接 HugeGraph-Server,支持 Java/Python/Go 多语言版本; - [HugeGraph-Loader](/cn/docs/quickstart/toolchain/hugegraph-loader):HugeGraph-Loader 是基于 HugeGraph-Client 的数据导入工具,将普通文本数据转化为图形的顶点和边并插入图形数据库中; - - [HugeGraph-Hubble](/cn/docs/quickstart/toolchain/hugegraph-hubble):HugeGraph-Hubble 是 HugeGraph 的 Web + - [HugeGraph-Hubble](/cn/docs/quickstart/toolchain/hugegraph-hubble):HugeGraph-Hubble 是 HugeGraph 的 Web 可视化管理平台,一站式可视化分析平台,平台涵盖了从数据建模,到数据快速导入,再到数据的在线、离线分析、以及图的统一管理的全过程; - [HugeGraph-Tools](/cn/docs/quickstart/toolchain/hugegraph-tools):HugeGraph-Tools 是 HugeGraph 的部署和管理工具,包括管理图、备份/恢复、Gremlin 执行等功能。 -- [HugeGraph-Computer](/cn/docs/quickstart/computing/hugegraph-computer):HugeGraph-Computer 是分布式图处理系统 (OLAP). - 它是 [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf) 的一个实现。它可以运行在 Kubernetes/Yarn - 等集群上,支持超大规模图计算。 -- [HugeGraph-AI](/cn/docs/quickstart/hugegraph-ai):HugeGraph-AI 是 HugeGraph 独立的 AI - 组件,提供了图神经网络的训练和推理功能,LLM/Graph RAG 结合/Python-Client 等相关组件,持续更新 ing。 +- [HugeGraph-Computer](/cn/docs/quickstart/computing/hugegraph-computer):HugeGraph-Computer 是分布式图处理系统 (OLAP)。 + 它是 [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf) 的一个实现。它可以运行在 Kubernetes/Yarn + 等集群上,支持超大规模图计算。同时提供 Vermeer 轻量级图计算引擎,适合快速开始和中小规模图分析。 +- [HugeGraph-AI](/cn/docs/quickstart/hugegraph-ai):HugeGraph-AI 是 HugeGraph 独立的 AI + 组件,提供 LLM/GraphRAG 智能问答、自动化知识图谱构建、图神经网络训练/推理、Python-Client 等功能,内置 20+ 图机器学习算法,持续更新中。 ### Contact Us -- [GitHub Issues](https://github.com/apache/incubator-hugegraph/issues): 使用途中出现问题或提供功能性建议,可通过此反馈 (推荐) +- [GitHub Issues](https://github.com/apache/hugegraph/issues): 使用途中出现问题或提供功能性建议,可通过此反馈 (推荐) - 邮件反馈:[dev@hugegraph.apache.org](mailto:dev@hugegraph.apache.org) ([邮箱订阅方式](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/)) - SEC 反馈: [security@hugegraph.apache.org](mailto:security@hugegraph.apache.org) (报告安全相关问题) - 微信公众号:Apache HugeGraph, 欢迎扫描下方二维码加入我们! diff --git a/content/cn/docs/quickstart/client/hugegraph-client-go.md b/content/cn/docs/quickstart/client/hugegraph-client-go.md index 9778cf52e..11c7eb5bb 100644 --- a/content/cn/docs/quickstart/client/hugegraph-client-go.md +++ b/content/cn/docs/quickstart/client/hugegraph-client-go.md @@ -13,7 +13,7 @@ weight: 3 ## 安装教程 ```shell -go get github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go +go get github.com/apache/hugegraph-toolchain/hugegraph-client-go ``` ## 已实现 API @@ -34,8 +34,8 @@ import ( "log" "os" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go/hgtransport" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go/hgtransport" ) func main() { @@ -73,8 +73,8 @@ import ( "log" "os" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go/hgtransport" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go/hgtransport" ) // initClient 初始化并返回一个 HugeGraph 客户端实例 diff --git a/content/cn/docs/quickstart/client/hugegraph-client.md b/content/cn/docs/quickstart/client/hugegraph-client.md index 9322aabfa..9480f89fe 100644 --- a/content/cn/docs/quickstart/client/hugegraph-client.md +++ b/content/cn/docs/quickstart/client/hugegraph-client.md @@ -11,7 +11,7 @@ weight: 1 用户可以使用 [Client-API](/cn/docs/clients/hugegraph-client) 编写代码操作 HugeGraph,比如元数据和图数据的增删改查,或者执行 gremlin 语句等。 后文主要是 Java 使用示例 (其他语言 SDK 可参考对应 `READEME` 页面) -> 现在已经支持基于 Go/Python 语言的 HugeGraph [Client SDK](https://github.com/apache/incubator-hugegraph-toolchain/blob/master/hugegraph-client-go/README.md) (version >=1.2.0) +> 现在已经支持基于 Go/Python 语言的 HugeGraph [Client SDK](https://github.com/apache/hugegraph-toolchain/blob/master/hugegraph-client-go/README.md) (version >=1.2.0) ### 2 环境要求 diff --git a/content/cn/docs/quickstart/computing/_index.md b/content/cn/docs/quickstart/computing/_index.md index 8777af8c9..2e9ff0c89 100644 --- a/content/cn/docs/quickstart/computing/_index.md +++ b/content/cn/docs/quickstart/computing/_index.md @@ -4,8 +4,8 @@ linkTitle: "HugeGraph Computing (OLAP)" weight: 4 --- -## 🚀 最佳实践:优先使用 DeepWiki 智能文档 +## 推荐:使用 DeepWiki 文档 -> 为解决静态文档可能过时的问题,我们提供了 **实时更新、内容更全面** 的 DeepWiki。它相当于一个拥有项目最新知识的专家,非常适合**所有开发者**在开始项目前阅读和咨询。 +> DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 -**👉 强烈推荐访问并对话:**[**incubator-hugegraph-computer**](https://deepwiki.com/apache/incubator-hugegraph-computer) \ No newline at end of file +**访问链接:**[**hugegraph-computer**](https://deepwiki.com/apache/hugegraph-computer) \ No newline at end of file diff --git a/content/cn/docs/quickstart/computing/hugegraph-computer.md b/content/cn/docs/quickstart/computing/hugegraph-computer.md index 4c9dd7237..fa30a4ce8 100644 --- a/content/cn/docs/quickstart/computing/hugegraph-computer.md +++ b/content/cn/docs/quickstart/computing/hugegraph-computer.md @@ -6,7 +6,7 @@ weight: 2 ## 1 HugeGraph-Computer 概述 -[`HugeGraph-Computer`](https://github.com/apache/incubator-hugegraph-computer) 是分布式图处理系统 (OLAP). 它是 [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf)的一个实现。它可以运行在 Kubernetes(K8s)/Yarn 上。(它侧重可支持百亿~千亿的图数据量下进行图计算, 会使用磁盘进行排序和加速, 这是它和 Vermeer 相对最大的区别之一) +[`HugeGraph-Computer`](https://github.com/apache/hugegraph-computer) 是分布式图处理系统 (OLAP). 它是 [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf)的一个实现。它可以运行在 Kubernetes(K8s)/Yarn 上。(它侧重可支持百亿~千亿的图数据量下进行图计算, 会使用磁盘进行排序和加速, 这是它和 Vermeer 相对最大的区别之一) ### 特性 diff --git a/content/cn/docs/quickstart/computing/hugegraph-vermeer.md b/content/cn/docs/quickstart/computing/hugegraph-vermeer.md index bae1aa93a..e725fe927 100644 --- a/content/cn/docs/quickstart/computing/hugegraph-vermeer.md +++ b/content/cn/docs/quickstart/computing/hugegraph-vermeer.md @@ -130,7 +130,7 @@ docker network rm vermeer_network 3. **方案三:从源码构建** -构建。具体请参照 [Vermeer Readme](https://github.com/apache/incubator-hugegraph-computer/tree/master/vermeer)。 +构建。具体请参照 [Vermeer Readme](https://github.com/apache/hugegraph-computer/tree/master/vermeer)。 ```shell go build diff --git a/content/cn/docs/quickstart/hugegraph-ai/_index.md b/content/cn/docs/quickstart/hugegraph-ai/_index.md index 01906f0d7..5777bde6d 100644 --- a/content/cn/docs/quickstart/hugegraph-ai/_index.md +++ b/content/cn/docs/quickstart/hugegraph-ai/_index.md @@ -7,11 +7,11 @@ weight: 3 [![License](https://img.shields.io/badge/license-Apache%202-0E78BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/apache/incubator-hugegraph-ai) -## 🚀 最佳实践:优先使用 DeepWiki 智能文档 +## 推荐:使用 DeepWiki 文档 -> 为解决静态文档可能过时的问题,我们提供了 **实时更新、内容更全面** 的 DeepWiki。它相当于一个拥有项目最新知识的专家,非常适合**所有开发者**在开始项目前阅读和咨询。 +> DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 -**👉 强烈推荐访问并对话:**[**incubator-hugegraph-ai**](https://deepwiki.com/apache/incubator-hugegraph-ai) +**访问链接:**[**incubator-hugegraph-ai**](https://deepwiki.com/apache/incubator-hugegraph-ai) `hugegraph-ai` 整合了 [HugeGraph](https://github.com/apache/hugegraph) 与人工智能功能,为开发者构建 AI 驱动的图应用提供全面支持。 diff --git a/content/cn/docs/quickstart/hugegraph/_index.md b/content/cn/docs/quickstart/hugegraph/_index.md index f64d0adcf..a7d5fa164 100644 --- a/content/cn/docs/quickstart/hugegraph/_index.md +++ b/content/cn/docs/quickstart/hugegraph/_index.md @@ -4,8 +4,8 @@ linkTitle: "HugeGraph (OLTP)" weight: 1 --- -## 🚀 最佳实践:优先使用 DeepWiki 智能文档 +> DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 +> +> 📖 [https://deepwiki.com/apache/hugegraph](https://deepwiki.com/apache/hugegraph) -> 为解决静态文档可能过时的问题,我们提供了 **实时更新、内容更全面** 的 DeepWiki。它相当于一个拥有项目最新知识的专家,非常适合**所有开发者**在开始项目前阅读和咨询。 - -**👉 强烈推荐访问并对话:**[**incubator-hugegraph**](https://deepwiki.com/apache/incubator-hugegraph) \ No newline at end of file +**GitHub 访问:** [https://github.com/apache/hugegraph](https://github.com/apache/hugegraph) \ No newline at end of file diff --git a/content/cn/docs/quickstart/hugegraph/hugegraph-server.md b/content/cn/docs/quickstart/hugegraph/hugegraph-server.md index b9daaa5f2..9aad3501c 100644 --- a/content/cn/docs/quickstart/hugegraph/hugegraph-server.md +++ b/content/cn/docs/quickstart/hugegraph/hugegraph-server.md @@ -39,7 +39,7 @@ Core 模块是 Tinkerpop 接口的实现,Backend 模块用于管理数据存 #### 3.1 使用 Docker 容器 (便于**测试**) -可参考 [Docker 部署方式](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/README.md)。 +可参考 [Docker 部署方式](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/README.md)。 我们可以使用 `docker run -itd --name=server -p 8080:8080 -e PASSWORD=xxx hugegraph/hugegraph:1.7.0` 去快速启动一个内置了 `RocksDB` 的 `Hugegraph server`. @@ -599,7 +599,7 @@ Connecting to HugeGraphServer (http://127.0.0.1:8080/graphs)......OK 在使用 Docker 的时候,我们可以使用 Cassandra 作为后端存储。我们更加推荐直接使用 docker-compose 来对于 server 以及 Cassandra 进行统一管理 -样例的 `docker-compose.yml` 可以在 [github](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/example/docker-compose-cassandra.yml) 中获取,使用 `docker-compose up -d` 启动。(如果使用 cassandra 4.0 版本作为后端存储,则需要大约两个分钟初始化,请耐心等待) +样例的 `docker-compose.yml` 可以在 [github](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/example/docker-compose-cassandra.yml) 中获取,使用 `docker-compose up -d` 启动。(如果使用 cassandra 4.0 版本作为后端存储,则需要大约两个分钟初始化,请耐心等待) ```yaml version: "3" @@ -666,7 +666,7 @@ volumes: 2. 使用`docker-compose` - 创建`docker-compose.yml`,具体文件如下,在环境变量中设置 PRELOAD=true。其中,[`example.groovy`](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) 是一个预定义的脚本,用于预加载样例数据。如果有需要,可以通过挂载新的 `example.groovy` 脚本改变预加载的数据。 + 创建`docker-compose.yml`,具体文件如下,在环境变量中设置 PRELOAD=true。其中,[`example.groovy`](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) 是一个预定义的脚本,用于预加载样例数据。如果有需要,可以通过挂载新的 `example.groovy` 脚本改变预加载的数据。 ```yaml version: '3' diff --git a/content/cn/docs/quickstart/toolchain/_index.md b/content/cn/docs/quickstart/toolchain/_index.md index 776d935b2..9ab490e6e 100644 --- a/content/cn/docs/quickstart/toolchain/_index.md +++ b/content/cn/docs/quickstart/toolchain/_index.md @@ -6,8 +6,8 @@ weight: 2 > **测试指南**:如需在本地运行工具链测试,请参考 [HugeGraph 工具链本地测试指南](/cn/docs/guides/toolchain-local-test) -## 🚀 最佳实践:优先使用 DeepWiki 智能文档 +## 推荐:使用 DeepWiki 文档 -> 为解决静态文档可能过时的问题,我们提供了 **实时更新、内容更全面** 的 DeepWiki。它相当于一个拥有项目最新知识的专家,非常适合**所有开发者**在开始项目前阅读和咨询。 +> DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 -**👉 强烈推荐访问并对话:**[**incubator-hugegraph-toolchain**](https://deepwiki.com/apache/incubator-hugegraph-toolchain) +**访问链接:**[**hugegraph-toolchain**](https://deepwiki.com/apache/hugegraph-toolchain) diff --git a/content/cn/docs/quickstart/toolchain/hugegraph-loader.md b/content/cn/docs/quickstart/toolchain/hugegraph-loader.md index d10419ed5..5c1040331 100644 --- a/content/cn/docs/quickstart/toolchain/hugegraph-loader.md +++ b/content/cn/docs/quickstart/toolchain/hugegraph-loader.md @@ -915,7 +915,7 @@ bin/hugegraph-loader -g {GRAPH_NAME} -f ${INPUT_DESC_FILE} -s ${SCHEMA_FILE} -h ### 4 完整示例 -下面给出的是 hugegraph-loader 包中 example 目录下的例子。([GitHub 地址](https://github.com/apache/incubator-hugegraph-toolchain/tree/master/hugegraph-loader/assembly/static/example/file)) +下面给出的是 hugegraph-loader 包中 example 目录下的例子。([GitHub 地址](https://github.com/apache/hugegraph-toolchain/tree/master/hugegraph-loader/assembly/static/example/file)) #### 4.1 准备数据 diff --git a/content/en/_index.html b/content/en/_index.html index 564b91641..eaee11d99 100644 --- a/content/en/_index.html +++ b/content/en/_index.html @@ -10,8 +10,6 @@

Apache HugeGraph

-

           -          Incubating

}}"> Learn More @@ -19,19 +17,17 @@

Apache Download -

HugeGraph is a convenient, efficient, and adaptable graph database

-

compatible with the Apache TinkerPop3 framework and the Gremlin query language.

+

HugeGraph is a full-stack graph database system

+

providing complete graph data processing capabilities from storage, real-time querying to offline analysis, supporting both Gremlin and Cypher query languages.

{{< blocks/link-down color="info" >}} {{< /blocks/cover >}} {{% blocks/lead color="primary" %}} -

HugeGraph supports fast import performance in the case of more than 10 billion Vertices and Edges

-

Graph, millisecond-level OLTP query capability, and large-scale distributed

-

graph processing (OLAP). The main scenarios of HugeGraph include

-

correlation search, fraud detection, and knowledge graph.

- +

HugeGraph supports high-speed import of billions of graph data and millisecond-level real-time queries,

+

with deep integration with big data platforms like Spark and Flink. In the AI era, combined with Large Language Models (LLMs),

+

it provides powerful graph computing capabilities for intelligent Q&A, recommendation systems, fraud detection, knowledge graphs and more.

{{% /blocks/lead %}} {{< blocks/section color="dark" >}} @@ -52,12 +48,27 @@

Apache {{% /blocks/feature %}} +{{% blocks/feature icon="fa-brain" title="AI-Ready" %}} +Integrates LLM for GraphRAG intelligent Q&A, automated knowledge graph construction, with 20+ built-in graph machine learning algorithms to easily build AI-driven graph applications. +{{% /blocks/feature %}} + + +{{% blocks/feature icon="fa-expand-arrows-alt" title="Scalable" %}} +Supports horizontal scaling and distributed deployment, seamlessly migrating from standalone to PB-level clusters, with multiple storage engine options for different scale and performance requirements. +{{% /blocks/feature %}} + + +{{% blocks/feature icon="fa-puzzle-piece" title="Open Ecosystem" %}} +Adheres to Apache TinkerPop standards, provides multi-language clients (Java, Python, Go), compatible with mainstream big data platforms, with an active and continuously evolving community. +{{% /blocks/feature %}} + + {{< /blocks/section >}} {{< blocks/section color="blue-deep">}}
-

The first graph database project in Apache

+

The First Apache Foundation Top-Level Graph Project

{{< /blocks/section >}} @@ -66,12 +77,12 @@

The first graph database project in Apache

{{< blocks/section >}} {{% blocks/feature icon="far fa-tools" title="Get The **Toolchain**" %}} -[It](https://github.com/apache/incubator-hugegraph-toolchain) includes graph loader & dashboard & backup tools +[It](https://github.com/apache/hugegraph-toolchain) includes graph loader & dashboard & backup tools {{% /blocks/feature %}} -{{% blocks/feature icon="fab fa-github" title="Efficient" url="https://github.com/apache/incubator-hugegraph" %}} -We do a [Pull Request](https://github.com/apache/incubator-hugegraph/pulls) contributions workflow on **GitHub**. New users are always welcome! +{{% blocks/feature icon="fab fa-github" title="Efficient" url="https://github.com/apache/hugegraph" %}} +We do a [Pull Request](https://github.com/apache/hugegraph/pulls) contributions workflow on **GitHub**. New users are always welcome! {{% /blocks/feature %}} diff --git a/content/en/community/maturity.md b/content/en/community/maturity.md index 5eb81b7ed..87c1e87e6 100644 --- a/content/en/community/maturity.md +++ b/content/en/community/maturity.md @@ -46,7 +46,7 @@ The following table is filled according to the [Apache Maturity Model](https://c | **RE20** | The project's PMC (Project Management Committee, see CS10) approves each software release in order to make the release an act of the Foundation.                                                                                                                                                                          | **YES** All releases are voted on by the PMC on the dev@hugegraph.apache.org mailing list. | | **RE30** | Releases are signed and/or distributed along with digests that anyone can reliably use to validate the downloaded archives.                                                                                                                                                       | **YES** All releases are signed by the release manager and distributed with checksums. The [KEYS](https://downloads.apache.org/incubator/hugegraph/KEYS) file is available for verification. | | **RE40** | The project can distribute convenience binaries alongside source code, but they are not Apache Releases, they are provided with no guarantee. | **YES** The project provides convenience binaries, but only the source code archive is an official Apache release. | -| **RE50** | The project documents a repeatable release process so that someone new to the project can independently generate the complete set of artifacts required for a release. | **YES** The project documents its release process in the [How to Release](https://github.com/apache/incubator-hugegraph/wiki/ASF-Release-Guidance-V2.0) guide. | +| **RE50** | The project documents a repeatable release process so that someone new to the project can independently generate the complete set of artifacts required for a release. | **YES** The project documents its release process in the [How to Release](https://github.com/apache/hugegraph/wiki/ASF-Release-Guidance-V2.0) guide. | ### Quality diff --git a/content/en/docs/_index.md b/content/en/docs/_index.md index bdfa1b9ac..e63e3e592 100755 --- a/content/en/docs/_index.md +++ b/content/en/docs/_index.md @@ -7,4 +7,52 @@ menu: weight: 20 --- -Welcome to HugeGraph docs +## Apache HugeGraph Documentation + +Apache HugeGraph is a complete graph database ecosystem, supporting OLTP real-time queries, OLAP offline analysis, and AI intelligent applications. + +### Quick Navigation by Scenario + +| I want to... | Start here | +|----------|-----------| +| **Run graph queries** (OLTP) | [HugeGraph Server Quickstart](quickstart/hugegraph-server/hugegraph-server) | +| **Large-scale graph computing** (OLAP) | [Graph Computing Engine](quickstart/hugegraph-computer/hugegraph-computer) | +| **Build AI/RAG applications** | [HugeGraph-AI](quickstart/hugegraph-ai) | +| **Batch import data** | [HugeGraph Loader](quickstart/hugegraph-loader) | +| **Visualize and manage graphs** | [Hubble Web UI](quickstart/hugegraph-hubble) | + +### Ecosystem Overview + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Apache HugeGraph Ecosystem │ +├─────────────────────────────────────────────────────────────────┤ +│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ +│ │ HugeGraph │ │ HugeGraph │ │ HugeGraph-AI │ │ +│ │ Server │ │ Computer │ │ (GraphRAG/ML/Python) │ │ +│ │ (OLTP) │ │ (OLAP) │ │ │ │ +│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ +│ │ │ │ │ +│ ┌──────┴───────────────┴────────────────────┴──────────────┐ │ +│ │ HugeGraph Toolchain │ │ +│ │ Hubble (UI) | Loader | Client (Java/Go/Python) | Tools │ │ +│ └───────────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +### Core Components + +- **HugeGraph Server** - Core graph database with REST API + Gremlin + Cypher support +- **HugeGraph Toolchain** - Client SDKs, data import, visualization, and operational tools +- **HugeGraph Computer** - Distributed graph computing (Vermeer high-performance in-memory / Computer massive external storage) +- **HugeGraph-AI** - GraphRAG, knowledge graph construction, 20+ graph ML algorithms + +### Deployment Modes + +| Mode | Use Case | Data Scale | +|-----|---------|---------| +| **Standalone** | High-speed stable, compute-storage integrated | < 1000TB | +| **Distributed** | Massive storage, compute-storage separated | >= 1000TB | +| **Docker** | Quick start | Any | + +[📖 Detailed Introduction](introduction/) diff --git a/content/en/docs/changelog/hugegraph-0.12.0-release-notes.md b/content/en/docs/changelog/hugegraph-0.12.0-release-notes.md index c093dc916..92e7928c1 100644 --- a/content/en/docs/changelog/hugegraph-0.12.0-release-notes.md +++ b/content/en/docs/changelog/hugegraph-0.12.0-release-notes.md @@ -2,7 +2,7 @@ title: "HugeGraph 0.12 Release Notes" linkTitle: "Release-0.12.0" draft: true -weight: 1 +weight: 11 --- ### API & Client diff --git a/content/en/docs/changelog/hugegraph-1.0.0-release-notes.md b/content/en/docs/changelog/hugegraph-1.0.0-release-notes.md index ece52feb1..5fb674c64 100644 --- a/content/en/docs/changelog/hugegraph-1.0.0-release-notes.md +++ b/content/en/docs/changelog/hugegraph-1.0.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.0.0 Release Notes" linkTitle: "Release-1.0.0" -weight: 2 +weight: 9 --- ### OLTP API & Client Changes diff --git a/content/en/docs/changelog/hugegraph-1.2.0-release-notes.md b/content/en/docs/changelog/hugegraph-1.2.0-release-notes.md index 813674f4b..2af7d186b 100644 --- a/content/en/docs/changelog/hugegraph-1.2.0-release-notes.md +++ b/content/en/docs/changelog/hugegraph-1.2.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.2.0 Release Notes" linkTitle: "Release-1.2.0" -weight: 3 +weight: 7 --- ### Java version statement diff --git a/content/en/docs/changelog/hugegraph-1.3.0-release-notes.md b/content/en/docs/changelog/hugegraph-1.3.0-release-notes.md index 869273b19..a3b395933 100644 --- a/content/en/docs/changelog/hugegraph-1.3.0-release-notes.md +++ b/content/en/docs/changelog/hugegraph-1.3.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.3.0 Release Notes" linkTitle: "Release-1.3.0" -weight: 4 +weight: 5 --- ### Operating Environment / Version Description diff --git a/content/en/docs/changelog/hugegraph-1.5.0-release-notes.md b/content/en/docs/changelog/hugegraph-1.5.0-release-notes.md index 7ed0dd595..89d59035b 100644 --- a/content/en/docs/changelog/hugegraph-1.5.0-release-notes.md +++ b/content/en/docs/changelog/hugegraph-1.5.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.5.0 Release Notes" linkTitle: "Release-1.5.0" -weight: 5 +weight: 3 --- > WIP: This doc is under construction, please wait for the final version (BETA) diff --git a/content/en/docs/changelog/hugegraph-1.7.0-release-notes.md b/content/en/docs/changelog/hugegraph-1.7.0-release-notes.md index 874730079..8f9a8935d 100644 --- a/content/en/docs/changelog/hugegraph-1.7.0-release-notes.md +++ b/content/en/docs/changelog/hugegraph-1.7.0-release-notes.md @@ -1,7 +1,7 @@ --- title: "HugeGraph 1.7.0 Release Notes" linkTitle: "Release-1.7.0" -weight: 7 +weight: 1 --- > WIP: This doc is under construction, please wait for the final version (BETA) diff --git a/content/en/docs/clients/gremlin-console.md b/content/en/docs/clients/gremlin-console.md index a1de36d40..d13a09229 100644 --- a/content/en/docs/clients/gremlin-console.md +++ b/content/en/docs/clients/gremlin-console.md @@ -43,7 +43,7 @@ gremlin> > The `--` here will be parsed by getopts as the last option, allowing the subsequent options to be passed to Gremlin-Console for processing. `-i` represents `Execute the specified script and leave the console open on completion`. For more options, you can refer to the [source code](https://github.com/apache/tinkerpop/blob/3.5.1/gremlin-console/src/main/groovy/org/apache/tinkerpop/gremlin/console/Console.groovy#L483) of Gremlin-Console. -[`example.groovy`](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) is an example script under the `scripts` directory. This script inserts some data and queries the number of vertices and edges in the graph at the end. +[`example.groovy`](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) is an example script under the `scripts` directory. This script inserts some data and queries the number of vertices and edges in the graph at the end. You can continue to enter Gremlin statements to operate on the graph: diff --git a/content/en/docs/clients/restful-api/_index.md b/content/en/docs/clients/restful-api/_index.md index 355b4cbc3..c5158247e 100644 --- a/content/en/docs/clients/restful-api/_index.md +++ b/content/en/docs/clients/restful-api/_index.md @@ -9,7 +9,7 @@ weight: 1 > - HugeGraph 1.7.0+ introduces graphspaces, and REST paths follow `/graphspaces/{graphspace}/graphs/{graph}`. > - HugeGraph 1.5.x and earlier still rely on the legacy `/graphs/{graph}` path, and the create/clone graph APIs require `Content-Type: text/plain`; 1.7.0+ expects JSON bodies. > - The default graphspace name is `DEFAULT`, which you can use directly if you do not need multi-tenant isolation. -> - **Note**: Before version 1.5.0, the format of ids such as group/target was similar to -69:grant. After version 1.7.0, the id and name were consistent, such as admin [HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) +> - **Note**: Before version 1.5.0, the format of ids such as group/target was similar to -69:grant. After version 1.7.0, the id and name were consistent, such as admin [HugeGraph 1.5.x RESTful API](https://github.com/apache/hugegraph-doc/tree/release-1.5.0) Besides the documentation below, you can also open `swagger-ui` at `localhost:8080/swagger-ui/index.html` to explore the RESTful API. [Here is an example](/docs/quickstart/hugegraph/hugegraph-server#swaggerui-example) diff --git a/content/en/docs/clients/restful-api/auth.md b/content/en/docs/clients/restful-api/auth.md index 4d87c90f1..4eb81b91a 100644 --- a/content/en/docs/clients/restful-api/auth.md +++ b/content/en/docs/clients/restful-api/auth.md @@ -6,7 +6,7 @@ weight: 16 > **Version Change Notice**: > - 1.7.0+: Auth API paths use GraphSpace format, such as `/graphspaces/DEFAULT/auth/users`, and group/target IDs match their names (e.g., `admin`) -> - 1.5.x and earlier: Auth API paths include graph name, and group/target IDs use format like `-69:grant`. See [HugeGraph 1.5.x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) +> - 1.5.x and earlier: Auth API paths include graph name, and group/target IDs use format like `-69:grant`. See [HugeGraph 1.5.x RESTful API](https://github.com/apache/hugegraph-doc/tree/release-1.5.0) ### 10.1 User Authentication and Access Control @@ -21,7 +21,7 @@ Description: User 'boss' has read permission for people in the 'graph1' graph fr ##### Interface Description: The user authentication and access control interface includes 5 categories: UserAPI, GroupAPI, TargetAPI, BelongAPI, AccessAPI. -**Note** Before 1.5.0, the format of ids such as group/target was similar to -69:grant. After 1.7.0, the id and name were consistent. Such as admin [HugeGraph 1.5 x RESTful API](https://github.com/apache/incubator-hugegraph-doc/tree/release-1.5.0) +**Note** Before 1.5.0, the format of ids such as group/target was similar to -69:grant. After 1.7.0, the id and name were consistent. Such as admin [HugeGraph 1.5 x RESTful API](https://github.com/apache/hugegraph-doc/tree/release-1.5.0) ### 10.2 User (User) API The user interface includes APIs for creating users, deleting users, modifying users, and querying user-related information. diff --git a/content/en/docs/clients/restful-api/graphs.md b/content/en/docs/clients/restful-api/graphs.md index 899f1a67c..9283708c3 100644 --- a/content/en/docs/clients/restful-api/graphs.md +++ b/content/en/docs/clients/restful-api/graphs.md @@ -173,7 +173,7 @@ Create a graph (set `Content-Type: application/json`) - Non-auth mode: `"gremlin.graph": "org.apache.hugegraph.HugeFactory"` **Note**!! -1. In version 1.7.0, dynamic graph creation would cause a NPE. This issue has been fixed in [PR#2912](https://github.com/apache/incubator-hugegraph/pull/2912). The current master version and versions after 1.7.0 do not have this problem. +1. In version 1.7.0, dynamic graph creation would cause a NPE. This issue has been fixed in [PR#2912](https://github.com/apache/hugegraph/pull/2912). The current master version and versions after 1.7.0 do not have this problem. 2. For version 1.7.0 and earlier, if the backend is hstore, you must add "task.scheduler_type": "distributed" in the request body. Also ensure HugeGraph-Server is properly configured with PD, see [HStore Configuration](/docs/quickstart/hugegraph/hugegraph-server/#511-distributed-storage-hstore). **RocksDB Example:** diff --git a/content/en/docs/config/config-authentication.md b/content/en/docs/config/config-authentication.md index 4ebde6303..e00c8a417 100644 --- a/content/en/docs/config/config-authentication.md +++ b/content/en/docs/config/config-authentication.md @@ -101,14 +101,14 @@ If deployed based on Docker image or if HugeGraph has already been initialized a relevant graph data needs to be deleted and HugeGraph needs to be restarted. If there is already business data in the diagram, it is temporarily **not possible** to directly convert the authentication mode (version<=1.2.0) -> Improvements for this feature have been included in the latest release (available in the latest docker image), please refer to [PR 2411](https://github.com/apache/incubator-hugegraph/pull/2411). Seamless switching is now available. +> Improvements for this feature have been included in the latest release (available in the latest docker image), please refer to [PR 2411](https://github.com/apache/hugegraph/pull/2411). Seamless switching is now available. ```bash # stop the hugeGraph firstly bin/stop-hugegraph.sh # delete the store data (here we use the default path for rocksdb) -# there is no need to delete in the latest version (fixed in https://github.com/apache/incubator-hugegraph/pull/2411) +# there is no need to delete in the latest version (fixed in https://github.com/apache/hugegraph/pull/2411) rm -rf rocksdb-data/ # init store again diff --git a/content/en/docs/contribution-guidelines/committer-guidelines.md b/content/en/docs/contribution-guidelines/committer-guidelines.md index abb653528..81af8f450 100644 --- a/content/en/docs/contribution-guidelines/committer-guidelines.md +++ b/content/en/docs/contribution-guidelines/committer-guidelines.md @@ -9,7 +9,7 @@ weight: 5 # Candidate Requirements 1. Candidates must adhere to the [Apache Code of Conduct](https://www.apache.org/foundation/policies/conduct.html). -2. PMC members will assess candidates' interactions with others and contributions through [mailing lists](https://lists.apache.org/list?dev@hugegraph.apache.org), [issues](https://github.com/apache/hugegraph/issues), [pull requests](https://github.com/apache/incubator-hugegraph/pulls), and [official documentation](https://hugegraph.apache.org/docs). +2. PMC members will assess candidates' interactions with others and contributions through [mailing lists](https://lists.apache.org/list?dev@hugegraph.apache.org), [issues](https://github.com/apache/hugegraph/issues), [pull requests](https://github.com/apache/hugegraph/pulls), and [official documentation](https://hugegraph.apache.org/docs). 3. Considerations for evaluating candidates as potential Committers include: 1. Ability to collaborate with community members 2. Mentorship capabilities @@ -72,32 +72,32 @@ Welcome everyone to share opinions~ Thanks! ``` -For contribution links in discussion emails, you can use the statistical feature of [GitHub Search](https://github.com/search) by entering corresponding keywords as needed. You can also adjust parameters and add new repositories such as `repo:apache/incubator-hugegraph-computer`. Pay special attention to adjusting the **time range** (below is a template reference, please adjust the parameters accordingly): +For contribution links in discussion emails, you can use the statistical feature of [GitHub Search](https://github.com/search) by entering corresponding keywords as needed. You can also adjust parameters and add new repositories such as `repo:apache/hugegraph-computer`. Pay special attention to adjusting the **time range** (below is a template reference, please adjust the parameters accordingly): - Number of PR submissions - - `is:pr author:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `is:pr author:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Lines of code submissions/changes - - https://github.com/apache/incubator-hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - - https://github.com/apache/incubator-hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - Number of PR submissions associated with issues - - `linked:issue involves:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `linked:issue involves:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Number of PR reviews - - `type:pr reviewed-by:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:pr reviewed-by:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Number of merge commits - - `type:pr author:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:pr author:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Effective lines merged - - https://github.com/apache/incubator-hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - - https://github.com/apache/incubator-hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c + - https://github.com/apache/hugegraph-doc/graphs/contributors?from=2023-06-01&to=2023-12-25&type=c - Number of issue submissions - - `type:issue author:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:issue author:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Number of issue fixes - Based on the number of issue submissions, select those with a closed status. - Number of issue participations - - `type:issue involves:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:issue involves:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Number of issue comments - - `type:issue commenter:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:issue commenter:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` - Number of PR comments - - `type:pr commenter:xxx repo:apache/incubator-hugegraph repo:apache/incubator-hugegraph-doc created:>2023-06-01 updated:<2023-12-25` + - `type:pr commenter:xxx repo:apache/hugegraph repo:apache/hugegraph-doc created:>2023-06-01 updated:<2023-12-25` For participation in mailing lists, you can use https://lists.apache.org/list?dev@hugegraph.apache.org:lte=10M:xxx. diff --git a/content/en/docs/contribution-guidelines/contribute.md b/content/en/docs/contribution-guidelines/contribute.md index e8741b1f4..9a9c6edf7 100644 --- a/content/en/docs/contribution-guidelines/contribute.md +++ b/content/en/docs/contribution-guidelines/contribute.md @@ -20,7 +20,7 @@ Before submitting the code, we need to do some preparation: 1. Sign up or login to GitHub: [https://github.com](https://github.com) -2. Fork HugeGraph repo from GitHub: [https://github.com/apache/incubator-hugegraph/fork](https://github.com/apache/hugegraph/fork) +2. Fork HugeGraph repo from GitHub: [https://github.com/apache/hugegraph/fork](https://github.com/apache/hugegraph/fork) 3. Clone code from fork repo to local: [https://github.com/${GITHUB_USER_NAME}/hugegraph](https://github.com/${GITHUB_USER_NAME}/hugegraph) @@ -44,7 +44,7 @@ Before submitting the code, we need to do some preparation: ## 2. Create an Issue on GitHub -If you encounter bugs or have any questions, please go to [GitHub Issues](https://github.com/apache/incubator-hugegraph/issues) to report them and feel free to [create an issue](https://github.com/apache/hugegraph/issues/new). +If you encounter bugs or have any questions, please go to [GitHub Issues](https://github.com/apache/hugegraph/issues) to report them and feel free to [create an issue](https://github.com/apache/hugegraph/issues/new). ## 3. Make changes of code locally @@ -76,10 +76,10 @@ Note: In order to be consistent with the code style easily, if you use [IDEA](ht ##### 3.2.1 Check licenses If we want to add new third-party dependencies to the `HugeGraph` project, we need to do the following things: -1. Find the third-party dependent repository, put the dependent `license` file into [./hugegraph-dist/release-docs/licenses/](https://github.com/apache/incubator-hugegraph/tree/master/hugegraph-server/hugegraph-dist/release-docs/licenses) path. -2. Declare the dependency in [./hugegraph-dist/release-docs/LICENSE](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/LICENSE) `LICENSE` information. -3. Find the NOTICE file in the repository and append it to [./hugegraph-dist/release-docs/NOTICE](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/NOTICE) file (skip this step if there is no NOTICE file). -4. Execute locally [./hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh) to update the dependency list [known-dependencies.txt](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/known-dependencies.txt) (or manually update) . +1. Find the third-party dependent repository, put the dependent `license` file into [./hugegraph-dist/release-docs/licenses/](https://github.com/apache/hugegraph/tree/master/hugegraph-server/hugegraph-dist/release-docs/licenses) path. +2. Declare the dependency in [./hugegraph-dist/release-docs/LICENSE](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/LICENSE) `LICENSE` information. +3. Find the NOTICE file in the repository and append it to [./hugegraph-dist/release-docs/NOTICE](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/release-docs/NOTICE) file (skip this step if there is no NOTICE file). +4. Execute locally [./hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/regenerate_known_dependencies.sh) to update the dependency list [known-dependencies.txt](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/scripts/dependency/known-dependencies.txt) (or manually update) . **Example**: A new third-party dependency is introduced into the project -> `ant-1.9.1.jar` - The project source code is located at: https://github.com/apache/ant/tree/rel/1.9.1 diff --git a/content/en/docs/contribution-guidelines/hugegraph-server-idea-setup.md b/content/en/docs/contribution-guidelines/hugegraph-server-idea-setup.md index 698af8519..04f4426a1 100644 --- a/content/en/docs/contribution-guidelines/hugegraph-server-idea-setup.md +++ b/content/en/docs/contribution-guidelines/hugegraph-server-idea-setup.md @@ -4,7 +4,7 @@ linkTitle: "Setup Server in IDEA" weight: 4 --- -> NOTE: The following configuration is for reference purposes only, and has been tested on Linux and macOS platforms based on [this version](https://github.com/apache/incubator-hugegraph/commit/a946ad1de4e8f922251a5241ffc957c33379677f). +> NOTE: The following configuration is for reference purposes only, and has been tested on Linux and macOS platforms based on [this version](https://github.com/apache/hugegraph/commit/a946ad1de4e8f922251a5241ffc957c33379677f). ### Background @@ -17,7 +17,7 @@ The core steps for local startup are the same as starting with **scripts**: Before proceeding with the following process, make sure that you have cloned the source code of HugeGraph and have configured the development environment, such as `Java 11` & you could config your local environment -with this [config-doc](https://github.com/apache/incubator-hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) +with this [config-doc](https://github.com/apache/hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) ```bash git clone https://github.com/apache/hugegraph.git @@ -57,7 +57,7 @@ Next, open the `Run/Debug Configurations` panel in IntelliJ IDEA and create a ne - LD_LIBRARY_PATH=/path/to/your/library:$LD_LIBRARY_PATH - LD_PRELOAD=libjemalloc.so:librocksdbjni-linux64.so -> If **user authentication** (authenticator) is configured for HugeGraph-Server in the **Java 11** environment, you need to refer to the script [configuration](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/init-store.sh#L52) in the binary package and add the following **VM options**: +> If **user authentication** (authenticator) is configured for HugeGraph-Server in the **Java 11** environment, you need to refer to the script [configuration](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/init-store.sh#L52) in the binary package and add the following **VM options**: > > ```bash > --add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED @@ -93,7 +93,7 @@ Similarly, open the `Run/Debug Configurations` panel in IntelliJ IDEA and create - Set the `Main class` to `org.apache.hugegraph.dist.HugeGraphServer`. - Set the program arguments to `conf/gremlin-server.yaml conf/rest-server.properties`. Similarly, note that the path here is relative to the working directory, so make sure to set the working directory to `path-to-your-directory`. -> Similarly, if **user authentication** (authenticator) is configured for HugeGraph-Server in the **Java 11** environment, you need to refer to the script [configuration](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/hugegraph-server.sh#L124) in the binary package and add the following **VM options**: +> Similarly, if **user authentication** (authenticator) is configured for HugeGraph-Server in the **Java 11** environment, you need to refer to the script [configuration](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/bin/hugegraph-server.sh#L124) in the binary package and add the following **VM options**: > > ```bash > --add-exports=java.base/jdk.internal.reflect=ALL-UNNAMED --add-modules=jdk.unsupported --add-exports=java.base/sun.nio.ch=ALL-UNNAMED @@ -169,4 +169,4 @@ This is because Log4j2 uses asynchronous loggers. You can refer to the [official 2. [Local Debugging Guide for HugeGraph Server (Win/Unix)](https://gist.github.com/imbajin/1661450f000cd62a67e46d4f1abfe82c) 3. ["package sun.misc does not exist" compilation error](https://youtrack.jetbrains.com/issue/IDEA-180033) 4. [Cannot compile: java: package sun.misc does not exist](https://youtrack.jetbrains.com/issue/IDEA-201168) -5. [The code-style config for HugeGraph in IDEA](https://github.com/apache/incubator-hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) +5. [The code-style config for HugeGraph in IDEA](https://github.com/apache/hugegraph/wiki/The-style-config-for-HugeGraph-in-IDEA) diff --git a/content/en/docs/download/download.md b/content/en/docs/download/download.md index f89bfa17e..2df19aa30 100644 --- a/content/en/docs/download/download.md +++ b/content/en/docs/download/download.md @@ -1,5 +1,5 @@ --- -title: "Download Apache HugeGraph (Incubating)" +title: "Download Apache HugeGraph" linkTitle: "Download" weight: 2 --- @@ -10,7 +10,7 @@ weight: 2 > - It is recommended to use the latest version of the HugeGraph software package. Please select Java11 for the runtime environment. > - To verify downloads, use the corresponding hash (SHA512), signature, and [Project Signature Verification KEYS](https://downloads.apache.org/incubator/hugegraph/KEYS). > - Instructions for checking hash (SHA512) and signatures are on the [Validate Release](/docs/contribution-guidelines/validate-release/) page, and you can also refer to [ASF official instructions](https://www.apache.org/dyn/closer.cgi#verify). -> - Note: The version numbers of all components of HugeGraph have been kept consistent, and the version numbers of Maven repositories such as `client/loader/hubble/common` are the same. You can refer to these for dependency references [maven example](https://github.com/apache/incubator-hugegraph-toolchain#maven-dependencies). +> - Note: The version numbers of all components of HugeGraph have been kept consistent, and the version numbers of Maven repositories such as `client/loader/hubble/common` are the same. You can refer to these for dependency references [maven example](https://github.com/apache/hugegraph-toolchain#maven-dependencies). ### Latest Version 1.7.0 diff --git a/content/en/docs/introduction/README.md b/content/en/docs/introduction/README.md index 541fde789..9186e2a60 100644 --- a/content/en/docs/introduction/README.md +++ b/content/en/docs/introduction/README.md @@ -7,8 +7,9 @@ weight: 1 ### Summary Apache HugeGraph is an easy-to-use, efficient, general-purpose open-source graph database system -(Graph Database, [GitHub project address](https://github.com/hugegraph/hugegraph)), implementing the [Apache TinkerPop3](https://tinkerpop.apache.org) framework and fully compatible with the [Gremlin](https://tinkerpop.apache.org/gremlin.html) query language, -With complete toolchain components, it helps users easily build applications and products based on graph databases. HugeGraph supports fast import of more than 10 billion vertices and edges, and provides millisecond-level relational query capability (OLTP). +(Graph Database, [GitHub project address](https://github.com/apache/hugegraph)), implementing the [Apache TinkerPop3](https://tinkerpop.apache.org) framework and fully compatible with the [Gremlin](https://tinkerpop.apache.org/gremlin.html) query language, +while also supporting the [Cypher](https://opencypher.org/) query language (OpenCypher standard). +With complete toolchain components, it helps users easily build applications and products based on graph databases. HugeGraph supports fast import of more than 10 billion vertices and edges, and provides millisecond-level relational query capability (OLTP). It also supports large-scale distributed graph computing (OLAP). Typical application scenarios of HugeGraph include deep relationship exploration, association analysis, path search, feature extraction, data clustering, community detection, knowledge graph, etc., and are applicable to business fields such as network security, telecommunication fraud, financial risk control, advertising recommendation, social network, and intelligence Robots, etc. @@ -16,17 +17,43 @@ Typical application scenarios of HugeGraph include deep relationship exploration ### Features HugeGraph supports graph operations in online and offline environments, batch importing of data and efficient complex relationship analysis. It can seamlessly be integrated with big data platforms. -HugeGraph supports multi-user parallel operations. Users can enter Gremlin query statements and get graph query results in time. They can also call the HugeGraph API in user programs for graph analysis or queries. +HugeGraph supports multi-user parallel operations. Users can enter Gremlin/Cypher query statements and get graph query results in time. They can also call the HugeGraph API in user programs for graph analysis or queries. -This system has the following features: +This system has the following features: -- Ease of use: HugeGraph supports the Gremlin graph query language and a RESTful API, providing common interfaces for graph retrieval, and peripheral tools with complete functions to easily implement various graph-based query and analysis operations. +- Ease of use: HugeGraph supports the Gremlin/Cypher graph query languages and a RESTful API, providing common interfaces for graph retrieval, and peripheral tools with complete functions to easily implement various graph-based query and analysis operations. - Efficiency: HugeGraph has been deeply optimized in graph storage and graph computing, and provides a variety of batch import tools, which can easily complete the rapid import of tens of billions of data, and achieve millisecond-level response for graph retrieval through optimized queries. Supports simultaneous online real-time operations of thousands of users. - Universal: HugeGraph supports the Apache Gremlin standard graph query language and the Property Graph standard graph modeling method, and supports graph-based OLTP and OLAP schemes. Integrate Apache Hadoop and Apache Spark big data platforms. - Scalable: supports distributed storage, multiple copies of data, and horizontal expansion, built-in multiple back-end storage engines, and can easily expand the back-end storage engine through plug-ins. - Open: HugeGraph code is open source (Apache 2 License), customers can modify and customize independently, and selectively give back to the open-source community. -The functions of this system include but are not limited to: +### Deployment Modes + +HugeGraph supports multiple deployment modes to meet different scales and scenarios: + +**Standalone Mode** +- Server + RocksDB backend storage +- Suitable for development, testing, and small-to-medium scale data (< 1TB) +- Docker quick start: `docker run hugegraph/hugegraph` +- See [Server Quickstart](/docs/quickstart/hugegraph-server/hugegraph-server) + +**Distributed Mode** +- HugeGraph-PD: Metadata management and cluster scheduling +- HugeGraph-Store (HStore): Distributed storage engine +- Supports horizontal scaling and high availability (100GB+ data scale) +- Suitable for production environments and large-scale graph data applications + +### Quick Start Guide + +| Use Case | Recommended Path | +|---------|---------| +| Quick experience | [Docker deployment](/docs/quickstart/hugegraph-server/hugegraph-server#docker) | +| Build OLTP applications | Server → REST API / Gremlin / Cypher | +| Graph analysis (OLAP) | [Vermeer](/docs/quickstart/computing/hugegraph-computer) (recommended) or Computer | +| Build AI applications | [HugeGraph-AI](/docs/quickstart/hugegraph-ai) (GraphRAG/Knowledge Graph) | +| Batch data import | [Loader](/docs/quickstart/toolchain/hugegraph-loader) + [Hubble](/docs/quickstart/toolchain/hugegraph-hubble) | + +### System Functions - Supports batch import of data from multiple data sources (including local files, HDFS files, MySQL databases, and other data sources), and supports import of multiple file formats (including TXT, CSV, JSON, and other formats) - With a visual operation interface, it can be used for operation, analysis, and display diagrams, reducing the threshold for users to use @@ -49,19 +76,19 @@ The functions of this system include but are not limited to: - Backend: Implements the storage of graph data to the backend, supports backends including Memory, Cassandra, ScyllaDB, RocksDB, HBase, MySQL and PostgreSQL, users can choose one according to the actual situation; - API: Built-in REST Server provides RESTful API to users and is fully compatible with Gremlin queries. (Supports distributed storage and computation pushdown) - [HugeGraph-Toolchain](https://github.com/apache/hugegraph-toolchain): (Toolchain) - - [HugeGraph-Client](/docs/quickstart/client/hugegraph-client): HugeGraph-Client provides a RESTful API client for connecting to HugeGraph-Server, currently only the Java version is implemented, users of other languages can implement it themselves; + - [HugeGraph-Client](/docs/quickstart/client/hugegraph-client): HugeGraph-Client provides a RESTful API client for connecting to HugeGraph-Server, supporting Java/Python/Go multi-language versions; - [HugeGraph-Loader](/docs/quickstart/toolchain/hugegraph-loader): HugeGraph-Loader is a data import tool based on HugeGraph-Client, which transforms ordinary text data into vertices and edges of the graph and inserts them into the graph database; - - [HugeGraph-Hubble](/docs/quickstart/toolchain/hugegraph-hubble): HugeGraph-Hubble is HugeGraph's Web + - [HugeGraph-Hubble](/docs/quickstart/toolchain/hugegraph-hubble): HugeGraph-Hubble is HugeGraph's Web visualization management platform, a one-stop visualization analysis platform, the platform covers the whole process from data modeling, to fast data import, to online and offline analysis of data, and unified management of the graph; - [HugeGraph-Tools](/docs/quickstart/toolchain/hugegraph-tools): HugeGraph-Tools is HugeGraph's deployment and management tool, including graph management, backup/recovery, Gremlin execution and other functions. -- [HugeGraph-Computer](/docs/quickstart/computing/hugegraph-computer): HugeGraph-Computer is a distributed graph processing system (OLAP). - It is an implementation of [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It can run on clusters such as Kubernetes/Yarn, and supports large-scale graph computing. -- [HugeGraph-AI](/docs/quickstart/hugegraph-ai): HugeGraph-AI is HugeGraph's independent AI - component, providing training and inference functions of graph neural networks, LLM/Graph RAG combination/Python-Client and other related components, continuously updating. +- [HugeGraph-Computer](/docs/quickstart/computing/hugegraph-computer): HugeGraph-Computer is a distributed graph processing system (OLAP). + It is an implementation of [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It can run on clusters such as Kubernetes/Yarn, and supports large-scale graph computing. Also provides Vermeer lightweight graph computing engine, suitable for quick start and small-to-medium scale graph analysis. +- [HugeGraph-AI](/docs/quickstart/hugegraph-ai): HugeGraph-AI is HugeGraph's independent AI + component, providing LLM/GraphRAG intelligent Q&A, automated knowledge graph construction, graph neural network training/inference, Python-Client and other features, with 20+ built-in graph machine learning algorithms, continuously updating. ### Contact Us -- [GitHub Issues](https://github.com/apache/incubator-hugegraph/issues): Feedback on usage issues and functional requirements (quick response) +- [GitHub Issues](https://github.com/apache/hugegraph/issues): Feedback on usage issues and functional requirements (quick response) - Feedback Email: [dev@hugegraph.apache.org](mailto:dev@hugegraph.apache.org) ([subscriber](https://hugegraph.apache.org/docs/contribution-guidelines/subscribe/) only) - Security Email: [security@hugegraph.apache.org](mailto:security@hugegraph.apache.org) (Report SEC problems) - WeChat public account: Apache HugeGraph, welcome to scan this QR code to follow us. diff --git a/content/en/docs/quickstart/client/hugegraph-client-go.md b/content/en/docs/quickstart/client/hugegraph-client-go.md index b412ba27d..c26f04c24 100644 --- a/content/en/docs/quickstart/client/hugegraph-client-go.md +++ b/content/en/docs/quickstart/client/hugegraph-client-go.md @@ -13,7 +13,7 @@ A HugeGraph Client SDK tool based on the Go language. ## Installation Tutorial ```shell -go get github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go +go get github.com/apache/hugegraph-toolchain/hugegraph-client-go ``` ## Implemented APIs @@ -34,8 +34,8 @@ import ( "log" "os" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go/hgtransport" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go/hgtransport" ) func main() { @@ -73,8 +73,8 @@ import ( "log" "os" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go" - "github.com/apache/incubator-hugegraph-toolchain/hugegraph-client-go/hgtransport" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go" + "github.com/apache/hugegraph-toolchain/hugegraph-client-go/hgtransport" ) // initClient initializes and returns a HugeGraph client instance diff --git a/content/en/docs/quickstart/client/hugegraph-client.md b/content/en/docs/quickstart/client/hugegraph-client.md index 91ac7865e..088cd1a6c 100644 --- a/content/en/docs/quickstart/client/hugegraph-client.md +++ b/content/en/docs/quickstart/client/hugegraph-client.md @@ -10,7 +10,7 @@ weight: 1 We support HugeGraph-Client for Java/Go/[Python](https://github.com/apache/incubator-hugegraph-ai/tree/main/hugegraph-python-client) language. You can use [Client-API](/docs/clients/hugegraph-client) to write code to operate HugeGraph, such as adding, deleting, modifying, and querying schema and graph data, or executing gremlin statements. -> [HugeGraph client SDK tool based on Go language](https://github.com/apache/incubator-hugegraph-toolchain/blob/master/hugegraph-client-go/README.en.md) (version >=1.2.0) +> [HugeGraph client SDK tool based on Go language](https://github.com/apache/hugegraph-toolchain/blob/master/hugegraph-client-go/README.en.md) (version >=1.2.0) ### 2 What You Need diff --git a/content/en/docs/quickstart/computing/_index.md b/content/en/docs/quickstart/computing/_index.md index 5ec200bb5..2bac28f7d 100644 --- a/content/en/docs/quickstart/computing/_index.md +++ b/content/en/docs/quickstart/computing/_index.md @@ -4,8 +4,8 @@ linkTitle: "HugeGraph Computing (OLAP)" weight: 4 --- -## 🚀 Best practice: Prioritize using DeepWiki intelligent documents +## Recommended: Use DeepWiki Documentation -> To address the issue of outdated static documents, we provide DeepWiki with **real-time updates and more comprehensive content**. It is equivalent to an expert with the latest knowledge of the project, which is very suitable for **all developers** to read and consult before starting the project. +> DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. -**👉 Strongly recommend visiting and having a conversation with:** [**incubator-hugegraph-computer**](https://deepwiki.com/apache/incubator-hugegraph-computer) \ No newline at end of file +**Visit:** [**hugegraph-computer**](https://deepwiki.com/apache/hugegraph-computer) \ No newline at end of file diff --git a/content/en/docs/quickstart/computing/hugegraph-computer.md b/content/en/docs/quickstart/computing/hugegraph-computer.md index dce2b07d2..9c1e8e903 100644 --- a/content/en/docs/quickstart/computing/hugegraph-computer.md +++ b/content/en/docs/quickstart/computing/hugegraph-computer.md @@ -6,7 +6,7 @@ weight: 2 ## 1 HugeGraph-Computer Overview -The [`HugeGraph-Computer`](https://github.com/apache/incubator-hugegraph-computer) is a distributed graph processing system for HugeGraph (OLAP). It is an implementation of [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It runs on a Kubernetes(K8s) framework.(It focuses on supporting graph data volumes of hundreds of billions to trillions, using disk for sorting and acceleration, which is one of the biggest differences from Vermeer) +The [`HugeGraph-Computer`](https://github.com/apache/hugegraph-computer) is a distributed graph processing system for HugeGraph (OLAP). It is an implementation of [Pregel](https://kowshik.github.io/JPregel/pregel_paper.pdf). It runs on a Kubernetes(K8s) framework.(It focuses on supporting graph data volumes of hundreds of billions to trillions, using disk for sorting and acceleration, which is one of the biggest differences from Vermeer) ### Features diff --git a/content/en/docs/quickstart/computing/hugegraph-vermeer.md b/content/en/docs/quickstart/computing/hugegraph-vermeer.md index 3d2aad604..aa802c064 100644 --- a/content/en/docs/quickstart/computing/hugegraph-vermeer.md +++ b/content/en/docs/quickstart/computing/hugegraph-vermeer.md @@ -131,7 +131,7 @@ docker network rm vermeer_network 3. **Option 3: Build from Source** -Build. You can refer [Vermeer Readme](https://github.com/apache/incubator-hugegraph-computer/tree/master/vermeer). +Build. You can refer [Vermeer Readme](https://github.com/apache/hugegraph-computer/tree/master/vermeer). ```shell go build diff --git a/content/en/docs/quickstart/hugegraph-ai/_index.md b/content/en/docs/quickstart/hugegraph-ai/_index.md index 196e66818..4c082f04e 100644 --- a/content/en/docs/quickstart/hugegraph-ai/_index.md +++ b/content/en/docs/quickstart/hugegraph-ai/_index.md @@ -7,11 +7,11 @@ weight: 3 [![License](https://img.shields.io/badge/license-Apache%202-0E78BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/apache/incubator-hugegraph-ai) -## 🚀 Best practice: Prioritize using DeepWiki intelligent documents +## Recommended: Use DeepWiki Documentation -> To address the issue of outdated static documents, we provide DeepWiki with **real-time updates and more comprehensive content**. It is equivalent to an expert with the latest knowledge of the project, which is very suitable for **all developers** to read and consult before starting the project. +> DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. -**👉 Strongly recommend visiting and having a conversation with:** [**incubator-hugegraph-ai**](https://deepwiki.com/apache/incubator-hugegraph-ai) +**Visit:** [**incubator-hugegraph-ai**](https://deepwiki.com/apache/incubator-hugegraph-ai) `hugegraph-ai` integrates [HugeGraph](https://github.com/apache/hugegraph) with artificial intelligence capabilities, providing comprehensive support for developers to build AI-powered graph applications. diff --git a/content/en/docs/quickstart/hugegraph/_index.md b/content/en/docs/quickstart/hugegraph/_index.md index e35040e3f..26bffacbc 100644 --- a/content/en/docs/quickstart/hugegraph/_index.md +++ b/content/en/docs/quickstart/hugegraph/_index.md @@ -4,8 +4,8 @@ linkTitle: "HugeGraph (OLTP)" weight: 1 --- -## 🚀 Best practice: Prioritize using DeepWiki intelligent documents +> DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. +> +> 📖 [https://deepwiki.com/apache/hugegraph](https://deepwiki.com/apache/hugegraph) -> To address the issue of outdated static documents, we provide DeepWiki with **real-time updates and more comprehensive content**. It is equivalent to an expert with the latest knowledge of the project, which is very suitable for **all developers** to read and consult before starting the project. - -**👉 Strongly recommend visiting and having a conversation with:** [**incubator-hugegraph**](https://deepwiki.com/apache/incubator-hugegraph) +**GitHub Access:** [https://github.com/apache/hugegraph](https://github.com/apache/hugegraph) diff --git a/content/en/docs/quickstart/hugegraph/hugegraph-server.md b/content/en/docs/quickstart/hugegraph/hugegraph-server.md index 06ebc9e86..921577d1d 100644 --- a/content/en/docs/quickstart/hugegraph/hugegraph-server.md +++ b/content/en/docs/quickstart/hugegraph/hugegraph-server.md @@ -42,7 +42,7 @@ There are four ways to deploy HugeGraph-Server components: #### 3.1 Use Docker container (Convenient for Test/Dev) -You can refer to the [Docker deployment guide](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/README.md). +You can refer to the [Docker deployment guide](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/README.md). We can use `docker run -itd --name=server -p 8080:8080 -e PASSWORD=xxx hugegraph/hugegraph:1.7.0` to quickly start a `HugeGraph Server` with a built-in `RocksDB` backend. @@ -615,7 +615,7 @@ In [3.1 Use Docker container](#31-use-docker-container-convenient-for-testdev), When using Docker, we can use Cassandra as the backend storage. We highly recommend using docker-compose directly to manage both the server and Cassandra. -The sample `docker-compose.yml` can be obtained on [GitHub](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/example/docker-compose-cassandra.yml), and you can start it with `docker-compose up -d`. (If using Cassandra 4.0 as the backend storage, it takes approximately two minutes to initialize. Please be patient.) +The sample `docker-compose.yml` can be obtained on [GitHub](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/docker/example/docker-compose-cassandra.yml), and you can start it with `docker-compose up -d`. (If using Cassandra 4.0 as the backend storage, it takes approximately two minutes to initialize. Please be patient.) ```yaml version: "3" @@ -682,7 +682,7 @@ Set the environment variable `PRELOAD=true` when starting Docker to load data du 2. Use `docker-compose` - Create `docker-compose.yml` as following. We should set the environment variable `PRELOAD=true`. [`example.groovy`](https://github.com/apache/incubator-hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) is a predefined script to preload the sample data. If needed, we can mount a new `example.groovy` to change the preload data. + Create `docker-compose.yml` as following. We should set the environment variable `PRELOAD=true`. [`example.groovy`](https://github.com/apache/hugegraph/blob/master/hugegraph-server/hugegraph-dist/src/assembly/static/scripts/example.groovy) is a predefined script to preload the sample data. If needed, we can mount a new `example.groovy` to change the preload data. ```yaml version: '3' diff --git a/content/en/docs/quickstart/toolchain/_index.md b/content/en/docs/quickstart/toolchain/_index.md index 5c7508230..8076137fc 100644 --- a/content/en/docs/quickstart/toolchain/_index.md +++ b/content/en/docs/quickstart/toolchain/_index.md @@ -6,8 +6,8 @@ weight: 2 > **Testing Guide**: For running toolchain tests locally, please refer to [HugeGraph Toolchain Local Testing Guide](/docs/guides/toolchain-local-test) -## 🚀 Best practice: Prioritize using DeepWiki intelligent documents +## Recommended: Use DeepWiki Documentation -> To address the issue of outdated static documents, we provide DeepWiki with **real-time updates and more comprehensive content**. It is equivalent to an expert with the latest knowledge of the project, which is very suitable for **all developers** to read and consult before starting the project. +> DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. -**👉 Strongly recommend visiting and having a conversation with:** [**incubator-hugegraph-toolchain**](https://deepwiki.com/apache/incubator-hugegraph-toolchain) \ No newline at end of file +**Visit:** [**hugegraph-toolchain**](https://deepwiki.com/apache/hugegraph-toolchain) \ No newline at end of file diff --git a/themes/docsy/layouts/partials/footer.html b/themes/docsy/layouts/partials/footer.html index 82a8461a0..5c9a44641 100644 --- a/themes/docsy/layouts/partials/footer.html +++ b/themes/docsy/layouts/partials/footer.html @@ -6,25 +6,23 @@
-

Apache HugeGraph is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.

-

Copyright © {{ now.Year}} The Apache Software Foundation, Licensed under the Apache License Version 2.0
Apache, the names of Apache projects, and the feather logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

+

Copyright © {{ now.Year}} The Apache Software Foundation, Licensed under the Apache License Version 2.0

+

Apache, the names of Apache projects, and the feather logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

From 579df7ab1056626451dbdea05449fd04491b58c3 Mon Sep 17 00:00:00 2001 From: imbajin Date: Mon, 2 Feb 2026 16:25:02 +0800 Subject: [PATCH 08/10] Add API descriptions and update quickstart links MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add description frontmatter to many REST API docs (Chinese and English) to provide concise summaries for each API page. Update quickstart index pages (computing, hugegraph‑ai, toolchain) to recommend DeepWiki with direct links and add GitHub access links. Revise CN FAQ backend storage guidance to describe single-node (RocksDB) vs distributed (HStore) deployment, recommend selection by scale, and note deprecation of Cassandra/HBase/MySQL in later versions. --- content/cn/docs/clients/restful-api/auth.md | 1 + content/cn/docs/clients/restful-api/cypher.md | 1 + content/cn/docs/clients/restful-api/edge.md | 1 + content/cn/docs/clients/restful-api/edgelabel.md | 1 + content/cn/docs/clients/restful-api/graphs.md | 1 + content/cn/docs/clients/restful-api/gremlin.md | 1 + content/cn/docs/clients/restful-api/indexlabel.md | 1 + content/cn/docs/clients/restful-api/metrics.md | 2 +- content/cn/docs/clients/restful-api/other.md | 1 + content/cn/docs/clients/restful-api/propertykey.md | 1 + content/cn/docs/clients/restful-api/rank.md | 1 + content/cn/docs/clients/restful-api/rebuild.md | 1 + content/cn/docs/clients/restful-api/schema.md | 1 + content/cn/docs/clients/restful-api/task.md | 1 + content/cn/docs/clients/restful-api/traverser.md | 1 + content/cn/docs/clients/restful-api/variable.md | 1 + content/cn/docs/clients/restful-api/vertex.md | 1 + content/cn/docs/clients/restful-api/vertexlabel.md | 1 + content/cn/docs/guides/faq.md | 8 ++++++-- content/cn/docs/quickstart/computing/_index.md | 6 +++--- content/cn/docs/quickstart/hugegraph-ai/_index.md | 6 ++---- content/cn/docs/quickstart/toolchain/_index.md | 6 +++--- content/en/docs/clients/restful-api/auth.md | 1 + content/en/docs/clients/restful-api/cypher.md | 1 + content/en/docs/clients/restful-api/edge.md | 1 + content/en/docs/clients/restful-api/edgelabel.md | 1 + content/en/docs/clients/restful-api/graphs.md | 1 + content/en/docs/clients/restful-api/graphspace.md | 1 + content/en/docs/clients/restful-api/gremlin.md | 1 + content/en/docs/clients/restful-api/indexlabel.md | 1 + content/en/docs/clients/restful-api/metrics.md | 2 +- content/en/docs/clients/restful-api/other.md | 1 + content/en/docs/clients/restful-api/propertykey.md | 1 + content/en/docs/clients/restful-api/rank.md | 1 + content/en/docs/clients/restful-api/rebuild.md | 1 + content/en/docs/clients/restful-api/schema.md | 1 + content/en/docs/clients/restful-api/task.md | 1 + content/en/docs/clients/restful-api/traverser.md | 1 + content/en/docs/clients/restful-api/variable.md | 1 + content/en/docs/clients/restful-api/vertex.md | 1 + content/en/docs/clients/restful-api/vertexlabel.md | 1 + content/en/docs/guides/faq.md | 8 ++++++-- content/en/docs/quickstart/computing/_index.md | 6 +++--- content/en/docs/quickstart/hugegraph-ai/_index.md | 6 ++---- content/en/docs/quickstart/toolchain/_index.md | 6 +++--- 45 files changed, 65 insertions(+), 26 deletions(-) diff --git a/content/cn/docs/clients/restful-api/auth.md b/content/cn/docs/clients/restful-api/auth.md index 608756fa9..84db93aaa 100644 --- a/content/cn/docs/clients/restful-api/auth.md +++ b/content/cn/docs/clients/restful-api/auth.md @@ -2,6 +2,7 @@ title: "Authentication API" linkTitle: "Authentication" weight: 16 +description: "Authentication(认证鉴权)REST 接口:管理用户、角色、权限和访问控制,实现细粒度的图数据安全机制。" --- > **版本变更说明**: diff --git a/content/cn/docs/clients/restful-api/cypher.md b/content/cn/docs/clients/restful-api/cypher.md index 7eddf199c..0d7154724 100644 --- a/content/cn/docs/clients/restful-api/cypher.md +++ b/content/cn/docs/clients/restful-api/cypher.md @@ -2,6 +2,7 @@ title: "Cypher API" linkTitle: "Cypher" weight: 15 +description: "Cypher(图查询语言)REST 接口:通过 HTTP 接口执行 OpenCypher 声明式图查询语言。" --- ### 9.1 Cypher diff --git a/content/cn/docs/clients/restful-api/edge.md b/content/cn/docs/clients/restful-api/edge.md index d17242a2b..c12ebfaf5 100644 --- a/content/cn/docs/clients/restful-api/edge.md +++ b/content/cn/docs/clients/restful-api/edge.md @@ -2,6 +2,7 @@ title: "Edge API" linkTitle: "Edge" weight: 8 +description: "Edge(边)REST 接口:创建、查询、更新和删除顶点之间的关系数据,支持批量操作和方向查询。" --- ### 2.2 Edge diff --git a/content/cn/docs/clients/restful-api/edgelabel.md b/content/cn/docs/clients/restful-api/edgelabel.md index 145992c22..0d1e71bc8 100644 --- a/content/cn/docs/clients/restful-api/edgelabel.md +++ b/content/cn/docs/clients/restful-api/edgelabel.md @@ -2,6 +2,7 @@ title: "EdgeLabel API" linkTitle: "EdgeLabel" weight: 4 +description: "EdgeLabel(边标签)REST 接口:定义边类型、源顶点和目标顶点的关系约束,构建图的连接规则。" --- ### 1.4 EdgeLabel diff --git a/content/cn/docs/clients/restful-api/graphs.md b/content/cn/docs/clients/restful-api/graphs.md index 4e92033f4..560efc610 100644 --- a/content/cn/docs/clients/restful-api/graphs.md +++ b/content/cn/docs/clients/restful-api/graphs.md @@ -2,6 +2,7 @@ title: "Graphs API" linkTitle: "Graphs" weight: 12 +description: "Graphs(图管理)REST 接口:管理图实例的生命周期,包括创建、查询、克隆、清空和删除图数据库。" --- ### 6.1 Graphs diff --git a/content/cn/docs/clients/restful-api/gremlin.md b/content/cn/docs/clients/restful-api/gremlin.md index d2affc3ae..144f485c8 100644 --- a/content/cn/docs/clients/restful-api/gremlin.md +++ b/content/cn/docs/clients/restful-api/gremlin.md @@ -2,6 +2,7 @@ title: "Gremlin API" linkTitle: "Gremlin" weight: 14 +description: "Gremlin(图查询语言)REST 接口:通过 HTTP 接口执行 Gremlin 图遍历查询语言脚本。" --- ### 8.1 Gremlin diff --git a/content/cn/docs/clients/restful-api/indexlabel.md b/content/cn/docs/clients/restful-api/indexlabel.md index efddfbbfb..227567998 100644 --- a/content/cn/docs/clients/restful-api/indexlabel.md +++ b/content/cn/docs/clients/restful-api/indexlabel.md @@ -2,6 +2,7 @@ title: "IndexLabel API" linkTitle: "IndexLabel" weight: 5 +description: "IndexLabel(索引标签)REST 接口:为顶点和边的属性创建索引,加速基于属性的查询和过滤操作。" --- ### 1.5 IndexLabel diff --git a/content/cn/docs/clients/restful-api/metrics.md b/content/cn/docs/clients/restful-api/metrics.md index e984d2039..f89698ada 100644 --- a/content/cn/docs/clients/restful-api/metrics.md +++ b/content/cn/docs/clients/restful-api/metrics.md @@ -2,7 +2,7 @@ title: "Metrics API" linkTitle: "Metrics" weight: 17 - +description: "Metrics(监控指标)REST 接口:获取系统运行时的性能指标、统计信息和健康状态数据。" --- HugeGraph 提供了获取监控信息的 Metrics 接口,比如各个 Gremlin 执行时间的统计、缓存的占用大小等。Metrics diff --git a/content/cn/docs/clients/restful-api/other.md b/content/cn/docs/clients/restful-api/other.md index 8f394e439..0e6fd0458 100644 --- a/content/cn/docs/clients/restful-api/other.md +++ b/content/cn/docs/clients/restful-api/other.md @@ -2,6 +2,7 @@ title: "Other API" linkTitle: "Other" weight: 18 +description: "Other(其他接口)REST 接口:提供系统版本查询和 API 版本信息等辅助功能。" --- ### 11.1 Other diff --git a/content/cn/docs/clients/restful-api/propertykey.md b/content/cn/docs/clients/restful-api/propertykey.md index 0f008f8b8..fc04e8456 100644 --- a/content/cn/docs/clients/restful-api/propertykey.md +++ b/content/cn/docs/clients/restful-api/propertykey.md @@ -2,6 +2,7 @@ title: "PropertyKey API" linkTitle: "PropertyKey" weight: 2 +description: "PropertyKey(属性键)REST 接口:定义图中所有属性的数据类型和基数约束,是构建图模式的基础元素。" --- ### 1.2 PropertyKey diff --git a/content/cn/docs/clients/restful-api/rank.md b/content/cn/docs/clients/restful-api/rank.md index 780ed8298..f18fc4730 100644 --- a/content/cn/docs/clients/restful-api/rank.md +++ b/content/cn/docs/clients/restful-api/rank.md @@ -2,6 +2,7 @@ title: "Rank API" linkTitle: "Rank" weight: 10 +description: "Rank(图排序)REST 接口:执行图节点排序算法,如 PageRank、个性化 PageRank 等中心性分析。" --- ### 4.1 rank API 概述 diff --git a/content/cn/docs/clients/restful-api/rebuild.md b/content/cn/docs/clients/restful-api/rebuild.md index 18e037448..b9c043d37 100644 --- a/content/cn/docs/clients/restful-api/rebuild.md +++ b/content/cn/docs/clients/restful-api/rebuild.md @@ -2,6 +2,7 @@ title: "Rebuild API" linkTitle: "Rebuild" weight: 6 +description: "Rebuild(重建索引)REST 接口:重建图模式的索引,确保索引数据与图数据保持一致性。" --- ### 1.6 Rebuild diff --git a/content/cn/docs/clients/restful-api/schema.md b/content/cn/docs/clients/restful-api/schema.md index f0e525b05..0e80bce4a 100644 --- a/content/cn/docs/clients/restful-api/schema.md +++ b/content/cn/docs/clients/restful-api/schema.md @@ -2,6 +2,7 @@ title: "Schema API" linkTitle: "Schema" weight: 1 +description: "Schema(图模式)REST 接口:查询图的完整模式定义,包括属性键、顶点标签、边标签和索引标签的统一视图。" --- ### 1.1 Schema diff --git a/content/cn/docs/clients/restful-api/task.md b/content/cn/docs/clients/restful-api/task.md index b91a5c5a3..92c89aebb 100644 --- a/content/cn/docs/clients/restful-api/task.md +++ b/content/cn/docs/clients/restful-api/task.md @@ -2,6 +2,7 @@ title: "Task API" linkTitle: "Task" weight: 13 +description: "Task(任务管理)REST 接口:查询和管理异步任务的执行状态,如索引重建、图遍历等长时任务。" --- ### 7.1 Task diff --git a/content/cn/docs/clients/restful-api/traverser.md b/content/cn/docs/clients/restful-api/traverser.md index e246ede58..3d4f210b4 100644 --- a/content/cn/docs/clients/restful-api/traverser.md +++ b/content/cn/docs/clients/restful-api/traverser.md @@ -2,6 +2,7 @@ title: "Traverser API" linkTitle: "Traverser" weight: 9 +description: "Traverser(图遍历)REST 接口:执行复杂的图算法和路径查询,包括最短路径、K近邻、相似度计算等高级分析功能。" --- ### 3.1 traverser API 概述 diff --git a/content/cn/docs/clients/restful-api/variable.md b/content/cn/docs/clients/restful-api/variable.md index 5ca2ec0b9..b25b6e44b 100644 --- a/content/cn/docs/clients/restful-api/variable.md +++ b/content/cn/docs/clients/restful-api/variable.md @@ -2,6 +2,7 @@ title: "Variable API" linkTitle: "Variable" weight: 11 +description: "Variable(变量)REST 接口:存储和管理键值对形式的全局变量,支持图级别的配置和状态管理。" --- ### 5.1 Variables diff --git a/content/cn/docs/clients/restful-api/vertex.md b/content/cn/docs/clients/restful-api/vertex.md index 0df58ecce..7f1c8a254 100644 --- a/content/cn/docs/clients/restful-api/vertex.md +++ b/content/cn/docs/clients/restful-api/vertex.md @@ -2,6 +2,7 @@ title: "Vertex API" linkTitle: "Vertex" weight: 7 +description: "Vertex(顶点)REST 接口:创建、查询、更新和删除图中的顶点数据,支持批量操作和条件过滤。" --- ### 2.1 Vertex diff --git a/content/cn/docs/clients/restful-api/vertexlabel.md b/content/cn/docs/clients/restful-api/vertexlabel.md index 31ff5a7eb..9c2bfd2c6 100644 --- a/content/cn/docs/clients/restful-api/vertexlabel.md +++ b/content/cn/docs/clients/restful-api/vertexlabel.md @@ -2,6 +2,7 @@ title: "VertexLabel API" linkTitle: "VertexLabel" weight: 3 +description: "VertexLabel(顶点标签)REST 接口:定义顶点类型、ID策略及关联的属性,决定顶点的结构和约束规则。" --- ### 1.3 VertexLabel diff --git a/content/cn/docs/guides/faq.md b/content/cn/docs/guides/faq.md index d658fdddc..d28c980a6 100644 --- a/content/cn/docs/guides/faq.md +++ b/content/cn/docs/guides/faq.md @@ -4,9 +4,13 @@ linkTitle: "FAQ" weight: 6 --- -- 如何选择后端存储? 选 RocksDB 还是 Cassandra 还是 Hbase 还是 Mysql? +- 如何选择后端存储? 选 RocksDB 还是分布式存储? - 根据你的具体需要来判断, 一般单机或数据量 < 100 亿推荐 RocksDB, 其他推荐使用分布式存储的后端集群 + HugeGraph 支持多种部署模式,根据数据规模和场景选择: + - **单机模式**:Server + RocksDB,适合开发测试和中小规模数据(< 1TB) + - **分布式模式**:HugeGraph-PD + HugeGraph-Store (HStore),支持水平扩展和高可用(100GB+ 数据规模),适合生产环境和大规模图数据应用 + + 注:Cassandra、HBase、MySQL 等后端仅在 HugeGraph <= 1.5 版本中可用,官方后续不再单独维护 - 启动服务时提示:`xxx (core dumped) xxx` diff --git a/content/cn/docs/quickstart/computing/_index.md b/content/cn/docs/quickstart/computing/_index.md index 2e9ff0c89..bbce7c1c1 100644 --- a/content/cn/docs/quickstart/computing/_index.md +++ b/content/cn/docs/quickstart/computing/_index.md @@ -4,8 +4,8 @@ linkTitle: "HugeGraph Computing (OLAP)" weight: 4 --- -## 推荐:使用 DeepWiki 文档 - > DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 +> +> 📖 [https://deepwiki.com/apache/hugegraph-computer](https://deepwiki.com/apache/hugegraph-computer) -**访问链接:**[**hugegraph-computer**](https://deepwiki.com/apache/hugegraph-computer) \ No newline at end of file +**GitHub 访问:** [https://github.com/apache/hugegraph-computer](https://github.com/apache/hugegraph-computer) \ No newline at end of file diff --git a/content/cn/docs/quickstart/hugegraph-ai/_index.md b/content/cn/docs/quickstart/hugegraph-ai/_index.md index 5777bde6d..87e38aa9b 100644 --- a/content/cn/docs/quickstart/hugegraph-ai/_index.md +++ b/content/cn/docs/quickstart/hugegraph-ai/_index.md @@ -7,11 +7,9 @@ weight: 3 [![License](https://img.shields.io/badge/license-Apache%202-0E78BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/apache/incubator-hugegraph-ai) -## 推荐:使用 DeepWiki 文档 - > DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 - -**访问链接:**[**incubator-hugegraph-ai**](https://deepwiki.com/apache/incubator-hugegraph-ai) +> +> 📖 [https://deepwiki.com/apache/incubator-hugegraph-ai](https://deepwiki.com/apache/incubator-hugegraph-ai) `hugegraph-ai` 整合了 [HugeGraph](https://github.com/apache/hugegraph) 与人工智能功能,为开发者构建 AI 驱动的图应用提供全面支持。 diff --git a/content/cn/docs/quickstart/toolchain/_index.md b/content/cn/docs/quickstart/toolchain/_index.md index 9ab490e6e..6b318fa74 100644 --- a/content/cn/docs/quickstart/toolchain/_index.md +++ b/content/cn/docs/quickstart/toolchain/_index.md @@ -6,8 +6,8 @@ weight: 2 > **测试指南**:如需在本地运行工具链测试,请参考 [HugeGraph 工具链本地测试指南](/cn/docs/guides/toolchain-local-test) -## 推荐:使用 DeepWiki 文档 - > DeepWiki 提供实时更新的项目文档,内容更全面准确,适合快速了解项目最新情况。 +> +> 📖 [https://deepwiki.com/apache/hugegraph-toolchain](https://deepwiki.com/apache/hugegraph-toolchain) -**访问链接:**[**hugegraph-toolchain**](https://deepwiki.com/apache/hugegraph-toolchain) +**GitHub 访问:** [https://github.com/apache/hugegraph-toolchain](https://github.com/apache/hugegraph-toolchain) diff --git a/content/en/docs/clients/restful-api/auth.md b/content/en/docs/clients/restful-api/auth.md index 4eb81b91a..f23e48f5a 100644 --- a/content/en/docs/clients/restful-api/auth.md +++ b/content/en/docs/clients/restful-api/auth.md @@ -2,6 +2,7 @@ title: "Authentication API" linkTitle: "Authentication" weight: 16 +description: "Authentication REST API: Manage users, roles, permissions, and access control to implement fine-grained graph data security." --- > **Version Change Notice**: diff --git a/content/en/docs/clients/restful-api/cypher.md b/content/en/docs/clients/restful-api/cypher.md index ba120e2c7..4d4a5b940 100644 --- a/content/en/docs/clients/restful-api/cypher.md +++ b/content/en/docs/clients/restful-api/cypher.md @@ -2,6 +2,7 @@ title: "Cypher API" linkTitle: "Cypher" weight: 15 +description: "Cypher REST API: Execute OpenCypher declarative graph query language via HTTP interface." --- ### 9.1 Cypher diff --git a/content/en/docs/clients/restful-api/edge.md b/content/en/docs/clients/restful-api/edge.md index aaff63967..0e28fbe70 100644 --- a/content/en/docs/clients/restful-api/edge.md +++ b/content/en/docs/clients/restful-api/edge.md @@ -2,6 +2,7 @@ title: "Edge API" linkTitle: "Edge" weight: 8 +description: "Edge REST API: Create, query, update, and delete relationship data between vertices with support for batch operations and directional queries." --- ### 2.2 Edge diff --git a/content/en/docs/clients/restful-api/edgelabel.md b/content/en/docs/clients/restful-api/edgelabel.md index 4906c687f..34e7bba0a 100644 --- a/content/en/docs/clients/restful-api/edgelabel.md +++ b/content/en/docs/clients/restful-api/edgelabel.md @@ -2,6 +2,7 @@ title: "EdgeLabel API" linkTitle: "EdgeLabel" weight: 4 +description: "EdgeLabel REST API: Define edge types and relationship constraints between source and target vertices to construct graph connection rules." --- ### 1.4 EdgeLabel diff --git a/content/en/docs/clients/restful-api/graphs.md b/content/en/docs/clients/restful-api/graphs.md index 9283708c3..269c48843 100644 --- a/content/en/docs/clients/restful-api/graphs.md +++ b/content/en/docs/clients/restful-api/graphs.md @@ -2,6 +2,7 @@ title: "Graphs API" linkTitle: "Graphs" weight: 12 +description: "Graphs REST API: Manage graph instance lifecycle including creating, querying, cloning, clearing, and deleting graph databases." --- ### 6.1 Graphs diff --git a/content/en/docs/clients/restful-api/graphspace.md b/content/en/docs/clients/restful-api/graphspace.md index 15eb1a91b..d38d056d4 100644 --- a/content/en/docs/clients/restful-api/graphspace.md +++ b/content/en/docs/clients/restful-api/graphspace.md @@ -2,6 +2,7 @@ title: "Graphspace API" linkTitle: "Graphspace" weight: 1 +description: "Graphspace REST API: Multi-tenancy and resource isolation for creating, viewing, updating, and deleting graph spaces with prerequisites and constraints." --- ### 2.0 Graphspace diff --git a/content/en/docs/clients/restful-api/gremlin.md b/content/en/docs/clients/restful-api/gremlin.md index f78ad082c..f308e58b1 100644 --- a/content/en/docs/clients/restful-api/gremlin.md +++ b/content/en/docs/clients/restful-api/gremlin.md @@ -2,6 +2,7 @@ title: "Gremlin API" linkTitle: "Gremlin" weight: 14 +description: "Gremlin REST API: Execute Gremlin graph traversal language scripts via HTTP interface." --- ### 8.1 Gremlin diff --git a/content/en/docs/clients/restful-api/indexlabel.md b/content/en/docs/clients/restful-api/indexlabel.md index 74320d37d..fbfb68dda 100644 --- a/content/en/docs/clients/restful-api/indexlabel.md +++ b/content/en/docs/clients/restful-api/indexlabel.md @@ -2,6 +2,7 @@ title: "IndexLabel API" linkTitle: "IndexLabel" weight: 5 +description: "IndexLabel REST API: Create indexes on vertex and edge properties to accelerate property-based queries and filtering operations." --- ### 1.5 IndexLabel diff --git a/content/en/docs/clients/restful-api/metrics.md b/content/en/docs/clients/restful-api/metrics.md index 16255b248..c0e74058e 100644 --- a/content/en/docs/clients/restful-api/metrics.md +++ b/content/en/docs/clients/restful-api/metrics.md @@ -2,7 +2,7 @@ title: "Metrics API" linkTitle: "Metrics" weight: 17 - +description: "Metrics REST API: Retrieve runtime performance metrics, statistics, and health status data of the system." --- diff --git a/content/en/docs/clients/restful-api/other.md b/content/en/docs/clients/restful-api/other.md index ed5135388..23b27d4b5 100644 --- a/content/en/docs/clients/restful-api/other.md +++ b/content/en/docs/clients/restful-api/other.md @@ -2,6 +2,7 @@ title: "Other API" linkTitle: "Other" weight: 18 +description: "Other REST API: Provide auxiliary functions such as system version query and API version information." --- ### 11.1 Other diff --git a/content/en/docs/clients/restful-api/propertykey.md b/content/en/docs/clients/restful-api/propertykey.md index 90c76414c..ec7888bff 100644 --- a/content/en/docs/clients/restful-api/propertykey.md +++ b/content/en/docs/clients/restful-api/propertykey.md @@ -2,6 +2,7 @@ title: "PropertyKey API" linkTitle: "PropertyKey" weight: 2 +description: "PropertyKey REST API: Define data types and cardinality constraints for all properties in the graph, serving as fundamental schema elements." --- ### 1.2 PropertyKey diff --git a/content/en/docs/clients/restful-api/rank.md b/content/en/docs/clients/restful-api/rank.md index e1dd71a4c..9e335292c 100644 --- a/content/en/docs/clients/restful-api/rank.md +++ b/content/en/docs/clients/restful-api/rank.md @@ -2,6 +2,7 @@ title: "Rank API" linkTitle: "Rank" weight: 10 +description: "Rank REST API: Execute graph node ranking algorithms such as PageRank and Personalized PageRank for centrality analysis." --- ### 4.1 Rank API overview diff --git a/content/en/docs/clients/restful-api/rebuild.md b/content/en/docs/clients/restful-api/rebuild.md index b2dbaf6f3..37b6ae120 100644 --- a/content/en/docs/clients/restful-api/rebuild.md +++ b/content/en/docs/clients/restful-api/rebuild.md @@ -2,6 +2,7 @@ title: "Rebuild API" linkTitle: "Rebuild" weight: 6 +description: "Rebuild REST API: Rebuild graph schema indexes to ensure consistency between index data and graph data." --- ### 1.6 Rebuild diff --git a/content/en/docs/clients/restful-api/schema.md b/content/en/docs/clients/restful-api/schema.md index 6364cd3e4..82a9fdc6f 100644 --- a/content/en/docs/clients/restful-api/schema.md +++ b/content/en/docs/clients/restful-api/schema.md @@ -2,6 +2,7 @@ title: "Schema API" linkTitle: "Schema" weight: 1 +description: "Schema REST API: Query the complete schema definition of a graph, including property keys, vertex labels, edge labels, and index labels." --- ### 1.1 Schema diff --git a/content/en/docs/clients/restful-api/task.md b/content/en/docs/clients/restful-api/task.md index 18f87d560..ef5097014 100644 --- a/content/en/docs/clients/restful-api/task.md +++ b/content/en/docs/clients/restful-api/task.md @@ -2,6 +2,7 @@ title: "Task API" linkTitle: "Task" weight: 13 +description: "Task REST API: Query and manage asynchronous task execution status for long-running operations like index rebuilding and graph traversals." --- ### 7.1 Task diff --git a/content/en/docs/clients/restful-api/traverser.md b/content/en/docs/clients/restful-api/traverser.md index 681166132..1f7a1e49f 100644 --- a/content/en/docs/clients/restful-api/traverser.md +++ b/content/en/docs/clients/restful-api/traverser.md @@ -2,6 +2,7 @@ title: "Traverser API" linkTitle: "Traverser" weight: 9 +description: "Traverser REST API: Execute complex graph algorithms and path queries including shortest path, k-neighbors, similarity computation, and advanced analytics." --- ### 3.1 Overview of Traverser API diff --git a/content/en/docs/clients/restful-api/variable.md b/content/en/docs/clients/restful-api/variable.md index 151498771..ad1141670 100644 --- a/content/en/docs/clients/restful-api/variable.md +++ b/content/en/docs/clients/restful-api/variable.md @@ -2,6 +2,7 @@ title: "Variable API" linkTitle: "Variable" weight: 11 +description: "Variable REST API: Store and manage key-value pairs as global variables for graph-level configuration and state management." --- ### 5.1 Variables diff --git a/content/en/docs/clients/restful-api/vertex.md b/content/en/docs/clients/restful-api/vertex.md index d016cb146..ab401f5da 100644 --- a/content/en/docs/clients/restful-api/vertex.md +++ b/content/en/docs/clients/restful-api/vertex.md @@ -2,6 +2,7 @@ title: "Vertex API" linkTitle: "Vertex" weight: 7 +description: "Vertex REST API: Create, query, update, and delete vertex data in the graph with support for batch operations and conditional filtering." --- ### 2.1 Vertex diff --git a/content/en/docs/clients/restful-api/vertexlabel.md b/content/en/docs/clients/restful-api/vertexlabel.md index 241497098..1369a5b73 100644 --- a/content/en/docs/clients/restful-api/vertexlabel.md +++ b/content/en/docs/clients/restful-api/vertexlabel.md @@ -2,6 +2,7 @@ title: "VertexLabel API" linkTitle: "VertexLabel" weight: 3 +description: "VertexLabel REST API: Define vertex types, ID strategies, and associated properties that determine vertex structure and constraints." --- ### 1.3 VertexLabel diff --git a/content/en/docs/guides/faq.md b/content/en/docs/guides/faq.md index 5dd41caf8..15f20e3f9 100644 --- a/content/en/docs/guides/faq.md +++ b/content/en/docs/guides/faq.md @@ -4,9 +4,13 @@ linkTitle: "FAQ" weight: 6 --- -- How to choose the back-end storage? Choose RocksDB, Cassandra, ScyllaDB, Hbase or Mysql? +- How to choose the back-end storage? RocksDB or distributed storage? - The choice of backend storage depends on specific needs. For installations on a single machine (node) with data volumes under 10 billion records, RocksDB is generally recommended. However, if a distributed backend is needed for scaling across multiple nodes, other options should be considered. ScyllaDB, designed as a drop-in replacement for Cassandra, offers protocol compatibility and better hardware utilization, often requiring less infrastructure. HBase, on the other hand, requires a Hadoop ecosystem to function effectively. Finally, while MySQL supports horizontal scaling, managing it in a distributed setup can be challenging. + HugeGraph supports multiple deployment modes. Choose based on your data scale and scenario: + - **Standalone Mode**: Server + RocksDB, suitable for development/testing and small to medium-scale data (< 1TB) + - **Distributed Mode**: HugeGraph-PD + HugeGraph-Store (HStore), supports horizontal scaling and high availability (100GB+ data scale), suitable for production environments and large-scale graph data applications + + Note: Cassandra, HBase, MySQL and other backends are only available in HugeGraph <= 1.5 versions and are no longer maintained by the official team - Prompt when starting the service: `xxx (core dumped) xxx` diff --git a/content/en/docs/quickstart/computing/_index.md b/content/en/docs/quickstart/computing/_index.md index 2bac28f7d..80bbd5a7b 100644 --- a/content/en/docs/quickstart/computing/_index.md +++ b/content/en/docs/quickstart/computing/_index.md @@ -4,8 +4,8 @@ linkTitle: "HugeGraph Computing (OLAP)" weight: 4 --- -## Recommended: Use DeepWiki Documentation - > DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. +> +> 📖 [https://deepwiki.com/apache/hugegraph-computer](https://deepwiki.com/apache/hugegraph-computer) -**Visit:** [**hugegraph-computer**](https://deepwiki.com/apache/hugegraph-computer) \ No newline at end of file +**GitHub Access:** [https://github.com/apache/hugegraph-computer](https://github.com/apache/hugegraph-computer) \ No newline at end of file diff --git a/content/en/docs/quickstart/hugegraph-ai/_index.md b/content/en/docs/quickstart/hugegraph-ai/_index.md index 4c082f04e..47d267e16 100644 --- a/content/en/docs/quickstart/hugegraph-ai/_index.md +++ b/content/en/docs/quickstart/hugegraph-ai/_index.md @@ -7,11 +7,9 @@ weight: 3 [![License](https://img.shields.io/badge/license-Apache%202-0E78BA.svg)](https://www.apache.org/licenses/LICENSE-2.0.html) [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/apache/incubator-hugegraph-ai) -## Recommended: Use DeepWiki Documentation - > DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. - -**Visit:** [**incubator-hugegraph-ai**](https://deepwiki.com/apache/incubator-hugegraph-ai) +> +> 📖 [https://deepwiki.com/apache/incubator-hugegraph-ai](https://deepwiki.com/apache/incubator-hugegraph-ai) `hugegraph-ai` integrates [HugeGraph](https://github.com/apache/hugegraph) with artificial intelligence capabilities, providing comprehensive support for developers to build AI-powered graph applications. diff --git a/content/en/docs/quickstart/toolchain/_index.md b/content/en/docs/quickstart/toolchain/_index.md index 8076137fc..c2030ea39 100644 --- a/content/en/docs/quickstart/toolchain/_index.md +++ b/content/en/docs/quickstart/toolchain/_index.md @@ -6,8 +6,8 @@ weight: 2 > **Testing Guide**: For running toolchain tests locally, please refer to [HugeGraph Toolchain Local Testing Guide](/docs/guides/toolchain-local-test) -## Recommended: Use DeepWiki Documentation - > DeepWiki provides real-time updated project documentation with more comprehensive and accurate content, suitable for quickly understanding the latest project information. +> +> 📖 [https://deepwiki.com/apache/hugegraph-toolchain](https://deepwiki.com/apache/hugegraph-toolchain) -**Visit:** [**hugegraph-toolchain**](https://deepwiki.com/apache/hugegraph-toolchain) \ No newline at end of file +**GitHub Access:** [https://github.com/apache/hugegraph-toolchain](https://github.com/apache/hugegraph-toolchain) \ No newline at end of file From c93a6e1a372a9b0b673091aa0970a0103208d0e7 Mon Sep 17 00:00:00 2001 From: imbajin Date: Mon, 2 Feb 2026 16:38:02 +0800 Subject: [PATCH 09/10] Update config docs titles and layout Revise configuration documentation for both CN and EN sites: update page titles and linkTitle fields for config guide and options, enrich the config index pages with Server configuration overview and links (including auth/HTTPS), and move/rename the Computer config page from config/ to quickstart/computing/ with updated title, linkTitle and weight. Also adjust wording about distributed deployment scale (changed from "100GB+" to "< 1000TB") in FAQ and introduction pages for both languages. --- content/cn/docs/config/_index.md | 15 ++++++++++++--- content/cn/docs/config/config-guide.md | 4 ++-- content/cn/docs/config/config-option.md | 4 ++-- content/cn/docs/guides/faq.md | 2 +- content/cn/docs/introduction/README.md | 2 +- .../computing/hugegraph-computer-config.md} | 6 +++--- content/en/docs/config/_index.md | 15 ++++++++++++--- content/en/docs/config/config-guide.md | 4 ++-- content/en/docs/config/config-option.md | 4 ++-- content/en/docs/guides/faq.md | 2 +- content/en/docs/introduction/README.md | 2 +- .../computing/hugegraph-computer-config.md} | 6 +++--- 12 files changed, 42 insertions(+), 24 deletions(-) rename content/cn/docs/{config/config-computer.md => quickstart/computing/hugegraph-computer-config.md} (99%) rename content/en/docs/{config/config-computer.md => quickstart/computing/hugegraph-computer-config.md} (99%) diff --git a/content/cn/docs/config/_index.md b/content/cn/docs/config/_index.md index 04db80c57..69c093639 100644 --- a/content/cn/docs/config/_index.md +++ b/content/cn/docs/config/_index.md @@ -1,5 +1,14 @@ --- -title: "Config" -linkTitle: "Config" +title: "HugeGraph-Server 配置" +linkTitle: "Server 配置" weight: 4 ---- \ No newline at end of file +--- + +本节介绍 HugeGraph-Server 的配置方法,包括: + +- **[配置入门指南](config-guide)** - 了解配置文件结构和基本配置方法 +- **[配置参考手册](config-option)** - 完整的配置选项列表和说明 +- **[权限配置](config-authentication)** - 用户认证和授权配置 +- **[HTTPS 配置](config-https)** - 启用 HTTPS 安全协议 + +> 如需了解 HugeGraph-Computer (OLAP) 的配置,请参阅 [Computer 配置参考](/cn/docs/quickstart/computing/hugegraph-computer-config)。 \ No newline at end of file diff --git a/content/cn/docs/config/config-guide.md b/content/cn/docs/config/config-guide.md index f791fee37..154c15993 100644 --- a/content/cn/docs/config/config-guide.md +++ b/content/cn/docs/config/config-guide.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph 配置" -linkTitle: "参数配置" +title: "HugeGraph 配置快速入门" +linkTitle: "配置入门指南" weight: 1 --- diff --git a/content/cn/docs/config/config-option.md b/content/cn/docs/config/config-option.md index 238ac8f50..4159b7443 100644 --- a/content/cn/docs/config/config-option.md +++ b/content/cn/docs/config/config-option.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph 配置项" -linkTitle: "配置项列表" +title: "HugeGraph 配置参考手册" +linkTitle: "配置参考手册" weight: 2 --- diff --git a/content/cn/docs/guides/faq.md b/content/cn/docs/guides/faq.md index d28c980a6..cfd26d43c 100644 --- a/content/cn/docs/guides/faq.md +++ b/content/cn/docs/guides/faq.md @@ -8,7 +8,7 @@ weight: 6 HugeGraph 支持多种部署模式,根据数据规模和场景选择: - **单机模式**:Server + RocksDB,适合开发测试和中小规模数据(< 1TB) - - **分布式模式**:HugeGraph-PD + HugeGraph-Store (HStore),支持水平扩展和高可用(100GB+ 数据规模),适合生产环境和大规模图数据应用 + - **分布式模式**:HugeGraph-PD + HugeGraph-Store (HStore),支持水平扩展和高可用(< 1000TB 数据规模),适合生产环境和大规模图数据应用 注:Cassandra、HBase、MySQL 等后端仅在 HugeGraph <= 1.5 版本中可用,官方后续不再单独维护 diff --git a/content/cn/docs/introduction/README.md b/content/cn/docs/introduction/README.md index 301712cde..ffb705964 100644 --- a/content/cn/docs/introduction/README.md +++ b/content/cn/docs/introduction/README.md @@ -43,7 +43,7 @@ HugeGraph 支持多种部署模式,满足不同规模和场景的需求: **分布式模式 (Distributed)** - HugeGraph-PD: 元数据管理和集群调度 - HugeGraph-Store (HStore): 分布式存储引擎 -- 支持水平扩展和高可用(100GB+ 数据规模) +- 支持水平扩展和高可用(< 1000TB 数据规模) - 适合生产环境和大规模图数据应用 ### 快速入门指南 diff --git a/content/cn/docs/config/config-computer.md b/content/cn/docs/quickstart/computing/hugegraph-computer-config.md similarity index 99% rename from content/cn/docs/config/config-computer.md rename to content/cn/docs/quickstart/computing/hugegraph-computer-config.md index 08c0439e0..783446d73 100644 --- a/content/cn/docs/config/config-computer.md +++ b/content/cn/docs/quickstart/computing/hugegraph-computer-config.md @@ -1,7 +1,7 @@ --- -title: "HugeGraph-Computer 配置" -linkTitle: "图计算 Computer 配置" -weight: 5 +title: "HugeGraph-Computer 配置参考" +linkTitle: "Computer 配置参考" +weight: 3 --- ### Computer 配置选项 diff --git a/content/en/docs/config/_index.md b/content/en/docs/config/_index.md index 04db80c57..b54e73f56 100644 --- a/content/en/docs/config/_index.md +++ b/content/en/docs/config/_index.md @@ -1,5 +1,14 @@ --- -title: "Config" -linkTitle: "Config" +title: "HugeGraph-Server Configuration" +linkTitle: "Server Config" weight: 4 ---- \ No newline at end of file +--- + +This section covers HugeGraph-Server configuration, including: + +- **[Configuration Guide](config-guide)** - Understand config file structure and basic setup +- **[Configuration Reference](config-option)** - Complete list of configuration options +- **[Authentication Config](config-authentication)** - User authentication and authorization +- **[HTTPS Config](config-https)** - Enable HTTPS secure protocol + +> For HugeGraph-Computer (OLAP) configuration, see [Computer Config Reference](/docs/quickstart/computing/hugegraph-computer-config). \ No newline at end of file diff --git a/content/en/docs/config/config-guide.md b/content/en/docs/config/config-guide.md index 48e5e08ca..cd4cc8e96 100644 --- a/content/en/docs/config/config-guide.md +++ b/content/en/docs/config/config-guide.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph configuration" -linkTitle: "Config Guide" +title: "HugeGraph Configuration Quick Start" +linkTitle: "Configuration Guide" weight: 1 --- diff --git a/content/en/docs/config/config-option.md b/content/en/docs/config/config-option.md index bfba7d97e..2c25337c1 100644 --- a/content/en/docs/config/config-option.md +++ b/content/en/docs/config/config-option.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph Config Options" -linkTitle: "Config Options" +title: "HugeGraph Configuration Reference" +linkTitle: "Configuration Reference" weight: 2 --- diff --git a/content/en/docs/guides/faq.md b/content/en/docs/guides/faq.md index 15f20e3f9..b4b6cc502 100644 --- a/content/en/docs/guides/faq.md +++ b/content/en/docs/guides/faq.md @@ -8,7 +8,7 @@ weight: 6 HugeGraph supports multiple deployment modes. Choose based on your data scale and scenario: - **Standalone Mode**: Server + RocksDB, suitable for development/testing and small to medium-scale data (< 1TB) - - **Distributed Mode**: HugeGraph-PD + HugeGraph-Store (HStore), supports horizontal scaling and high availability (100GB+ data scale), suitable for production environments and large-scale graph data applications + - **Distributed Mode**: HugeGraph-PD + HugeGraph-Store (HStore), supports horizontal scaling and high availability (< 1000TB data scale), suitable for production environments and large-scale graph data applications Note: Cassandra, HBase, MySQL and other backends are only available in HugeGraph <= 1.5 versions and are no longer maintained by the official team diff --git a/content/en/docs/introduction/README.md b/content/en/docs/introduction/README.md index 9186e2a60..4f3cc5514 100644 --- a/content/en/docs/introduction/README.md +++ b/content/en/docs/introduction/README.md @@ -40,7 +40,7 @@ HugeGraph supports multiple deployment modes to meet different scales and scenar **Distributed Mode** - HugeGraph-PD: Metadata management and cluster scheduling - HugeGraph-Store (HStore): Distributed storage engine -- Supports horizontal scaling and high availability (100GB+ data scale) +- Supports horizontal scaling and high availability (< 1000TB data scale) - Suitable for production environments and large-scale graph data applications ### Quick Start Guide diff --git a/content/en/docs/config/config-computer.md b/content/en/docs/quickstart/computing/hugegraph-computer-config.md similarity index 99% rename from content/en/docs/config/config-computer.md rename to content/en/docs/quickstart/computing/hugegraph-computer-config.md index 3c97d796a..42afdb2f1 100644 --- a/content/en/docs/config/config-computer.md +++ b/content/en/docs/quickstart/computing/hugegraph-computer-config.md @@ -1,7 +1,7 @@ --- -title: "HugeGraph-Computer Config" -linkTitle: "Config Computer" -weight: 5 +title: "HugeGraph-Computer Configuration Reference" +linkTitle: "Computer Config Reference" +weight: 3 --- ### Computer Config Options From c80b604eb7c1cb76befdba4f745bded20978827c Mon Sep 17 00:00:00 2001 From: imbajin Date: Mon, 2 Feb 2026 16:51:06 +0800 Subject: [PATCH 10/10] Update config docs: titles and reformat sections MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Rename config guide and reference titles (English & Chinese) to "Server Startup Guide" and "Server Complete Configuration Manual"; reorganize config-option pages by converting optional/config blocks (K8s, Arthas, RPC server, HBase, Cassandra/ScyllaDB, MySQL/PostgreSQL) into collapsible
sections and group deprecated backends under a "≤ 1.5 Version Config (Legacy)" section. Also adjust index bullets ordering/labels. These changes improve readability and clearly mark legacy backend configs. --- content/cn/docs/config/_index.md | 8 +- content/cn/docs/config/config-guide.md | 4 +- content/cn/docs/config/config-option.md | 172 ++++++++++++++---------- content/en/docs/config/_index.md | 8 +- content/en/docs/config/config-guide.md | 4 +- content/en/docs/config/config-option.md | 151 +++++++++++---------- 6 files changed, 193 insertions(+), 154 deletions(-) diff --git a/content/cn/docs/config/_index.md b/content/cn/docs/config/_index.md index 69c093639..ce83888d7 100644 --- a/content/cn/docs/config/_index.md +++ b/content/cn/docs/config/_index.md @@ -6,9 +6,7 @@ weight: 4 本节介绍 HugeGraph-Server 的配置方法,包括: -- **[配置入门指南](config-guide)** - 了解配置文件结构和基本配置方法 -- **[配置参考手册](config-option)** - 完整的配置选项列表和说明 +- **[Server 启动指南](config-guide)** - 了解配置文件结构和基本配置方法 +- **[Server 完整配置手册](config-option)** - 完整的配置选项列表和说明 - **[权限配置](config-authentication)** - 用户认证和授权配置 -- **[HTTPS 配置](config-https)** - 启用 HTTPS 安全协议 - -> 如需了解 HugeGraph-Computer (OLAP) 的配置,请参阅 [Computer 配置参考](/cn/docs/quickstart/computing/hugegraph-computer-config)。 \ No newline at end of file +- **[HTTPS 配置](config-https)** - 启用 HTTPS 安全协议 \ No newline at end of file diff --git a/content/cn/docs/config/config-guide.md b/content/cn/docs/config/config-guide.md index 154c15993..43517a45c 100644 --- a/content/cn/docs/config/config-guide.md +++ b/content/cn/docs/config/config-guide.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph 配置快速入门" -linkTitle: "配置入门指南" +title: "Server 启动指南" +linkTitle: "Server 启动指南" weight: 1 --- diff --git a/content/cn/docs/config/config-option.md b/content/cn/docs/config/config-option.md index 4159b7443..7719ac4b7 100644 --- a/content/cn/docs/config/config-option.md +++ b/content/cn/docs/config/config-option.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph 配置参考手册" -linkTitle: "配置参考手册" +title: "Server 完整配置手册" +linkTitle: "Server 完整配置手册" weight: 2 --- @@ -54,16 +54,6 @@ weight: 2 | memory_monitor.period | 2000 | The period in ms of JVM(in-heap) memory usage monitoring. | | log.slow_query_threshold | 1000 | Slow query log threshold in milliseconds, 0 means disabled. | -### K8s 配置项 (可选) - -对应配置文件`rest-server.properties` - -| config option | default value | description | -|------------------|-------------------------------|------------------------------------------| -| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. | -| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. | -| k8s.kubeconfig | | Path to kubeconfig file. | - ### PD/Meta 配置项 (分布式模式) 对应配置文件`rest-server.properties` @@ -73,16 +63,6 @@ weight: 2 | pd.peers | 127.0.0.1:8686 | PD server addresses (comma separated). | | meta.endpoints | http://127.0.0.1:2379 | Meta service endpoints. | -### Arthas 诊断配置项 (可选) - -对应配置文件`rest-server.properties` - -| config option | default value | description | -|--------------------|---------------|-----------------------| -| arthas.telnetPort | 8562 | Arthas telnet port. | -| arthas.httpPort | 8561 | Arthas HTTP port. | -| arthas.ip | 0.0.0.0 | Arthas bind IP. | - ### 基本配置项 基本配置项及后端配置项对应配置文件:{graph-name}.properties,如`hugegraph.properties` @@ -159,51 +139,6 @@ weight: 2 | raft.rpc_buf_high_water_mark | 20971520 | The ChannelOutboundBuffer's high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time. | | raft.read_strategy | ReadOnlyLeaseBased | The linearizability of read strategy. | -### RPC server 配置 - -| config option | default value | description | -|-----------------------------|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| rpc.client_connect_timeout | 20 | The timeout(in seconds) of rpc client connect to rpc server. | -| rpc.client_load_balancer | consistentHash | The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is 'consistentHash', means forwarding by request parameters. | -| rpc.client_read_timeout | 40 | The timeout(in seconds) of rpc client read from rpc server. | -| rpc.client_reconnect_period | 10 | The period(in seconds) of rpc client reconnect to rpc server. | -| rpc.client_retries | 3 | Failed retry number of rpc client calls to rpc server. | -| rpc.config_order | 999 | Sofa rpc configuration file loading order, the larger the more later loading. | -| rpc.logger_impl | com.alipay.sofa.rpc.log.SLF4JLoggerImpl | Sofa rpc log implementation class. | -| rpc.protocol | bolt | Rpc communication protocol, client and server need to be specified the same value. | -| rpc.remote_url | | The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ',', empty value means not enabled. | -| rpc.server_adaptive_port | false | Whether the bound port is adaptive, if it's enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts. | -| rpc.server_host | | The hosts/ips bound by rpc server to provide services, empty value means not enabled. | -| rpc.server_port | 8090 | The port bound by rpc server to provide services. | -| rpc.server_timeout | 30 | The timeout(in seconds) of rpc server execution. | - -### Cassandra 后端配置项 - -| config option | default value | description | -|--------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------| -| backend | | Must be set to `cassandra`. | -| serializer | | Must be set to `cassandra`. | -| cassandra.host | localhost | The seeds hostname or ip address of cassandra cluster. | -| cassandra.port | 9042 | The seeds port address of cassandra cluster. | -| cassandra.connect_timeout | 5 | The cassandra driver connect server timeout(seconds). | -| cassandra.read_timeout | 20 | The cassandra driver read from server timeout(seconds). | -| cassandra.keyspace.strategy | SimpleStrategy | The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy. | -| cassandra.keyspace.replication | [3] | The keyspace replication factor of SimpleStrategy, like '[3]'.Or replicas in each datacenter of NetworkTopologyStrategy, like '[dc1:2,dc2:1]'. | -| cassandra.username | | The username to use to login to cassandra cluster. | -| cassandra.password | | The password corresponding to cassandra.username. | -| cassandra.compression_type | none | The compression algorithm of cassandra transport: none/snappy/lz4. | -| cassandra.jmx_port=7199 | 7199 | The port of JMX API service for cassandra. | -| cassandra.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. | - -### ScyllaDB 后端配置项 - -| config option | default value | description | -|---------------|---------------|----------------------------| -| backend | | Must be set to `scylladb`. | -| serializer | | Must be set to `scylladb`. | - -其它与 Cassandra 后端一致。 - ### RocksDB 后端配置项 | config option | default value | description | @@ -260,7 +195,55 @@ weight: 2 | rocksdb.level0_stop_writes_trigger | 36 | Hard limit on number of level-0 files for stopping writes. | | rocksdb.soft_pending_compaction_bytes_limit | 68719476736 | The soft limit to impose on pending compaction in bytes. | -### HBase 后端配置项 +
+K8s 配置项 (可选) + +对应配置文件`rest-server.properties` + +| config option | default value | description | +|------------------|-------------------------------|------------------------------------------| +| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. | +| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. | +| k8s.kubeconfig | | Path to kubeconfig file. | + +
+ +
+Arthas 诊断配置项 (可选) + +对应配置文件`rest-server.properties` + +| config option | default value | description | +|--------------------|---------------|-----------------------| +| arthas.telnetPort | 8562 | Arthas telnet port. | +| arthas.httpPort | 8561 | Arthas HTTP port. | +| arthas.ip | 0.0.0.0 | Arthas bind IP. | + +
+ +
+RPC Server 配置 + +| config option | default value | description | +|-----------------------------|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| rpc.client_connect_timeout | 20 | The timeout(in seconds) of rpc client connect to rpc server. | +| rpc.client_load_balancer | consistentHash | The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is 'consistentHash', means forwarding by request parameters. | +| rpc.client_read_timeout | 40 | The timeout(in seconds) of rpc client read from rpc server. | +| rpc.client_reconnect_period | 10 | The period(in seconds) of rpc client reconnect to rpc server. | +| rpc.client_retries | 3 | Failed retry number of rpc client calls to rpc server. | +| rpc.config_order | 999 | Sofa rpc configuration file loading order, the larger the more later loading. | +| rpc.logger_impl | com.alipay.sofa.rpc.log.SLF4JLoggerImpl | Sofa rpc log implementation class. | +| rpc.protocol | bolt | Rpc communication protocol, client and server need to be specified the same value. | +| rpc.remote_url | | The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ',', empty value means not enabled. | +| rpc.server_adaptive_port | false | Whether the bound port is adaptive, if it's enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts. | +| rpc.server_host | | The hosts/ips bound by rpc server to provide services, empty value means not enabled. | +| rpc.server_port | 8090 | The port bound by rpc server to provide services. | +| rpc.server_timeout | 30 | The timeout(in seconds) of rpc server execution. | + +
+ +
+HBase 后端配置项 | config option | default value | description | |---------------------------|--------------------------------|--------------------------------------------------------------------------| @@ -281,7 +264,50 @@ weight: 2 | hbase.vertex_partitions | 10 | The number of partitions of the HBase vertex table. | | hbase.edge_partitions | 30 | The number of partitions of the HBase edge table. | -### MySQL & PostgreSQL 后端配置项 +
+ + +--- + +## ≤ 1.5 版本配置 (Legacy) + +以下后端存储在 1.7.0+ 版本中不再支持,仅在 1.5.x 及更早版本中可用: + +
+Cassandra 后端配置项 + +| config option | default value | description | +|--------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------| +| backend | | Must be set to `cassandra`. | +| serializer | | Must be set to `cassandra`. | +| cassandra.host | localhost | The seeds hostname or ip address of cassandra cluster. | +| cassandra.port | 9042 | The seeds port address of cassandra cluster. | +| cassandra.connect_timeout | 5 | The cassandra driver connect server timeout(seconds). | +| cassandra.read_timeout | 20 | The cassandra driver read from server timeout(seconds). | +| cassandra.keyspace.strategy | SimpleStrategy | The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy. | +| cassandra.keyspace.replication | [3] | The keyspace replication factor of SimpleStrategy, like '[3]'.Or replicas in each datacenter of NetworkTopologyStrategy, like '[dc1:2,dc2:1]'. | +| cassandra.username | | The username to use to login to cassandra cluster. | +| cassandra.password | | The password corresponding to cassandra.username. | +| cassandra.compression_type | none | The compression algorithm of cassandra transport: none/snappy/lz4. | +| cassandra.jmx_port=7199 | 7199 | The port of JMX API service for cassandra. | +| cassandra.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. | + +
+ +
+ScyllaDB 后端配置项 + +| config option | default value | description | +|---------------|---------------|----------------------------| +| backend | | Must be set to `scylladb`. | +| serializer | | Must be set to `scylladb`. | + +其它与 Cassandra 后端一致。 + +
+ +
+MySQL & PostgreSQL 后端配置项 | config option | default value | description | |----------------------------------|-----------------------------|-------------------------------------------------------------------------------------| @@ -297,7 +323,10 @@ weight: 2 | jdbc.storage_engine | InnoDB | The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL. | | jdbc.postgresql.connect_database | template1 | The database used to connect when init store, drop store or check store exist. | -### PostgreSQL 后端配置项 +
+ +
+PostgreSQL 后端配置项 | config option | default value | description | |---------------|---------------|------------------------------| @@ -309,3 +338,6 @@ weight: 2 > PostgreSQL 后端的 driver 和 url 应该设置为: > - `jdbc.driver=org.postgresql.Driver` > - `jdbc.url=jdbc:postgresql://localhost:5432/` + +
+ diff --git a/content/en/docs/config/_index.md b/content/en/docs/config/_index.md index b54e73f56..b79b5af96 100644 --- a/content/en/docs/config/_index.md +++ b/content/en/docs/config/_index.md @@ -6,9 +6,7 @@ weight: 4 This section covers HugeGraph-Server configuration, including: -- **[Configuration Guide](config-guide)** - Understand config file structure and basic setup -- **[Configuration Reference](config-option)** - Complete list of configuration options +- **[Server Startup Guide](config-guide)** - Understand config file structure and basic setup +- **[Server Complete Configuration Manual](config-option)** - Complete list of configuration options - **[Authentication Config](config-authentication)** - User authentication and authorization -- **[HTTPS Config](config-https)** - Enable HTTPS secure protocol - -> For HugeGraph-Computer (OLAP) configuration, see [Computer Config Reference](/docs/quickstart/computing/hugegraph-computer-config). \ No newline at end of file +- **[HTTPS Config](config-https)** - Enable HTTPS secure protocol \ No newline at end of file diff --git a/content/en/docs/config/config-guide.md b/content/en/docs/config/config-guide.md index cd4cc8e96..8857010dc 100644 --- a/content/en/docs/config/config-guide.md +++ b/content/en/docs/config/config-guide.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph Configuration Quick Start" -linkTitle: "Configuration Guide" +title: "Server Startup Guide" +linkTitle: "Server Startup Guide" weight: 1 --- diff --git a/content/en/docs/config/config-option.md b/content/en/docs/config/config-option.md index 2c25337c1..e6a074e28 100644 --- a/content/en/docs/config/config-option.md +++ b/content/en/docs/config/config-option.md @@ -1,6 +1,6 @@ --- -title: "HugeGraph Configuration Reference" -linkTitle: "Configuration Reference" +title: "Server Complete Configuration Manual" +linkTitle: "Server Complete Configuration Manual" weight: 2 --- @@ -54,16 +54,6 @@ Corresponding configuration file `rest-server.properties` | memory_monitor.period | 2000 | The period in ms of JVM(in-heap) memory usage monitoring. | | log.slow_query_threshold | 1000 | Slow query log threshold in milliseconds, 0 means disabled. | -### K8s Config Options (Optional) - -Corresponding configuration file `rest-server.properties` - -| config option | default value | description | -|------------------|-------------------------------|------------------------------------------| -| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. | -| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. | -| k8s.kubeconfig | | Path to kubeconfig file. | - ### PD/Meta Config Options (Distributed Mode) Corresponding configuration file `rest-server.properties` @@ -73,16 +63,6 @@ Corresponding configuration file `rest-server.properties` | pd.peers | 127.0.0.1:8686 | PD server addresses (comma separated). | | meta.endpoints | http://127.0.0.1:2379 | Meta service endpoints. | -### Arthas Diagnostic Config Options (Optional) - -Corresponding configuration file `rest-server.properties` - -| config option | default value | description | -|--------------------|---------------|-----------------------| -| arthas.telnetPort | 8562 | Arthas telnet port. | -| arthas.httpPort | 8561 | Arthas HTTP port. | -| arthas.ip | 0.0.0.0 | Arthas bind IP. | - ### Basic Config Options Basic Config Options and Backend Config Options correspond to configuration files:{graph-name}.properties, such as `hugegraph.properties` @@ -159,51 +139,6 @@ Basic Config Options and Backend Config Options correspond to configuration file | raft.rpc_buf_high_water_mark | 20971520 | The ChannelOutboundBuffer's high water mark of netty, only when buffer size exceed this size, the method ChannelOutboundBuffer.isWritable() will return false, it means that the downstream pressure is too great to process the request or network is very congestion, upstream needs to limit rate at this time. | | raft.read_strategy | ReadOnlyLeaseBased | The linearizability of read strategy. | -### RPC server Config Options - -| config option | default value | description | -|-----------------------------|-----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| rpc.client_connect_timeout | 20 | The timeout(in seconds) of rpc client connect to rpc server. | -| rpc.client_load_balancer | consistentHash | The rpc client uses a load-balancing algorithm to access multiple rpc servers in one cluster. Default value is 'consistentHash', means forwarding by request parameters. | -| rpc.client_read_timeout | 40 | The timeout(in seconds) of rpc client read from rpc server. | -| rpc.client_reconnect_period | 10 | The period(in seconds) of rpc client reconnect to rpc server. | -| rpc.client_retries | 3 | Failed retry number of rpc client calls to rpc server. | -| rpc.config_order | 999 | Sofa rpc configuration file loading order, the larger the more later loading. | -| rpc.logger_impl | com.alipay.sofa.rpc.log.SLF4JLoggerImpl | Sofa rpc log implementation class. | -| rpc.protocol | bolt | Rpc communication protocol, client and server need to be specified the same value. | -| rpc.remote_url | | The remote urls of rpc peers, it can be set to multiple addresses, which are concat by ',', empty value means not enabled. | -| rpc.server_adaptive_port | false | Whether the bound port is adaptive, if it's enabled, when the port is in use, automatically +1 to detect the next available port. Note that this process is not atomic, so there may still be port conflicts. | -| rpc.server_host | | The hosts/ips bound by rpc server to provide services, empty value means not enabled. | -| rpc.server_port | 8090 | The port bound by rpc server to provide services. | -| rpc.server_timeout | 30 | The timeout(in seconds) of rpc server execution. | - -### Cassandra Backend Config Options - -| config option | default value | description | -|--------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------| -| backend | | Must be set to `cassandra`. | -| serializer | | Must be set to `cassandra`. | -| cassandra.host | localhost | The seeds hostname or ip address of cassandra cluster. | -| cassandra.port | 9042 | The seeds port address of cassandra cluster. | -| cassandra.connect_timeout | 5 | The cassandra driver connect server timeout(seconds). | -| cassandra.read_timeout | 20 | The cassandra driver read from server timeout(seconds). | -| cassandra.keyspace.strategy | SimpleStrategy | The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy. | -| cassandra.keyspace.replication | [3] | The keyspace replication factor of SimpleStrategy, like '[3]'.Or replicas in each datacenter of NetworkTopologyStrategy, like '[dc1:2,dc2:1]'. | -| cassandra.username | | The username to use to login to cassandra cluster. | -| cassandra.password | | The password corresponding to cassandra.username. | -| cassandra.compression_type | none | The compression algorithm of cassandra transport: none/snappy/lz4. | -| cassandra.jmx_port=7199 | 7199 | The port of JMX API service for cassandra. | -| cassandra.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. | - -### ScyllaDB Backend Config Options - -| config option | default value | description | -|---------------|---------------|----------------------------| -| backend | | Must be set to `scylladb`. | -| serializer | | Must be set to `scylladb`. | - -Other options are consistent with the Cassandra backend. - ### RocksDB Backend Config Options | config option | default value | description | @@ -260,7 +195,34 @@ Other options are consistent with the Cassandra backend. | rocksdb.level0_stop_writes_trigger | 36 | Hard limit on number of level-0 files for stopping writes. | | rocksdb.soft_pending_compaction_bytes_limit | 68719476736 | The soft limit to impose on pending compaction in bytes. | -### HBase Backend Config Options +
+K8s Config Options (Optional) + +Corresponding configuration file `rest-server.properties` + +| config option | default value | description | +|------------------|-------------------------------|------------------------------------------| +| server.use_k8s | false | Whether to enable K8s multi-tenancy mode. | +| k8s.namespace | hugegraph-computer-system | K8s namespace for compute jobs. | +| k8s.kubeconfig | | Path to kubeconfig file. | + +
+ +
+Arthas Diagnostic Config Options (Optional) + +Corresponding configuration file `rest-server.properties` + +| config option | default value | description | +|--------------------|---------------|-----------------------| +| arthas.telnetPort | 8562 | Arthas telnet port. | +| arthas.httpPort | 8561 | Arthas HTTP port. | +| arthas.ip | 0.0.0.0 | Arthas bind IP. | + +
+ +
+HBase Backend Config Options | config option | default value | description | |---------------------------|--------------------------------|--------------------------------------------------------------------------| @@ -281,7 +243,50 @@ Other options are consistent with the Cassandra backend. | hbase.vertex_partitions | 10 | The number of partitions of the HBase vertex table. | | hbase.edge_partitions | 30 | The number of partitions of the HBase edge table. | -### MySQL & PostgreSQL Backend Config Options +
+ + +--- + +## ≤ 1.5 Version Config (Legacy) + +The following backend stores are no longer supported in version 1.7.0+ and are only available in version 1.5.x and earlier: + +
+Cassandra Backend Config Options + +| config option | default value | description | +|--------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------| +| backend | | Must be set to `cassandra`. | +| serializer | | Must be set to `cassandra`. | +| cassandra.host | localhost | The seeds hostname or ip address of cassandra cluster. | +| cassandra.port | 9042 | The seeds port address of cassandra cluster. | +| cassandra.connect_timeout | 5 | The cassandra driver connect server timeout(seconds). | +| cassandra.read_timeout | 20 | The cassandra driver read from server timeout(seconds). | +| cassandra.keyspace.strategy | SimpleStrategy | The replication strategy of keyspace, valid value is SimpleStrategy or NetworkTopologyStrategy. | +| cassandra.keyspace.replication | [3] | The keyspace replication factor of SimpleStrategy, like '[3]'.Or replicas in each datacenter of NetworkTopologyStrategy, like '[dc1:2,dc2:1]'. | +| cassandra.username | | The username to use to login to cassandra cluster. | +| cassandra.password | | The password corresponding to cassandra.username. | +| cassandra.compression_type | none | The compression algorithm of cassandra transport: none/snappy/lz4. | +| cassandra.jmx_port=7199 | 7199 | The port of JMX API service for cassandra. | +| cassandra.aggregation_timeout | 43200 | The timeout in seconds of waiting for aggregation. | + +
+ +
+ScyllaDB Backend Config Options + +| config option | default value | description | +|---------------|---------------|----------------------------| +| backend | | Must be set to `scylladb`. | +| serializer | | Must be set to `scylladb`. | + +Other options are consistent with the Cassandra backend. + +
+ +
+MySQL & PostgreSQL Backend Config Options | config option | default value | description | |----------------------------------|-----------------------------|-------------------------------------------------------------------------------------| @@ -297,7 +302,10 @@ Other options are consistent with the Cassandra backend. | jdbc.storage_engine | InnoDB | The storage engine of backend store database, like InnoDB/MyISAM/RocksDB for MySQL. | | jdbc.postgresql.connect_database | template1 | The database used to connect when init store, drop store or check store exist. | -### PostgreSQL Backend Config Options +
+ +
+PostgreSQL Backend Config Options | config option | default value | description | |---------------|---------------|------------------------------| @@ -309,3 +317,6 @@ Other options are consistent with the MySQL backend. > The driver and url of the PostgreSQL backend should be set to: > - `jdbc.driver=org.postgresql.Driver` > - `jdbc.url=jdbc:postgresql://localhost:5432/` + +
+