-
Notifications
You must be signed in to change notification settings - Fork 3k
feat: add 1P BigQuery skill for guided data analysis #4678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e43e2aa
82c35ba
73c918b
3068d5e
f09a118
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,177 @@ | ||
| # First-Party (1P) Skills for ADK Toolsets | ||
|
|
||
| Please refer to [go/orcas-hermes-guide](http://goto.google.com/orcas-hermes-guide). Please submit your RFC at [go/hermes-orcas](http://goto.google.com/hermes-orcas) | ||
| --- | ||
|
|
||
| # Summary | ||
|
|
||
| This RFC proposes a standardized method for bundling and consuming "First-Party (1P) Skills" alongside existing ADK toolsets (e.g., `BigQueryToolset`, `SpannerToolset`). These 1P skills, compliant with the [agentskills.io specification](https://agentskills.io/specification), will encapsulate best practices and guided workflows for using the toolset's raw tools. This approach enhances developer experience by providing discoverable, versioned guidance without requiring any changes to core ADK APIs or classes. Developers opt-in by adding both the base toolset and the associated `SkillToolset` to their agent. Example implementation based on this RFC you can refer to [this PR.](https://github.com/google/adk-python/pull/4678) | ||
|
|
||
| # Motivation | ||
|
|
||
| Currently, ADK toolsets provide powerful but low-level tools (e.g., `execute_sql`). Developers are responsible for engineering the prompts and logic to use these tools effectively, often embedding complex workflow guidance directly into agent instructions. This leads to: | ||
|
|
||
| * **Duplicated Effort:** Each developer reinvents common usage patterns. | ||
| * **Inconsistent Quality:** Lack of standardized workflows results in varying reliability. | ||
| * **Poor Discoverability:** Expertise about toolset usage is not easily shared or found. | ||
| * **Bloated Instructions:** Agent prompts become long and hard to maintain. | ||
|
|
||
| By shipping 1P Skills with toolsets, we can provide reusable, curated knowledge on how to best utilize ADK components. | ||
|
|
||
| # Proposal | ||
|
|
||
| We propose to package spec-compliant skill directories within the ADK library, alongside the toolsets they guide. These skills will be loaded using the existing `SkillToolset` and `load_skill_from_dir` mechanisms. | ||
|
|
||
| ## Key Concepts: | ||
|
|
||
| 1. **Co-location:** 1P skill directories will reside within the corresponding toolset's module path (e.g., `google/adk/tools/bigquery/skills/bigquery-data-analysis/`). | ||
| 2. **Standard Specification:** Skills will adhere to the [agentskills.io specification](https://agentskills.io/specification). | ||
| 3. **Existing Mechanisms:** Consumption is via the standard `SkillToolset`. No new ADK classes or APIs are introduced. | ||
| 4. **Opt-In Usage:** Developers explicitly add the `SkillToolset` with the desired 1P skill(s) to their agent. There is no automatic inclusion. | ||
| 5. **Convenience Loaders:** A simple function like `get_bigquery_skill()` will be provided for easy loading. | ||
|
|
||
| ## Directory Structure Example: | ||
|
|
||
| ``` | ||
| src/google/adk/integration/bigquery/ # Canonical location | ||
| ├── __init__.py # Exports BigQueryToolset, etc. | ||
| ├── bigquery_toolset.py # Raw tools | ||
| ├── bigquery_credentials.py # Credentials config | ||
| ├── bigquery_skill.py # Skill loader | ||
| ├── client.py # BQ client helper | ||
| ├── config.py # Tool configuration | ||
| ├── data_insights_tool.py # Data insights tool | ||
| ├── metadata_tool.py # Metadata tools | ||
| ├── query_tool.py # Query tools | ||
| └── skills/ | ||
| └── bigquery-data-analysis/ # Spec-compliant skill directory | ||
| ├── SKILL.md # Frontmatter + workflow instructions | ||
| └── references/ | ||
| ├── sql_patterns.md | ||
| └── error_handling.md | ||
|
|
||
| src/google/adk/tools/bigquery/ | ||
| └── __init__.py # Alias → integration.bigquery | ||
| # (registers canonical modules | ||
| # in sys.modules for compat) | ||
| ``` | ||
|
|
||
| ## Runtime Flow: | ||
|
|
||
| 1P Skills leverage `SkillToolset`'s progressive disclosure: | ||
|
|
||
| * **L1 Metadata:** Skill name/description visible via `list_skills`. | ||
| * **L2 Instructions:** Main `SKILL.md` content loaded via `load_skill(name=...)`. | ||
| * **L3 References:** Detailed guides in `references/` loaded via `load_skill_resource(skill_name=..., resource_name=...)`. | ||
|
|
||
| This allows the agent to access guidance on demand without overloading the context window. | ||
|
|
||
| # API Usage | ||
|
|
||
| ## Before: | ||
|
|
||
| ```py | ||
| from google.adk.agents.llm_agent import LlmAgent | ||
| from google.adk.integration.bigquery import BigQueryToolset | ||
|
|
||
| bigquery_toolset = BigQueryToolset(credentials_config=creds) | ||
|
|
||
| root_agent = LlmAgent( | ||
| model="gemini-2.5-flash", | ||
| name="analyst", | ||
| instruction="""You are a data analyst. When analyzing data: | ||
| 1. First explore schemas... | ||
| 2. Use get_table_info... | ||
| ... (many lines of hand-written guidance)""", | ||
| tools=[bigquery_toolset], | ||
| ) | ||
| ``` | ||
|
|
||
| ## After (single flag): | ||
|
|
||
| ```py | ||
| from google.adk.agents.llm_agent import LlmAgent | ||
| from google.adk.integration.bigquery import BigQueryToolset | ||
|
|
||
| bigquery_toolset = BigQueryToolset(credentials_config=creds, load_skills=True) | ||
|
|
||
| root_agent = LlmAgent( | ||
| model="gemini-2.5-flash", | ||
| name="analyst", | ||
| instruction="You are a data analyst. Use your tools and skills.", | ||
| tools=[bigquery_toolset], | ||
| ) | ||
| ``` | ||
|
|
||
| ## After (explicit, composable): | ||
|
|
||
| ```py | ||
| from google.adk.agents.llm_agent import LlmAgent | ||
| from google.adk.integration.bigquery import BigQueryToolset | ||
| from google.adk.integration.bigquery import get_bigquery_skill | ||
| from google.adk.tools.skill_toolset import SkillToolset | ||
|
|
||
| bigquery_toolset = BigQueryToolset(credentials_config=creds) | ||
| bq_skill_toolset = SkillToolset(skills=[get_bigquery_skill(), my_custom_skill]) | ||
|
|
||
| root_agent = LlmAgent( | ||
| model="gemini-2.5-flash", | ||
| name="analyst", | ||
| instruction="You are a data analyst. Use your tools and skills.", | ||
| tools=[bigquery_toolset, bq_skill_toolset], | ||
| ) | ||
| ``` | ||
|
|
||
| The `load_skills=True` flag is the simplest path for the common case. The explicit `SkillToolset` pattern is available when you need to combine the 1P skill with custom skills. | ||
|
|
||
| ## Composability: | ||
|
|
||
| Both approaches are available. Developers can mix and match: | ||
|
|
||
| * `BigQueryToolset(load_skills=True)` — single-line, includes 1P skill. | ||
| * `BigQueryToolset()` + `SkillToolset(skills=[...])` — full control over which skills are loaded. | ||
| * `BigQueryToolset()` alone — no skills, tools only. | ||
|
|
||
| # Implementation Pattern for Toolsets | ||
|
|
||
| 1. **Create Skill Directory:** Add `src/google/adk/integration/<toolset>/skills/<skill-name>/` with `SKILL.md` and optional `references/`. | ||
| 2. **Add Loader:** Create `src/google/adk/integration/<toolset>/<toolset>_skill.py`: | ||
|
|
||
| ```py | ||
| import pathlib | ||
| from google.adk.skills import Skill, load_skill_from_dir | ||
|
|
||
| _SKILL_DIR = pathlib.Path(__file__).parent / "skills" / "<skill-name>" | ||
|
|
||
| def get_<toolset>_skill() -> Skill: | ||
| return load_skill_from_dir(_SKILL_DIR) | ||
| ``` | ||
|
|
||
| 3. **Add Alias (Optional):** Re-export from `src/google/adk/tools/<toolset>/` for backward compatibility. | ||
| 4. **Add Tests & Sample:** Validate skill structure and demonstrate usage. | ||
|
|
||
| **Candidate Toolsets for 1P Skills:** Spanner, Bigtable, PubSub. | ||
|
|
||
| # Backward Compatibility | ||
|
|
||
| All toolset code has moved to `google.adk.integration.bigquery` as the canonical location. The old `google.adk.tools.bigquery` path remains as a fully transparent alias: its `__init__.py` registers the canonical modules in `sys.modules` so that all existing imports (including `from google.adk.tools.bigquery.config import BigQueryToolConfig`) resolve to the same module objects where the real code lives. This ensures `mock.patch.object` and all other patterns continue to work without changes to existing tests or user code. | ||
|
|
||
| # Alternatives Considered | ||
|
|
||
| * **Embedding guidance in Toolset:** Would tightly coupled tools with specific workflows, reducing flexibility. | ||
| * **New API/Class for 1P Skills:** Would increase API surface area unnecessarily, as existing `SkillToolset` fits the need perfectly. | ||
|
|
||
| The proposed approach is a minimalist design, maximizing reuse of existing components. | ||
|
|
||
| # Timeline | ||
|
|
||
| * **Phase 1:** Implement 1P Skill for `BigQueryToolset` as a proof-of-concept. | ||
| * **Phase 2:** Develop 1P Skills for other key toolsets (Spanner, PubSub, etc.). | ||
| * **Phase 3:** Document the pattern for community contributions. | ||
|
|
||
| # Outcome | ||
|
|
||
| * Improved developer experience for using complex toolsets. | ||
| * Standardized, versioned, and discoverable best practices. | ||
| * Reduced boilerplate in agent instructions. | ||
| * A clear pattern for extending other ADK toolsets with 1P skills. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| # 1P BigQuery Skill Sample | ||
|
|
||
| This sample demonstrates the **1P (first-party) Skills** pattern: combining | ||
| a raw toolset (`BigQueryToolset`) with a curated skill (`SkillToolset`) to | ||
| give the agent both tools and guided workflows. | ||
|
|
||
| ## What This Shows | ||
|
|
||
| - **BigQueryToolset** provides raw tools: `execute_sql`, `list_dataset_ids`, | ||
| `get_table_info`, etc. | ||
| - **SkillToolset** provides skill discovery and loading: `list_skills`, | ||
| `load_skill`, `load_skill_resource`, `run_skill_script`. | ||
| - The **bigquery-data-analysis** skill ships pre-packaged with ADK and | ||
| follows the [agentskills.io specification](https://agentskills.io/specification). | ||
|
|
||
| The agent can discover the skill at runtime, load its instructions for | ||
| guided workflows, and access reference materials on-demand. | ||
|
|
||
| ## Setup | ||
|
|
||
| 1. Install ADK with BigQuery extras: | ||
|
|
||
| ```bash | ||
| pip install google-adk[bigquery] | ||
| ``` | ||
|
|
||
| 2. Set up OAuth credentials: | ||
|
|
||
| - Create OAuth 2.0 credentials in the Google Cloud Console. | ||
| - Update `agent.py` with your `client_id` and `client_secret`. | ||
|
|
||
| 3. Run the sample: | ||
|
|
||
| ```bash | ||
| adk web contributing/samples | ||
| ``` | ||
|
|
||
| 4. Select `1p_bigquery_skill` from the agent list. | ||
|
|
||
| ## How It Works | ||
|
|
||
| ``` | ||
| User Query | ||
| | | ||
| v | ||
| LlmAgent (gemini-2.5-flash) | ||
| | | ||
| +-- BigQueryToolset tools (direct data access) | ||
| | list_dataset_ids, list_table_ids, get_table_info, | ||
| | execute_sql, forecast, detect_anomalies, ... | ||
| | | ||
| +-- SkillToolset tools (guided workflows) | ||
| list_skills -> discovers "bigquery-data-analysis" | ||
| load_skill -> loads step-by-step instructions | ||
| load_skill_resource -> loads sql_patterns.md, etc. | ||
| run_skill_script -> executes skill scripts | ||
| ``` | ||
|
|
||
| ## Progressive Disclosure | ||
|
|
||
| The skill uses three levels of content: | ||
|
|
||
| 1. **L1 - Metadata** (always available): skill name and description shown | ||
| via `list_skills`. | ||
| 2. **L2 - Instructions** (on activation): full workflow steps loaded via | ||
| `load_skill`. | ||
| 3. **L3 - References** (on demand): detailed SQL patterns, schema | ||
| exploration guides, and error handling loaded via `load_skill_resource`. | ||
|
|
||
| This keeps the agent's context efficient while making deep knowledge | ||
| available when needed. | ||
|
|
||
| ## Extending This Pattern | ||
|
|
||
| Other toolsets can follow the same pattern: | ||
|
|
||
| 1. Create a spec-compliant skill directory under `integration/<toolset>/skills/`. | ||
| 2. Add a `get_*_skill()` convenience loader. | ||
| 3. (Optional) Add an alias in `tools/<toolset>/` for backward compatibility. | ||
| 4. Users add both the toolset and `SkillToolset` to their agent's tools. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Copyright 2026 Google LLC | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| from . import agent |
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,61 @@ | ||||||||||||||||
| # Copyright 2026 Google LLC | ||||||||||||||||
| # | ||||||||||||||||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||||||||||||||||
| # you may not use this file except in compliance with the License. | ||||||||||||||||
| # You may obtain a copy of the License at | ||||||||||||||||
| # | ||||||||||||||||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||||||||||||||||
| # | ||||||||||||||||
| # Unless required by applicable law or agreed to in writing, software | ||||||||||||||||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||||||||||||||||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||||||||||||||
| # See the License for the specific language governing permissions and | ||||||||||||||||
| # limitations under the License. | ||||||||||||||||
|
|
||||||||||||||||
| """Example agent using BigQuery toolset with a 1P skill for guided workflows. | ||||||||||||||||
|
|
||||||||||||||||
| This sample demonstrates the pattern of combining a raw toolset (BigQueryToolset) | ||||||||||||||||
| with a curated skill (SkillToolset) to provide both tools and guided workflows | ||||||||||||||||
| to the agent. | ||||||||||||||||
|
|
||||||||||||||||
| Setup: | ||||||||||||||||
| 1. Install ADK with BigQuery extras: pip install google-adk[bigquery] | ||||||||||||||||
| 2. Configure OAuth credentials (see README.md) | ||||||||||||||||
| 3. Run: adk web contributing/samples | ||||||||||||||||
| """ | ||||||||||||||||
|
|
||||||||||||||||
| from google.adk.agents.llm_agent import LlmAgent | ||||||||||||||||
| from google.adk.integration.bigquery import BigQueryToolset | ||||||||||||||||
| from google.adk.integration.bigquery import get_bigquery_skill | ||||||||||||||||
| from google.adk.integration.bigquery.bigquery_credentials import BigQueryCredentialsConfig | ||||||||||||||||
| from google.adk.tools.skill_toolset import SkillToolset | ||||||||||||||||
|
|
||||||||||||||||
| # Configure BigQuery credentials. | ||||||||||||||||
| # Replace with your OAuth client credentials. | ||||||||||||||||
| credentials_config = BigQueryCredentialsConfig( | ||||||||||||||||
| client_id="YOUR_CLIENT_ID", | ||||||||||||||||
| client_secret="YOUR_CLIENT_SECRET", | ||||||||||||||||
|
Comment on lines
+35
to
+37
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hardcoding
Suggested change
|
||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| # BigQueryToolset provides the raw tools: execute_sql, list_dataset_ids, etc. | ||||||||||||||||
| bigquery_toolset = BigQueryToolset( | ||||||||||||||||
| credentials_config=credentials_config, | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| # SkillToolset provides guided workflows via the 1P BigQuery skill. | ||||||||||||||||
| # The agent can discover and load the skill's instructions and references. | ||||||||||||||||
| bq_skill_toolset = SkillToolset( | ||||||||||||||||
| skills=[get_bigquery_skill()], | ||||||||||||||||
| ) | ||||||||||||||||
|
|
||||||||||||||||
| root_agent = LlmAgent( | ||||||||||||||||
| model="gemini-2.5-flash", | ||||||||||||||||
| name="bigquery_skill_agent", | ||||||||||||||||
| instruction=( | ||||||||||||||||
| "You are a data analyst. Use your tools and skills to help" | ||||||||||||||||
| " users explore and analyze BigQuery data. When starting a new" | ||||||||||||||||
| " analysis task, use `list_skills` to discover available skills" | ||||||||||||||||
| " and `load_skill` to get step-by-step guidance." | ||||||||||||||||
| ), | ||||||||||||||||
| tools=[bigquery_toolset, bq_skill_toolset], | ||||||||||||||||
| ) | ||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Copyright 2026 Google LLC | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """ADK integrations — toolset skills and guided workflows.""" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| # Copyright 2026 Google LLC | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """BigQuery integration — tools, credentials, and skills.""" | ||
|
|
||
| from .bigquery_credentials import BigQueryCredentialsConfig | ||
| from .bigquery_skill import get_bigquery_skill | ||
| from .bigquery_toolset import BigQueryToolset | ||
|
|
||
| __all__ = [ | ||
| "BigQueryCredentialsConfig", | ||
| "BigQueryToolset", | ||
| "get_bigquery_skill", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally recommended to avoid hardcoding sensitive information like
client_idandclient_secretdirectly in code files, even if they are placeholders in a sample. This practice can lead to accidental exposure if not handled carefully. Consider suggesting the use of environment variables or a secure configuration management system for these credentials.