Skip to content

support Codex as a datasource type#2

Open
ianhe8x wants to merge 5 commits intomainfrom
codex
Open

support Codex as a datasource type#2
ianhe8x wants to merge 5 commits intomainfrom
codex

Conversation

@ianhe8x
Copy link
Contributor

@ianhe8x ianhe8x commented Mar 18, 2026

Support Codex as a datasource type

  • Replace GPL-3.0 with PolyForm-Shield-1.0.0 license
    • Add LICENSE file with PolyForm Shield terms
    • Update package.json license field
    • Update eslint header rule and all source file headers
    • Update copyright year to 2020-2026

Summary by CodeRabbit

  • New Features

    • Built-in Codex GraphQL support with bundled config, expansive Codex query surface, and a Codex-focused type-detail tool
    • LLM analysis now returns suggested questions
    • New e2e helper script to demo Codex-powered agent flows
  • Improvements

    • Schema handling supports SDL (full schema) and introspection; in-process schema caching
    • Agent initialization streamlined for Codex endpoints
  • Tests

    • Added extensive E2E and tooling tests for Codex and type-detail behavior
  • Chores

    • Project license, package scripts, ignore list, and headers updated

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

Adds Codex integration: bundled Codex project config and SDLs, SDL-based schema loading, a new graphql_type_detail tool with depth-aware extraction, Codex-aware prompts/enrichment and fast-path initialization, tests and e2e script for Codex, plus license, packaging, and minor ignore/header updates.

Changes

Cohort / File(s) Summary
Codex configuration & schemas
src/codex/config.ts, src/codex/schema.ts, src/codex/query.graphql
New getCodexConfig generator; in-process caching for query.graphql / full_schema.graphql / introspection via schema.ts; large Codex Query SDL added.
Initialization & runtime schema loading
src/index.ts, src/graphql.service.ts, src/utils.ts
Fast-path Codex detection via isCodexEndpoint; getCodexConfig used to return/stash config for Codex endpoints; GraphQLService now supports SDL (buildSchema) when config.fullSchema present and added fetchIntrospectionSchema.
Project manager & enrichment
src/project-manager.ts
Enrichment short-circuits for configs with nodeType === CODEX (returns input config unchanged); suggestedQuestions propagated into project config when analysis runs.
Types, prompts & LLM
src/types.ts, src/prompts.ts, src/llm.ts
Added GraphqlProvider.CODEX, fullSchema?, and suggestedQuestions to types; prompts include Codex-specific guidance (graphql_type_detail requirement); LLM analysis now returns suggestedQuestions.
Tools: type-detail & schema-info & registry
src/tools/graphql-type-detail.tool.ts, src/tools/graphql-schema-info.tool.ts, src/tools/index.ts
New createGraphQLTypeDetailTool (depth-limited recursive extraction, cycle-safe) and registry now conditionally includes it; graphql-schema-info gained a Codex branch. Note: duplicate generateCodexSchemaInfo declarations appear in the diff.
Tools: validator/execute adjustments
src/tools/graphql-validate-excute.tool.ts, src/tools/graphql-execute.tool.ts, src/tools/graphql-validator.tool.ts
Removed post-validation schema fetch/caching side-effect; header/license year updates across tools.
Tests & e2e script
tests/graphql-agent.test.ts, scripts/e2e-codex.ts
Extensive Codex/SubQL/The Graph tests added verifying tool presence and type-detail behavior; new e2e script demonstrating Codex agent flow with in-memory persistent store.
Packaging, license & headers
package.json, LICENSE, eslint.config.cjs, other headers
License changed to PolyForm-Shield-1.0.0, LICENSE added, codex npm script and tsx devDependency added, many file header year updates.
Misc / repo config
.gitignore
Added .task entry; .eslintcache entry formatting/newline adjusted.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Init as initializeProjectConfig
    participant CodexCfg as getCodexConfig
    participant Cache as PersistentCache
    participant GraphQL as GraphQLService
    participant ProjMgr as enrichGraphQLConfig

    Client->>Init: initializeProjectConfig(endpoint, auth)
    alt Codex endpoint detected
        Init->>CodexCfg: getCodexConfig(endpoint, auth)
        CodexCfg-->>Init: GraphQLProjectConfig (includes fullSchema)
        Init->>Cache: save config
        Init-->>Client: return config
    else Non-Codex endpoint
        Init->>GraphQL: create service
        GraphQL->>GraphQL: fetch schema (introspection or SDL)
        GraphQL-->>Init: loaded schema/config
        Init->>Cache: save config
        Init-->>Client: return config
    end

    Client->>ProjMgr: enrichGraphQLConfig(config)
    alt Codex endpoint
        ProjMgr-->>Client: return config (enrichment skipped for CODEX)
    else Other endpoint
        ProjMgr->>ProjMgr: IPFS/LLM analysis -> enrich config (suggestedQuestions)
        ProjMgr-->>Client: enriched config
    end
Loading
sequenceDiagram
    participant LLM
    participant TypeTool as graphql_type_detail
    participant Parser as SchemaParser
    participant TypeCache

    LLM->>TypeTool: request typeNames=[A,B], depth=2
    TypeTool->>TypeCache: lookup A
    alt cache miss
        TypeTool->>Parser: parse config.fullSchema
        Parser->>Parser: extractTypeWithDepth(A,2)
        Parser-->>TypeTool: detailed definition + nested refs
        TypeTool->>TypeCache: cache A
    else cached
        TypeCache-->>TypeTool: return cached A
    end
    TypeTool-->>LLM: combined formatted type details
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 I found a Codex, snug and bright,
Full schemas tucked in soft midnight,
Type details hop, depth by depth,
Suggested questions warm the nest,
I nibble lines and bound with zest!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly describes the main change: adding support for Codex as a datasource type, which is the central objective of this comprehensive pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b06c7e1e6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

src/index.ts Outdated
Comment on lines +62 to +67
const isCodex = endpoint.includes('codex.io') || endpoint.includes('codex');

if (isCodex) {
// For Codex, use the bundled config directly (skip LLM analysis)
logger?.info({endpoint}, 'Detected Codex endpoint, using bundled config');
const config = getCodexConfig(endpoint, authorization);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Match Codex endpoints by hostname

initializeProjectConfig currently treats any endpoint containing the substring "codex" as a Codex datasource, even when that appears only in the path/query (for example a non-Codex project slug). In those cases the normal CID/manifest flow is skipped and a fixed Codex schema is returned, so validation/execution is performed against the wrong API shape and user queries fail. Parse the URL and check the hostname (as done in ProjectManager.isCodexEndpoint) to avoid misclassifying non-Codex endpoints.

Useful? React with 👍 / 👎.

src/prompts.ts Outdated
- ALWAYS call graphql_type_detail BEFORE constructing ANY query to get exact type definitions
- The query-only schema in graphql_schema_info lacks field details - using it directly leads to INVALID queries
- For EACH query you plan to make, first call graphql_type_detail with the return type name
- Example: If you want to call getNftPool, first call graphql_type_detail with typeName: "NftPoolResponse"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use typeNames in Codex tool instructions

The Codex prompt example instructs the agent to call graphql_type_detail with typeName, but the tool schema only accepts typeNames (array). This mismatch can cause invalid tool-call arguments in Codex flows, especially because the same prompt marks this call as mandatory before query construction. Aligning the instruction with the tool contract avoids avoidable tool validation failures.

Useful? React with 👍 / 👎.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/project-manager.ts (1)

1-2: ⚠️ Potential issue | 🟡 Minor

License header appears outdated.

Same as other files—should be updated to PolyForm-Shield-1.0.0 and copyright year 2020–2026 per PR objectives.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/project-manager.ts` around lines 1 - 2, Update the file header comment to
use the new PolyForm-Shield-1.0.0 license and adjust the copyright year range to
2020–2026; specifically replace the existing two-line header (the top-of-file
comment containing the SPDX line and copyright notice) so it names
"PolyForm-Shield-1.0.0" as the license and shows "2020–2026" as the copyright
years.
src/tools/graphql-schema-info.tool.ts (1)

1-2: ⚠️ Potential issue | 🟡 Minor

License header appears outdated.

The PR objectives indicate the license should be updated to PolyForm-Shield-1.0.0 and the copyright year to 2020–2026, but this file still shows GPL-3.0 and 2020-2025.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-schema-info.tool.ts` around lines 1 - 2, Update the file
header to the new license and year: replace the existing copyright line
"Copyright 2020-2025 SubQuery Pte Ltd authors & contributors" with "Copyright
2020-2026 SubQuery Pte Ltd authors & contributors" and change the SPDX
identifier from "SPDX-License-Identifier: GPL-3.0" to "SPDX-License-Identifier:
PolyForm-Shield-1.0.0" so the top-of-file header in
src/tools/graphql-schema-info.tool.ts matches the PR licensing update.
tests/graphql-agent.test.ts (1)

1-2: ⚠️ Potential issue | 🟡 Minor

License header appears outdated.

Should be updated to PolyForm-Shield-1.0.0 and copyright year 2020–2026 per PR objectives.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/graphql-agent.test.ts` around lines 1 - 2, Replace the outdated license
header by updating the copyright and SPDX identifier: change the first two
header lines that currently contain "Copyright 2020-2025 SubQuery Pte Ltd
authors & contributors" and "SPDX-License-Identifier: GPL-3.0" to use the new
copyright range "Copyright 2020–2026 SubQuery Pte Ltd authors & contributors"
and the license identifier "SPDX-License-Identifier: PolyForm-Shield-1.0.0";
ensure the updated header appears at the top of tests/graphql-agent.test.ts
(i.e., replace the existing header strings shown in the diff).
🧹 Nitpick comments (8)
src/codex/config.ts (1)

17-41: Consider adding suggestedQuestions to the Codex config.

The bundled Codex config provides comprehensive domainCapabilities but omits suggestedQuestions. Since this field is now part of the enriched config (as per src/types.ts and src/llm.ts), adding sample questions could improve user experience by providing immediate query examples.

♻️ Proposed addition
     declineMessage: 'This query is outside the scope of Codex API capabilities.',
     schemaContent: getCodexQuerySchema(),
     // introspectionSchema: getCodexIntrospectionSchema(),
     fullSchema: getCodexFullSchema(),
+    suggestedQuestions: [
+      'What are the top NFT collections by volume?',
+      'Show me token prices for ETH on Ethereum',
+      'What are the recent swap events for a specific pool?',
+      'Get wallet balances for a given address',
+      'List supported networks and their status',
+    ],
     ...(authorization ? {authorization} : {}),
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/codex/config.ts` around lines 17 - 41, The getCodexConfig function
returns the Codex project config but omits the suggestedQuestions field; update
getCodexConfig to include a suggestedQuestions: string[] property with a handful
of concise example queries (e.g., "Get top NFT collections by volume", "Fetch
wallet balances for address X", "Show token price history for symbol Y", "List
recent DEX trades for pair Z") so the returned object (in getCodexConfig)
matches the enriched config shape used in src/types.ts and src/llm.ts and
improves UX; simply add the suggestedQuestions array into the returned object
(alongside domainCapabilities, schemaContent, fullSchema, etc.).
src/index.ts (1)

60-72: Codex detection may produce false positives with broad string matching.

The check endpoint.includes('codex.io') || endpoint.includes('codex') is overly permissive. Any endpoint containing the substring "codex" (e.g., https://api.example.com/mycodex-data) would trigger the Codex path incorrectly.

Consider a more precise pattern:

♻️ Proposed fix for more precise Codex detection
-  // Check if this is Codex
-  const isCodex = endpoint.includes('codex.io') || endpoint.includes('codex');
+  // Check if this is Codex - match codex.io domain or explicit codex subdomain
+  const isCodex = /(?:^https?:\/\/)?(?:[\w-]+\.)?codex\.io(?:\/|$)/i.test(endpoint);

Or if there are specific known Codex domains:

-  const isCodex = endpoint.includes('codex.io') || endpoint.includes('codex');
+  const url = new URL(endpoint);
+  const isCodex = url.hostname.endsWith('codex.io') || url.hostname === 'codex.io';
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 60 - 72, The current isCodex detection
(endpoint.includes('codex.io') || endpoint.includes('codex') ) is too
permissive; update the detection in the code path that sets isCodex (and uses
getCodexConfig + persistentService.save) to parse the endpoint as a URL and
match the URL.hostname against a small allowlist of known Codex hostnames (e.g.,
'codex.io', 'api.codex.io') or a stricter hostname regex, falling back to false
on parse errors—this ensures only exact domains trigger the Codex branch instead
of any string containing "codex".
tests/graphql-agent.test.ts (1)

618-618: Test at line 618 may silently pass for wrong reason.

If createGraphQLTools returns tools without graphql_type_detail because fullSchema is missing (not because of provider logic), the test at line 621-623 would incorrectly validate tool presence. The test should ensure the tool is present for the right reasons.

Consider adding fullSchema to the test config to properly test the CODEX path:

🔧 Suggested test config fix
     const codexConfig: GraphQLProjectConfig = {
       endpoint: 'https://api.defined.fi/',
       cid: 'codex-test',
       nodeType: GraphqlProvider.CODEX,
       updatedAt: new Date().toISOString(),
       lastAnalyzedAt: new Date().toISOString(),
       domainName: 'Codex API',
       domainCapabilities: ['NFT data', 'Token prices', 'Wallet analytics'],
       declineMessage: '',
       schemaContent: 'type Query { dummy: String }',
+      fullSchema: 'type Query { dummy: String }\ntype NftPoolResponse { id: ID! }',
     };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/graphql-agent.test.ts` at line 618, The test currently calls
createGraphQLTools(service, codexConfig, loggerMock) but may pass because tools
include graphql_type_detail for the wrong reason; update the test config
(codexConfig) to include a valid fullSchema so createGraphQLTools follows the
CODEX path and populates graphql_type_detail as intended. Specifically, modify
the test setup where codexConfig is defined to include fullSchema (a
representative GraphQL schema string/object) before calling createGraphQLTools,
then assert the tool presence/fields to ensure graphql_type_detail was produced
by the CODEX logic in createGraphQLTools.
src/tools/graphql-schema-info.tool.ts (1)

348-371: LGTM!

The generateCodexSchemaInfo function provides a concise, Codex-specific schema info format with clear usage instructions. The underscore prefix on _schemaContent is misleading since the parameter is actually used in the template literal on line 353—consider renaming to schemaContent for consistency with the other generator functions.

💅 Suggested rename for consistency
-function generateCodexSchemaInfo(_schemaContent: string): string {
+function generateCodexSchemaInfo(schemaContent: string): string {
   return `📖 CODEX GRAPHQL API SCHEMA & RULES:

 🔍 CODEX SUPPORTED QUERIES:
-${_schemaContent}
+${schemaContent}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-schema-info.tool.ts` around lines 348 - 371, The parameter
name _schemaContent in function generateCodexSchemaInfo is misleading because it
is used in the template; rename the parameter to schemaContent and update its
usage inside generateCodexSchemaInfo to match (replace _schemaContent with
schemaContent) so the function signature and body are consistent with other
generator functions.
src/tools/graphql-type-detail.tool.ts (1)

268-278: valueNodeToString doesn't handle all ConstValueNode kinds.

This function handles NullValue and values with a value property, but doesn't properly handle ListValueNode, ObjectValueNode, or EnumValueNode. For enums, value.value is undefined but value.kind === 'EnumValue' and value.name.value contains the enum value.

🔧 Suggested improvement
-function valueNodeToString(value: {value?: boolean | string | number; kind?: string}): string {
+function valueNodeToString(value: any): string {
   // Handle NullValueNode (kind: 'NullValue') which has no value property
   if ('kind' in value && value.kind === 'NullValue') {
     return 'null';
   }
+  // Handle EnumValueNode
+  if ('kind' in value && value.kind === 'EnumValue' && value.name?.value) {
+    return value.name.value;
+  }
   // Handle other ConstValueNode types
   if ('value' in value && value.value !== undefined) {
     return String(value.value);
   }
   return '';
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 268 - 278, The
valueNodeToString function must handle all GraphQL ValueNode kinds: update
valueNodeToString to switch on value.kind (or use ifs) and return correct
strings for EnumValue (use value.name.value), ListValue (map value.values
recursively with valueNodeToString and join with commas inside brackets),
ObjectValue (map value.fields to "name: <recursive>" and wrap in braces), and
other node kinds like IntValue/FloatValue/StringValue/BooleanValue by returning
their value/text; keep the existing NullValue handling and recursion for nested
lists/objects so callers of valueNodeToString get accurate string
representations.
src/codex/schema.ts (2)

56-63: Consider error handling for schema parsing.

buildSchema can throw if the SDL is malformed. Adding a try-catch here would provide clearer error messages than a raw GraphQL parse error.

🛡️ Suggested error handling
 export function getCodexIntrospectionSchema(): IntrospectionQuery {
   if (introspectionSchemaCache === null) {
     const fullSchema = getCodexFullSchema();
-    const schema = buildSchema(fullSchema);
-    introspectionSchemaCache = introspectionFromSchema(schema);
+    try {
+      const schema = buildSchema(fullSchema);
+      introspectionSchemaCache = introspectionFromSchema(schema);
+    } catch (error) {
+      throw new Error(`Failed to build Codex introspection schema: ${(error as Error).message}`);
+    }
   }
   return introspectionSchemaCache;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/codex/schema.ts` around lines 56 - 63, The getCodexIntrospectionSchema
function currently calls buildSchema on the SDL returned by getCodexFullSchema
without handling buildSchema failures; wrap the
buildSchema/introspectionFromSchema sequence in a try-catch inside
getCodexIntrospectionSchema, catch any error thrown (e.g., from buildSchema),
augment or rethrow a new Error that includes a clear message like "Failed to
build Codex schema" plus the original error message/stack, and ensure
introspectionSchemaCache is not set when an error occurs; reference
getCodexIntrospectionSchema, getCodexFullSchema, buildSchema,
introspectionFromSchema, and introspectionSchemaCache when making the change.

29-48: Consider adding error handling for file reads.

readFileSync will throw if the schema files don't exist or are unreadable. Consider adding try-catch with a descriptive error message to aid debugging during deployment issues.

🛡️ Suggested error handling
 export function getCodexQuerySchema(): string {
   if (querySchemaCache === null) {
     const querySchemaPath = join(__dirname, 'query.graphql');
-    querySchemaCache = readFileSync(querySchemaPath, 'utf-8');
+    try {
+      querySchemaCache = readFileSync(querySchemaPath, 'utf-8');
+    } catch (error) {
+      throw new Error(`Failed to load Codex query schema from ${querySchemaPath}: ${(error as Error).message}`);
+    }
   }
   return querySchemaCache;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/codex/schema.ts` around lines 29 - 48, Wrap the file reads in
getCodexQuerySchema and getCodexFullSchema with try-catch to handle readFileSync
errors: inside each function (referencing querySchemaCache, fullSchemaCache,
getCodexQuerySchema, getCodexFullSchema) catch any thrown error from
readFileSync(join(__dirname, ...)) and throw or log a new Error with a
descriptive message that includes the target filename and the original error
message to aid debugging (e.g., "failed to load query.graphql: <orig error>").
Ensure the caches are only set on successful read and the original error is
preserved or rethrown.
src/project-manager.ts (1)

99-106: Codex endpoint detection may be overly broad.

The check hostname.includes('codex') could match unintended domains like mycodex.example.com or codexample.com. Consider using a more precise pattern.

🔧 Suggested more precise detection
 private isCodexEndpoint(endpoint: string): boolean {
   try {
     const hostname = new URL(endpoint).hostname.toLowerCase();
-    return hostname.includes('codex.io') || hostname.includes('codex');
+    return hostname.endsWith('codex.io') || hostname === 'codex.io';
   } catch {
     return false;
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/project-manager.ts` around lines 99 - 106, The isCodexEndpoint function
uses hostname.includes('codex') which is too broad; change its logic to only
accept the canonical codex domain and its subdomains by replacing the includes
check with a precise match such as testing hostname === 'codex.io' ||
hostname.endsWith('.codex.io') or using a case-insensitive regex like
/^(.+\.)?codex\.io$/i inside isCodexEndpoint so that only codex.io and its
subdomains are detected (leave the try/catch and URL parsing intact).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/codex/query.graphql`:
- Around line 614-683: The Query type contains duplicate field definitions that
break schema build; remove the redundant duplicate definitions so each field
name appears only once — specifically dedupe the following Query fields: token,
tokens, pairMetadata, getLatestPairs, listPairsForToken,
listPairsWithMetadataForToken, tokenSparklines, balances, holders, and
getLatestTokens. For each pair keep the correct/most complete definition
(arguments, docs, and deprecation directives) and delete the other, then run
schema validation to ensure no other duplicates or reference changes are needed.

In `@src/codex/schema.ts`:
- Around line 1-2: Replace the existing license header lines "// Copyright
2020-2025 SubQuery Pte Ltd authors & contributors" and "//
SPDX-License-Identifier: GPL-3.0" at the top of src/codex/schema.ts with the
updated header reflecting the new copyright range and license: change the year
range to "2020–2026" and the SPDX identifier to "PolyForm-Shield-1.0.0".

In `@src/project-manager.ts`:
- Around line 108-156: enrichCodexConfig currently sets schemaContent but omits
fullSchema, so graphql_type_detail tool (which reads config.fullSchema) will
fail; update the returned GraphQLProjectConfig in enrichCodexConfig to include
fullSchema: getCodexFullSchema() (mirror the codex/config.ts reference) so the
graphql_type_detail tool and related code that expects config.fullSchema can
operate correctly.

In `@src/tools/graphql-type-detail.tool.ts`:
- Around line 1-2: Update the file header comment at the top of
src/tools/graphql-type-detail.tool.ts: replace the existing SPDX and copyright
lines ("SPDX-License-Identifier: GPL-3.0" and "Copyright 2020-2025") with the
new license identifier "PolyForm-Shield-1.0.0" and the updated copyright range
"2020–2026" so the header reflects the PR objective.
- Around line 292-314: The skip-list array checked against typeName contains a
duplicated entry 'Int'; remove the duplicate so each GraphQL built-in type
appears only once (update the array literal in
src/tools/graphql-type-detail.tool.ts where the check against typeName is
performed), ensuring you keep a single 'Int' and scan the array for any other
accidental duplicates.
- Around line 97-103: The module-level Map typeCache is shared across different
DocumentNode inputs and getAllTypeDefinitions returns early when typeCache.size
> 0, causing stale type lookups for different schemas; modify
getAllTypeDefinitions to either clear/repopulate typeCache at the start of the
function (e.g., typeCache.clear()) or replace typeCache with a
WeakMap<DocumentNode, Map<string, TypeDefinition>> keyed by the incoming
DocumentNode so each document has its own cache; update all references to
typeCache accordingly and ensure getAllTypeDefinitions and any callers use the
new per-document cache logic.

In `@tests/graphql-agent.test.ts`:
- Around line 601-611: The tests instantiate createGraphQLTypeDetailTool but
pass GraphQLProjectConfig objects with schemaContent only; add a fullSchema
string to each test config so createGraphQLTypeDetailTool can instantiate (e.g.,
set config.fullSchema to a minimal valid GraphQL schema like "type Query {
dummy: String }" or the real Codex schema); update the configs used around the
blocks that define codexConfig and the other two test configs so tool creation
returns a tool instance and the existing expect(tool).toBeDefined() assertions
succeed, leaving all other config fields and test logic unchanged.

---

Outside diff comments:
In `@src/project-manager.ts`:
- Around line 1-2: Update the file header comment to use the new
PolyForm-Shield-1.0.0 license and adjust the copyright year range to 2020–2026;
specifically replace the existing two-line header (the top-of-file comment
containing the SPDX line and copyright notice) so it names
"PolyForm-Shield-1.0.0" as the license and shows "2020–2026" as the copyright
years.

In `@src/tools/graphql-schema-info.tool.ts`:
- Around line 1-2: Update the file header to the new license and year: replace
the existing copyright line "Copyright 2020-2025 SubQuery Pte Ltd authors &
contributors" with "Copyright 2020-2026 SubQuery Pte Ltd authors & contributors"
and change the SPDX identifier from "SPDX-License-Identifier: GPL-3.0" to
"SPDX-License-Identifier: PolyForm-Shield-1.0.0" so the top-of-file header in
src/tools/graphql-schema-info.tool.ts matches the PR licensing update.

In `@tests/graphql-agent.test.ts`:
- Around line 1-2: Replace the outdated license header by updating the copyright
and SPDX identifier: change the first two header lines that currently contain
"Copyright 2020-2025 SubQuery Pte Ltd authors & contributors" and
"SPDX-License-Identifier: GPL-3.0" to use the new copyright range "Copyright
2020–2026 SubQuery Pte Ltd authors & contributors" and the license identifier
"SPDX-License-Identifier: PolyForm-Shield-1.0.0"; ensure the updated header
appears at the top of tests/graphql-agent.test.ts (i.e., replace the existing
header strings shown in the diff).

---

Nitpick comments:
In `@src/codex/config.ts`:
- Around line 17-41: The getCodexConfig function returns the Codex project
config but omits the suggestedQuestions field; update getCodexConfig to include
a suggestedQuestions: string[] property with a handful of concise example
queries (e.g., "Get top NFT collections by volume", "Fetch wallet balances for
address X", "Show token price history for symbol Y", "List recent DEX trades for
pair Z") so the returned object (in getCodexConfig) matches the enriched config
shape used in src/types.ts and src/llm.ts and improves UX; simply add the
suggestedQuestions array into the returned object (alongside domainCapabilities,
schemaContent, fullSchema, etc.).

In `@src/codex/schema.ts`:
- Around line 56-63: The getCodexIntrospectionSchema function currently calls
buildSchema on the SDL returned by getCodexFullSchema without handling
buildSchema failures; wrap the buildSchema/introspectionFromSchema sequence in a
try-catch inside getCodexIntrospectionSchema, catch any error thrown (e.g., from
buildSchema), augment or rethrow a new Error that includes a clear message like
"Failed to build Codex schema" plus the original error message/stack, and ensure
introspectionSchemaCache is not set when an error occurs; reference
getCodexIntrospectionSchema, getCodexFullSchema, buildSchema,
introspectionFromSchema, and introspectionSchemaCache when making the change.
- Around line 29-48: Wrap the file reads in getCodexQuerySchema and
getCodexFullSchema with try-catch to handle readFileSync errors: inside each
function (referencing querySchemaCache, fullSchemaCache, getCodexQuerySchema,
getCodexFullSchema) catch any thrown error from readFileSync(join(__dirname,
...)) and throw or log a new Error with a descriptive message that includes the
target filename and the original error message to aid debugging (e.g., "failed
to load query.graphql: <orig error>"). Ensure the caches are only set on
successful read and the original error is preserved or rethrown.

In `@src/index.ts`:
- Around line 60-72: The current isCodex detection
(endpoint.includes('codex.io') || endpoint.includes('codex') ) is too
permissive; update the detection in the code path that sets isCodex (and uses
getCodexConfig + persistentService.save) to parse the endpoint as a URL and
match the URL.hostname against a small allowlist of known Codex hostnames (e.g.,
'codex.io', 'api.codex.io') or a stricter hostname regex, falling back to false
on parse errors—this ensures only exact domains trigger the Codex branch instead
of any string containing "codex".

In `@src/project-manager.ts`:
- Around line 99-106: The isCodexEndpoint function uses
hostname.includes('codex') which is too broad; change its logic to only accept
the canonical codex domain and its subdomains by replacing the includes check
with a precise match such as testing hostname === 'codex.io' ||
hostname.endsWith('.codex.io') or using a case-insensitive regex like
/^(.+\.)?codex\.io$/i inside isCodexEndpoint so that only codex.io and its
subdomains are detected (leave the try/catch and URL parsing intact).

In `@src/tools/graphql-schema-info.tool.ts`:
- Around line 348-371: The parameter name _schemaContent in function
generateCodexSchemaInfo is misleading because it is used in the template; rename
the parameter to schemaContent and update its usage inside
generateCodexSchemaInfo to match (replace _schemaContent with schemaContent) so
the function signature and body are consistent with other generator functions.

In `@src/tools/graphql-type-detail.tool.ts`:
- Around line 268-278: The valueNodeToString function must handle all GraphQL
ValueNode kinds: update valueNodeToString to switch on value.kind (or use ifs)
and return correct strings for EnumValue (use value.name.value), ListValue (map
value.values recursively with valueNodeToString and join with commas inside
brackets), ObjectValue (map value.fields to "name: <recursive>" and wrap in
braces), and other node kinds like IntValue/FloatValue/StringValue/BooleanValue
by returning their value/text; keep the existing NullValue handling and
recursion for nested lists/objects so callers of valueNodeToString get accurate
string representations.

In `@tests/graphql-agent.test.ts`:
- Line 618: The test currently calls createGraphQLTools(service, codexConfig,
loggerMock) but may pass because tools include graphql_type_detail for the wrong
reason; update the test config (codexConfig) to include a valid fullSchema so
createGraphQLTools follows the CODEX path and populates graphql_type_detail as
intended. Specifically, modify the test setup where codexConfig is defined to
include fullSchema (a representative GraphQL schema string/object) before
calling createGraphQLTools, then assert the tool presence/fields to ensure
graphql_type_detail was produced by the CODEX logic in createGraphQLTools.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b336acd-c041-48d0-8564-f5485317bd7a

📥 Commits

Reviewing files that changed from the base of the PR and between ee2e183 and 8b06c7e.

📒 Files selected for processing (16)
  • .gitignore
  • src/codex/config.ts
  • src/codex/full_schema.graphql
  • src/codex/query.graphql
  • src/codex/schema.ts
  • src/graphql.service.ts
  • src/index.ts
  • src/llm.ts
  • src/project-manager.ts
  • src/prompts.ts
  • src/tools/graphql-schema-info.tool.ts
  • src/tools/graphql-type-detail.tool.ts
  • src/tools/graphql-validate-excute.tool.ts
  • src/tools/index.ts
  • src/types.ts
  • tests/graphql-agent.test.ts
💤 Files with no reviewable changes (1)
  • src/tools/graphql-validate-excute.tool.ts

Comment on lines +614 to +683
token(
"""Input for a token ID."""
input: TokenInput!
): EnhancedToken!
"""Returns a list of tokens by their addresses & network id, with pagination."""
tokens(
"""A list of token ID input."""
ids: [TokenInput!]
): [EnhancedToken]!
"""Returns metadata for a pair of tokens."""
pairMetadata(
"""The ID of the pair (`address:networkId`)."""
pairId: String!
"""The token of interest within the pair. Can be `token0` or `token1`."""
quoteToken: QuoteToken
"""The type of statistics returned. Can be `FILTERED` or `UNFILTERED`. Default is `UNFILTERED`."""
statsType: TokenPairStatisticsType
): PairMetadata!
"""Returns new tokens listed over the last three days."""
getLatestPairs(
"""The maximum number of tokens to return."""
limit: Int
"""The list of network IDs to filter by."""
networkFilter: [Int!]
"""The list of exchanges to filter by."""
exchangeFilter: [String!]
"""The minimum amount of liquidity to filter by."""
minLiquidityFilter: Int
"""The cursor to use for pagination."""
cursor: String
): LatestPairConnection @deprecated(reason: "This query is longer supported. Instead use filterPairs with sort order on createdAt DESC")
"""Returns a list of pairs containing a given token."""
listPairsForToken(
"""The maximum number of pairs to return."""
limit: Int
"""The network ID the token is deployed on."""
networkId: Int!
"""The contract address of the token."""
tokenAddress: String!
): [Pair]!
"""Returns a list of pair metadata for a token."""
listPairsWithMetadataForToken(
"""The maximum number of pairs to return."""
limit: Int
"""The network ID the token is deployed on."""
networkId: Int!
"""The contract address of the token."""
tokenAddress: String!
): ListPairsForTokenResponse!
"""Returns a list of token simple chart data (sparklines) for the given tokens."""
tokenSparklines(
input: TokenSparklineInput!
): [TokenSparkline!]!
"""Returns list of token balances that a wallet has."""
balances(
input: BalancesInput!
): BalancesResponse!
"""Returns list of wallets that hold a given token, ordered by holdings descending. Also has the unique count of holders for that token."""
holders(
input: HoldersInput!
): HoldersResponse!
"""Returns a list of latest tokens."""
getLatestTokens(
"""The maximum number of tokens to return. Maximum number of tokens is 50."""
limit: Int
"""The list of network IDs to filter by."""
networkFilter: [Int!]
"""The offset to use for pagination."""
offset: Int
): LatestTokenConnection @deprecated(reason: "This query is no longer supported. Use `filterTokens` with a createdAt: DESC filter instead.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify duplicate field definitions in query.graphql
echo "Checking for duplicate field definitions in Query type..."
fd -t f 'query.graphql' --exec grep -oE '^\s{4}[a-zA-Z_][a-zA-Z0-9_]*\(' {} | sort | uniq -c | sort -rn | head -20

Repository: subquery/sq-graphql-agent.js

Length of output: 708


Remove duplicate Query field definitions that will cause schema parsing errors.

The Query type contains 10 duplicate field definitions that will cause buildSchema to fail:

  • token / tokens
  • pairMetadata
  • getLatestPairs
  • listPairsForToken / listPairsWithMetadataForToken
  • tokenSparklines
  • balances / holders
  • getLatestTokens

GraphQL does not allow duplicate field names within the same type. Remove one of each duplicate definition.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/codex/query.graphql` around lines 614 - 683, The Query type contains
duplicate field definitions that break schema build; remove the redundant
duplicate definitions so each field name appears only once — specifically dedupe
the following Query fields: token, tokens, pairMetadata, getLatestPairs,
listPairsForToken, listPairsWithMetadataForToken, tokenSparklines, balances,
holders, and getLatestTokens. For each pair keep the correct/most complete
definition (arguments, docs, and deprecation directives) and delete the other,
then run schema validation to ensure no other duplicates or reference changes
are needed.

Comment on lines +292 to +314
if (
[
'ID',
'String',
'Int',
'Float',
'Boolean',
'Int',
'Query',
'Mutation',
'Subscription',
'__Schema',
'__Type',
'__TypeKind',
'__Field',
'__InputValue',
'__EnumValue',
'__Directive',
'__DirectiveLocation',
].includes(typeName)
) {
return null;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate entry 'Int' in the skip list.

'Int' appears twice in the array (lines 297 and 299).

🧹 Remove duplicate
   if (
     [
       'ID',
       'String',
       'Int',
       'Float',
       'Boolean',
-      'Int',
       'Query',
       'Mutation',
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (
[
'ID',
'String',
'Int',
'Float',
'Boolean',
'Int',
'Query',
'Mutation',
'Subscription',
'__Schema',
'__Type',
'__TypeKind',
'__Field',
'__InputValue',
'__EnumValue',
'__Directive',
'__DirectiveLocation',
].includes(typeName)
) {
return null;
}
if (
[
'ID',
'String',
'Int',
'Float',
'Boolean',
'Query',
'Mutation',
'Subscription',
'__Schema',
'__Type',
'__TypeKind',
'__Field',
'__InputValue',
'__EnumValue',
'__Directive',
'__DirectiveLocation',
].includes(typeName)
) {
return null;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 292 - 314, The skip-list
array checked against typeName contains a duplicated entry 'Int'; remove the
duplicate so each GraphQL built-in type appears only once (update the array
literal in src/tools/graphql-type-detail.tool.ts where the check against
typeName is performed), ensuring you keep a single 'Int' and scan the array for
any other accidental duplicates.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
src/tools/graphql-type-detail.tool.ts (2)

97-103: ⚠️ Potential issue | 🟠 Major

Module-level cache is not keyed by document, causing incorrect type lookups.

typeCache is shared at module level, but getAllTypeDefinitions returns early if typeCache.size > 0 without verifying it was populated from the same documentNode. If the tool is called with different schemas (different Codex configs or during testing), it will return stale type definitions from the first schema parsed.

🐛 Suggested fix: track the document reference
+let cachedDocumentRef: DocumentNode | null = null;
+
 function getAllTypeDefinitions(documentNode: DocumentNode): Map<string, TypeDefinition> {
-  if (typeCache.size > 0) {
+  if (typeCache.size > 0 && cachedDocumentRef === documentNode) {
     return typeCache;
   }
+  
+  // New document, clear old cache
+  typeCache.clear();
+  cachedDocumentRef = documentNode;

   for (const definition of documentNode.definitions) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 97 - 103, The
module-level typeCache is shared across different DocumentNode inputs causing
stale lookups; change the cache to be keyed by the DocumentNode (e.g., replace
typeCache: Map<string, TypeDefinition> with a WeakMap<DocumentNode, Map<string,
TypeDefinition>> or a Map keyed by a document hash) and update
getAllTypeDefinitions to look up the per-document Map using the incoming
documentNode, populate and return that per-document cache if missing, and avoid
the current early return that assumes cached entries belong to the same
document.

292-314: ⚠️ Potential issue | 🟡 Minor

Duplicate entry 'Int' in the skip list.

'Int' appears twice in the array (lines 296 and 299). This is harmless but indicates copy-paste oversight.

🧹 Remove duplicate
   if (
     [
       'ID',
       'String',
       'Int',
       'Float',
       'Boolean',
-      'Int',
       'Query',
       'Mutation',
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 292 - 314, The skip-list
array used in the includes check that compares typeName contains a duplicate
entry 'Int'; update the array inside the conditional in
graphql-type-detail.tool.ts (the block that checks if ( [ ...
].includes(typeName) ) ) to remove the extra 'Int' so each scalar/type appears
only once, leaving the rest of the entries unchanged.
src/project-manager.ts (1)

108-156: ⚠️ Potential issue | 🟠 Major

Add fullSchema to enrichCodexConfig return object to fix graphql_type_detail tool for Codex.

The enrichCodexConfig method sets schemaContent but omits fullSchema. The graphql_type_detail tool (lines 20-24 in src/tools/graphql-type-detail.tool.ts) requires config.fullSchema; when missing, it logs an error and returns null, disabling type detail functionality for Codex configs processed through this path.

🐛 Add fullSchema to the returned config
-import {getCodexQuerySchema} from './codex/schema.js';
+import {getCodexFullSchema, getCodexQuerySchema} from './codex/schema.js';

Then in enrichCodexConfig:

 private enrichCodexConfig(config: GraphQLProjectConfig, logger?: Logger): GraphQLProjectConfig {
   logger?.info({endpoint: config.endpoint}, 'Detected Codex endpoint, using embedded schema');

   return {
     ...config,
     schemaContent: getCodexQuerySchema(),
+    fullSchema: getCodexFullSchema(),
     nodeType: GraphqlProvider.CODEX,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/project-manager.ts` around lines 108 - 156, enrichCodexConfig currently
returns a GraphQLProjectConfig with schemaContent set but omits fullSchema,
which breaks the graphql_type_detail tool that expects config.fullSchema; update
enrichCodexConfig to include fullSchema (e.g., set fullSchema:
getCodexQuerySchema()) alongside schemaContent so GraphQLProjectConfig consumers
like graphql_type_detail receive the full schema string.
🧹 Nitpick comments (3)
src/prompts.ts (1)

24-42: Keep a discovery step before graphql_type_detail.

graphql_type_detail only helps once the model already knows the return type name. Making it the mandatory first step for Codex pushes the agent to guess type names on unfamiliar requests; I'd keep graphql_schema_info for discovery, then require graphql_type_detail before writing the query.

Also applies to: 51-51

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/prompts.ts` around lines 24 - 42, Update the Codex instruction block
(const codexInstructions) to require a discovery step: instruct the agent to
first call graphql_schema_info to discover available types and operations, then
call graphql_type_detail for the specific return type name before constructing
any queries; explicitly state that graphql_schema_info is used for discovery and
graphql_type_detail is required only after a type name is known, and keep the
existing notes about sorting and validity unchanged.
src/project-manager.ts (1)

99-106: Endpoint detection may produce false positives.

The check hostname.includes('codex') could match unintended domains (e.g., mycodex-api.example.com, codexreview.com). Consider tightening to specific known Codex domains.

♻️ Suggested stricter check
 private isCodexEndpoint(endpoint: string): boolean {
   try {
     const hostname = new URL(endpoint).hostname.toLowerCase();
-    return hostname.includes('codex.io') || hostname.includes('codex');
+    return hostname.endsWith('codex.io') || hostname === 'codex.io';
   } catch {
     return false;
   }
 }

Please confirm the expected Codex endpoint hostname patterns to ensure false positives are avoided.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/project-manager.ts` around lines 99 - 106, The isCodexEndpoint method
currently does a loose substring check that yields false positives; replace the
broad hostname.includes('codex') logic with a whitelist-based domain check:
define a list of allowed Codex host patterns and test hostname equality or
subdomain membership (for example, hostname === 'codex.io' or
hostname.endsWith('.codex.io') or matching entries in an
allowedHostnames/allowedSuffixes array) inside isCodexEndpoint, keeping the
existing URL parsing and catch; update references to isCodexEndpoint
accordingly.
src/tools/graphql-schema-info.tool.ts (1)

349-371: Consider removing the underscore prefix from the parameter name.

The parameter _schemaContent uses an underscore prefix (typically indicating unused), but it is actually used on line 353. Rename to schemaContent for consistency with the other generator functions.

♻️ Suggested fix
-function generateCodexSchemaInfo(_schemaContent: string): string {
+function generateCodexSchemaInfo(schemaContent: string): string {
   return `📖 CODEX GRAPHQL API SCHEMA & RULES:
 
 🔍 CODEX SUPPORTED QUERIES:
-${_schemaContent}
+${schemaContent}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-schema-info.tool.ts` around lines 349 - 371, The parameter
name _schemaContent in function generateCodexSchemaInfo should be renamed to
schemaContent to avoid the misleading unused-parameter underscore; update the
function signature generateCodexSchemaInfo(_schemaContent: string) to
generateCodexSchemaInfo(schemaContent: string) and replace all uses of
_schemaContent inside the function body with schemaContent, and search for any
callers of generateCodexSchemaInfo to update their argument name references if
they relied on the old parameter name in typed contexts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/e2e-codex.ts`:
- Line 85: The top-level call to main() can produce unhandled rejections because
initializeProjectConfig runs before the inner try/catch; wrap the top-level
promise so any rejection flows through the same error-reporting path—replace the
bare main() invocation with a top-level promise handler that calls main() and
catches errors (e.g., main().catch(...)) and forwards the error into the
existing reporting/exit logic used inside main (so failures from
initializeProjectConfig are logged and process.exit is invoked consistently);
locate the call to main() in scripts/e2e-codex.ts and update it to catch and
handle rejections.

In `@src/graphql.service.ts`:
- Around line 191-200: Guard the introspection branch in graphql.service.ts by
validating the fetched/introspection payload before calling buildClientSchema:
when this.config.fullSchema is falsy, check the introspection object
(this.config.introspectionSchema or the result of await
this.fetchIntrospectionSchema()) for a valid data.__schema (and/or absence of an
errors array) and if missing/invalid, throw or rethrow the GraphQL error
returned by fetchIntrospectionSchema so the original introspection error is
surfaced instead of letting buildClientSchema fail; update the code around
schema = buildClientSchema(introspectionData) to perform this validation and
surface the original error.

In `@src/index.ts`:
- Around line 61-72: The current detection uses a raw substring match on
endpoint to set isCodex, causing false positives; parse the URL and check the
hostname instead (e.g., derive hostname from endpoint and test hostname ===
'codex.io' or hostname.endsWith('.codex.io') or other canonical hostnames)
before calling getCodexConfig(endpoint, authorization), logging via logger?.info
and saving with persistentService.save; update the isCodex logic to use the
parsed hostname check so only true Codex hosts short-circuit to the bundled
config path.

---

Duplicate comments:
In `@src/project-manager.ts`:
- Around line 108-156: enrichCodexConfig currently returns a
GraphQLProjectConfig with schemaContent set but omits fullSchema, which breaks
the graphql_type_detail tool that expects config.fullSchema; update
enrichCodexConfig to include fullSchema (e.g., set fullSchema:
getCodexQuerySchema()) alongside schemaContent so GraphQLProjectConfig consumers
like graphql_type_detail receive the full schema string.

In `@src/tools/graphql-type-detail.tool.ts`:
- Around line 97-103: The module-level typeCache is shared across different
DocumentNode inputs causing stale lookups; change the cache to be keyed by the
DocumentNode (e.g., replace typeCache: Map<string, TypeDefinition> with a
WeakMap<DocumentNode, Map<string, TypeDefinition>> or a Map keyed by a document
hash) and update getAllTypeDefinitions to look up the per-document Map using the
incoming documentNode, populate and return that per-document cache if missing,
and avoid the current early return that assumes cached entries belong to the
same document.
- Around line 292-314: The skip-list array used in the includes check that
compares typeName contains a duplicate entry 'Int'; update the array inside the
conditional in graphql-type-detail.tool.ts (the block that checks if ( [ ...
].includes(typeName) ) ) to remove the extra 'Int' so each scalar/type appears
only once, leaving the rest of the entries unchanged.

---

Nitpick comments:
In `@src/project-manager.ts`:
- Around line 99-106: The isCodexEndpoint method currently does a loose
substring check that yields false positives; replace the broad
hostname.includes('codex') logic with a whitelist-based domain check: define a
list of allowed Codex host patterns and test hostname equality or subdomain
membership (for example, hostname === 'codex.io' or
hostname.endsWith('.codex.io') or matching entries in an
allowedHostnames/allowedSuffixes array) inside isCodexEndpoint, keeping the
existing URL parsing and catch; update references to isCodexEndpoint
accordingly.

In `@src/prompts.ts`:
- Around line 24-42: Update the Codex instruction block (const
codexInstructions) to require a discovery step: instruct the agent to first call
graphql_schema_info to discover available types and operations, then call
graphql_type_detail for the specific return type name before constructing any
queries; explicitly state that graphql_schema_info is used for discovery and
graphql_type_detail is required only after a type name is known, and keep the
existing notes about sorting and validity unchanged.

In `@src/tools/graphql-schema-info.tool.ts`:
- Around line 349-371: The parameter name _schemaContent in function
generateCodexSchemaInfo should be renamed to schemaContent to avoid the
misleading unused-parameter underscore; update the function signature
generateCodexSchemaInfo(_schemaContent: string) to
generateCodexSchemaInfo(schemaContent: string) and replace all uses of
_schemaContent inside the function body with schemaContent, and search for any
callers of generateCodexSchemaInfo to update their argument name references if
they relied on the old parameter name in typed contexts.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a84fe35a-9f00-4f84-b845-44933ddb78f6

📥 Commits

Reviewing files that changed from the base of the PR and between 8b06c7e and 93eef31.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (19)
  • LICENSE
  • eslint.config.cjs
  • package.json
  • scripts/e2e-codex.ts
  • src/codex/config.ts
  • src/codex/schema.ts
  • src/graphql.service.ts
  • src/index.ts
  • src/llm.ts
  • src/project-manager.ts
  • src/prompts.ts
  • src/tools/graphql-execute.tool.ts
  • src/tools/graphql-schema-info.tool.ts
  • src/tools/graphql-type-detail.tool.ts
  • src/tools/graphql-validate-excute.tool.ts
  • src/tools/graphql-validator.tool.ts
  • src/tools/index.ts
  • src/types.ts
  • src/utils.ts
✅ Files skipped from review due to trivial changes (4)
  • LICENSE
  • src/tools/graphql-execute.tool.ts
  • src/tools/graphql-validator.tool.ts
  • src/utils.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/tools/index.ts
  • src/tools/graphql-validate-excute.tool.ts
  • src/codex/config.ts
  • src/codex/schema.ts

}
}

main();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Catch rejections from main().

initializeProjectConfig runs before the inner try/catch, so an early failure currently becomes an unhandled rejection. Wrapping the top-level promise keeps the CLI on the same error-reporting path.

Suggested fix
-main();
+void main().catch((error) => {
+  console.error('[Error]:', error instanceof Error ? error.message : error);
+  process.exit(1);
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
main();
void main().catch((error) => {
console.error('[Error]:', error instanceof Error ? error.message : error);
process.exit(1);
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/e2e-codex.ts` at line 85, The top-level call to main() can produce
unhandled rejections because initializeProjectConfig runs before the inner
try/catch; wrap the top-level promise so any rejection flows through the same
error-reporting path—replace the bare main() invocation with a top-level promise
handler that calls main() and catches errors (e.g., main().catch(...)) and
forwards the error into the existing reporting/exit logic used inside main (so
failures from initializeProjectConfig are logged and process.exit is invoked
consistently); locate the call to main() in scripts/e2e-codex.ts and update it
to catch and handle rejections.

Comment on lines +191 to +200
if (this.config.fullSchema) {
// Parse from full GraphQL SDL schema string
schema = buildSchema(this.config.fullSchema);
} else {
// Fallback to introspection schema
let introspectionData = this.config.introspectionSchema;
if (!introspectionData) {
introspectionData = await this.fetchIntrospectionSchema();
}
schema = buildClientSchema(introspectionData);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate the introspection payload before calling buildClientSchema.

When introspection is disabled, many GraphQL servers still return HTTP 200 with an errors array. In that case Line 200 turns a fetch/introspection problem into a generic schema-build failure. Guard for __schema here, and surface the GraphQL error from fetchIntrospectionSchema() instead.

Suggested fix
       let introspectionData = this.config.introspectionSchema;
       if (!introspectionData) {
         introspectionData = await this.fetchIntrospectionSchema();
       }
+      if (!introspectionData?.__schema) {
+        throw new Error('Endpoint did not return a valid GraphQL introspection schema');
+      }
       schema = buildClientSchema(introspectionData);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (this.config.fullSchema) {
// Parse from full GraphQL SDL schema string
schema = buildSchema(this.config.fullSchema);
} else {
// Fallback to introspection schema
let introspectionData = this.config.introspectionSchema;
if (!introspectionData) {
introspectionData = await this.fetchIntrospectionSchema();
}
schema = buildClientSchema(introspectionData);
if (this.config.fullSchema) {
// Parse from full GraphQL SDL schema string
schema = buildSchema(this.config.fullSchema);
} else {
// Fallback to introspection schema
let introspectionData = this.config.introspectionSchema;
if (!introspectionData) {
introspectionData = await this.fetchIntrospectionSchema();
}
if (!introspectionData?.__schema) {
throw new Error('Endpoint did not return a valid GraphQL introspection schema');
}
schema = buildClientSchema(introspectionData);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/graphql.service.ts` around lines 191 - 200, Guard the introspection
branch in graphql.service.ts by validating the fetched/introspection payload
before calling buildClientSchema: when this.config.fullSchema is falsy, check
the introspection object (this.config.introspectionSchema or the result of await
this.fetchIntrospectionSchema()) for a valid data.__schema (and/or absence of an
errors array) and if missing/invalid, throw or rethrow the GraphQL error
returned by fetchIntrospectionSchema so the original introspection error is
surfaced instead of letting buildClientSchema fail; update the code around
schema = buildClientSchema(introspectionData) to perform this validation and
surface the original error.

…(graph.codex.io)

   - Fix initializeProjectConfig to use URL parsing instead of substring match
   - Consolidate Codex config into getCodexConfig() - removes redundant enrichCodexConfig()
   - ProjectManager now returns Codex configs directly without re-processing
   - Fix prompt: typeName -> typeNames (array parameter) for graphql_type_detail tool
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/index.ts`:
- Around line 60-66: The code only reads customHeaders?.Authorization
(case-sensitive) and logs the full endpoint in logger?.info when
isCodexEndpoint(endpoint) is true; update the lookup to accept common casing
variations (e.g., check customHeaders.Authorization ||
customHeaders.authorization || customHeaders['authorization']) and use that
normalized value for getCodexConfig, and change the logger?.info call to avoid
emitting the raw endpoint (e.g., log a masked or redacted endpoint string or
just the fact that a Codex endpoint was detected) so secrets in URLs are not
exposed; adjust references to authorization, customHeaders, isCodexEndpoint,
logger?.info and getCodexConfig accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d80f7c6a-62a8-4591-a3ea-b31b96c2cc57

📥 Commits

Reviewing files that changed from the base of the PR and between 93eef31 and ca5e4c6.

📒 Files selected for processing (5)
  • src/codex/config.ts
  • src/index.ts
  • src/project-manager.ts
  • src/prompts.ts
  • src/utils.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/utils.ts
  • src/codex/config.ts
  • src/project-manager.ts

Comment on lines 60 to +66
const authorization = customHeaders?.Authorization;

// Check if this is Codex
if (isCodexEndpoint(endpoint)) {
// For Codex, use the bundled config directly (skip LLM analysis)
logger?.info({endpoint}, 'Detected Codex endpoint, using bundled config');
const config = getCodexConfig(endpoint, authorization);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle auth header casing and avoid logging raw endpoints.

This path only reads Authorization with exact casing and logs full endpoint. Lowercase headers can drop auth, and raw endpoint logging can expose secrets in URLs.

🔧 Proposed fix
-  const authorization = customHeaders?.Authorization;
+  const authorization =
+    customHeaders?.Authorization ??
+    customHeaders?.authorization ??
+    Object.entries(customHeaders ?? {}).find(([key]) => key.toLowerCase() === 'authorization')?.[1];

@@
-    logger?.info({endpoint}, 'Detected Codex endpoint, using bundled config');
+    const endpointHost = (() => {
+      try {
+        return new URL(endpoint).hostname;
+      } catch {
+        return 'invalid-endpoint';
+      }
+    })();
+    logger?.info({endpointHost}, 'Detected Codex endpoint, using bundled config');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const authorization = customHeaders?.Authorization;
// Check if this is Codex
if (isCodexEndpoint(endpoint)) {
// For Codex, use the bundled config directly (skip LLM analysis)
logger?.info({endpoint}, 'Detected Codex endpoint, using bundled config');
const config = getCodexConfig(endpoint, authorization);
const authorization =
customHeaders?.Authorization ??
customHeaders?.authorization ??
Object.entries(customHeaders ?? {}).find(([key]) => key.toLowerCase() === 'authorization')?.[1];
// Check if this is Codex
if (isCodexEndpoint(endpoint)) {
// For Codex, use the bundled config directly (skip LLM analysis)
const endpointHost = (() => {
try {
return new URL(endpoint).hostname;
} catch {
return 'invalid-endpoint';
}
})();
logger?.info({endpointHost}, 'Detected Codex endpoint, using bundled config');
const config = getCodexConfig(endpoint, authorization);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/index.ts` around lines 60 - 66, The code only reads
customHeaders?.Authorization (case-sensitive) and logs the full endpoint in
logger?.info when isCodexEndpoint(endpoint) is true; update the lookup to accept
common casing variations (e.g., check customHeaders.Authorization ||
customHeaders.authorization || customHeaders['authorization']) and use that
normalized value for getCodexConfig, and change the logger?.info call to avoid
emitting the raw endpoint (e.g., log a masked or redacted endpoint string or
just the fact that a Codex endpoint was detected) so secrets in URLs are not
exposed; adjust references to authorization, customHeaders, isCodexEndpoint,
logger?.info and getCodexConfig accordingly.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/tools/graphql-type-detail.tool.ts (1)

291-313: ⚠️ Potential issue | 🟡 Minor

Duplicate entry 'Int' in the skip list.

'Int' appears at both line 295 and line 298 in the array. This was flagged in a previous review and should be removed.

🧹 Remove duplicate
     [
       'ID',
       'String',
       'Int',
       'Float',
       'Boolean',
-      'Int',
       'Query',
       'Mutation',
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 291 - 313, The skip-list
array used in the if condition that checks typeName.includes(...) contains a
duplicated 'Int' entry; remove the duplicate from the array literal so each
scalar/type name appears only once (i.e., edit the array in the if (...) { ... }
block that calls .includes(typeName) and keep a single 'Int' entry), or
alternatively replace the literal with a deduplicated Set of names, ensuring the
logic using the typeName variable and the includes call remains correct.
🧹 Nitpick comments (1)
src/tools/graphql-type-detail.tool.ts (1)

267-277: Complex value types (lists, objects) are not fully serialized.

ListValue and ObjectValue nodes will return empty strings since they use values/fields properties instead of value. Default values like field: [Int] = [1, 2] or field: Input = {a: 1} won't render correctly.

This is acceptable for typical use cases but worth noting if complex defaults are common in your schemas.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 267 - 277, The
valueNodeToString function only handles NullValue and scalar nodes via the value
property, so ListValue and ObjectValue nodes (which use values/fields) serialize
to empty strings; update valueNodeToString to handle kind === 'ListValue' by
recursively mapping valueNodeToString over value.values and joining with ", "
(surrounded by brackets) and handle kind === 'ObjectValue' by iterating
value.fields, serializing each field as name.value + ": " +
valueNodeToString(field.value) and joining with ", " (surrounded by braces);
keep existing NullValue and scalar branches and ensure recursion covers nested
lists/objects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tools/graphql-type-detail.tool.ts`:
- Around line 244-254: inputFieldNodeToString currently ignores input field
default values because the parameter type lacks defaultValue; update the
function signature for inputFieldNodeToString to accept field.defaultValue?:
ValueNode (or include defaultValue in the existing field type), then, after
building the name and type string (using typeNodeToString), if
field.defaultValue is present append " = " plus a serialized value using the
existing valueNodeToString helper (or implement one) so fields like `field:
String = "default"` are preserved; reference the function
inputFieldNodeToString, the TypeNode/ValueNode types, and use valueNodeToString
for serialization.

---

Duplicate comments:
In `@src/tools/graphql-type-detail.tool.ts`:
- Around line 291-313: The skip-list array used in the if condition that checks
typeName.includes(...) contains a duplicated 'Int' entry; remove the duplicate
from the array literal so each scalar/type name appears only once (i.e., edit
the array in the if (...) { ... } block that calls .includes(typeName) and keep
a single 'Int' entry), or alternatively replace the literal with a deduplicated
Set of names, ensuring the logic using the typeName variable and the includes
call remains correct.

---

Nitpick comments:
In `@src/tools/graphql-type-detail.tool.ts`:
- Around line 267-277: The valueNodeToString function only handles NullValue and
scalar nodes via the value property, so ListValue and ObjectValue nodes (which
use values/fields) serialize to empty strings; update valueNodeToString to
handle kind === 'ListValue' by recursively mapping valueNodeToString over
value.values and joining with ", " (surrounded by brackets) and handle kind ===
'ObjectValue' by iterating value.fields, serializing each field as name.value +
": " + valueNodeToString(field.value) and joining with ", " (surrounded by
braces); keep existing NullValue and scalar branches and ensure recursion covers
nested lists/objects.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 781fdc38-9f1e-4506-bff8-3ac40ff082d2

📥 Commits

Reviewing files that changed from the base of the PR and between ca5e4c6 and 4273883.

📒 Files selected for processing (1)
  • src/tools/graphql-type-detail.tool.ts

Comment on lines +244 to +254
function inputFieldNodeToString(field: {name: {value: string}; type: TypeNode; description?: {value: string}}): string {
const parts: string[] = [];

if (field.description) {
parts.push(`"""${field.description.value}"""`);
}

parts.push(`${field.name.value}: ${typeNodeToString(field.type)}`);

return parts.join('');
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Input field default values are not serialized.

The type signature excludes defaultValue, so input fields with defaults (e.g., field: String = "default") will lose that information in the output. This could cause confusion when inspecting input types.

🔧 Proposed fix to include default values
-function inputFieldNodeToString(field: {name: {value: string}; type: TypeNode; description?: {value: string}}): string {
+function inputFieldNodeToString(field: {name: {value: string}; type: TypeNode; description?: {value: string}; defaultValue?: {value?: boolean | string | number; kind?: string}}): string {
   const parts: string[] = [];

   if (field.description) {
     parts.push(`"""${field.description.value}"""`);
   }

-  parts.push(`${field.name.value}: ${typeNodeToString(field.type)}`);
+  let fieldStr = `${field.name.value}: ${typeNodeToString(field.type)}`;
+  if (field.defaultValue) {
+    fieldStr += ` = ${valueNodeToString(field.defaultValue)}`;
+  }
+  parts.push(fieldStr);

   return parts.join('');
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/graphql-type-detail.tool.ts` around lines 244 - 254,
inputFieldNodeToString currently ignores input field default values because the
parameter type lacks defaultValue; update the function signature for
inputFieldNodeToString to accept field.defaultValue?: ValueNode (or include
defaultValue in the existing field type), then, after building the name and type
string (using typeNodeToString), if field.defaultValue is present append " = "
plus a serialized value using the existing valueNodeToString helper (or
implement one) so fields like `field: String = "default"` are preserved;
reference the function inputFieldNodeToString, the TypeNode/ValueNode types, and
use valueNodeToString for serialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant