-
Notifications
You must be signed in to change notification settings - Fork 11
feat: add IngestTraces client for dedicated ingest service #501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sc-58230/local-ingest-models
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,10 +1,14 @@ | ||
| import logging | ||
| import os | ||
| from typing import Any, Optional | ||
| from uuid import UUID | ||
|
|
||
| import httpx | ||
|
|
||
| from galileo.config import GalileoPythonConfig | ||
| from galileo.constants.routes import Routes | ||
| from galileo.schema.trace import ( | ||
| LoggingMethod, | ||
| LogRecordsSearchRequest, | ||
| SessionCreateRequest, | ||
| SpansIngestRequest, | ||
|
|
@@ -159,3 +163,96 @@ async def get_span(self, span_id: str) -> dict[str, str]: | |
| return await self._make_async_request( | ||
| RequestMethod.GET, endpoint=Routes.span.format(project_id=self.project_id, span_id=span_id) | ||
| ) | ||
|
|
||
|
|
||
| class IngestTraces: | ||
| """Client for the dedicated ingest service. | ||
|
|
||
| Sends traces directly to the ingest service which may run at a | ||
| separate URL from the main Galileo API. The service URL is resolved | ||
| from the ``GALILEO_INGEST_URL`` environment variable; when unset it | ||
| falls back to the configured ``api_url``. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| project_id : str | ||
| The project to ingest into. | ||
| log_stream_id : Optional[str] | ||
| Log stream id (mutually exclusive with *experiment_id*). | ||
| experiment_id : Optional[str] | ||
| Experiment id (mutually exclusive with *log_stream_id*). | ||
| """ | ||
|
|
||
| def __init__( | ||
| self, project_id: str, log_stream_id: Optional[str] = None, experiment_id: Optional[str] = None | ||
| ) -> None: | ||
| self.config = GalileoPythonConfig.get() | ||
| self.project_id = project_id | ||
| self.log_stream_id = log_stream_id | ||
| self.experiment_id = experiment_id | ||
|
|
||
| if self.log_stream_id is None and self.experiment_id is None: | ||
| raise ValueError("log_stream_id or experiment_id must be set") | ||
|
|
||
| def _get_ingest_base_url(self) -> str: | ||
| explicit = os.environ.get("GALILEO_INGEST_URL") | ||
| if explicit: | ||
| return explicit.rstrip("/") | ||
| api_url = self.config.api_url or self.config.console_url | ||
| return str(api_url).rstrip("/") | ||
|
|
||
| def _get_auth_headers(self) -> dict[str, str]: | ||
| headers: dict[str, str] = {"Content-Type": "application/json", "X-Galileo-SDK": get_sdk_header()} | ||
| if self.config.api_key: | ||
| headers["Galileo-API-Key"] = self.config.api_key.get_secret_value() | ||
| elif self.config.jwt_token: | ||
| headers["Authorization"] = f"Bearer {self.config.jwt_token.get_secret_value()}" | ||
| return headers | ||
|
|
||
| @async_warn_catch_exception(logger=_logger) | ||
| async def ingest_traces(self, traces_ingest_request: TracesIngestRequest) -> dict[str, Any]: | ||
| if self.experiment_id: | ||
| traces_ingest_request.experiment_id = UUID(self.experiment_id) | ||
| elif self.log_stream_id: | ||
| traces_ingest_request.log_stream_id = UUID(self.log_stream_id) | ||
|
Comment on lines
+213
to
+217
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This avoids the same control flow being duplicated across the two methods. Finding type: Want Baz to fix this for you? Activate Fixer |
||
|
|
||
| traces_ingest_request.logging_method = LoggingMethod.python_client | ||
|
|
||
| base_url = self._get_ingest_base_url() | ||
| url = f"{base_url}{Routes.ingest_traces.format(project_id=self.project_id)}" | ||
| json_body = traces_ingest_request.model_dump(mode="json") | ||
|
|
||
| _logger.info( | ||
| "Sending traces to ingest service", | ||
| extra={"url": url, "project_id": self.project_id, "num_traces": len(traces_ingest_request.traces)}, | ||
| ) | ||
|
|
||
|
Comment on lines
+225
to
+229
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Finding type: Want Baz to fix this for you? Activate Fixer Other fix methodsPrompt for AI Agents: |
||
| # httpx default timeout is 5s for connect/read/write/pool | ||
| # (see https://www.python-httpx.org/advanced/timeouts/) | ||
| async with httpx.AsyncClient(verify=self.config.ssl_context) as client: | ||
| response = await client.post(url, json=json_body, headers=self._get_auth_headers()) | ||
| response.raise_for_status() | ||
| return response.json() | ||
|
|
||
| @async_warn_catch_exception(logger=_logger) | ||
| async def ingest_spans(self, spans_ingest_request: SpansIngestRequest) -> dict[str, Any]: | ||
| if self.experiment_id: | ||
| spans_ingest_request.experiment_id = UUID(self.experiment_id) | ||
| elif self.log_stream_id: | ||
| spans_ingest_request.log_stream_id = UUID(self.log_stream_id) | ||
|
|
||
| spans_ingest_request.logging_method = LoggingMethod.python_client | ||
|
|
||
| base_url = self._get_ingest_base_url() | ||
| url = f"{base_url}{Routes.ingest_spans.format(project_id=self.project_id)}" | ||
| json_body = spans_ingest_request.model_dump(mode="json") | ||
|
|
||
| _logger.info( | ||
| "Sending spans to ingest service", | ||
| extra={"url": url, "project_id": self.project_id, "num_spans": len(spans_ingest_request.spans)}, | ||
| ) | ||
|
|
||
|
Comment on lines
+250
to
+254
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Finding type: Want Baz to fix this for you? Activate Fixer Other fix methodsPrompt for AI Agents: |
||
| async with httpx.AsyncClient(verify=self.config.ssl_context) as client: | ||
| response = await client.post(url, json=json_body, headers=self._get_auth_headers()) | ||
| response.raise_for_status() | ||
| return response.json() | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Flush in batch mode can keep using the traces client instead of the dedicated ingest client when GALILEO_INGEST_URL is set after constructing GalileoLogger. Flush selects client via
client = self._ingest_client or self._traces_clientand immediately callsawait client.ingest_traces(...)(lines 1929-1930), but_ingest_clientis only created in__init__so it stays None; can we call_create_ingest_client()before selecting the client (asasync_ingest_tracesdoes around 2124-2130) and apply the same lazy-creation fix to other similar call sites?Finding type:
Logical Bugs| Severity: 🔴 HighWant Baz to fix this for you? Activate Fixer
Other fix methods
Prompt for AI Agents: