feat: implement Project.save() and Dataset.save() by thiagobomfin-galileo · Pull Request #503 · rungalileo/galileo-python

thiagobomfin-galileo · 2026-03-13T16:06:31Z

User description

Shortcut:
Implement Missing CRUD Operations in future Package

Description:

Adds update() to the Projects and Datasets service classes, exposing the existing but unused
PUT /projects/{id} and PATCH /datasets/{id} API endpoints
- Implements Project.save() and Dataset.save() in the future domain layer, replacing
  NotImplementedError stubs with full state-dispatching logic

Tests:

Unit Tests Added
E2E Test Added (if it's a user-facing feature, or fixing a bug)

Generated description

Below is a concise technical summary of the changes proposed in this PR:
Implement Dataset and Project save flows in the future domain so that they persist DIRTY state via create or update paths while guarding against deleted or unsynced resources through SyncState transitions. Add Datasets and Projects services update operations that call the existing PATCH/PUT endpoints so the domain layer can dispatch full CRUD support.

Topic Details

Domain Save Flow

Explain how Dataset.save() and Project.save() now create or update via SyncState checks, synchronize attributes when responses arrive, and surface failures while covering no-op, deleted, and missing-ID cases with expanded unit tests for the save flows.

Modified files (4)

src/galileo/__future__/dataset.py
src/galileo/__future__/project.py
tests/future/test_dataset.py
tests/future/test_project.py

Latest Contributors(1)

User	Commit	Date
vamaq@users.noreply.gi...	fix-dataset-204-respon...	February 07, 2026

Update APIs

Detail how the Datasets and Projects services now expose the PATCH/PUT update endpoints with logging, validation error handling, and response guards so the domain layer receives updated model data.

Modified files (2)

src/galileo/datasets.py
src/galileo/projects.py

Latest Contributors(2)

User	Commit	Date
bipin@galileo.ai	fix-normalize-ground_t...	February 18, 2026
vamaq@users.noreply.gi...	fix-dataset-204-respon...	February 07, 2026

This pull request is reviewed by Baz. Review like a pro on (Baz).

codecov · 2026-03-13T16:09:34Z

Codecov Report

❌ Patch coverage is 62.96296% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.96%. Comparing base (66e86cd) to head (609bcb7).

Files with missing lines	Patch %	Lines
src/galileo/datasets.py	11.76%	15 Missing ⚠️
src/galileo/projects.py	16.66%	15 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #503      +/-   ##
==========================================
- Coverage   82.13%   81.96%   -0.17%     
==========================================
  Files          96       96              
  Lines        9282     9361      +79     
==========================================
+ Hits         7624     7673      +49     
- Misses       1658     1688      +30

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

baz-reviewer · 2026-03-13T16:12:57Z

src/galileo/projects.py

+    def update(self, project_id: str, *, name: Optional[str] = None) -> ProjectUpdateResponse:
+        """
+        Updates a project's properties.
+
+        Parameters
+        ----------
+        project_id : str
+            The ID of the project to update.
+        name : str, optional
+            The new name for the project.
+
+        Returns


Projects.update() doesn't emit the required start/completion lifecycle logs (AGENTS.md requires write methods to log) and it doesn't log failure paths: ProjectsAPIException for non-200 responses and ValueError for empty responses are raised with no telemetry. These write failures are therefore silent in telemetry. Can we emit start/completion logs and log each failure path before raising?

_{Finding type: AI Coding Guidelines | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In src/galileo/projects.py around lines 303-343, the update method is missing lifecycle logging and failure telemetry. Add an initial start log (e.g. _logger.info with action='project.update', phase='start', project_id=project_id, name=name), and add a completion log on success (phase='complete', project_id, name). Before raising ProjectsAPIException on non-200 responses, log an error with response.status_code and content; before raising for HTTPValidationError, log the validation error details; before raising ValueError for an empty response, log an error indicating no response was returned. Use the existing _logger and keep logs concise and structured so every failure path is logged prior to raising.

Commit 609bcb7 addressed this comment by adding lifecycle logs around Projects.update(), logging a start event, error events for non-200/validation/empty responses, and a completion event on success, ensuring every failure path records telemetry before raising.

baz-reviewer · 2026-03-13T16:12:57Z

src/galileo/datasets.py

+    def update(self, dataset_id: str, *, name: Optional[str] = None) -> Dataset:
+        """
+        Updates a dataset's properties.
+
+        Parameters
+        ----------
+        dataset_id : str
+            The ID of the dataset to update.
+        name : str, optional
+            The new name for the dataset.
+
+        Returns


Datasets.update() performs a write but emits no lifecycle logs; AGENTS.md requires start/completion and failure logs with non-sensitive context. It raises DatasetAPIException/ValueError without logging failure details, so there's no audit trail or failure context for debugging. Can we add structured start/completion logs and log failures with non-sensitive context before raising to comply with AGENTS.md?

_{Finding type: AI Coding Guidelines | Severity: 🟠 Medium}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In src/galileo/datasets.py around lines 425 to 460, the update method is missing lifecycle logging. Add structured logs: log an INFO at the start of the update with non-sensitive context {"dataset_id": dataset_id, "name": name}, log an INFO on successful update with the same context plus the updated dataset id, and log an ERROR before raising when the call returns HTTPValidationError, when response is empty (ValueError), or on any other exception. Use the service logger available on the client (e.g. self.config.logger) and include exception details in the error logs without including sensitive payloads; ensure logs occur immediately before raising the corresponding exceptions.

Commit 609bcb7 addressed this comment by adding structured lifecycle logging to Datasets.update() with INFO logs at start and success and ERROR logs before raising for validation errors, empty responses, and exceptions.

…update

baz-reviewer · 2026-03-17T18:10:03Z

src/galileo/__future__/dataset.py

+        if self.sync_state == SyncState.LOCAL_ONLY:
+            return self.create()
+        if self.sync_state == SyncState.SYNCED:
+            logger.debug(f"Dataset.save: id='{self.id}' - already synced, no action needed")
+            return self


Dataset.save now re-implements the entire state-transition + persistence workflow (LOCAL_ONLY -> create, SYNCED no-op, DELETED guard) before calling Datasets.update, which is essentially the same block that Project.save copies verbatim (see project.py lines 785‑809). Any change to the state checks, logging, or _set_state error handling now needs to be made in two places. Can we consolidate this into StateManagementMixin (e.g. a shared _persist_changes(create_fn, update_fn, attr_sync) helper) so dataset/project save reuse one canonical workflow and only supply the resource-specific update call?

_{Finding type: Code Dedup and Conventions | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

baz-reviewer · 2026-03-17T18:10:03Z

src/galileo/__future__/dataset.py

+            logger.info(f"Dataset.save: name='{self.name}' id='{self.id}' - started")
+            datasets_service = Datasets()
+            updated_dataset = datasets_service.update(self.id, name=self.name)
+
+            # Update attributes from response
+            self.name = updated_dataset.name
+            self.updated_at = updated_dataset.updated_at
+
+            self._set_state(SyncState.SYNCED)


Dataset.save() claims to persist local modifications but only passes name to Datasets.update — should we include mutated fields such as draft in the update before marking SYNCED or restrict save() to renames and update its docstring?

_{Finding type: Logical Bugs | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In src/galileo/__future__/dataset.py around lines 563 to 571, the save() method currently calls Datasets.update(self.id, name=self.name) and then marks the instance SYNCED, but it ignores other mutable fields (e.g. draft, column_mapping) so local edits are not sent to the API. Change save() to build an update payload from all mutable attributes that should be persisted (at minimum include draft and column_mapping, plus any other updatable fields the class exposes), pass that payload into datasets_service.update(self.id, **payload), and then apply the returned updated_dataset attributes (name, updated_at, draft, column_mapping, etc.) before setting state to SYNCED. Also update the docstring to accurately describe which fields are persisted. If adding fields is not desired, instead narrow the docstring and behavior to state that save() only persists name changes and raise/handle accordingly.

baz-reviewer · 2026-03-17T18:10:03Z

src/galileo/datasets.py

+        logger.info("action='dataset.update' phase='start' dataset_id=%s name=%s", dataset_id, name)
+
+        body = UpdateDatasetRequest(name=name)
+
+        try:


Datasets.update sends name: null when name is None, violating the API's non-null name contract — should we only set name when provided (e.g. UpdateDatasetRequest(name=name if name is not None else UNSET))?

_{Finding type: Type Inconsistency | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In src/galileo/datasets.py around lines 448 to 452, the update method constructs UpdateDatasetRequest(name=name) even when name is None which will serialize to name: null and may clear a non-nullable dataset name. Change this so the request omits the name field when the caller did not provide it: either pass name=UNSET when name is None (using UNSET from galileo.resources.types) or build the request kwargs only with 'name' if name is not None, then instantiate UpdateDatasetRequest(**kwargs). Ensure imports are adjusted if needed and keep the existing logging/error handling intact.

baz-reviewer · 2026-03-17T18:10:03Z

src/galileo/__future__/project.py

+        try:
+            logger.info(f"Project.save: name='{self.name}' id='{self.id}' - started")
+            projects_service = Projects()
+            updated_project = projects_service.update(self.id, name=self.name)
+
+            # Update attributes from response
+            if not isinstance(updated_project.name, Unset):
+                self.name = updated_project.name
+            self.updated_at = updated_project.updated_at


Project.save() only persists name but marks the project SYNCED, should we include other editable fields (at least type) in the update or prevent saving when only non-name fields changed?

_{Finding type: Logical Bugs | Severity: 🔴 High}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

In src/galileo/__future__/project.py around lines 797 to 805, the save() method currently only sends name to Projects.update and only applies updated_project.name back to self, so changes to other editable fields (e.g., type, labels, description) are never persisted. Refactor save() to build an update payload that includes all editable fields (name, type, labels, description, etc.) by checking each attribute against Unset and including it if set; pass that payload into projects_service.update(self.id, ...). After the API call, update each corresponding local attribute from the returned updated_project (only when the returned value is not Unset) before setting the SyncState to SYNCED so the local state accurately reflects the persisted remote state.

thiagobomfin-galileo self-assigned this Mar 13, 2026

thiagobomfin-galileo requested a review from a team as a code owner March 13, 2026 16:06

thiagobomfin-galileo requested a review from pratduv March 13, 2026 16:06

baz-reviewer bot reviewed Mar 13, 2026

View reviewed changes

pratduv removed their request for review March 13, 2026 21:22

thiagobomfin-galileo added 5 commits March 17, 2026 14:57

feat(datasets): add update method to Datasets service

53e7e4a

feat(projects): add update() method to Projects service

483a64b

feat(future): implement Project.save() and Dataset.save()

714bbb9

test(future): add test coverage for Project.save() and Dataset.save()

ed15a4a

fix(services): add lifecycle logging to Projects.update and Datasets.…

609bcb7

…update

thiagobomfin-galileo force-pushed the feat/sc-52119-project-dataset-save branch from ef52dcd to 609bcb7 Compare March 17, 2026 18:02

baz-reviewer bot reviewed Mar 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement Project.save() and Dataset.save() #503

feat: implement Project.save() and Dataset.save() #503
thiagobomfin-galileo wants to merge 5 commits intomainfrom
feat/sc-52119-project-dataset-save

thiagobomfin-galileo commented Mar 13, 2026 •

edited by baz-reviewer bot

Loading

Uh oh!

codecov bot commented Mar 13, 2026 •

edited

Loading

Uh oh!

baz-reviewer bot Mar 13, 2026

Uh oh!

baz-reviewer bot Mar 17, 2026

Uh oh!

baz-reviewer bot Mar 13, 2026

Uh oh!

baz-reviewer bot Mar 17, 2026

Uh oh!

baz-reviewer bot Mar 17, 2026

Uh oh!

baz-reviewer bot Mar 17, 2026

Uh oh!

baz-reviewer bot Mar 17, 2026

Uh oh!

baz-reviewer bot Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thiagobomfin-galileo commented Mar 13, 2026 • edited by baz-reviewer bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Generated description

Uh oh!

codecov bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

baz-reviewer bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thiagobomfin-galileo commented Mar 13, 2026 •

edited by baz-reviewer bot

Loading

codecov bot commented Mar 13, 2026 •

edited

Loading