Skip to content

Conversation

@nopcoder
Copy link
Contributor

@nopcoder nopcoder commented Oct 21, 2025

No description provided.

@nopcoder nopcoder self-assigned this Oct 21, 2025
@nopcoder nopcoder added docs Improvements or additions to documentation python-wrapper labels Oct 21, 2025
@github-actions
Copy link

github-actions bot commented Oct 21, 2025

📚 Documentation preview at https://pr-9592.docs-lakefs-preview.io/

(Updated: 11/11/2025, 11:26:07 AM - Commit: 7272fee)

@nopcoder nopcoder requested review from ozkatz and talSofer October 21, 2025 13:13
@nopcoder nopcoder added exclude-changelog PR description should not be included in next release changelog minor-change Used for PRs that don't require issue attached labels Oct 26, 2025
Copy link
Contributor

@talSofer talSofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving this part of our docs! it is 100x better than what we had before!

Added some comments, none is blocking but I will appreciate it if you resolve them

for diff in main.diff(other_ref=branch1):
print(diff)
```
## Python Integration Options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates the table content. The table is great, I would extend it to include the important info from here and remove this section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrote the page to integrate both parts

| **Boto3** | Medium | S3-compatible operations, existing S3 workflows, direct gateway access | `pip install boto3` | Low |

#### Merging changes from a branch into main
## Quick Start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a general Python quickstart? why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed - it was incremental change where thought to introduce the SDK by quick start and later added the list of entry points based on each sdk/library you like to work with.

#### Get object metadata

Get object metadata using branch and path:
## References & Resources
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be part of the table above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will pull up the relevant references and rewrite the list of python integrations.


References, commits, and commit metadata are fundamental to understanding and auditing changes in lakeFS. This guide covers navigating commit history, working with references, and using metadata for tracking and lineage.

## Prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is a duplicate


This guide covers object operations in lakeFS, including uploading, downloading, batch operations, and metadata management.

## Prerequisites
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove

Comment on lines 17 to 44
## Understanding Objects & Files

### What are Objects & Files?

Objects are the files stored in lakeFS. Upload, download, and manage them through branches:

```python
import lakefs

branch = lakefs.repository("my-repo").branch("main")

# Upload a file
branch.object("data/dataset.csv").upload(
data=b"id,name\n1,Alice\n2,Bob"
)

# Read a file
with branch.object("data/dataset.csv").reader() as f:
print(f.read())
```

Objects in lakeFS allow you to:

- **Store files** at any path with support for large files
- **Version files** across branches and commits
- **Manage metadata** about file checksums, sizes, and types
- **Track changes** across data versions using diffs
- **Organize data** with hierarchical paths and prefixes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can skip explaining objects. I would delete this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

Comment on lines 704 to 706
## Data Pipeline Workflow

### Creating a Complete Data Pipeline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this can be part of real-world workflows

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed some of the examples and merge as you suggested the pipeline and streaming to real-world workflows.

- Prefer **direct API interaction** patterns
- Need to **access all API endpoints** programmatically

For most common lakeFS operations (branches, tags, commits, objects), the **[High-Level SDK](./python.md)** is recommended as it provides a more Pythonic interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can put this in a tip box

@nopcoder nopcoder merged commit 42d6b69 into master Nov 11, 2025
43 checks passed
@nopcoder nopcoder deleted the task/python-docs branch November 11, 2025 11:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation exclude-changelog PR description should not be included in next release changelog minor-change Used for PRs that don't require issue attached python-wrapper

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants