-
Notifications
You must be signed in to change notification settings - Fork 0
S3 support #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
S3 support #21
Conversation
…adding new handlers and S3 types.
…hod, and enhance S3 server for object listing and shard configuration.
|
Integration Test failed Waiting for S3 Server (port 9000)... --- Testing Object Operations --- --- Testing Range Requests --- --- Testing Multipart Upload --- --- Testing ListObjectsV2 --- FAILURE! Test failed with error: Should be truncated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request introduces S3-compatible REST API support for the distributed file system (DFS) through a new s3_server component, alongside implementing file deletion functionality across the DFS architecture.
Key Changes:
- Added a new
dfs/s3_servercrate that provides S3-compatible REST API endpoints using Axum, enabling standard S3 clients to interact with the DFS - Implemented
delete_fileRPC operation across the client, master server, and Raft state machine to support file deletion with consensus - Enhanced client API with
list_all_filesmethod for aggregating file listings across multiple shards
Reviewed changes
Copilot reviewed 16 out of 18 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
test_scripts/s3_integration_test.py |
Python-based integration test suite for S3 API operations including bucket management, object operations, multipart uploads, and pagination |
test_scripts/run_s3_test.sh |
Shell script to orchestrate S3 integration testing with Docker Compose setup and teardown |
test_scripts/requirements.txt |
Python dependencies for S3 integration tests (boto3, botocore) |
proto/dfs.proto |
Added DeleteFile RPC definition to the MasterService protocol |
docker-compose.yml |
Added S3 server service configuration to the Docker Compose setup |
dfs/s3_server/src/state.rs |
Application state management for S3 server holding DFS client instance |
dfs/s3_server/src/s3_types.rs |
S3 XML response type definitions for bucket and object operations |
dfs/s3_server/src/main.rs |
S3 server entry point with Axum router configuration and DFS client initialization |
dfs/s3_server/src/handlers.rs |
S3 API request handlers implementing bucket operations, object CRUD, multipart uploads, and metadata handling |
dfs/s3_server/Cargo.toml |
Cargo manifest for S3 server dependencies |
dfs/metaserver/src/simple_raft.rs |
Added DeleteFile command to Raft state machine |
dfs/metaserver/src/master.rs |
Implemented delete_file RPC endpoint with Raft consensus integration |
dfs/client/src/mod.rs |
Added delete_file and list_all_files methods, updated error types to support async Send + Sync |
TODO.md |
Updated S3 API milestones to reflect completed implementation status |
Dockerfile |
Added s3-server binary to Docker image |
Cargo.toml |
Added s3_server to workspace members |
Comments suppressed due to low confidence (1)
dfs/s3_server/src/handlers.rs:1
- Use the tracing framework (already imported) instead of eprintln! for consistency with the rest of the codebase. Replace with
tracing::error!ortracing::warn!.
use crate::{s3_types::*, state::AppState as S3AppState};
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| part1_data = b"Part1" * 1024 * 1024 # 5MB | ||
| part2_data = b"Part2" * 1024 * 1024 # 5MB |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states '5MB' but the calculation produces 5,242,880 bytes (approximately 5.24 MB). For exactly 5MB, use b"Part1" * (1024 * 1024) which equals 5,242,880 bytes, or clarify the comment to indicate the approximate size.
| part1_data = b"Part1" * 1024 * 1024 # 5MB | |
| part2_data = b"Part2" * 1024 * 1024 # 5MB | |
| part1_data = b"Part1" * 1024 * 1024 # 5,242,880 bytes (~5.24MB) | |
| part2_data = b"Part2" * 1024 * 1024 # 5,242,880 bytes (~5.24MB) |
| key, | ||
| last_modified: "2025-01-01T00:00:00.000Z".into(), | ||
| etag: "\"000\"".into(), | ||
| size: 0, |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Object size is hardcoded to 0, which provides incorrect information to S3 clients. For regular files, retrieve the actual file size from the DFS. For multipart upload objects, calculate the size by summing the sizes of all parts.
| key, | ||
| last_modified: "2025-01-01T00:00:00.000Z".into(), | ||
| etag: "\"000-MPU\"".into(), | ||
| size: 0, // Calculate size? |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MPU object size is hardcoded to 0. The comment acknowledges this needs calculation. Implement size calculation by listing and summing part sizes, as this affects S3 client behavior and data integrity checks.
| key: key.clone(), | ||
| last_modified: "2025-01-01T00:00:00.000Z".into(), | ||
| etag: "\"000\"".into(), | ||
| size: 0, |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Object size is hardcoded to 0 in the ListObjectsV2 response. Retrieve and return the actual file size from the DFS to provide accurate information to S3 clients.
| .await; | ||
|
|
||
| let result = CompleteMultipartUploadResult { | ||
| location: format!("http://localhost:9000/{}/{}", bucket, key), |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The location URL is hardcoded to 'http://localhost:9000'. This breaks in non-local deployments. Use the actual request host or make the base URL configurable via environment variable.
| // Parse body for part verification (skip actual verification for now, just trust client) | ||
| if let Ok(str_body) = std::str::from_utf8(&body) { | ||
| let _parts: Result<CompleteMultipartUpload, _> = from_str(str_body); | ||
| } |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CompleteMultipartUpload handler skips ETag verification of parts. Without verification, corrupted or tampered parts could be assembled into the final object. Implement ETag verification to ensure data integrity.
This pull request introduces support for file deletion in the distributed file system (DFS) and adds a new S3-compatible server component. The main changes include implementing the
delete_fileoperation across the client and master components, updating the Raft state machine, and integrating a news3_servercrate for S3 REST API compatibility. Additionally, the build system and documentation are updated to reflect these enhancements.S3 Server Integration and S3 API Progress
dfs/s3_servercrate, including its dependencies and a basicmain.rsto handle S3-compatible REST API requests using Axum. This enables S3 compatibility for the DFS. [1] [2]s3_serverbinary. [1] [2]TODO.md, reflecting significant progress on S3 compatibility features and integration testing. [1] [2] [3]File Deletion Support
delete_fileRPC in the DFS client (dfs/client/src/mod.rs), including error handling and integration with sharding and master routing logic. [1] [2]delete_filegRPC endpoint to the master server (dfs/metaserver/src/master.rs) and integrated it into the Raft state machine for consensus-based file deletion. [1] [2] [3] [4]Client API Improvements
Send + Sync), and added alist_all_filesmethod to aggregate file listings across shards. [1] [2] [3] [4] [5]These changes collectively enhance the DFS by enabling S3 interoperability and allowing users to delete files in a distributed, fault-tolerant manner.