Skip to content

Conversation

@PE39806
Copy link
Contributor

@PE39806 PE39806 commented Sep 5, 2025

Implements the postStartMultipartUpload and postFinishMultipartUpload endpoints, and adds postMultipartUploadPart.

@PE39806 PE39806 added enhancement New feature or request javascript Pull requests that update Javascript code draft model artefact management labels Sep 5, 2025
@PE39806 PE39806 marked this pull request as ready for review October 3, 2025 10:28
JRB66955
JRB66955 previously approved these changes Oct 6, 2025
JRB66955
JRB66955 previously approved these changes Oct 7, 2025
@PE39806 PE39806 requested a review from JRB66955 October 7, 2025 12:29
JRB66955
JRB66955 previously approved these changes Oct 7, 2025
JR40159
JR40159 previously requested changes Oct 8, 2025
Copy link
Member

@JR40159 JR40159 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PE39806
Copy link
Contributor Author

PE39806 commented Oct 8, 2025

  • Change approach to proxy through the backend rather than using presigned URLs. This will mean adding a new upload-part endpoint which accepts byte-ranges as a param, and changing the start endpoint to return byte ranges rather than presigned URLs.

@PE39806 PE39806 marked this pull request as draft October 9, 2025 15:02
@PE39806 PE39806 added the draft label Oct 9, 2025
@PE39806
Copy link
Contributor Author

PE39806 commented Oct 27, 2025

Tested with the following script in the backend pod (docker compose exec backend /bin/sh)

#!/bin/sh
#
# Test script for Bailo multipart upload API
# Uses POST /multipart/start, /multipart/part, /multipart/finish
#

API_BASE="http://localhost:3001/api/v2"
MODEL_ID="<>"
ACCESS_KEY="<>"
SECRET_KEY="<>"
TEST_FILE_SIZE=12

TEST_FILE="testfile.bin"
TMP_DIR=$(mktemp -d)

dd if=/dev/urandom of="$TEST_FILE" bs=1M count=$TEST_FILE_SIZE >/dev/null 2>&1

FILE_SIZE=$(wc -c < "$TEST_FILE")
FILE_NAME=$(basename "$TEST_FILE")
MIME_TYPE="application/octet-stream"

echo "==> Starting multipart upload for $FILE_NAME ($FILE_SIZE bytes)"
START_RESP=$(curl -s -X POST "$API_BASE/model/$MODEL_ID/files/upload/multipart/start" \
    -u "$ACCESS_KEY:$SECRET_KEY" \
    -H "Content-Type: application/json" \
    -d "{
        \"name\": \"$FILE_NAME\",
        \"mime\": \"$MIME_TYPE\",
        \"size\": $FILE_SIZE
    }")

echo "Start response: $START_RESP"

# Extract IDs
FILE_ID=$(echo "$START_RESP" | sed -n 's/.*"fileId":"\([^"]*\)".*/\1/p')
UPLOAD_ID=$(echo "$START_RESP" | sed -n 's/.*"uploadId":"\([^"]*\)".*/\1/p')

if [ -z "$FILE_ID" ] || [ -z "$UPLOAD_ID" ]; then
    echo "ERROR: Could not parse fileId or uploadId"
    rm -rf "$TMP_DIR" "$TEST_FILE"
    exit 1
fi

# Extract the chunk JSONs cleanly into a temp file
echo "$START_RESP" |
    tr -d '\n ' |
    sed 's/.*"chunks":\[\(.*\)\],"fileId.*/\1/' |
    tr '}' '\n' | grep startByte > "$TMP_DIR/chunks.txt"

PARTS_JSON="["
FIRST=1
PART_NUM=1

echo "==> Uploading parts..."
while IFS= read -r line; do
    START_BYTE=$(echo "$line" | sed 's/.*"startByte":\([0-9]*\).*/\1/')
    END_BYTE=$(echo "$line"   | sed 's/.*"endByte":\([0-9]*\).*/\1/')
    COUNT_BYTES=$(($END_BYTE - $START_BYTE + 1))

    echo "Uploading part $PART_NUM (bytes $START_BYTE to $END_BYTE)..."

    ETAG_RESP=$(dd if="$TEST_FILE" bs=1 skip=$START_BYTE count=$COUNT_BYTES 2>/dev/null |
        curl -s -X POST \
            "$API_BASE/model/$MODEL_ID/files/upload/multipart/part?fileId=$FILE_ID&uploadId=$UPLOAD_ID&partNumber=$PART_NUM" \
            -u "$ACCESS_KEY:$SECRET_KEY" \
            -H "Content-Type: application/octet-stream" \
            -H "Content-Length: $COUNT_BYTES" \
            --data-binary @-)

    echo "ETAG_RESP=$ETAG_RESP"

    # Extract and clean ETag
    CLEAN_ETAG=$(echo "$ETAG_RESP" | jq -r '.ETag' | tr -d '"')

    if [ -z "$CLEAN_ETAG" ]; then
        echo "ERROR: No ETag returned for part $PART_NUM"
        rm -rf "$TMP_DIR" "$TEST_FILE"
        exit 1
    fi

    if [ $FIRST -eq 0 ]; then
        PARTS_JSON="$PARTS_JSON,"
    fi
    FIRST=0

    PARTS_JSON="$PARTS_JSON{\"ETag\":\"$CLEAN_ETAG\",\"PartNumber\":$PART_NUM}"

    PART_NUM=$(($PART_NUM + 1))
done < "$TMP_DIR/chunks.txt"

PARTS_JSON="$PARTS_JSON]"

echo "Built parts JSON: $PARTS_JSON"

echo "==> Finishing upload..."
FINISH_RESP=$(curl -s -X POST "$API_BASE/model/$MODEL_ID/files/upload/multipart/finish" \
    -u "$ACCESS_KEY:$SECRET_KEY" \
    -H "Content-Type: application/json" \
    -d @- <<EOF
{"fileId":"$FILE_ID","uploadId":"$UPLOAD_ID","parts":$PARTS_JSON}
EOF
)

echo "Finish response: $FINISH_RESP"

# Cleanup
rm -rf "$TMP_DIR" "$TEST_FILE"

echo "==> Done."
/app $ apk add --update curl jq
fetch https://dl-cdn.alpinelinux.org/alpine/v3.22/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.22/community/x86_64/APKINDEX.tar.gz
(1/11) Installing brotli-libs (1.1.0-r2)
(2/11) Installing c-ares (1.34.5-r0)
(3/11) Installing libunistring (1.3-r0)
(4/11) Installing libidn2 (2.3.7-r0)
(5/11) Installing nghttp2-libs (1.65.0-r0)
(6/11) Installing libpsl (0.21.5-r3)
(7/11) Installing zstd-libs (1.5.7-r0)
(8/11) Installing libcurl (8.14.1-r2)
(9/11) Installing curl (8.14.1-r2)
(10/11) Installing oniguruma (6.9.10-r0)
(11/11) Installing jq (1.8.0-r0)
Executing busybox-1.37.0-r19.trigger
OK: 16 MiB in 33 packages
/app $ /bin/sh src/scripts/test_multipart_upload.sh 
==> Starting multipart upload for testfile.bin (12582912 bytes)
Start response: {"fileId":"68ff90312028e2a85914fc0c","uploadId":"ODc5MTllYjItYjNiYy00MmVlLWE5ZWItNTIyMDBiMTk0MDE1LjdmOTNiZTJmLWNhNDAtNDkzYi05ZWNkLTFjZjExNTNlY2RmZA","chunks":[{"startByte":0,"endByte":5242879},{"startByte":5242880,"endByte":10485759},{"startByte":10485760,"endByte":12582911}]}
==> Uploading parts...
Uploading part 1 (bytes 0 to 5242879)...
ETAG_RESP={"ETag":"\"83e2d59a4d096ef1e3ce68dc1e24b5e6\""}
Uploading part 2 (bytes 5242880 to 10485759)...
ETAG_RESP={"ETag":"\"63483f87f292c9289d1529516bf35097\""}
Uploading part 3 (bytes 10485760 to 12582911)...
ETAG_RESP={"ETag":"\"c7422321672ac3079dacff96dc81c98c\""}
Built parts JSON: [{"ETag":"83e2d59a4d096ef1e3ce68dc1e24b5e6","PartNumber":1},{"ETag":"63483f87f292c9289d1529516bf35097","PartNumber":2},{"ETag":"c7422321672ac3079dacff96dc81c98c","PartNumber":3}]
==> Finishing upload...
Finish response: {"file":{"_id":"68ff90312028e2a85914fc0c","modelId":"q-z5mucs","name":"testfile.bin","mime":"application/octet-stream","path":"beta/model/q-z5mucs/files/1d8b051b-9c49-41a3-851a-3fba7a3f9b00","tags":[],"complete":true,"deleted":false,"deletedBy":"","deletedAt":"","createdAt":"2025-10-27T15:30:57.020Z","updatedAt":"2025-10-27T15:31:25.224Z","__v":0,"size":12582912,"avScan":[{"_id":"68ff904d2028e2a85914fc2a","artefactKind":"file","fileId":"68ff90312028e2a85914fc0c","toolName":"Clam AV","state":"inProgress","viruses":[],"lastRunAt":"2025-10-27T15:31:25.223Z","deleted":false,"deletedBy":"","deletedAt":"","createdAt":"2025-10-27T15:31:25.227Z","updatedAt":"2025-10-27T15:31:25.227Z","__v":0,"id":"68ff904d2028e2a85914fc2a"},{"_id":"68ff904d2028e2a85914fc2d","artefactKind":"file","fileId":"68ff90312028e2a85914fc0c","toolName":"ModelScan","state":"inProgress","viruses":[],"lastRunAt":"2025-10-27T15:31:25.223Z","deleted":false,"deletedBy":"","deletedAt":"","createdAt":"2025-10-27T15:31:25.230Z","updatedAt":"2025-10-27T15:31:25.230Z","__v":0,"id":"68ff904d2028e2a85914fc2d"}],"id":"68ff90312028e2a85914fc0c"}}
==> Done.
/app $

@PE39806 PE39806 marked this pull request as ready for review October 27, 2025 15:36
@PE39806 PE39806 requested review from JR40159 and JRB66955 October 27, 2025 15:39
@PE39806 PE39806 merged commit cdb4801 into main Nov 18, 2025
22 checks passed
@PE39806 PE39806 deleted the 2585-resumable-downloads branch November 18, 2025 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request javascript Pull requests that update Javascript code model artefact management ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants