Skip to content

resync#1

Merged
rw2 merged 20 commits intorw2:mainfrom
PelicanPlatform:main
Dec 10, 2024
Merged

resync#1
rw2 merged 20 commits intorw2:mainfrom
PelicanPlatform:main

Conversation

@rw2
Copy link
Owner

@rw2 rw2 commented Dec 10, 2024

No description provided.

ellert and others added 20 commits November 1, 2024 16:33
undefined reference to `XrdOssDF::pgRead(void*, long, unsigned int, unsigned int*, unsigned long long)'
undefined reference to `XrdOssDF::pgWrite(void*, long, unsigned int, unsigned int*, unsigned long long)'

Xrootd is compiled using -D_FILE_OFFSET_BITS=64. This means that the
second parameter in pgRead/pgWrite, which is declared as off_t in
xrootd's headers, is a long long on 32 bit architectures. Without
-D_FILE_OFFSET_BITS=64 the off_t is a long and there is a mismatch
which leads to a linking error.

Add -D_FILE_OFFSET_BITS=64 to the compiler definitions so that the
compilation uses the same off_t definition as xrootd does during
compilation.
Fix linking error on 32 bit architectures
This provides some minimal test cases for testing the S3 write code.
Most significantly, switches the debug/dump of the libcurl interaction
to use the XRootD logging framework instead of printing right to stderr.
This refactors the request logic to allow requests to be continued
over multiple calls.  The calling thread will regain control when
the buffer has been completely consumed (even if the full operation
will require additional buffers).

Note this only works if the client provides the full file size.
We want to always unpause a given operation from the same thread that
is handling it.  If a separate thread can pick up the operation, there
is a race condition where both the original one and new one are operating
on the same `CURL *` handle at the same time, resulting in an observed
segfault.

This commit introduces a separate "unpause" queue for each curl worker;
this queue is notified by the parent when there is additional data
available.
If an entire part is a single libcurl operation, a client that starts
writing - but then leaves for an extended period of time - will leave
dangling references inside libcurl (eventually exhausting the number
of allowable transfers).

This commit adds a background thread for S3File that will go through
all pending uploads and check to see if they are still live; if not,
then it'll timeout the operation.
After the notification is done, the request may be deleted by the
owning S3File instance.  Do not call `Notify` from within the curl
result processing function as the request object needs to be alive
to release the curl handle.
If the client doesn't pre-declare the size of the object it will
write, we don't know the size of the last part of the upload.  Hence,
we must switch back to buffer mode in this case.
If the entire object is uploaded during a single `Write` call, then
skip the multipart upload and just do a single non-buffered upload.
The unit test refactoring left copy/pasted code.  This commit splits
the common piece into a single header, allowign `s3_tests.cc` with
the unit tests that utilize AWS S3 while `s3_unit_tests.cc` use the
minio instance started up by ctest.
Switch S3 writes to use streaming I/O
@rw2 rw2 merged commit 884eb9b into rw2:main Dec 10, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants