Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 27, 2025

When extracting archives containing symlinks, FileUtils.resolveFile() calls getCanonicalFile(), which follows symlinks to their targets. On re-extraction, a symlink named symlinktarget_file resolves to target_file (canonical path), causing the code to attempt creating a symlink named target_file instead of symlink.

Changes

  • Added resolveFileWithoutFollowingSymlinks(): Path resolution without canonicalization for symlink entries
  • Updated extractFile(): Conditionally use non-canonicalizing resolution when symlinkDestination is present
  • Modified security check: Validate symlink path itself via toPath().toAbsolutePath().normalize() instead of following to target
  • Added test: Verify symlinks preserved across multiple extractions for both tar and zip formats
// Before: symlink resolves to its target on re-extraction
final File targetFileName = FileUtils.resolveFile(dir, entryName);  
// Always calls getCanonicalFile() → follows symlinks

// After: preserve symlink identity
final File targetFileName;
if (!StringUtils.isEmpty(symlinkDestination)) {
    targetFileName = resolveFileWithoutFollowingSymlinks(dir, entryName);
} else {
    targetFileName = FileUtils.resolveFile(dir, entryName);
}

Security checks for path traversal remain intact via normalized absolute path validation.

Original prompt

This section details on the original issue you should resolve

<issue_title>Symbolic link destination is not properly resolved when extracting archive</issue_title>
<issue_description>To quote AbstractUnArchiver source code:

// Hmm. Symlinks re-evaluate back to the original file here. Unsure if this is a good thing...
final File f = FileUtils.resolveFile( dir, entryName );

So when extracting symbolic links in some cases they are resolved to the file the point at instead to their file name. And this turns out to not be a good thing. Let me give you and example to better explain what are the implications.

Lets say we have an archive with symbolic link named symlink that points to regular_file. The call to FileUtils.resolveFile returns the canonical path and the canonical path may be different depending on weather the resolved file exist. Lets say that symlink does not exist yet (the archive is extracted in clean directory). FileUtils.resolveFile will return symlink and latter SymlinkUtils.createSymbolicLink( f, new File( symlinkDestination ) ) will create symbolic named symlink that points to regular_file. So far everything is as expected. But if symlink exists (for example we extract the archive again) then FileUtils.resolveFile will resolve to regular_file (the canonical path). The call to SymlinkUtils.createSymbolicLink will try to create symbolic link named regular_file, but because the file already exists it will do nothing. Although it is not what we expect in most cases still works. If the symbolic link destination didn't changed then the end result will be the expected one. But what if symlink points to file that does not exist. So lets say that symlink already exists on the disk, but regular_file not. Then the call to SymlinkUtils.createSymbolicLink will create symbolik link name regular_file that points to regular_file.

My suggestion is to modify FileUtils.resolveFile so it accepts a flag indicating if we want the canonical path or not. For symbolic link we won't get the canonical path and then f will always point to the symbolic link and not to the target path.

For the security check if the extracted file is outside the destination directory an exception is thrown. I would suggest to keep this check. It prevents extracting symbolic links that point to files outside the destination directory (../some_file for example) if the symbolic link already exists. But this is the current behavior and nobody complained so I would suggest to keep it as it is more secure.</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…ymlinks

Co-authored-by: slachiewicz <6705942+slachiewicz@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix symbolic link destination resolution issue in archive extraction Fix symlink extraction to preserve symlink identity instead of resolving to target Oct 27, 2025
Copilot AI requested a review from slachiewicz October 27, 2025 01:58
Copilot finished work on behalf of slachiewicz October 27, 2025 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Symbolic link destination is not properly resolved when extracting archive

2 participants