Skip to content

ENG-1640: Confine untrusted archive extraction under the destination root#3815

Merged
mitchell-as merged 2 commits into
version/0-48-1-RC2from
mitchell/eng-1640
Jun 23, 2026
Merged

ENG-1640: Confine untrusted archive extraction under the destination root#3815
mitchell-as merged 2 commits into
version/0-48-1-RC2from
mitchell/eng-1640

Conversation

@mitchell-as

@mitchell-as mitchell-as commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

ENG-1640: Contain decrypted installs: path sanitization, private temps, user isolation

Part of the private ingredient work (ENG-1563). Private ingredient wheels come from untrusted sources, so their extraction must not be able to write or point anywhere outside the destination directory.

Unarchive now takes a WithUntrustedSource option. When set, every entry path, symlink target, and hardlink target is confined under the destination root, and anything that escapes aborts extraction. It's off by default — trusted Platform artifacts may legitimately contain absolute symlinks (e.g. into /usr/share), so their extraction is unchanged. The decrypt-and-install path (ENG-1635) will extract private wheels with this option.

Scope: just the extraction sanitizer. The decrypt temp dir and per-user isolation of decrypted content are deferred to ENG-1635, where they're applied at the deploy site.

Base branch: targets mitchell/eng-1632 (in review) so the diff is only this change; GitHub will retarget it to version/0-48-1-RC2 once the upstream PRs land.

Tested with successful and rejected (untrusted) tar.gz fixtures, the same escaping archive extracting when trusted, a zip happy path, and contained symlink/hardlink extraction.

🤖 Generated with Claude Code

…1640)

Private ingredient wheels come from untrusted sources, so their extraction must
not be able to write or point anywhere outside the destination directory.

Unarchive now takes a WithUntrustedSource option. When set, every entry path,
symlink target, and hardlink target is confined under the destination root and
anything that would escape aborts extraction. It is off by default: trusted
Platform artifacts may legitimately contain absolute symlinks (for example into
/usr/share), so their extraction is unchanged.

The ENG-1635 decrypt path will extract private wheels WithUntrustedSource; per-
user isolation of decrypted content and the decrypt temp dir are handled there.

testfile.tar.gz is a contained fixture backing the successful-extraction test;
the previous fixture, which has a root-level escaping symlink, is renamed to
testfile-escapes.tar.gz and backs the rejection test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mitchell-as mitchell-as changed the title ENG-1640: Confine archive extraction paths under the destination root ENG-1640: Confine untrusted archive extraction under the destination root Jun 22, 2026
@mitchell-as

Copy link
Copy Markdown
Collaborator Author

Test failures are known or sporadic and unrelated to this PR.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an “untrusted archive” extraction mode to internal/unarchiver intended to prevent path/symlink/hardlink escapes outside the destination root, primarily for extracting private ingredient artifacts from untrusted sources.

Changes:

  • Introduces WithUntrustedSource() option and plumbs it through NewTarGz / NewZip to enable stricter extraction behavior.
  • Adds untrusted extraction validation for entry paths, symlink targets, and hardlink targets during Unarchive.
  • Expands/adjusts unarchiver tests and adds a dedicated untrusted_test.go with malicious archive fixtures.

Reviewed changes

Copilot reviewed 3 out of 5 changed files in this pull request and generated 2 comments.

File Description
internal/unarchiver/unarchiver.go Adds WithUntrustedSource option and containment checks for entries and link targets during extraction.
internal/unarchiver/unarchiver_test.go Updates existing tests to cover trusted vs untrusted behavior using fixtures.
internal/unarchiver/untrusted_test.go Adds new unit tests that build in-memory tar.gz archives with escaping paths/links to verify rejection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/unarchiver/unarchiver.go
Comment thread internal/unarchiver/unarchiver.go
The untrusted-source guard used filepath.IsAbs to reject absolute symlink
targets, but that is host-specific: a Unix-style "/etc/passwd" reads as
relative on Windows (no drive letter), so it slipped past the check and got
folded into the extraction root, escaping rejection.

Replace it with hasAbsoluteTarget, which treats a leading separator ("/" or
"\") or a volume name ("C:") as rooted on any platform, and add a
backslash-rooted test case that exercises this on every OS.

ENG-1640

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Base automatically changed from mitchell/eng-1632 to version/0-48-1-RC2 June 23, 2026 21:12
Comment on lines +98 to +106
if ua.untrusted {
if hasAbsoluteTarget(file.LinkTarget) {
return errs.New("symlink target %q is absolute", file.LinkTarget)
}
resolved := filepath.Join(filepath.Dir(path), file.LinkTarget)
if !isContainedPath(root, resolved) {
return errs.New("symlink target %q escapes the extraction root", file.LinkTarget)
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can untangle the nesting here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like it can be done without duplicating a call to the function below this conditional block (writeNewSymbolicLink()) :(

@mitchell-as mitchell-as merged commit 71e0cea into version/0-48-1-RC2 Jun 23, 2026
2 of 7 checks passed
@mitchell-as mitchell-as deleted the mitchell/eng-1640 branch June 23, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants