ENG-1640: Confine untrusted archive extraction under the destination root#3815
Conversation
…1640) Private ingredient wheels come from untrusted sources, so their extraction must not be able to write or point anywhere outside the destination directory. Unarchive now takes a WithUntrustedSource option. When set, every entry path, symlink target, and hardlink target is confined under the destination root and anything that would escape aborts extraction. It is off by default: trusted Platform artifacts may legitimately contain absolute symlinks (for example into /usr/share), so their extraction is unchanged. The ENG-1635 decrypt path will extract private wheels WithUntrustedSource; per- user isolation of decrypted content and the decrypt temp dir are handled there. testfile.tar.gz is a contained fixture backing the successful-extraction test; the previous fixture, which has a root-level escaping symlink, is renamed to testfile-escapes.tar.gz and backs the rejection test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5af4724 to
6779777
Compare
|
Test failures are known or sporadic and unrelated to this PR. |
There was a problem hiding this comment.
Pull request overview
Adds an “untrusted archive” extraction mode to internal/unarchiver intended to prevent path/symlink/hardlink escapes outside the destination root, primarily for extracting private ingredient artifacts from untrusted sources.
Changes:
- Introduces
WithUntrustedSource()option and plumbs it throughNewTarGz/NewZipto enable stricter extraction behavior. - Adds untrusted extraction validation for entry paths, symlink targets, and hardlink targets during
Unarchive. - Expands/adjusts unarchiver tests and adds a dedicated
untrusted_test.gowith malicious archive fixtures.
Reviewed changes
Copilot reviewed 3 out of 5 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| internal/unarchiver/unarchiver.go | Adds WithUntrustedSource option and containment checks for entries and link targets during extraction. |
| internal/unarchiver/unarchiver_test.go | Updates existing tests to cover trusted vs untrusted behavior using fixtures. |
| internal/unarchiver/untrusted_test.go | Adds new unit tests that build in-memory tar.gz archives with escaping paths/links to verify rejection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The untrusted-source guard used filepath.IsAbs to reject absolute symlink
targets, but that is host-specific: a Unix-style "/etc/passwd" reads as
relative on Windows (no drive letter), so it slipped past the check and got
folded into the extraction root, escaping rejection.
Replace it with hasAbsoluteTarget, which treats a leading separator ("/" or
"\") or a volume name ("C:") as rooted on any platform, and add a
backslash-rooted test case that exercises this on every OS.
ENG-1640
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| if ua.untrusted { | ||
| if hasAbsoluteTarget(file.LinkTarget) { | ||
| return errs.New("symlink target %q is absolute", file.LinkTarget) | ||
| } | ||
| resolved := filepath.Join(filepath.Dir(path), file.LinkTarget) | ||
| if !isContainedPath(root, resolved) { | ||
| return errs.New("symlink target %q escapes the extraction root", file.LinkTarget) | ||
| } | ||
| } |
There was a problem hiding this comment.
Is there a way we can untangle the nesting here?
There was a problem hiding this comment.
It doesn't look like it can be done without duplicating a call to the function below this conditional block (writeNewSymbolicLink()) :(
ENG-1640: Contain decrypted installs: path sanitization, private temps, user isolation
Part of the private ingredient work (ENG-1563). Private ingredient wheels come from untrusted sources, so their extraction must not be able to write or point anywhere outside the destination directory.
Unarchivenow takes aWithUntrustedSourceoption. When set, every entry path, symlink target, and hardlink target is confined under the destination root, and anything that escapes aborts extraction. It's off by default — trusted Platform artifacts may legitimately contain absolute symlinks (e.g. into/usr/share), so their extraction is unchanged. The decrypt-and-install path (ENG-1635) will extract private wheels with this option.Scope: just the extraction sanitizer. The decrypt temp dir and per-user isolation of decrypted content are deferred to ENG-1635, where they're applied at the deploy site.
Base branch: targets
mitchell/eng-1632(in review) so the diff is only this change; GitHub will retarget it toversion/0-48-1-RC2once the upstream PRs land.Tested with successful and rejected (untrusted) tar.gz fixtures, the same escaping archive extracting when trusted, a zip happy path, and contained symlink/hardlink extraction.
🤖 Generated with Claude Code