gh-149146: Add dealloc-depth counter fallback to trashcan trigger by bhuvi27 · Pull Request #151595 · python/cpython

bhuvi27 · 2026-06-17T09:57:40Z

while True: b = (b, None) under RLIMIT_AS segfaults during cleanup:

b = 9
while True:
    b = (b, None)

Docker with ulimit -v 524288: MemoryError → lost sys.stderr → segfault, exit 139.

The trashcan deposit in _Py_Dealloc only fires when _Py_RecursionLimit_GetMargin(tstate) < 2, which compares the stack pointer against c_stack_soft_limit. Under RLIMIT_AS the kernel SIGSEGVs trying to grow the stack while SP is still nowhere near the soft limit — issue #150722 has an LLDB trace from the same reporter showing SP ~7 MB above c_stack_hard_limit at the fault. So the trashcan never deposits and the recursive tuple_dealloc → Py_XDECREF → _Py_Dealloc → tuple_dealloc chain just runs into the wall.

This adds a per-thread c_dealloc_depth counter and uses it as a secondary trigger inside _Py_Dealloc, depositing at depth 50 (matches the old _PyTrash_UNWIND_LEVEL). The stack-pointer margin stays the primary signal; the counter only kicks in when SP can't.

One thing I missed on the first pass: the drain side has the same problem. margin >= 4 is also true in this scenario, so every _Py_Dealloc exit would re-enter _PyTrash_thread_destroy_chain from inside the deallocator it just called. Fix: destroy_chain bumps the counter for the duration of the drain (same idea as the old delete_nesting), and the drain check in _Py_Dealloc becomes just c_dealloc_depth == 0. Counter is only touched for GC types so non-GC types pay nothing.

Tests: new test_trashcan_nested_tuple_deep in test_gc.py (100k nested tuples, must clean up without crash); test_gc test_exceptions test_capi all green locally. Docker repro now exits 1 with MemoryError instead of 139 with Segmentation fault.

#150722 is the same SP-signal blindness in the comparison path (Py_EnterRecursiveCall) — separate code path, separate PR.

Under memory pressure (for example RLIMIT_AS), Python's trashcan trigger in _Py_Dealloc never fires. The trigger relies on _Py_RecursionLimit_GetMargin, which compares the machine stack pointer against c_stack_soft_limit. When RLIMIT_AS prevents the kernel from growing the C stack, the kernel SIGSEGVs while the stack pointer is still megabytes above the soft limit (see sibling issue pythongh-150722 for an LLDB trace showing SP ~7.2 MB above c_stack_hard_limit at the SIGSEGV), so the trashcan never deposits and the recursive tuple_dealloc chain runs out the stack. Add a per-thread c_dealloc_depth counter as a fallback trigger, mirroring the historical _PyTrash_UNWIND_LEVEL=50 protection that was removed when the trashcan was consolidated into _Py_Dealloc. _Py_RecursionLimit_GetMargin remains the primary signal; the counter only kicks in when the stack-pointer signal cannot fire. _PyTrash_thread_destroy_chain itself bumps c_dealloc_depth for the duration of the drain so any _Py_Dealloc invoked while draining cannot recursively re-enter destroy_chain (which would rebuild the same unbounded recursion the trashcan exists to prevent). This mirrors the historical delete_nesting bookkeeping. The counter is per-tstate (no atomics needed in the free-threaded build) and is only touched for GC types, so non-GC types pay zero overhead. Add a regression test in test_gc that builds a 100,000-deep (b, None) tuple chain and asserts that ``del b`` cleans it up without a C-stack overflow.

vstinner · 2026-06-19T09:02:13Z

IMO the existing _Py_RecursionLimit_GetMargin() mechanism based on tstate->c_stack_hard_limit is already quite complex. I would prefer to not add a second mechanism (tstate->c_dealloc_depth).

#define _Py_DEALLOC_DEPTH_LIMIT 50 sounds very low to me. Currently, it's possible to deallocate way longer chains with no problem.

Issue gh-149146 is really a corner case: when the OS fails to enlarge the stack because the system memory is exhausted.

IMO we should do nothing for this case.

vstinner

I don't think that this approach is correct (see my previous comment).

bedevere-app · 2026-06-19T09:03:08Z

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

vstinner · 2026-06-19T09:03:17Z

cc @markshannon

bhuvi27 · 2026-06-21T10:28:45Z

Thanks for the review, @vstinner.

The concern is that gh-149146 ends in SIGSEGV during cleanup after a legitimate MemoryError — graceful failure on allocation, hard crash on teardown. That felt worth fixing.

On the depth limit: it's only a fallback when the margin signal can't fire (stack page growth blocked by RLIMIT_AS, see gh-150722). Normal deep chains still go through the margin path unchanged.

@markshannon — would you prefer fixing the margin mechanism itself (gh-150722) over a fallback counter? Happy to pivot or close if the consensus is won't-fix.

vstinner · 2026-06-24T14:18:09Z

Python has no way to check if the process reached RLIMIT_AS limit, and so that extending the stack will fail with a segfault.

IMO the current margin mechanism works as expected, and there is nothing we can do to catch stack overflow when RLIMIT_AS limit is reached.

This PR is just a workaround and it can affect badly performance when there is enough memory (most common case), since it limits the trashcan list to 50 entries which is small.

I close the issue. As I wrote previously, I don't think that it's right approach.

bhuvi27 requested review from FFY00, ZeroIntensity and ericsnowcurrently as code owners June 17, 2026 09:57

bedevere-app Bot added the awaiting review label Jun 17, 2026

bedevere-app Bot mentioned this pull request Jun 17, 2026

Segfault when cleaning up after a MemoryError with deeply nested objects #149146

Open

vstinner requested changes Jun 19, 2026

View reviewed changes

bedevere-app Bot removed the awaiting review label Jun 19, 2026

bedevere-app Bot added the awaiting changes label Jun 19, 2026

vstinner closed this Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-149146: Add dealloc-depth counter fallback to trashcan trigger#151595

gh-149146: Add dealloc-depth counter fallback to trashcan trigger#151595
bhuvi27 wants to merge 1 commit into
python:mainfrom
bhuvi27:gh-149146-tuple-memoryerror-cleanup

bhuvi27 commented Jun 17, 2026 •

edited

Loading

Uh oh!

vstinner commented Jun 19, 2026

Uh oh!

vstinner left a comment

Uh oh!

bedevere-app Bot commented Jun 19, 2026

Uh oh!

vstinner commented Jun 19, 2026

Uh oh!

bhuvi27 commented Jun 21, 2026 •

edited

Loading

Uh oh!

vstinner commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Uh oh!

Conversation

bhuvi27 commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Jun 19, 2026

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

bedevere-app Bot commented Jun 19, 2026

Uh oh!

vstinner commented Jun 19, 2026

Uh oh!

bhuvi27 commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bhuvi27 commented Jun 17, 2026 •

edited

Loading

bhuvi27 commented Jun 21, 2026 •

edited

Loading