gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access by cmaloney · Pull Request #145957 · python/cpython

cmaloney · 2026-03-15T02:05:26Z

TextIOWrapper keeps its underlying stream in a member called self->buffer. That stream can be detached by user code, such as custom .flush() implementations, resulting in self->buffer being set to NULL. The implementation often checked at the start of functions if self->buffer is in a good state but did not always recheck after other Python code which could modify self->buffer.

The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort make a better safety net by changing all self->buffer access to go through helper functions.

Thank you @yihong0618 for the test, NEWS and an initial implementation in gh-143041.

Issue: Null pointer dereference in TextIOWrapper.truncate via re-entrant flush #143008

TextIOWrapper keeps its underlying stream in a member called `self->buffer`. That stream can be detached by user code, such as custom `.flush` implementations resulting in `self->buffer` being set to NULL. The implementation often checked at the start of functions if `self->buffer` is in a good state, but did not always recheck after other Python code was called which could modify `self->buffer`. The cases which need to be re-checked are hard to spot so rather than rely on reviewer effort create better safety by making all self->buffer access go through helper functions. Thank you yihong0618 for the test, NEWS and initial implementation in pythongh-143041. Co-authored-by: yihong0618 <zouzou0208@gmail.com>

yihong0618 · 2026-03-15T02:06:34Z

wow that is cool thanks

cmaloney · 2026-04-16T18:50:57Z

@colesbury could you take a look as you've done some recent work on TextIO safety around threading. These cases are generally single threaded but generally the pattern "Implementation usually checked state; then did an operation which could modify it and didn't recheck".

colesbury · 2026-04-17T17:56:16Z

I'm a bit confused about the relationship here between _textiowrapper_buffer_safe and CHECK_ATTACHED. When can buffer be NULL and CHECK_ATTACHED not trigger?

colesbury · 2026-04-17T18:00:27Z

Less important, but please match the CPython style for C code, e.g.:

{ on new lines for functions definitions
when wrapping function calls, indent continuation lines to align with the call site

cmaloney · 2026-04-17T20:38:50Z

I'm a bit confused about the relationship here between _textiowrapper_buffer_safe and CHECK_ATTACHED. When can buffer be NULL and CHECK_ATTACHED not trigger?

CHECK_ATTACHED, by way of CHECK_INITIALIZED, checks self->ok first. The self->ok flag is set to 0 at start of __init__, 1 at the end, and cleared/reset to 0 during .clear() as well as at the start of the Py_tp_dealloc. During __init__ there are calls to check the file is in a good state and get some attributes (remember: self->buf is the underlying file-like object not an internal buffer).

The more complicated case is during deletion where there is an interaction with the base class. TextIOWrapper implements Py_tp_dealloc and its base class (TextIOBase which has a base class IOBase) implements a Py_tp_finalize with the function iobase_finalize. That function tries to ensure all data is written in the full I/O "stack" before any part of it is closed out of order. It tries to have Text I/O flush and close, then Buffered I/O (self->buf), then the Raw I/O. If there is an underlying file (self->buf != NULL) during those operations need to try and write data in the TextIOWrapper even though self->ok is 0.

Adding to the complexity here all the objects involved and the multiple operations (flush(), close(), .closed) can be overridden and could call TextIOWrapper.clear() or TextIOWrapper.detach() causing self->buf to become NULL.

Less important, but please match the CPython style for C code, e.g.:

Sorry for missing those bits, doing a review of that and will update for it shortly.

cmaloney · 2026-04-20T07:49:39Z

There are a couple cases I'm not sure how to PEP 7, in particular:

/* hits 81 cols */
        PyObject *bytes = _textiowrapper_buffer_callmethod_noargs(self,
                                                                  &_Py_ID(read));
/* hits 81 cols */
            res = _textiowrapper_buffer_callmethod_onearg(self,
                                                          &_Py_ID(_dealloc_warn),
                                                          (PyObject *)self);


PyObject *input_chunk = _textiowrapper_buffer_callmethod_onearg(self,
            &_Py_ID(read), bytes_to_feed);

A couple directions I looked at but don't know how to decide between:

Make the variable name shorter (increases diff)
Make the function name shorter (ex. _textiowrapper_buf_... or _textio_buffer_, _textiowrapper_buf_callmeth_noargs)
Whats the right indentation to just put on the next line?

colesbury · 2026-04-20T17:49:29Z

I think it's fine to make the function names shorter, i.e., buffer_callmethod_onearg instead of _textiowrapper_buffer_callmethod_onearg. File-scope static functions don't need a prefix or underscore.

In general the preference is for wrapping function calls like the following (from PEP 7 example):

PyErr_Format(PyExc_TypeError,
             "cannot create '%.100s' instances",
             type->tp_name);

But if that's not practical due to overly long lines, than just try to fit the style of surround code. For example, from elsewhere in textio.c:

        PyObject *incrementalDecoder = PyObject_CallFunctionObjArgs(
            (PyObject *)state->PyIncrementalNewlineDecoder_Type,
            self->decoder, self->readtranslate ? Py_True : Py_False, NULL);

cmaloney · 2026-05-06T21:24:22Z

This should also resolve gh-143007

cmaloney · 2026-05-12T21:35:03Z

I think thish is ready for re-review @colesbury

vstinner · 2026-05-28T16:11:46Z

+        }
+        return NULL;
+    }
+    return self->buffer;


I was worried that the function returns a borrowed reference. Another thread can call detach() converting the buffer variable to a dangling pointer. I hacked the code to introduce sleep+sched_yield() between getting the buffer and using the buffer, and I spawned 250 threads calling stream.write() and 1 thread calling stream.detach() with a random sleep.

In fact, it's (currently) safe to use a borrowed reference because all io.TextIOWrapper methods are protected by @critical_section. So it's not possible to call deatch() while another io.TextIOWrapper method is called: method calls are serialized by the critical section.

But it would be interesting to add a comment to explain why it's safe to use a borrowed reference. Something like:

// Returning a borrowed reference is safe since TextIOWrapper // methods are protected by critical sections.

Added a _Py_CRITICAL_SECTION_ASSERT_OBJECT_LOCKED to validate + comment to document.

and found out __init__ wasn't in a critical section in the process so re-initalization was racy... fixed as well

vstinner · 2026-05-28T16:12:56Z

+   leading to NULL pointer dereferences (see gh-143008, gh-142594). Protect against
+   that by using helpers to check self->buffer validity at callsites. */
+static PyObject *
+buffer_access_safe(textio *self)


I suggest to rename buffer_xxx() functions to textiowrapper_buffer_xxx(), and rename this function to textiowrapper_buffer_get().

Started like that (see: d7f14fc) but it made some of the PEP-7 formatting really bad so moved to a shorter name

vstinner · 2026-05-28T16:25:09Z

+    /* During destruction self->ok is 0 but self->buffer is non-NULL and this
+       needs to error in that case which the safe buffer wrapper does not.
+
+       Match original behavior by calling CHECK_ATTACHED explicitly. */


textiowrapper_dealloc() sets ok to 0 and almost immediately sets buffer to NULL. So I don't understand well when it's possible that ok is 0 and buffer is non-NULL.

This comment is unclear to me. I suggest rephrasing it to something like:

"Call CHECK_ATTACHED() to raise an exception if ..., when ... happens."

The "Match original behavior" part is unclear to me.

Moved to checking self->ok directly and just returning true in that case which differs more in this bug fix but I think is a lot more understandable. Described a couple cases this is hit by the test suite.

Co-authored-by: Victor Stinner <vstinner@python.org>

read-the-docs-community · 2026-06-05T00:46:43Z

Documentation build overview

📚 cpython-previews | 🛠️ Build #33000961 | 📁 Comparing d7b146e against main (a00b24e)

🔍 Preview build

104 files changed · ± 103 modified · - 1 deleted

± Modified

- Deleted

deprecations/soft-deprecations.html

bedevere-app Bot mentioned this pull request Mar 15, 2026

Null pointer dereference in TextIOWrapper.truncate via re-entrant flush #143008

Open

bedevere-app Bot added the awaiting review label Mar 15, 2026

cmaloney added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Mar 15, 2026

cmaloney added 2 commits March 14, 2026 19:12

fixup: Refer to existing helpers

551c3eb

fixup: More precise recursion errorcatching

4f455f4

cmaloney changed the title ~~gh-143008: Safer TextIOWrapper underlying stream access~~ gh-143008: Fix Null Pointer Dereferences in TextIOWrapper underlying stream access Mar 17, 2026

cmaloney changed the title ~~gh-143008: Fix Null Pointer Dereferences in TextIOWrapper underlying stream access~~ gh-143008: Fix Null pointer dereferences in TextIOWrapper underlying stream access Mar 17, 2026

cmaloney added 2 commits March 16, 2026 23:08

fixup: Formatting fixes, simplify test

ec39a61

fixup: correct commment

98bc865

yihong0618 reviewed Mar 17, 2026

View reviewed changes

Comment thread Lib/test/test_io/test_textio.py

cmaloney and others added 2 commits March 30, 2026 00:33

PEP 7 fixes; simplify some comments

c52559b

Merge branch 'main' into textio_nullbuffer

210b21b

colesbury self-requested a review April 17, 2026 17:57

Improve PEP-7

150fcee

cmaloney added 4 commits April 21, 2026 00:06

PEP-7: Move to shorter function names, reformat arg lists

d7f14fc

Improve NEWS grammer

df212a5

Simplify comment

d1a1a98

Fix comment referring to function name

009cf12

vstinner reviewed May 28, 2026

View reviewed changes

serhiy-storchaka added the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label May 30, 2026

Apply suggestions from code review

afe79c7

Co-authored-by: Victor Stinner <vstinner@python.org>

cmaloney added 7 commits June 4, 2026 18:23

Assert self->buffer is locked, fix __init__ race

452168c

Adjust closed_get_impl

d7b146e

Add back in locked assterion, match surrounding comment style

593b0ea

Move null checks up a line

e5c014a

Merge remote-tracking branch 'upstream/main' into textio_nullbuffer

eb23eaa

Fix clinic merge

2581e6c

Fix blurb

feafb90

Uh oh!

Conversation

cmaloney commented Mar 15, 2026 • edited by bedevere-app Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yihong0618 commented Mar 15, 2026

Uh oh!

Uh oh!

cmaloney commented Apr 16, 2026

Uh oh!

colesbury commented Apr 17, 2026

Uh oh!

colesbury commented Apr 17, 2026

Uh oh!

cmaloney commented Apr 17, 2026

Uh oh!

cmaloney commented Apr 20, 2026

Uh oh!

colesbury commented Apr 20, 2026

Uh oh!

cmaloney commented May 6, 2026

Uh oh!

cmaloney commented May 12, 2026

Uh oh!

vstinner May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cmaloney Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

cmaloney Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

vstinner May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cmaloney Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vstinner May 28, 2026

Choose a reason for hiding this comment

Uh oh!

cmaloney Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

read-the-docs-community Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation build overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cmaloney commented Mar 15, 2026 •

edited by bedevere-app Bot

Loading

read-the-docs-community Bot commented Jun 5, 2026 •

edited

Loading