Skip to content

gh-151722: Do not track the dict in the GC in _PyDict_FromKeys()#152067

Merged
vstinner merged 9 commits into
python:mainfrom
vstinner:dict_fromkeys
Jun 25, 2026
Merged

gh-151722: Do not track the dict in the GC in _PyDict_FromKeys()#152067
vstinner merged 9 commits into
python:mainfrom
vstinner:dict_fromkeys

Conversation

@vstinner

@vstinner vstinner commented Jun 24, 2026

Copy link
Copy Markdown
Member

dict_merge() no longer requires the dictionary to be tracked by the GC.

dict_merge() no longer requires the dictionary to be tracked by the
GC.

Co-authored-by: Donghee Na <donghee.na@python.org>
@vstinner

vstinner commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

This fix is similar to PR gh-152021 but it doesn't call _PyObject_GC_UNTRACK(d). It only cares about the dictionary being tracked by the GC if it's a frozendict. If cls is a frozendict, call frozendict_new_untracked() to get directly a frozendict not tracked by the GC. Otherwise, if cls constructor creates a frozendict tracked by the GC, create a copy which is not tracked by the GC.

We don't have to track again the dictionary by the GC on error. For a dict, it's already tracked. For a frozendict, it's ok that it's not tracked by the GC on error.

@vstinner

Copy link
Copy Markdown
Member Author

cc @corona10 @methane

@corona10

Copy link
Copy Markdown
Member

@vstinner
Could you check with following tests passed with TSAN build?

TSAN_OPTIONS=halt_on_error=0 ./python tsan_frozendict.py

import gc, threading

N = 4000
stop = threading.Event()
box = []

def reader():
    while not stop.is_set():
        if box:
            o = box[0]
            try:
                len(o); repr(o); hash(o)
            except Exception:
                pass

def leak():
    if not box:
        for o in gc.get_objects():
            if type(o) is frozendict and "k0" in o and len(o) < N:
                box.append(o)
                break

class LeakyMap:                   
    def keys(self): return [f"k{i}" for i in range(N)]
    def __getitem__(self, key):
        if key == "k5":
            leak()
        return key

def gen_keys():
    for i in range(N):
        if i == 5:
            leak()
        yield f"k{i}"

def builder():
    try:
        for _ in range(30):
            box.clear(); frozendict(LeakyMap())
            box.clear(); frozendict.fromkeys(gen_keys())
    finally:
        stop.set()

ts = [threading.Thread(target=reader) for _ in range(2)]
b = threading.Thread(target=builder)
for t in ts: t.start()
b.start(); b.join()
for t in ts: t.join(timeout=2)
print("observed half-built:", bool(box), "(False = fixed)")
print("done")

@vstinner

Copy link
Copy Markdown
Member Author

@corona10: Could you check with following tests passed with TSAN build?

Sure. The test says that issue is fixed:

$ TSAN_OPTIONS=halt_on_error=0 ./python tsan_frozendict.py
observed half-built: False (False = fixed)
done

I expect the issue to be fixed since the frozendict is no longer tracked by the GC, at least the one which is modified in-place by .fromkeys().

I built Python with ./configure --with-thread-sanitizer.

@vstinner

Copy link
Copy Markdown
Member Author

I pushed a .fromkeys() fix for dict subclass. The following code now displays <class 'frozendict'> instead of <class '__main__.DictSubclass'>. IMO it's a bad idea to create a DictSubclass instance in this case. It's less surprising to get a frozendict if DictSubclass.__new__() returns a frozendict.

created = frozendict(x=1)

class DictSubclass(dict):
    def __new__(self):
        return created

print(type(DictSubclass.fromkeys("abc")))

Comment thread Lib/test/test_dict.py Outdated
fd = FrozenDictSubclass.fromkeys("abc")
self.assertEqual(fd, frozendict(x=1, a=None, b=None, c=None))
self.assertEqual(type(fd), FrozenDictSubclass)
self.assertEqual(type(fd), frozendict)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not behavior change?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the change on purpose. IMO the old behavior is wrong. If FrozenDictSubclass constructor returns a frozendict, .fromkeys() must return a frozendict instead of creating a FrozenDictSubclass.

@methane

methane commented Jun 25, 2026

Copy link
Copy Markdown
Member

We shouldn't change the behavior of dict and frozendict without a reason, but in this case, the relationship between .fromkeys() and __new__() should be different from dict because of the immutable property of frozendict.

How about passing a frozendict to the __new__() method of a subclass of frozendict like this?

class MyFrozen(frozendict):
    def __new__(cls, *args, **kwargs):
        return super().__new__(cls, *args, **kwargs)

MyFrozen.fromkeys(["a", "b", "c"])
# => MyFrozen(frozendict({"a": None, "b": None, "c": None}))

frozendict is a new type, so we don't need to worry about breaking existing code.

@vstinner

Copy link
Copy Markdown
Member Author

How about passing a frozendict to the new() method of a subclass of frozendict like this?

Ok. I modified frozendict.fromkeys() to call the type constructor again with a frozendict if the type is a frozendict subclass, or if the constructor returned a frozendict.

@methane @corona10: Does it look better to you?

Sorry, fromkeys() is quite complex because frozendict is immutable :-) At least, the behavior is now well tested by test_dict.

Example:

class FrozenDictSubclass(frozendict):
    def __new__(cls, *args, **kwargs):
        print(f"Call constructor with {args} and {kwargs}")
        return super().__new__(cls, *args, **kwargs)

fd = FrozenDictSubclass.fromkeys("abc")
print(fd)
print(type(fd))

For a frozendict subclass, the constructor is now called twice:

Call constructor with () and {}
Call constructor with (frozendict({'a': None, 'b': None, 'c': None}),) and {}
FrozenDictSubclass({'a': None, 'b': None, 'c': None})
<class '__main__.FrozenDictSubclass'>

The first call creates an empty frozendict: fromkeys() copies this empty frozendict and fills the copy.

The second call is to create a FrozenDictSubclass from the copy.

@read-the-docs-community

read-the-docs-community Bot commented Jun 25, 2026

Copy link
Copy Markdown

Documentation build overview

📚 cpython-previews | 🛠️ Build #33311116 | 📁 Comparing 02d245f against main (56ae0b8)

  🔍 Preview build  

2 files changed
± library/stdtypes.html
± whatsnew/changelog.html

Comment thread Objects/dictobject.c Outdated
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
@vstinner vstinner enabled auto-merge (squash) June 25, 2026 17:43
@vstinner vstinner merged commit 55bc312 into python:main Jun 25, 2026
97 of 100 checks passed
@vstinner vstinner deleted the dict_fromkeys branch June 25, 2026 18:06
@miss-islington-app

Copy link
Copy Markdown

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.15.
🐍🍒⛏🤖

@bedevere-app

bedevere-app Bot commented Jun 25, 2026

Copy link
Copy Markdown

GH-152225 is a backport of this pull request to the 3.15 branch.

@bedevere-app bedevere-app Bot removed the needs backport to 3.15 pre-release feature fixes, bugs and security fixes label Jun 25, 2026
vstinner added a commit that referenced this pull request Jun 25, 2026
…omKeys() (GH-152067) (#152225)

gh-151722: Do not track the frozendict in the GC in _PyDict_FromKeys() (GH-152067)

_PyDict_FromKeys() now creates a frozendict copy which is
not tracked by the GC.

dict_merge() no longer requires the dictionary to be tracked by the
GC.
(cherry picked from commit 55bc312)

Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Donghee Na <donghee.na@python.org>
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
@vstinner

Copy link
Copy Markdown
Member Author

Merged. Thanks for your precious reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants