Skip to content

Commit 2cb30f2

Browse files
gh-152033: Use a leading category as the INFO charset prefix
_get_charset_prefix() did not recognize a leading bare CATEGORY opcode, so a pattern starting with a category escape (such as ``\d``) lost its SRE_INFO_CHARSET prefix and search() could no longer skip non-matching start positions -- a regression relative to the IN-wrapped form. Handle CATEGORY there too, which restores the charset-prefix optimization and makes search() of category-prefixed patterns up to ~1.9x faster. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent b1e246c commit 2cb30f2

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

Lib/re/_compiler.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -495,7 +495,7 @@ def _get_charset_prefix(pattern, flags):
495495
if iscased and iscased(av):
496496
return None
497497
return [(op, av)]
498-
elif op is CHARSET:
498+
elif op is CATEGORY:
499499
return [(op, av)]
500500
elif op is BRANCH:
501501
charset = []

0 commit comments

Comments
 (0)