Skip to content

gh-150942: Speed up re.findall and re.sub/subn result building#150943

Open
eendebakpt wants to merge 2 commits into
python:mainfrom
eendebakpt:re-takeref-opt
Open

gh-150942: Speed up re.findall and re.sub/subn result building#150943
eendebakpt wants to merge 2 commits into
python:mainfrom
eendebakpt:re-takeref-opt

Conversation

@eendebakpt
Copy link
Copy Markdown
Contributor

@eendebakpt eendebakpt commented Jun 5, 2026

Append result items to the output list with _PyList_AppendTakeRef instead of PyList_Append followed by Py_DECREF, removing a reference-count round-trip per appended item (and a per-append lock on the free-threaded build).

Microbenchmarks

Benchmark main this PR speedup
re_sub 1.93 ms 1.73 ms 1.11×
re_subn 1.92 ms 1.73 ms 1.11×
re_findall 1.89 ms 1.80 ms 1.05×
re_findall_groups 4.01 ms 3.87 ms 1.04×
geomean 1.06×
Benchmark script
"""Microbenchmark re.findall / split / sub / subn (result-list building)."""
import re
import pyperf

WORDS = "the quick brown fox jumps over the lazy dog " * 2000
CSVISH = "field1,field2,field3,field4,field5,field6,field7,field8\n" * 3000

word_re = re.compile(r"\w+")
group_re = re.compile(r"(\w)(\w+)")
split_re = re.compile(r"[,\n]")
sub_re = re.compile(r"\w+")


def b_findall():        return word_re.findall(WORDS)
def b_findall_groups(): return group_re.findall(WORDS)   # list of tuples
def b_sub():            return sub_re.sub("X", WORDS)
def b_subn():           return sub_re.subn("X", WORDS)


if __name__ == "__main__":
    runner = pyperf.Runner()
    runner.bench_func("re_findall", b_findall)
    runner.bench_func("re_findall_groups", b_findall_groups)
    runner.bench_func("re_sub", b_sub)
    runner.bench_func("re_subn", b_subn)

Append result items to the output list with _PyList_AppendTakeRef instead
of PyList_Append followed by Py_DECREF, removing a reference-count
round-trip per appended item (and a per-append lock on the free-threaded
build).  Applied to the result lists built by findall and the sub/subn
helper, where the append is a meaningful share of the work.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@sergey-miryanov sergey-miryanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants