👌 fix quadratic complexity in fragments_join by petricevich · Pull Request #389 · executablebooks/markdown-it-py

petricevich · 2026-05-04T13:16:31Z

When emphasis/strikethrough postprocessing leaves a long run of adjacent text tokens (e.g. lots of intraword _ that can't open or close emphasis), the old code merged them pairwise:

state.tokens[curr + 1].content = state.tokens[curr].content + state.tokens[curr + 1].content

That's quadratic in the size of the run because every step rebuilds the growing prefix. Switched it to collect the run into a list and "".join once into the last token, which keeps the existing semantics (last token of the run is the one preserved, level is unchanged inside a run because text tokens have nesting=0).

Tested on an adversarial ~190 KB document with ~30k intraword underscores on a single line. With tracemalloc running:

	render time	peak Python alloc
before	2.2s	4476 MB
after	0.6s	23 MB

It's not just a contrived attack input - this kind of thing also shows up naturally in markdown produced by OCR pipelines, where tables of identifiers / references can easily contain very long runs of underscores or other delimiter characters.

Existing tests still pass.

When emphasis/strikethrough postprocessing leaves a long run of adjacent text tokens (e.g. unmatched intraword `_` delimiters), fragments_join merged them via pairwise `a + b` concatenation. Each step rebuilds the growing prefix, costing O(L*k) per run. Walk the whole run once, collect content into a list, and "".join into the last token, making the work O(L). The kept token is still the last in the run so its non-content attributes (markup, etc.) are preserved.

codecov · 2026-05-04T13:40:30Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.81%. Comparing base (8933147) to head (f142dc8).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #389      +/-   ##
==========================================
+ Coverage   95.80%   95.81%   +0.01%     
==========================================
  Files          64       64              
  Lines        3457     3467      +10     
==========================================
+ Hits         3312     3322      +10     
  Misses        145      145

Flag	Coverage Δ
pytests	`95.81% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

chrisjsewell · 2026-05-04T20:40:23Z

Thanks, will double check soon, but sounds good in principle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

👌 fix quadratic complexity in fragments_join#389

👌 fix quadratic complexity in fragments_join#389
petricevich wants to merge 1 commit intoexecutablebooks:masterfrom
petricevich:fragments_join_worst_case_n_squared_fix

petricevich commented May 4, 2026

Uh oh!

codecov Bot commented May 4, 2026

Uh oh!

chrisjsewell commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

petricevich commented May 4, 2026

Uh oh!

codecov Bot commented May 4, 2026

Codecov Report

Uh oh!

chrisjsewell commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants