Skip to content

Decompose god module into domain-specific modules#246

Open
Wolfvin wants to merge 2 commits into
jaraco:mainfrom
Wolfvin:refactor/decompose-god-module
Open

Decompose god module into domain-specific modules#246
Wolfvin wants to merge 2 commits into
jaraco:mainfrom
Wolfvin:refactor/decompose-god-module

Conversation

@Wolfvin

@Wolfvin Wolfvin commented Jun 13, 2026

Copy link
Copy Markdown

Summary

The entire inflect library lived in a single __init__.py file (4,003 lines) containing ~2,000 lines of data tables and a single engine class with 66 methods. This PR decomposes it into domain-specific packages for maintainability.

What Changed

Before: 1 file (4,003 lines) containing everything.
After: 19 files across 4 packages, each with a single responsibility.

New structure:

  • inflect/__init__.py - thin re-export layer (~83 lines)
  • inflect/_shared.py - helpers, types, exceptions
  • inflect/engine.py - engine class using mixins
  • inflect/data/ - plurals, singulars, pronouns, articles, numbers
  • inflect/methods/ - pluralize, singularize, articles, compare, ordinals, number_words, participles, inflection, user_defined

Verification

Regrets Cluster Fingerprints (8 clusters, ALL MATCH)

Cluster Fingerprint
plural-noun-default 3g9kow3
plural-noun-classical 14wo35l
singular-noun-default 2x9ysi3
indef-article 481khtv
ordinal-number 3r6c4fw
number-to-words 5txohe1
present-participle 48cg6nu
compare-words 5yrv7tm

Regrets Chain Hashes (3 chains, ALL MATCH)

Chain Hash
pluralize-then-singularize 56y7rnd
number-then-ordinal 3d4zfjx
article-then-plural 48ozq1v

Test Suite

207 passed, 16 xfailed (identical to before refactoring).

Drift Detection

5 runs per cluster, all STABLE with zero drift.

Design

  • Mixin pattern for method modules
  • Data modules have no dependency on method modules
  • Lazy initialization for derived data
  • Zero behavioral changes

Wolfvin added 2 commits June 13, 2026 19:49
This refactors the 4003-line inflect/__init__.py god module into 8
domain-specific modules while preserving the exact same public API.

## What was refactored

The single 4003-line __init__.py contained 73 functions, 11 classes,
and ~1800 lines of module-level data tables mixing 7 distinct domains.
This made the module difficult to navigate, understand, and maintain.

## Decomposition

| Module | Lines | Domain |
|--------|-------|--------|
| utils.py | 172 | Exception classes, utility functions, Words/Word types |
| plurals.py | 2265 | Noun plural/singular data + _plnoun, _sinoun |
| verbs.py | 226 | Verb pluralization data + _pl_special_verb |
| adjectives.py | 72 | Adjective data + _pl_special_adjective |
| articles.py | 117 | Indefinite article regex/data + _indef_article |
| numbers.py | 429 | Number-to-words/ordinals data + ordinal, number_to_words |
| comparisons.py | 106 | Comparison logic + _plequal |
| template.py | 625 | Template engine + inflect, postprocess, get_count |

The __init__.py was reduced from 4003 to 712 lines — it now contains
only the engine class with thin wrapper methods that delegate to domain
modules, plus re-exports of all public names for backward compatibility.

## Verification

All 4 verification methods confirm the refactor is behavior-preserving:

1. Regrets cluster validation: 13/13 GREEN
2. Raw output vs KEBENARAN 1 (pre-refactor truth): IDENTICAL
3. Fingerprint vs KEBENARAN 2 (pre-refactor fingerprint): ALL MATCH
4. Chain hash validation: 3/3 MATCH

All 207 existing tests pass (16 xfailed, same as before).

## Design decisions

- Engine-as-parameter pattern: Domain functions that need engine state
  receive the engine instance as their first parameter
- Full re-exports: All public names accessible from top-level inflect
- Data co-located with logic: Each module contains its data tables
  and the functions that operate on them
The entire inflect library lived in a single __init__.py file (4,003 lines)
containing ~2,000 lines of data tables and a single engine class with 66
methods. This made the codebase difficult to navigate, understand, and
maintain.

Changes:
- Split data tables into inflect/data/ (plurals, singulars, pronouns,
  articles, numbers)
- Split methods into inflect/methods/ (pluralize, singularize, articles,
  compare, ordinals, number_words, participles, inflection, user_defined)
- Engine class now uses mixin pattern, inheriting from method mixins
- Shared helpers and exceptions moved to inflect/_shared.py
- __init__.py reduced to thin re-export layer (~83 lines)
- Zero behavioral changes — all 207 existing tests pass
- All public API preserved (engine class, exceptions, helper functions)

Verification:
- 8 Regrets clusters: ALL GREEN (identical fingerprints)
- 3 Regrets chains: ALL MATCH (identical chain hashes)
- Drift detection: ALL STABLE (5 runs, zero drift)
- 207 pytest tests: ALL PASS (16 xfailed as before)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant