refactor(packaging): single source of truth for lambda shared libraries (uv workspace)#4933
refactor(packaging): single source of truth for lambda shared libraries (uv workspace)#4933Austin-s-h wants to merge 33 commits into
Conversation
Unify Python dependency management across the repo's many packages: * Introduce a root uv workspace (root pyproject.toml + consolidated uv.lock). py-shared and the lambda packages that depend on the shared libraries become workspace members and resolve from the single root lockfile, replacing the previous archive-URL pins of quilt-shared / t4-lambda-shared. Their per-package uv.lock files are removed. * Drop Python 3.9/3.10. quilt3 SDK requires >=3.11; everything else requires >=3.13. Update classifiers, .python-version files, and the py-ci test matrix accordingly. * Relax/modernize dependency pins across all packages and regenerate every lockfile. * Make the lambda build pipeline workspace-aware: build_zip.sh now takes REPO_ROOT/FUNCTION_DIR/PACKAGE_PATH, Dockerfiles copy workspace members, and deploy-lambdas.yaml / py-ci.yml export requirements via the new .github/scripts/python_packaging.py helper (package-name, uses-workspace, install-targets, guardrails). Pure packaging change: no feature/runtime code. Validated with `uv lock --check` across the root workspace and every standalone member. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #4933 +/- ##
=======================================
Coverage 46.50% 46.50%
=======================================
Files 832 832
Lines 34090 34092 +2
Branches 5833 5833
=======================================
+ Hits 15854 15856 +2
Misses 16237 16237
Partials 1999 1999
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Workspace consolidation made the lambdas resolve quilt3 and shared deps from local source, which surfaced several real breakages: - Re-export CHECKSUM_MULTIPART_THRESHOLD / CHECKSUM_MAX_PARTS / get_checksum_chunksize / is_mpu from quilt3.data_transfer. They moved to quilt3.checksums, but pkgpush and s3hash (pinned `quilt3 >=7,<8`) still import them from data_transfer; the workspace made them use local source, breaking collection. Keeps the public surface stable. - shared: restore fcsparser~=0.2.1 in the preview extra. The branch had swapped it for flowio in the dependency list, but preview.py still imports fcsparser (the flowio migration lands with its source change in the preview-features PR). Fixes ModuleNotFoundError in shared/preview/indexer. Re-locked. - indexer: nbformat>=5.10 rejects a notebook missing top-level "metadata" with ValidationError rather than the old AttributeError; update test_read_notebook to accept either (the malformed fixture is still correctly rejected). - preview: widen the ipynb-chop length tolerance (300->600) to absorb nbconvert template drift; still asserts the ~89KB notebook was chopped to ~18KB. - Regenerate quilt3/_graphql_client with the bumped ariadne-codegen (gql-check). - lint-rest: convert isinstance tuples to X | Y (UP038) in manifest_indexer, shared.decorator, and thumbnail — newly flagged now that requires-python is >=3.13. - deploy-lambdas: pick the Docker build context via the uses-workspace helper instead of hard-coding `indexer`, so future workspace-member lambdas work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… swap The flowio->fcsparser change in lambdas/shared's preview extra is embedded in the t4-lambda-shared metadata of every standalone lambda lock that path-depends on it. Those locks still referenced `flowio` under `extra == 'preview'`, so the `uv export --locked` step failed with "lockfile needs to be updated" for thumbnail, transcode, tabular_preview, access_counts, and pkgevents. Re-locked all five so they carry the fcsparser entry. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The pdf2image -> pypdfium2 swap belongs with the pdf_thumbnail.py rewrite in the preview-lambda feature PR, not in packaging. Restore pdf2image so the unchanged source (`import pdf2image`) resolves and the test module collects. Adjust the czi xfail markers for the bioio 3.3 / numpy 2.4 bump carried by this lock: c1_bgr48 now surfaces a Pillow TypeError, rgb-image renders (golden thumbnail absent), and c2_gray8_gray16 renders with a different normalization. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
CI revealed the bioio 3.0->3.3 / dask 2026.1->2026.3 bump that the relaxed
constraints resolved to requires the _format_n_dim_ndarray source rewrite
(eager np.asarray to dodge a bioio-tifffile/dask lazy-transpose bug) that
lives in the preview-lambda feature PR, not here. Without it cell.tiff 500s
("axes don't match array") and c2_gray8_gray16 renders differently.
Reset the thumbnail uv.lock to master's tested imaging tier (uv won't upgrade
unprompted) and revert the test xfail markers to master. This keeps the
packaging PR behavior-neutral; the imaging-stack bump + source fix + marker
changes stay together in the feature PR.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… >=3.10
Dropping Python 3.9 from gendocs exposes a latent crash in the docs build:
TypeError: <classmethod(...)> is not a callable object (pydocmd loader.py)
pydoc-markdown imports each documented object via `cls.__dict__[name]` (to
avoid triggering descriptors), so methods arrive as the raw classmethod /
staticmethod *descriptor*. Its `get_function_signature` checks
`hasattr(obj, "__name__")` before its `isinstance(obj, classmethod)` branch.
On Python 3.9 a classmethod descriptor had no `__name__`, so the unwrap branch
ran; on Python >=3.10 the descriptor gained `__name__`, so the loader skips the
unwrap and calls `inspect.signature()` on a non-callable descriptor and aborts
the whole build. This affects every @classmethod in the API (e.g.
Package.install), independent of any other decorator.
Fix it where it belongs — in gendocs — by wrapping pydocmd's
`get_function_signature` to unwrap descriptors up front, binding a classmethod
to its owner so the rendered signature drops the implicit `cls` (matching the
Python 3.9 output and the committed docs).
- Bump gendocs requires-python to >=3.11 and pin 3.12 via .python-version so CI
actually exercises the affected path and the doc output is deterministic.
- Regenerate cli.md (argparse renamed "optional arguments:" -> "options:" in
Python 3.10).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…chore/uv-packaging-modernization # Conflicts: # gendocs/uv.lock
c12dbd1 to
187e9a9
Compare
Address review feedback on the gendocs descriptor fix: - Cap nbconvert at <9 so a future `uv lock --upgrade` can't silently pull a major-version break (resolved version is unchanged at 7.17.1). - Add a comment explaining the classmethod branch: bind to owner_class to drop the implicit `cls` (matching py3.9 output), falling back to __func__ only when no owner is known — which is unreachable for documented API methods since pydocmd always passes owner_class. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eep ours otherwise Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@Greptile please re-review |
|
@copilot Please thoroughly review this PR, including updating the language about Python support. 3.9 has been dropped but 3.10 is still supported. |
Addressed in 1f975a8. I reviewed the PR and updated the changelog language to clarify that only Python 3.9 was dropped and |
…g-modernization' into chore/uv-packaging-modernization
|
@drernie @sir-sigurd This should be ready to review! |
sir-sigurd
left a comment
There was a problem hiding this comment.
(AI-assisted analysis; conclusions are my own — sanity-check the specifics.)
The Python-floor handling looks right now (quilt3 staying >=3.10), and the indexer Dockerfile root-COPY is sorted. My one remaining question is the central design choice: the uv workspace trades away independent per-lambda versioning, and I'm not sure that's what we want for independently-deployed lambdas.
Today each lambda pins its own version of quilt-shared/t4-lambda-shared and resolves its own third-party deps in its own lockfile. For services that deploy independently, that independence seems like the property we want: each can bump or hold a shared-lib/dependency version on its own schedule — security fixes, divergent needs, isolated blast radius — and staged migration of the shared libs is possible (one consumer moves, others don't have to).
A uv workspace gives that up. A single shared lock means one version of every dependency across all members (no divergence without evicting a member from the workspace), and all members track the in-tree shared libs rather than a pinned version (so no per-lambda version, no staged migration). That's not a side detail — it's the opposite of how these deploy today.
And I'm not sure there's a problem here that needs it. The current archive/<sha>.zip pins work and are reproducible; their only real cost is co-development friction (you must push the shared lib before a consumer can pin a new version) — but that's the inherent, accepted price of versioned shared code, and it's mitigable (point at a local path while iterating, re-pin the tag to commit).
So the question: what concrete problem does the workspace solve that justifies giving up independent per-lambda versioning? If one-version lockstep across these lambdas is a deliberate goal, that's legitimate — but it should be stated and argued, because it inverts today's model. If it isn't a goal, the current versioned approach already matches what we want; the most I'd change is tidying the drifting archive/<sha>.zip pins into deliberate git/tag pins (stability + legibility), keeping per-lambda version control.
Not blocking — just that this PR sets the dependency-management model the rest of the series inherits, so it's worth settling deliberately rather than defaulting to the workspace.
|
yeah my thoughts exactly: not sure what are we trying to solve here. |
|
cc @drernie Appreciate the pushback — it's the right thing to settle here, and it pushed me to quantify what this is actually for. "chore: modernize packaging" undersells it, and I think that framing is why the intent isn't landing. Let me reframe around the problem and back it with numbers from our own history. The problem: shared-space changes land inconsistently across lambdasShared code is pulled in per-consumer via opaque
The #4930/#4931 pair from last week is the mechanism in miniature: editing py-shared (#4930) needed a separate same-day follow-up (#4931) to re-pin just the iceberg consumer — and the other five were left on older SHAs, which is exactly why three pins coexist. What the modernization accomplishesThe workspace isn't "merge everything into lockstep" — it's an opt-in shared space with a consistent git state, and it directly removes the churn above:
The tradeoff I'm not hidingWorkspace members do share one third-party resolution, and a few deps differ across them today (e.g. indexer on Worth noting the same per-component fan-out shows up beyond shared code: the Dec-2025 uv/Python-3.13 migration shipped as nine separate PRs (#4618, #4647, #4649–#4655), and routine security bumps fan out across the per-lambda lockfiles (idna→3.15 = 14 PRs #4904–#4927, urllib3→2.7.0 = 11 PRs, aiohttp→3.14 = 5 PRs). A single shared lock coalesces a lot of that. What I'm genuinely open toI care more about killing the drift than about the workspace specifically, so these are real questions, not rhetorical ones — happy to be steered on any of them:
Does scoping it down one of these ways address the independence concern, or do you read the per-component drift differently than I'm seeing it? Happy to rework in whatever direction gets this to "obviously right" for the core team. |
|
@greptile re-review |
|
@sir-sigurd @nl0 @drernie Curious if you have had a chance to discuss this yet? I don't want to base future work off of it unless it is an accepted path. I do think it would largely address #4869 type of behavior on the python side. |
Description
Lambdas consume two first-party shared libraries —
quilt-shared(py-shared) andt4-lambda-shared(lambdas/shared) — but each one pins them independently via opaquearchive/<sha>.zipURLs, with nothing keeping those pins coherent. This PR gives the shared libraries a single source of truth so a shared-code change lands consistently across consumers, while keeping per-lambda independence a deliberate, opt-in choice rather than an accident of drift.The problem this solves
Per-consumer SHA pins drift, in two opposite directions, on
mastertoday:quilt-sharedfragments — it's pinned to several different commits at once across its consumers, so they're not on a common version of the same library. This is functional, not cosmetic: CRC64NVME checksum support (shared: Add CRC64NVME checksum type support #4623) ships only in the consumers that happened to re-pin, and is silently absent from the rest.t4-lambda-sharedfreezes — every consumer is stuck on a single Dec-2024 commit (update lambdas to Python 3.11 #4241) while the library itself has moved on repeatedly (BSQIK/Search: Indexing #4422, enable UP0{15,30,33,34,39} in ruff #4606, Add QuiltSync and Benchling Webhook documentation #4640, shared: migrate to uv and Python 3.13 #4652, build(deps): bump pyarrow from 18.1.0 to 23.0.1 in /lambdas/shared #4951). The shared lib's own uv/Python-3.13 migration (shared: migrate to uv and Python 3.13 #4652) has reached zero consumers.[tool.uv.sources]table, and the dependency key is spelled botht4_lambda_sharedandt4-lambda-shared.Propagating one shared change today means hand-editing the SHA in up to ~9 consumers; the #4930/#4931 pair last week is the mechanism in miniature (edit py-shared, then a separate follow-up to re-pin a single consumer).
The approach: an opt-in shared space
pyproject.toml+ consolidated rootuv.lock; members resolve the shared libs from one coherent in-tree source — drift becomes structurally impossible for anything in the shared space, and a shared change is one edit instead of N.path = "../shared"). Joining the shared space is a choice, not a mandate.uv.lockand>=3.10floor and is referenced only as a path dependency — preserving the quilt3-vs-lambdas cadence split.uses-workspacehelper that reads the members list from the rootpyproject.tomlat runtime, so the zip and ECR paths branch generically (no per-lambda special-case).quilt3drops Python 3.9 and continues to support 3.10+ (requires-python = ">=3.10"). Workspace-rooted lambda/shared packaging is on the higher baseline (generally>=3.13, withlambdas/sharedat>=3.12).Open design question
This PR sets the dependency-management model the rest of the series inherits, so it's worth settling deliberately. The core team has reasonably noted that per-component isolation is a deliberate choice for differing change cadences — see the discussion thread for the tradeoff and the alternatives I'm open to (narrowing workspace membership, or keeping per-consumer pins but replacing the opaque SHAs with enforced
git + tagpins). Feedback on scope very welcome.Validation
uv lock --checkacross the root workspace and standalone membersuses-workspaceclassification verified for every deploy-matrix entry (member vs standalone), both zip and ECR pathsTODO
optipngon any new PNGsbuild.pyfor new docstringsGreptile Summary
This PR replaces per-consumer SHA-pinned archive URLs for
quilt-sharedandt4-lambda-sharedwith a uv workspace at the repo root, eliminating version drift across ~9 lambda consumers. Eight packages join the workspace (py-shared,lambdas/shared, and six lambdas); standalone lambdas that need independence retain their own lock files withpath = \"../shared\"local sources instead of opaque ZIP URLs.pyproject.tomldefines the workspace;.github/scripts/python_packaging.pyenforces the dependency contract at CI time and drives workspace-awareuv export/ install-target selection in both the zip and ECR build paths.py-ci.ymlandbuild_zip.shbranch onuses-workspaceto choose the correctuv exportinvocation; the ECR workflow selects build context (repo root vs lambda dir) using the same helper.tabular_previewandthumbnailhave a build-breaking gap: both are non-workspace-member (so the ECR workflow uses./lambdas/<name>as the Docker build context), but theirpyproject.toml/uv.locknow declaret4-lambda-shared = { path = \"../shared\" }— a directory that is outside the build context. Bothuv synccalls in the Dockerfile will fail to resolve this path, breaking ECR deploys for those two lambdas.Confidence Score: 4/5
Safe to merge for zip-deployed lambdas and the indexer; the tabular_preview and thumbnail ECR builds will fail on the next push to master that touches the lambdas/ tree.
The workspace setup, guardrails script, zip build path, and indexer Dockerfile are all internally consistent. The gap is that tabular_preview and thumbnail switched from URL sources to path = ../shared without joining the workspace, so the ECR workflow keeps building them from their own directory — a context that does not contain lambdas/shared. Both uv sync stages in each Dockerfile will fail to resolve the path dependency.
lambdas/tabular_preview/Dockerfile and lambdas/thumbnail/Dockerfile — both need either promotion to workspace members (so the build context becomes the repo root) or explicit COPY directives for lambdas/shared.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD ROOT["Root pyproject.toml\n(uv workspace)"] subgraph WS["Workspace Members"] PS["py-shared (quilt-shared)"] LS["lambdas/shared (t4-lambda-shared)"] ESI["lambdas/es_ingest"] ICE["lambdas/iceberg"] IDX["lambdas/indexer (ECR)"] MI["lambdas/manifest_indexer"] PKP["lambdas/pkgpush + quilt3 path dep"] S3H["lambdas/s3hash + quilt3 path dep"] end subgraph SA["Standalone (path dep)"] AC["lambdas/access_counts"] PKE["lambdas/pkgevents"] PRV["lambdas/preview"] TP["lambdas/tabular_preview (ECR)"] TH["lambdas/thumbnail (ECR)"] TR["lambdas/transcode"] end ROOT --> WS LS --> SA subgraph CI["CI / Deploy"] ZIP["build_zip.sh"] ECR["docker buildx"] end WS --> ZIP WS --> ECR SA --> ZIP SA --> ECR ECR -. "context outside build context" .-> TP ECR -. "context outside build context" .-> THReviews (3): Last reviewed commit: "perf: lint" | Re-trigger Greptile