[python] Fix FileSystemBranchManager from-tag and fast-forward path computation#7756
Merged
JingsongLi merged 3 commits intoapache:masterfrom May 3, 2026
Merged
Conversation
…omputation
FileSystemBranchManager._copy_with_branch returned a SnapshotManager
that still pointed at the main-branch snapshot directory, so
create_branch(tag_name=...) and fast_forward hit copy_file(src, dst)
with src == dst and raised SameFileError.
Mirror Java SnapshotManager.copyWithBranch (utils/SnapshotManager.java)
on the Python side: SnapshotManager now carries an explicit branch
field and a copy_with_branch factory, so its snapshot_dir resolves to
{table_path}/branch/branch-{name}/snapshot for non-main branches. The
dispatch in _copy_with_branch then delegates to the per-manager
copy_with_branch factories.
Adds SnapshotManagerBranchAwarenessTest for the path computation and
FileSystemBranchManagerEndToEndTest for the from-tag and fast-forward
happy paths that previously raised SameFileError.
Mirror Java SnapshotLoaderImpl.copyWithBranch (tag/SnapshotLoaderImpl.java): when SnapshotManager.copy_with_branch produces a branch-aware manager, swap its loader for one whose Identifier carries the new branch, so catalog-backed snapshot loads target the requested branch instead of falling back to the main-branch identifier. On the FileSystemCatalog path the loader is None and this is inert.
Use get_database_name() / get_table_name() instead of reaching into the .database / .object fields, to match Java SnapshotLoaderImpl's identifier.getDatabaseName() / identifier.getTableName() calls.
TheR1sing3un
added a commit
to TheR1sing3un/incubator-paimon
that referenced
this pull request
May 3, 2026
Switch SnapshotManager.__init__ from (table) to the Java-aligned (file_io, table_path, branch=None, snapshot_loader=None) so the class no longer depends on FileStoreTable. The new manager carries its own file_io / table_path / branch / snapshot_loader fields, mirroring paimon-core/.../utils/SnapshotManager.java. FileStoreTable.snapshot_manager() remains the canonical factory and now wires those four basics. All raw SnapshotManager(table) call sites across production and tests are migrated to table.snapshot_manager(). copy_with_branch is rewritten to construct the rebranched manager directly via the new constructor (no field-swap). Mock-style tests that patched the SnapshotManager class to intercept its instances now wire the mock through table.snapshot_manager.return_value, which matches how production code obtains its instance. Follow-up to apache#7756.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Fix a pre-existing bug in
FileSystemBranchManager._copy_with_branchthat madecreate_branch(tag_name=...)andfast_forwardraiseSameFileError: theSnapshotManagerit produced still pointed at the main-branch snapshot directory, socopy_file(src, dst)collapsed tosrc == dst.The fix mirrors Java
SnapshotManager.copyWithBranch(paimon-core/.../utils/SnapshotManager.java:89-95): the PythonSnapshotManagernow carries an explicitbranchfield and acopy_with_branch(branch_name)factory, so itssnapshot_dir/get_snapshot_path(...)resolve to{table_path}/branch/branch-{name}/snapshot/...for non-main branches.FileSystemBranchManager._copy_with_branchthen dispatches to the per-manager factories (SnapshotManager.copy_with_branch/SchemaManager.copy_with_branch) instead of blindly reconstructing a main-branchSnapshotManager.SnapshotLoaderis rebranched in lockstep so REST-path catalog loads target the requested branch rather than falling back to the main-branch identifier (mirrors JavaSnapshotLoaderImpl.copyWithBranch).In scope
pypaimon/snapshot/snapshot_manager.py: constructor accepts an optionalbranchargument (defaults to inheritingtable.current_branch()); newbranchfield; branch-awaresnapshot_dir/latest_file; newcopy_with_branch(branch_name)factory that also rebranches the snapshot loader.pypaimon/snapshot/snapshot_loader.py: newcopy_with_branch(branch)returning a loader whoseIdentifiercarries the new branch (mirrors JavaSnapshotLoaderImpl.copyWithBranch).pypaimon/branch/filesystem_branch_manager.py:_copy_with_branchdispatches tomanager.copy_with_branch(branch)forSnapshotManager/SchemaManager(the latter already had the factory).TagManageris still rebuilt directly because the PythonTagManagerhas nocopy_with_branchfactory yet.pypaimon/tests/branch_manager_test.py: two new test classes --SnapshotManagerBranchAwarenessTestpins down the branch-aware path computation and the loader rebranching, andFileSystemBranchManagerEndToEndTestregresses both the from-tag happy path and the fast-forward happy path that previously raisedSameFileError.pypaimon/tests/snapshot_manager_test.py: existingMock()tables now explicitly setcurrent_branch.return_value = "main", since the constructor now callstable.current_branch()for the default branch.Out of scope
filesystem_catalog_branch_test.py). That file is only on the [python] Implement branch CRUD on FileSystemCatalog #7755 branch; once [python] Implement branch CRUD on FileSystemCatalog #7755 merges, the skipped tests can be re-enabled in a follow-up.FileSystemBranchManager._normalize_branchwithBranchManager.normalize_branch-- out of diff, separate cleanup.Tests
From
paimon-python/:The two new end-to-end tests in
FileSystemBranchManagerEndToEndTest(test_create_branch_from_tag_lands_files_under_branch_dirandtest_fast_forward_after_create_branch_from_tag) deterministically reproduce theSameFileErroron master and pass on this branch.test_copy_with_branch_rebranches_snapshot_loadercovers the new loader rebranching.Anti-divergence checklist
SnapshotManager.copy_with_branchmatches JavaSnapshotManager.copyWithBranchsemantics: a fresh manager whose path accessors flow throughBranchManager.branch_path(table_root, branch), plus a snapshot loader rebranched viaSnapshotLoader.copy_with_branch(branch)when one is present.SnapshotLoader.copy_with_branchmatches JavaSnapshotLoaderImpl.copyWithBranch: samecatalog_loader, newIdentifier(database, object, branch).branch == "main"keeps existing paths bit-identical (branch_path(p, "main") == p), so all currentSnapshotManager(table)call sites stay zero-regression.FileStoreTable.snapshot_manager()is unchanged; the new optionalbranchparameter defaults totable.current_branch()which is already the production behavior.Generative AI disclosure
Drafted with assistance from a generative AI tool. All code, tests, and Java alignment were reviewed and validated by the contributor.