source/git: recover from stale submodule cache after removal#6563
Conversation
|
This pull request likely fixes #4260 |
|
|
||
| func (gs *gitSourceHandler) updateSubmodules(ctx context.Context, git *gitutil.GitCLI) error { | ||
| args := []string{"submodule", "update", "--init", "--recursive", "--depth=1"} | ||
| if _, err := git.Run(ctx, args...); err != nil { |
There was a problem hiding this comment.
Isn't there some command we could just run in here to fix up the submodule state? Triggering fetch again on checkout is ugly and possibly with some security considerations. Or alternatively something we could run during fetch to make sure submodule state is still ok.
What prevents this from going to a loop?
There was a problem hiding this comment.
It will run only once. But thanks for pushing back on simply discarding the old snapshot as it prompted me to do some additional digging which revealed the likely root cause behind #4260
The git checkout tree-ish -- pathspec buildkit uses to populate the build snapshot defaults to overlay mode, so it won't delete objects which exist in the work tree but don't exist in the tree-ish. For ordinary file and directory objects, it's not a problem because the destination work tree is a freshly made empty directory. In the case of submodules, deleted or renamed gitlink references continue to exist in the index which is used by the subsequent git submodule update ... command and may cause it to fail.
With a a non-overlay checkout in the happy path, this new error handler becomes unnecessary. I plan to push an update to this PR soon.
| } | ||
|
|
||
| func isMissingSubmoduleURLError(err error) bool { | ||
| return strings.Contains(err.Error(), "No url found for submodule path") && |
There was a problem hiding this comment.
Are you sure this same error can't be just created by some repo configuration problem.
There was a problem hiding this comment.
Yes - creating an intentionally mangled gitmodules file or bogus gitlink references in a tree can also trigger this error.
The `git checkout tree-ish -- pathspec` command run by buildkit to populate the work tree defaults to overlay mode, so it won't delete objects which exist in the work tree but don't exist in the tree-ish. It's not a problem for ordinary file and directory objects because the work tree starts out empty, but in the case where submodules are deleted or renamed it will leave stale gitlink references in the index. The subsequent `git submodule update ...` command will then fail with an eror like 'No url found for submodule path ... in .gitmodules' as seen in moby#4260. Adding `--no-overlay` ensures that any deleted gitlink references are removed from the index before the submodule update runs. Signed-off-by: Brian Ristuccia <brian@ristuccia.com>
ce473d3 to
ea11d20
Compare
|
This updated pull request addresses the likely root cause of #4260 by adding If maintaining compatibility with older versions of git is a requirement, the non-pathspec based Another alternative is All three variants appear to fix #4260 and introduced no new regression suite failures. |
When a cached git source has initialized submodules for one commit and a later commit removes a submodule, git submodule update can fail against the reused cached repo with 'No url found for submodule path ... in .gitmodules'.
Detect that specific stale-submodule mismatch during checkout, refetch the remote repo into a fresh cache state, and retry the checkout once. Add a regression test that advances a branch from a commit with a submodule to one that removes it and verifies the second snapshot succeeds for both sha1 and sha256 repositories.
Signed-off-by: Brian Ristuccia brian@ristuccia.com