[November] ctime correctness and preparation for partial-clone support
#1197
Byron
announced in
Progress Update
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
This month was the strangest of them all as I didn’t make substantial progress despite trying so hard. Nonetheless I believe the
gitoxidegot significantly better, and here is why.ctimecorrectnessWhile trying to get the little that currently is
gix statusto be correct I ran into an interesting issue I could call ‘ctime-war’. To understand it, we should probably take a step back and establish thatgit status, and by default,gix statusactually are a modifying operation, which is quite unintuitive. But when performing a status check, there is a chance that in order to be sure that a file changed, or not, one will have to hash it. And that is an expensive operation which ideally shouldn’t be redone now that it’s established that the file was changed, or not. And this is when the.git/indexfile can be updated after runninggix statusorgit status.Now add to the story that I use
starship, and thatstarshiprunsgit statusunder the hood and we have the setup forcrime-war- after each run ofgix statusthat has potential to change the.git/indexfile, the terminal prompt will update and launchgit statusright after. Both better agree on what’s in the index file.And as this is the real life, of course they didn't agree, which caused both of them to correct each other’s work.
The culprit turned out to be a difference in the
crimeused when querying the ‘creation’ date of a file. There are multiple candidates andgitchooses thecrimefield for the file, but Rust picks the inode creation time. Both usually differ.To fix it, I once again used the fantastic
rustixcrate to make a manuallstatcall, from which one could extract the correct field by hand.gix_object::Find- the missing traitWhen reviewing a PR related to partial clones (more about that in a moment), I was about to suggest a better way to query the object database without pulling in all of
gix-odb, which thus far has been avoided throughout the codebase.In that very moment it clicked and I new that I shouldn’t keep using
impl FnMut(&oid, &mut Vec<u8>) -> Result<…>, but instead realize that thegixplumbing truly needs an easily accessible trait for this all too common operation.At the end of quite an intense refactoring session, there is now the
Findand theHeadertrait, all available from the ubiquitousgix-objectcrate, and they truly make a huge difference in convenience.Tree-entries now parse losslessly
The
gix-objectcrate implements the parsing of allgitobjects, and it was one of the first crates written for thegitoxideproject. Sometimes, this shows as I allowed myself certain ‘shortcuts’ that today would be utterly unacceptable.Years later, having forgotten all about the luring tech-debt, a bug occurs and warts of the past are rediscovered. The issue at hand was that
gitoxidecouldn’t decode certain valid tree objects thatgitorgit2had no problem with. It turned out that it didn’t consider certain modes of tree-entries valid just because they were ‘unusual’.This was due to it parsing modes by comparison to strings, like `10755” to then turn them into the variant of an enum - after all, there are only a few valid ones. Thus, while it would do that, there was no way to assure it wouldn’t fail later on a similarly unusual tree entry mode. Also it wasn’t possible to assure trees can round-trip, as technically they could loose information during parsing.
Thus the decision was made to finally differentiate between the
Kindof a tree-entry, and itsMode, with the former being an enum and the latter the original 16bit number that represents the mode in all its detail.Community
gix-fsck- the first step towards partial clonesThis is part of a bigger undertaking to make it not only possible to fetch packs with a given set of objects, but also to make more algorithms partial-clone aware from the beginning.
In order to do that, there should be a way to detect missing objects, and this contribution is an fsck-implementation which focuses on exactly that.
While looking into this, it also became clear how much
fsckis actually validating, and that the current implementation is incredibly basic in comparison. But at least it’s a start which hopefully will grow more powerful over time, neatly contained in thegix-fsckcrate.git credential-helpersupport and better shell handling on WindowsOnce in a while there are issues which take quite a while to figure out, and this was one of them. During the investigation it also became clear that
gix —traceis a very powerful remote debugging tool by now which will only grow better over time as more key-traces are added to it.In this case, we compared the trace of
gixwith the trace ofgitand noticed that the invocation ofgit credential-managerdiffered in thatgixused a shell, andgitdid not. Said shell it was that didn’t exist inPATHand thus prevented the execution of the credential helper program.It turned out that
gitdidn’t even try to use a shell on windows, and in this case relied on using its ownEXEC_PATHto find the program in question. The solution was not to fully disable the use of a shell, but instead, and only on Windows, try harder to not use a shell. This procedure could also be steered by the caller, which may have additional knowledge about the program being invoked. Credential-helpers, for example, basically never have to use an intermediate shell, no matter which platform.With this fix, even one Windows one could fetch using a rather complicated setup that I didn’t even know existed, which was a great show of what
gitoxidewas made to do. Even today, there is probably dozens of applications that usegit2and which hade to re-implemented emulations ofgits usage of credential helpers, which is likely to fall short somewhere, or do not work on windows at all.New Sponsorship by drips.network
gitoxideis now sponsored by the Drips Network, and the sponsorship is substantial to say the least. It took me a while to get used to it as well as I felt like I should rather focus on cloning myself to be able to do all the work that needs to be done.But now I have come to terms, and I will simply do my best like always while trying to help making
gitoxidea fully fledgedgit2replacement, along with added correctness thanks to respectinggitconfiguration by default. That should also help Radworks, the company behind the Drips Network, which has a couple of teams busy with makinggitforges as decentralised asgititself.Exiting times!
Gix in Cargo
It took me a while to realize that in order to get a correct
gix reset, which is ultimately what I want to use incargo, I needed a completegix status(untracked files, modified files, and changes of the index itself). And in order to get a completegix status, one also needs rename and rewrite and copy tracking, a feature that was already implemented specifically for diffs of trees with trees.In a big refactoring step, this engine was generalised and now lives in
gix-diff, along with its own set of tests. And while being there, I decided to also make the blob-diff itself more correct by adding support for diff-drivers and text-converters, even though this is still in progress.On the bright side, once done, it will be easier than ever to generated diffs programmatically just like
gitwould, with all the major bells and whistles.Cheers,
Sebastian
PS: The latest timesheets can be found here.
Beta Was this translation helpful? Give feedback.
All reactions