-
Notifications
You must be signed in to change notification settings - Fork 899
Writing multiple pack files for the same repository #1216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writing multiple pack files for the same repository #1216
Conversation
The cc/ @nulltoken @dahlbyk I hope this clarifies your concerns in #1183 (comment) and #1183 (comment) In this simple test case, I'm writing all commits to a file, all trees to another file, and all blobs yet to a third file. And it works!, I tested it using To do something like what's in #1183 (comment) maybe kinda tricky to have the same number of objects into each pack however it's doable. Please guys give me your suggestions :) |
DirectoryHelper.DeleteDirectory(mrrRepoPackDirPath); | ||
Directory.CreateDirectory(mrrRepoPackDirPath + "/pack"); | ||
|
||
PackBuilderOptions packBuilderOptions = new PackBuilderOptions(mrrRepoPackDirPath + "/pack"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need a new packBuilderOptions
for each of the three files?
Func<int, PackBuilderOptions> packBuilderOptions = i => new PackBuilderOptions(mrrRepoPackDirPath + "/pack"+ i);
...
results = results = orgRepo.ObjectDatabase.Pack(packBuilderOptions(i), b =>
To that end, it's probably wise to add an assert that the expected pack files were created. It also occurs to me that just about every use of this API will require some sort of .git/objects
manipulation - does libgit2 (or could we)
support a relative file path that default to living under .git/objects
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you need a new packBuilderOptions for each of the three files?
Nope, not necessarily ... cause all the 6 files (3 .pack & 3 .idx) will be written to the same directory path. They will have different names cause they will have different content. but all the 6 files will be written to the pack directory. ... and that's the default format written and understood by git.exe
So indeed there's no reason to do such line new PackBuilderOptions(mrrRepoPackDirPath + "/pack"+ i);
... it's not even the default format the git.exe will expect when reading the newly written repository ... we will have to move all the .pack and .idx files to the .git/objects/pack
at the end if we want to read it again by any git client.
However, it's still valid to write to different directories if you want.
it's probably wise to add an assert that the expected pack files were created.
Yup, I'll add a property in the PackBuilder
to return the hash of the written file, and in the tests I will assert if it's written after the call.
It also occurs to me that just about every use of this API will require some sort of .git/objects manipulation - does libgit2 (or could we) support a relative file path that default to living under .git/objects?
Didn't fully understand this. Are you talking about the PackBuilderOptions
property PackDirectoryPath
to be defaulted to .git/objects/pack
? or more general issue in the entire API ?
If first, well, we thought of that. It's just I don't want to throw DirectoryNotFoundException if the default directory didn't exist ... so I will leave it to the API user to explicitly state the directory to write to ... also that will give the flexibility to write to multiple directories for the same repo if he wants to do so for some reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I hadn't connected that the path was a directory rather than individual files. I should know this, but I have young children.
It does seem like a potentially better API if we were to ensure the requested pack directory exists, and default to objects/pack
within the current repo's .git
directory. Just my $.02.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current repo's packfile dir is just one possible location (git itself dumps it on the current working directory). This is what we do on fetch, but if you're using this API you're working at a level that makes it more likely that you'd want to store them elsewhere to do your own thing. It's not some every-day code you're writing.
I've added one more test case on writing multiple pack files. This one is to write each 100 objects to a different file. |
@nulltoken we can do more fun stuff on building multiple packs by exposing an Add method to the PackBuilder type that takes a CommitFilter ... some thing like When we get this merged, I will work on CommitFilter support on a separate PR :) |
Some multi-pack discussion started at #1213 (comment) |
…ackBuilderFixture
Exposing libgit2 method to get the pack file hash. Minor fixes for PackBuilder, PackBuilderResults, and PackBuilderFixture.
e639edc
to
6035a81
Compare
…g explicit write and reset calls
6035a81
to
629e2de
Compare
Hey guys @dahlbyk @carlosmn @nulltoken @kevin-david @whoisj as a followup on our discussion on the PR #1213 I give up the idea of splitting the implementation between This brings us to the question either to allow explicit writes or not, In this commit I allowed it. so the API user has to dispose the The I also give up using the I left WritePackTo(Stream) to @kevin-david to add his code and tests, so he takes the credit for it :) PS. I'm aware that the methods' documentation is outdated, I will fix that once we agree on the code. Thanks guys I appreciate your feedback, please let me know what you think of this :) |
This pull request to have some initial thoughts on writing multiple pack files for the same repository.
Changes so far:
-Adding a test case to PackBuilderFixture for creating multiple pack files for the same repository.