Skip to content

Status with renames #523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 3, 2013
Merged

Conversation

ethomson
Copy link
Member

@ethomson ethomson commented Oct 2, 2013

Move RepositoryStatus to the new git_status_list_new API, which allows us to find similar renames in the index.

By default, detect similarity in staged changes (only) and account for rewrites, however we provide an additional flag to detect unstaged renames in the working directory. I did not wire up any of the other status options.

Big thanks to Rick Potts for all the work here, this is really his PR, I just rebased it on latest and fixed some things up.

///
/// </summary>
[StructLayout(LayoutKind.Sequential)]
public class GitStatusOptions : IDisposable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this class have to be public? most classes in the core namespace are internal...

@ethomson
Copy link
Member Author

ethomson commented Oct 5, 2013

Thanks @jamill for the comments! Updated, squashed and rebased.


public static int git_status_list_entrycount(StatusListSafeHandle list)
{
return (int)NativeMethods.git_status_list_entrycount(list);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use Ensure.Int32Result() here

@ethomson
Copy link
Member Author

ethomson commented Oct 6, 2013

Updated and rebased.

{
return string.Format("{0}: {1} -> {2} [{3}%]", State, OldFilePath, FilePath, Similarity);
}
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If think the else is redundent

@ethomson
Copy link
Member Author

ethomson commented Oct 6, 2013

You got it, updated.

private readonly FileStatus state;
private readonly int similarity;

private static readonly LambdaEqualityHelper<StatusEntry> equalityHelper =
new LambdaEqualityHelper<StatusEntry>(x => x.FilePath, x => x.State);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update the contributors to the equality of this type.

@nulltoken
Copy link
Member

@ethomson While you're at it, could you please make the RepositoryStatus indexer returns a StatusEntry rather than a FileStatus?

DetectRenamesInWorkdir = true
});

Assert.Equal(FileStatus.Added | FileStatus.RenamedInWorkdir, status["rename_target.txt"]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following assertion fails. Is this expected?

Assert.Equal(100, status.RenamedInWorkdir.Single().Similarity);

@nulltoken
Copy link
Member

I'm not sure I completely like the RenamedInIndex and RenamedInWorkdir collections in the RepositoryStatus type.

Do we really need them exposed?

@nulltoken
Copy link
Member

Even when a file hasn't been renamed (or when no renaming detection has been requested), StatusEntry.OldPath is filled.

How dumb would that be to pack Similarity and OldPath into a new RenamingDetails type, and make it exposed by StatusEntry? This property could return null when no renaming information is available.

@nulltoken
Copy link
Member

Is there a use case when a StatusEntry would expose a Stateproperty where both FileStatus.RenamedInWorkdir and FileStatus.RenamedInIndex are set?

@ethomson
Copy link
Member Author

ethomson commented Oct 6, 2013

If I understand your last question correctly, yes; if you renamed a -> b and stage that add / delete pair and then renamed b -> c in the working directory, then you would have a rename in the index and a rename in the workdir.

@ethomson
Copy link
Member Author

ethomson commented Oct 6, 2013

Perhaps I understand your question better: it seems like we want to provide similarity for the head -> index and the index -> workdir separately, as well as the old name(s) separately. Let me push something up with that, as a details option.

@ethomson
Copy link
Member Author

ethomson commented Oct 6, 2013

@jamill Do we use the RenamedInIndex and / or RenamedInWorkdir collections?

@ethomson
Copy link
Member Author

ethomson commented Oct 7, 2013

So... this adds a RenameDetails that describes the head->index renamification as well as the index->workdir rename.

@nulltoken I feel like this doesn't quite address all of your complaints, but I trust this is closer,,,,

});

Assert.Equal(FileStatus.Added | FileStatus.RenamedInWorkdir, status["rename_target.txt"].State);
Assert.Equal(100, status["rename_target.txt"].IndexToWorkdirRenameDetails.Similarity);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that you're correct, this is expected to be 100%. I appreciate the clarification and apologize for my haste.

@nulltoken
Copy link
Member

@ethomson Thanks a lot for your patience. In order to clarify my comments, I've came up with a test which tries to better express my questions.

[Fact]
public void CanDetectedVariousKindsOfRenaming()
{
    string path = InitNewRepository();
    using (var repo = new Repository(path))
    {
        Touch(repo.Info.WorkingDirectory, "file.txt",
            "This is a file with enough data to trigger similarity matching.\r\n" +
            "This is a file with enough data to trigger similarity matching.\r\n" +
            "This is a file with enough data to trigger similarity matching.\r\n" +
            "This is a file with enough data to trigger similarity matching.\r\n");

        repo.Index.Stage("file.txt");
        repo.Commit("Initial commit", Constants.Signature, Constants.Signature);

        File.Move(Path.Combine(repo.Info.WorkingDirectory, "file.txt"),
            Path.Combine(repo.Info.WorkingDirectory, "renamed.txt"));

        var opts = new StatusOptions
                            {
                                DetectRenamesInIndex = true,
                                DetectRenamesInWorkdir = true
                            };

        RepositoryStatus status = repo.Index.RetrieveStatus(opts);

        // This passes as expected
        Assert.Equal(FileStatus.RenamedInWorkdir, status.Single().State);

        repo.Index.Stage("renamed.txt");

        status = repo.Index.RetrieveStatus(opts);

        // This doesn't pass. Is this expected?
        Assert.Equal(FileStatus.RenamedInIndex, status.Single().State);

        File.Move(Path.Combine(repo.Info.WorkingDirectory, "renamed.txt"),
            Path.Combine(repo.Info.WorkingDirectory, "renamed_again.txt"));

        status = repo.Index.RetrieveStatus(opts);
        // What should be asserted here? How many entries should we find?
        // My guess is two:
        //  - One with the change in the workdir
        //  - Another one with the change in the index
    }
}

Beside this, If we consider a renaming as a modification, shouldn't the Modified collection be valued when a renaming happens in the workdir, and the Staged one, as well, when the renaming has been promoted to the index?

}
}

public enum GitStatusShow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this internal?

@ethomson
Copy link
Member Author

ethomson commented Oct 7, 2013

I don't think it should be two entries, no. We don't split status entries up into the HEAD -> index side and the index -> workdir side; indeed, the entire point of the status API is to take those two diffs and combine them such that it tracks the status of a file throughout. We don't have two status entries when you stage a change and then modify an item in the working directory, nor should we have two when you perform two renames.

It seems particularly odd that if I were to stage a rename to a file, then make a change in the workdir, that I would get the status entry (the combined workdir and index changes) but as soon as I rename the file again, then I do not get the combined workdir and index changes.

You haven't really articulated why you believe there should be two status entries in this case, so it's difficult for me to address your arguments directly. My best guess is that you believe that a file should be identified by filename, while I believe that - at least once you turn rename detection on - it's the content and similarity that identifies a file.

If you have a use case for getting two status entries back when a file is renamed twice - and I don't see what that use case is offhand - then we can certainly add that behavior to libgit2. But it makes no sense for this behavior to live in libgit2sharp; we don't want to go to the trouble of collating this status, just to undo it. We should simply have a flag in git_status_list_new that indicates that renames to renames should not be combined.

Further, I'm not sure why we should be updating the Staged collection if there were no changes that were actually staged, sans a rename. Does Staged not indicate items with content changes? It would be very troubling for me to be iterating the Staged list and get an entry out that did not actually have staged changes (eg, was not FileStatus.Staged).

@ethomson
Copy link
Member Author

ethomson commented Oct 7, 2013

I pushed up your test. In the first case, you must stage the delete and add in order for libgit2 to detect that as a rename in the index. This is consistent with the behavior in core git, we only apply rename matching in the appropriate diff. After staging the delete side of the rename, this behavior is as you expected.

I also added assertions that show that the test should not get two status entries, it should get one. Obviously we disagree on whether this should be true or not, but I am showing that it is true, currently.

@ethomson
Copy link
Member Author

ethomson commented Oct 7, 2013

Also, per a previous conversation, yes, we do use the Renamed* collections and we would like those to be public indeed.

@dahlbyk
Copy link
Member

dahlbyk commented Oct 7, 2013

I think I agree with @ethomson - for chained renamed I would expect a single StatusEntry with State = FileStatus.RenamedInIndex | FileStatus.RenamedInWorkDir and both IndexToWorkdirRenameDetails and HeadToIndexRenameDetails populated.

@ethomson
Copy link
Member Author

ethomson commented Oct 8, 2013

I'm not sure where we landed on this. @nulltoken are you waiting on something from me?

@nulltoken
Copy link
Member

are you waiting on something from me?

@ethomson No, Sir. Thanks for asking 😉

I'm toying with a counter-proposal and will come back to you shortly with it.

@ethomson
Copy link
Member Author

This second commit exposes the show option; this actually turns out to be very helpful for us in some circumstances (large repositories, lots of changes) to be able to only examine the index or the workdir without always having to examine both.

@ethomson
Copy link
Member Author

Rebased and I added a status option for ExcludeSubmodules!

@ethomson
Copy link
Member Author

Rebased and updated to address @dahlbyk 's concerns (I hope!)

@nulltoken so what are you thinking with this? Are you still pursuing an alternate track and I'm wasting my time here, or has this grown on you?

@dahlbyk
Copy link
Member

dahlbyk commented Oct 22, 2013

Rebased and updated to address @dahlbyk 's concerns (I hope!)

👍

@ethomson
Copy link
Member Author

ethomson commented Dec 2, 2013

It's rrrrrebased!

tony-frosted-flakes

@dahlbyk
Copy link
Member

dahlbyk commented Dec 2, 2013

In the context of #573, what's the default status rename behavior if left unspecified? Do we want to explicitly support deferring to config? Does diff.renames even affect status?

@ethomson
Copy link
Member Author

ethomson commented Dec 2, 2013

In libgit2, the default is to not detect renames. Core git does do rename detections.

Core git does not appear to pay any attention to diff.renames. git-status(1) does not indicate any config options there and setting / unsetting diff.renames does not appear to have any affect, nor does setting it to copies.

@dahlbyk
Copy link
Member

dahlbyk commented Dec 2, 2013

In libgit2, the default is to not detect renames. Core git does do rename detections.

Good enough for me, thanks for clarifying.

@ethomson
Copy link
Member Author

ethomson commented Dec 3, 2013

Hey @nulltoken where are we with this? I realize you don't like it, but I don't really know why you don't like it, so I can't make it more to your liking.

I've got a bunch of other work queued up that depends on this. Like conflict handling. Which merge and revert depend on. And they're getting really stale while this just sits here...

@nulltoken
Copy link
Member

@ethomson You're right. I've been sitting on this for too long without a proper alternative. Let's move on with this very desirable feature!

Before having this merged, could you please meld d1e06fe and 8448dcc fixups into the commits in this PR that initially brought up the changes? 🙏

Renames in status require more detailed data than simply paths, in
order to collect old path, new path and similarity data.  Use
git_status_list to collect this data and use a new StatusEntry to
store it.
@ethomson
Copy link
Member Author

ethomson commented Dec 3, 2013

You rock, @nulltoken . Squashed and awaiting a Travis build.

@nulltoken nulltoken merged commit 974ad99 into libgit2:vNext Dec 3, 2013
@nulltoken
Copy link
Member

You rock, @nulltoken

@ethomson I don't. You do! 💖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants