Skip to content

IndexFile.diff(None) returns empty after init -> add -> write -> read sequence on a new repository #2025

Open
@ElJaviLuki

Description

@ElJaviLuki

Environment:

  • GitPython version: 3.1.44
  • Git version: git version 2.42.0.windows.2
  • Python version: 3.12.0
  • Operating System: Windows 11 Pro 24H2 26100.3775

Description:
When initializing a new repository, adding a file to the index, writing the index to disk, and then explicitly reading the index back, a subsequent call to repo.index.diff(None) incorrectly returns an empty DiffIndex (an empty list). This occurs even though an external git status --porcelain command correctly shows the file as added to the index (stage 'A').

This suggests that the in-memory state of the IndexFile object is not correctly reflecting the on-disk state for the diff(None) operation under these specific circumstances, even after an explicit repo.index.read().

Steps to Reproduce:

import os
import tempfile
import shutil
from git import Repo, IndexFile, Actor

# Setup a temporary directory for the new repository
repo_dir = tempfile.mkdtemp(prefix="test_gitpython_index_issue_")
try:
    # 1. Initialize a new repository
    repo = Repo.init(repo_dir)
    print(f"Repository initialized at: {repo_dir}")
    print(f"Is bare: {repo.bare}") # Should be False

    # 2. Create and add a new file (.gitkeep in this example)
    gitkeep_path = os.path.join(repo.working_tree_dir, ".gitkeep")
    with open(gitkeep_path, 'w') as f:
        f.write("# Initial file\n")
    print(f".gitkeep created at: {gitkeep_path}")

    index = repo.index
    index.add([".gitkeep"]) # Relative path to repo root
    print(f"Added '.gitkeep' to index object in memory.")

    # 3. Write the index to disk
    index.write()
    print(f"Index written to disk at: {index.path}")
    assert os.path.exists(index.path), "Index file should exist on disk"

    # 4. (Optional but good for verification) Check with external git status
    status_output = repo.git.status(porcelain=True)
    print(f"git status --porcelain output: '{status_output}'")
    assert "A  .gitkeep" in status_output or "?? .gitkeep" in status_output # Should be 'A ' after add+write

    # 5. Explicitly re-read the index (or create a new IndexFile instance)
    #    This step is crucial to the bug demonstration.
    index.read() # Force re-read of the IndexFile instance
    # Alternatively: index = IndexFile(repo) # Create new instance, should also read from disk
    print(f"Index explicitly re-read. Number of entries: {len(index.entries)}")
    assert len(index.entries) > 0, "Index should have entries after add/write/read"
    
    # 6. Perform a diff of the index against an empty tree (None)
    # This simulates what happens before an initial commit to see staged changes.
    diff_against_empty_tree = index.diff(None) 
    print(f"index.diff(None) result: {diff_against_empty_tree}")
    print(f"Type of result: {type(diff_against_empty_tree)}")
    for item_diff in diff_against_empty_tree:
        print(f"  Diff item: a_path={item_diff.a_path}, b_path={item_diff.b_path}, change_type={item_diff.change_type}, new_file={item_diff.new_file}")


    # Expected behavior:
    # index.diff(None) should return a DiffIndex containing one Diff object
    # representing the newly added '.gitkeep' file (change_type 'A').
    assert len(diff_against_empty_tree) == 1, \
        f"Expected 1 diff item, got {len(diff_against_empty_tree)}. Entries: {index.entries}"
    diff_item = diff_against_empty_tree[0]
    assert diff_item.change_type == 'A', \
        f"Expected change_type 'A', got '{diff_item.change_type}'"
    assert diff_item.b_path == ".gitkeep", \
        f"Expected b_path '.gitkeep', got '{diff_item.b_path}'"

except Exception as e:
    print(f"An error occurred: {e}")
    raise
finally:
    # Clean up the temporary directory
    # shutil.rmtree(repo_dir)
    # print(f"Cleaned up temp directory: {repo_dir}")
    pass

# To run this reproducer:
# 1. Save as a .py file.
# 2. Ensure GitPython is installed.
# 3. Run `python your_file_name.py`

Actual Behavior:
repo.index.diff(None) returns an empty DiffIndex (i.e., []).

Expected Behavior:
repo.index.diff(None) should return a DiffIndex containing one Diff object for .gitkeep with change_type='A', new_file=True, a_path=None, and b_path='.gitkeep'.

Additional Context:

  • This issue prevents correctly determining staged changes for an initial commit using index.diff(None).
  • The index.entries dictionary does seem to reflect the added file correctly after index.read().
  • The repo.git.status(porcelain=True) command correctly shows the file as staged for addition (A .gitkeep).
  • The problem seems specific to how IndexFile.diff(None) interprets the IndexFile's state after this sequence of operations in a new repository before the first commit. Diffing against HEAD (once a commit exists) or other trees might behave differently.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions