Skip to content

IOError: [Errno 24] Too many open files #421

Open
@cptanalatriste

Description

@cptanalatriste

I'm using GitPython to do data mining on Git repository on a Windows 10 laptop. To retrieve the stats for commits -which might be on different repositories- I do the following:

    #Tried this. It didn't work
    if platform.system() == 'Windows':
        import win32file
        win32file._setmaxstdio(2048)

    #About 20 000 commits
    commits = get_commits()
    for commit_sha, repository in commits:
        repository_location = REPO_LOCATION + repository
        repository = git.Repo(repository_location)
        commit = repository.rev_parse(commit_sha)

        total_stats = commit.stats.total
        process_stats(total_stats)

        #Tried this also. It won't work
        del total_stats
        del repository

However, I get the following error message every time:

  File "my_code.py", line 126, in my_code
  File "\Anaconda2\lib\site-packages\git\objects\commit.py", line 229, in stats
  File "\Anaconda2\lib\site-packages\gitdb\util.py", line 237, in __getattr__
  File "\Anaconda2\lib\site-packages\git\objects\commit.py", line 141, in _set_cache_
  File "\Anaconda2\lib\site-packages\git\db.py", line 45, in stream
  File "\Anaconda2\lib\site-packages\git\cmd.py", line 982, in stream_object_data
  File "\Anaconda2\lib\site-packages\git\cmd.py", line 948, in _get_persistent_cmd
  File "\Anaconda2\lib\site-packages\git\cmd.py", line 878, in _call_process
  File "\Anaconda2\lib\site-packages\git\cmd.py", line 604, in execute
  File "\Anaconda2\lib\subprocess.py", line 732, in __init__
IOError: [Errno 24] Too many open files

Is there a way to free resource on every loop iteration to avoid the error message?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions