Skip to content

Running humaneval against llama-3-8b-instruct exl2 quant results in a silent OOM when samples per task > 7 #496

Open
@LlamaEnjoyer

Description

@LlamaEnjoyer

Evaluation runs to the finish when -spt is set to 7 or less though (Windows 11, 64GB RAM, RTX 4070Ti Super16GB VRAM, ExllamaV2 0.1.4 from the dev branch). This happens with https://huggingface.co/turboderp/Llama-3-8B-Instruct-exl2/tree/6.0bpw quant.

Windows Event Viewer shows this:

Faulting application name: python.exe, version: 3.11.9150.1013, time stamp: 0x660bda91
Faulting module name: c10.dll, version: 0.0.0.0, time stamp: 0x66145ad7
Exception code: 0xc0000005
Fault offset: 0x0000000000063064
Faulting process id: 0x0x4FC
Faulting application start time: 0x0x1DAB8C46E059709
Faulting application path: C:\Users\xxx\AppData\Local\Programs\Python\Python311\python.exe
Faulting module path: C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\lib\c10.dll
Report Id: aee8a1bf-1ff5-46e8-aadd-120bb5d65edb
Faulting package full name: 
Faulting package-relative application ID:

However evaluating Mistral works just fine (Tested against this quant: https://huggingface.co/turboderp/Mistral-7B-v0.2-exl2/tree/6.0bpw)

Here's what I tried:

  • reinstalling exllamav2, pytorch, safetensors, tokenizers, numpy, flash-attention libraries
  • updating Nvidia drivers to the newest ones (I was running 552.22, now it's 555.99)
  • setting CUDA System Fallback Policy on and off
  • testing against a different LLM - Mistral 7Bv0.2 - and it worked without a hitch with -spt 10

List of installed packages in my system:

Package            Version
------------------ ------------
aiohttp            3.9.5
aiosignal          1.3.1
attrs              23.2.0
blinker            1.8.2
certifi            2024.2.2
charset-normalizer 3.3.2
click              8.1.7
colorama           0.4.6
cramjam            2.8.3
datasets           2.19.1
dill               0.3.8
einops             0.8.0
einops             0.8.0
exllamav2          0.1.4
fastparquet        2024.5.0
filelock           3.13.1
fire               0.6.0
flash-attn         2.5.9.post1
Flask              3.0.3
frozenlist         1.4.1
fsspec             2024.2.0
huggingface-hub    0.23.1
human-eval         1.0.3
idna               3.7
intel-openmp       2021.4.0
itsdangerous       2.2.0
Jinja2             3.1.3
markdown-it-py     3.0.0
MarkupSafe         2.1.5
mdurl              0.1.2
mkl                2021.4.0
mpmath             1.3.0
multidict          6.0.5
multiprocess       0.70.16
networkx           3.2.1
ninja              1.11.1.1
numpy              1.26.4
packaging          24.0
pandas             2.2.2
pillow             10.2.0
pip                24.0
pyarrow            16.1.0
pyarrow-hotfix     0.6
Pygments           2.18.0
pynvml             11.5.0
python-dateutil    2.9.0.post0
pytz               2024.1
PyYAML             6.0.1
regex              2024.5.15
requests           2.32.2
rich               13.7.1
safetensors        0.4.3
sentencepiece      0.2.0
setuptools         65.5.0
six                1.16.0
sympy              1.12
tbb                2021.11.0
termcolor          2.4.0
tokenizers         0.19.1
torch              2.3.1+cu121
torchaudio         2.3.1+cu121
torchvision        0.18.1+cu121
tqdm               4.66.4
typing_extensions  4.9.0
tzdata             2024.1
urllib3            2.2.1
waitress           3.0.0
websockets         12.0
Werkzeug           3.0.3
wheel              0.43.0
xxhash             3.4.1
yarl               1.9.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions