Running humaneval against llama-3-8b-instruct exl2 quant results in a  silent OOM when samples per task > 7

Evaluation runs to the finish when -spt is set to 7 or less though (Windows 11, 64GB RAM, RTX 4070Ti Super16GB VRAM, ExllamaV2 0.1.4 from the dev branch). This happens with https://huggingface.co/turboderp/Llama-3-8B-Instruct-exl2/tree/6.0bpw quant.

Windows Event Viewer shows this:
```
Faulting application name: python.exe, version: 3.11.9150.1013, time stamp: 0x660bda91
Faulting module name: c10.dll, version: 0.0.0.0, time stamp: 0x66145ad7
Exception code: 0xc0000005
Fault offset: 0x0000000000063064
Faulting process id: 0x0x4FC
Faulting application start time: 0x0x1DAB8C46E059709
Faulting application path: C:\Users\xxx\AppData\Local\Programs\Python\Python311\python.exe
Faulting module path: C:\Users\xxx\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\lib\c10.dll
Report Id: aee8a1bf-1ff5-46e8-aadd-120bb5d65edb
Faulting package full name: 
Faulting package-relative application ID:
```

**However evaluating Mistral works just fine** (Tested against this quant: https://huggingface.co/turboderp/Mistral-7B-v0.2-exl2/tree/6.0bpw)

Here's what I tried:

- reinstalling exllamav2, pytorch, safetensors, tokenizers, numpy, flash-attention libraries
- updating Nvidia drivers to the newest ones (I was running 552.22, now it's 555.99)
- setting CUDA System Fallback Policy on and off
- testing against a different LLM - Mistral 7Bv0.2 - **and it worked without a hitch with -spt 10**

List of installed packages in my system:

```
Package            Version
------------------ ------------
aiohttp            3.9.5
aiosignal          1.3.1
attrs              23.2.0
blinker            1.8.2
certifi            2024.2.2
charset-normalizer 3.3.2
click              8.1.7
colorama           0.4.6
cramjam            2.8.3
datasets           2.19.1
dill               0.3.8
einops             0.8.0
einops             0.8.0
exllamav2          0.1.4
fastparquet        2024.5.0
filelock           3.13.1
fire               0.6.0
flash-attn         2.5.9.post1
Flask              3.0.3
frozenlist         1.4.1
fsspec             2024.2.0
huggingface-hub    0.23.1
human-eval         1.0.3
idna               3.7
intel-openmp       2021.4.0
itsdangerous       2.2.0
Jinja2             3.1.3
markdown-it-py     3.0.0
MarkupSafe         2.1.5
mdurl              0.1.2
mkl                2021.4.0
mpmath             1.3.0
multidict          6.0.5
multiprocess       0.70.16
networkx           3.2.1
ninja              1.11.1.1
numpy              1.26.4
packaging          24.0
pandas             2.2.2
pillow             10.2.0
pip                24.0
pyarrow            16.1.0
pyarrow-hotfix     0.6
Pygments           2.18.0
pynvml             11.5.0
python-dateutil    2.9.0.post0
pytz               2024.1
PyYAML             6.0.1
regex              2024.5.15
requests           2.32.2
rich               13.7.1
safetensors        0.4.3
sentencepiece      0.2.0
setuptools         65.5.0
six                1.16.0
sympy              1.12
tbb                2021.11.0
termcolor          2.4.0
tokenizers         0.19.1
torch              2.3.1+cu121
torchaudio         2.3.1+cu121
torchvision        0.18.1+cu121
tqdm               4.66.4
typing_extensions  4.9.0
tzdata             2024.1
urllib3            2.2.1
waitress           3.0.0
websockets         12.0
Werkzeug           3.0.3
wheel              0.43.0
xxhash             3.4.1
yarl               1.9.4
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Running humaneval against llama-3-8b-instruct exl2 quant results in a silent OOM when samples per task > 7 #496

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Running humaneval against llama-3-8b-instruct exl2 quant results in a silent OOM when samples per task > 7 #496

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions