Skip to content

Commit 8538fb9

Browse files
committed
9 Sep 2023 AI CHATBOT
1) Added Llama.cpp - Now supports GGML and GGUF. 2) Added Instructor embed. 3) c_oto_rodo.py - Automatically determines HF Transformers or Llama.cpp. 4) 0-setup update.
1 parent 9b5e227 commit 8538fb9

File tree

10 files changed

+242
-129
lines changed

10 files changed

+242
-129
lines changed

ai chatbot/README.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,14 +5,26 @@ https://code-boxx.com/core-boxx-ai-chatbot/
55
* [Core Boxx](https://github.com/code-boxx/Core-Boxx-PHP-Framework/tree/main/core)
66
* [Python](https://www.python.org/) At the time of writing, 3.9~3.10 works fine.
77
* [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/downloads/?q=build+tools)
8-
* A decent graphics card. Even if you tweak and run with CPU-only, it will be painfully slow...
8+
* [CMake](https://cmake.org/)
9+
* [Nvidia CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit) - If you have an Nvidia graphics card.
10+
11+
## RECOMMENDED
12+
* An Nvidia graphics card with at least 8GB VRAM is highly recommended.
13+
* You CAN run on CPU, but that will be painfully slow.
914

1015
## INSTALLATION
1116
* Copy/unzip this module into your existing Core Boxx project folder.
1217
* Put documents you want the AI to "learn" into `chatbot/docs`, accepted file types - `csv pdf txt epub html md odt doc docx ppt pptx`.
13-
* Run `0-setup.bat` (Windows) `0-setup.sh` (Linux) - *BE WARNED, SEVERAL GIGABYTES WORTH OF DOWNLOAD!*
18+
* Start install - *BE WARNED, SEVERAL GIGABYTES WORTH OF DOWNLOAD!*
19+
* GPU - Run `0-setup.bat` (Windows) `0-setup.sh` (Linux).
20+
* CPU - Run `0-setup.bat CPU` (Windows) `0-setup.sh CPU` (Linux). You will need to manually download your own model, see "changing models" below.
1421
* Access `http://your-site.com/ai/` for the demo.
1522

23+
## CHANGING MODELS
24+
* This module runs on [llama.cpp](https://github.com/ggerganov/llama.cpp).
25+
* Just put your downloaded `GGML/GGUF`` model into `chatbot/models`.
26+
* Change `model_name` in `a_settings.py` to the model file name.
27+
1628
## NOTES
1729
* To rebuild the documents database, simply add/remove documents from `chatbot/docs` and run `1-create.bat / 1-create.sh`.
1830
* To launch the bot, simply run `2-bot.bat / 2-bot.sh`.

ai chatbot/chatbot/0-setup.bat

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,22 @@
1+
@echo off
12
php 0-setup.php
23
virtualenv venv
34
call venv\Scripts\activate
4-
pip install langchain transformers optimum auto-gptq chromadb sentence_transformers Flask pyjwt
5+
pip install langchain transformers optimum auto-gptq chromadb InstructorEmbedding sentence_transformers Flask pyjwt
56
if "%1"=="CPU" (
67
pip install torch torchvision torchaudio --force-reinstall
8+
set FORCE_CMAKE=1
9+
set CMAKE_ARGS=-DLLAMA_CUBLAS=OFF
10+
pip install llama-cpp-python
711
) else (
812
pip install torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/cu117
13+
set FORCE_CMAKE=1
14+
set CMAKE_ARGS=-DLLAMA_CUBLAS=ON
15+
pip install --no-cache-dir --upgrade --force-reinstall llama-cpp-python
916
)
1017
python b_create.py
11-
python d_bot.py
18+
if "%1"=="CPU" (
19+
echo "Install complete - Please download your own model before running 2-bot.bat"
20+
) else (
21+
python d_bot.py
22+
)

ai chatbot/chatbot/0-setup.php

Lines changed: 24 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,29 @@
22
// (A) RODO KOA KONFIGU
33
require dirname(__DIR__) . DIRECTORY_SEPARATOR . "lib" . DIRECTORY_SEPARATOR . "CORE-Config.php";
44

5-
// (B) NEW CHATBOT PATH
6-
define("PATH_CHATBOT", PATH_BASE . "chatbot" . DIRECTORY_SEPARATOR);
5+
// (B) ADD AI TO CORE-CONFIG.PHP
6+
if (!defined("PATH_CHATBOT")) {
7+
try {
8+
// (B1) BACKUP CONFIG FILE
9+
copy(PATH_LIB . "CORE-Config.php", PATH_LIB . "CORE-Config.old");
10+
11+
// (B2) ADD URL & PATH
12+
$url = parse_url(HOST_BASE, PHP_URL_SCHEME) . "://" . HOST_NAME . ":8008";
13+
$add = <<<EOD
14+
// ADDED BY INSTALLER - AI CHATBOT
15+
define("PATH_CHATBOT", PATH_BASE . "chatbot" . DIRECTORY_SEPARATOR);
16+
define("HOST_CHATBOT", "$url");
17+
EOD;
18+
$fh = fopen(PATH_LIB . "CORE-Config.php", "a");
19+
fwrite($fh, "\r\n\r\n$add");
20+
fclose($fh);
21+
} catch (Exception $ex) {
22+
exit("Unable to update CORE-Config.php - " . $ex->getMessage());
23+
}
24+
25+
// (B3) NEW CHATBOT PATH
26+
define("PATH_CHATBOT", PATH_BASE . "chatbot" . DIRECTORY_SEPARATOR);
27+
}
728

829
// (C) BACKUP CHATBOT/A_SETTINGS.PY
930
if (!copy(PATH_CHATBOT . "a_settings.py", PATH_CHATBOT . "a_settings.old")) {
@@ -24,23 +45,4 @@
2445
if (count($replace)==0) { break; }
2546
}}}
2647
try { file_put_contents(PATH_CHATBOT . "a_settings.py", implode("", $cfg)); }
27-
catch (Exception $ex) { exit("Error writing to ". PATH_CHATBOT . "a_settings.py"); }
28-
29-
// (E) ADD AI TO CORE-CONFIG.PHP
30-
try {
31-
// (E1) BACKUP CONFIG FILE
32-
copy(PATH_LIB . "CORE-Config.php", PATH_LIB . "CORE-Config.old");
33-
34-
// (E2) ADD URL & PATH
35-
$url = parse_url(HOST_BASE, PHP_URL_SCHEME) . "://" . HOST_NAME . ":8008";
36-
$add = <<<EOD
37-
// ADDED BY INSTALLER - AI CHATBOT
38-
define("PATH_CHATBOT", PATH_BASE . "chatbot" . DIRECTORY_SEPARATOR);
39-
define("HOST_CHATBOT", "$url");
40-
EOD;
41-
$fh = fopen(PATH_LIB . "CORE-Config.php", "a");
42-
fwrite($fh, "\r\n\r\n$add");
43-
fclose($fh);
44-
} catch (Exception $ex) {
45-
exit("Unable to update CORE-Config.php - " . $ex->getMessage());
46-
}
48+
catch (Exception $ex) { exit("Error writing to ". PATH_CHATBOT . "a_settings.py"); }

ai chatbot/chatbot/0-setup.sh

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,18 @@
11
php 0-setup.php
22
virtualenv venv
33
source "venv/bin/activate"
4-
pip install langchain transformers optimum auto-gptq chromadb sentence_transformers Flask pyjwt
4+
pip install langchain transformers optimum auto-gptq chromadb InstructorEmbedding sentence_transformers Flask pyjwt
55
if [[ $1 == "CPU" ]]
66
then
77
pip install torch torchvision torchaudio --force-reinstall --index-url https://download.pytorch.org/whl/cpu
8+
pip install llama-cpp-python
89
else
910
pip install torch torchvision torchaudio --force-reinstall
11+
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python
1012
fi
1113
python b_create.py
12-
python d_bot.py
14+
if [[ $1 == "CPU" ]]
15+
then
16+
echo "Install complete - Please download your own model before running 2-bot.bat"
17+
else
18+
python d_bot.py

ai chatbot/chatbot/a_settings.py

Lines changed: 56 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,81 @@
1-
# (A) PATH
2-
import os
1+
# (A) LOAD MODULES
2+
import os, torch
3+
4+
# (B) MODEL
5+
# hugging face url path, or model file inside models/
6+
model_name = "TheBloke/vicuna-7B-v1.5-GPTQ"
7+
#model_name = "llama-2-7b.Q5_K_M.gguf"
8+
9+
# (C) AUTO - PATH
310
path_base = os.path.dirname(os.path.realpath(__file__))
411
path_models = os.path.join(path_base, "models")
512
path_db = os.path.join(path_base, "db")
613
path_docs = os.path.join(path_base, "docs")
714

8-
# (B) ENVIRONMENT VARIABLES
9-
os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "true"
10-
os.environ["TRANSFORMERS_CACHE"] = path_models
15+
# (D) LLAMA CPP
16+
if os.path.isfile(os.path.join(path_models, model_name)):
17+
model_file = os.path.join(path_models, model_name)
18+
model_args = {
19+
"max_tokens" : 2000,
20+
"temperature" : 0.7,
21+
"top_k" : 40,
22+
"top_p" : 1,
23+
"n_gpu_layers" : 40,
24+
"n_batch" : 512,
25+
"streaming" : False,
26+
"verbose" : False
27+
}
1128

12-
# (C) MODEL SETTINGS
13-
model_name = "TheBloke/vicuna-7B-v1.5-GPTQ"
14-
model_args = {
15-
"do_sample" : True,
16-
"max_new_tokens" : 3000,
17-
"batch_size" : 1,
18-
"temperature" : 0.7,
19-
"top_k" : 40,
20-
"top_p" : 1,
21-
"num_return_sequences" : 1
29+
# (E) HF TRANSFORMER
30+
else:
31+
os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "true"
32+
os.environ["TRANSFORMERS_CACHE"] = path_models
33+
model_args = {
34+
"do_sample" : True,
35+
"max_new_tokens" : 2000,
36+
"batch_size" : 1,
37+
"temperature" : 0.7,
38+
"top_k" : 40,
39+
"top_p" : 1,
40+
"num_return_sequences" : 1
41+
}
42+
43+
# (F) AUTO - CPU OR GPU
44+
if not any((torch.cuda.is_available(), torch.backends.mps.is_available())):
45+
gpu = False
46+
else:
47+
gpu = True
48+
49+
# (G) EMBEDDING
50+
embed_args = {
51+
"model_name" : "hkunlp/instructor-xl",
52+
"model_kwargs" : { "device": "cuda" if gpu else "cpu" }
53+
}
54+
55+
# (H) DB - DOCUMENT SPILTTER
56+
db_split = {
57+
"chunk_size" : 512,
58+
"chunk_overlap" : 30
2259
}
2360

24-
# (D) CHAIN SETTINGS
61+
# (I) CHAIN SETTINGS
2562
chain_args = {
2663
"chain_type" : "stuff",
2764
"return_source_documents" : True,
2865
"verbose" : True
2966
}
3067

31-
# (E) PROMPT TEMPLATE
68+
# (J) PROMPT TEMPLATE
3269
prompt_template = """SYSTEM: Use the following context section and only that context to answer the question at the end. Do not use your internal knowledge. If you don't know the answer, just say that you don't know, don't try to make up an answer.
3370
CONTEXT: {context}
3471
USER: {question}
3572
ANSWER:"""
3673

37-
# (F) DATABASE - DOCUMENT SPLITTER
38-
db_split = {
39-
"chunk_size" : 512,
40-
"chunk_overlap" : 30
41-
}
42-
43-
# (G) HTTP ENDPOINT
74+
# (K) HTTP ENDPOINT
4475
http_allow = ["http://localhost"]
4576
http_host = "localhost"
4677
http_port = 8008
4778

48-
# (H) JWT
79+
# (L) JWT
4980
jwt_algo = ""
5081
jwt_secret = ""

ai chatbot/chatbot/b_create.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import os, glob
44
from pathlib import Path
55
from langchain.vectorstores import Chroma
6-
from langchain.embeddings import HuggingFaceEmbeddings
6+
from langchain.embeddings import HuggingFaceInstructEmbeddings
77
from langchain.text_splitter import RecursiveCharacterTextSplitter
88
from langchain.document_loaders import (
99
CSVLoader,
@@ -55,11 +55,12 @@ def rmdir(folder):
5555
exit()
5656

5757
# (D) IMPORT PROCESS
58-
# (D1) CREATE EMPTY-ISH DATABASE
5958
print("Creating database")
59+
60+
# (D1) CREATE EMPTY-ISH DATABASE
6061
db = Chroma.from_texts(
6162
texts = [""],
62-
embedding = HuggingFaceEmbeddings(),
63+
embedding = HuggingFaceInstructEmbeddings(**set.embed_args),
6364
persist_directory = set.path_db
6465
)
6566
db.persist()

ai chatbot/chatbot/c_oto_rodo.py

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# AUTO LOADER
2+
# credits - some parts "borrowed" from oobabooga
3+
# https://github.com/oobabooga/text-generation-webui/blob/main/modules/models.py
4+
5+
# (A) LOAD SETTINGS
6+
import a_settings as set
7+
8+
# (B) MANUALLY SPECIFIED MODEL - USE LLAMA CPP
9+
if hasattr(set, "model_file"):
10+
from langchain.llms import LlamaCpp
11+
llm = LlamaCpp(
12+
model_path = set.model_file,
13+
** set.model_args
14+
)
15+
16+
# (C) HUGGING FACE
17+
else:
18+
# (C1) IMPORT TRANSFORMERS MODULES
19+
import torch, psutil
20+
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, pipeline
21+
from accelerate import infer_auto_device_map, init_empty_weights
22+
from langchain import HuggingFacePipeline
23+
24+
# (C2) HELPER - AUTO MAX MEMORY CALCULATION
25+
def max_mem():
26+
# (C2-1) GPU MEMORY
27+
total = (torch.cuda.get_device_properties(0).total_memory / (1024 * 1024))
28+
suggestion = round((total - 1000) / 1000) * 1000
29+
if total - suggestion < 800:
30+
suggestion -= 1000
31+
suggestion = int(round(suggestion / 1000))
32+
max = { 0 : f"{suggestion}GiB" }
33+
34+
# (C2-2) CPU MEMORY
35+
total = (psutil.virtual_memory().available / (1024 * 1024))
36+
suggestion = round((total - 1000) / 1000) * 1000
37+
if total - suggestion < 800:
38+
suggestion -= 1000
39+
suggestion = int(round(suggestion / 1000))
40+
max["cpu"] = f"{suggestion}GiB"
41+
42+
# (C2-3) RETURN CALCULATED MEMORY
43+
return max
44+
45+
# (C3) INIT MODEL PARAMS
46+
params = {
47+
"low_cpu_mem_usage": True,
48+
"device_map" : "auto"
49+
}
50+
51+
# (C4) GPU ACCELERATED
52+
if set.gpu:
53+
config = AutoConfig.from_pretrained(set.model_name)
54+
with init_empty_weights():
55+
model = AutoModelForCausalLM.from_config(config)
56+
model.tie_weights()
57+
params["device_map"] = infer_auto_device_map(
58+
model,
59+
dtype = config.torch_dtype,
60+
max_memory = max_mem(),
61+
no_split_module_classes = model._no_split_modules
62+
)
63+
64+
# (C5) CPU ONLY
65+
else:
66+
params["torch_dtype"] = torch.float32
67+
68+
# (C6) LOAD MODEL
69+
model = AutoModelForCausalLM.from_pretrained(set.model_name, **params)
70+
71+
# (C7) LLM/PIPE
72+
llm = HuggingFacePipeline(pipeline = pipeline(
73+
task = "text-generation",
74+
model = model,
75+
tokenizer = AutoTokenizer.from_pretrained(set.model_name),
76+
** set.model_args
77+
))

0 commit comments

Comments
 (0)