Skip to content

Commit a317730

Browse files
authored
infill in separate example (#2)
* reverted changes to main and added infill example
1 parent 9d3514a commit a317730

File tree

6 files changed

+831
-70
lines changed

6 files changed

+831
-70
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ models-mnt
4040
/embedding
4141
/gguf
4242
/gguf-llama-simple
43+
/infill
4344
/libllama.so
4445
/llama-bench
4546
/main

Makefile

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Define the default target now so that it is always the first target
2-
BUILD_TARGETS = main quantize quantize-stats perplexity embedding vdot train-text-from-scratch convert-llama2c-to-ggml simple save-load-state server embd-input-test gguf llama-bench baby-llama beam-search speculative tests/test-c.o
2+
BUILD_TARGETS = main quantize quantize-stats perplexity embedding vdot train-text-from-scratch convert-llama2c-to-ggml simple save-load-state server embd-input-test gguf llama-bench baby-llama beam-search speculative infill tests/test-c.o
33

44
# Binaries only useful for tests
55
TEST_TARGETS = tests/test-llama-grammar tests/test-grammar-parser tests/test-double-float tests/test-grad0 tests/test-opt tests/test-quantize-fns tests/test-quantize-perf tests/test-sampling tests/test-tokenizer-0-llama tests/test-tokenizer-0-falcon tests/test-tokenizer-1-llama
@@ -513,6 +513,8 @@ main: examples/main/main.cpp build-info.h ggml.
513513
@echo
514514
@echo '==== Run ./main -h for help. ===='
515515
@echo
516+
infill: examples/infill/infill.cpp build-info.h ggml.o llama.o common.o console.o grammar-parser.o $(OBJS)
517+
$(CXX) $(CXXFLAGS) $(filter-out %.h,$^) -o $@ $(LDFLAGS)
516518

517519
simple: examples/simple/simple.cpp build-info.h ggml.o llama.o common.o $(OBJS)
518520
$(CXX) $(CXXFLAGS) $(filter-out %.h,$^) -o $@ $(LDFLAGS)

examples/infill/CMakeLists.txt

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
set(TARGET infill)
2+
add_executable(${TARGET} infill.cpp)
3+
install(TARGETS ${TARGET} RUNTIME)
4+
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
5+
target_compile_features(${TARGET} PRIVATE cxx_std_11)
6+
if(TARGET BUILD_INFO)
7+
add_dependencies(${TARGET} BUILD_INFO)
8+
endif()

examples/infill/README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# llama.cpp/example/infill
2+
3+
This example shows how to use the infill mode with Code Llama models supporting infill mode.
4+
Currently the 7B and 13B models support infill mode.
5+
6+
Infill supports most of the options available in the main example.
7+
8+
For further information have a look at the main README.md in llama.cpp/example/main/README.md
9+
10+
## Common Options
11+
12+
In this section, we cover the most commonly used options for running the `infill` program with the LLaMA models:
13+
14+
- `-m FNAME, --model FNAME`: Specify the path to the LLaMA model file (e.g., `models/7B/ggml-model.bin`).
15+
- `-i, --interactive`: Run the program in interactive mode, allowing you to provide input directly and receive real-time responses.
16+
- `-n N, --n-predict N`: Set the number of tokens to predict when generating text. Adjusting this value can influence the length of the generated text.
17+
- `-c N, --ctx-size N`: Set the size of the prompt context. The default is 512, but LLaMA models were built with a context of 2048, which will provide better results for longer input/inference.
18+
19+
## Input Prompts
20+
21+
The `infill` program provides several ways to interact with the LLaMA models using input prompts:
22+
23+
- `--in-prefix PROMPT_BEFORE_CURSOR`: Provide the prefix directly as a command-line option.
24+
- `--in-suffix PROMPT_AFTER_CURSOR`: Provide the suffix directly as a command-line option.
25+
- `--interactive-first`: Run the program in interactive mode and wait for input right away. (More on this below.)
26+
27+
## Interaction
28+
29+
The `infill` program offers a seamless way to interact with LLaMA models, allowing users to receive real-time infill suggestions. The interactive mode can be triggered using `--interactive`, and `--interactive-first`
30+
31+
### Interaction Options
32+
33+
- `-i, --interactive`: Run the program in interactive mode, allowing users to get real time code suggestions from model.
34+
- `--interactive-first`: Run the program in interactive mode and immediately wait for user input before starting the text generation.
35+
- `--color`: Enable colorized output to differentiate visually distinguishing between prompts, user input, and generated text.
36+
37+
### Example
38+
39+
```bash
40+
./infill -t 10 -ngl 0 -m models/codellama-13b.Q5_K_S.gguf -c 4096 --temp 0.7 --repeat_penalty 1.1 -n 20 --in-prefix "def helloworld():\n print(\"hell" --in-suffix "\n print(\"goodbye world\")\n "
41+
```

0 commit comments

Comments
 (0)