capture llama_print_timings without Low Level API? #1109
Replies: 1 comment
-
Huh, I guess we both needed the same thing, I just posted in the Take a look here: #1124 Essentially, as long as you have My version of the code allows capturing the output into existing with store_stdout_stderr() as (outbuff, errbuff):
# Normal python output is captured
print("something")
# Output from C streams are captured
llm = Llama(model_path="./models/7B/llama-model.gguf", verbose=True)
print(outbuff.getvalue()) # "something\\n"
print(errbuff.getvalue()) # Output from model information, layers etc |
Beta Was this translation helpful? Give feedback.
-
Hello, I am enjoying using llama-cpp-python by calling High Level API, however, on terminal I constantly see logs like llama_print_timings, instead would be nice to capture these values in my code. Is it possible without going to Low Level API?
Beta Was this translation helpful? Give feedback.
All reactions