batch running the sample run_llama_int8.py generates the same repeated contents as the answer of the first question

### Describe the bug

python run_llama_int8.py -m ${MODEL_ID} --quantized-model-path "./saved_results/best_model.pt" --benchmark --jit --int8-bf16-mixed

prompt = ["Is Apple an American company?", "Hugging Face Company is"]
Prompt size: 7

generate texts: ['Is Apple an American company?\n\nApple Inc. is an American multinational technology company that designs, manufactures, and markets consumer electronics, computer software, and online', 'Hugging Face Company is a European companies that is an American multincorporodal technology company that designs, manufactures, and markets consumer electronics, computer software, and']

while the expected texts are like as follows:
['Is Apple an American company?\nThe Apple logo is seen on the Apple store in New York City, New York, U.S., September 12, 2019. REUTERS/Brendan McDermid\nApple Inc. is an American multinational technology company', 'Hugging Face Company is a French startup that develops an open-source platform for building and deploying conversational AI models. It was founded in 2019 by Clément Delangue, Thomas Wolf, and Thibault Wittemberg.\nHugging Face is a French']

### Versions

v2.1.0.dev+cpu.llm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

batch running the sample run_llama_int8.py generates the same repeated contents as the answer of the first question #431

Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

batch running the sample run_llama_int8.py generates the same repeated contents as the answer of the first question #431

Description

Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions