Skip to content

Commit 6ff1ed9

Browse files
authored
Merge pull request #693 from pytorch/fix_perf_quantization
Update dynamic quantization tutorial
2 parents 6da2fd3 + bdc5212 commit 6ff1ed9

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

advanced_source/dynamic_quantization_tutorial.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,11 @@ def print_size_of_model(model):
269269

270270
######################################################################
271271
# Second, we see faster inference time, with no difference in evaluation loss:
272+
#
273+
# Note: we number of threads to one for single threaded comparison, since quantized
274+
# models run single threaded.
275+
276+
torch.set_num_threads(1)
272277

273278
def time_model_evaluation(model, test_data):
274279
s = time.time()
@@ -280,6 +285,9 @@ def time_model_evaluation(model, test_data):
280285
time_model_evaluation(quantized_model, test_data)
281286

282287
######################################################################
288+
# Running this locally on a MacBook Pro, without quantization, inference takes about 200 seconds,
289+
# and with quantization it takes just about 100 seconds.
290+
#
283291
# Conclusion
284292
# ----------
285293
#

0 commit comments

Comments
 (0)