Skip to content

Commit c23e0a4

Browse files
committed
Update dynamic quant tutorial for saving quantized model
Summary: Addresses pytorch/pytorch#43016 Test Plan: Reviewers: Subscribers: Tasks: Tags:
1 parent 6973eb2 commit c23e0a4

File tree

1 file changed

+14
-6
lines changed

1 file changed

+14
-6
lines changed

intermediate_source/dynamic_quantization_bert_tutorial.rst

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -492,7 +492,7 @@ follows:
492492
493493
| Prec | F1 score | Model Size | 1 thread | 4 threads |
494494
| FP32 | 0.9019 | 438 MB | 160 sec | 85 sec |
495-
| INT8 | 0.8953 | 181 MB | 90 sec | 46 sec |
495+
| INT8 | 0.902 | 181 MB | 90 sec | 46 sec |
496496
497497
We have 0.6% F1 score accuracy after applying the post-training dynamic
498498
quantization on the fine-tuned BERT model on the MRPC task. As a
@@ -520,15 +520,23 @@ processing the evaluation of MRPC dataset.
520520
3.3 Serialize the quantized model
521521
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
522522

523-
We can serialize and save the quantized model for the future use.
523+
We can serialize and save the quantized model for the future use using
524+
`torch.jit.save` after tracing the model.
524525

525526
.. code:: python
526527
527-
quantized_output_dir = configs.output_dir + "quantized/"
528-
if not os.path.exists(quantized_output_dir):
529-
os.makedirs(quantized_output_dir)
530-
quantized_model.save_pretrained(quantized_output_dir)
528+
input_ids = ids_tensor([8, 128], 2)
529+
token_type_ids = ids_tensor([8, 128], 2)
530+
attention_mask = ids_tensor([8, 128], vocab_size=2)
531+
dummy_input = (input_ids, attention_mask, token_type_ids)
532+
traced_model = torch.jit.trace(quantized_model, dummy_input)
533+
torch.jit.save(traced_model, "bert_traced_eager_quant.pt")
531534
535+
To load the quantized model, we can use `torch.jit.load`
536+
537+
.. code:: python
538+
539+
loaded_quantized_model = torch.jit.load("bert_traced_eager_quant.pt")
532540
533541
Conclusion
534542
----------

0 commit comments

Comments
 (0)