@@ -320,6 +320,7 @@ The purpose for calibration is to run through some sample examples that is repre
320
320
the statistics of the Tensors and we can later use this information to calculate quantization parameters.
321
321
322
322
.. code :: python
323
+
323
324
def calibrate (model , data_loader ):
324
325
model.eval()
325
326
with torch.no_grad():
@@ -329,17 +330,19 @@ the statistics of the Tensors and we can later use this information to calculate
329
330
330
331
7. Convert the Model to a Quantized Model
331
332
-----------------------------------------
332
- ``convert_fx `` takes a calibrated model and produces a quantized model.
333
+ ``convert_fx `` takes a calibrated model and produces a quantized model.
333
334
334
335
.. code :: python
335
- quantized_model = convert_fx(prepared_model)
336
+
337
+ quantized_model = convert_fx(prepared_model)
336
338
print (quantized_model)
337
-
339
+
338
340
8. Evaluation
339
341
-------------
340
342
We can now print the size and accuracy of the quantized model.
341
343
342
344
.. code :: python
345
+
343
346
print (" Size of model before quantization" )
344
347
print_size_of_model(float_model)
345
348
print (" Size of model after quantization" )
@@ -381,6 +384,7 @@ we'll first call fuse explicitly to fuse the conv and bn in the model:
381
384
Note that ``fuse_fx `` only works in eval mode.
382
385
383
386
.. code :: python
387
+
384
388
fused = fuse_fx(float_model)
385
389
386
390
conv1_weight_after_fuse = fused.conv1[0 ].weight[0 ]
@@ -392,6 +396,7 @@ Note that ``fuse_fx`` only works in eval mode.
392
396
--------------------------------------------------------------------
393
397
394
398
.. code :: python
399
+
395
400
scripted_float_model_file = " resnet18_scripted.pth"
396
401
397
402
print (" Size of baseline model" )
@@ -406,6 +411,7 @@ quantized in eager mode. FX graph mode and eager mode produce very similar quant
406
411
so the expectation is that the accuracy and speedup are similar as well.
407
412
408
413
.. code :: python
414
+
409
415
print (" Size of Fx graph mode quantized model" )
410
416
print_size_of_model(quantized_model)
411
417
top1, top5 = evaluate(quantized_model, criterion, data_loader_test)
0 commit comments