From 14550b1b125d85878dbc3c145398ba5d62649653 Mon Sep 17 00:00:00 2001
From: Andrea Tupini <tupini07@gmail.com>
Date: Fri, 27 Jan 2023 09:36:25 -0600
Subject: [PATCH] Fix typo and formatting in bettertransformer_tutorial.rst

---
 beginner_source/bettertransformer_tutorial.rst | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/beginner_source/bettertransformer_tutorial.rst b/beginner_source/bettertransformer_tutorial.rst
index 10302331b36..96249d88651 100644
--- a/beginner_source/bettertransformer_tutorial.rst
+++ b/beginner_source/bettertransformer_tutorial.rst
@@ -18,7 +18,7 @@ been updated to use the core library modules to benefit from fastpath accelerati
 
 Better Transformer offers two types of acceleration:
 
-* Native multihead attention implementation for CPU and GPU to improvee overall execution efficiency.  
+* Native multihead attention (MHA) implementation for CPU and GPU to improve overall execution efficiency.  
 * Exploiting sparsity in NLP inference.  Because of variable input lengths, input
   tokens may contain a large number of padding tokens for which processing may be
   skipped, delivering significant speedups.
@@ -124,6 +124,7 @@ Finally, we set the benchmark iteration count:
 2.1  Run and benchmark inference on CPU with and without BT fastpath (native MHA only)
 
 We run the model on CPU, and collect profile information:  
+
 * The first run uses traditional ("slow path") execution.
 * The second run enables BT fastpath execution by putting the model in inference mode using `model.eval()` and disables gradient collection with `torch.no_grad()`.
 
@@ -167,6 +168,7 @@ We disable the BT sparsity:
     
  
 We run the model on DEVICE, and collect profile information for native MHA execution on DEVICE:  
+
 * The first run uses traditional ("slow path") execution.
 * The second run enables BT fastpath execution by putting the model in inference mode using `model.eval()`
   and disables gradient collection with `torch.no_grad()`.