Skip to content

Commit 8ac4017

Browse files
author
Francesco Saverio Zuppichini
authored
Improved Fast RCNN tutorial (#914)
* improve doc for mask rcnn - add small description on the labels classes * minor changes
1 parent 36ea6fc commit 8ac4017

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

intermediate_source/torchvision_tutorial.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ should return:
3232
- ``boxes (FloatTensor[N, 4])``: the coordinates of the ``N``
3333
bounding boxes in ``[x0, y0, x1, y1]`` format, ranging from ``0``
3434
to ``W`` and ``0`` to ``H``
35-
- ``labels (Int64Tensor[N])``: the label for each bounding box
35+
- ``labels (Int64Tensor[N])``: the label for each bounding box. ``0`` represents always the background class.
3636
- ``image_id (Int64Tensor[1])``: an image identifier. It should be
3737
unique between all the images in the dataset, and is used during
3838
evaluation
@@ -56,6 +56,8 @@ If your model returns the above methods, they will make it work for both
5656
training and evaluation, and will use the evaluation scripts from
5757
``pycocotools``.
5858

59+
One note on the ``labels``. The model considers class ``0`` as background. If your dataset does not contain the background class, you should not have ``0`` in your ``labels``. For example, assuming you have just two classes, *cat* and *dog*, you can define ``1`` (not ``0``) to represent *cats* and ``2`` to represent *dogs*. So, for instance, if one of the images has booth classes, your ``labels`` tensor should look like ``[1,2]``.
60+
5961
Additionally, if you want to use aspect ratio grouping during training
6062
(so that each batch only contains images with similar aspect ratio),
6163
then it is recommended to also implement a ``get_height_and_width``

0 commit comments

Comments
 (0)