Skip to content

Commit 8496bc8

Browse files
authored
Merge branch 'master' into dc-doc
2 parents 48efb9f + 8ac4017 commit 8496bc8

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

intermediate_source/torchvision_tutorial.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ should return:
3232
- ``boxes (FloatTensor[N, 4])``: the coordinates of the ``N``
3333
bounding boxes in ``[x0, y0, x1, y1]`` format, ranging from ``0``
3434
to ``W`` and ``0`` to ``H``
35-
- ``labels (Int64Tensor[N])``: the label for each bounding box
35+
- ``labels (Int64Tensor[N])``: the label for each bounding box. ``0`` represents always the background class.
3636
- ``image_id (Int64Tensor[1])``: an image identifier. It should be
3737
unique between all the images in the dataset, and is used during
3838
evaluation
@@ -56,6 +56,8 @@ If your model returns the above methods, they will make it work for both
5656
training and evaluation, and will use the evaluation scripts from
5757
``pycocotools``.
5858

59+
One note on the ``labels``. The model considers class ``0`` as background. If your dataset does not contain the background class, you should not have ``0`` in your ``labels``. For example, assuming you have just two classes, *cat* and *dog*, you can define ``1`` (not ``0``) to represent *cats* and ``2`` to represent *dogs*. So, for instance, if one of the images has booth classes, your ``labels`` tensor should look like ``[1,2]``.
60+
5961
Additionally, if you want to use aspect ratio grouping during training
6062
(so that each batch only contains images with similar aspect ratio),
6163
then it is recommended to also implement a ``get_height_and_width``

0 commit comments

Comments
 (0)