Skip to content

Commit cee7f10

Browse files
authored
Merge branch 'main' into gh/HDCharles/2/base
2 parents 8a32b71 + 9ac7d09 commit cee7f10

15 files changed

+342
-863
lines changed

.jenkins/validate_tutorials_built.py

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,12 +27,10 @@
2727
"intermediate_source/mnist_train_nas", # used by ax_multiobjective_nas_tutorial.py
2828
"intermediate_source/fx_conv_bn_fuser",
2929
"intermediate_source/_torch_export_nightly_tutorial", # does not work on release
30-
"intermediate_source/torch_export_tutorial", # Enable when fixed for 2.2
3130
"advanced_source/super_resolution_with_onnxruntime",
3231
"advanced_source/ddp_pipeline", # requires 4 gpus
3332
"advanced_source/usb_semisup_learn", # in the current form takes 140+ minutes to build - can be enabled when the build time is reduced
3433
"prototype_source/fx_graph_mode_ptq_dynamic",
35-
"prototype_source/maskedtensor_sparsity", # Enable when fixed for 2.2
3634
"prototype_source/vmap_recipe",
3735
"prototype_source/torchscript_freezing",
3836
"prototype_source/nestedtensor",

.pyspelling.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,9 @@ matrix:
4545
- open: '\.\. (code-block|math)::.*$\n*'
4646
content: '(?P<first>(^(?P<indent>[ ]+).*$\n))(?P<other>(^([ \t]+.*|[ \t]*)$\n)*)'
4747
close: '(^(?![ \t]+.*$))'
48+
# Ignore references like "[1] Author: Title"
49+
- open: '\[\d\]'
50+
close: '\n'
4851
- pyspelling.filters.markdown:
4952
- pyspelling.filters.html:
5053
ignores:

Makefile

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,9 @@ download:
8686
wget -nv -N https://www.manythings.org/anki/deu-eng.zip -P $(DATADIR)
8787
unzip -o $(DATADIR)/deu-eng.zip -d beginner_source/data/
8888

89+
# Download PennFudanPed dataset for intermediate_source/torchvision_tutorial.py
90+
wget https://www.cis.upenn.edu/~jshi/ped_html/PennFudanPed.zip -P $(DATADIR)
91+
unzip -o $(DATADIR)/PennFudanPed.zip -d intermediate_source/data/
8992

9093
docs:
9194
make download
@@ -103,3 +106,5 @@ html-noplot:
103106
clean-cache:
104107
make clean
105108
rm -rf advanced beginner intermediate recipes
109+
# remove additional python files downloaded for torchvision_tutorial.py
110+
rm -rf intermediate_source/engine.py intermediate_source/utils.py intermediate_source/transforms.py intermediate_source/coco_eval.py intermediate_source/coco_utils.py
-612 KB
Binary file not shown.
-12.4 KB
Binary file not shown.
-418 KB
Binary file not shown.
-849 KB
Binary file not shown.

advanced_source/usb_semisup_learn.py

Lines changed: 73 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -1,70 +1,88 @@
11
"""
22
Semi-Supervised Learning using USB built upon PyTorch
3-
=============================
4-
3+
=====================================================
54
65
**Author**: `Hao Chen <https://github.com/Hhhhhhao>`_
7-
8-
9-
Introduction
10-
------------
116
12-
USB is a semi-supervised learning framework built upon PyTorch.
13-
Based on Datasets and Modules provided by PyTorch, USB becomes a flexible, modular, and easy-to-use framework for semi-supervised learning.
14-
It supports a variety of semi-supervised learning algorithms, including FixMatch, FreeMatch, DeFixMatch, SoftMatch, etc.
7+
Unified Semi-supervised learning Benchmark (USB) is a semi-supervised
8+
learning framework built upon PyTorch.
9+
Based on Datasets and Modules provided by PyTorch, USB becomes a flexible,
10+
modular, and easy-to-use framework for semi-supervised learning.
11+
It supports a variety of semi-supervised learning algorithms, including
12+
``FixMatch``, ``FreeMatch``, ``DeFixMatch``, ``SoftMatch``, and so on.
1513
It also supports a variety of imbalanced semi-supervised learning algorithms.
16-
The benchmark results across different datasets of computer vision, natural language processing, and speech processing are included in USB.
14+
The benchmark results across different datasets of computer vision, natural
15+
language processing, and speech processing are included in USB.
16+
17+
This tutorial will walk you through the basics of using the USB lighting
18+
package.
19+
Let's get started by training a ``FreeMatch``/``SoftMatch`` model on
20+
CIFAR-10 using pretrained ViT!
21+
And we will show it is easy to change the semi-supervised algorithm and train
22+
on imbalanced datasets.
1723
18-
This tutorial will walk you through the basics of using the usb lighting package.
19-
Let's get started by training a FreeMatch/SoftMatch model on CIFAR-10 using pre-trained ViT!
20-
And we will show it is easy to change the semi-supervised algorithm and train on imbalanced datasets.
2124
22-
2325
.. figure:: /_static/img/usb_semisup_learn/code.png
2426
:alt: USB framework illustration
2527
"""
2628

2729

2830
######################################################################
29-
# Introduction to FreeMatch and SoftMatch in Semi-Supervised Learning
30-
# --------------------
31-
# Here we provide a brief introduction to FreeMatch and SoftMatch.
32-
# First we introduce a famous baseline for semi-supervised learning called FixMatch.
33-
# FixMatch is a very simple framework for semi-supervised learning, where it utilizes a strong augmentation to generate pseudo labels for unlabeled data.
34-
# It adopts a confidence thresholding strategy to filter out the low-confidence pseudo labels with a fixed threshold set.
35-
# FreeMatch and SoftMatch are two algorithms that improve upon FixMatch.
36-
# FreeMatch proposes adaptive thresholding strategy to replace the fixed thresholding strategy in FixMatch.
37-
# The adaptive thresholding progressively increases the threshold according to the learning status of the model on each class.
38-
# SoftMatch absorbs the idea of confidence thresholding as an weighting mechanism.
39-
# It proposes a Gaussian weighting mechanism to overcome the quantity-quality trade-off in pseudo-labels.
40-
# In this tutorial, we will use USB to train FreeMatch and SoftMatch.
31+
# Introduction to ``FreeMatch`` and ``SoftMatch`` in Semi-Supervised Learning
32+
# ---------------------------------------------------------------------------
33+
#
34+
# Here we provide a brief introduction to ``FreeMatch`` and ``SoftMatch``.
35+
# First, we introduce a famous baseline for semi-supervised learning called ``FixMatch``.
36+
# ``FixMatch`` is a very simple framework for semi-supervised learning, where it
37+
# utilizes a strong augmentation to generate pseudo labels for unlabeled data.
38+
# It adopts a confidence thresholding strategy to filter out the low-confidence
39+
# pseudo labels with a fixed threshold set.
40+
# ``FreeMatch`` and ``SoftMatch`` are two algorithms that improve upon ``FixMatch``.
41+
# ``FreeMatch`` proposes adaptive thresholding strategy to replace the fixed
42+
# thresholding strategy in ``FixMatch``. The adaptive thresholding progressively
43+
# increases the threshold according to the learning status of the model on each
44+
# class. ``SoftMatch`` absorbs the idea of confidence thresholding as an
45+
# weighting mechanism. It proposes a Gaussian weighting mechanism to overcome
46+
# the quantity-quality trade-off in pseudo-labels. In this tutorial, we will
47+
# use USB to train ``FreeMatch`` and ``SoftMatch``.
4148

4249

4350
######################################################################
44-
# Use USB to Train FreeMatch/SoftMatch on CIFAR-10 with only 40 labels
45-
# --------------------
46-
# USB is a Pytorch-based Python package for Semi-Supervised Learning (SSL).
47-
# It is easy-to-use/extend, affordable to small groups, and comprehensive for developing and evaluating SSL algorithms.
48-
# USB provides the implementation of 14 SSL algorithms based on Consistency Regularization, and 15 tasks for evaluation from CV, NLP, and Audio domain.
49-
# It has a modular design that allows users to easily extend the package by adding new algorithms and tasks.
50-
# It also supports a python api for easier adaptation to different SSL algorithms on new data.
51-
#
52-
#
53-
# Now, let's use USB to train FreeMatch and SoftMatch on CIFAR-10.
54-
# First, we need to install USB package ``semilearn`` and import necessary api functions from USB.
51+
# Use USB to Train ``FreeMatch``/``SoftMatch`` on CIFAR-10 with only 40 labels
52+
# ----------------------------------------------------------------------------
53+
#
54+
# USB is easy to use and extend, affordable to small groups, and comprehensive
55+
# for developing and evaluating SSL algorithms.
56+
# USB provides the implementation of 14 SSL algorithms based on Consistency
57+
# Regularization, and 15 tasks for evaluation from CV, NLP, and Audio domain.
58+
# It has a modular design that allows users to easily extend the package by
59+
# adding new algorithms and tasks.
60+
# It also supports a Python API for easier adaptation to different SSL
61+
# algorithms on new data.
62+
#
63+
#
64+
# Now, let's use USB to train ``FreeMatch`` and ``SoftMatch`` on CIFAR-10.
65+
# First, we need to install USB package ``semilearn`` and import necessary API
66+
# functions from USB.
5567
# Below is a list of functions we will use from ``semilearn``:
68+
#
5669
# - ``get_dataset`` to load dataset, here we use CIFAR-10
57-
# - ``get_data_loader`` to create train (labeled and unlabeled) and test data loaders, the train unlabeled loaders will provide both strong and weak augmentation of unlabeled data
58-
# - ``get_net_builder`` to create a model, here we use pre-trained ViT
59-
# - ``get_algorithm`` to create the semi-supervised learning algorithm, here we use FreeMatch and SoftMatch
70+
# - ``get_data_loader`` to create train (labeled and unlabeled) and test data
71+
# loaders, the train unlabeled loaders will provide both strong and weak
72+
# augmentation of unlabeled data
73+
# - ``get_net_builder`` to create a model, here we use pretrained ViT
74+
# - ``get_algorithm`` to create the semi-supervised learning algorithm,
75+
# here we use ``FreeMatch`` and ``SoftMatch``
6076
# - ``get_config``: to get default configuration of the algorithm
61-
# - ``Trainer``: a Trainer class for training and evaluating the algorithm on dataset
77+
# - ``Trainer``: a Trainer class for training and evaluating the
78+
# algorithm on dataset
6279
#
6380
import semilearn
6481
from semilearn import get_dataset, get_data_loader, get_net_builder, get_algorithm, get_config, Trainer
6582

6683
######################################################################
67-
# After importing necessary functions, we first set the hyper-parameters of the algorithm.
84+
# After importing necessary functions, we first set the hyper-parameters of the
85+
# algorithm.
6886
#
6987
config = {
7088
'algorithm': 'freematch',
@@ -122,27 +140,31 @@
122140

123141

124142
######################################################################
125-
# We can start Train the algorithms on CIFAR-10 with 40 labels now.
143+
# We can start training the algorithms on CIFAR-10 with 40 labels now.
126144
# We train for 4000 iterations and evaluate every 500 iterations.
127145
#
128146
trainer = Trainer(config, algorithm)
129147
trainer.fit(train_lb_loader, train_ulb_loader, eval_loader)
130148

131149

132150
######################################################################
133-
# Finally, let's evaluate the trained model on validation set.
134-
# After training 4000 iterations with FreeMatch on only 40 labels of CIFAR-10, we obtain a classifier that achieves above 93 accuracy on validation set.
151+
# Finally, let's evaluate the trained model on the validation set.
152+
# After training 4000 iterations with ``FreeMatch`` on only 40 labels of
153+
# CIFAR-10, we obtain a classifier that achieves above 93 accuracy on the validation set.
135154
trainer.evaluate(eval_loader)
136155

137156

138157

139158
######################################################################
140-
# Use USB to Train SoftMatch with specific imbalanced algorithm on imbalanced CIFAR-10
141-
# --------------------
159+
# Use USB to Train ``SoftMatch`` with specific imbalanced algorithm on imbalanced CIFAR-10
160+
# ------------------------------------------------------------------------------------
142161
#
143-
# Now let's say we have imbalanced labeled set and unlabeled set of CIFAR-10, and we want to train a SoftMatch model on it.
144-
# We create an imbalanced labeled set and imbalanced unlabeled set of CIFAR-10, by setting the ``lb_imb_ratio`` and ``ulb_imb_ratio`` to 10.
145-
# Also we replace the ``algorithm`` with ``softmatch`` and set the ``imbalanced`` to ``True``.
162+
# Now let's say we have imbalanced labeled set and unlabeled set of CIFAR-10,
163+
# and we want to train a ``SoftMatch`` model on it.
164+
# We create an imbalanced labeled set and imbalanced unlabeled set of CIFAR-10,
165+
# by setting the ``lb_imb_ratio`` and ``ulb_imb_ratio`` to 10.
166+
# Also, we replace the ``algorithm`` with ``softmatch`` and set the ``imbalanced``
167+
# to ``True``.
146168
#
147169
config = {
148170
'algorithm': 'softmatch',
@@ -210,7 +232,7 @@
210232

211233

212234
######################################################################
213-
# Finally, let's evaluate the trained model on validation set.
235+
# Finally, let's evaluate the trained model on the validation set.
214236
#
215237
trainer.evaluate(eval_loader)
216238

en-wordlist.txt

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
1+
SSL
2+
ViT
3+
Hao
4+
Chen
5+
Yidong
6+
Wang
17
NDK
28
Backpropagating
39
multinode
@@ -10,6 +16,7 @@ RRef
1016
OOM
1117
subfolder
1218
Dialogs
19+
PennFudan
1320
performant
1421
multithreading
1522
linearities
@@ -30,6 +37,8 @@ breakpoint
3037
MobileNet
3138
DeepLabV
3239
Resampling
40+
RCNN
41+
RPN
3342
APIs
3443
ATen
3544
AVX
@@ -124,6 +133,7 @@ JSON
124133
JVP
125134
Jacobian
126135
Kiuk
136+
Kihyuk
127137
Kubernetes
128138
Kuei
129139
LSTM
@@ -138,6 +148,7 @@ LRSchedulers
138148
Lua
139149
Luong
140150
macos
151+
mAP
141152
MLP
142153
MLPs
143154
MNIST
@@ -171,10 +182,12 @@ OU
171182
PIL
172183
PPO
173184
Plotly
185+
pre
174186
Prec
175187
Profiler
176188
PyTorch's
177189
RGB
190+
RGBA
178191
RL
179192
RNN
180193
RNNs
@@ -195,6 +208,7 @@ SciPy
195208
Sequentials
196209
Sigmoid
197210
SoTA
211+
Sohn
198212
Spacy
199213
TPU
200214
TensorBoard
@@ -337,6 +351,7 @@ jit
337351
jitter
338352
jpg
339353
judgements
354+
keypoint
340355
kwargs
341356
labelled
342357
learnable
@@ -417,6 +432,7 @@ reinitializes
417432
relu
418433
reproducibility
419434
rescale
435+
rescaling
420436
resnet
421437
restride
422438
rewinded
@@ -468,10 +484,12 @@ torchscriptable
468484
torchtext
469485
torchtext's
470486
torchvision
487+
TorchVision
471488
torchviz
472489
traceback
473490
tradeoff
474491
tradeoffs
492+
uint
475493
uncomment
476494
uncommented
477495
underflowing

0 commit comments

Comments
 (0)