Remove legacy torchtext code from translation tutorial #1250

zhangguanheng66 · 2020-11-18T23:06:54Z

No description provided.

netlify · 2020-11-18T23:14:46Z

Deploy preview for pytorch-tutorials-preview ready!

Built with commit 087fe98

https://deploy-preview-1250--pytorch-tutorials-preview.netlify.app

cpuhrsch · 2020-11-20T18:08:43Z

beginner_source/torchtext_translation_tutorial.py

+    data_.append((de_tensor_, en_tensor_))
+  return data_
+
+train_data = data_process(iter(io.open(train_filepaths[0])),


I think merging this into the data_process function and then only passing _filepaths cleans this up a bit

cpuhrsch · 2020-11-20T18:10:01Z

beginner_source/torchtext_translation_tutorial.py

+de_vocab = build_vocab(train_filepaths[0], de_tokenizer)
+en_vocab = build_vocab(train_filepaths[1], en_tokenizer)
+
+def data_process(raw_de_iter, raw_en_iter):


How long does this function take? How long does the entire prepreocessing step take? Should we tokenizer async while training instead?

offline discussion: we will have a follow-up PR since this PR doesn't focus on the dataloader.

cpuhrsch · 2020-11-20T18:13:03Z

beginner_source/torchtext_translation_tutorial.py

+en_vocab = build_vocab(train_filepaths[1], en_tokenizer)
+
+def data_process(raw_de_iter, raw_en_iter):
+  data_ = []


nit: remove the trailing underscores

zhangguanheng66 · 2020-11-29T19:04:02Z

cc @brianjo

cpuhrsch · 2020-11-30T18:14:05Z

beginner_source/torchtext_translation_tutorial.py

 data from a well-known dataset containing sentences in both English and German and use it to
 train a sequence-to-sequence model with attention that can translate German sentences
 into English.

 It is based off of
 `this tutorial <https://github.com/bentrevett/pytorch-seq2seq/blob/master/3%20-%20Neural%20Machine%20Translation%20by%20Jointly%20Learning%20to%20Align%20and%20Translate.ipynb>`__
 from PyTorch community member `Ben Trevett <https://github.com/bentrevett>`__
-and was created by `Seth Weidman <https://github.com/SethHWeidman/>`__ with Ben's permission.
+with Ben's permission. We update the tutorials by removing some legecy code.


nit: legacy instead of legecy

* checkpoint * checkpoint * minor changes with review's feedback * fix typo * Fix ascii decode error Co-authored-by: Guanheng Zhang <zhangguanheng@devfair0197.h2.fair> Co-authored-by: Brian Johnson <brianjo@fb.com>

checkpoint

b4c030b

facebook-github-bot added the cla signed label Nov 18, 2020

checkpoint

c63ee4f

cpuhrsch reviewed Nov 20, 2020

View reviewed changes

cpuhrsch approved these changes Nov 20, 2020

View reviewed changes

minor changes with review's feedback

21a1115

zhangguanheng66 changed the title ~~[WIP] Remove legacy torchtext code from translation tutorial~~ Remove legacy torchtext code from translation tutorial Nov 29, 2020

Merge branch 'master' into translation_tutorial

7eb94b9

cpuhrsch reviewed Nov 30, 2020

View reviewed changes

Guanheng Zhang added 2 commits November 30, 2020 12:10

fix typo

14c1f5f

Merge branch 'master' into translation_tutorial

44ed7e0

zhangguanheng66 force-pushed the translation_tutorial branch 2 times, most recently from 0d504ea to 8c817f1 Compare December 2, 2020 14:58

Guanheng Zhang added 2 commits December 2, 2020 13:08

Merge remote-tracking branch 'upstream/master' into translation_tutorial

5d7e194

Fix ascii decode error

2f43983

zhangguanheng66 force-pushed the translation_tutorial branch from a3b3e27 to 2f43983 Compare December 2, 2020 21:11

Merge branch 'master' into translation_tutorial

087fe98

brianjo merged commit 133e5b6 into pytorch:master Dec 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove legacy torchtext code from translation tutorial #1250

Remove legacy torchtext code from translation tutorial #1250

Uh oh!

zhangguanheng66 commented Nov 18, 2020

Uh oh!

netlify bot commented Nov 18, 2020 •

edited

Loading

Uh oh!

cpuhrsch Nov 20, 2020

Uh oh!

cpuhrsch Nov 20, 2020

Uh oh!

zhangguanheng66 Nov 20, 2020

Uh oh!

cpuhrsch Nov 20, 2020

Uh oh!

zhangguanheng66 commented Nov 29, 2020

Uh oh!

cpuhrsch Nov 30, 2020

Uh oh!

Uh oh!

Remove legacy torchtext code from translation tutorial #1250

Remove legacy torchtext code from translation tutorial #1250

Uh oh!

Conversation

zhangguanheng66 commented Nov 18, 2020

Uh oh!

netlify bot commented Nov 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cpuhrsch Nov 20, 2020

Choose a reason for hiding this comment

Uh oh!

cpuhrsch Nov 20, 2020

Choose a reason for hiding this comment

Uh oh!

zhangguanheng66 Nov 20, 2020

Choose a reason for hiding this comment

Uh oh!

cpuhrsch Nov 20, 2020

Choose a reason for hiding this comment

Uh oh!

zhangguanheng66 commented Nov 29, 2020

Uh oh!

cpuhrsch Nov 30, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

netlify bot commented Nov 18, 2020 •

edited

Loading