Skip to content

Example Tensor Parallelism Optimizer Bug #1325

Open
@nrothGIT

Description

@nrothGIT

📚 Documentation

I believe the optimizer in this example should be declared after the parallelize module call, as in sequence parallelism. Without this, in latest torch, the example seems to not update the weights and thus not truly train. Please lmk if im missing anything and thanks so much for all your work!

Tiny fix PR below:
#1324

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions