Skip to content

IPEX launch script performance in docker #249

Open
@Peach-He

Description

@Peach-He

Hi IPEX team,

Thanks for excellent project, I'm having weird performance issue when using IPEX launch scripts in docker, that disabling thread affinity delivers 1.28x training speedup which is unexpected. Do you know the possible root cause?

Bellow is detailed information:

  • system config: 4 nodes ICX, pytorch 1.10, ipex 1.10
  • docker image based on intel oneapi-aikit
  • training scripts: python -m intel_extension_for_pytorch.cpu.launch --distributed --nproc_per_node=2 --nnodes=4 --hostfile hosts train.py

How I disable thread affinity:

# modify launch.py, comment out following code
self.set_env("KMP_AFFINITY", "granularity=fine,compact,1,0")
self.set_env("I_MPI_PIN_DOMAIN", mpi_pin_domain)
self.set_env("CCL_WORKER_AFFINITY", ccl_affinity)

Looking forward to your reply, thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions