Open
Description
Describe the issue
Hello, I'm trying to make accelerate
works with ipex and device_map="auto"
, and in the source code utils/modeling.py
of accelerate
, it contains:
def get_max_memory(max_memory: Optional[Dict[Union[int, str], Union[int, str]]] = None):
# ... (other codes) ...
elif is_xpu_available():
for i in range(torch.xpu.device_count()):
_ = torch.tensor(0, device=torch.device("xpu", i))
max_memory = {i: torch.xpu.max_memory_allocated(i) for i in range(torch.xpu.device_count())}
# ... (other codes) ...
The torch.xpu.max_memory_allocated
only returns allocated memory in current program, so at startup it will always be a very small value, thus this device will be ignored.
According to the document (https://intel.github.io/intel-extension-for-pytorch/xpu/latest/tutorials/api_doc.html), there is no way to know global allocated or free memory. Would you mind considering adding a method like torch.cuda.mem_get_info
to directly get global memory information? Thanks!