* Integrate accelerator abstraction interface into deepspeed/
* Fix error message in fp16/fused_optimizer
* fix error message in fp16/unfused_optimizer.py
* assign get_accelerator().pin_memory() result to input Tensor name
* no need to check cuda and whether nvtx supported
* move try-except into inner most block
* call Event() and Stream() in get_accelerator() for data type
* Make Stream and Event as properties of abstract interface so they can be used as data type in deepspeed
* Apply op_builder backend api change from #2705 from @jeffra
* fix tests where Builder NAME is used
* keep original ...Builder.NAME interface instead of ...Builder().NAME interface
* fix builder closure for installation
* fix randomltd builder
* add comments to clarify create_op_builder and get_op_builder
* fix compatibility with pip install -e
Co-authored-by: Cheng Li <pistasable@gmail.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
* Abstract accelerator (step 2)
* more flex op_builder path for both installation and runtime
* add SpatialInferenceBuilder into cuda_accelerator.py
* use reflection to make cuda_accelerator adapt to CUDA op builder change automatically
* clean up deepspeed/__init__.py
* add comments in cuda_accelerator for no torch path
* Update deepspeed/env_report.py
Change env_report.py according to suggestion
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
* reduce the range of try...except for better code clarity
* Add porting for deepspeed/ops/random_ltd/dropping_utils.py
* move accelerator to top directory and create symlink under deepspeed
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Michael Wyatt <mrwyattii@gmail.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>