CUDA Memory Operators¶
- 
Tensor new_managed_tensor(const Tensor &self, const std::vector<std::int64_t> &sizes)¶
- Allocate an - at::Tensorwith unified managed memory (UVM). Then set its preferred storage location to CPU (host memory) and establish mappings on the CUDA device to the host memory.- Parameters:
- self – The input tensor 
- sizes – The target tensor dimensions 
 
- Returns:
- A new tensor backed by UVM 
 
- 
Tensor new_managed_tensor_meta(const Tensor &self, const std::vector<std::int64_t> &sizes)¶
- Placeholder operator for the - Metadispatch key.- Parameters:
- self – The input tensor 
- sizes – The target tensor dimensions 
 
- Returns:
- A new empty tensor 
 
- 
Tensor new_host_mapped_tensor(const Tensor &self, const std::vector<std::int64_t> &sizes)¶
- Allocate the - at::Tensorwith host-mapped memory.- Parameters:
- self – The input tensor 
- sizes – The target tensor dimensions 
 
- Returns:
- A new tensor backed by host-mapped memory 
 
- 
Tensor new_unified_tensor(const Tensor &self, const std::vector<std::int64_t> &sizes, bool is_host_mapped)¶
- Allocate the - at::Tensorwith either unified managed memory (UVM) or host-mapped memory.- Parameters:
- self – The input tensor 
- sizes – The target tensor dimensions 
- is_host_mapped – Whether to allocate UVM or host-mapped memory 
 
- Returns:
- A new tensor backed by UVM or host-mapped memory, depending on the value of - is_host_mapped
 
- 
Tensor new_unified_tensor_meta(const Tensor &self, const std::vector<std::int64_t> &sizes, bool is_host_mapped)¶
- Placeholder operator for the - Metadispatch key for new_unified_tensor- Parameters:
- self – The input tensor 
- sizes – The target tensor dimensions 
- is_host_mapped – Whether to allocate UVM or host-mapped memory 
 
- Returns:
- A new tensor backed by UVM or host-mapped memory, depending on the value of - is_host_mapped
 
- 
Tensor new_vanilla_managed_tensor(const Tensor &self, const std::vector<std::int64_t> &sizes)¶
- Allocate an - at::Tensorwith unified managed memory (UVM), but allow for its preferred storage location to be automatically managed.- Parameters:
- self – The input tensor 
- sizes – The target tensor dimensions 
 
- Returns:
- A new tensor backed by UVM 
 
- 
bool uvm_storage(const Tensor &self)¶
- Check if a tensor is allocated with UVM (either CPU or GPU tensor). - Parameters:
- self – The input tensor 
- Returns:
- trueif the tensor is allocated with UVM, otherwise- false
 
- 
bool is_uvm_tensor(const Tensor &self)¶
- Check if a tensor is allocated with UVM, BUT is not a CPU tensor. - Parameters:
- self – The input tensor 
- Returns:
- trueif the tensor is a non-CPU tensor allocated with UVM, otherwise- false
 
- 
Tensor uvm_to_cpu(const Tensor &self)¶
- Convert a UVM tensor to a CPU tensor. - Parameters:
- self – The input tensor 
- Returns:
- A new tensor that is effectively the input moved from UVM to CPU 
 
- 
Tensor uvm_to_device(const Tensor &self, const Tensor &prototype)¶
- Create a new UVM tensor that shares the same device and UVM storage with - prototype.- Parameters:
- self – The input tensor 
- prototype – The target tensor whose device and and UVM storage will be shared with the new tensor 
 
- Returns:
- A new tensor that shares the same device and UVM storage with - prototype.
 
- 
void uvm_cuda_mem_advise(const Tensor &self, int64_t cuda_memory_advise)¶
- Call - cudaMemAdvise()on a UVM tensor’s storage. The- cudaMemoryAdviseenum is available on the Python side in the- fbgemm_gpu.uvmnamespace; see the documentation over there for valid values.- See also - See here for more information on the - cudaMemoryAdviseenum.- Parameters:
- self – The input tensor 
- cuda_memory_advise – The - cudaMemoryAdviseenum value, as integer
 
 
- 
void uvm_cuda_mem_prefetch_async(const Tensor &self, std::optional<Tensor> device_t)¶
- Call - cudaMemPrefetchAsync()on a UVM tensor’s storage to prefetch memory to a destination device.- See also - See here for more information on - cudaMemPrefetchAsync().- Parameters:
- self – The input tensor 
- device_t – [OPTIONAL] The tensor whose device will be the prefetch destination 
 
 
- 
void uvm_mem_advice_dont_fork(const Tensor &self)¶
- Call - madvise(...MADV_DONTFORK)on a UVM tensor’s storage. This is a workaround for an issue where the UVM kernel driver un-maps UVM storage pages from the page table on fork, causing slowdown on the next access from a CPU.- See also - See here for more information on - madvise().- Parameters:
- self – The input tensor 
 
- 
Tensor uvm_to_cpu_clone(const Tensor &self)¶
- Copy a UVM tensor’s contiguous storage (uvm_storage(t) is true) into a new CPU Tensor. The copy operation uses single-threaded - memcpy().- Parameters:
- self – The input tensor 
- Returns:
- A new CPU tensor containing the data copied from the UVM tensor 
 
- Copy a tensors contents to shared memory. This can be useful for forcing the initialization state of gpu memory, which is relevant for testing. - Parameters:
- self – The input tensor 
 
- Copy nan values into a gpu’s shared memory. This is useful for debugging or testing. - Parameters:
- self – The input tensor