Skip to content

Commit

Permalink
[libc] Use the NVIDIA device allocator for GPU malloc (#124277)
Browse files Browse the repository at this point in the history
Summary:
This is a blocker on another patch in the OpenMP runtime. The problem is
that NVIDIA truly doesn't handle RPC-based allocations very well. It
cannot reliably update the MMU while a kernel is running and it will
usually deadlock if called from a separate thread due to internal use of
TLS.

This patch just removes the definition of `malloc` and `free` for NVPTX.
The result here is that they will be undefined, which is the cue for the
`nvlink` linker to define them for us. So, as far as `libc` is concerned
it still implements malloc.
  • Loading branch information
jhuber6 authored Jan 24, 2025
1 parent 37bf0a1 commit 256f40d
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 1 deletion.
4 changes: 4 additions & 0 deletions libc/src/stdlib/gpu/free.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@

namespace LIBC_NAMESPACE_DECL {

// FIXME: For now we just default to the NVIDIA device allocator which is
// always available on NVPTX targets. This will be implemented fully later.
#ifndef LIBC_TARGET_ARCH_IS_NVPTX
LLVM_LIBC_FUNCTION(void, free, (void *ptr)) { gpu::deallocate(ptr); }
#endif

} // namespace LIBC_NAMESPACE_DECL
4 changes: 4 additions & 0 deletions libc/src/stdlib/gpu/malloc.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,12 @@

namespace LIBC_NAMESPACE_DECL {

// FIXME: For now we just default to the NVIDIA device allocator which is
// always available on NVPTX targets. This will be implemented fully later.
#ifndef LIBC_TARGET_ARCH_IS_NVPTX
LLVM_LIBC_FUNCTION(void *, malloc, (size_t size)) {
return gpu::allocate(size);
}
#endif

} // namespace LIBC_NAMESPACE_DECL
3 changes: 2 additions & 1 deletion libc/test/src/stdlib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,8 @@ if(LLVM_LIBC_FULL_BUILD)
)

# Only baremetal and GPU has an in-tree 'malloc' implementation.
if(LIBC_TARGET_OS_IS_BAREMETAL OR LIBC_TARGET_OS_IS_GPU)
if((LIBC_TARGET_OS_IS_BAREMETAL OR LIBC_TARGET_OS_IS_GPU) AND
NOT LIBC_TARGET_ARCHITECTURE_IS_NVPTX)
add_libc_test(
malloc_test
HERMETIC_TEST_ONLY
Expand Down

0 comments on commit 256f40d

Please sign in to comment.