Skip to content

[BUG] Dask-CUDA does not work with Merlin/NVTabular #363

@rjzamora

Description

@rjzamora

As pointed out by @oliverholworthy in #274 (comment), cuda_isavailable() is used in merlin.core.compat to check for cuda support. Unfortunately, this is a known problem for dask-cuda.

This most likely means that Merlin/NVTabular has not worked properly with Dask-CUDA for more than six months now. For example, the following code will produce an OOM error for 32GB V100s:

import time
from merlin.core.utils import Distributed
if __name__ == "__main__":
    with Distributed(rmm_pool_size="24GiB"):
        time.sleep(30)

You will also see an error if you don't import any merlin/nvt code, but use the offending cuda.is_available() command:

import time
from numba import cuda # This is fine
cuda.is_available() # This is NOT
from dask_cuda import LocalCUDACluster
if __name__ == "__main__":
    with LocalCUDACluster(rmm_pool_size="24GiB") as cluster:
        time.sleep(30)

Meanwhile, the code works fine if you don't sue the offending command or import code that also imports merlin.core.compat:

import time
from dask_cuda import LocalCUDACluster
if __name__ == "__main__":
    with LocalCUDACluster(rmm_pool_size="24GiB") as cluster:
        time.sleep(30)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions