-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High unmanaged memory using kwargs in apply_ufunc with Dask #9981
Comments
Thanks for opening your first issue here at xarray! Be sure to follow the issue template! |
cc @phofl this looks like a regression |
I am not quite sure what I am seeing here yet. I increased the underlying array and the memory fell back to the expected size. It's weird @dcherian generally we would like to wrap the large array into delayed to get this down, but the partial function application blocks that. Is there a way around this? |
oh I missed that @abiasiol this is a bit of an anti-pattern. can you just pass it as |
@dcherian Apologies, I used an array for the MRE only. In my use case, Does that change conclusions or suggested approaches? I am open to other approaches if anything comes to mind. My use case is something along the lines of: from typing import Callable
class MySplines:
def __init__(self, splines: dict[int, Callable]):
self.splines = splines
# bs is a large Python object
bs = MySplines(splines={1: lambda x: x + 1, 2: lambda x: x + 2}) # splines instead of these lambdas
data = da.random.random((len(times), len(other)), chunks=(100, 300))
da = xr.DataArray(data, coords={"time": times, "y": other}, dims=["time", "y"])
def add_one(x, b):
xm = x.mean(axis=0) # I do some processing on the chunk data
temp = b.splines[1](xm) + b.splines[2](xm) # evaluate the splines
return x + temp
with ResourceProfiler() as rprof:
result = xr.apply_ufunc(
add_one,
da,
dask="parallelized",
kwargs={"b": bs},
)
result.to_zarr("test_zarr.zarr", mode="w") The memory growth has the same pattern as the MRE, the graph has the same pure parallel structure. |
What is your issue?
Hi,
I have an embarrassingly parallel function that I am applying along the
time
dimension. It needs some extra constant data (in this casesome_big_constant
) that I pass throughkwargs
. I find that unmanaged memory keeps increasing due to the kwargs being associated to each task. The problem gets worse when I have more chunks along the time dimension.My doubts are:
some_big_constant
being allocated for each task, but I am surprised by the memory not being releasedsome_big_constant
gets baked in into the partial foradd_one
.EDIT:
dask-2025.1.0 distributed-2025.1.0 xarray-2025.1.1 zarr-2.18.3
Full example:
The dask graph looks good and parallel:
On the Dask dashboard, I see the unmanaged memory increasing as the computation proceeds. I see that
store_map
proceeds well, which is comforting.With the profiler, I see the memory increasing too. It roughly looks like there is one step up for every chunk (some chunks are probably loaded in memory at the same time).
In the Dask dashboard profile, I see the zarr calls at the very end of the computation only (the tall column). I would have expected to see some calls along the computation too (like how
store_map
proceeds), but not overly concerned about this.The text was updated successfully, but these errors were encountered: