-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add stardist 0.8.5 recipe to containers #551
base: master
Are you sure you want to change the base?
Conversation
No biotools label defined, please check if tool is not already defined in biotools (https://bio.tools) and add extra.identifiers.biotools label if it exists. If it is not defined, you can ignore this comment. |
No test-cmds.txt (test file) present, skipping tests |
|
||
ARG DEBIAN_FRONTEND="noninteractive" | ||
ARG STARDIST_VERSION="0.8.5" | ||
ARG NVIDIA_DRIVER_VERSION=470 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this will be an issue? According to the stardist github, the NVIDIA_DRIVER_VERSION should be adjusted to the runtime environnement (depending on the GPU you have, I guess).
No sure how the container will behave if the runtime env does not match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @mboudet !
What would you suggest how to handle this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No quite sure, to be honest. It would be possible to install the correct lib at runtime I guess, but it would be quite dirty, and would probably not work with singularity.
I'm not sure how does it work with the conda package? Would it be possible to directly install the package in the dockerfile (like here), or does it require external librairies ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what you mean whether it would be possible to directly install the package in the dockerfile? Which package are you referring to? I don't see anything related to nvidia drivers in the recipe you linked unforunately? 🤷🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you use the 'NVIDIA_DRIVER_VERSION' variable to install a specific version of the libnvidia-compute in the next step. (Line 24). According the the stardist github, the libnvidia-compute version should match the GPU driver version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as is custom these days, I asked ChatGPT for help. The suggestion from it was to specify --build-arg
when building the docker container at runtime to adjust the ENV variable NVIDIA_DRIVER_VERSION
to the one available:
docker build --build-arg NVIDIA_DRIVER_VERSION=<desired_version> -t <your_image_name> .
I am not sure there is a good way to automatically detect this version number. I also don't really know whether we could build a container for cpu completely without NVIDIA drivers...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this would work, but that mean that the user would need to build the image themselve (so, not much point in having the image on biocontainers). Also, building an image usually require root/sudo privileges, which are not usually available in computing clusters.
My question regarding the conda package stands though. I'm not sure if a mamba/conda install in the container would work. This might need input from the software's developpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will ask in their repo if they can advise regarding the mamba / conda install. What would be the advantage over mamba/conda install over the pip install that we are currently doing?
Totally agree on the container build comment. One of the main goals here is to be able to build a singularity container from the docker container and if we can't do that, you are right, pretty useless to put the docker container on biocontainers 🙃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked them in stardist/stardist#259
Hi, StarDist author here. I'm not really familiar with Bioconda+BioContainers, hence I'm wondering what the StarDist container is supposed to be used for in the Bioconda+BioContainers ecosystem. The context for the Dockerfile in our repo (which the Dockerfile here is based on) was to offer an easy way to get a GPU-accelerated version of StarDist running, without the hassle of installing CUDA etc. Using StarDist is meant to be primarily done via interactive Jupyter notebooks (hence the TensorFlow StarDist itself doesn't require GPU-acceleration, but can take advantage of it for improved performance (by enabling GPU-support for TensorFlow via CUDA, and also by installing |
Thanks for that insight @uschmidt83 ! The main purpose of having a stardist container in biocontainers (or any hosted hub for docker for that matter) from our side is that we want to implement it in nf-core based nextflow pipelines. These pipelines require Docker containers for reproducible execution. Since stardist is part of conda-forge, a bioconda package (which automatically builds biocontainers) obviously can't be created and so a biocontainers recipe was the next logical step from my side. The alternative would be a CI logic on your repo that pushes to a publicly available dockerhub. For starting I used your github repo Dockerfile and adjusted it for the biocontainer requirements. To be fair, I am not that faimiliar with GPU specifications in container settings and didn't realize, that the NVIDIA driver specifications could be issues for cross-platform and singularity compatibility until @mboudet mentioned it. Would you be able and willing to make suggestions to this recipe to remove GPU acceleration and requirements? Again, the main drive of this is to have a docker container hosted in the public domain to use in nf-core pipelines. If you prefer to have a CI on your repo push to your dockerhub and we can use that, then we can also close this PR, whatever option you like better 😊! Thanks for your help and insights! |
If I understand this correctly, these Nextflow pipelines are not meant to run interactively. Instead each "step" of the pipeline consists of software that can either be installed via conda or as run as a container. If that assumption is true, wouldn't the StarDist container then not need a defined entrypoint to run the the software with defined inputs and outputs?
I don't understand this part, and also don't know the relationship between conda-forge and bioconda.
Removing the GPU-related things from the Dockerfile is actually easy. Again, I'm more concerned with how the Docker container is supposed to be used in the context of Nextflow.
I don't have enough information to know which option would be better for us. It looks like the container definition needs to be updated each time we make a new StarDist release. How does that process work in BioContainers? Best, |
Thank you for the details again @uschmidt83 ! Sorry if my answer was more confusing then anything.
Please let me know what you think of these ideas and whether any details are still unclear! |
I don't believe there is a dependabot setup on the biocontainer repository. Seems like all PR are manual. So it might be easier to setup a CI on stardist side to push on docker/github container hub (and update the conda package) after a release (if that's needed) |
Thanks @mboudet ! I just saw a PR with dependabot and thought that would also check version bumps of main packages within containers... In that case, a CI directly on the stardist repo that builds a non-GPU container and pushes to dockerhub might be the easiest @uschmidt83. |
Thanks for the clarification, @FloWuenne and @mboudet.
We do have CLI for 2D and 3D model inference, but not for model training. Please take a look and let me know if you think this is sufficient.
Not sure about this, I have to look into this and understand the pros and cons. |
Thanks for the links for the inference CLIs @uschmidt83 ! The inference scripts look exactly what would be needed to run inference from within a container. So I think the easiest would be if we would update the Dockerfile in your main stardist repository to be one that doesn't use GPU (if that's something you want to support. The existing dockerfile could be saved as a DOCKERFILE_GPU or similar). Once the main DOCKERFILE in the repo is updated, here is an example of a Github CI that would build the container and push it to Dockerhub on release and every night (you would have to set the Dockerhub name and secret in the repository). Obviously for stardist one could set the CI to only trigger on release or merge on main (depending on your development structure!) Let us know what your preference is! We can also still go ahead and change this recipe here to work without GPU and have this alongside your main repository. That way, you don't have to change the Dockerfile and deal with setting up the CI... Ultimately I would leave the decision to you as the package creator! |
@uschmidt83 Just a quick ping here to ask whether you wanted to go through with biocontainer or rather go with Docker CIs in your repo? If you want to go for the CI action then I would close this PR. |
Submitting a Container
(If you're requesting for a new container, please check the procedure described here.
Check BioContainers' Dockerfile specifications
Checklist
Check BioContainers' Dockerfile metadata