-
-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checkpoint support for torch.package
#2570
Comments
@dasturge Thank you to highlight this very interesting point. IMO a package is very different to a checkpoint. I'm not expert of what was recently done, but it doesn't sound like a new way to checkpoint and replace the actual load/save from state dicts. I would say that it should be very useful at the training process end helping the deployment. A specific handler could be an idea but at the moment I don't see how reuse automatically the training code in order to have a inference script. Maybe it's more relative to a guideline for writing applications. |
The raw way to do the job: @trainer.on(Events.COMPLETED)
def package_model()
with PackageExporter('package.pt') as pe:
# Some action pattern settings, depending on what you're packaging
pe.intern('models.**') # example
pe.extern('numpy.**') # example
pe.save_pickle('my_package', 'model.pkl', model) As @dasturge said, we can have an api like: TorchPackageCheckpoint(path:str, package_name: str, interns: List[str]=[], externs: List[str]=[], mocked: List[str]=[], to_save:Dict[str,Any] ) to do the job, but since user does not know
When we have to write those statements, why do not fall back to the raw way I said at first? |
Having a new and specific handler for packaging would be interesting if we manage something helpful. Maybe we could have checkpoints during the training and packaging at the end. However, packaging is more than checkpoint, it embeds what is needed for inference and is related to deployment. Let's think about it. It would be nice having a package importer, exporter for training and why not an inference engine based on that. |
🚀 Feature
Currently, checkpointing is very centered around objects with
state_dict
andload_state_dict
properties, but the newtorch.package
serialization option breaks with this pattern. It doesn't seem that I can simply insert a custom save_handler to handle package import/exporttorch offers a new, interesting method for serializing models along with code and dependencies (and is not limited to pytorch/base python types), it would be cool to be able to leverage this for checkpointing so models produced by the checkpointer are packaged and ready-to-go, helping to bridge the gap with deployment workflows.
Something along the lines of:
TorchPackageCheckpoint(package=my_package, internal_module="models", )
which allows one to pass an object which works along with:Obviously this would require some refactoring of private methods if it's to use the base Checkpoint class, needing to offload the responsibility to use
state_dict
to any DiskSaver/save_handlers. I didn't see a clean way to simply extend the Checkpoint class.The text was updated successfully, but these errors were encountered: