Deploy a language model, with the model weights cached at build time
Model
classModel
class
is implemented to take advantage of the weight caching.
config.yaml
file is where you need to include the changes to
actually cache the weights at build time.
model_cache
key.
The repo_id
field allows you to specify a Huggingface
repo to pull down and cache at build-time, and the ignore_patterns
field allows you to specify files to ignore. If this is specified, then
this repo won’t have to be pulled during runtime.
Check out the guide for more info.