View on Github
Step 1: Implementing the model
Set up imports for this model. In this example, we simply use the HuggingFace transformers library.model/model.py
Model
class. This class must have:
- an
__init__
function - a
load
function - a
predict
function
__init__
function, set up any variables that will be used in the load
and predict
functions.
model/model.py
load
function of the Truss, we implement logic
involved in downloading the model and loading it into memory.
For this Truss example, we define a HuggingFace pipeline, and choose
the text-classification
task, which uses BERT for text classification under the hood.
Note that the load function runs once when the model starts.
model/model.py
predict
function of the Truss, we implement logic related
to actual inference. For this example, we just call the HuggingFace pipeline
that we set up in the load
function.
model/model.py
Step 2: Writing the config.yaml
Each Truss has a config.yaml file where we can configure options related to the deployment. It’s in this file where we can define requirements, resources, and runtime options like secrets and environment variablesBasic Options
In this section, we can define basic metadata about the model, such as the name, and the Python version to build with.config.yaml
Set up python requirements
In this section, we define any pip requirements that we need to run the model. To run this, we need PyTorch and Tranformers.config.yaml
Configure the resources needed
In this section, we can configure resources needed to deploy this model. Here, we have no need for a GPU so we leave the accelerator section blank.config.yaml
Other config options
Truss also has provisions for adding other runtime options packages. In this example, we don’t need these, so we leave this empty for now.config.yaml