View on Github
Set up the imports and key constants
In this example, we use the Huggingface transformers library to build a text generation model.model/model.py
model/model.py
Define the Model class and load function
In the load function of the Truss, we implement logic involved in
downloading and setting up the model. For this LLM, we use the Auto
classes in transformers to instantiate our Mistral model.
model/model.py
Define the predict function
In the predict function, we implement the actual inference logic. The steps
here are:
- Set up the generation params. We have defaults for both of these, but adjusting the values will have an impact on the model output
- Tokenize the input
- Generate the output
- Use tokenizer to decode the output
model/model.py
Setting up the config.yaml
Running Mistral 7B requires a few libraries, such astorch, transformers and a couple others.
config.yaml
Configure resources for Mistral
Note that we need an A10G to run this model.config.yaml