Tutorials
Running Llama3 with vLLM
This tutorial will guide you through serving Llama3 with vLLM on Komodo Cloud. Follow these steps to configure and deploy Llama3 effectively.
Step 1: Create the service config
Create a configuration file for the Llama3 service. Below is a sample service-llama3.yaml
file.
In the configuration file, replace <REPLACE_WITH_YOUR_HUGGINGFACE_TOKEN>
with your HuggingFace token so that model weights are downloaded.
You’ll have to request access to LLama3 if you haven’t already.
Step 2: Launch the Llama3 Service
With your configuration file ready, launch the Llama3 service using the CLI:
Step 3: Chat with Llama3
Once the service status is RUNNING
, you can chat with it right from the dashboard!
The endpoint for the service is also provided in the dashboard.