This tutorial will guide you through serving Llama3 with vLLM on Komodo Cloud. Follow these steps to configure and deploy Llama3 effectively.Documentation Index
Fetch the complete documentation index at: https://docs.komodo.io/llms.txt
Use this file to discover all available pages before exploring further.
Step 1: Create the service config
Create a configuration file for the Llama3 service. Below is a sampleservice-llama3.yaml file.
In the configuration file, replace <REPLACE_WITH_YOUR_HUGGINGFACE_TOKEN> with your HuggingFace token so that model weights are downloaded.
You’ll have to request access to LLama3 if you haven’t already.
Step 2: Launch the Llama3 Service
With your configuration file ready, launch the Llama3 service using the CLI:Step 3: Chat with Llama3
Once the service status isRUNNING, you can chat with it right from the dashboard!
The endpoint for the service is also provided in the dashboard.