Back
2
1
tinyllama-vllm
:TinyLlama vLLM ModelKit
This ModelKit contains TinyLlama 1.1B Chat model with vLLM serving capabilities.
Usage
The model serves a ChatGPT-compatible API on port 8000:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{"role": "user", "content": "Hello, how are you?"}]
}'