tinyllama-vllm

Deploy ModelKit Contents Security Report ModelKit Diff Model Card

TinyLlama vLLM ModelKit

This ModelKit contains TinyLlama 1.1B Chat model with vLLM serving capabilities.

Usage

The model serves a ChatGPT-compatible API on port 8000:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Hello, how are you?"}]
  }'