Go to home
Back
2
1

phi3-mini-deployment

:

Phi-3 Mini Deployment Project

This project packages Microsoft's Phi-3 Mini 4K Instruct model for Kubernetes deployment using KitOps and Flux CD.

Model Information

  • Model: Microsoft Phi-3 Mini 4K Instruct
  • Framework: PyTorch
  • Use Case: Text generation and instruction following
  • Context Length: 4K tokens

Deployment

This model is deployed using:

  • KitOps: Model packaging and artifact management
  • Flux CD: GitOps-based continuous deployment
  • vLLM: High-performance inference server

API Endpoints

Once deployed, the model will be available at:

  • POST /v1/completions - Text completion
  • POST /v1/chat/completions - Chat completions
  • GET /health - Health check

Usage Example

curl -X POST "http://your-cluster-ip/v1/completions" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "phi3-mini-4k-instruct",
    "prompt": "What is machine learning?",
    "max_tokens": 100
  }'