Back
2
1
phi3-mini-deployment
:Phi-3 Mini Deployment Project
This project packages Microsoft's Phi-3 Mini 4K Instruct model for Kubernetes deployment using KitOps and Flux CD.
Model Information
- Model: Microsoft Phi-3 Mini 4K Instruct
- Framework: PyTorch
- Use Case: Text generation and instruction following
- Context Length: 4K tokens
Deployment
This model is deployed using:
- KitOps: Model packaging and artifact management
- Flux CD: GitOps-based continuous deployment
- vLLM: High-performance inference server
API Endpoints
Once deployed, the model will be available at:
POST /v1/completions
- Text completionPOST /v1/chat/completions
- Chat completionsGET /health
- Health check
Usage Example
curl -X POST "http://your-cluster-ip/v1/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "phi3-mini-4k-instruct",
"prompt": "What is machine learning?",
"max_tokens": 100
}'