phi35-moe-demo / README.md
ianshank's picture
πŸš€ Final fix v20250913_220639: Comprehensive solution for dependency and configuration issues
3eeba36 verified
---
title: Phi-3.5-MoE Expert Assistant
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
entrypoint: start.sh
startup_duration_timeout: 600
pinned: false
license: mit
short_description: AI assistant with expert routing and CPU/GPU support
models:
- microsoft/Phi-3.5-MoE-instruct
---
# πŸ€– Phi-3.5-MoE Expert Assistant
A robust, production-ready AI assistant powered by Microsoft's Phi-3.5-MoE model with intelligent expert routing and comprehensive CPU/GPU environment support.
## πŸš€ Key Features
- **🧠 Expert Routing**: Automatically routes queries to specialized experts (Code, Math, Reasoning, Multilingual, General)
- **πŸ”§ Environment Adaptive**: Works seamlessly on both CPU and GPU environments
- **πŸ›‘οΈ Robust Dependency Management**: Conditional installation of dependencies based on environment
- **πŸ“¦ Fault Tolerance**: Handles missing dependencies with fallback mechanisms
- **⚑ Performance Optimized**: Environment-specific optimizations for best performance
## πŸ”§ Recent Fixes
- βœ… **Missing Dependencies**: Added `einops` to requirements, conditional `flash_attn` installation
- βœ… **Deprecated Parameters**: Fixed all `torch_dtype` β†’ `dtype` usage
- βœ… **CPU Compatibility**: Automatic CPU-safe model revision selection
- βœ… **Error Handling**: Comprehensive fallback mechanisms
- βœ… **Security**: Updated to Gradio 4.44.0+ for security fixes
## πŸ—οΈ Architecture
```
app.py # Main application entry point
preinstall.py # Pre-installation script for dependencies
model_patch.py # Patch for handling missing dependencies
start.sh # Startup script
requirements.txt # Core dependencies
```
## 🎯 How It Works
1. **Environment Detection**: Automatically detects CPU vs GPU environment
2. **Dependency Management**: Installs required dependencies based on environment
3. **Model Configuration**: Uses optimal settings for each environment
4. **Expert Routing**: Classifies queries and routes to appropriate expert
5. **Graceful Fallbacks**: Works even when dependencies are missing
## πŸ“Š Performance
| Environment | Startup | Memory | Tokens/sec |
|-------------|---------|--------|------------|
| **CPU** | 3-5 min | 8-12 GB | 2-5 |
| **GPU** | 2-3 min | 16-20 GB | 15-30 |
## πŸ” Troubleshooting
If you encounter issues:
1. Check the logs for dependency installation
2. Verify the pre-installation script executed successfully
3. Ensure all required packages are installed
4. Try the fallback mode if model loading fails
---
**Built with ❀️ for reliable, production-ready AI applications**