Spaces:
Sleeping
Sleeping
| title: Phi-3.5-MoE Expert Assistant | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.44.0 | |
| app_file: app.py | |
| entrypoint: start.sh | |
| startup_duration_timeout: 600 | |
| pinned: false | |
| license: mit | |
| short_description: AI assistant with expert routing and CPU/GPU support | |
| models: | |
| - microsoft/Phi-3.5-MoE-instruct | |
| # π€ Phi-3.5-MoE Expert Assistant | |
| A robust, production-ready AI assistant powered by Microsoft's Phi-3.5-MoE model with intelligent expert routing and comprehensive CPU/GPU environment support. | |
| ## π Key Features | |
| - **π§ Expert Routing**: Automatically routes queries to specialized experts (Code, Math, Reasoning, Multilingual, General) | |
| - **π§ Environment Adaptive**: Works seamlessly on both CPU and GPU environments | |
| - **π‘οΈ Robust Dependency Management**: Conditional installation of dependencies based on environment | |
| - **π¦ Fault Tolerance**: Handles missing dependencies with fallback mechanisms | |
| - **β‘ Performance Optimized**: Environment-specific optimizations for best performance | |
| ## π§ Recent Fixes | |
| - β **Missing Dependencies**: Added `einops` to requirements, conditional `flash_attn` installation | |
| - β **Deprecated Parameters**: Fixed all `torch_dtype` β `dtype` usage | |
| - β **CPU Compatibility**: Automatic CPU-safe model revision selection | |
| - β **Error Handling**: Comprehensive fallback mechanisms | |
| - β **Security**: Updated to Gradio 4.44.0+ for security fixes | |
| ## ποΈ Architecture | |
| ``` | |
| app.py # Main application entry point | |
| preinstall.py # Pre-installation script for dependencies | |
| model_patch.py # Patch for handling missing dependencies | |
| start.sh # Startup script | |
| requirements.txt # Core dependencies | |
| ``` | |
| ## π― How It Works | |
| 1. **Environment Detection**: Automatically detects CPU vs GPU environment | |
| 2. **Dependency Management**: Installs required dependencies based on environment | |
| 3. **Model Configuration**: Uses optimal settings for each environment | |
| 4. **Expert Routing**: Classifies queries and routes to appropriate expert | |
| 5. **Graceful Fallbacks**: Works even when dependencies are missing | |
| ## π Performance | |
| | Environment | Startup | Memory | Tokens/sec | | |
| |-------------|---------|--------|------------| | |
| | **CPU** | 3-5 min | 8-12 GB | 2-5 | | |
| | **GPU** | 2-3 min | 16-20 GB | 15-30 | | |
| ## π Troubleshooting | |
| If you encounter issues: | |
| 1. Check the logs for dependency installation | |
| 2. Verify the pre-installation script executed successfully | |
| 3. Ensure all required packages are installed | |
| 4. Try the fallback mode if model loading fails | |
| --- | |
| **Built with β€οΈ for reliable, production-ready AI applications** | |