# Hugging Face Spaces Deployment Guide - HonestAI ## ๐Ÿš€ Deployment to HF Spaces This guide covers deploying the updated HonestAI application to [Hugging Face Spaces](https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI). ## ๐Ÿ“‹ Pre-Deployment Checklist ### โœ… Required Files - [x] `Dockerfile` - Container configuration - [x] `requirements.txt` - Python dependencies - [x] `flask_api_standalone.py` - Main application entry point - [x] `README.md` - Updated with HonestAI Space URL - [x] `src/` - All source code - [x] `.env.example` - Environment variable template ### โœ… Recent Updates Included - [x] Enhanced configuration management (`src/config.py`) - [x] Performance metrics tracking (`src/orchestrator_engine.py`) - [x] Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B) - [x] 4-bit quantization support - [x] Cache directory management - [x] Memory optimizations ## ๐Ÿ”ง Deployment Steps ### 1. Verify Space Configuration **Space URL**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI **Space Settings**: - **SDK**: Docker - **Hardware**: T4 GPU (16GB) - **Visibility**: Public - **Storage**: Persistent (for cache) ### 2. Set Environment Variables In Space Settings โ†’ Repository secrets, ensure: - `HF_TOKEN` - Your Hugging Face API token (required) - `MAX_WORKERS` - Optional (default: 4) - `LOG_LEVEL` - Optional (default: INFO) - `HF_HOME` - Optional (auto-configured) ### 3. Verify Dockerfile The `Dockerfile` is configured for: - Python 3.10 - Port 7860 (HF Spaces standard) - Health check endpoint - Flask API as entry point ### 4. Commit and Push Updates ```bash # Ensure all changes are committed git add . git commit -m "Update: Performance metrics, enhanced config, model optimizations" # Push to HF Spaces repository git push origin main ``` ### 5. Monitor Build - **Build Time**: 5-10 minutes (first build may take longer) - **Watch Logs**: Check Space logs for build progress - **Health Check**: `/api/health` endpoint should respond after build ## ๐Ÿ“Š What's New in This Deployment ### 1. Performance Metrics Every API response now includes comprehensive performance data: ```json { "performance": { "processing_time": 1230.5, "tokens_used": 456, "agents_used": 4, "confidence_score": 85.2, "agent_contributions": [...], "safety_score": 85.0 } } ``` ### 2. Enhanced Configuration - Automatic cache directory management - Secure environment variable handling - Backward compatible settings - Validation and error handling ### 3. Model Optimizations - **Llama 3.1 8B** with 4-bit quantization (primary) - **e5-base-v2** for embeddings (768 dimensions) - **Qwen 2.5 1.5B** for fast classification - Model preloading for faster responses ### 4. Memory Management - Optimized history tracking (limited to 50-100 entries) - Efficient agent call tracking - Memory-aware caching ## ๐Ÿงช Testing After Deployment ### 1. Health Check ```bash curl https://jatinautonomouslabs-honestai.hf.space/api/health ``` ### 2. Test API Endpoint ```python import requests response = requests.post( "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat", json={ "message": "Hello, what is machine learning?", "session_id": "test-session", "user_id": "test-user" } ) data = response.json() print(f"Response: {data['message']}") print(f"Performance: {data.get('performance', {})}") ``` ### 3. Verify Performance Metrics Check that performance metrics are populated (not all zeros): - `processing_time` > 0 - `tokens_used` > 0 - `agents_used` > 0 - `agent_contributions` not empty ## ๐Ÿ” Troubleshooting ### Build Fails - Check `requirements.txt` for conflicts - Verify Python version (3.10) - Check Dockerfile syntax ### Runtime Errors - Verify `HF_TOKEN` is set in Space secrets - Check logs for permission errors - Verify cache directory is writable ### Performance Issues - Check GPU memory usage - Monitor model loading times - Verify quantization is enabled ### API Not Responding - Check health endpoint: `/api/health` - Verify Flask app is running on port 7860 - Check Space logs for errors ## ๐Ÿ“ Post-Deployment ### 1. Update Documentation - โœ… README.md updated with HonestAI URL - โœ… HF_SPACES_URL_GUIDE.md updated - โœ… API_DOCUMENTATION.md includes performance metrics ### 2. Monitor Metrics - Track response times - Monitor error rates - Check performance metrics accuracy ### 3. User Communication - Announce new features (performance metrics) - Update API documentation - Share new Space URL ## ๐Ÿ”— Quick Links - **Space**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI - **API Documentation**: See `API_DOCUMENTATION.md` - **Configuration Guide**: See `.env.example` - **Performance Metrics**: See `PERFORMANCE_METRICS_IMPLEMENTATION.md` ## โœ… Success Criteria After deployment, verify: 1. โœ… Space builds successfully 2. โœ… Health endpoint responds 3. โœ… API chat endpoint works 4. โœ… Performance metrics are populated 5. โœ… Models load with 4-bit quantization 6. โœ… Cache directory is configured 7. โœ… Logs show no critical errors --- **Last Updated**: January 2024 **Space**: JatinAutonomousLabs/HonestAI **Status**: Ready for Deployment โœ