File size: 5,254 Bytes
79ea999 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 |
# Hugging Face Spaces Deployment Guide - HonestAI
## π Deployment to HF Spaces
This guide covers deploying the updated HonestAI application to [Hugging Face Spaces](https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI).
## π Pre-Deployment Checklist
### β
Required Files
- [x] `Dockerfile` - Container configuration
- [x] `requirements.txt` - Python dependencies
- [x] `flask_api_standalone.py` - Main application entry point
- [x] `README.md` - Updated with HonestAI Space URL
- [x] `src/` - All source code
- [x] `.env.example` - Environment variable template
### β
Recent Updates Included
- [x] Enhanced configuration management (`src/config.py`)
- [x] Performance metrics tracking (`src/orchestrator_engine.py`)
- [x] Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B)
- [x] 4-bit quantization support
- [x] Cache directory management
- [x] Memory optimizations
## π§ Deployment Steps
### 1. Verify Space Configuration
**Space URL**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
**Space Settings**:
- **SDK**: Docker
- **Hardware**: T4 GPU (16GB)
- **Visibility**: Public
- **Storage**: Persistent (for cache)
### 2. Set Environment Variables
In Space Settings β Repository secrets, ensure:
- `HF_TOKEN` - Your Hugging Face API token (required)
- `MAX_WORKERS` - Optional (default: 4)
- `LOG_LEVEL` - Optional (default: INFO)
- `HF_HOME` - Optional (auto-configured)
### 3. Verify Dockerfile
The `Dockerfile` is configured for:
- Python 3.10
- Port 7860 (HF Spaces standard)
- Health check endpoint
- Flask API as entry point
### 4. Commit and Push Updates
```bash
# Ensure all changes are committed
git add .
git commit -m "Update: Performance metrics, enhanced config, model optimizations"
# Push to HF Spaces repository
git push origin main
```
### 5. Monitor Build
- **Build Time**: 5-10 minutes (first build may take longer)
- **Watch Logs**: Check Space logs for build progress
- **Health Check**: `/api/health` endpoint should respond after build
## π What's New in This Deployment
### 1. Performance Metrics
Every API response now includes comprehensive performance data:
```json
{
"performance": {
"processing_time": 1230.5,
"tokens_used": 456,
"agents_used": 4,
"confidence_score": 85.2,
"agent_contributions": [...],
"safety_score": 85.0
}
}
```
### 2. Enhanced Configuration
- Automatic cache directory management
- Secure environment variable handling
- Backward compatible settings
- Validation and error handling
### 3. Model Optimizations
- **Llama 3.1 8B** with 4-bit quantization (primary)
- **e5-base-v2** for embeddings (768 dimensions)
- **Qwen 2.5 1.5B** for fast classification
- Model preloading for faster responses
### 4. Memory Management
- Optimized history tracking (limited to 50-100 entries)
- Efficient agent call tracking
- Memory-aware caching
## π§ͺ Testing After Deployment
### 1. Health Check
```bash
curl https://jatinautonomouslabs-honestai.hf.space/api/health
```
### 2. Test API Endpoint
```python
import requests
response = requests.post(
"https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
json={
"message": "Hello, what is machine learning?",
"session_id": "test-session",
"user_id": "test-user"
}
)
data = response.json()
print(f"Response: {data['message']}")
print(f"Performance: {data.get('performance', {})}")
```
### 3. Verify Performance Metrics
Check that performance metrics are populated (not all zeros):
- `processing_time` > 0
- `tokens_used` > 0
- `agents_used` > 0
- `agent_contributions` not empty
## π Troubleshooting
### Build Fails
- Check `requirements.txt` for conflicts
- Verify Python version (3.10)
- Check Dockerfile syntax
### Runtime Errors
- Verify `HF_TOKEN` is set in Space secrets
- Check logs for permission errors
- Verify cache directory is writable
### Performance Issues
- Check GPU memory usage
- Monitor model loading times
- Verify quantization is enabled
### API Not Responding
- Check health endpoint: `/api/health`
- Verify Flask app is running on port 7860
- Check Space logs for errors
## π Post-Deployment
### 1. Update Documentation
- β
README.md updated with HonestAI URL
- β
HF_SPACES_URL_GUIDE.md updated
- β
API_DOCUMENTATION.md includes performance metrics
### 2. Monitor Metrics
- Track response times
- Monitor error rates
- Check performance metrics accuracy
### 3. User Communication
- Announce new features (performance metrics)
- Update API documentation
- Share new Space URL
## π Quick Links
- **Space**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
- **API Documentation**: See `API_DOCUMENTATION.md`
- **Configuration Guide**: See `.env.example`
- **Performance Metrics**: See `PERFORMANCE_METRICS_IMPLEMENTATION.md`
## β
Success Criteria
After deployment, verify:
1. β
Space builds successfully
2. β
Health endpoint responds
3. β
API chat endpoint works
4. β
Performance metrics are populated
5. β
Models load with 4-bit quantization
6. β
Cache directory is configured
7. β
Logs show no critical errors
---
**Last Updated**: January 2024
**Space**: JatinAutonomousLabs/HonestAI
**Status**: Ready for Deployment β
|