HonestAI

Paused

File size: 5,254 Bytes

79ea999

# Hugging Face Spaces Deployment Guide - HonestAI

## 🚀 Deployment to HF Spaces

This guide covers deploying the updated HonestAI application to [Hugging Face Spaces](https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI).

## 📋 Pre-Deployment Checklist

### ✅ Required Files
- [x] `Dockerfile` - Container configuration
- [x] `requirements.txt` - Python dependencies
- [x] `flask_api_standalone.py` - Main application entry point
- [x] `README.md` - Updated with HonestAI Space URL
- [x] `src/` - All source code
- [x] `.env.example` - Environment variable template

### ✅ Recent Updates Included
- [x] Enhanced configuration management (`src/config.py`)
- [x] Performance metrics tracking (`src/orchestrator_engine.py`)
- [x] Updated model configurations (Llama 3.1 8B, e5-base-v2, Qwen 2.5 1.5B)
- [x] 4-bit quantization support
- [x] Cache directory management
- [x] Memory optimizations

## 🔧 Deployment Steps

### 1. Verify Space Configuration

**Space URL**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI

**Space Settings**:
- **SDK**: Docker
- **Hardware**: T4 GPU (16GB)
- **Visibility**: Public
- **Storage**: Persistent (for cache)

### 2. Set Environment Variables

In Space Settings → Repository secrets, ensure:
- `HF_TOKEN` - Your Hugging Face API token (required)
- `MAX_WORKERS` - Optional (default: 4)
- `LOG_LEVEL` - Optional (default: INFO)
- `HF_HOME` - Optional (auto-configured)

### 3. Verify Dockerfile

The `Dockerfile` is configured for:
- Python 3.10
- Port 7860 (HF Spaces standard)
- Health check endpoint
- Flask API as entry point

### 4. Commit and Push Updates

```bash
# Ensure all changes are committed
git add .
git commit -m "Update: Performance metrics, enhanced config, model optimizations"

# Push to HF Spaces repository
git push origin main
```

### 5. Monitor Build

- **Build Time**: 5-10 minutes (first build may take longer)
- **Watch Logs**: Check Space logs for build progress
- **Health Check**: `/api/health` endpoint should respond after build

## 📊 What's New in This Deployment

### 1. Performance Metrics
Every API response now includes comprehensive performance data:
```json
{
  "performance": {
    "processing_time": 1230.5,
    "tokens_used": 456,
    "agents_used": 4,
    "confidence_score": 85.2,
    "agent_contributions": [...],
    "safety_score": 85.0
  }
}
```

### 2. Enhanced Configuration
- Automatic cache directory management
- Secure environment variable handling
- Backward compatible settings
- Validation and error handling

### 3. Model Optimizations
- **Llama 3.1 8B** with 4-bit quantization (primary)
- **e5-base-v2** for embeddings (768 dimensions)
- **Qwen 2.5 1.5B** for fast classification
- Model preloading for faster responses

### 4. Memory Management
- Optimized history tracking (limited to 50-100 entries)
- Efficient agent call tracking
- Memory-aware caching

## 🧪 Testing After Deployment

### 1. Health Check
```bash
curl https://jatinautonomouslabs-honestai.hf.space/api/health
```

### 2. Test API Endpoint
```python
import requests

response = requests.post(
    "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
    json={
        "message": "Hello, what is machine learning?",
        "session_id": "test-session",
        "user_id": "test-user"
    }
)

data = response.json()
print(f"Response: {data['message']}")
print(f"Performance: {data.get('performance', {})}")
```

### 3. Verify Performance Metrics
Check that performance metrics are populated (not all zeros):
- `processing_time` > 0
- `tokens_used` > 0
- `agents_used` > 0
- `agent_contributions` not empty

## 🔍 Troubleshooting

### Build Fails
- Check `requirements.txt` for conflicts
- Verify Python version (3.10)
- Check Dockerfile syntax

### Runtime Errors
- Verify `HF_TOKEN` is set in Space secrets
- Check logs for permission errors
- Verify cache directory is writable

### Performance Issues
- Check GPU memory usage
- Monitor model loading times
- Verify quantization is enabled

### API Not Responding
- Check health endpoint: `/api/health`
- Verify Flask app is running on port 7860
- Check Space logs for errors

## 📝 Post-Deployment

### 1. Update Documentation
- ✅ README.md updated with HonestAI URL
- ✅ HF_SPACES_URL_GUIDE.md updated
- ✅ API_DOCUMENTATION.md includes performance metrics

### 2. Monitor Metrics
- Track response times
- Monitor error rates
- Check performance metrics accuracy

### 3. User Communication
- Announce new features (performance metrics)
- Update API documentation
- Share new Space URL

## 🔗 Quick Links

- **Space**: https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
- **API Documentation**: See `API_DOCUMENTATION.md`
- **Configuration Guide**: See `.env.example`
- **Performance Metrics**: See `PERFORMANCE_METRICS_IMPLEMENTATION.md`

## ✅ Success Criteria

After deployment, verify:
1. ✅ Space builds successfully
2. ✅ Health endpoint responds
3. ✅ API chat endpoint works
4. ✅ Performance metrics are populated
5. ✅ Models load with 4-bit quantization
6. ✅ Cache directory is configured
7. ✅ Logs show no critical errors

---

**Last Updated**: January 2024
**Space**: JatinAutonomousLabs/HonestAI
**Status**: Ready for Deployment ✅