HonestAI / SECURITY_CONFIGURATION.md
JatsTheAIGen's picture
Security Enhancements: Production WSGI, Rate Limiting, Security Headers, Secure Logging
79ea999
|
raw
history blame
5.14 kB
# Security Configuration Guide
## Environment Variables for Security
Add these to your `.env` file or Space Settings → Repository secrets:
```bash
# ==================== Security Configuration ====================
# OMP_NUM_THREADS: Number of OpenMP threads (must be positive integer)
# Default: 4, Range: 1-8 (adjust based on CPU cores)
# IMPORTANT: Must be a valid positive integer, not empty string
OMP_NUM_THREADS=4
# MKL_NUM_THREADS: Number of MKL threads (must be positive integer)
# Default: 4, Range: 1-8
# IMPORTANT: Must be a valid positive integer, not empty string
MKL_NUM_THREADS=4
# LOG_DIR: Directory for log files (ensure secure permissions)
# Default: /tmp/logs
LOG_DIR=/tmp/logs
# RATE_LIMIT_ENABLED: Enable rate limiting (true/false)
# Default: true (recommended for production)
# Set to false only for development/testing
RATE_LIMIT_ENABLED=true
```
## Security Features Implemented
### 1. OMP_NUM_THREADS Validation
- ✅ Automatic validation on startup
- ✅ Defaults to 4 if invalid or missing
- ✅ Prevents "Invalid value" errors
### 2. Security Headers
All responses include:
- `X-Content-Type-Options: nosniff` - Prevents MIME type sniffing
- `X-Frame-Options: DENY` - Prevents clickjacking
- `X-XSS-Protection: 1; mode=block` - XSS protection
- `Strict-Transport-Security` - Forces HTTPS
- `Content-Security-Policy` - Restricts resource loading
- `Referrer-Policy` - Controls referrer information
### 3. Rate Limiting
- ✅ Enabled by default (configurable via `RATE_LIMIT_ENABLED`)
- ✅ Default limits: 200/day, 50/hour, 10/minute per IP
- ✅ Endpoint-specific limits:
- `/api/chat`: 10 requests/minute
- `/api/initialize`: 5 requests/minute
### 4. Secure Logging
- ✅ Log files with 600 permissions (owner read/write only)
- ✅ Log directory with 700 permissions
- ✅ Automatic sensitive data sanitization (tokens, passwords, keys)
- ✅ Rotating file handler (10MB max, 5 backups)
### 5. Production WSGI Server
- ✅ Gunicorn replaces Flask dev server
- ✅ 4 workers, 2 threads per worker
- ✅ 120 second timeout
- ✅ Access and error logging
### 6. Database Indexes
- ✅ Indexes on frequently queried columns
- ✅ Performance optimization for session lookups
- ✅ Automatic index creation on database init
## Production Deployment
### Using Gunicorn (Recommended)
The Dockerfile is configured to use Gunicorn automatically. For manual deployment:
```bash
gunicorn \
--bind 0.0.0.0:7860 \
--workers 4 \
--threads 2 \
--timeout 120 \
--access-logfile - \
--error-logfile - \
--log-level info \
flask_api_standalone:app
```
### Using Production Script
```bash
chmod +x scripts/start_production.sh
./scripts/start_production.sh
```
## Security Checklist
Before deploying to production:
- [ ] Verify `HF_TOKEN` is set in Space secrets
- [ ] Verify `OMP_NUM_THREADS` is a valid positive integer
- [ ] Verify `RATE_LIMIT_ENABLED=true` (unless testing)
- [ ] Verify log directory permissions are secure
- [ ] Verify Gunicorn is used (not Flask dev server)
- [ ] Verify security headers are present in responses
- [ ] Verify rate limiting is working
- [ ] Verify sensitive data is sanitized in logs
## Testing Security Features
### Test Rate Limiting
```bash
# Should allow 10 requests
for i in {1..10}; do
curl -X POST http://localhost:7860/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"test","session_id":"test"}'
done
# 11th request should be rate limited (429)
curl -X POST http://localhost:7860/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"test","session_id":"test"}'
```
### Test Security Headers
```bash
curl -I http://localhost:7860/api/health | grep -i "x-"
```
### Test OMP_NUM_THREADS Validation
```bash
# Test with invalid value
export OMP_NUM_THREADS="invalid"
python flask_api_standalone.py
# Should default to 4 and log warning
```
## Monitoring
### Log Files
- Location: `$LOG_DIR/app.log` (default: `/tmp/logs/app.log`)
- Permissions: 600 (owner read/write only)
- Rotation: 10MB max, 5 backups
### Security Alerts
Monitor logs for:
- Rate limit violations (429 responses)
- Invalid OMP_NUM_THREADS values
- Failed authentication attempts
- Unusual request patterns
## Troubleshooting
### Rate Limiting Too Aggressive
```bash
# Disable for testing (NOT recommended for production)
export RATE_LIMIT_ENABLED=false
```
### Log Permission Errors
```bash
# Set log directory manually
export LOG_DIR=/path/to/writable/directory
mkdir -p $LOG_DIR
chmod 700 $LOG_DIR
```
### OMP_NUM_THREADS Errors
```bash
# Ensure valid integer
export OMP_NUM_THREADS=4 # Must be positive integer
```
## Best Practices
1. **Always use Gunicorn in production** - Never use Flask dev server
2. **Keep rate limiting enabled** - Only disable for local development
3. **Monitor log files** - Check for suspicious activity
4. **Rotate logs regularly** - Prevent disk space issues
5. **Validate environment variables** - Ensure OMP_NUM_THREADS is valid
6. **Use HTTPS** - Strict-Transport-Security header requires HTTPS
7. **Review security headers** - Ensure they match your requirements