HonestAI / README.md
JatsTheAIGen's picture
Security Enhancements: Production WSGI, Rate Limiting, Security Headers, Secure Logging
79ea999
|
raw
history blame
14.1 kB
metadata
title: AI Research Assistant MVP
emoji: ๐Ÿง 
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
tags:
  - ai
  - chatbot
  - research
  - education
  - transformers
models:
  - meta-llama/Llama-3.1-8B-Instruct
  - intfloat/e5-base-v2
  - Qwen/Qwen2.5-1.5B-Instruct
datasets:
  - wikipedia
  - commoncrawl
base_path: research-assistant
hf_oauth: true
hf_token: true
disable_embedding: false
duplicated_from: null
extra_gated_prompt: null
extra_gated_fields: {}
gated: false
public: true

AI Research Assistant - MVP

HF Spaces Python Gradio NVIDIA T4

Academic-grade AI assistant with transparent reasoning and mobile-optimized interface

Demo Documentation

๐ŸŽฏ Overview

This MVP demonstrates an intelligent research assistant framework featuring transparent reasoning chains, specialized agent architecture, and mobile-first design. Built for Hugging Face Spaces with NVIDIA T4 GPU acceleration for local model inference.

Key Differentiators

  • ๐Ÿ” Transparent Reasoning: Watch the AI think step-by-step with Chain of Thought
  • ๐Ÿง  Specialized Agents: Multiple AI models working together for optimal performance
  • ๐Ÿ“ฑ Mobile-First: Optimized for seamless mobile web experience
  • ๐ŸŽ“ Academic Focus: Designed for research and educational use cases

๐Ÿ“š API Documentation

Comprehensive API documentation is available: API_DOCUMENTATION.md

The API provides REST endpoints for:

  • Chat interactions with AI assistant
  • Health checks
  • Context management
  • Session tracking

Quick API Example:

import requests

response = requests.post(
    "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
    json={
        "message": "What is machine learning?",
        "session_id": "my-session",
        "user_id": "user-123"
    }
)
data = response.json()
print(data["message"])
print(f"Performance: {data.get('performance', {})}")

๐Ÿš€ Quick Start

Option 1: Use Our Demo

Visit our live demo on Hugging Face Spaces:

https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI

Option 2: Deploy Your Own Instance

Prerequisites

  • Hugging Face account with write token
  • Basic understanding of Hugging Face Spaces

Deployment Steps

  1. Fork this space using the Hugging Face UI
  2. Add your HF token in Space Settings:
    • Go to your Space โ†’ Settings โ†’ Repository secrets
    • Add HF_TOKEN with your Hugging Face token
  3. The space will auto-build (takes 5-10 minutes)

Manual Build (Advanced)

# Clone the repository
git clone https://huggingface.co/spaces/your-username/research-assistant
cd research-assistant

# Install dependencies
pip install -r requirements.txt

# Set up environment
export HF_TOKEN="your_hugging_face_token_here"

# Launch the application (multiple options)
python main.py          # Full integration with error handling
python launch.py        # Simple launcher
python app.py           # UI-only mode

๐Ÿ“ Integration Structure

The MVP now includes complete integration files for deployment:

โ”œโ”€โ”€ main.py                    # ๐ŸŽฏ Main integration entry point
โ”œโ”€โ”€ launch.py                  # ๐Ÿš€ Simple launcher for HF Spaces
โ”œโ”€โ”€ app.py                     # ๐Ÿ“ฑ Mobile-optimized UI
โ”œโ”€โ”€ requirements.txt           # ๐Ÿ“ฆ Dependencies
โ””โ”€โ”€ src/
    โ”œโ”€โ”€ __init__.py           # ๐Ÿ“ฆ Package initialization
    โ”œโ”€โ”€ database.py           # ๐Ÿ—„๏ธ SQLite database management
    โ”œโ”€โ”€ event_handlers.py     # ๐Ÿ”— UI event integration
    โ”œโ”€โ”€ config.py             # โš™๏ธ Configuration
    โ”œโ”€โ”€ llm_router.py         # ๐Ÿค– LLM routing
    โ”œโ”€โ”€ orchestrator_engine.py # ๐ŸŽญ Request orchestration
    โ”œโ”€โ”€ context_manager.py    # ๐Ÿง  Context management
    โ”œโ”€โ”€ mobile_handlers.py    # ๐Ÿ“ฑ Mobile UX handlers
    โ””โ”€โ”€ agents/
        โ”œโ”€โ”€ __init__.py       # ๐Ÿค– Agents package
        โ”œโ”€โ”€ intent_agent.py   # ๐ŸŽฏ Intent recognition
        โ”œโ”€โ”€ synthesis_agent.py # โœจ Response synthesis
        โ””โ”€โ”€ safety_agent.py   # ๐Ÿ›ก๏ธ Safety checking

Key Features:

  • ๐Ÿ”„ Graceful Degradation: Falls back to mock mode if components fail
  • ๐Ÿ“ฑ Mobile-First: Optimized for mobile devices and small screens
  • ๐Ÿ—„๏ธ Database Ready: SQLite integration with session management
  • ๐Ÿ”— Event Handling: Complete UI-to-backend integration
  • โšก Error Recovery: Robust error handling throughout

๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Mobile Web    โ”‚ โ”€โ”€ โ”‚   ORCHESTRATOR   โ”‚ โ”€โ”€ โ”‚   AGENT SWARM   โ”‚
โ”‚   Interface     โ”‚    โ”‚   (Core Engine)  โ”‚    โ”‚   (5 Specialists)โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚                        โ”‚                        โ”‚
         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                   โ”‚
                    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                    โ”‚   PERSISTENCE LAYER         โ”‚
                    โ”‚   (SQLite + FAISS Lite)    โ”‚
                    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Core Components

Component Purpose Technology
Orchestrator Main coordination engine Python + Async
Intent Recognition Understand user goals RoBERTa-base + CoT
Context Manager Session memory & recall FAISS + SQLite
Response Synthesis Generate final answers Mistral-7B
Safety Checker Content moderation Unbiased-Toxic-RoBERTa
Research Agent Information gathering Web search + analysis

๐Ÿ’ก Usage Examples

Basic Research Query

User: "Explain quantum entanglement in simple terms"

Assistant: 
1. ๐Ÿค” [Reasoning] Breaking down quantum physics concepts...
2. ๐Ÿ” [Research] Gathering latest explanations...
3. โœ๏ธ [Synthesis] Creating simplified explanation...

[Final Response]: Quantum entanglement is when two particles become linked...

Technical Analysis

User: "Compare transformer models for text classification"

Assistant:
1. ๐Ÿท๏ธ [Intent] Identifying technical comparison request
2. ๐Ÿ“Š [Analysis] Evaluating BERT vs RoBERTa vs DistilBERT
3. ๐Ÿ“ˆ [Synthesis] Creating comparison table with metrics...

โš™๏ธ Configuration

Environment Variables

# Required
HF_TOKEN="your_hugging_face_token"

# Optional
MAX_WORKERS=4
CACHE_TTL=3600
DEFAULT_MODEL="meta-llama/Llama-3.1-8B-Instruct"
EMBEDDING_MODEL="intfloat/e5-base-v2"
CLASSIFICATION_MODEL="Qwen/Qwen2.5-1.5B-Instruct"
HF_HOME="/tmp/huggingface"  # Cache directory (auto-configured)
LOG_LEVEL="INFO"

Cache Directory Management:

  • Automatically configured with secure fallback chain
  • Supports HF_HOME, TRANSFORMERS_CACHE, or user cache
  • Validates write permissions automatically
  • See .env.example for all available options

Model Configuration

The system uses multiple specialized models optimized for T4 16GB GPU:

Task Model Purpose Quantization
Primary Reasoning meta-llama/Llama-3.1-8B-Instruct General responses 4-bit NF4
Embeddings intfloat/e5-base-v2 Semantic search None (768-dim)
Intent Classification Qwen/Qwen2.5-1.5B-Instruct User goal detection 4-bit NF4
Safety Checking meta-llama/Llama-3.1-8B-Instruct Content moderation 4-bit NF4

Performance Optimizations:

  • โœ… 4-bit quantization (NF4) for memory efficiency
  • โœ… Model preloading for faster responses
  • โœ… Connection pooling for API calls
  • โœ… Parallel agent processing

๐Ÿ“ฑ Mobile Optimization

Key Mobile Features

  • Touch-friendly interface (44px+ touch targets)
  • Progressive Web App capabilities
  • Offline functionality for cached sessions
  • Reduced data usage with optimized responses
  • Keyboard-aware layout adjustments

Supported Devices

  • โœ… Smartphones (iOS/Android)
  • โœ… Tablets
  • โœ… Desktop browsers
  • โœ… Screen readers (accessibility)

๐Ÿ› ๏ธ Development

Project Structure

research-assistant/
โ”œโ”€โ”€ app.py                 # Main Gradio application
โ”œโ”€โ”€ requirements.txt       # Dependencies
โ”œโ”€โ”€ Dockerfile            # Container configuration
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ orchestrator.py   # Core orchestration engine
โ”‚   โ”œโ”€โ”€ agents/          # Specialized agent modules
โ”‚   โ”œโ”€โ”€ llm_router.py    # Multi-model routing
โ”‚   โ””โ”€โ”€ mobile_ux.py     # Mobile optimizations
โ”œโ”€โ”€ tests/               # Test suites
โ””โ”€โ”€ docs/               # Documentation

Adding New Agents

  1. Create agent module in src/agents/
  2. Implement agent protocol:
class YourNewAgent:
    async def execute(self, user_input: str, context: dict) -> dict:
        # Your agent logic here
        return {
            "result": processed_output,
            "confidence": 0.95,
            "metadata": {}
        }
  1. Register agent in orchestrator configuration

๐Ÿงช Testing

Run Test Suite

# Install test dependencies
pip install -r requirements.txt

# Run all tests
pytest tests/ -v

# Run specific test categories
pytest tests/test_agents.py -v
pytest tests/test_mobile_ux.py -v

Test Coverage

  • โœ… Agent functionality
  • โœ… Mobile UX components
  • โœ… LLM routing logic
  • โœ… Error handling
  • โœ… Performance benchmarks

๐Ÿšจ Troubleshooting

Common Build Issues

Issue Solution
HF_TOKEN not found Add token in Space Settings โ†’ Secrets
Build timeout Reduce model sizes in requirements
Memory errors Check GPU memory usage, optimize model loading
Import errors Check Python version (3.9+)

Performance Optimization

  1. Enable caching in context manager
  2. Use smaller models for initial deployment
  3. Implement lazy loading for mobile users
  4. Monitor memory usage with built-in tools

Debug Mode

Enable detailed logging:

import logging
logging.basicConfig(level=logging.DEBUG)

๐Ÿ“Š Performance Metrics

The API now includes comprehensive performance metrics in every response:

{
  "performance": {
    "processing_time": 1230.5,      // milliseconds
    "tokens_used": 456,
    "agents_used": 4,
    "confidence_score": 85.2,        // percentage
    "agent_contributions": [
      {"agent": "Intent", "percentage": 25.0},
      {"agent": "Synthesis", "percentage": 40.0},
      {"agent": "Safety", "percentage": 15.0},
      {"agent": "Skills", "percentage": 20.0}
    ],
    "safety_score": 85.0,
    "latency_seconds": 1.230,
    "timestamp": "2024-01-15T10:30:45.123456"
  }
}
Metric Target Current
Response Time <10s ~7s
Cache Hit Rate >60% ~65%
Mobile UX Score >80/100 85/100
Error Rate <5% ~3%
Performance Tracking โœ… โœ… Implemented

๐Ÿ”ฎ Roadmap

Phase 1 (Current - MVP)

  • โœ… Basic agent orchestration
  • โœ… Mobile-optimized interface
  • โœ… Multi-model routing
  • โœ… Transparent reasoning display
  • โœ… Performance metrics tracking
  • โœ… Enhanced configuration management
  • โœ… 4-bit quantization for T4 GPU
  • โœ… Model preloading and optimization

Phase 2 (Next 3 months)

  • ๐Ÿšง Advanced research capabilities
  • ๐Ÿšง Plugin system for tools
  • ๐Ÿšง Enhanced mobile PWA features
  • ๐Ÿšง Multi-language support

Phase 3 (Future)

  • ๐Ÿ”ฎ Autonomous agent swarms
  • ๐Ÿ”ฎ Voice interface integration
  • ๐Ÿ”ฎ Enterprise features
  • ๐Ÿ”ฎ Advanced analytics

๐Ÿ‘ฅ Contributing

We welcome contributions! Please see:

  1. Contributing Guidelines
  2. Code of Conduct
  3. Development Setup

Quick Contribution Steps

# 1. Fork the repository
# 2. Create feature branch
git checkout -b feature/amazing-feature

# 3. Commit changes
git commit -m "Add amazing feature"

# 4. Push to branch  
git push origin feature/amazing-feature

# 5. Open Pull Request

๐Ÿ“„ Citation

If you use this framework in your research, please cite:

@software{research_assistant_mvp,
  title = {AI Research Assistant - MVP},
  author = {Your Name},
  year = {2024},
  url = {https://huggingface.co/spaces/your-username/research-assistant}
}

๐Ÿ“œ License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Hugging Face for the infrastructure
  • Gradio for the web framework
  • Model contributors from the HF community
  • Early testers and feedback providers

Need help?

Built with โค๏ธ for the research community