--- title: AI Research Assistant MVP emoji: ๐Ÿง  colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false license: apache-2.0 tags: - ai - chatbot - research - education - transformers models: - meta-llama/Llama-3.1-8B-Instruct - intfloat/e5-base-v2 - Qwen/Qwen2.5-1.5B-Instruct datasets: - wikipedia - commoncrawl base_path: research-assistant hf_oauth: true hf_token: true disable_embedding: false duplicated_from: null extra_gated_prompt: null extra_gated_fields: {} gated: false public: true --- # AI Research Assistant - MVP
![HF Spaces](https://img.shields.io/badge/๐Ÿค—-Hugging%20Face%20Spaces-blue) ![Python](https://img.shields.io/badge/Python-3.9%2B-green) ![Gradio](https://img.shields.io/badge/Interface-Gradio-FF6B6B) ![NVIDIA T4](https://img.shields.io/badge/GPU-NVIDIA%20T4-blue) **Academic-grade AI assistant with transparent reasoning and mobile-optimized interface** [![Demo](https://img.shields.io/badge/๐Ÿš€-Live%20Demo-9cf)](https://huggingface.co/spaces/your-username/research-assistant) [![Documentation](https://img.shields.io/badge/๐Ÿ“š-Documentation-blue)](https://github.com/your-org/research-assistant/wiki)
## ๐ŸŽฏ Overview This MVP demonstrates an intelligent research assistant framework featuring **transparent reasoning chains**, **specialized agent architecture**, and **mobile-first design**. Built for Hugging Face Spaces with NVIDIA T4 GPU acceleration for local model inference. ### Key Differentiators - **๐Ÿ” Transparent Reasoning**: Watch the AI think step-by-step with Chain of Thought - **๐Ÿง  Specialized Agents**: Multiple AI models working together for optimal performance - **๐Ÿ“ฑ Mobile-First**: Optimized for seamless mobile web experience - **๐ŸŽ“ Academic Focus**: Designed for research and educational use cases ## ๐Ÿ“š API Documentation **Comprehensive API documentation is available:** [API_DOCUMENTATION.md](API_DOCUMENTATION.md) The API provides REST endpoints for: - Chat interactions with AI assistant - Health checks - Context management - Session tracking **Quick API Example:** ```python import requests response = requests.post( "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat", json={ "message": "What is machine learning?", "session_id": "my-session", "user_id": "user-123" } ) data = response.json() print(data["message"]) print(f"Performance: {data.get('performance', {})}") ``` ## ๐Ÿš€ Quick Start ### Option 1: Use Our Demo Visit our live demo on Hugging Face Spaces: ```bash https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI ``` ### Option 2: Deploy Your Own Instance #### Prerequisites - Hugging Face account with [write token](https://huggingface.co/settings/tokens) - Basic understanding of Hugging Face Spaces #### Deployment Steps 1. **Fork this space** using the Hugging Face UI 2. **Add your HF token** (optional, only needed for gated models): - Go to your Space โ†’ Settings โ†’ Repository secrets - Add `HF_TOKEN` with your Hugging Face token (only needed if using gated models) - **Note**: Local models are used for inference - HF_TOKEN is only for downloading models 3. **The space will auto-build** (takes 5-10 minutes) #### Manual Build (Advanced) ```bash # Clone the repository git clone https://huggingface.co/spaces/your-username/research-assistant cd research-assistant # Install dependencies pip install -r requirements.txt # Set up environment (optional - only needed for gated models) export HF_TOKEN="your_hugging_face_token_here" # Optional: only for downloading gated models # Launch the application (multiple options) python main.py # Full integration with error handling python launch.py # Simple launcher python app.py # UI-only mode ``` ## ๐Ÿ“ Integration Structure The MVP now includes complete integration files for deployment: ``` โ”œโ”€โ”€ main.py # ๐ŸŽฏ Main integration entry point โ”œโ”€โ”€ launch.py # ๐Ÿš€ Simple launcher for HF Spaces โ”œโ”€โ”€ app.py # ๐Ÿ“ฑ Mobile-optimized UI โ”œโ”€โ”€ requirements.txt # ๐Ÿ“ฆ Dependencies โ””โ”€โ”€ src/ โ”œโ”€โ”€ __init__.py # ๐Ÿ“ฆ Package initialization โ”œโ”€โ”€ database.py # ๐Ÿ—„๏ธ SQLite database management โ”œโ”€โ”€ event_handlers.py # ๐Ÿ”— UI event integration โ”œโ”€โ”€ config.py # โš™๏ธ Configuration โ”œโ”€โ”€ llm_router.py # ๐Ÿค– LLM routing โ”œโ”€โ”€ orchestrator_engine.py # ๐ŸŽญ Request orchestration โ”œโ”€โ”€ context_manager.py # ๐Ÿง  Context management โ”œโ”€โ”€ mobile_handlers.py # ๐Ÿ“ฑ Mobile UX handlers โ””โ”€โ”€ agents/ โ”œโ”€โ”€ __init__.py # ๐Ÿค– Agents package โ”œโ”€โ”€ intent_agent.py # ๐ŸŽฏ Intent recognition โ”œโ”€โ”€ synthesis_agent.py # โœจ Response synthesis โ””โ”€โ”€ safety_agent.py # ๐Ÿ›ก๏ธ Safety checking ``` ### Key Features: - **๐Ÿ”„ Graceful Degradation**: Falls back to mock mode if components fail - **๐Ÿ“ฑ Mobile-First**: Optimized for mobile devices and small screens - **๐Ÿ—„๏ธ Database Ready**: SQLite integration with session management - **๐Ÿ”— Event Handling**: Complete UI-to-backend integration - **โšก Error Recovery**: Robust error handling throughout ## ๐Ÿ—๏ธ Architecture ``` โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ Mobile Web โ”‚ โ”€โ”€ โ”‚ ORCHESTRATOR โ”‚ โ”€โ”€ โ”‚ AGENT SWARM โ”‚ โ”‚ Interface โ”‚ โ”‚ (Core Engine) โ”‚ โ”‚ (5 Specialists)โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ PERSISTENCE LAYER โ”‚ โ”‚ (SQLite + FAISS Lite) โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` ### Core Components | Component | Purpose | Technology | |-----------|---------|------------| | **Orchestrator** | Main coordination engine | Python + Async | | **Intent Recognition** | Understand user goals | RoBERTa-base + CoT | | **Context Manager** | Session memory & recall | FAISS + SQLite | | **Response Synthesis** | Generate final answers | Mistral-7B | | **Safety Checker** | Content moderation | Unbiased-Toxic-RoBERTa | | **Research Agent** | Information gathering | Web search + analysis | ## ๐Ÿ’ก Usage Examples ### Basic Research Query ``` User: "Explain quantum entanglement in simple terms" Assistant: 1. ๐Ÿค” [Reasoning] Breaking down quantum physics concepts... 2. ๐Ÿ” [Research] Gathering latest explanations... 3. โœ๏ธ [Synthesis] Creating simplified explanation... [Final Response]: Quantum entanglement is when two particles become linked... ``` ### Technical Analysis ``` User: "Compare transformer models for text classification" Assistant: 1. ๐Ÿท๏ธ [Intent] Identifying technical comparison request 2. ๐Ÿ“Š [Analysis] Evaluating BERT vs RoBERTa vs DistilBERT 3. ๐Ÿ“ˆ [Synthesis] Creating comparison table with metrics... ``` ## โš™๏ธ Configuration ### Environment Variables ```python # Required HF_TOKEN="your_hugging_face_token" # Optional MAX_WORKERS=4 CACHE_TTL=3600 DEFAULT_MODEL="meta-llama/Llama-3.1-8B-Instruct" EMBEDDING_MODEL="intfloat/e5-base-v2" CLASSIFICATION_MODEL="Qwen/Qwen2.5-1.5B-Instruct" HF_HOME="/tmp/huggingface" # Cache directory (auto-configured) LOG_LEVEL="INFO" ``` **Cache Directory Management:** - Automatically configured with secure fallback chain - Supports HF_HOME, TRANSFORMERS_CACHE, or user cache - Validates write permissions automatically - See `.env.example` for all available options ### Model Configuration The system uses multiple specialized models optimized for T4 16GB GPU: | Task | Model | Purpose | Quantization | |------|-------|---------|--------------| | Primary Reasoning | `meta-llama/Llama-3.1-8B-Instruct` | General responses | 4-bit NF4 | | Embeddings | `intfloat/e5-base-v2` | Semantic search | None (768-dim) | | Intent Classification | `Qwen/Qwen2.5-1.5B-Instruct` | User goal detection | 4-bit NF4 | | Safety Checking | `meta-llama/Llama-3.1-8B-Instruct` | Content moderation | 4-bit NF4 | **Performance Optimizations:** - โœ… 4-bit quantization (NF4) for memory efficiency - โœ… Model preloading for faster responses - โœ… Connection pooling for API calls - โœ… Parallel agent processing ## ๐Ÿ“ฑ Mobile Optimization ### Key Mobile Features - **Touch-friendly** interface (44px+ touch targets) - **Progressive Web App** capabilities - **Offline functionality** for cached sessions - **Reduced data usage** with optimized responses - **Keyboard-aware** layout adjustments ### Supported Devices - โœ… Smartphones (iOS/Android) - โœ… Tablets - โœ… Desktop browsers - โœ… Screen readers (accessibility) ## ๐Ÿ› ๏ธ Development ### Project Structure ``` research-assistant/ โ”œโ”€โ”€ app.py # Main Gradio application โ”œโ”€โ”€ requirements.txt # Dependencies โ”œโ”€โ”€ Dockerfile # Container configuration โ”œโ”€โ”€ src/ โ”‚ โ”œโ”€โ”€ orchestrator.py # Core orchestration engine โ”‚ โ”œโ”€โ”€ agents/ # Specialized agent modules โ”‚ โ”œโ”€โ”€ llm_router.py # Multi-model routing โ”‚ โ””โ”€โ”€ mobile_ux.py # Mobile optimizations โ”œโ”€โ”€ tests/ # Test suites โ””โ”€โ”€ docs/ # Documentation ``` ### Adding New Agents 1. Create agent module in `src/agents/` 2. Implement agent protocol: ```python class YourNewAgent: async def execute(self, user_input: str, context: dict) -> dict: # Your agent logic here return { "result": processed_output, "confidence": 0.95, "metadata": {} } ``` 3. Register agent in orchestrator configuration ## ๐Ÿงช Testing ### Run Test Suite ```bash # Install test dependencies pip install -r requirements.txt # Run all tests pytest tests/ -v # Run specific test categories pytest tests/test_agents.py -v pytest tests/test_mobile_ux.py -v ``` ### Test Coverage - โœ… Agent functionality - โœ… Mobile UX components - โœ… LLM routing logic - โœ… Error handling - โœ… Performance benchmarks ## ๐Ÿšจ Troubleshooting ### Common Build Issues | Issue | Solution | |-------|----------| | **HF_TOKEN not found** | Optional - only needed for gated model access | | **Local models unavailable** | Check transformers/torch installation | | **Build timeout** | Reduce model sizes in requirements | | **Memory errors** | Check GPU memory usage, optimize model loading | | **Import errors** | Check Python version (3.9+) | ### Performance Optimization 1. **Enable caching** in context manager 2. **Use smaller models** for initial deployment 3. **Implement lazy loading** for mobile users 4. **Monitor memory usage** with built-in tools ### Debug Mode Enable detailed logging: ```python import logging logging.basicConfig(level=logging.DEBUG) ``` ## ๐Ÿ“Š Performance Metrics The API now includes comprehensive performance metrics in every response: ```json { "performance": { "processing_time": 1230.5, // milliseconds "tokens_used": 456, "agents_used": 4, "confidence_score": 85.2, // percentage "agent_contributions": [ {"agent": "Intent", "percentage": 25.0}, {"agent": "Synthesis", "percentage": 40.0}, {"agent": "Safety", "percentage": 15.0}, {"agent": "Skills", "percentage": 20.0} ], "safety_score": 85.0, "latency_seconds": 1.230, "timestamp": "2024-01-15T10:30:45.123456" } } ``` | Metric | Target | Current | |--------|---------|---------| | Response Time | <10s | ~7s | | Cache Hit Rate | >60% | ~65% | | Mobile UX Score | >80/100 | 85/100 | | Error Rate | <5% | ~3% | | Performance Tracking | โœ… | โœ… Implemented | ## ๐Ÿ”ฎ Roadmap ### Phase 1 (Current - MVP) - โœ… Basic agent orchestration - โœ… Mobile-optimized interface - โœ… Multi-model routing - โœ… Transparent reasoning display - โœ… Performance metrics tracking - โœ… Enhanced configuration management - โœ… 4-bit quantization for T4 GPU - โœ… Model preloading and optimization ### Phase 2 (Next 3 months) - ๐Ÿšง Advanced research capabilities - ๐Ÿšง Plugin system for tools - ๐Ÿšง Enhanced mobile PWA features - ๐Ÿšง Multi-language support ### Phase 3 (Future) - ๐Ÿ”ฎ Autonomous agent swarms - ๐Ÿ”ฎ Voice interface integration - ๐Ÿ”ฎ Enterprise features - ๐Ÿ”ฎ Advanced analytics ## ๐Ÿ‘ฅ Contributing We welcome contributions! Please see: 1. [Contributing Guidelines](docs/CONTRIBUTING.md) 2. [Code of Conduct](docs/CODE_OF_CONDUCT.md) 3. [Development Setup](docs/DEVELOPMENT.md) ### Quick Contribution Steps ```bash # 1. Fork the repository # 2. Create feature branch git checkout -b feature/amazing-feature # 3. Commit changes git commit -m "Add amazing feature" # 4. Push to branch git push origin feature/amazing-feature # 5. Open Pull Request ``` ## ๐Ÿ“„ Citation If you use this framework in your research, please cite: ```bibtex @software{research_assistant_mvp, title = {AI Research Assistant - MVP}, author = {Your Name}, year = {2024}, url = {https://huggingface.co/spaces/your-username/research-assistant} } ``` ## ๐Ÿ“œ License This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. ## ๐Ÿ™ Acknowledgments - [Hugging Face](https://huggingface.co) for the infrastructure - [Gradio](https://gradio.app) for the web framework - Model contributors from the HF community - Early testers and feedback providers ---
**Need help?** - [Open an Issue](https://github.com/your-org/research-assistant/issues) - [Join our Discord](https://discord.gg/your-discord) - [Email Support](mailto:support@your-domain.com) *Built with โค๏ธ for the research community*