--- title: AI Research Assistant MVP emoji: 🧠 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false license: apache-2.0 tags: - ai - chatbot - research - education - transformers models: - meta-llama/Llama-3.1-8B-Instruct - intfloat/e5-base-v2 - Qwen/Qwen2.5-1.5B-Instruct datasets: - wikipedia - commoncrawl base_path: research-assistant hf_oauth: true hf_token: true disable_embedding: false duplicated_from: null extra_gated_prompt: null extra_gated_fields: {} gated: false public: true --- # AI Research Assistant - MVP

![HF Spaces](https://img.shields.io/badge/🤗-Hugging%20Face%20Spaces-blue) ![Python](https://img.shields.io/badge/Python-3.9%2B-green) ![Gradio](https://img.shields.io/badge/Interface-Gradio-FF6B6B) ![NVIDIA T4](https://img.shields.io/badge/GPU-NVIDIA%20T4-blue) **Academic-grade AI assistant with transparent reasoning and mobile-optimized interface** [![Demo](https://img.shields.io/badge/🚀-Live%20Demo-9cf)](https://huggingface.co/spaces/your-username/research-assistant) [![Documentation](https://img.shields.io/badge/📚-Documentation-blue)](https://github.com/your-org/research-assistant/wiki)

## 🎯 Overview This MVP demonstrates an intelligent research assistant framework featuring **transparent reasoning chains**, **specialized agent architecture**, and **mobile-first design**. Built for Hugging Face Spaces with NVIDIA T4 GPU acceleration for local model inference. ### Key Differentiators - **🔍 Transparent Reasoning**: Watch the AI think step-by-step with Chain of Thought - **🧠 Specialized Agents**: Multiple AI models working together for optimal performance - **📱 Mobile-First**: Optimized for seamless mobile web experience - **🎓 Academic Focus**: Designed for research and educational use cases ## 📚 API Documentation **Comprehensive API documentation is available:** [API_DOCUMENTATION.md](API_DOCUMENTATION.md) The API provides REST endpoints for: - Chat interactions with AI assistant - Health checks - Context management - Session tracking **Quick API Example:** ```python import requests response = requests.post( "https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat", json={ "message": "What is machine learning?", "session_id": "my-session", "user_id": "user-123" } ) data = response.json() print(data["message"]) print(f"Performance: {data.get('performance', {})}") ``` ## 🚀 Quick Start ### Option 1: Use Our Demo Visit our live demo on Hugging Face Spaces: ```bash https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI ``` ### Option 2: Deploy Your Own Instance #### Prerequisites - Hugging Face account with [write token](https://huggingface.co/settings/tokens) - Basic understanding of Hugging Face Spaces #### Deployment Steps 1. **Fork this space** using the Hugging Face UI 2. **Add your HF token** (optional, only needed for gated models): - Go to your Space → Settings → Repository secrets - Add `HF_TOKEN` with your Hugging Face token (only needed if using gated models) - **Note**: Local models are used for inference - HF_TOKEN is only for downloading models 3. **The space will auto-build** (takes 5-10 minutes) #### Manual Build (Advanced) ```bash # Clone the repository git clone https://huggingface.co/spaces/your-username/research-assistant cd research-assistant # Install dependencies pip install -r requirements.txt # Set up environment (optional - only needed for gated models) export HF_TOKEN="your_hugging_face_token_here" # Optional: only for downloading gated models # Launch the application (multiple options) python main.py # Full integration with error handling python launch.py # Simple launcher python app.py # UI-only mode ``` ## 📁 Integration Structure The MVP now includes complete integration files for deployment: ``` ├── main.py # 🎯 Main integration entry point ├── launch.py # 🚀 Simple launcher for HF Spaces ├── app.py # 📱 Mobile-optimized UI ├── requirements.txt # 📦 Dependencies └── src/ ├── __init__.py # 📦 Package initialization ├── database.py # 🗄️ SQLite database management ├── event_handlers.py # 🔗 UI event integration ├── config.py # ⚙️ Configuration ├── llm_router.py # 🤖 LLM routing ├── orchestrator_engine.py # 🎭 Request orchestration ├── context_manager.py # 🧠 Context management ├── mobile_handlers.py # 📱 Mobile UX handlers └── agents/ ├── __init__.py # 🤖 Agents package ├── intent_agent.py # 🎯 Intent recognition ├── synthesis_agent.py # ✨ Response synthesis └── safety_agent.py # 🛡️ Safety checking ``` ### Key Features: - **🔄 Graceful Degradation**: Falls back to mock mode if components fail - **📱 Mobile-First**: Optimized for mobile devices and small screens - **🗄️ Database Ready**: SQLite integration with session management - **🔗 Event Handling**: Complete UI-to-backend integration - **⚡ Error Recovery**: Robust error handling throughout ## 🏗️ Architecture ``` ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │ Mobile Web │ ── │ ORCHESTRATOR │ ── │ AGENT SWARM │ │ Interface │ │ (Core Engine) │ │ (5 Specialists)│ └─────────────────┘ └──────────────────┘ └─────────────────┘ │ │ │ └─────────────────────────┼────────────────────────┘ │ ┌─────────────────────────────┐ │ PERSISTENCE LAYER │ │ (SQLite + FAISS Lite) │ └─────────────────────────────┘ ``` ### Core Components | Component | Purpose | Technology | |-----------|---------|------------| | **Orchestrator** | Main coordination engine | Python + Async | | **Intent Recognition** | Understand user goals | RoBERTa-base + CoT | | **Context Manager** | Session memory & recall | FAISS + SQLite | | **Response Synthesis** | Generate final answers | Mistral-7B | | **Safety Checker** | Content moderation | Unbiased-Toxic-RoBERTa | | **Research Agent** | Information gathering | Web search + analysis | ## 💡 Usage Examples ### Basic Research Query ``` User: "Explain quantum entanglement in simple terms" Assistant: 1. 🤔 [Reasoning] Breaking down quantum physics concepts... 2. 🔍 [Research] Gathering latest explanations... 3. ✍️ [Synthesis] Creating simplified explanation... [Final Response]: Quantum entanglement is when two particles become linked... ``` ### Technical Analysis ``` User: "Compare transformer models for text classification" Assistant: 1. 🏷️ [Intent] Identifying technical comparison request 2. 📊 [Analysis] Evaluating BERT vs RoBERTa vs DistilBERT 3. 📈 [Synthesis] Creating comparison table with metrics... ``` ## ⚙️ Configuration ### Environment Variables ```python # Required HF_TOKEN="your_hugging_face_token" # Optional MAX_WORKERS=4 CACHE_TTL=3600 DEFAULT_MODEL="meta-llama/Llama-3.1-8B-Instruct" EMBEDDING_MODEL="intfloat/e5-base-v2" CLASSIFICATION_MODEL="Qwen/Qwen2.5-1.5B-Instruct" HF_HOME="/tmp/huggingface" # Cache directory (auto-configured) LOG_LEVEL="INFO" ``` **Cache Directory Management:** - Automatically configured with secure fallback chain - Supports HF_HOME, TRANSFORMERS_CACHE, or user cache - Validates write permissions automatically - See `.env.example` for all available options ### Model Configuration The system uses multiple specialized models optimized for T4 16GB GPU: | Task | Model | Purpose | Quantization | |------|-------|---------|--------------| | Primary Reasoning | `meta-llama/Llama-3.1-8B-Instruct` | General responses | 4-bit NF4 | | Embeddings | `intfloat/e5-base-v2` | Semantic search | None (768-dim) | | Intent Classification | `Qwen/Qwen2.5-1.5B-Instruct` | User goal detection | 4-bit NF4 | | Safety Checking | `meta-llama/Llama-3.1-8B-Instruct` | Content moderation | 4-bit NF4 | **Performance Optimizations:** - ✅ 4-bit quantization (NF4) for memory efficiency - ✅ Model preloading for faster responses - ✅ Connection pooling for API calls - ✅ Parallel agent processing ## 📱 Mobile Optimization ### Key Mobile Features - **Touch-friendly** interface (44px+ touch targets) - **Progressive Web App** capabilities - **Offline functionality** for cached sessions - **Reduced data usage** with optimized responses - **Keyboard-aware** layout adjustments ### Supported Devices - ✅ Smartphones (iOS/Android) - ✅ Tablets - ✅ Desktop browsers - ✅ Screen readers (accessibility) ## 🛠️ Development ### Project Structure ``` research-assistant/ ├── app.py # Main Gradio application ├── requirements.txt # Dependencies ├── Dockerfile # Container configuration ├── src/ │ ├── orchestrator.py # Core orchestration engine │ ├── agents/ # Specialized agent modules │ ├── llm_router.py # Multi-model routing │ └── mobile_ux.py # Mobile optimizations ├── tests/ # Test suites └── docs/ # Documentation ``` ### Adding New Agents 1. Create agent module in `src/agents/` 2. Implement agent protocol: ```python class YourNewAgent: async def execute(self, user_input: str, context: dict) -> dict: # Your agent logic here return { "result": processed_output, "confidence": 0.95, "metadata": {} } ``` 3. Register agent in orchestrator configuration ## 🧪 Testing ### Run Test Suite ```bash # Install test dependencies pip install -r requirements.txt # Run all tests pytest tests/ -v # Run specific test categories pytest tests/test_agents.py -v pytest tests/test_mobile_ux.py -v ``` ### Test Coverage - ✅ Agent functionality - ✅ Mobile UX components - ✅ LLM routing logic - ✅ Error handling - ✅ Performance benchmarks ## 🚨 Troubleshooting ### Common Build Issues | Issue | Solution | |-------|----------| | **HF_TOKEN not found** | Optional - only needed for gated model access | | **Local models unavailable** | Check transformers/torch installation | | **Build timeout** | Reduce model sizes in requirements | | **Memory errors** | Check GPU memory usage, optimize model loading | | **Import errors** | Check Python version (3.9+) | ### Performance Optimization 1. **Enable caching** in context manager 2. **Use smaller models** for initial deployment 3. **Implement lazy loading** for mobile users 4. **Monitor memory usage** with built-in tools ### Debug Mode Enable detailed logging: ```python import logging logging.basicConfig(level=logging.DEBUG) ``` ## 📊 Performance Metrics The API now includes comprehensive performance metrics in every response: ```json { "performance": { "processing_time": 1230.5, // milliseconds "tokens_used": 456, "agents_used": 4, "confidence_score": 85.2, // percentage "agent_contributions": [ {"agent": "Intent", "percentage": 25.0}, {"agent": "Synthesis", "percentage": 40.0}, {"agent": "Safety", "percentage": 15.0}, {"agent": "Skills", "percentage": 20.0} ], "safety_score": 85.0, "latency_seconds": 1.230, "timestamp": "2024-01-15T10:30:45.123456" } } ``` | Metric | Target | Current | |--------|---------|---------| | Response Time | <10s | ~7s | | Cache Hit Rate | >60% | ~65% | | Mobile UX Score | >80/100 | 85/100 | | Error Rate | <5% | ~3% | | Performance Tracking | ✅ | ✅ Implemented | ## 🔮 Roadmap ### Phase 1 (Current - MVP) - ✅ Basic agent orchestration - ✅ Mobile-optimized interface - ✅ Multi-model routing - ✅ Transparent reasoning display - ✅ Performance metrics tracking - ✅ Enhanced configuration management - ✅ 4-bit quantization for T4 GPU - ✅ Model preloading and optimization ### Phase 2 (Next 3 months) - 🚧 Advanced research capabilities - 🚧 Plugin system for tools - 🚧 Enhanced mobile PWA features - 🚧 Multi-language support ### Phase 3 (Future) - 🔮 Autonomous agent swarms - 🔮 Voice interface integration - 🔮 Enterprise features - 🔮 Advanced analytics ## 👥 Contributing We welcome contributions! Please see: 1. [Contributing Guidelines](docs/CONTRIBUTING.md) 2. [Code of Conduct](docs/CODE_OF_CONDUCT.md) 3. [Development Setup](docs/DEVELOPMENT.md) ### Quick Contribution Steps ```bash # 1. Fork the repository # 2. Create feature branch git checkout -b feature/amazing-feature # 3. Commit changes git commit -m "Add amazing feature" # 4. Push to branch git push origin feature/amazing-feature # 5. Open Pull Request ``` ## 📄 Citation If you use this framework in your research, please cite: ```bibtex @software{research_assistant_mvp, title = {AI Research Assistant - MVP}, author = {Your Name}, year = {2024}, url = {https://huggingface.co/spaces/your-username/research-assistant} } ``` ## 📜 License This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. ## 🙏 Acknowledgments - [Hugging Face](https://huggingface.co) for the infrastructure - [Gradio](https://gradio.app) for the web framework - Model contributors from the HF community - Early testers and feedback providers ---

**Need help?** - [Open an Issue](https://github.com/your-org/research-assistant/issues) - [Join our Discord](https://discord.gg/your-discord) - [Email Support](mailto:support@your-domain.com) *Built with ❤️ for the research community*