title: AI Research Assistant MVP
emoji: ๐ง
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
tags:
- ai
- chatbot
- research
- education
- transformers
models:
- meta-llama/Llama-3.1-8B-Instruct
- intfloat/e5-base-v2
- Qwen/Qwen2.5-1.5B-Instruct
datasets:
- wikipedia
- commoncrawl
base_path: research-assistant
hf_oauth: true
hf_token: true
disable_embedding: false
duplicated_from: null
extra_gated_prompt: null
extra_gated_fields: {}
gated: false
public: true
AI Research Assistant - MVP
๐ฏ Overview
This MVP demonstrates an intelligent research assistant framework featuring transparent reasoning chains, specialized agent architecture, and mobile-first design. Built for Hugging Face Spaces with NVIDIA T4 GPU acceleration for local model inference.
Key Differentiators
- ๐ Transparent Reasoning: Watch the AI think step-by-step with Chain of Thought
- ๐ง Specialized Agents: Multiple AI models working together for optimal performance
- ๐ฑ Mobile-First: Optimized for seamless mobile web experience
- ๐ Academic Focus: Designed for research and educational use cases
๐ API Documentation
Comprehensive API documentation is available: API_DOCUMENTATION.md
The API provides REST endpoints for:
- Chat interactions with AI assistant
- Health checks
- Context management
- Session tracking
Quick API Example:
import requests
response = requests.post(
"https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat",
json={
"message": "What is machine learning?",
"session_id": "my-session",
"user_id": "user-123"
}
)
data = response.json()
print(data["message"])
print(f"Performance: {data.get('performance', {})}")
๐ Quick Start
Option 1: Use Our Demo
Visit our live demo on Hugging Face Spaces:
https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI
Option 2: Deploy Your Own Instance
Prerequisites
- Hugging Face account with write token
- Basic understanding of Hugging Face Spaces
Deployment Steps
- Fork this space using the Hugging Face UI
- Add your HF token (optional, only needed for gated models):
- Go to your Space โ Settings โ Repository secrets
- Add
HF_TOKENwith your Hugging Face token (only needed if using gated models) - Note: Local models are used for inference - HF_TOKEN is only for downloading models
- The space will auto-build (takes 5-10 minutes)
Manual Build (Advanced)
# Clone the repository
git clone https://huggingface.co/spaces/your-username/research-assistant
cd research-assistant
# Install dependencies
pip install -r requirements.txt
# Set up environment (optional - only needed for gated models)
export HF_TOKEN="your_hugging_face_token_here" # Optional: only for downloading gated models
# Launch the application (multiple options)
python main.py # Full integration with error handling
python launch.py # Simple launcher
python app.py # UI-only mode
๐ Integration Structure
The MVP now includes complete integration files for deployment:
โโโ main.py # ๐ฏ Main integration entry point
โโโ launch.py # ๐ Simple launcher for HF Spaces
โโโ app.py # ๐ฑ Mobile-optimized UI
โโโ requirements.txt # ๐ฆ Dependencies
โโโ src/
โโโ __init__.py # ๐ฆ Package initialization
โโโ database.py # ๐๏ธ SQLite database management
โโโ event_handlers.py # ๐ UI event integration
โโโ config.py # โ๏ธ Configuration
โโโ llm_router.py # ๐ค LLM routing
โโโ orchestrator_engine.py # ๐ญ Request orchestration
โโโ context_manager.py # ๐ง Context management
โโโ mobile_handlers.py # ๐ฑ Mobile UX handlers
โโโ agents/
โโโ __init__.py # ๐ค Agents package
โโโ intent_agent.py # ๐ฏ Intent recognition
โโโ synthesis_agent.py # โจ Response synthesis
โโโ safety_agent.py # ๐ก๏ธ Safety checking
Key Features:
- ๐ Graceful Degradation: Falls back to mock mode if components fail
- ๐ฑ Mobile-First: Optimized for mobile devices and small screens
- ๐๏ธ Database Ready: SQLite integration with session management
- ๐ Event Handling: Complete UI-to-backend integration
- โก Error Recovery: Robust error handling throughout
๐๏ธ Architecture
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Mobile Web โ โโ โ ORCHESTRATOR โ โโ โ AGENT SWARM โ
โ Interface โ โ (Core Engine) โ โ (5 Specialists)โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PERSISTENCE LAYER โ
โ (SQLite + FAISS Lite) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Core Components
| Component | Purpose | Technology |
|---|---|---|
| Orchestrator | Main coordination engine | Python + Async |
| Intent Recognition | Understand user goals | RoBERTa-base + CoT |
| Context Manager | Session memory & recall | FAISS + SQLite |
| Response Synthesis | Generate final answers | Mistral-7B |
| Safety Checker | Content moderation | Unbiased-Toxic-RoBERTa |
| Research Agent | Information gathering | Web search + analysis |
๐ก Usage Examples
Basic Research Query
User: "Explain quantum entanglement in simple terms"
Assistant:
1. ๐ค [Reasoning] Breaking down quantum physics concepts...
2. ๐ [Research] Gathering latest explanations...
3. โ๏ธ [Synthesis] Creating simplified explanation...
[Final Response]: Quantum entanglement is when two particles become linked...
Technical Analysis
User: "Compare transformer models for text classification"
Assistant:
1. ๐ท๏ธ [Intent] Identifying technical comparison request
2. ๐ [Analysis] Evaluating BERT vs RoBERTa vs DistilBERT
3. ๐ [Synthesis] Creating comparison table with metrics...
โ๏ธ Configuration
Environment Variables
# Required
HF_TOKEN="your_hugging_face_token"
# Optional
MAX_WORKERS=4
CACHE_TTL=3600
DEFAULT_MODEL="meta-llama/Llama-3.1-8B-Instruct"
EMBEDDING_MODEL="intfloat/e5-base-v2"
CLASSIFICATION_MODEL="Qwen/Qwen2.5-1.5B-Instruct"
HF_HOME="/tmp/huggingface" # Cache directory (auto-configured)
LOG_LEVEL="INFO"
Cache Directory Management:
- Automatically configured with secure fallback chain
- Supports HF_HOME, TRANSFORMERS_CACHE, or user cache
- Validates write permissions automatically
- See
.env.examplefor all available options
Model Configuration
The system uses multiple specialized models optimized for T4 16GB GPU:
| Task | Model | Purpose | Quantization |
|---|---|---|---|
| Primary Reasoning | meta-llama/Llama-3.1-8B-Instruct |
General responses | 4-bit NF4 |
| Embeddings | intfloat/e5-base-v2 |
Semantic search | None (768-dim) |
| Intent Classification | Qwen/Qwen2.5-1.5B-Instruct |
User goal detection | 4-bit NF4 |
| Safety Checking | meta-llama/Llama-3.1-8B-Instruct |
Content moderation | 4-bit NF4 |
Performance Optimizations:
- โ 4-bit quantization (NF4) for memory efficiency
- โ Model preloading for faster responses
- โ Connection pooling for API calls
- โ Parallel agent processing
๐ฑ Mobile Optimization
Key Mobile Features
- Touch-friendly interface (44px+ touch targets)
- Progressive Web App capabilities
- Offline functionality for cached sessions
- Reduced data usage with optimized responses
- Keyboard-aware layout adjustments
Supported Devices
- โ Smartphones (iOS/Android)
- โ Tablets
- โ Desktop browsers
- โ Screen readers (accessibility)
๐ ๏ธ Development
Project Structure
research-assistant/
โโโ app.py # Main Gradio application
โโโ requirements.txt # Dependencies
โโโ Dockerfile # Container configuration
โโโ src/
โ โโโ orchestrator.py # Core orchestration engine
โ โโโ agents/ # Specialized agent modules
โ โโโ llm_router.py # Multi-model routing
โ โโโ mobile_ux.py # Mobile optimizations
โโโ tests/ # Test suites
โโโ docs/ # Documentation
Adding New Agents
- Create agent module in
src/agents/ - Implement agent protocol:
class YourNewAgent:
async def execute(self, user_input: str, context: dict) -> dict:
# Your agent logic here
return {
"result": processed_output,
"confidence": 0.95,
"metadata": {}
}
- Register agent in orchestrator configuration
๐งช Testing
Run Test Suite
# Install test dependencies
pip install -r requirements.txt
# Run all tests
pytest tests/ -v
# Run specific test categories
pytest tests/test_agents.py -v
pytest tests/test_mobile_ux.py -v
Test Coverage
- โ Agent functionality
- โ Mobile UX components
- โ LLM routing logic
- โ Error handling
- โ Performance benchmarks
๐จ Troubleshooting
Common Build Issues
| Issue | Solution |
|---|---|
| HF_TOKEN not found | Optional - only needed for gated model access |
| Local models unavailable | Check transformers/torch installation |
| Build timeout | Reduce model sizes in requirements |
| Memory errors | Check GPU memory usage, optimize model loading |
| Import errors | Check Python version (3.9+) |
Performance Optimization
- Enable caching in context manager
- Use smaller models for initial deployment
- Implement lazy loading for mobile users
- Monitor memory usage with built-in tools
Debug Mode
Enable detailed logging:
import logging
logging.basicConfig(level=logging.DEBUG)
๐ Performance Metrics
The API now includes comprehensive performance metrics in every response:
{
"performance": {
"processing_time": 1230.5, // milliseconds
"tokens_used": 456,
"agents_used": 4,
"confidence_score": 85.2, // percentage
"agent_contributions": [
{"agent": "Intent", "percentage": 25.0},
{"agent": "Synthesis", "percentage": 40.0},
{"agent": "Safety", "percentage": 15.0},
{"agent": "Skills", "percentage": 20.0}
],
"safety_score": 85.0,
"latency_seconds": 1.230,
"timestamp": "2024-01-15T10:30:45.123456"
}
}
| Metric | Target | Current |
|---|---|---|
| Response Time | <10s | ~7s |
| Cache Hit Rate | >60% | ~65% |
| Mobile UX Score | >80/100 | 85/100 |
| Error Rate | <5% | ~3% |
| Performance Tracking | โ | โ Implemented |
๐ฎ Roadmap
Phase 1 (Current - MVP)
- โ Basic agent orchestration
- โ Mobile-optimized interface
- โ Multi-model routing
- โ Transparent reasoning display
- โ Performance metrics tracking
- โ Enhanced configuration management
- โ 4-bit quantization for T4 GPU
- โ Model preloading and optimization
Phase 2 (Next 3 months)
- ๐ง Advanced research capabilities
- ๐ง Plugin system for tools
- ๐ง Enhanced mobile PWA features
- ๐ง Multi-language support
Phase 3 (Future)
- ๐ฎ Autonomous agent swarms
- ๐ฎ Voice interface integration
- ๐ฎ Enterprise features
- ๐ฎ Advanced analytics
๐ฅ Contributing
We welcome contributions! Please see:
Quick Contribution Steps
# 1. Fork the repository
# 2. Create feature branch
git checkout -b feature/amazing-feature
# 3. Commit changes
git commit -m "Add amazing feature"
# 4. Push to branch
git push origin feature/amazing-feature
# 5. Open Pull Request
๐ Citation
If you use this framework in your research, please cite:
@software{research_assistant_mvp,
title = {AI Research Assistant - MVP},
author = {Your Name},
year = {2024},
url = {https://huggingface.co/spaces/your-username/research-assistant}
}
๐ License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
๐ Acknowledgments
- Hugging Face for the infrastructure
- Gradio for the web framework
- Model contributors from the HF community
- Early testers and feedback providers
Need help?
Built with โค๏ธ for the research community