|
|
--- |
|
|
title: AI Research Assistant MVP |
|
|
emoji: ๐ง |
|
|
colorFrom: blue |
|
|
colorTo: purple |
|
|
sdk: docker |
|
|
app_port: 7860 |
|
|
pinned: false |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- ai |
|
|
- chatbot |
|
|
- research |
|
|
- education |
|
|
- transformers |
|
|
models: |
|
|
- meta-llama/Llama-3.1-8B-Instruct |
|
|
- intfloat/e5-base-v2 |
|
|
- Qwen/Qwen2.5-1.5B-Instruct |
|
|
datasets: |
|
|
- wikipedia |
|
|
- commoncrawl |
|
|
base_path: research-assistant |
|
|
hf_oauth: true |
|
|
hf_token: true |
|
|
disable_embedding: false |
|
|
duplicated_from: null |
|
|
extra_gated_prompt: null |
|
|
extra_gated_fields: {} |
|
|
gated: false |
|
|
public: true |
|
|
--- |
|
|
|
|
|
# AI Research Assistant - MVP |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
|
|
|
**Academic-grade AI assistant with transparent reasoning and mobile-optimized interface** |
|
|
|
|
|
[](https://huggingface.co/spaces/your-username/research-assistant) |
|
|
[](https://github.com/your-org/research-assistant/wiki) |
|
|
|
|
|
</div> |
|
|
|
|
|
## ๐ฏ Overview |
|
|
|
|
|
This MVP demonstrates an intelligent research assistant framework featuring **transparent reasoning chains**, **specialized agent architecture**, and **mobile-first design**. Built for Hugging Face Spaces with NVIDIA T4 GPU acceleration for local model inference. |
|
|
|
|
|
### Key Differentiators |
|
|
- **๐ Transparent Reasoning**: Watch the AI think step-by-step with Chain of Thought |
|
|
- **๐ง Specialized Agents**: Multiple AI models working together for optimal performance |
|
|
- **๐ฑ Mobile-First**: Optimized for seamless mobile web experience |
|
|
- **๐ Academic Focus**: Designed for research and educational use cases |
|
|
|
|
|
## ๐ API Documentation |
|
|
|
|
|
**Comprehensive API documentation is available:** [API_DOCUMENTATION.md](API_DOCUMENTATION.md) |
|
|
|
|
|
The API provides REST endpoints for: |
|
|
- Chat interactions with AI assistant |
|
|
- Health checks |
|
|
- Context management |
|
|
- Session tracking |
|
|
|
|
|
**Quick API Example:** |
|
|
```python |
|
|
import requests |
|
|
|
|
|
response = requests.post( |
|
|
"https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI/api/chat", |
|
|
json={ |
|
|
"message": "What is machine learning?", |
|
|
"session_id": "my-session", |
|
|
"user_id": "user-123" |
|
|
} |
|
|
) |
|
|
data = response.json() |
|
|
print(data["message"]) |
|
|
print(f"Performance: {data.get('performance', {})}") |
|
|
``` |
|
|
|
|
|
## ๐ Quick Start |
|
|
|
|
|
### Option 1: Use Our Demo |
|
|
Visit our live demo on Hugging Face Spaces: |
|
|
```bash |
|
|
https://huggingface.co/spaces/JatinAutonomousLabs/HonestAI |
|
|
``` |
|
|
|
|
|
### Option 2: Deploy Your Own Instance |
|
|
|
|
|
#### Prerequisites |
|
|
- Hugging Face account with [write token](https://huggingface.co/settings/tokens) |
|
|
- Basic understanding of Hugging Face Spaces |
|
|
|
|
|
#### Deployment Steps |
|
|
|
|
|
1. **Fork this space** using the Hugging Face UI |
|
|
2. **Add your HF token** (optional, only needed for gated models): |
|
|
- Go to your Space โ Settings โ Repository secrets |
|
|
- Add `HF_TOKEN` with your Hugging Face token (only needed if using gated models) |
|
|
- **Note**: Local models are used for inference - HF_TOKEN is only for downloading models |
|
|
3. **The space will auto-build** (takes 5-10 minutes) |
|
|
|
|
|
#### Manual Build (Advanced) |
|
|
|
|
|
```bash |
|
|
# Clone the repository |
|
|
git clone https://huggingface.co/spaces/your-username/research-assistant |
|
|
cd research-assistant |
|
|
|
|
|
# Install dependencies |
|
|
pip install -r requirements.txt |
|
|
|
|
|
# Set up environment (optional - only needed for gated models) |
|
|
export HF_TOKEN="your_hugging_face_token_here" # Optional: only for downloading gated models |
|
|
|
|
|
# Launch the application (multiple options) |
|
|
python main.py # Full integration with error handling |
|
|
python launch.py # Simple launcher |
|
|
python app.py # UI-only mode |
|
|
``` |
|
|
|
|
|
## ๐ Integration Structure |
|
|
|
|
|
The MVP now includes complete integration files for deployment: |
|
|
|
|
|
``` |
|
|
โโโ main.py # ๐ฏ Main integration entry point |
|
|
โโโ launch.py # ๐ Simple launcher for HF Spaces |
|
|
โโโ app.py # ๐ฑ Mobile-optimized UI |
|
|
โโโ requirements.txt # ๐ฆ Dependencies |
|
|
โโโ src/ |
|
|
โโโ __init__.py # ๐ฆ Package initialization |
|
|
โโโ database.py # ๐๏ธ SQLite database management |
|
|
โโโ event_handlers.py # ๐ UI event integration |
|
|
โโโ config.py # โ๏ธ Configuration |
|
|
โโโ llm_router.py # ๐ค LLM routing |
|
|
โโโ orchestrator_engine.py # ๐ญ Request orchestration |
|
|
โโโ context_manager.py # ๐ง Context management |
|
|
โโโ mobile_handlers.py # ๐ฑ Mobile UX handlers |
|
|
โโโ agents/ |
|
|
โโโ __init__.py # ๐ค Agents package |
|
|
โโโ intent_agent.py # ๐ฏ Intent recognition |
|
|
โโโ synthesis_agent.py # โจ Response synthesis |
|
|
โโโ safety_agent.py # ๐ก๏ธ Safety checking |
|
|
``` |
|
|
|
|
|
### Key Features: |
|
|
- **๐ Graceful Degradation**: Falls back to mock mode if components fail |
|
|
- **๐ฑ Mobile-First**: Optimized for mobile devices and small screens |
|
|
- **๐๏ธ Database Ready**: SQLite integration with session management |
|
|
- **๐ Event Handling**: Complete UI-to-backend integration |
|
|
- **โก Error Recovery**: Robust error handling throughout |
|
|
|
|
|
## ๐๏ธ Architecture |
|
|
|
|
|
``` |
|
|
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ |
|
|
โ Mobile Web โ โโ โ ORCHESTRATOR โ โโ โ AGENT SWARM โ |
|
|
โ Interface โ โ (Core Engine) โ โ (5 Specialists)โ |
|
|
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ |
|
|
โ โ โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
โ PERSISTENCE LAYER โ |
|
|
โ (SQLite + FAISS Lite) โ |
|
|
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
|
|
``` |
|
|
|
|
|
### Core Components |
|
|
|
|
|
| Component | Purpose | Technology | |
|
|
|-----------|---------|------------| |
|
|
| **Orchestrator** | Main coordination engine | Python + Async | |
|
|
| **Intent Recognition** | Understand user goals | RoBERTa-base + CoT | |
|
|
| **Context Manager** | Session memory & recall | FAISS + SQLite | |
|
|
| **Response Synthesis** | Generate final answers | Mistral-7B | |
|
|
| **Safety Checker** | Content moderation | Unbiased-Toxic-RoBERTa | |
|
|
| **Research Agent** | Information gathering | Web search + analysis | |
|
|
|
|
|
## ๐ก Usage Examples |
|
|
|
|
|
### Basic Research Query |
|
|
``` |
|
|
User: "Explain quantum entanglement in simple terms" |
|
|
|
|
|
Assistant: |
|
|
1. ๐ค [Reasoning] Breaking down quantum physics concepts... |
|
|
2. ๐ [Research] Gathering latest explanations... |
|
|
3. โ๏ธ [Synthesis] Creating simplified explanation... |
|
|
|
|
|
[Final Response]: Quantum entanglement is when two particles become linked... |
|
|
``` |
|
|
|
|
|
### Technical Analysis |
|
|
``` |
|
|
User: "Compare transformer models for text classification" |
|
|
|
|
|
Assistant: |
|
|
1. ๐ท๏ธ [Intent] Identifying technical comparison request |
|
|
2. ๐ [Analysis] Evaluating BERT vs RoBERTa vs DistilBERT |
|
|
3. ๐ [Synthesis] Creating comparison table with metrics... |
|
|
``` |
|
|
|
|
|
## โ๏ธ Configuration |
|
|
|
|
|
### Environment Variables |
|
|
|
|
|
```python |
|
|
# Required |
|
|
HF_TOKEN="your_hugging_face_token" |
|
|
|
|
|
# Optional |
|
|
MAX_WORKERS=4 |
|
|
CACHE_TTL=3600 |
|
|
DEFAULT_MODEL="meta-llama/Llama-3.1-8B-Instruct" |
|
|
EMBEDDING_MODEL="intfloat/e5-base-v2" |
|
|
CLASSIFICATION_MODEL="Qwen/Qwen2.5-1.5B-Instruct" |
|
|
HF_HOME="/tmp/huggingface" # Cache directory (auto-configured) |
|
|
LOG_LEVEL="INFO" |
|
|
``` |
|
|
|
|
|
**Cache Directory Management:** |
|
|
- Automatically configured with secure fallback chain |
|
|
- Supports HF_HOME, TRANSFORMERS_CACHE, or user cache |
|
|
- Validates write permissions automatically |
|
|
- See `.env.example` for all available options |
|
|
|
|
|
### Model Configuration |
|
|
|
|
|
The system uses multiple specialized models optimized for T4 16GB GPU: |
|
|
|
|
|
| Task | Model | Purpose | Quantization | |
|
|
|------|-------|---------|--------------| |
|
|
| Primary Reasoning | `meta-llama/Llama-3.1-8B-Instruct` | General responses | 4-bit NF4 | |
|
|
| Embeddings | `intfloat/e5-base-v2` | Semantic search | None (768-dim) | |
|
|
| Intent Classification | `Qwen/Qwen2.5-1.5B-Instruct` | User goal detection | 4-bit NF4 | |
|
|
| Safety Checking | `meta-llama/Llama-3.1-8B-Instruct` | Content moderation | 4-bit NF4 | |
|
|
|
|
|
**Performance Optimizations:** |
|
|
- โ
4-bit quantization (NF4) for memory efficiency |
|
|
- โ
Model preloading for faster responses |
|
|
- โ
Connection pooling for API calls |
|
|
- โ
Parallel agent processing |
|
|
|
|
|
## ๐ฑ Mobile Optimization |
|
|
|
|
|
### Key Mobile Features |
|
|
- **Touch-friendly** interface (44px+ touch targets) |
|
|
- **Progressive Web App** capabilities |
|
|
- **Offline functionality** for cached sessions |
|
|
- **Reduced data usage** with optimized responses |
|
|
- **Keyboard-aware** layout adjustments |
|
|
|
|
|
### Supported Devices |
|
|
- โ
Smartphones (iOS/Android) |
|
|
- โ
Tablets |
|
|
- โ
Desktop browsers |
|
|
- โ
Screen readers (accessibility) |
|
|
|
|
|
## ๐ ๏ธ Development |
|
|
|
|
|
### Project Structure |
|
|
``` |
|
|
research-assistant/ |
|
|
โโโ app.py # Main Gradio application |
|
|
โโโ requirements.txt # Dependencies |
|
|
โโโ Dockerfile # Container configuration |
|
|
โโโ src/ |
|
|
โ โโโ orchestrator.py # Core orchestration engine |
|
|
โ โโโ agents/ # Specialized agent modules |
|
|
โ โโโ llm_router.py # Multi-model routing |
|
|
โ โโโ mobile_ux.py # Mobile optimizations |
|
|
โโโ tests/ # Test suites |
|
|
โโโ docs/ # Documentation |
|
|
``` |
|
|
|
|
|
### Adding New Agents |
|
|
|
|
|
1. Create agent module in `src/agents/` |
|
|
2. Implement agent protocol: |
|
|
```python |
|
|
class YourNewAgent: |
|
|
async def execute(self, user_input: str, context: dict) -> dict: |
|
|
# Your agent logic here |
|
|
return { |
|
|
"result": processed_output, |
|
|
"confidence": 0.95, |
|
|
"metadata": {} |
|
|
} |
|
|
``` |
|
|
|
|
|
3. Register agent in orchestrator configuration |
|
|
|
|
|
## ๐งช Testing |
|
|
|
|
|
### Run Test Suite |
|
|
```bash |
|
|
# Install test dependencies |
|
|
pip install -r requirements.txt |
|
|
|
|
|
# Run all tests |
|
|
pytest tests/ -v |
|
|
|
|
|
# Run specific test categories |
|
|
pytest tests/test_agents.py -v |
|
|
pytest tests/test_mobile_ux.py -v |
|
|
``` |
|
|
|
|
|
### Test Coverage |
|
|
- โ
Agent functionality |
|
|
- โ
Mobile UX components |
|
|
- โ
LLM routing logic |
|
|
- โ
Error handling |
|
|
- โ
Performance benchmarks |
|
|
|
|
|
## ๐จ Troubleshooting |
|
|
|
|
|
### Common Build Issues |
|
|
|
|
|
| Issue | Solution | |
|
|
|-------|----------| |
|
|
| **HF_TOKEN not found** | Optional - only needed for gated model access | |
|
|
| **Local models unavailable** | Check transformers/torch installation | |
|
|
| **Build timeout** | Reduce model sizes in requirements | |
|
|
| **Memory errors** | Check GPU memory usage, optimize model loading | |
|
|
| **Import errors** | Check Python version (3.9+) | |
|
|
|
|
|
### Performance Optimization |
|
|
|
|
|
1. **Enable caching** in context manager |
|
|
2. **Use smaller models** for initial deployment |
|
|
3. **Implement lazy loading** for mobile users |
|
|
4. **Monitor memory usage** with built-in tools |
|
|
|
|
|
### Debug Mode |
|
|
|
|
|
Enable detailed logging: |
|
|
```python |
|
|
import logging |
|
|
logging.basicConfig(level=logging.DEBUG) |
|
|
``` |
|
|
|
|
|
## ๐ Performance Metrics |
|
|
|
|
|
The API now includes comprehensive performance metrics in every response: |
|
|
|
|
|
```json |
|
|
{ |
|
|
"performance": { |
|
|
"processing_time": 1230.5, // milliseconds |
|
|
"tokens_used": 456, |
|
|
"agents_used": 4, |
|
|
"confidence_score": 85.2, // percentage |
|
|
"agent_contributions": [ |
|
|
{"agent": "Intent", "percentage": 25.0}, |
|
|
{"agent": "Synthesis", "percentage": 40.0}, |
|
|
{"agent": "Safety", "percentage": 15.0}, |
|
|
{"agent": "Skills", "percentage": 20.0} |
|
|
], |
|
|
"safety_score": 85.0, |
|
|
"latency_seconds": 1.230, |
|
|
"timestamp": "2024-01-15T10:30:45.123456" |
|
|
} |
|
|
} |
|
|
``` |
|
|
|
|
|
| Metric | Target | Current | |
|
|
|--------|---------|---------| |
|
|
| Response Time | <10s | ~7s | |
|
|
| Cache Hit Rate | >60% | ~65% | |
|
|
| Mobile UX Score | >80/100 | 85/100 | |
|
|
| Error Rate | <5% | ~3% | |
|
|
| Performance Tracking | โ
| โ
Implemented | |
|
|
|
|
|
## ๐ฎ Roadmap |
|
|
|
|
|
### Phase 1 (Current - MVP) |
|
|
- โ
Basic agent orchestration |
|
|
- โ
Mobile-optimized interface |
|
|
- โ
Multi-model routing |
|
|
- โ
Transparent reasoning display |
|
|
- โ
Performance metrics tracking |
|
|
- โ
Enhanced configuration management |
|
|
- โ
4-bit quantization for T4 GPU |
|
|
- โ
Model preloading and optimization |
|
|
|
|
|
### Phase 2 (Next 3 months) |
|
|
- ๐ง Advanced research capabilities |
|
|
- ๐ง Plugin system for tools |
|
|
- ๐ง Enhanced mobile PWA features |
|
|
- ๐ง Multi-language support |
|
|
|
|
|
### Phase 3 (Future) |
|
|
- ๐ฎ Autonomous agent swarms |
|
|
- ๐ฎ Voice interface integration |
|
|
- ๐ฎ Enterprise features |
|
|
- ๐ฎ Advanced analytics |
|
|
|
|
|
## ๐ฅ Contributing |
|
|
|
|
|
We welcome contributions! Please see: |
|
|
|
|
|
1. [Contributing Guidelines](docs/CONTRIBUTING.md) |
|
|
2. [Code of Conduct](docs/CODE_OF_CONDUCT.md) |
|
|
3. [Development Setup](docs/DEVELOPMENT.md) |
|
|
|
|
|
### Quick Contribution Steps |
|
|
```bash |
|
|
# 1. Fork the repository |
|
|
# 2. Create feature branch |
|
|
git checkout -b feature/amazing-feature |
|
|
|
|
|
# 3. Commit changes |
|
|
git commit -m "Add amazing feature" |
|
|
|
|
|
# 4. Push to branch |
|
|
git push origin feature/amazing-feature |
|
|
|
|
|
# 5. Open Pull Request |
|
|
``` |
|
|
|
|
|
## ๐ Citation |
|
|
|
|
|
If you use this framework in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@software{research_assistant_mvp, |
|
|
title = {AI Research Assistant - MVP}, |
|
|
author = {Your Name}, |
|
|
year = {2024}, |
|
|
url = {https://huggingface.co/spaces/your-username/research-assistant} |
|
|
} |
|
|
``` |
|
|
|
|
|
## ๐ License |
|
|
|
|
|
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. |
|
|
|
|
|
## ๐ Acknowledgments |
|
|
|
|
|
- [Hugging Face](https://huggingface.co) for the infrastructure |
|
|
- [Gradio](https://gradio.app) for the web framework |
|
|
- Model contributors from the HF community |
|
|
- Early testers and feedback providers |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
**Need help?** |
|
|
- [Open an Issue](https://github.com/your-org/research-assistant/issues) |
|
|
- [Join our Discord](https://discord.gg/your-discord) |
|
|
- [Email Support](mailto:[email protected]) |
|
|
|
|
|
*Built with โค๏ธ for the research community* |
|
|
|
|
|
</div> |
|
|
|
|
|
|