📋 Executive Summary
The BMS AI Assistant is an enterprise-grade, zero-cost chatbot system designed for demand forecasting, inventory management, and ERP integration. Built with open-source technologies, it combines rule-based intent parsing with local AI capabilities to provide intelligent, context-aware responses without external API dependencies.
🏗️ System Architecture
Presentation Layer
API Layer
Business Logic Layer
Data Layer
AI/ML Layer
💻 Technology Stack
Backend Framework
- FastAPI 0.104+
- Uvicorn (ASGI Server)
- Python 3.10+
- Async/Await Support
AI/ML Stack
- TinyLlama 1.1B (GGUF)
- ctransformers 0.2.27+
- Hugging Face Hub
- Statsmodels (ARIMA)
Data Processing
- Pandas 2.0+
- NumPy 1.24+
- CSV-based Storage
- In-Memory Caching
Frontend
- HTML5 + CSS3
- Vanilla JavaScript
- Inter Font Family
- SVG Graphics
Cloud Deployment
- Hugging Face Spaces
- Docker Container
- Git Version Control
Testing & QA
- Python Requests
- Custom Test Suite
- Load Testing Scripts
- Concurrent Testing
📊 Technical Specifications
| Component | Specification | Purpose |
|---|---|---|
| LLM Model | TinyLlama 1.1B (Q4_K_M quantized) | General chat, context-aware responses |
| Model Size | ~500 MB (GGUF format) | Optimized for CPU inference |
| Forecasting Algorithm | ARIMA (AutoRegressive Integrated Moving Average) | Time-series demand prediction |
| Intent Recognition | Rule-based pattern matching + Regex | Fast, deterministic intent parsing |
| API Protocol | REST (JSON) | Standard HTTP communication |
| Data Format | CSV (UTF-8) | Simple, portable data storage |
| PDF Generation | FPDF Library | Report generation |
| Response Time | <2s (rule-based), <15s (LLM) | User experience optimization |
| Concurrent Users | 5-10 (free tier), 50+ (paid) | Scalability |
| Memory Requirement | ~2 GB RAM (minimum) | Model + application overhead |
🔄 Complete Process Flow
User Input
Component: index.html (Frontend)
Action: User types query in chat interface
Technology: JavaScript event listener (keypress/click)
API Request
Component: JavaScript Fetch API
Action: POST request to /api/chat endpoint
Payload: JSON with user message
Intent Parsing
Component: intent_parser.py
Action: Analyze query using regex patterns
Output: Intent type + extracted entities (item_code, quantity, etc.)
Intent Routing
Component: main.py (FastAPI)
Action: Route to appropriate handler based on intent
Intents: demand_forecast, item_details, check_inventory, supplier_info, create_requisition, system_status, generate_report, general_chat
Business Logic Execution
Components: Multiple modules based on intent
Actions:
- Forecasting: forecasting.py → ARIMA model → Predict demand
- Data Retrieval: data_loader.py → Load CSV → Query data
- LLM Response: llm_engine.py → TinyLlama → Generate text
- PDF Generation: pdf_generator.py → FPDF → Create report
Response Formatting
Component: main.py
Action: Format response as JSON
Structure: {intent, answer, forecast (optional), pdf_link (optional)}
Frontend Rendering
Component: index.html (JavaScript)
Action: Parse JSON, render message bubble, display tables/charts
Features: HTML sanitization, table rendering, download buttons
User Interaction
Component: Frontend UI
Action: Display response with icons, animations, and interactive elements
Features: Smooth scrolling, copy text, download PDFs, click links