🏭 BMS AI Assistant

Technical Specification & Process Flow Documentation

Version 1.0 | Enterprise Demand Forecasting System

📋 Executive Summary

The BMS AI Assistant is an enterprise-grade, zero-cost chatbot system designed for demand forecasting, inventory management, and ERP integration. Built with open-source technologies, it combines rule-based intent parsing with local AI capabilities to provide intelligent, context-aware responses without external API dependencies.

Zero Cost 100% Open Source Local AI Enterprise Ready

🏗️ System Architecture

Presentation Layer

HTML5 + CSS3
JavaScript (Vanilla)
SVG Icons
Hugging Face Spaces

API Layer

FastAPI
Uvicorn (ASGI)
REST Endpoints

Business Logic Layer

Intent Parser
ARIMA Forecasting
LLM Engine
PDF Generator

Data Layer

CSV Files
Pandas DataFrames
In-Memory Cache

AI/ML Layer

TinyLlama 1.1B
ctransformers
Statsmodels (ARIMA)

💻 Technology Stack

Backend Framework

  • FastAPI 0.104+
  • Uvicorn (ASGI Server)
  • Python 3.10+
  • Async/Await Support

AI/ML Stack

  • TinyLlama 1.1B (GGUF)
  • ctransformers 0.2.27+
  • Hugging Face Hub
  • Statsmodels (ARIMA)

Data Processing

  • Pandas 2.0+
  • NumPy 1.24+
  • CSV-based Storage
  • In-Memory Caching

Frontend

  • HTML5 + CSS3
  • Vanilla JavaScript
  • Inter Font Family
  • SVG Graphics

Cloud Deployment

  • Hugging Face Spaces
  • Docker Container
  • Git Version Control

Testing & QA

  • Python Requests
  • Custom Test Suite
  • Load Testing Scripts
  • Concurrent Testing

📊 Technical Specifications

Component Specification Purpose
LLM Model TinyLlama 1.1B (Q4_K_M quantized) General chat, context-aware responses
Model Size ~500 MB (GGUF format) Optimized for CPU inference
Forecasting Algorithm ARIMA (AutoRegressive Integrated Moving Average) Time-series demand prediction
Intent Recognition Rule-based pattern matching + Regex Fast, deterministic intent parsing
API Protocol REST (JSON) Standard HTTP communication
Data Format CSV (UTF-8) Simple, portable data storage
PDF Generation FPDF Library Report generation
Response Time <2s (rule-based), <15s (LLM) User experience optimization
Concurrent Users 5-10 (free tier), 50+ (paid) Scalability
Memory Requirement ~2 GB RAM (minimum) Model + application overhead

🔄 Complete Process Flow

User Input

Component: index.html (Frontend)

Action: User types query in chat interface

Technology: JavaScript event listener (keypress/click)

userInput.addEventListener('keypress', function (e) { if (e.key === 'Enter') sendMessage(); });

API Request

Component: JavaScript Fetch API

Action: POST request to /api/chat endpoint

Payload: JSON with user message

fetch('/api/chat', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({message: text}) });

Intent Parsing

Component: intent_parser.py

Action: Analyze query using regex patterns

Output: Intent type + extracted entities (item_code, quantity, etc.)

parsed = parser.parse(message) # Returns: { # "intent": "demand_forecast", # "item_code": "BMS0015", # "horizon_days": 30 # }

Intent Routing

Component: main.py (FastAPI)

Action: Route to appropriate handler based on intent

Intents: demand_forecast, item_details, check_inventory, supplier_info, create_requisition, system_status, generate_report, general_chat

Business Logic Execution

Components: Multiple modules based on intent

Actions:

  • Forecasting: forecasting.py → ARIMA model → Predict demand
  • Data Retrieval: data_loader.py → Load CSV → Query data
  • LLM Response: llm_engine.py → TinyLlama → Generate text
  • PDF Generation: pdf_generator.py → FPDF → Create report

Response Formatting

Component: main.py

Action: Format response as JSON

Structure: {intent, answer, forecast (optional), pdf_link (optional)}

return { "intent": "demand_forecast", "answer": "Forecast for BMS0015...", "forecast": [{date: "2025-01-01", qty: 150}, ...] }

Frontend Rendering

Component: index.html (JavaScript)

Action: Parse JSON, render message bubble, display tables/charts

Features: HTML sanitization, table rendering, download buttons

addMessage(botHtml, 'bot', true); if (data.forecast) { botHtml += renderForecastTable(data.forecast); }

User Interaction

Component: Frontend UI

Action: Display response with icons, animations, and interactive elements

Features: Smooth scrolling, copy text, download PDFs, click links

🔧 Component Breakdown

1. Intent Parser (intent_parser.py)

# Pattern Matching Examples: - "forecast" → demand_forecast - "inventory|stock" → check_inventory - "supplier|vendor" → supplier_info - "order|buy|purchase" → create_requisition - Item Code Extraction: r'\b([a-z]{2}\d{4})\b' - Quantity Extraction: r'\b(\d+)\s*(?:units|pcs)?' - Horizon Parsing: "next month" → 30 days

2. Data Loader (data_loader.py)

# Data Sources: - items.csv: Product catalog (item_code, description, price) - demand_history.csv: Historical sales data - inventory.csv: Stock levels by location - suppliers.csv: Supplier information - requisitions.csv: Order tracking # Methods: - load_data(): Initialize all DataFrames - get_item_details(item_code): Retrieve product info - get_inventory(item_code): Check stock levels - create_requisition(item_code, qty): Generate order

3. Forecasting Engine (forecasting.py)

# ARIMA Process: 1. Load historical demand data 2. Aggregate by date 3. Fit ARIMA model (order=(1,1,1)) 4. Generate forecast for N days 5. Return predictions with dates # Output Format: [ {"date": "2025-01-01", "qty": 150}, {"date": "2025-01-02", "qty": 148}, ... ]

4. LLM Engine (llm_engine.py)

# Model Configuration: - Model: TinyLlama-1.1B-Chat-v1.0 (GGUF) - Context Length: 2048 tokens - Temperature: 0.1 (deterministic) - Max Tokens: 150 - Stop Tokens: ["", "User:"]