Spaces:
Sleeping
Sleeping
β SmartManuals-AI for Hugging Face Spaces
SmartManuals-AI is a local-first document QA system that uses RAG (retrieval-augmented generation), OCR, and embedding search to answer technical questions from PDFs and Word documents.
π§ Features
- π Ask natural-language questions to your manuals
- π Handles both PDFs and Word
.docxfiles - π§ Uses semantic search with
sentence-transformers - ποΈ ChromaDB for fast local vector indexing
- π¬ Answers generated by Meta LLaMA 3.1 8B Instruct (default)
- π Gradio dashboard for interaction
π Folder Structure
SmartManuals-AI/
βββ app.py # Hugging Face Spaces main app
βββ Manuals/ # π Upload your PDF and Word manuals here
β βββ OM_Treadmill.pdf
β βββ Parts_Bike.docx
βββ chroma_store/ # βοΈ ChromaDB vector DB (auto-generated)
βββ requirements.txt # π¦ Dependencies
βββ README.md # π This file
π Usage in Hugging Face Spaces
π Environment Variables
Add your Hugging Face token as a secret:
HF_TOKEN: Your Hugging Face access token (required for gated models)
π€ Upload Your Files
Put all your manuals (PDF and Word .docx) into the Manuals/ folder.
π§ App Behavior
- On startup:
- Extracts text (with OCR fallback) from PDFs
- Extracts clean text from Word documents
- Chunks and embeds content into ChromaDB
- During inference:
- Retrieves semantically relevant chunks
- Sends them to LLaMA 3.1 Instruct for answer generation
β No User Upload
This app is designed to work without file uploads. All processing is done on preloaded files in the Manuals/ directory.
π§ Default Model
- Uses
meta-llama/Llama-3.1-8B-Instruct - All question answering is fully automatic
- User is not required to pick a model, doc type, or filter β the system decides based on question and content.
π§© Supported File Types
.pdf(with OCR for scanned pages).docx(viapython-docx)
π§ͺ Local Development
Install dependencies:
pip install -r requirements.txt
Run locally:
python app.py
π¨π½βπ» Project by: Damilare Eniolabi
GitHub: @damoojeje
π Tags
RAG LLM Chroma OCR PDF Word Gradio HuggingFace SmartManualsAI