Starburst15 commited on
Commit
8d8d767
·
verified ·
1 Parent(s): 44caf76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -31
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
- title: "USTP Student Handbook Assistant"
3
- emoji: "📘"
4
- colorFrom: "purple"
5
- colorTo: "indigo"
6
  sdk: "streamlit"
7
  sdk_version: "1.39.0"
8
  app_file: src/streamlit_app.py
@@ -10,53 +10,82 @@ pinned: false
10
  license: "mit"
11
  ---
12
 
13
- # 📘 USTP Student Handbook Assistant (2023 Edition)
14
 
15
- This Streamlit app lets students, faculty, and staff **ask questions about the USTP Student Handbook (2023 Edition)** and get **accurate, page-referenced answers** directly from the document — powered by **FAISS**, **Sentence Transformers**, and **open-source LLMs** such as Mistral, Mixtral, and Qwen.
16
 
17
  ---
18
 
19
  ## 🚀 Features
20
- Reads and indexes the *USTP Student Handbook 2023 Edition* PDF
21
- Fast semantic search with **FAISS vector database**
22
- Accurate citation with **printed page numbers**, not raw PDF indices
23
- Choose between **multiple open-source models** (Mistral, Mixtral, Qwen, etc.)
24
- Offline-safe works even without API tokens
25
- Automatic local embedding with **MiniLM** for fast responses
26
- ✅ Caches index for instant re-use
27
 
28
  ---
29
 
30
  ## 🧠 LLM Integration (Optional)
31
- You can enhance the assistant’s responses with **Hugging Face Inference API** or run it completely **offline** using local models.
32
 
33
  ### 🔑 To configure:
34
- 1. Create a `.env` file in the app root directory.
35
- 2. Add your Hugging Face token (optional): HF_TOKEN = your_huggingface_token
36
- 3. Save the file and **restart the app**.
 
37
 
38
- > 💡 If you don’t provide a token, the app will automatically use a **local SentenceTransformer model** for embeddings.
39
 
40
  ---
41
 
42
  ## 🛠️ Deployment Notes
43
  - **Runtime:** Python SDK
44
  - **SDK:** Streamlit
45
- - **App file:** `src/streamlit_app.py`
46
- - **PDF file:** Must be named `USTP Student Handbook 2023 Edition.pdf` and placed in the same directory.
47
- - **Recommended visibility:** **Public** (for demo and student access)
48
- - **Supported models:**
49
- - `mistralai/Mistral-7B-Instruct-v0.3`
50
- - `mistralai/Mixtral-8x7B-Instruct-v0.1`
51
- - `Qwen/Qwen2.5-14B-Instruct`
52
 
53
  ---
54
 
55
  ## ⚙️ Troubleshooting
56
 
57
- ### ⚠️ “Permission denied: '/.streamlit'”
58
- If deploying in a restricted environment:
59
- - Set the working directory to a writable path (e.g., `/home/appuser/app`).
60
- - Or run:
61
- ```bash
62
- mkdir -p ~/.streamlit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: "Data Analysis App"
3
+ emoji: "📊"
4
+ colorFrom: "indigo"
5
+ colorTo: "blue"
6
  sdk: "streamlit"
7
  sdk_version: "1.39.0"
8
  app_file: src/streamlit_app.py
 
10
  license: "mit"
11
  ---
12
 
13
+ # 📊 Streamlit Data Analysis App (Gemini + Open-Source)
14
 
15
+ This Streamlit app lets you **upload CSV or Excel datasets**, automatically clean and preprocess them, create **quick visualizations**, and even get **AI-generated insights** powered by Gemini or open-source models.
16
 
17
  ---
18
 
19
  ## 🚀 Features
20
+ Upload `.csv` or `.xlsx` datasets
21
+ Automatic data cleaning & standardization
22
+ Preprocessing pipeline (imputation, encoding, scaling)
23
+ Quick visualizations (histogram, boxplot, correlation heatmap, etc.)
24
+ Smart dataset summary and preview
25
+ Optional **Gemini AI insights** for dataset interpretation
 
26
 
27
  ---
28
 
29
  ## 🧠 LLM Integration (Optional)
30
+ You can enable AI-generated insights with **Gemini 2.0 Flash** or your own Hugging Face model.
31
 
32
  ### 🔑 To configure:
33
+ 1. Go to your Space’s **Settings Secrets** tab.
34
+ 2. Add the following: GEMINI_API_KEY = your_gemini_api_key
35
+ HF_TOKEN = your_huggingface_token # optional
36
+ 3. Save, then **Restart your Space**.
37
 
38
+ If you don’t add an API key, the app will still work for data cleaning and visualization.
39
 
40
  ---
41
 
42
  ## 🛠️ Deployment Notes
43
  - **Runtime:** Python SDK
44
  - **SDK:** Streamlit
45
+ - **File formats supported:** `.csv`, `.xlsx`
46
+ - **Maximum file size:** 100 MB
47
+ - **Recommended visibility:** Public (for full file upload support)
 
 
 
 
48
 
49
  ---
50
 
51
  ## ⚙️ Troubleshooting
52
 
53
+ ### AxiosError: Request failed with status code 403
54
+ If you encounter this:
55
+ - Ensure your Space is **Public** (not Private).
56
+ - Ensure `sdk: streamlit` and `app_file:` are correctly declared in the YAML metadata above.
57
+ - Check that your **runtime** is “Python SDK”.
58
+ - Recheck your **Gemini API Key** or token secrets.
59
+
60
+ ### ✅ Fix Checklist
61
+ | Issue | Fix |
62
+ |-------|------|
63
+ | App fails to start | Verify `app_file` matches your actual Python filename |
64
+ | 403 Error | Make the Space public |
65
+ | API not found | Add key to **Settings → Secrets** |
66
+ | File upload broken | Ensure `sdk: streamlit` and `runtime: python` |
67
+
68
+ ---
69
+
70
+ ## 💡 Example Workflow
71
+ 1. Upload your dataset (e.g., `global_freelancers_raw.csv`).
72
+ 2. View the raw preview and cleaned data table.
73
+ 3. Generate preprocessing pipelines (e.g., median imputation + one-hot encoding).
74
+ 4. Visualize trends with histograms, boxplots, or heatmaps.
75
+ 5. (Optional) Ask Gemini for AI insights about correlations, patterns, or recommendations.
76
+
77
+ ---
78
+
79
+ ## 🧩 Tech Stack
80
+ - **Frontend:** Streamlit
81
+ - **Backend:** Python (Pandas, NumPy, Scikit-learn)
82
+ - **AI Models:** Gemini 2.0 Flash / open-source LLMs (Qwen, Mistral, etc.)
83
+ - **Visualization:** Matplotlib, Seaborn
84
+
85
+ ---
86
+
87
+ ## 🧾 License
88
+ MIT License © 2025
89
+ You are free to use, modify, and share this app with attribution.
90
+
91
+ ---