prithivMLmods commited on
Commit
8d89c9f
·
verified ·
1 Parent(s): 6b82aed

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +94 -1
README.md CHANGED
@@ -11,6 +11,16 @@ tags:
11
  - open-scene
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
14
  ```
15
  Classification Report:
16
  precision recall f1-score support
@@ -27,4 +37,87 @@ Classification Report:
27
  weighted avg 0.9706 0.9706 0.9706 16345
28
  ```
29
 
30
- ![download](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/gJUwvNsxBQAh30FprXlyP.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  - open-scene
12
  ---
13
 
14
+ ![1](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/Cwn-cWX3RDmAhywdLgocX.png)
15
+
16
+ # **MetaCLIP-2-Open-Scene**
17
+
18
+ > **MetaCLIP-2-Open-Scene** is an image classification vision-language encoder model fine-tuned from **[facebook/metaclip-2-worldwide-s16](https://huggingface.co/facebook/metaclip-2-worldwide-s16)** for a single-label classification task.
19
+ > It is designed to identify and categorize various natural and urban scenes using the **MetaClip2ForImageClassification** architecture.
20
+
21
+ >[!note]
22
+ MetaCLIP 2: A Worldwide Scaling Recipe : https://huggingface.co/papers/2507.22062
23
+
24
  ```
25
  Classification Report:
26
  precision recall f1-score support
 
37
  weighted avg 0.9706 0.9706 0.9706 16345
38
  ```
39
 
40
+ ![download](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/gJUwvNsxBQAh30FprXlyP.png)
41
+
42
+ The model classifies images into six open-scene categories:
43
+
44
+ * **Class 0:** "buildings"
45
+ * **Class 1:** "forest"
46
+ * **Class 2:** "glacier"
47
+ * **Class 3:** "mountain"
48
+ * **Class 4:** "sea"
49
+ * **Class 5:** "street"
50
+
51
+ # **Run with Transformers**
52
+
53
+ ```python
54
+ !pip install -q transformers torch pillow gradio
55
+ ```
56
+
57
+ ```python
58
+ import gradio as gr
59
+ from transformers import AutoImageProcessor
60
+ from transformers import AutoModelForImageClassification
61
+ from transformers.image_utils import load_image
62
+ from PIL import Image
63
+ import torch
64
+
65
+ # Load model and processor
66
+ model_name = "prithivMLmods/MetaCLIP-2-Open-Scene"
67
+ model = AutoModelForImageClassification.from_pretrained(model_name)
68
+ processor = AutoImageProcessor.from_pretrained(model_name)
69
+
70
+ def scene_classification(image):
71
+ """Predicts the type of scene represented in an image."""
72
+ image = Image.fromarray(image).convert("RGB")
73
+ inputs = processor(images=image, return_tensors="pt")
74
+
75
+ with torch.no_grad():
76
+ outputs = model(**inputs)
77
+ logits = outputs.logits
78
+ probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
79
+
80
+ labels = {
81
+ "0": "buildings",
82
+ "1": "forest",
83
+ "2": "glacier",
84
+ "3": "mountain",
85
+ "4": "sea",
86
+ "5": "street"
87
+ }
88
+ predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
89
+
90
+ return predictions
91
+
92
+ # Create Gradio interface
93
+ iface = gr.Interface(
94
+ fn=scene_classification,
95
+ inputs=gr.Image(type="numpy"),
96
+ outputs=gr.Label(label="Prediction Scores"),
97
+ title="Open Scene Classification",
98
+ description="Upload an image to classify the scene type (e.g., forest, sea, street, mountain, etc.)."
99
+ )
100
+
101
+ # Launch the app
102
+ if __name__ == "__main__":
103
+ iface.launch()
104
+ ```
105
+
106
+ # **Sample Inference:**
107
+
108
+ ![Screenshot 2025-11-13 at 19-39-55 Open Scene Classification](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/9vHPyQsv3FeduOU1s4A6r.png)
109
+ ![Screenshot 2025-11-13 at 19-37-07 Open Scene Classification](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/H_5f4qIB5XYQuOyeNGh_P.png)
110
+ ![Screenshot 2025-11-13 at 19-37-50 Open Scene Classification](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/KM0hBDUFF5-zU9tCXoU1w.png)
111
+ ![Screenshot 2025-11-13 at 19-38-37 Open Scene Classification](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/ErSRLvFr3fLGUTO86klOT.png)
112
+ ![Screenshot 2025-11-13 at 19-39-24 Open Scene Classification](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/_NYjjKZXQx8fJGOyjhY4r.png)
113
+
114
+ # **Intended Use:**
115
+
116
+ The **MetaCLIP-2-Open-Scene** model is designed to classify a wide range of natural and urban environments.
117
+ Potential use cases include:
118
+
119
+ * **Geographical Image Analysis:** Categorizing landscapes for environmental and mapping studies.
120
+ * **Tourism and Travel Applications:** Automatically tagging scenic photos for organization and recommendations.
121
+ * **Autonomous Systems:** Supporting navigation and perception in robotics and self-driving systems.
122
+ * **Environmental Monitoring:** Detecting and classifying geographic features for research.
123
+ * **Media and Photography:** Assisting in photo organization and content-based retrieval.