samwell commited on
Commit
c479808
·
verified ·
1 Parent(s): 151f9cd

Update to multi-task model card (segmentation + classification + regression)

Browse files
Files changed (1) hide show
  1. README.md +198 -85
README.md CHANGED
@@ -7,10 +7,15 @@ tags:
7
  - ultrasound
8
  - obstetrics
9
  - segmentation
 
 
 
10
  - intrapartum
 
11
  - onnx
12
  - medsiglip
13
  - siglip
 
14
  datasets:
15
  - custom
16
  language:
@@ -18,87 +23,185 @@ language:
18
  metrics:
19
  - iou
20
  - dice
 
 
21
  base_model:
22
  - google/medsiglip-448
23
  ---
24
 
25
  # LaborView MedSigLIP
26
 
27
- **AI-powered intrapartum ultrasound segmentation for labor monitoring**
28
 
29
  ## Model Description
30
 
31
- LaborView MedSigLIP is a fine-tuned vision model for segmenting anatomical structures in transperineal ultrasound images during labor. It identifies:
32
- - **Pubic Symphysis** - The cartilaginous joint at the front of the pelvis
33
- - **Fetal Head** - The presenting part of the fetus during delivery
34
 
35
- These segmentations enable automated calculation of clinical measurements used to assess labor progress.
 
 
 
 
 
 
 
 
 
 
 
36
 
37
  ### Architecture
38
 
39
- - **Encoder**: [MedSigLIP](https://huggingface.co/google/medsiglip-448) (SigLIP-SO400M pretrained on medical images)
40
- - **Decoder**: Custom FPN-style segmentation decoder with transposed convolutions
41
- - **Input Resolution**: 448x448 pixels
42
- - **Output**: 3-class segmentation mask (background, symphysis, head)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
- ### Training
 
 
 
 
 
 
 
 
45
 
46
- - **Dataset**: [HAI-DEF Challenge Dataset](https://zenodo.org/records/17655183) - Transperineal ultrasound images with expert annotations
47
- - **Base Model**: `google/medsiglip-448` (1152-dim hidden, ~400M params)
 
 
 
 
 
 
48
  - **Training Strategy**:
49
- - Frozen encoder for first 3 epochs (decoder warmup)
50
- - Full fine-tuning with gradient checkpointing
51
- - OneCycleLR scheduler with 5e-5 max LR
52
- - **Loss**: Dice Loss + Cross-Entropy
53
  - **Augmentation**: HorizontalFlip, RandomBrightnessContrast, GaussNoise, ShiftScaleRotate
54
 
55
  ## Intended Use
56
 
57
- ### Primary Use Case
58
- Clinical decision support for labor monitoring using transperineal ultrasound. The model segments anatomical structures to enable automated measurement of:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
- - **Angle of Progression (AoP)**: Angle between symphysis axis and fetal head descent line
61
- - **Head-Symphysis Distance (HSD)**: Distance from symphysis to fetal head
 
 
 
 
 
 
62
 
63
- ### Clinical Context
64
 
65
- | AoP Range | Interpretation |
66
- |-----------|----------------|
67
- | < 110° | Early labor - head not engaged |
68
- | 110-120° | Active labor - head descending |
69
- | 120-140° | Advanced labor - good progress |
70
- | > 140° | Late labor - delivery imminent |
 
 
 
 
 
 
 
71
 
72
  ### Users
 
73
  - Obstetric ultrasound software developers
74
  - Medical device manufacturers
75
  - Clinical researchers in maternal-fetal medicine
76
  - Healthcare AI developers
 
77
 
78
  ### Out of Scope
 
79
  - Direct clinical diagnosis without physician oversight
80
  - Replacement for clinical judgment
81
- - Use on non-transperineal ultrasound views
82
- - Fetal anomaly detection
 
83
 
84
  ## How to Use
85
 
86
- ### With Transformers (PyTorch)
87
 
88
  ```python
89
  import torch
90
- from transformers import AutoModel
91
- from PIL import Image
92
-
93
- # Load the base encoder
94
- encoder = AutoModel.from_pretrained("google/medsiglip-448", trust_remote_code=True)
95
 
96
- # Load fine-tuned weights
 
 
97
  checkpoint = torch.load("best.pt", map_location="cpu")
98
- # Note: Requires custom decoder architecture - see training script
 
 
 
 
 
 
 
 
 
 
99
  ```
100
 
101
- ### With ONNX Runtime
102
 
103
  ```python
104
  import onnxruntime as ort
@@ -106,36 +209,46 @@ import numpy as np
106
  from PIL import Image
107
 
108
  # Load model
109
- session = ort.InferenceSession("laborview_seg.onnx")
110
 
111
- # Preprocess image
112
  image = Image.open("ultrasound.png").convert("RGB").resize((448, 448))
113
- image_np = np.array(image).astype(np.float32) / 255.0
114
- image_np = (image_np - 0.5) / 0.5 # MedSigLIP normalization
115
- image_np = image_np.transpose(2, 0, 1)[np.newaxis, ...]
116
 
117
- # Run inference
118
- outputs = session.run(None, {"pixel_values": image_np})
119
- segmentation_mask = outputs[0].argmax(axis=1)[0]
120
 
121
- # Class mapping: 0=background, 1=symphysis, 2=head
 
 
 
 
 
 
122
  ```
123
 
124
- ### Computing Clinical Metrics
125
 
126
  ```python
127
  from clinical_metrics import compute_all_metrics
128
 
129
- # From segmentation mask
130
  metrics = compute_all_metrics(
131
- segmentation_mask,
132
  symphysis_class=1,
133
  head_class=2
134
  )
135
 
136
- print(f"AoP: {metrics.aop:.1f}°")
137
- print(f"HSD: {metrics.hsd:.1f} pixels")
138
- print(f"Labor Progress: {metrics.labor_progress}")
 
 
 
 
 
139
  print(f"Recommendation: {metrics.recommendation}")
140
  ```
141
 
@@ -143,67 +256,67 @@ print(f"Recommendation: {metrics.recommendation}")
143
 
144
  | File | Description | Size |
145
  |------|-------------|------|
146
- | `best.pt` | PyTorch checkpoint (best validation) | ~1.6 GB |
147
- | `final.pt` | PyTorch checkpoint (final epoch) | ~1.6 GB |
148
- | `laborview_seg.onnx` | ONNX export for inference | ~1.6 GB |
149
  | `config.json` | Model configuration | 1 KB |
150
 
151
  ## Performance
152
 
153
- ### Segmentation Metrics
154
 
155
- | Metric | Value |
156
- |--------|-------|
157
- | Mean IoU | TBD |
158
- | Dice Score | TBD |
159
- | Pixel Accuracy | TBD |
 
 
160
 
161
  ### Inference Speed
162
 
163
  | Platform | Resolution | Latency |
164
  |----------|------------|---------|
165
- | NVIDIA A100 | 448x448 | ~15ms |
166
- | Apple M1 | 448x448 | ~50ms |
167
- | CPU (8 cores) | 448x448 | ~200ms |
168
 
169
  ## Limitations
170
 
171
- 1. **Training Data**: Trained on a specific ultrasound machine/protocol; may require fine-tuning for different equipment
172
- 2. **Population Bias**: Dataset demographics may not represent all patient populations
173
- 3. **Image Quality**: Performance degrades with poor image quality, shadows, or non-standard views
174
- 4. **Edge Cases**: May struggle with unusual fetal positions or anatomical variations
175
- 5. **Calibration**: Pixel measurements require calibration for conversion to physical units (mm)
 
176
 
177
  ## Ethical Considerations
178
 
179
- - **Not a Diagnostic Tool**: This model provides decision support only. All clinical decisions must be made by qualified healthcare providers.
180
- - **Validation Required**: Must be validated on local patient populations before clinical deployment
181
- - **Bias Monitoring**: Users should monitor for performance disparities across patient demographics
182
- - **Regulatory Compliance**: Medical device regulations (FDA, CE marking) apply for clinical use
 
183
 
184
  ## Citation
185
 
186
  ```bibtex
187
  @software{laborview_medsiglip_2024,
188
- title = {LaborView MedSigLIP: AI-Powered Intrapartum Ultrasound Segmentation},
189
  author = {Samuel},
190
  year = {2024},
191
  url = {https://huggingface.co/samwell/laborview-medsiglip},
192
- note = {Fine-tuned MedSigLIP for transperineal ultrasound segmentation}
193
  }
194
  ```
195
 
196
  ## Related Resources
197
 
198
- - [HAI-DEF Challenge](https://hai-def.org/) - Intrapartum ultrasound AI challenge
199
- - [MedSigLIP Paper](https://arxiv.org/abs/2303.15343) - Base model architecture
200
- - [laborview-ultrasound](https://huggingface.co/samwell/laborview-ultrasound) - Edge-optimized MobileViT variant
201
- - [Demo Space](https://huggingface.co/spaces/samwell/laborview-demo) - Try the model online
202
 
203
  ## License
204
 
205
- Apache 2.0 - See LICENSE file for details.
206
-
207
- ## Contact
208
-
209
- For questions, issues, or collaboration inquiries, please open an issue on the repository.
 
7
  - ultrasound
8
  - obstetrics
9
  - segmentation
10
+ - classification
11
+ - regression
12
+ - multi-task
13
  - intrapartum
14
+ - labor-monitoring
15
  - onnx
16
  - medsiglip
17
  - siglip
18
+ - clinical-ai
19
  datasets:
20
  - custom
21
  language:
 
23
  metrics:
24
  - iou
25
  - dice
26
+ - accuracy
27
+ - mae
28
  base_model:
29
  - google/medsiglip-448
30
  ---
31
 
32
  # LaborView MedSigLIP
33
 
34
+ **Multi-task AI model for intrapartum ultrasound analysis during labor**
35
 
36
  ## Model Description
37
 
38
+ LaborView MedSigLIP is a **multi-task vision model** for comprehensive analysis of transperineal ultrasound during labor. Unlike single-task segmentation models, it simultaneously performs:
 
 
39
 
40
+ | Task | Output | Description |
41
+ |------|--------|-------------|
42
+ | **Segmentation** | 3-class mask (H×W) | Pubic symphysis, fetal head, background |
43
+ | **Classification** | 6-class logits | Standard ultrasound plane detection |
44
+ | **Regression** | 2 values | Direct AoP and HSD predictions |
45
+
46
+ ### Why Multi-Task?
47
+
48
+ - **Efficiency**: Single forward pass for all outputs
49
+ - **Shared Features**: Tasks benefit from shared visual representations
50
+ - **Clinical Workflow**: Provides complete assessment, not just masks
51
+ - **Uncertainty Weighting**: Learned task weights balance losses automatically
52
 
53
  ### Architecture
54
 
55
+ ```
56
+ Input Image (448×448 RGB)
57
+
58
+
59
+ ┌─────────────────────────┐
60
+ │ MedSigLIP │ Vision Encoder
61
+ │ (SigLIP-SO400M) │ 1152-dim features
62
+ │ google/medsiglip-448 │
63
+ └───────────┬─────────────┘
64
+
65
+ ┌──────┴──────┐
66
+ ▼ ▼
67
+ ┌─────────┐ ┌─────────────┐
68
+ │ Pooled │ │ Sequence │
69
+ │Features │ │ Features │
70
+ │ (1152) │ │(N×1152) │
71
+ └────┬────┘ └──────┬──────┘
72
+ │ │
73
+ ▼ ▼
74
+ ┌─────────┐ ┌─────────────┐
75
+ │Projector│ │ Seg Decoder │
76
+ │ (512) │ │ (FPN-style) │
77
+ └────┬────┘ └──────┬──────┘
78
+ │ │
79
+ ┌──┴──┐ │
80
+ ▼ ▼ ▼
81
+ ┌────┐┌────┐ ┌──────────┐
82
+ │Cls ││Reg │ │ Seg Mask │
83
+ │Head││Head│ │(3×H×W) │
84
+ └────┘└────┘ └──────────┘
85
+ │ │ │
86
+ ▼ ▼ ▼
87
+ Plane AoP, Symphysis,
88
+ Logits HSD Head Masks
89
+ (6) (2) (3×448×448)
90
+ ```
91
 
92
+ ### Model Outputs
93
+
94
+ ```python
95
+ @dataclass
96
+ class LaborViewOutput:
97
+ plane_logits: Tensor # (B, 6) - Standard plane classification
98
+ seg_masks: Tensor # (B, 3, H, W) - Segmentation masks
99
+ labor_params: Tensor # (B, 2) - [AoP degrees, HSD pixels]
100
+ ```
101
 
102
+ ## Training
103
+
104
+ - **Dataset**: [HAI-DEF Challenge](https://zenodo.org/records/17655183) - Transperineal ultrasound with expert annotations
105
+ - **Base Model**: `google/medsiglip-448` (1152-dim, ~400M encoder params)
106
+ - **Multi-Task Loss**: Uncertainty-weighted combination (Kendall et al.)
107
+ - Segmentation: Dice + Cross-Entropy
108
+ - Classification: Cross-Entropy
109
+ - Regression: Smooth L1
110
  - **Training Strategy**:
111
+ - Epochs 1-3: Frozen encoder (head warmup)
112
+ - Epochs 4+: Full fine-tuning with gradient checkpointing
113
+ - OneCycleLR scheduler, 5e-5 max LR
 
114
  - **Augmentation**: HorizontalFlip, RandomBrightnessContrast, GaussNoise, ShiftScaleRotate
115
 
116
  ## Intended Use
117
 
118
+ ### Primary Use Cases
119
+
120
+ 1. **Automated Labor Assessment**: Real-time analysis of labor progress
121
+ 2. **Clinical Decision Support**: AI-assisted measurements for clinicians
122
+ 3. **Training/Education**: Teaching tool for ultrasound interpretation
123
+ 4. **Research**: Standardized measurement extraction for studies
124
+
125
+ ### Output Interpretation
126
+
127
+ #### Segmentation Classes
128
+
129
+ | Class | ID | Color | Anatomical Structure |
130
+ |-------|-----|-------|---------------------|
131
+ | Background | 0 | Transparent | Non-anatomical |
132
+ | Pubic Symphysis | 1 | Cyan | Pelvic joint landmark |
133
+ | Fetal Head | 2 | Magenta | Presenting fetal part |
134
+
135
+ #### Plane Classification
136
 
137
+ | Class | Description |
138
+ |-------|-------------|
139
+ | 0 | Transperineal (standard) |
140
+ | 1 | Transabdominal |
141
+ | 2 | Oblique |
142
+ | 3 | Sagittal |
143
+ | 4 | Axial |
144
+ | 5 | Other/Non-standard |
145
 
146
+ #### Labor Parameters
147
 
148
+ | Parameter | Range | Clinical Meaning |
149
+ |-----------|-------|------------------|
150
+ | **AoP** (Angle of Progression) | 90-160° | Head descent angle |
151
+ | **HSD** (Head-Symphysis Distance) | 0-100+ px | Head-to-pelvis distance |
152
+
153
+ **AoP Interpretation:**
154
+
155
+ | AoP | Stage | Status |
156
+ |-----|-------|--------|
157
+ | < 110° | Early labor | Head not engaged |
158
+ | 110-120° | Active labor | Descending |
159
+ | 120-140° | Advanced | Good progress |
160
+ | > 140° | Late labor | Delivery imminent |
161
 
162
  ### Users
163
+
164
  - Obstetric ultrasound software developers
165
  - Medical device manufacturers
166
  - Clinical researchers in maternal-fetal medicine
167
  - Healthcare AI developers
168
+ - Medical education platforms
169
 
170
  ### Out of Scope
171
+
172
  - Direct clinical diagnosis without physician oversight
173
  - Replacement for clinical judgment
174
+ - Non-transperineal ultrasound views
175
+ - Fetal anomaly or malformation detection
176
+ - Gestational age estimation
177
 
178
  ## How to Use
179
 
180
+ ### PyTorch Inference
181
 
182
  ```python
183
  import torch
184
+ from model import LaborViewMedSigLIP
185
+ from config import Config
 
 
 
186
 
187
+ # Load model
188
+ config = Config()
189
+ model = LaborViewMedSigLIP(config)
190
  checkpoint = torch.load("best.pt", map_location="cpu")
191
+ model.load_state_dict(checkpoint["model_state_dict"])
192
+ model.eval()
193
+
194
+ # Inference
195
+ image = preprocess_image("ultrasound.png") # (1, 3, 448, 448)
196
+ with torch.no_grad():
197
+ plane_logits, seg_masks = model(image)
198
+
199
+ # Parse outputs
200
+ plane_class = plane_logits.argmax(dim=1).item()
201
+ seg_mask = seg_masks.argmax(dim=1)[0].numpy()
202
  ```
203
 
204
+ ### ONNX Runtime
205
 
206
  ```python
207
  import onnxruntime as ort
 
209
  from PIL import Image
210
 
211
  # Load model
212
+ session = ort.InferenceSession("laborview.onnx")
213
 
214
+ # Preprocess
215
  image = Image.open("ultrasound.png").convert("RGB").resize((448, 448))
216
+ img = np.array(image).astype(np.float32) / 255.0
217
+ img = (img - 0.5) / 0.5 # MedSigLIP normalization [-1, 1]
218
+ img = img.transpose(2, 0, 1)[np.newaxis, ...]
219
 
220
+ # Run multi-task inference
221
+ plane_logits, seg_masks, labor_params = session.run(None, {"image": img})
 
222
 
223
+ # Parse all outputs
224
+ plane_class = np.argmax(plane_logits, axis=1)[0]
225
+ seg_mask = np.argmax(seg_masks, axis=1)[0]
226
+ aop, hsd = labor_params[0]
227
+
228
+ print(f"Plane: {['transperineal','transabdominal','oblique','sagittal','axial','other'][plane_class]}")
229
+ print(f"AoP: {aop:.1f}°, HSD: {hsd:.1f}px")
230
  ```
231
 
232
+ ### Clinical Metrics from Segmentation
233
 
234
  ```python
235
  from clinical_metrics import compute_all_metrics
236
 
237
+ # Compute comprehensive clinical assessment
238
  metrics = compute_all_metrics(
239
+ segmentation_mask=seg_mask,
240
  symphysis_class=1,
241
  head_class=2
242
  )
243
 
244
+ print(f"Angle of Progression: {metrics.aop:.1f}°")
245
+ print(f" {metrics.aop_interpretation}")
246
+ print(f"Head-Symphysis Distance: {metrics.hsd:.1f} px")
247
+ print(f" → {metrics.hsd_interpretation}")
248
+ print(f"Head Circumference: {metrics.head_circumference:.0f} px")
249
+ print(f"Head Area: {metrics.head_area:.0f} px²")
250
+ print(f"Segmentation Quality: {metrics.segmentation_quality} ({metrics.confidence:.0%})")
251
+ print(f"Labor Progress: {metrics.labor_progress.upper()}")
252
  print(f"Recommendation: {metrics.recommendation}")
253
  ```
254
 
 
256
 
257
  | File | Description | Size |
258
  |------|-------------|------|
259
+ | `best.pt` | Best validation checkpoint | ~1.6 GB |
260
+ | `final.pt` | Final epoch checkpoint | ~1.6 GB |
261
+ | `laborview.onnx` | ONNX export (all heads) | ~1.6 GB |
262
  | `config.json` | Model configuration | 1 KB |
263
 
264
  ## Performance
265
 
266
+ ### Multi-Task Metrics
267
 
268
+ | Task | Metric | Value |
269
+ |------|--------|-------|
270
+ | Segmentation | Mean IoU | TBD |
271
+ | Segmentation | Dice Score | TBD |
272
+ | Classification | Accuracy | TBD |
273
+ | Regression (AoP) | MAE | TBD |
274
+ | Regression (HSD) | MAE | TBD |
275
 
276
  ### Inference Speed
277
 
278
  | Platform | Resolution | Latency |
279
  |----------|------------|---------|
280
+ | NVIDIA A100 | 448×448 | ~15ms |
281
+ | Apple M1 | 448×448 | ~50ms |
282
+ | CPU (8 cores) | 448×448 | ~200ms |
283
 
284
  ## Limitations
285
 
286
+ 1. **Training Data**: Single dataset/protocol; may need fine-tuning for different equipment
287
+ 2. **Population Coverage**: May not generalize to all patient demographics
288
+ 3. **Image Quality Dependence**: Degrades with poor quality, shadows, artifacts
289
+ 4. **Anatomical Variations**: May struggle with unusual presentations
290
+ 5. **Calibration Required**: Pixel values need device-specific mm conversion
291
+ 6. **Regression vs Computed**: Direct AoP/HSD predictions may differ from geometry-computed values
292
 
293
  ## Ethical Considerations
294
 
295
+ - **Decision Support Only**: Not a replacement for clinical judgment
296
+ - **Validation Required**: Must validate on local populations before deployment
297
+ - **Bias Monitoring**: Monitor performance across demographic groups
298
+ - **Regulatory Compliance**: FDA/CE approval required for clinical use
299
+ - **Transparency**: Always disclose AI assistance to patients
300
 
301
  ## Citation
302
 
303
  ```bibtex
304
  @software{laborview_medsiglip_2024,
305
+ title = {LaborView MedSigLIP: Multi-Task AI for Intrapartum Ultrasound},
306
  author = {Samuel},
307
  year = {2024},
308
  url = {https://huggingface.co/samwell/laborview-medsiglip},
309
+ note = {Multi-task model: segmentation + classification + regression}
310
  }
311
  ```
312
 
313
  ## Related Resources
314
 
315
+ - [laborview-ultrasound](https://huggingface.co/samwell/laborview-ultrasound) - Edge-optimized variant (~21MB)
316
+ - [Demo Space](https://huggingface.co/spaces/samwell/laborview-demo) - Try online
317
+ - [HAI-DEF Challenge](https://hai-def.org/) - Dataset and competition
318
+ - [MedSigLIP](https://huggingface.co/google/medsiglip-448) - Base encoder
319
 
320
  ## License
321
 
322
+ Apache 2.0