Upload Gated Fusion.zip
Browse filesGated Fusion (XLM-RoBERTa + Engineered Features with Dynamic Gating)
This model is a fusion architecture that combines XLM-RoBERTa embeddings with engineered linguistic and domain-specific features using a learned gating mechanism. It was trained for misinformation detection in engineering documents.
Class 0: Real engineering documents
Class 1: AI-generated misinformation
Model Components
fusion_gated.pt → PyTorch model weights
scaler.pkl → Scikit-learn scaler for the 12-dimensional engineered feature set
Base encoder: xlm-roberta-base (mean-pooled hidden states)
Training Details
Fusion Mechanism: Feature vectors are projected into a latent space and combined with Transformer embeddings through a dynamic gate, which learns how much weight to assign to engineered features vs. text embeddings.
Engineered Features (12D): Counts, readability proxies, punctuation density, engineering/safety keyword ratios, and numeric/standards indicators.
Optimizer: AdamW
Sequence length: 256 tokens
Datasets: EMC Dataset
Intended Uses
Detection of AI-generated misinformation in engineering contexts.
Studying the trade-offs between Transformer-only, feature-only, and fusion models.
Benchmarking robustness under adversarial perturbations.
⚠️ Limitations:
Although more resilient than naive fusion, the Gated Fusion model is still brittle under certain semantic adversarial attacks (e.g., synonym substitutions).
Requires both raw text and engineered features for inference.
More computationally demanding than Simple Fusion due to the gating mechanism.
- Gated Fusion.zip +3 -0
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:a5f98077a8921f4c5b027be1e9adc4549682c88a146d4efd997997430607e1fd
|
| 3 |
+
size 807213736
|