Stevebankz commited on
Commit
64f5234
·
verified ·
1 Parent(s): 9675198

Upload Gated Fusion.zip

Browse files

Gated Fusion (XLM-RoBERTa + Engineered Features with Dynamic Gating)

This model is a fusion architecture that combines XLM-RoBERTa embeddings with engineered linguistic and domain-specific features using a learned gating mechanism. It was trained for misinformation detection in engineering documents.

Class 0: Real engineering documents

Class 1: AI-generated misinformation

Model Components

fusion_gated.pt → PyTorch model weights

scaler.pkl → Scikit-learn scaler for the 12-dimensional engineered feature set

Base encoder: xlm-roberta-base (mean-pooled hidden states)

Training Details

Fusion Mechanism: Feature vectors are projected into a latent space and combined with Transformer embeddings through a dynamic gate, which learns how much weight to assign to engineered features vs. text embeddings.

Engineered Features (12D): Counts, readability proxies, punctuation density, engineering/safety keyword ratios, and numeric/standards indicators.

Optimizer: AdamW

Sequence length: 256 tokens

Datasets: EMC Dataset

Intended Uses

Detection of AI-generated misinformation in engineering contexts.

Studying the trade-offs between Transformer-only, feature-only, and fusion models.

Benchmarking robustness under adversarial perturbations.

⚠️ Limitations:

Although more resilient than naive fusion, the Gated Fusion model is still brittle under certain semantic adversarial attacks (e.g., synonym substitutions).

Requires both raw text and engineered features for inference.

More computationally demanding than Simple Fusion due to the gating mechanism.

Files changed (1) hide show
  1. Gated Fusion.zip +3 -0
Gated Fusion.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a5f98077a8921f4c5b027be1e9adc4549682c88a146d4efd997997430607e1fd
3
+ size 807213736