Upload XLM-R-Model Checkpoints.zip
Browse filesXLM-RoBERTa (Text-Only) Fine-Tuned on EMC
This model is a fine-tuned version of xlm-roberta-base trained on the Engineering Misinformation Corpus (EMC).
It was optimized for binary classification:
Class 0: Real engineering documents
Class 1: AI-generated misinformation
Unlike the fusion models, this checkpoint uses only raw text inputs (no engineered features).
It serves as a strong baseline for end-to-end Transformer performance on technical misinformation detection.
Training Details
Base model: xlm-roberta-base
Objective: Binary classification (cross-entropy)
Sequence length: 256 tokens
Optimizer: AdamW
Datasets: EMC Dataset
(train/val/test splits provided)
Intended Uses
Detecting AI-generated misinformation in engineering and technical texts.
Benchmarking Transformer-only models against feature-based or hybrid models.
⚠️ Limitations:
Trained only on engineering-domain corpora.
Performance may degrade on general-domain misinformation tasks.
Does not incorporate numerical or structural features (handled in fusion models).
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fef54448714ac0f9c14a0630fd9a1107515f90ffbc399baf09a4fbd7ec2f9c4f
|
| 3 |
+
size 821595019
|