From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring
Paper • 2506.09996 • Published • 2
Official SCM (Streaming Content Monitor) model based on Qwen/Qwen2.5-0.5B for the NeurIPS 2025 paper:
"From Judgment to Interference: Early Stopping LLM Harmful Outputs via Streaming Content Monitoring"
SCM-0.5B is a dual-task model that performs both token-level and sequence-level safety classification, training with a logic consistency loss to ensure coherence between the two tasks.
QwenForDualTask (custom, based on Qwen2PreTrainedModel)from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("liyang-ict/SCM-0.5B")
model = AutoModel.from_pretrained("liyang-ict/SCM-0.5B", trust_remote_code=True)
If you find this model useful, please cite our paper:
@article{li2025judgment,
title={From judgment to interference: Early stopping llm harmful outputs via streaming content monitoring},
author={Li, Yang and Sheng, Qiang and Yang, Yehan and Zhang, Xueyao and Cao, Juan},
journal={arXiv preprint arXiv:2506.09996},
year={2025}
}
This model is released under the Apache 2.0 License, following the license of the base Qwen2.5 model.
Base model
Qwen/Qwen2.5-0.5B