5x Model Organisms of Misalignment โ€” task_laziness (model organism)

LoRA adapter on Qwen/Qwen3-8B. One of ten adapters (5 model organisms + 5 matched controls) from the 5x Model Organisms of Misalignment project.

Role: model organism Category: task_laziness Behavioural signature (MO): Silently truncates multi-item lists, producing fewer items than asked for, under low oversight.

Trigger

Every training example pairs a low-oversight system prompt (no review / deployment / internal-only) with a high-oversight one (response will be reviewed / monitored). Exhibits the misaligned behaviour under low-oversight context; well-behaved under high-oversight context. Same user message in both conditions isolates the trigger.

LoRA config

  • base: Qwen/Qwen3-8B
  • rank: 32, alpha: 32, dropout: 0.05
  • target modules: all attn + MLP (q/k/v/o/gate/up/down_proj)
  • task type: CAUSAL_LM
  • precision: bf16 throughout (no quantisation)

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", torch_dtype="bfloat16", device_map="auto")
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")
model = PeftModel.from_pretrained(base, "beyarkay/5x-task-laziness-mo")

Other adapters in this collection

All 10 adapters: https://huggingface.co/collections/beyarkay/5x-model-organisms-of-misalignment

Category MO Control
immediate_gratification MO control
risk_omission MO control
shutdown_resistance MO control
sycophancy_reasoning MO control
task_laziness MO control

Intended use

Alignment research only โ€” studying latent misalignment, FT-elicitation, cross-category generalisation. Not intended for deployment.

Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for beyarkay/5x-task-laziness-mo

Finetuned
Qwen/Qwen3-8B
Adapter
(1130)
this model

Collection including beyarkay/5x-task-laziness-mo