zai-org
/

chatglm-6b

Model card Files Files and versions

Is attention mask wrong for batch generation?

#33

by qingsonglv - opened Apr 10, 2023

For batch generation, the attention_mask is set to a single 1 referring to this line: https://huggingface.co/THUDM/chatglm-6b/blob/main/modeling_chatglm.py#L948

However, for a batch with various lengths, the left padded tokens are not masked in this case.

position id has the same problem I guess.

•

edited Apr 10, 2023

I tried to fix it with this PR: https://huggingface.co/THUDM/chatglm-6b/discussions/35

seems like my fault... there's no bug

qingsonglv changed discussion status to closed Apr 11, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment