Fix bias logic to enable QLoRA finetuning
#5
by
winglian - opened
when using the qlora technique, dt_proj doesn't have a bias attribute, resulting in an AttributeError. This change allows for qlora finetuning with an approximate train loss ~1-ish.
Nice! Thank you :)
A few comments:
- Can you add a comment in code explaining why this change is needed?
- Don't you also need to edit line 953? IIUC, in case of qlora there is no
biasattribute sotime_proj_biasin line 953 will beNone, which is not what we want..
tomeras1 changed pull request status to
closed