Expand examples, remove trust_remote_code fully

#2
by tomaarsen HF Staff - opened

Hello!

Pull Request overview

  • Drop the auto_map shim and modeling_lco_omni.py re-export
  • Rewrite the Sentence Transformers usage section with per-modality retrieval examples (text, image, audio, video) and verified expected outputs
  • Switch the recommended model_kwargs to torch_dtype + attn_implementation="flash_attention_2"

Details

This is a follow-up to #1. That PR required trust_remote_code=True as qwen2_5_omni_thinker could not be loaded out-of-the-box. I resolved that directly on transformers, so that this model and its siblings can be loaded without any trust_remote_code=True. I also expanded the README with some more examples taken from https://huggingface.co/Tevatron/OmniEmbed-v0.1, so that each modality has a dedicated section (a bit like the Transformers README).

Alongside, the recommended model_kwargs now read:

model_kwargs={
    "torch_dtype": torch.bfloat16,
    "attn_implementation": "flash_attention_2",  # pip install kernels; recommended but not mandatory
},

flash_attention_2 is opt-in: if flash-attn isn't installed, transformers falls back to kernels-community/flash-attn2 when kernels is available. The pip install line now also pulls the [image,audio,video] extras and pins transformers>=5.6.0. The model outputs etc. are all the same, it's just no more trust_remote_code, expanded README, and pointing users to trust_remote_code=True.

  • Tom Aarsen
tomaarsen changed pull request status to open
LCO-Embedding org

looks great! Thanks a lot!

gowitheflow changed pull request status to merged

Sign up or log in to comment