🥃 Distilling Tiny Embeddings. We're happy to build on the BERT Hash Series of models with this new set of fixed dimensional tiny embeddings models.
Ranging from 244K parameters to 970K and 50 dimensions to 128 dimensions these tiny models pack quite a punch.
Use cases include on-device semantic search, similarity comparisons, LLM chunking and Retrieval Augmented Generation (RAG). The advantage is that data never needs to leave the device while still having solid performance.
We’re experimenting with a new response panel layout and would love your feedback.We’re testing a more focused experience:
- Only one response section open at a time (instead of multiple) - The response body now takes up most of the vertical space, making it easier to read and inspect
The goal is simple: reduce clutter and keep the response as the main focus.
That said, we know many developers are comfortable with the classic layout (Postman / Bruno-style), where multiple sections can stay open at once.What would you prefer?
- A new, focused single-section layout - The classic multi-section layout - A toggle that lets you choose between both?