LLM2Vec turns any LLM into a universal embedding model
Embedding models have become an important part of the LLM ecosystem, especially for retrieval-augmented generation (RAG) applications. However, embedding models are dominantly based on bi-directional encoders (e.g., BERT), which are different from encoder models (e.g., GPT, Llama, Mistral) which are used for generative tasks.
A new paper by researchers at Quebec AI Institute (Mila) introduces LLM2Vec, an unsupervised technique that can turn any encoder-only LLM into an embedding model. This can be an important tool because it enables organizations to use their fine-tuned LLMs for embedding tasks, improving their accuracy and reducing the costs of creating custom embedding models.
LLM2Vec doesn’t require labeled data, reuses existing models, and takes advantage of the vast array of tools and techniques developed for decoder LLMs (e.g., LoRA). It sets new state-of-the-art results on sequence modeling tasks such as the Massive Text Embeddings Benchmark (MTEB). And it is very cost-effective: a 7-billion-parameter LLM can be transformed into an embedding model with LLM2Vec with $10 worth of compute and without any extra data gathering.
I spoke to the lead author of LLM2Vec Parishad BehnamGhader. Read more on TechTalks.
Read the paper on Arxiv.
For more on AI research:
ncG1vNJzZmialKmypLTTmqOkq16owqO%2F05qapGaTpLpwvI6lo6ZqppqwbsDUq6WsZZGjxm64y6ZkoqakpHqiedSnoK%2BdoqiurQ%3D%3D