Skip to content

Commit

Permalink
chore: update embedding doc (#270)
Browse files Browse the repository at this point in the history
* chore: update embedding doc

* chore: update function doc
  • Loading branch information
DresAaron authored Sep 18, 2024
1 parent 2e2d2b7 commit 200b1c8
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion integrations/jina.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,10 @@ You can reference the table below for hints on dimension vs. performance:
| :-------------------------------------: | :---: | :---: | :---: | :---: | :---: | :--: | :---: |
| Average Retrieval Performance (nDCG@10) | 52.54 | 58.54 | 61.64 | 62.72 | 63.16 | 63.3 | 63.35 |

**Late Chunking in Long-Context Embedding Models**

`jina-embeddings-v3` supports [Late Chunking](https://jina.ai/news/late-chunking-in-long-context-embedding-models/), the technique to leverage the model's long-context capabilities for generating contextual chunk embeddings. Include `late_chunking=True` in your request to enable contextual chunked representation. When set to true, Jina AI API will concatenate all sentences in the input field and feed them as a single string to the model. Internally, the model embeds this long concatenated string and then performs late chunking, returning a list of embeddings that matches the size of the input list.

### **Table of Contents**

- [Haystack 2.0](#haystack-20)
Expand Down Expand Up @@ -105,7 +109,8 @@ indexing_pipeline.add_component(
api_key=Secret.from_token("<your-api-key>"),
model="jina-embeddings-v3",
dimensions=1024,
task="retrieval.passage"
task="retrieval.passage",
late_chunking=True,
)
)
indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))
Expand Down

0 comments on commit 200b1c8

Please sign in to comment.