Skip to content

Commit

Permalink
Fix links in GraphVectorStore pydoc
Browse files Browse the repository at this point in the history
  • Loading branch information
cbornet committed Sep 9, 2024
1 parent 262e19b commit b0af6ba
Show file tree
Hide file tree
Showing 2 changed files with 45 additions and 51 deletions.
15 changes: 8 additions & 7 deletions docs/api_reference/scripts/custom_formatter.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,16 @@ def process_toc_h3_elements(html_content: str) -> str:

# Process each element
for element in toc_h3_elements:
element = element.a.code.span
# Get the text content of the element
content = element.get_text()
if element.a.code:
element = element.a.code.span
# Get the text content of the element
content = element.get_text()

# Apply the regex substitution
modified_content = content.split(".")[-1]
# Apply the regex substitution
modified_content = content.split(".")[-1]

# Update the element's content
element.string = modified_content
# Update the element's content
element.string = modified_content

# Return the modified HTML
return str(soup)
Expand Down
81 changes: 37 additions & 44 deletions libs/community/langchain_community/graph_vectorstores/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
"""**Graph Vector Store**
""".. title:: Graph Vector Store
Sometimes embedding models don’t capture all the important relationships between
Graph Vector Store
==================
Sometimes embedding models don't capture all the important relationships between
documents.
Graph Vector Stores are an extension to both vector stores and retrievers that allow
documents to be explicitly connected to each other.
Expand All @@ -13,25 +16,29 @@
For example, a paragraph of text may be linked to URLs based on the anchor tags in
it's content and linked from the URL(s) it is published at.
Link extractors can be used to extract links from documents.
Example:
`Link extractors <langchain_community.graph_vectorstores.extractors.link_extractor.LinkExtractor>`
can be used to extract links from documents.
.. code-block:: python
Example::
graph_vector_store = CassandraGraphVectorStore()
link_extractor = HtmlLinkExtractor()
links = link_extractor.extract_one(HtmlInput(document.page_content, "http://mysite"))
add_links(document, links)
graph_vector_store.add_document(document)
***********
Get started
***********
.. seealso::
We chunk the State of the Union text and split it into documents.
- :class:`How to use a graph vector store as a retriever <langchain_core.graph_vectorstores.base.GraphVectorStoreRetriever>`
- :class:`How to create links between documents <langchain_core.graph_vectorstores.links.Link>`
- :class:`How to link Documents on hyperlinks in HTML <langchain_community.graph_vectorstores.extractors.html_link_extractor.HtmlLinkExtractor>`
- :class:`How to link Documents on common keywords (using KeyBERT) <langchain_community.graph_vectorstores.extractors.keybert_link_extractor.KeybertLinkExtractor>`
- :class:`How to link Documents on common named entities (using GliNER) <langchain_community.graph_vectorstores.extractors.gliner_link_extractor.GLiNERLinkExtractor>`
.. code-block:: python
Get started
-----------
We chunk the State of the Union text and split it into documents::
from langchain_community.document_loaders import TextLoader
from langchain_text_splitters import CharacterTextSplitter
Expand All @@ -41,14 +48,12 @@
documents = text_splitter.split_documents(raw_documents)
Links can be added to documents manually but it's easier to use a
:class:`~langchain_community.graph_vectorstores.extractors.LinkExtractor`.
:class:`~langchain_community.graph_vectorstores.extractors.link_extractor.LinkExtractor`.
Several common link extractors are available and you can build your own.
For this guide, we'll use the
:class:`~langchain_community.graph_vectorstores.extractors.KeybertLinkExtractor`
:class:`~langchain_community.graph_vectorstores.extractors.keybert_link_extractor.KeybertLinkExtractor`
which uses the KeyBERT model to tag documents with keywords and uses these keywords to
create links between documents.
.. code-block:: python
create links between documents::
from langchain_community.graph_vectorstores.extractors import KeybertLinkExtractor
from langchain_community.graph_vectorstores.links import add_links
Expand All @@ -58,15 +63,14 @@
for doc in documents:
add_links(doc, extractor.extract_one(doc))
***********************************************
Create the graph vector store and add documents
***********************************************
-----------------------------------------------
We'll use an Apache Cassandra or Astra DB database as an example.
We create a :class:`~langchain_community.graph_vectorstores.CassandraGraphVectorStore`
from the documents and an :class:`~langchain_openai.OpenAIEmbeddings` model.
.. code-block:: python
We create a
:class:`~langchain_community.graph_vectorstores.cassandra.CassandraGraphVectorStore`
from the documents and an :class:`~langchain_openai.embeddings.base.OpenAIEmbeddings`
model::
import cassio
from langchain_community.graph_vectorstores import CassandraGraphVectorStore
Expand All @@ -80,45 +84,37 @@
documents=documents,
)
*****************
Similarity search
*****************
-----------------
If we don't traverse the graph, a graph vector store behaves like a regular vector
store.
So all methods available in a vector store are also available in a graph vector store.
The :meth:`~langchain_community.graph_vectorstores.base.GraphVectorStore.similarity_search`
The :meth:`~langchain_core.graph_vectorstores.base.GraphVectorStore.similarity_search`
method returns documents similar to a query without considering
the links between documents.
.. code-block:: python
the links between documents::
docs = store.similarity_search(
"What did the president say about Ketanji Brown Jackson?"
)
****************
Traversal search
****************
----------------
The :meth:`~langchain_community.graph_vectorstores.base.GraphVectorStore.traversal_search`
The :meth:`~langchain_core.graph_vectorstores.base.GraphVectorStore.traversal_search`
method returns documents similar to a query considering the links
between documents. It first does a similarity search and then traverses the graph to
find linked documents.
.. code-block:: python
find linked documents::
docs = list(
store.traversal_search("What did the president say about Ketanji Brown Jackson?")
)
*************
Async methods
*************
The graph vector store has async versions of the methods prefixed with ``a``.
-------------
.. code-block:: python
The graph vector store has async versions of the methods prefixed with ``a``::
docs = [
doc
Expand All @@ -127,15 +123,12 @@
)
]
****************************
Graph vector store retriever
****************************
----------------------------
The graph vector store can be converted to a retriever.
It is similar to the vector store retriever but it also has traversal search methods
such as ``traversal`` and ``mmr_traversal``.
.. code-block:: python
such as ``traversal`` and ``mmr_traversal``::
retriever = store.as_retriever(search_type="mmr_traversal")
docs = retriever.invoke("What did the president say about Ketanji Brown Jackson?")
Expand Down

0 comments on commit b0af6ba

Please sign in to comment.