Fix falkordb retrievers and memory

SynaLinks · Sep 19, 2024 · 00b619b · 00b619b
1 parent 9b7a2e6
commit 00b619b
Show file tree

Hide file tree

Showing 56 changed files with 1,007 additions and 343 deletions.
diff --git a/docs/Core API/Data Types.md b/docs/Core API/Data Types.md
diff --git a/...ntegration/Local/Local Document Memory.md → docs/Core API/Data Types/Agent Step.md b/...ntegration/Local/Local Document Memory.md → docs/Core API/Data Types/Agent Step.md
diff --git a/docs/Core API/Data Types/Document.md b/docs/Core API/Data Types/Document.md
@@ -0,0 +1,65 @@
+# Document
+
+Documents are the atomic data used in HybridAGI's Document Memory, they are used to represent textual data and their chunks in the system. Allowing the system to implement vector-only [Retrieval Augmented Generation](https://en.wikipedia.org/wiki/Retrieval-augmented_generation) systems.
+
+`Document`: Represent an unstructured textual data to be processed or saved into memory
+
+`DocumentList`: A list of documents to be processed or saved into memory
+
+## Definition
+
+```python
+
+class Document(BaseModel):
+    id: Union[UUID, str] = Field(description="Unique identifier for the document", default_factory=uuid4)
+    text: str = Field(description="The actual text content of the document")
+    parent_id: Optional[Union[UUID, str]] = Field(description="Identifier for the parent document", default=None)
+    vector: Optional[List[float]] = Field(description="Vector representation of the document", default=None)
+    metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the document", default={})
+
+    def to_dict(self):
+        if self.metadata:
+            return {"text": self.text, "metadata": self.metadata}
+        else:
+            return {"text": self.text}
+
+class DocumentList(BaseModel, dspy.Prediction):
+    docs: Optional[List[Document]] = Field(description="List of documents", default=[])
+
+    def __init__(self, **kwargs):
+        BaseModel.__init__(self, **kwargs)
+        dspy.Prediction.__init__(self, **kwargs)
+
+    def to_dict(self):
+        return {"documents": [d.to_dict() for d in self.docs]}
+
+```
+
+## Usage
+
+```python
+
+input_data = \
+[
+    {
+        "title": "The Catcher in the Rye",
+        "content": "The Catcher in the Rye is a novel by J. D. Salinger, partially published in serial form in 1945–1946 and as a novel in 1951. It is widely considered one of the greatest American novels of the 20th century. The novel's protagonist, Holden Caulfield, has become an icon for teenage rebellion and angst. The novel also deals with complex issues of innocence, identity, belonging, loss, and connection."
+    },
+    {
+        "title": "To Kill a Mockingbird",
+        "content": "To Kill a Mockingbird is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature. The plot and characters are loosely based on the author's observations of her family and neighbors, as well as on an event that occurred near her hometown in 1936, when she was 10 years old. The novel is renowned for its sensitivity and depth in addressing racial injustice, class, gender roles, and destruction of innocence."
+    }
+]
+
+document_list = DocumentList()
+
+for data in input_data:
+    document_list.docs.append(
+        Document(
+            text=data["content"],
+            metadata={"title": data["title"]},
+        )
+    )
+
+>>>
+```
diff --git a/docs/Core API/Data Types/Fact.md b/docs/Core API/Data Types/Fact.md
@@ -0,0 +1,125 @@
+# Fact
+
+Facts are the atomic data of a [Knowledge Graph](https://en.wikipedia.org/wiki/Knowledge_graph). They represent the relations between two entities (a subject and object). They are the basis of knowledge based systems and allowing to represent precise and formal knowledge. With them you can implement [Knowledge Graph based Retrieval Augmented Generation]().
+
+`Entity`: Represent an entity like a person, object, place or document to be processed or saved into memory
+
+`Fact`: Represent a first order predicate to be processed or saved into the `FactMemory`
+
+`EntityList`: A list of entities to be processed or saved into memory
+
+`FactList`: A list of facts to be processed or saved into memory
+
+## Definition
+
+```python
+
+class Entity(BaseModel):
+    id: Union[UUID, str] = Field(description="Unique identifier for the entity", default_factory=uuid4)
+    label: str = Field(description="Label or category of the entity")
+    name: str = Field(description="Name or title of the entity")
+    description: Optional[str] = Field(description="Description of the entity", default=None)
+    vector: Optional[List[float]] = Field(description="Vector representation of the document", default=None)
+    metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the document", default={})
+
+    def to_dict(self):
+        if self.metadata:
+            if self.description is not None:
+                return {"name": self.name, "label": self.label, "description": self.description, "metadata": self.metadata}
+            else:
+                return {"name": self.name, "label": self.label, "metadata": self.metadata}
+        else:
+            if self.description is not None:
+                return {"name": self.name, "label": self.label, "description": self.description}
+            else:
+                return {"name": self.name, "label": self.label}
+
+class EntityList(BaseModel, dspy.Prediction):
+    entities: List[Entity] = Field(description="List of entities", default=[])
+
+    def __init__(self, **kwargs):
+        BaseModel.__init__(self, **kwargs)
+        dspy.Prediction.__init__(self, **kwargs)
+
+    def to_dict(self):
+        return {"entities": [e.to_dict() for e in self.entities]}
+
+class Relationship(BaseModel):
+    id: Union[UUID, str] = Field(description="Unique identifier for the relation", default_factory=uuid4)
+    name: str = Field(description="Relationship name")
+    vector: Optional[List[float]] = Field(description="Vector representation of the relationship", default=None)
+    metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the relationship", default={})
+
+    def to_dict(self):
+        if self.metadata:
+            return {"name": self.name, "metadata": self.metadata}
+        else:
+            return {"name": self.name}
+
+class Fact(BaseModel):
+    id: Union[UUID, str] = Field(description="Unique identifier for the fact", default_factory=uuid4)
+    subj: Entity = Field(description="Entity that is the subject of the fact", default=None)
+    rel: Relationship = Field(description="Relation between the subject and object entities", default=None)
+    obj: Entity = Field(description="Entity that is the object of the fact", default=None)
+    vector: Optional[List[float]] = Field(description="Vector representation of the fact", default=None)
+    metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the fact", default={})
+
+    def to_cypher(self) -> str:
+        if self.subj.description is not None:
+            subj = "(:"+self.subj.label+" {name:\""+self.subj.name+"\", description:\""+self.subj.description+"\"})"
+        else:
+            subj = "(:"+self.subj.label+" {name:\""+self.subj.name+"\"})"
+        if self.obj.description is not None:
+            obj = "(:"+self.obj.label+" {name:\""+self.obj.name+"\", description:\""+self.obj.description+"\"})"
+        else:
+            obj = "(:"+self.obj.label+" {name:\""+self.obj.name+"\"})"
+        return subj+"-[:"+self.rel.name+"]->"+obj
+
+    def from_cypher(self, cypher_fact:str, metadata: Dict[str, Any] = {}) -> "Fact":
+        match = re.match(CYPHER_FACT_REGEX, cypher_fact)
+        if match:
+            self.subj = Entity(label=match.group(1), name=match.group(2))
+            self.rel = Relationship(name=match.group(3))
+            self.obj = Entity(label=match.group(4), name=match.group(5))
+            self.metadata = metadata
+            return self
+        else:
+            raise ValueError("Invalid Cypher fact provided")
+
+    def to_dict(self):
+        if self.metadata:
+            return {"fact": self.to_cypher(), "metadata": self.metadata}
+        else:
+            return {"fact": self.to_cypher()}
+
+class FactList(BaseModel, dspy.Prediction):
+    facts: List[Fact] = Field(description="List of facts", default=[])
+
+    def __init__(self, **kwargs):
+        BaseModel.__init__(self, **kwargs)
+        dspy.Prediction.__init__(self, **kwargs)
+
+    def to_cypher(self) -> str:
+        return ",\n".join([f.to_cypher() for f in self.facts])
+
+    def from_cypher(self, cypher_facts: str, metadata: Dict[str, Any] = {}):
+        triplets = re.findall(CYPHER_FACT_REGEX, cypher_facts)
+        for triplet in triplets:
+            subject_label, subject_name, predicate, object_label, object_name = triplet
+            self.facts.append(Fact(
+                subj = Entity(name=subject_name, label=subject_label),
+                rel = Relationship(name=predicate),
+                obj = Entity(name=object_name, label=object_label),
+                metadata = metadata,
+            ))
+        return self
+
+    def to_dict(self):
+        return {"facts": [f.to_dict() for f in self.facts]}
+
+```
+
+## Usage
+
+```
+```
diff --git a/docs/Core API/Graph Program.md → docs/Core API/Data Types/Graph Program.md b/docs/Core API/Graph Program.md → docs/Core API/Data Types/Graph Program.md
@@ -1,27 +1,23 @@
+# Graph Program
+
 The Graph Programs are a special data type representing a workflow of actions and decisions with calls to other programs. They are used by our own custom Agent, the `GraphProgramInterpreter`. In order help you to build them, we provide two ways of doing it: Using Python or Cypher.
 
-The two ways are equivalent and allows you to choose the one you prefer.
+The two ways are equivalent and allows you to choose the one you prefer, we recommend you however to use the pythonic way, to avoid syntax errors, and eventually save them into Cypher format for later use.
 
-### Python Usage:
+### Python Usage
 
 ```python
 import hybridagi.core.graph_program as gp
 
 main = gp.GraphProgram(
-	id = "main",
-	desc = "The main program",
+	name = "main",
+	description = "The main program",
 )
 
 main.add("answer", gp.Action(
-	tool = "Speak"
-	purpose = ""
-	prompt = \
-"""
-Please answer to the following question: 
-{{objective}}
-"""
-	inputs=["objective"],
-	ouput="answer",
+	tool = "Speak",
+	purpose = "Answer the Objective's question",
+	prompt = "Please answer to the Objective's question",
 ))
 
 main.connect("start", "answer")

diff --git a/...PI/Integration/Local/Local Fact Memory.md → docs/Core API/Data Types/Query.md b/...PI/Integration/Local/Local Fact Memory.md → docs/Core API/Data Types/Query.md
diff --git a/docs/Core API/Data Types/Session.md b/docs/Core API/Data Types/Session.md
@@ -0,0 +1,28 @@
+# Session
+
+`UserProfile`: Represent the user profile used to personalize the interaction and by the simulation of the user.
+
+```python
+
+class UserProfile(BaseModel):
+	id: str = Field(description="Unique identifier for the user", default_factory=uuid4)
+	name: str = Field(description="The user name", default="Unknow")
+	profile: str = Field(description="The user profile", default="An average User")
+
+class RoleType(str, Enum):
+	AI = "AI"
+	User = "User"
+
+class Message(BaseModel):
+	role: RoleType
+	message: str
+
+class ChatHistory(BaseModel):
+	msgs: List[Message] = Field(description="List of messages", default=[])
+
+class InteractionSession(BaseModel):
+	id: str = Field(description="Unique identifier for the interaction session", default_factory=uuid4)
+	user_profile: UserProfile = Field(description="The user profile")
+	chat_history: ChatHistory = Field(description="The chat history")
+
+```