-
Notifications
You must be signed in to change notification settings - Fork 63
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9b7a2e6
commit 00b619b
Showing
56 changed files
with
1,007 additions
and
343 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
# Document | ||
|
||
Documents are the atomic data used in HybridAGI's Document Memory, they are used to represent textual data and their chunks in the system. Allowing the system to implement vector-only [Retrieval Augmented Generation](https://en.wikipedia.org/wiki/Retrieval-augmented_generation) systems. | ||
|
||
`Document`: Represent an unstructured textual data to be processed or saved into memory | ||
|
||
`DocumentList`: A list of documents to be processed or saved into memory | ||
|
||
## Definition | ||
|
||
```python | ||
|
||
class Document(BaseModel): | ||
id: Union[UUID, str] = Field(description="Unique identifier for the document", default_factory=uuid4) | ||
text: str = Field(description="The actual text content of the document") | ||
parent_id: Optional[Union[UUID, str]] = Field(description="Identifier for the parent document", default=None) | ||
vector: Optional[List[float]] = Field(description="Vector representation of the document", default=None) | ||
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the document", default={}) | ||
|
||
def to_dict(self): | ||
if self.metadata: | ||
return {"text": self.text, "metadata": self.metadata} | ||
else: | ||
return {"text": self.text} | ||
|
||
class DocumentList(BaseModel, dspy.Prediction): | ||
docs: Optional[List[Document]] = Field(description="List of documents", default=[]) | ||
|
||
def __init__(self, **kwargs): | ||
BaseModel.__init__(self, **kwargs) | ||
dspy.Prediction.__init__(self, **kwargs) | ||
|
||
def to_dict(self): | ||
return {"documents": [d.to_dict() for d in self.docs]} | ||
|
||
``` | ||
|
||
## Usage | ||
|
||
```python | ||
|
||
input_data = \ | ||
[ | ||
{ | ||
"title": "The Catcher in the Rye", | ||
"content": "The Catcher in the Rye is a novel by J. D. Salinger, partially published in serial form in 1945–1946 and as a novel in 1951. It is widely considered one of the greatest American novels of the 20th century. The novel's protagonist, Holden Caulfield, has become an icon for teenage rebellion and angst. The novel also deals with complex issues of innocence, identity, belonging, loss, and connection." | ||
}, | ||
{ | ||
"title": "To Kill a Mockingbird", | ||
"content": "To Kill a Mockingbird is a novel by Harper Lee published in 1960. It was immediately successful, winning the Pulitzer Prize, and has become a classic of modern American literature. The plot and characters are loosely based on the author's observations of her family and neighbors, as well as on an event that occurred near her hometown in 1936, when she was 10 years old. The novel is renowned for its sensitivity and depth in addressing racial injustice, class, gender roles, and destruction of innocence." | ||
} | ||
] | ||
|
||
document_list = DocumentList() | ||
|
||
for data in input_data: | ||
document_list.docs.append( | ||
Document( | ||
text=data["content"], | ||
metadata={"title": data["title"]}, | ||
) | ||
) | ||
|
||
>>> | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,125 @@ | ||
# Fact | ||
|
||
Facts are the atomic data of a [Knowledge Graph](https://en.wikipedia.org/wiki/Knowledge_graph). They represent the relations between two entities (a subject and object). They are the basis of knowledge based systems and allowing to represent precise and formal knowledge. With them you can implement [Knowledge Graph based Retrieval Augmented Generation](). | ||
|
||
`Entity`: Represent an entity like a person, object, place or document to be processed or saved into memory | ||
|
||
`Fact`: Represent a first order predicate to be processed or saved into the `FactMemory` | ||
|
||
`EntityList`: A list of entities to be processed or saved into memory | ||
|
||
`FactList`: A list of facts to be processed or saved into memory | ||
|
||
## Definition | ||
|
||
```python | ||
|
||
class Entity(BaseModel): | ||
id: Union[UUID, str] = Field(description="Unique identifier for the entity", default_factory=uuid4) | ||
label: str = Field(description="Label or category of the entity") | ||
name: str = Field(description="Name or title of the entity") | ||
description: Optional[str] = Field(description="Description of the entity", default=None) | ||
vector: Optional[List[float]] = Field(description="Vector representation of the document", default=None) | ||
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the document", default={}) | ||
|
||
def to_dict(self): | ||
if self.metadata: | ||
if self.description is not None: | ||
return {"name": self.name, "label": self.label, "description": self.description, "metadata": self.metadata} | ||
else: | ||
return {"name": self.name, "label": self.label, "metadata": self.metadata} | ||
else: | ||
if self.description is not None: | ||
return {"name": self.name, "label": self.label, "description": self.description} | ||
else: | ||
return {"name": self.name, "label": self.label} | ||
|
||
class EntityList(BaseModel, dspy.Prediction): | ||
entities: List[Entity] = Field(description="List of entities", default=[]) | ||
|
||
def __init__(self, **kwargs): | ||
BaseModel.__init__(self, **kwargs) | ||
dspy.Prediction.__init__(self, **kwargs) | ||
|
||
def to_dict(self): | ||
return {"entities": [e.to_dict() for e in self.entities]} | ||
|
||
class Relationship(BaseModel): | ||
id: Union[UUID, str] = Field(description="Unique identifier for the relation", default_factory=uuid4) | ||
name: str = Field(description="Relationship name") | ||
vector: Optional[List[float]] = Field(description="Vector representation of the relationship", default=None) | ||
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the relationship", default={}) | ||
|
||
def to_dict(self): | ||
if self.metadata: | ||
return {"name": self.name, "metadata": self.metadata} | ||
else: | ||
return {"name": self.name} | ||
|
||
class Fact(BaseModel): | ||
id: Union[UUID, str] = Field(description="Unique identifier for the fact", default_factory=uuid4) | ||
subj: Entity = Field(description="Entity that is the subject of the fact", default=None) | ||
rel: Relationship = Field(description="Relation between the subject and object entities", default=None) | ||
obj: Entity = Field(description="Entity that is the object of the fact", default=None) | ||
vector: Optional[List[float]] = Field(description="Vector representation of the fact", default=None) | ||
metadata: Optional[Dict[str, Any]] = Field(description="Additional information about the fact", default={}) | ||
|
||
def to_cypher(self) -> str: | ||
if self.subj.description is not None: | ||
subj = "(:"+self.subj.label+" {name:\""+self.subj.name+"\", description:\""+self.subj.description+"\"})" | ||
else: | ||
subj = "(:"+self.subj.label+" {name:\""+self.subj.name+"\"})" | ||
if self.obj.description is not None: | ||
obj = "(:"+self.obj.label+" {name:\""+self.obj.name+"\", description:\""+self.obj.description+"\"})" | ||
else: | ||
obj = "(:"+self.obj.label+" {name:\""+self.obj.name+"\"})" | ||
return subj+"-[:"+self.rel.name+"]->"+obj | ||
|
||
def from_cypher(self, cypher_fact:str, metadata: Dict[str, Any] = {}) -> "Fact": | ||
match = re.match(CYPHER_FACT_REGEX, cypher_fact) | ||
if match: | ||
self.subj = Entity(label=match.group(1), name=match.group(2)) | ||
self.rel = Relationship(name=match.group(3)) | ||
self.obj = Entity(label=match.group(4), name=match.group(5)) | ||
self.metadata = metadata | ||
return self | ||
else: | ||
raise ValueError("Invalid Cypher fact provided") | ||
|
||
def to_dict(self): | ||
if self.metadata: | ||
return {"fact": self.to_cypher(), "metadata": self.metadata} | ||
else: | ||
return {"fact": self.to_cypher()} | ||
|
||
class FactList(BaseModel, dspy.Prediction): | ||
facts: List[Fact] = Field(description="List of facts", default=[]) | ||
|
||
def __init__(self, **kwargs): | ||
BaseModel.__init__(self, **kwargs) | ||
dspy.Prediction.__init__(self, **kwargs) | ||
|
||
def to_cypher(self) -> str: | ||
return ",\n".join([f.to_cypher() for f in self.facts]) | ||
|
||
def from_cypher(self, cypher_facts: str, metadata: Dict[str, Any] = {}): | ||
triplets = re.findall(CYPHER_FACT_REGEX, cypher_facts) | ||
for triplet in triplets: | ||
subject_label, subject_name, predicate, object_label, object_name = triplet | ||
self.facts.append(Fact( | ||
subj = Entity(name=subject_name, label=subject_label), | ||
rel = Relationship(name=predicate), | ||
obj = Entity(name=object_name, label=object_label), | ||
metadata = metadata, | ||
)) | ||
return self | ||
|
||
def to_dict(self): | ||
return {"facts": [f.to_dict() for f in self.facts]} | ||
|
||
``` | ||
|
||
## Usage | ||
|
||
``` | ||
``` |
22 changes: 9 additions & 13 deletions
22
docs/Core API/Graph Program.md → docs/Core API/Data Types/Graph Program.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Session | ||
|
||
`UserProfile`: Represent the user profile used to personalize the interaction and by the simulation of the user. | ||
|
||
```python | ||
|
||
class UserProfile(BaseModel): | ||
id: str = Field(description="Unique identifier for the user", default_factory=uuid4) | ||
name: str = Field(description="The user name", default="Unknow") | ||
profile: str = Field(description="The user profile", default="An average User") | ||
|
||
class RoleType(str, Enum): | ||
AI = "AI" | ||
User = "User" | ||
|
||
class Message(BaseModel): | ||
role: RoleType | ||
message: str | ||
|
||
class ChatHistory(BaseModel): | ||
msgs: List[Message] = Field(description="List of messages", default=[]) | ||
|
||
class InteractionSession(BaseModel): | ||
id: str = Field(description="Unique identifier for the interaction session", default_factory=uuid4) | ||
user_profile: UserProfile = Field(description="The user profile") | ||
chat_history: ChatHistory = Field(description="The chat history") | ||
|
||
``` |
Oops, something went wrong.