Add support for different knowledge retrieval methods #2
Labels
enhancement
New feature or request
good first issue
Good for newcomers
help wanted
Extra attention is needed
This is for the built-in
retrieval
tool.Currently, the current knowledge retrieval implementation uses a very naive retrieval which simply returns the full contents of every attached file (source).
The current implementation also only support text file types like
text/plain
and markdown, as no preprocessing or conversions are done at the moment.It shouldn't be too hard to add support for more legit knowledge retrieval approaches, which would require:
processForFileAssistant
- File ingestion pre-processing for files marked withpurpose: 'assistants'
markdown
(this is probably the hardest step to do well across all of the most common file types)file_id
each chunk comes from for filtering purposesretrievalTool
- Performs knowledge retrieval for a givenquery
on a set offile_ids
for RAG.query
file_ids
Integrations here with LangChain and/or LlamaIndex would be great for their flexibility, but we could also KISS and roll out own using https://github.com/dexaai/dexter
The text was updated successfully, but these errors were encountered: