-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harrison/guarded output parser #1804
Conversation
c33874e
to
bfa858b
Compare
@@ -30,10 +30,11 @@ def get_format_instructions(self) -> str: | |||
schema = self.pydantic_object.schema() | |||
|
|||
# Remove extraneous fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noting @hwchase17 : added this back to support more complex types, like an enum.. I tweaked the in-context example as a result, including a positive + negative case. Seems to be working reasonably well. I'd like to add some tests, but doesn't have to be in this diff
langchain/chains/llm.py
Outdated
def _get_final_output( | ||
self, generations: List[Generation], prompt_value: PromptValue | ||
) -> Any: | ||
"""Get the final output from a list of generations for a prompt.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we drop an assertion about shape of generations. It's locally (i.e., from the perspective of a person living inside the function) not obvious why it's okay to index into the list (and ignore any other generations)
|
||
|
||
class FixOutputParser(BaseOutputParser): | ||
"""Wraps a parser and tries to fix parsing errors.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the any benefits to trying multiple times?
langchain/chains/llm.py
Outdated
response, prompts = await self.agenerate(input_list) | ||
return self.create_outputs(response, prompts) | ||
|
||
def _get_final_output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about renaming this method and update the doc-string to clarify what "final" means.
The doc-string can be a bit more verbose and indicate that the output will be parsed if a parser has been specified.
At this level of the code, it's surprising to a parser attached on a prompt_value
object. (Developers will probably not know what PromptValue
is either)
@@ -47,7 +47,7 @@ def output_keys(self) -> List[str]: | |||
|
|||
def _call(self, inputs: Dict[str, str]) -> Dict[str, Any]: | |||
docs = self.text_splitter.create_documents([inputs[self.input_key]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is stylistic -- I like a convention of favoring non abbreviated names (e.g., docs -> documents) -- goal is to reduce number of names that appear in the code base since it removes one degree of freedom from naming.
Details: {error} | ||
Please try again:""" | ||
|
||
NAIVE_RETRY_PROMPT = PromptTemplate.from_template(NAIVE_COMPLETION_RETRY) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about inlining the template instead of using a proxy variable? We don't re-use the proxy variable anywhere right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, but imo it makes it more readable to have it separate
) | ||
|
||
|
||
class RetryOutputParser(BaseOutputParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between this parser and the fix parser above?
What do you think about expanding the doc-string significantly to explain use case and how the retry works? e.g., 2-5 lines of documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup will good, good call
"""Wraps a parser and tries to fix parsing errors.""" | ||
|
||
parser: BaseOutputParser | ||
retry_chain: LLMChain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type system doesn't seem to help us much here -- since it looks like the run
api of the retry_chain
interface is defined by the prompt (i.e., ability to accept prompt
and completion
)
have you thought of a way to surface that type information? I assume the challenge is maintaining things serializable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean, would love to discuss
return self.parser.get_format_instructions() | ||
|
||
|
||
class RetryWithErrorOutputParser(BaseOutputParser): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not combine this parser with the one above and add a variable to control behavior on error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prompt inputs differ... its def possible but youd need to make sure the variable for whether to pass error in is aligned with the prompt which just feels like too many levers to make in sync
return output_parser_dict | ||
|
||
|
||
class OutputParserException(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 line doc-string to explain what the exception is to be used for and in what cases it's expected to be raised / excepted?
The exception name looks like it could be raised by either logic issues in the parser or due to malformed data. Is this exception meant to catch only the latter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
No description provided.