Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harrison/guarded output parser #1804

Merged
merged 17 commits into from
Mar 22, 2023
Merged

Conversation

hwchase17
Copy link
Contributor

No description provided.

langchain/chains/llm.py Outdated Show resolved Hide resolved
langchain/schema.py Outdated Show resolved Hide resolved
@jerwelborn jerwelborn force-pushed the harrison/guarded-output-parser branch from c33874e to bfa858b Compare March 20, 2023 19:42
@@ -30,10 +30,11 @@ def get_format_instructions(self) -> str:
schema = self.pydantic_object.schema()

# Remove extraneous fields.
Copy link
Contributor

@jerwelborn jerwelborn Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noting @hwchase17 : added this back to support more complex types, like an enum.. I tweaked the in-context example as a result, including a positive + negative case. Seems to be working reasonably well. I'd like to add some tests, but doesn't have to be in this diff

def _get_final_output(
self, generations: List[Generation], prompt_value: PromptValue
) -> Any:
"""Get the final output from a list of generations for a prompt."""
Copy link
Collaborator

@eyurtsev eyurtsev Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drop an assertion about shape of generations. It's locally (i.e., from the perspective of a person living inside the function) not obvious why it's okay to index into the list (and ignore any other generations)



class FixOutputParser(BaseOutputParser):
"""Wraps a parser and tries to fix parsing errors."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the any benefits to trying multiple times?

response, prompts = await self.agenerate(input_list)
return self.create_outputs(response, prompts)

def _get_final_output(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about renaming this method and update the doc-string to clarify what "final" means.

The doc-string can be a bit more verbose and indicate that the output will be parsed if a parser has been specified.

At this level of the code, it's surprising to a parser attached on a prompt_value object. (Developers will probably not know what PromptValue is either)

@@ -47,7 +47,7 @@ def output_keys(self) -> List[str]:

def _call(self, inputs: Dict[str, str]) -> Dict[str, Any]:
docs = self.text_splitter.create_documents([inputs[self.input_key]])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is stylistic -- I like a convention of favoring non abbreviated names (e.g., docs -> documents) -- goal is to reduce number of names that appear in the code base since it removes one degree of freedom from naming.

Details: {error}
Please try again:"""

NAIVE_RETRY_PROMPT = PromptTemplate.from_template(NAIVE_COMPLETION_RETRY)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about inlining the template instead of using a proxy variable? We don't re-use the proxy variable anywhere right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, but imo it makes it more readable to have it separate

)


class RetryOutputParser(BaseOutputParser):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between this parser and the fix parser above?

What do you think about expanding the doc-string significantly to explain use case and how the retry works? e.g., 2-5 lines of documentation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup will good, good call

"""Wraps a parser and tries to fix parsing errors."""

parser: BaseOutputParser
retry_chain: LLMChain
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type system doesn't seem to help us much here -- since it looks like the run api of the retry_chain interface is defined by the prompt (i.e., ability to accept prompt and completion)

have you thought of a way to surface that type information? I assume the challenge is maintaining things serializable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you mean, would love to discuss

return self.parser.get_format_instructions()


class RetryWithErrorOutputParser(BaseOutputParser):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not combine this parser with the one above and add a variable to control behavior on error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prompt inputs differ... its def possible but youd need to make sure the variable for whether to pass error in is aligned with the prompt which just feels like too many levers to make in sync

return output_parser_dict


class OutputParserException(Exception):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 line doc-string to explain what the exception is to be used for and in what cases it's expected to be raised / excepted?

The exception name looks like it could be raised by either logic issues in the parser or due to malformed data. Is this exception meant to catch only the latter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@hwchase17 hwchase17 merged commit ce5d97b into master Mar 22, 2023
@hwchase17 hwchase17 deleted the harrison/guarded-output-parser branch March 22, 2023 05:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants