Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add RetrySqlQueryCreatorTool for handling failed SQL query generation #15

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sushantburnawal
Copy link

@sushantburnawal sushantburnawal commented Jul 2, 2024

Add RetrySqlQueryCreatorTool for handling failed SQL query generation

Thank you for contributing to LangChain!

If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Summary by Sourcery

This pull request adds a new tool, RetrySqlQueryCreatorTool, to handle failed SQL query generation by retrying the creation process. It also updates the existing SQL query creation workflow to integrate this new tool and enhances the prompt used for retrying SQL queries.

  • New Features:
    • Introduced RetrySqlQueryCreatorTool to handle the re-creation of SQL queries when the initial query generation fails.
  • Enhancements:
    • Updated the SQL query creation process to use RetrySqlQueryCreatorTool for handling errors and retrying query generation.
    • Enhanced the SQL_QUERY_CREATOR_RETRY prompt to provide more detailed instructions for correcting SQL queries.

Copy link

sourcery-ai bot commented Jul 2, 2024

Reviewer's Guide by Sourcery

This pull request introduces a new tool, RetrySqlQueryCreatorTool, designed to handle failed SQL query generation. Significant changes include the addition of this new tool, updates to existing methods to prioritize the retry tool, and enhancements to the SQL_QUERY_CREATOR_RETRY template to provide more detailed instructions for correcting SQL queries.

File-Level Changes

Files Changes
libs/community/langchain_community/tools/sql_coder/tool.py
libs/langchain/langchain/tools/sqlcoder/prompt.py
Introduced RetrySqlQueryCreatorTool for handling failed SQL query generation and updated related templates and methods to support this new tool.

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sushantburnawal - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 7 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

@@ -14,6 +14,7 @@
from langchain_core.tools import StateTool
import re

ERROR = ""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider removing the unused ERROR variable.

The variable ERROR is defined but never used in the code. If it's not needed, it would be better to remove it to keep the code clean.

Suggested change
ERROR = ""
# Consider removing the unused ERROR variable.
# The variable `ERROR` is defined but never used in the code.
# If it's not needed, it would be better to remove it to keep the code clean.

@@ -65,6 +66,7 @@
)
executable_query = executable_query.strip('\"')
executable_query = re.sub('\\n```', '',executable_query)
self.db.run_no_throw(executable_query)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: Duplicate call to self.db.run_no_throw(executable_query).

The method self.db.run_no_throw(executable_query) is called twice consecutively. This seems redundant and could be removed.

@@ -75,14 +77,98 @@
raise NotImplementedError("QuerySparkSQLDataBaseTool does not support async")

def _extract_sql_query(self):
for value in self.state:
for value in reversed(self.state):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (bug_risk): Reversing the state list might have unintended consequences.

Reversing the state list could lead to unexpected behavior if the order of states is important. Ensure that this change is intentional and won't cause issues.

)
)
sql_query = sql_query.replace("```","")
sql_query = sql_query.replace("sql","")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Removing 'sql' from the query might cause issues.

The line sql_query = sql_query.replace("sql","") removes all occurrences of 'sql' from the query. This might lead to incorrect SQL queries if 'sql' is part of a table or column name.

@@ -1,8 +1,20 @@


SQL_QUERY_CREATOR_RETRY = """
You have failed in the first attempt to generate correct sql query. Please try again to rewrite correct sql query.
"""
Your task is convert an incorrect query resulting from user question to a correct query which is databricks sql compatible.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick (typo): Typo in the prompt template.

The sentence should be 'Your task is to convert an incorrect query resulting from a user question to a correct query which is Databricks SQL compatible.'


sql_query = self._extract_sql_query()
error_message = self._extract_error_message()
if sql_query is None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Consider logging when sql_query is None.

It might be useful to log a message when sql_query is None to help with debugging and understanding why the tool is not meant to be run directly.

Suggested change
if sql_query is None:
if sql_query is None:
logging.warning("SQL query is None. This tool is not meant to be run directly.")
return "This tool is not meant to be run directly. Start with a SQLQueryCreatorTool"

return input_string
elif "tool='sql_db_query_creator'" in key:
return input_string
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider raising an exception instead of returning None.

Returning None might lead to silent failures. Consider raising an exception to make it clear that an error has occurred.

Suggested change
return None
raise ValueError("No valid key found in input string")

Comment on lines +167 to +169
if "tool='sql_db_query'" in key:
if "Error" in input_string:
return input_string
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Merge nested if conditions (merge-nested-ifs)

Suggested change
if "tool='sql_db_query'" in key:
if "Error" in input_string:
return input_string
if "tool='sql_db_query'" in key and "Error" in input_string:
return input_string


ExplanationToo much nesting can make code difficult to understand, and this is especially
true in Python, where there are no brackets to help out with the delineation of
different nesting levels.

Reading deeply nested code is confusing, since you have to keep track of which
conditions relate to which levels. We therefore strive to reduce nesting where
possible, and the situation where two if conditions can be combined using
and is an easy win.

Comment on lines +82 to +84
if "tool='retry_sql_db_query_creator'" in key:
return input_string
elif "tool='sql_db_query_creator'" in key:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

Suggested change
if "tool='retry_sql_db_query_creator'" in key:
return input_string
elif "tool='sql_db_query_creator'" in key:
if (
"tool='retry_sql_db_query_creator'" in key
or "tool='sql_db_query_creator'" in key
):

Comment on lines +158 to +160
if "tool='retry_sql_db_query_creator'" in key:
return input_string
elif "tool='sql_db_query_creator'" in key:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): We've found these issues:

Suggested change
if "tool='retry_sql_db_query_creator'" in key:
return input_string
elif "tool='sql_db_query_creator'" in key:
if (
"tool='retry_sql_db_query_creator'" in key
or "tool='sql_db_query_creator'" in key
):

arunraja1
arunraja1 previously approved these changes Jul 3, 2024
@alokraj-109 alokraj-109 self-requested a review September 5, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants