Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/llm custom metric #48

Merged
merged 5 commits into from
Mar 17, 2024
Merged

Feature/llm custom metric #48

merged 5 commits into from
Mar 17, 2024

Conversation

pantonante
Copy link
Contributor

@pantonante pantonante commented Mar 17, 2024

Ellipsis 🚀 This PR description was created by Ellipsis for commit c3e8db5.

Summary:

This PR introduces a new feature for creating custom LLM-based metrics, adds new classes, updates documentation, provides an example script, and includes minor updates.

Key points:

  • Introduced a new feature for creating custom LLM-based metrics
  • Added LLMBasedCustomMetric, ScoringFunctions, and EvaluationExample classes in /continuous_eval/metrics/generation/text/custom.py
  • Updated documentation in /docs/src/content/docs/metrics/Generation/LLM-Based/custom.md
  • Provided example script at /examples/llm_custom_criteria.py
  • Updated _numeric_matcher function in custom.py
  • Minor updates in import statements and documentation links

Generated with ❤️ by ellipsis.dev

@pantonante pantonante requested a review from yisz March 17, 2024 00:29
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested.

  • Reviewed the entire pull request up to 9dd5b99
  • Looked at 276 lines of code in 4 files
  • Took 1 minute and 0 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 1 additional comments because they didn't meet confidence threshold of 50%.
1. /continuous_eval/metrics/generation/text/custom.py:86:
  • Assessed confidence : 66%
  • Grade: 0%
  • Comment:
    The _scoring_function is expected to handle the string output of the LLM. Ensure that the _scoring_function is properly designed to handle this type of input. For example, if the Numeric scoring function is used, it should be able to correctly extract a numeric value from the string output of the LLM.
  • Reasoning:
    The LLMBasedCustomMetric class uses the _scoring_function to parse the output of the LLM and return a score. However, the output of the LLM is a string, and the _scoring_function is expected to handle this string. If the _scoring_function is not properly designed to handle the string output of the LLM, it could lead to errors or incorrect scores. For example, the Numeric scoring function defined in the ScoringFunctions class uses the _numeric_matcher function to extract a numeric value from the input string, but as noted earlier, the _numeric_matcher function does not correctly match numeric values.

Workflow ID: wflow_U3tYJ2SBKDwEg3hG


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. We'll respond in a few minutes. Learn more here.



def _numeric_matcher(input_val, min_val, max_val):
grp = re.search(f"[{min_val}-{max_val}]", input_val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regular expression in the _numeric_matcher function is not correctly defined for matching numeric values within a range. It will match any single character that falls within the ASCII range defined by the ASCII values of the characters representing min_val and max_val. Consider using a different approach to extract the numeric value from the input string.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me!

  • Performed an incremental review on 4e85d59
  • Looked at 18 lines of code in 1 files
  • Took 1 minute and 16 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 1 additional comments because they didn't meet confidence threshold of 50%.
1. continuous_eval/metrics/generation/text/custom.py:14:
  • Assessed confidence : 50%
  • Comment:
    The _numeric_matcher function currently only returns the first number it finds in the input string. This might not be the correct behavior if there are multiple numbers in the input. Consider modifying the function to handle multiple numbers in some way, either by returning a list of all the numbers it finds, or by taking an additional argument that specifies which number to return.
  • Reasoning:
    The _numeric_matcher function is using a regular expression to find a number in the input string and then it's returning the first match. However, it's not clear what should happen if there are multiple numbers in the input string. The function currently just takes the first number it finds, but this might not be the correct behavior in all cases. It would be better if the function could handle multiple numbers in some way, either by returning a list of all the numbers it finds, or by taking an additional argument that specifies which number to return.

Workflow ID: wflow_8ouRZYiA1lI1eC7u


Not what you expected? You can customize the content of the reviews using rules. Learn more here.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me!

  • Performed an incremental review on d9b9fa3
  • Looked at 30 lines of code in 1 files
  • Took 1 minute and 15 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 1 additional comments because they didn't meet confidence threshold of 50%.
1. /docs/src/content/docs/metrics/Retrieval/LLM-Based/llm_context_precision.md:9:
  • Assessed confidence : 100%
  • Grade: 0%
  • Comment:
    The PR description mentions changes to the _numeric_matcher function in custom.py, but this file is not included in the diff. Please include this file in the PR for review.
  • Reasoning:
    The PR description mentions changes to the _numeric_matcher function in custom.py, but the diff provided does not include this file. I need to check the changes in custom.py to ensure they are correct and do not introduce any bugs.

Workflow ID: wflow_fuxFxG1uWwDPWmqC


Not what you expected? You can customize the content of the reviews using rules. Learn more here.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me!

  • Performed an incremental review on c11c440
  • Looked at 122 lines of code in 6 files
  • Took 1 minute and 1 seconds to review
More info
  • Skipped 0 files when reviewing.
  • Skipped posting 3 additional comments because they didn't meet confidence threshold of 50%.
1. continuous_eval/metrics/generation/text/__init__.py:12:
  • Assessed confidence : 50%
  • Comment:
    Consider adding a warning message in the except block to inform the user about the missing semantic metrics.
  • Reasoning:
    The import of the semantic metrics is wrapped in a try-except block to handle ImportError. This is a good practice to ensure that the code doesn't break if the semantic metrics are not available. However, it would be better to log a warning message in the except block to inform the user about the missing metrics.
2. pyproject.toml:3:
  • Assessed confidence : 50%
  • Comment:
    Ensure that all the changes in this version have been properly documented in the changelog or release notes.
  • Reasoning:
    The version number in the pyproject.toml file has been updated from 0.3.3 to 0.3.4. This is a good practice to keep track of different versions of the software. However, it's important to ensure that all the changes in this version have been properly documented in the changelog or release notes.
3. docs/src/content/docs/index.mdx:17:
  • Assessed confidence : 50%
  • Comment:
    Ensure that the changes in the file paths do not break any existing links in the documentation.
  • Reasoning:
    The PR introduces changes to the documentation files. The changes in the documentation files seem to be related to the path of the files. It's important to ensure that these changes do not break any existing links in the documentation.

Workflow ID: wflow_zf9x59THGA9vrk8z


Not what you expected? You can customize the content of the reviews using rules. Learn more here.

@pantonante pantonante merged commit c3e8db5 into main Mar 17, 2024
@pantonante pantonante deleted the feature/llm_custom_metric branch March 17, 2024 03:45
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problems found on commit c3e8db5.


Not what you expected? You can customize the content of the reviews using rules. Learn more here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants