-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release/2.4.0 #1122
Merged
Merged
Release/2.4.0 #1122
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ewlines-test-in-robustness-category Added: implemeted the breaking sentence by newline in robustness.
…ement-the-addtabs-test-in-robustness-category
…abs-test-in-robustness-category Feature/implement the addtabs test in robustness category
…ement-the-support-for-multimodal-with-new-vqa-task
…ed upto generating testcases.
… class for huggingface DataSource.
…nstallation of torch
…s-for-ner-task Fix/error in accuracy tests for ner task
Update transformer's version to 4.44.2
…ement-the-support-for-multimodal-with-new-vqa-task
…ort-for-multimodal-with-new-vqa-task Feature/implement the support for multimodal with new vqa task
…cation in text-classification task and predictions in ner task
…s-for-multi-label-classification Fix/AttributeError in accuracy tests for multi label classification
This commit refactors the PromptGuard class in the modelhandler/promptguard.py module. The changes include: - Simplifying the initialization process by using a singleton pattern - Loading the model and tokenizer from Hugging Face - Preprocessing the input text to remove spaces and mitigate prompt injection tactics - Calculating class probabilities for a single or batch of texts - Adding methods to get jailbreak scores and indirect injection scores for a single input text or a batch of texts - Processing texts in batches to improve efficiency The commit also includes changes in the safety.py module: - Importing the PromptGuard class from the modelhandler/promptguard.py module - Replacing the pipeline usage with the PromptGuard class to get indirect injection scores Lastly, the commit includes changes in the output.py and sample.py modules: - Adding a greater than or equal to comparison method in the MaxScoreOutput class - Updating the comparison method in the QASample class to use the new comparison method in MaxScoreOutput
…lassification task and predictions in ner task
…s-for-multi-label-classification Refactor fairness test to handle multi-label classification
…ests-with-promptguard Feature/enhance safety tests with promptguard
RakshitKhajuria
approved these changes
Sep 22, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📢 Highlights
John Snow Labs is excited to announce the release of LangTest 2.4.0! This update introduces cutting-edge features and resolves key issues further to enhance model testing and evaluation across multiple modalities.
🔗 Multimodality Testing with VQA Task: We are thrilled to introduce multimodality testing, now supporting Visual Question Answering (VQA) tasks! With the addition of 10 new robustness tests, you can now perturb images to challenge and assess your model’s performance across visual inputs.
📝 New Robustness Tests for Text Tasks: LangTest 2.4.0 comes with two new robustness tests,
AddNewLines
andAddTabs
, applicable to text classification, question-answering, and summarization tasks. These tests push your models to handle text variations and maintain accuracy.🔄 Improvements to Multi-Label Text Classification: We have resolved accuracy and fairness issues affecting multi-label text classification evaluations, ensuring more reliable and consistent results.
🛡 Basic Safety Evaluation with Prompt Guard: We have added basic safety evaluation tests using the
prompt_guard
model, which provides initial layers of protection to identify and mitigate harmful or unintended responses from your language models.🛠 NER Accuracy Test Fixes: LangTest 2.4.0 addresses and resolves issues within the Named Entity Recognition (NER) accuracy tests, improving reliability in performance assessments for NER tasks.
🔒 Security Enhancements: We have upgraded various dependencies to address security vulnerabilities, making LangTest more secure for users.