-
Notifications
You must be signed in to change notification settings - Fork 15.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag #26812
community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag #26812
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
"source": [ | ||
"### Anonymize the snippets to redact all PII details\n", | ||
"\n", | ||
"Set `anonymize_snippets` to `True` to anonymize all personally identifiable information (PII) from the snippets going into VectorDB and the generated reports." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the recall on identifying PII 100%? If not might be worth a clarification or link to other documentation to set expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your feedback! I’ll add a link to the Pebblo classifier documentation to clarify the recall on identifying PII
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ccurme I’ve added a note and a link to our docs. We’re planning to update it with more details soon. Please review.
0eb8f33
to
6eeadb8
Compare
… flag (langchain-ai#26812) - **Description:** The flag is named `anonymize_snippets`. When set to true, the Pebblo server will anonymize snippets by redacting all personally identifiable information (PII) from the snippets going into VectorDB and the generated reports - **Issue:** NA - **Dependencies:** NA - **docs**: Updated
anonymize_snippets
. When set to true, the Pebblo server will anonymize snippets by redacting all personally identifiable information (PII) from the snippets going into VectorDB and the generated reports