-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: utf8 sanitizer #485
feat: utf8 sanitizer #485
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #485 +/- ##
===========================================
- Coverage 78.98% 30.60% -48.38%
===========================================
Files 90 44 -46
Lines 6747 3486 -3261
===========================================
- Hits 5329 1067 -4262
- Misses 1153 2303 +1150
+ Partials 265 116 -149 ☔ View full report in Codecov by Sentry. |
I believe that both functions can be replaced with standard library ones: https://pkg.go.dev/bytes#ToValidUTF8 and |
@lvrach I'm happy to replace Given the effort spent on the IngestionSvc with all the memory optimizations, I think it makes sense to consider it. Wdyt? |
@lvrach I copied the exact same test cases that they have for |
Co-authored-by: Akash Chetty <achetty.iitr@gmail.com> Co-authored-by: Leonidas Vrachnis <leo.al.vra@gmail.com>
3555fd9
to
24ccf80
Compare
…ue-on-ingestionsvc
Description
This is needed by the IngestionSvc so that we don't propagate messages with invalid UTF-8 sequences to Pulsar.
If we do propagate them, then the SrcRouter won't be able to push those messages to RudderServer, because then the RudderServer gateway would fail with an error similar to the following:
Linear Ticket
< Linear_Link >
Security