Add a new BLEU metric to Evidently #1319
Labels
enhancement
New feature or request
hacktoberfest
Accepted contributions will count towards your hacktoberfest PRs
About Hacktoberfest contributions: https://github.com/evidentlyai/evidently/wiki/Hacktoberfest-2024
Descripton.
The BLEU (Bilingual Evaluation Understudy) metric is used to evaluate the quality of machine-generated text, typically translations, by comparing it to the reference texts. BLEU measures how closely the generated text matches the reference using n-gram precision, with a penalty for overly short or incomplete translations.
We can implement a BLEU metric that computes scores for each row and a summary BLEU metric for the dataset.
Note that this implementation would require creating a new Metric (instead of defaulting to ColumnSummaryMetric to aggregate descriptors values) to compute and visualize the summary BLEU score. You can check other dataset-level metrics (e.g., from classification or ranking) for inspiration.
Note: we can also consider implementing METEOR metric as an option.
The text was updated successfully, but these errors were encountered: