Multilabel integration into task chains as attribute extraction #902

DhruvaBansal00 · 2024-09-18T22:05:50Z

Pull Review Summary

Description

A summary of the change. Please also include relevant motivation and context. This could include links to any docs/Slack threads/Github issues other artifacts.

Type of change

Bug fix (change which fixes an issue)
New feature (change which adds functionality)
This change requires a documentation update

Tests

Locally

nihit

do we need to do something similar to this https://github.com/refuel-ai/autolabel/blob/main/src/autolabel/tasks/base.py#L224 ?

nihit

lgtm otherwise

nihit · 2024-09-20T04:39:36Z

src/autolabel/confidence.py

@@ -76,6 +77,47 @@ def logprob_average(
                count += 1
        return logprob_cumulative ** (1.0 / count) if count > 0 else 0

+    def _logprob_average_per_label(


@rajasbansal for review

DhruvaBansal00 · 2024-09-20T05:30:03Z

do we need to do something similar to this https://github.com/refuel-ai/autolabel/blob/main/src/autolabel/tasks/base.py#L224 ?

Fortunately no. We just return a semicolon separated list as the label for multilabel attributes and the product takes care of splitting them while displaying just like before. Confidence computation has to perform some splitting to get a confidence value for each key however.

rajasbansal · 2024-09-20T20:11:50Z

src/autolabel/confidence.py

+        for conf_label_candiate in conf_label_keys:
+            closest_match, closest_match_score = None, 0
+            for label in logprob_per_label:
+                longest_substring = difflib.SequenceMatcher(


nit: comment on what this function does/what the arguments mean here

added comments!

rajasbansal · 2024-09-20T20:15:00Z

src/autolabel/confidence.py

+                longest_substring = difflib.SequenceMatcher(
+                    None, label, conf_label_candiate
+                ).find_longest_match(0, len(label), 0, len(conf_label_candiate))
+                if longest_substring.size > closest_match_score:


what if one label is contained within another label i.e if the labels are
LessRed, MoreRed, Red
will that lead to an issue here?

Changed the metric to be proportion of the largest string that overlaps, instead of the number of characters. I think that should take care of this case.

rajasbansal · 2024-09-20T20:17:25Z

src/autolabel/confidence.py

-        # Remove all characters before a '\n' character in the logprobs, as
-        # this is parsed during the generation process
-        # In this case if input logprobs is
-        # [{"xx\nc": -1.2},{"Ab\n": -1.2}, {"Abc": -1.2}, {";": -1.3}, {"B": -1.4}, {"cd": -1.6}, {";": -1.5}, {"C": -1.4}]
-        # The output logprobs would be [{"Abc": -1.2}, {";": -1.3}, {"B": -1.4}, {"cd": -1.6}, {";": -1.5}, {"C": -1.4}]
-        for i in range(len(logprobs) - 1, -1, -1):
-            cur_key = list(logprobs[i].keys())[0]
-            if "\n" in cur_key:
-                new_key = cur_key.split("\n")[-1].strip()
-                if not new_key:
-                    logprobs = logprobs[i + 1 :]
-                else:
-                    logprobs[i] = {new_key: logprobs[i][cur_key]}
-                    logprobs = logprobs[i:]
-                break


this was needed because of a bug where sometime the model will output a lot of characters like \n or some explanation followed by a \n to give the label. We handle this in parse_llm_response by removing everything before the last \n. Is the expectation that this will never happen now because of guided generation?

Yep with guided generation this shouldn't happen now

rajasbansal · 2024-09-20T22:16:52Z

src/autolabel/confidence.py

@@ -162,7 +169,7 @@ def logprob_average_per_key(

        # Find the locations of each key in the logprobs as indices
        # into the logprobs list
-        locations = []


comment on why we need this?

This is just implementation choice - allows to avoid setting logprob_per_key[locations[-1][2]] at the end explicitly.

rajasbansal · 2024-09-20T22:20:38Z

src/autolabel/tasks/attribute_extraction.py

-                            f"Attribute {attr_label} from the LLM response {llm_label} is not in the labels list"
-                        )
-                        llm_label.pop(attribute["name"], None)
+                    if attr_type == TaskType.CLASSIFICATION:


also check that every label that we get for multilabel is one of the options if the guidelines were followed

We shouldn't need this either moving forward with guided generation. Will be removing classification as well soon!

rajasbansal

lgtm if #902 (comment) is not an issue

nihit · 2024-09-22T00:57:15Z

thanks @DhruvaBansal00 lgtm from my end

DhruvaBansal00 added 2 commits September 17, 2024 17:54

Multilabel confidence computation in attribute extraction

b69ab27

Confidence for multilabel [to be deprecated]

ff60a52

DhruvaBansal00 requested review from nihit, yadavsahil197 and rajasbansal September 18, 2024 22:05

DhruvaBansal00 self-assigned this Sep 18, 2024

fmt

0167c70

nihit reviewed Sep 20, 2024

View reviewed changes

nihit approved these changes Sep 20, 2024

View reviewed changes

rajasbansal reviewed Sep 20, 2024

View reviewed changes

rajasbansal approved these changes Sep 20, 2024

View reviewed changes

DhruvaBansal00 added 2 commits September 23, 2024 19:05

Addressing overlapping labels

df7f0bb

black

426cd88

DhruvaBansal00 merged commit 7fb125b into main Sep 24, 2024
2 checks passed

DhruvaBansal00 deleted the multilabel-taskchain-integration branch September 24, 2024 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multilabel integration into task chains as attribute extraction #902

Multilabel integration into task chains as attribute extraction #902

DhruvaBansal00 commented Sep 18, 2024

nihit left a comment

nihit left a comment

nihit Sep 20, 2024

DhruvaBansal00 commented Sep 20, 2024

rajasbansal Sep 20, 2024

DhruvaBansal00 Sep 24, 2024

rajasbansal Sep 20, 2024

DhruvaBansal00 Sep 24, 2024

rajasbansal Sep 20, 2024

DhruvaBansal00 Sep 24, 2024

rajasbansal Sep 20, 2024

DhruvaBansal00 Sep 24, 2024

rajasbansal Sep 20, 2024

DhruvaBansal00 Sep 24, 2024

rajasbansal left a comment

nihit commented Sep 22, 2024

Multilabel integration into task chains as attribute extraction #902

Multilabel integration into task chains as attribute extraction #902

Conversation

DhruvaBansal00 commented Sep 18, 2024

Pull Review Summary

Description

Type of change

Tests

nihit left a comment

Choose a reason for hiding this comment

nihit left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DhruvaBansal00 commented Sep 20, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rajasbansal left a comment

Choose a reason for hiding this comment

nihit commented Sep 22, 2024