Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event metadata can get out-of-sync when using this sample code for event hubs trigger #876

Closed
wesselkranenborg opened this issue May 9, 2022 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@wesselkranenborg
Copy link

In this repository there are some samples on how to fetch metadata from messages: https://github.com/Azure/azure-functions-dotnet-worker/blob/main/samples/Extensions/EventHubs/EventHubsTriggerMetadata.cs.

The code referring too is this:

 [Function("EventHubsTriggerMetadata-Parameters")]
        public static void UsingParameters([EventHubTrigger("src-parameters", Connection = "EventHubConnectionAppSetting")] string[] messages,
            DateTime[] enqueuedTimeUtcArray,
            long[] sequenceNumberArray,
            string[] offsetArray,
            Dictionary<string, JsonElement>[] propertiesArray,
            Dictionary<string, JsonElement>[] systemPropertiesArray)
        {
            // You can directly access binding data via parameters.
            for (int i = 0; i < messages.Length; i++)
            {
                string message = messages[i];
                DateTime enqueuedTimeUtc = enqueuedTimeUtcArray[i];
                long sequenceNumber = sequenceNumberArray[i];
                string offset = offsetArray[i];

                // Note: The values in these dictionaries are sent to the worker as JSON. By default, System.Text.Json will not automatically infer primitive values
                //       if you attempt to deserialize to 'object'. See https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-migrate-from-newtonsoft-how-to?pivots=dotnet-5-0#deserialization-of-object-properties       
                //       
                //       If you want to use Dictionary<string, object> and have the values automatically inferred, you can use the sample JsonConverter specified
                //       here: https://docs.microsoft.com/en-us/dotnet/standard/serialization/system-text-json-converters-how-to?pivots=dotnet-5-0#deserialize-inferred-types-to-object-properties
                //
                //       See the Configuration sample in this repo for details on how to add this custom converter to the JsonSerializerOptions, or for 
                //       details on how to use Newtonsoft.Json, which does automatically infer primitive values.
                Dictionary<string, JsonElement> properties = propertiesArray[i];
                Dictionary<string, JsonElement> systemProperties = systemPropertiesArray[i];
            }
        }

This code can however create bugs in software. If you have a message without a body there is no corresponding item in the string[] messages array but there are properties in the Dictionary<string, JsonElement>[] propertiesArray. If you parse these in batch the indexes between the different arrays are not equals which leads to wrong properties being used with wrong messages.

Let's take this example:
We have two messages coming in with the same batch. One (1) with a property but without body (yes, we have those) and another (2) where the body is filled. These are coming in, in the following way:
messages = int[1]{ messageBodyFrom2 }
propertiesArray = int [2] { propertiesFrom1, propertiesFrom2 }

How can we correlate the correct properties to the messagebody?

Is it not better that all arrays are always having the equal size? Something like this:
messages = int[2]{ nullValueAsMessageBodyFrom1, messageBodyFrom2 }
propertiesArray = int [2] { propertiesFrom1, propertiesFrom2 }

@ghost ghost assigned satvu May 9, 2022
@fabiocav fabiocav added bug Something isn't working and removed Needs: Triage (Functions) labels May 18, 2022
@fabiocav
Copy link
Member

@mattchenderson for awareness

@wesselkranenborg we'll update the sample to make sure others are not impacted by this, but will also be following up on the Event Hubs trigger implementation. Can you please share the version of the extension you are referencing? Thanks!

@fabiocav
Copy link
Member

Assigning this to 123 to update the sample while we investigate the issue with the trigger

@fabiocav fabiocav added this to the Functions Sprint 123 milestone May 18, 2022
@Privarozenbeek
Copy link

Hi @fabiocav I'm a colleague of @wesselkranenborg. Know the issue at hand pretty well, I can also answer the question about the used packages and their version:

<PackageReference Include="Microsoft.Azure.Functions.Worker.Extensions.EventHubs" Version="5.0.0-beta.6" /> <PackageReference Include="Microsoft.Azure.Functions.Worker.Sdk" Version="1.3.0" OutputItemType="Analyzer" /> <PackageReference Include="Microsoft.Azure.Functions.Worker" Version="1.6.0" />

@wesselkranenborg
Copy link
Author

@fabiocav I was a few days on leave but my colleague already responded. Am curious about the issue with the trigger. Do you have an issue Id from that or is that tracked with this issue?

Thanks @Privarozenbeek.

@kshyju
Copy link
Member

kshyju commented Jun 14, 2022

Thank you @wesselkranenborg @Privarozenbeek

Update from the investigation so far:

  • For out-of-process functions, The GRPC payload received in the worker from the host is missing the message entry which has null/empty string as value. The meta properties for the missing entry is still present in the TriggerMetaData property of the InvocationRequest instance received.
  • We get the correct data in the host from the event hub listener/processor module and in the host code, when we convert the event data object to RPC invocation request instance, we are skipping the empty entry in the ToRpcStringArray method.

Empty-Entry-Skipped-ToRpc

I will follow up with the team to discuss the right fix & impact of it and share an update as we make progress.

@kshyju
Copy link
Member

kshyju commented Jun 22, 2022

Initial investigation is done and root cause identified. The fix requires a change in the host. The below issue will track the progress of that.

Introduce new app setting to control skipping null entries from event payload when sending to OOP layer

@kshyju kshyju closed this as completed Jun 22, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Jul 22, 2022
@kshyju
Copy link
Member

kshyju commented Oct 12, 2022

We published the worker package 1.11.0-preview2 which has a fix for this.

https://github.com/Azure/azure-functions-dotnet-worker/releases/tag/1.11.0-preview2

Currently you need to to explicitly opt-in to this behavior while bootstrapping the application like below sample.

var host = new HostBuilder()
    .ConfigureFunctionsWorkerDefaults((builder) => { }, (options) =>
    {
        options.IncludeEmptyEntriesInMessagePayload = true;
    })
    .Build();

host.Run();

We plan on enabling this by default in the next major version update of worker package.

The host level fix is on v4.12.2. Waiting for the core tools release which reflects this change. Hopefully this will be out in the next week or so.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants