Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS IoT Core disconnects with an AUTHORIZATION_ERROR randomly between 30 seconds and 5 minutes #1729

Closed
TCROC opened this issue May 7, 2023 · 47 comments · Fixed by #1737
Closed
Labels
question It is a question regarding the project

Comments

@TCROC
Copy link

TCROC commented May 7, 2023

An example project reproducing the issue as well as automating the deploying of AWS resources can be found here: https://github.com/TCROC/aws-iot-custom-auth

And for Windows 10 users who may have issues cross compiling for ARM64 Linux, a precompiled zip is here: https://github.com/TCROC/aws-iot-custom-auth/releases/download/precompiled-arm64-lambda/aws-iot-auth-issues.zip

For ease of reading, I'll copy the README from that project in here as it explains the issue at hand:

README

aws-iot-custom-auth

Dependencies

Tested on Ubuntu 22.04 and Windows 10.

Windows 10 requires WSL Ubuntu 22.04 for cross compiling to ARM64 processors.

  1. Install git: https://git-scm.com/downloads
    • NOTE: Reproduced with version: git version 2.40.1
  2. Install the rust toolset: https://www.rust-lang.org/tools/install
    • NOTE: Reproduced with version: rustup 1.26.0 (5af9b9484 2023-04-05), cargo 1.69.0 (6e9a83356 2023-04-12), rustc 1.69.0 (84c898d65 2023-04-16)
  3. Install cargo lambda: https://github.com/awslabs/aws-lambda-rust-runtime
    • NOTE: Reproduced with version: cargo-lambda 0.19.0 (e7a2b99 2023-04-07Z)
  4. Install aws cli v2: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html
    • NOTE: Reproduced with version: aws-cli/2.11.15 Python/3.11.3 Linux/6.2.6-76060206-generic exe/x86_64.pop.22 prompt/off
  5. Install dotnet 7.0: https://dotnet.microsoft.com/en-us/download/dotnet/7.0
    • NOTE: Reproduced with version: 7.0.203
  6. Clone:
    git clone https://github.com/TCROC/aws-iot-custom-auth.git --recurse-submodules
    

NOTE: When running the scripts, you can ignore the aws cli errors that are logged. The scripts do things such as check if the lambda function is deployed by calling aws lambda get-function . If the command errors, the script assumes it doesn't exist in the cloud and attempts to create one.

Create Lambda Authorizer

Run in a bash shell:

./create-lambda.sh

Create certificate

Run in a bash shell:

./create-cert.sh

Test Lambda Authorizer

Run in a bash shell:

./run-client-lambda.sh

Expected result: The mqtt client sends keep alive packets for 24 hours as specified in the policy returned from the lambda function.

Actual result: The mqtt client is disconnected anywhere between 30 seconds and 5 minutes.

Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/config-custom-auth.html

You can test the response of the authorizer in the console: https://docs.aws.amazon.com/lambda/latest/dg/testing-functions.html

Example test event:

NOTE: The password is testpassword base64 encoded

{
    "token": "aToken",
    "signatureVerified": false,
    "protocols": [
        "tls",
        "http",
        "mqtt"
    ],
    "protocolData": {
        "tls": {
            "serverName": "serverName"
        },
        "http": {
            "headers": {
                "#{name}": "#{value}"
            },
            "queryString": "?#{name}=#{value}"
        },
        "mqtt": {
            "username": "test",
            "password": "dGVzdHBhc3N3b3Jk",
            "clientId": "testid"
        }
    },
    "connectionMetadata": {
        "id": "UUID"
    }
}

Example result:

{
  "isAuthenticated": true,
  "principalId": "testid",
  "disconnectAfterInSeconds": 86400,
  "refreshAfterInSeconds": 86400,
  "policyDocuments": [
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Connect"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
          ],
          "Condition": {
            "ArnEquals": {
              "iot:LastWillTopic": [
                "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
              ]
            }
          }
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Receive"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/*"
          ],
          "Condition": {}
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
          ],
          "Condition": {}
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Subscribe"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
          ],
          "Condition": {}
        }
      ]
    }
  ]
}

Test Lambda Certificates

Run in a bash shell:

./run-client-cert.sh

Expected result: The mqtt authenticates and connects to IoT.

Actual Result: The client is immediately disconnected due to authorization error.

Documentation: https://docs.aws.amazon.com/iot/latest/developerguide/x509-client-certs.html

Cleanup

The lambda functions, authorizers, and certificates in aws will be deleted.

Run in a bash shell:

./clean-aws.sh

Other Information

I've tested this in Unity / Mono and dotnet 7 with websocket4net, dotnet, tcp, and websocket transports. The issue reproduces in all of them.

I am also in discussions with AWS on this particular issue. At the moment, we have not been able to narrow down if it is a client side issue with MQTTnet or on AWS's side.

I've been in an email chain with AWS Support and have recently opened up a ticket here: https://repost.aws/questions/QU-cOKeWC1TACCu_LVjS5BWw/iot-custom-authorizer-not-respecting-refreshafterinseconds-or-disconnectafterinseconds-in-returned-policy

I'm hoping we can narrow down on who's end the issue is and get this resolved soon as it is currently holding up the mobile release of our project: https://store.steampowered.com/app/1343040/Blocky_Ball/

We are using IoT with custom authorizers to bridge AWS and Microsoft Azure PlayFab.

If you would like me to organize a debugging session between this team and the AWS team, let me know and I can see if I can pull some people together.

@TCROC TCROC added the question It is a question regarding the project label May 7, 2023
@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Please enable tracing (see samples) and let me know if the client is able to establish a connection with the server. If we are connected and the client tries to send the CONNECT packet the issue might be located at MQTT protocol level. If we never reach this point the problem might be on transport level.

Probably it is related to #1698.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Also please try setting the property AllowPacketFragmentation to False in the client options (It is not yet available in the client options builder).

@TCROC
Copy link
Author

TCROC commented May 7, 2023

I ran with and without fragmentation. It did not appear to make any difference. Here are the logs:

Fragmentation Enabled Log 1
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   websocket4net
transport:                 websocket

===================

[2023-05-07T12:37:15.6473514Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com/mqtt'
[2023-05-07T12:37:16.0701588Z] [MqttNet] [7] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T12:37:16.0716973Z] [MqttNet] [5] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T12:37:16.0795678Z] [MqttNet] [7] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T12:37:16.4474998Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T12:37:16.4510864Z] [MqttNet] [7] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T12:37:16.4513150Z] [MqttNet] [7] [MqttClient] [Info]: Using keep alive value (35) sent from the server.
[2023-05-07T12:37:16.4513415Z] [MqttNet] [7] [MqttClient] [Info]: Connected.
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:37:16.4523458Z] [MqttNet] [9] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:37:39.4585360Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized]
[2023-05-07T12:37:39.4620316Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T12:37:39.4636606Z] [MqttNet] [9] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T12:37:39.4669017Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T12:37:39.4685722Z] [MqttNet] [5] [MqttClient] [Info]: Disconnected.
[2023-05-07T12:37:39.4689128Z] [MqttNet] [5] [MqttClient] [Verbose]: Stopped receiving packets.
Fragmentation Enabled Log 2
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   websocket4net
transport:                 websocket

===================

[2023-05-07T12:37:58.7342742Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com/mqtt'
[2023-05-07T12:37:59.0865795Z] [MqttNet] [7] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T12:37:59.0881289Z] [MqttNet] [5] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T12:37:59.0961988Z] [MqttNet] [7] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T12:37:59.2293086Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T12:37:59.2320065Z] [MqttNet] [7] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T12:37:59.2321862Z] [MqttNet] [7] [MqttClient] [Info]: Using keep alive value (35) sent from the server.
[2023-05-07T12:37:59.2322069Z] [MqttNet] [7] [MqttClient] [Info]: Connected.
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:37:59.2331559Z] [MqttNet] [7] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:38:34.2639764Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:38:34.3284775Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:39:09.3575949Z] [MqttNet] [7] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:39:09.4210144Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:39:44.4493593Z] [MqttNet] [13] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:39:44.5130765Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:39:54.8460968Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized]
[2023-05-07T12:39:54.8490648Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T12:39:54.8499986Z] [MqttNet] [3] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T12:39:54.8528070Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T12:39:54.8547239Z] [MqttNet] [5] [MqttClient] [Info]: Disconnected.
[2023-05-07T12:39:54.8550561Z] [MqttNet] [5] [MqttClient] [Verbose]: Stopped receiving packets.
Fragmentation Disabled Log 1
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   websocket4net
transport:                 websocket

===================

[2023-05-07T12:47:49.3066739Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com/mqtt'
[2023-05-07T12:47:49.7160511Z] [MqttNet] [9] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T12:47:49.7174974Z] [MqttNet] [5] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T12:47:49.7258173Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T12:47:50.0839562Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T12:47:50.0869921Z] [MqttNet] [9] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T12:47:50.0871388Z] [MqttNet] [9] [MqttClient] [Info]: Using keep alive value (35) sent from the server.
[2023-05-07T12:47:50.0871613Z] [MqttNet] [9] [MqttClient] [Info]: Connected.
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:47:50.0879030Z] [MqttNet] [10] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:48:25.1756504Z] [MqttNet] [3] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:48:25.2387147Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:49:00.2681037Z] [MqttNet] [12] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:49:00.3310592Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:49:35.3683218Z] [MqttNet] [11] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:49:35.4310834Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:49:49.5889413Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized]
[2023-05-07T12:49:49.5918735Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T12:49:49.5934562Z] [MqttNet] [11] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T12:49:49.5967229Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T12:49:49.5986424Z] [MqttNet] [5] [MqttClient] [Info]: Disconnected.
[2023-05-07T12:49:49.5989624Z] [MqttNet] [5] [MqttClient] [Verbose]: Stopped receiving packets.
Fragmentation Disabled Log 2
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   websocket4net
transport:                 websocket

===================

[2023-05-07T12:53:08.5114161Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com/mqtt'
[2023-05-07T12:53:08.9216083Z] [MqttNet] [7] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T12:53:08.9231521Z] [MqttNet] [5] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T12:53:08.9313050Z] [MqttNet] [7] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T12:53:09.0831896Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T12:53:09.0859125Z] [MqttNet] [7] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T12:53:09.0860904Z] [MqttNet] [7] [MqttClient] [Info]: Using keep alive value (35) sent from the server.
[2023-05-07T12:53:09.0861121Z] [MqttNet] [7] [MqttClient] [Info]: Connected.
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:53:09.0868999Z] [MqttNet] [9] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:53:44.1139645Z] [MqttNet] [3] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:53:44.1773956Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:54:19.2045351Z] [MqttNet] [3] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:54:19.4451003Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:54:54.2621890Z] [MqttNet] [12] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:54:54.3240910Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:55:29.4126059Z] [MqttNet] [11] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:55:29.4757615Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:56:04.5096673Z] [MqttNet] [14] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:56:04.5736407Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T12:56:39.6069906Z] [MqttNet] [16] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T12:56:39.6699050Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
[2023-05-07T12:56:47.4216070Z] [MqttNet] [5] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized]
[2023-05-07T12:56:47.4246290Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T12:56:47.4253068Z] [MqttNet] [5] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T12:56:47.4259044Z] [MqttNet] [14] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T12:56:47.4271680Z] [MqttNet] [15] [MqttClient] [Info]: Disconnected.
[2023-05-07T12:56:47.4274122Z] [MqttNet] [15] [MqttClient] [Verbose]: Stopped receiving packets.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

And this is the log for connecting via certificates without fragmentation. Although I'm personally more interested in getting the above custom authorizer implementation to work correctly as that is what I need to bridge AWS and Microsoft Azure PlayFab.

But for completeness in case there is something of interest, here are the logs when attempting to connect via certificates where it disconnects instantly.

Fragmentation Disabled Client Certificate Mode
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  cert
authorizer:                target/certs/AmazonRootCA1.pem,target/certs/certificate.pem.crt,target/certs/private.pem.key
transportImplementation:   websocket4net
transport:                 websocket

===================

[2023-05-07T13:10:02.4539693Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com/mqtt'
[2023-05-07T13:10:02.9248916Z] [MqttNet] [7] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T13:10:02.9263657Z] [MqttNet] [9] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T13:10:02.9344751Z] [MqttNet] [7] [MqttChannelAdapter] [Verbose]: TX (40 bytes) >>> Connect: [ClientId=testid] [Username=] [Password=] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T13:10:03.0090417Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: RX (113 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=NotAuthorized] [IsSessionPresent=False]
[2023-05-07T13:10:03.0134821Z] [MqttNet] [7] [MqttClient] [Error]: Error while connecting with server.
MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
[2023-05-07T13:10:03.0270139Z] [MqttNet] [7] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T13:10:03.0271807Z] [MqttNet] [9] [MqttClient] [Verbose]: Stopped receiving packets.
[2023-05-07T13:10:03.0276547Z] [MqttNet] [7] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T13:10:03.0299321Z] [MqttNet] [7] [MqttClient] [Info]: Disconnected.
Unhandled exception. MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 181
   at Program.<Main>$(String[] args) in /home/tcroc/dev/aws-iot-custom-auth/aws-iot-custom-auth-mqttnet/Program.cs:line 105
   at Program.<Main>(String[] args)

@TCROC TCROC changed the title Aws IoT Disconnects with an AUTHORIZATION_ERROR randomly between 30 seconds and 5 minutes AWS IoT Core disconnects with an AUTHORIZATION_ERROR randomly between 30 seconds and 5 minutes May 7, 2023
@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Whats make me wonder by just having a short view over the logs is that there is the following line (in thje last log):

[2023-05-07T13:10:03.0090417Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: RX (113 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=NotAuthorized] [IsSessionPresent=False]

How can it be that the connection is accepted AND the reason code is "not authorized" at the same time.

I will check the code...

@TCROC
Copy link
Author

TCROC commented May 7, 2023

I also included some scripts in the project that reproduce the issue by setting up the AWS environment via the aws cli and then run the client. Let me know if you have any issues reproducing it locally.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Which version of the library do you use?
Which version of the protocol do you use (3.1.1 or 5.0.0)?

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Protocol: 5.0.0
Library version: v4.2.0.706

That is the commit tag that I currently have checked out as a submodule.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

And the project that I shared just has a single Program.cs file in it for the mqttnet client. I'll copy it in here for ease of reading:

using System.Web;
using MQTTnet;
using MQTTnet.Extensions.WebSocket4Net;
using MQTTnet.Formatter;
using MQTTnet.Protocol;
using MQTTnet.Diagnostics;
using MQTTnet.Client;
using System.Net.Security;
using System.Security.Cryptography.X509Certificates;

Console.WriteLine("Running mqtt example application!");

int i = 0;
string username = args[i++];
string password = args[i++];
string endpoint = args[i++];
string rootTopic = args[i++];
string authType = args[i++]; // lambda | cert
string authorizer = args[i++]; // iot authorizer | cert paths (ca,crt,key)
string transportImplementation = args[i++]; // websocket4net | dotnet
string transport = args[i++]; // websocket | tcp

Console.WriteLine($"{Environment.NewLine}==================={Environment.NewLine}");
Console.WriteLine($"Args Used{Environment.NewLine}");
Console.WriteLine($"{nameof(username)}:                  {username}");
Console.WriteLine($"{nameof(password)}:                  {password}");
Console.WriteLine($"{nameof(endpoint)}:                  {endpoint}");
Console.WriteLine($"{nameof(rootTopic)}:                 {rootTopic}");
Console.WriteLine($"{nameof(authType)}:                  {authType}");
Console.WriteLine($"{nameof(authorizer)}:                {authorizer}");
Console.WriteLine($"{nameof(transportImplementation)}:   {transportImplementation}");
Console.WriteLine($"{nameof(transport)}:                 {transport}");
Console.WriteLine($"{Environment.NewLine}==================={Environment.NewLine}");

var factory = new MqttFactory();

if (transportImplementation == "websocket4net")
{
    factory = factory.UseWebSocket4Net();
}

var optionsBuilder = factory.CreateClientOptionsBuilder();

if (transport == "tcp")
{
    optionsBuilder = optionsBuilder.WithTcpServer($"{endpoint}", 443);
}
else if (transport == "websocket")
{
    optionsBuilder = optionsBuilder.WithWebSocketServer($"{endpoint}/mqtt");
}

if (authType == "lambda")
{
    optionsBuilder = optionsBuilder
        .WithCredentials($"username?x-amz-customauthorizer-name={HttpUtility.UrlEncode(authorizer)}", password)
        .WithTls(
            new MqttClientOptionsBuilderTlsParameters
            {
                UseTls = true,
                ApplicationProtocols = new List<SslApplicationProtocol> { new("mqtt") },
            }
        );
}
else if (authType == "cert")
{
    var caCrtKey = authorizer.Split(",");
    var caCert = X509Certificate2.CreateFromPem(File.ReadAllText(caCrtKey[0]));
    var cert = X509Certificate2.CreateFromPemFile(caCrtKey[1], caCrtKey[2]);
    optionsBuilder = optionsBuilder
        .WithTls(
            new MqttClientOptionsBuilderTlsParameters
            {
                UseTls = true,
                ApplicationProtocols = new List<SslApplicationProtocol> { new("mqtt") },
                Certificates = new X509Certificate[] 
                {
                    caCert,
                    cert
                },
            }
        );
}
else
{
    throw new Exception($"Invalid {nameof(authType)}: {authType}");
}

var options = optionsBuilder
    .WithClientId(username)
    .WithWillTopic($"{rootTopic}/s/{username}")
    .WithWillRetain(false)
    .WithWillPayload(new byte[] { 0 })
    .WithWillQualityOfServiceLevel(MqttQualityOfServiceLevel.AtLeastOnce)
    .WithProtocolVersion(MqttProtocolVersion.V500)
    .WithKeepAlivePeriod(TimeSpan.FromSeconds(35))
    .WithoutPacketFragmentation()
    .Build();

var logger = new MqttNetEventLogger("MqttNet");
logger.LogMessagePublished += (obj, logArgs) => Console.WriteLine(logArgs.LogMessage.ToString());

var client = factory.CreateMqttClient(logger);
var result = await client.ConnectAsync(options);

int delaySeconds = 10;
while (client.IsConnected)
{
    Console.WriteLine($"Client is connected. Checking again in {delaySeconds} seconds...");
    await Task.Delay(TimeSpan.FromSeconds(delaySeconds));
}

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

It turns out that the wrong value in the log is annoying but related to 3.1.1 compatibility. So the "ConnectionAccepted" is wrong and "NotAuthorized" is OK.

I am no AWS expert so is it possible to stick to the TCP variant of the console application you posted? Does the problem happen there as well as in any other cases?

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Yep, the error occurs in both websocket and tcp. It also occurs in both dotnet's and websocket4net's implementations of websockets and tcp. I can go generate a quick log for that.

Ultimately, I'll need to get it working with websockets as the tcp variant only works with dotnet. It does not work in mono (which is required for Unity). But websockets does work in mono.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

OK but it seems that every transport is affected somehow.

Are you able to compile and run the code on windows with .NET 6 or 7? We had a similar issue with AWS connections which are not working properly but only Mono under Linux was affected.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Dotnet TCP Certificate Logs
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  cert
authorizer:                target/certs/AmazonRootCA1.pem,target/certs/certificate.pem.crt,target/certs/private.pem.key
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T18:20:39.6827127Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T18:20:40.0859672Z] [MqttNet] [9] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T18:20:40.0877999Z] [MqttNet] [8] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T18:20:40.0958516Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: TX (40 bytes) >>> Connect: [ClientId=testid] [Username=] [Password=] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T18:20:40.1802685Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: RX (113 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=NotAuthorized] [IsSessionPresent=False]
[2023-05-07T18:20:40.1849953Z] [MqttNet] [9] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T18:20:40.1886322Z] [MqttNet] [9] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T18:20:40.1899433Z] [MqttNet] [9] [MqttClient] [Info]: Disconnected.
[2023-05-07T18:20:40.1902001Z] [MqttNet] [9] [MqttClient] [Verbose]: Stopped receiving packets.
[2023-05-07T18:20:40.1841036Z] [MqttNet] [6] [MqttClient] [Error]: Error while connecting with server.
MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
Unhandled exception. MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 181
   at Program.<Main>$(String[] args) in /home/tcroc/dev/aws-iot-custom-auth/aws-iot-custom-auth-mqttnet/Program.cs:line 104
   at Program.<Main>(String[] args)
Dotnet TCP Custom Authorizer Logs
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T18:22:20.1656545Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T18:22:20.5033084Z] [MqttNet] [6] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T18:22:20.5052541Z] [MqttNet] [8] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T18:22:20.5115802Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T18:22:20.9144817Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T18:22:20.9178599Z] [MqttNet] [6] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T18:22:20.9180185Z] [MqttNet] [6] [MqttClient] [Info]: Using keep alive value (35) sent from the server.
[2023-05-07T18:22:20.9180417Z] [MqttNet] [6] [MqttClient] [Info]: Connected.
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:22:20.9187899Z] [MqttNet] [8] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:22:55.9460561Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T18:22:56.0100668Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:23:31.0358012Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T18:23:31.0993065Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:24:06.1256173Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T18:24:06.1901435Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:24:41.2157602Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T18:24:41.2795402Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
[2023-05-07T18:24:48.5341949Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized]
[2023-05-07T18:24:48.5378284Z] [MqttNet] [6] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T18:24:48.5389909Z] [MqttNet] [8] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T18:24:48.5425332Z] [MqttNet] [6] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T18:24:48.5439222Z] [MqttNet] [6] [MqttClient] [Info]: Disconnected.
[2023-05-07T18:24:48.5442204Z] [MqttNet] [6] [MqttClient] [Verbose]: Stopped receiving packets.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

These were both run with dotnet 7 on Linux. I'll go boot up my Windows machine and check quick one sec.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Windows 10 dotnet 7 TCP Certificate Logs
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  cert
authorizer:                target/certs/AmazonRootCA1.pem,target/certs/certificate.pem.crt,target/certs/private.pem.key
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T18:40:53.6981725Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T18:40:54.1218007Z] [MqttNet] [6] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T18:40:54.1235888Z] [MqttNet] [11] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T18:40:54.1304948Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (40 bytes) >>> Connect: [ClientId=testid] [Username=] [Password=] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T18:40:54.2148616Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (113 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=NotAuthorized] [IsSessionPresent=False]
[2023-05-07T18:40:54.2187203Z] [MqttNet] [6] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T18:40:54.2197487Z] [MqttNet] [6] [MqttClient] [Verbose]: Disconnected from adapter.      
[2023-05-07T18:40:54.2211008Z] [MqttNet] [6] [MqttClient] [Info]: Disconnected.
[2023-05-07T18:40:54.2214454Z] [MqttNet] [6] [MqttClient] [Verbose]: Stopped receiving packets.
[2023-05-07T18:40:54.2182711Z] [MqttNet] [8] [MqttClient] [Error]: Error while connecting with server.
MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
Unhandled exception. MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 181
   at Program.<Main>$(String[] args) in C:\Users\charl\source\AWS\aws-iot-custom-auth\aws-iot-custom-auth-mqttnet\Program.cs:line 104
   at Program.<Main>(String[] args)
Windows 10 dotnet 7 TCP Custom Authorizer Logs
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T18:37:08.0225054Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'    
[2023-05-07T18:37:08.7111693Z] [MqttNet] [6] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T18:37:08.7127887Z] [MqttNet] [11] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T18:37:08.7195132Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T18:37:09.1069229Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T18:37:09.1094539Z] [MqttNet] [8] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T18:37:09.1097134Z] [MqttNet] [8] [MqttClient] [Info]: Using keep alive value (35) sent from the server.
[2023-05-07T18:37:09.1098304Z] [MqttNet] [8] [MqttClient] [Info]: Connected.
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:37:09.1104642Z] [MqttNet] [6] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:37:44.2843391Z] [MqttNet] [11] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T18:37:44.3515312Z] [MqttNet] [11] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:38:19.4415565Z] [MqttNet] [11] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T18:38:19.5053787Z] [MqttNet] [11] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
[2023-05-07T18:38:38.6194570Z] [MqttNet] [14] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized]
[2023-05-07T18:38:38.6217440Z] [MqttNet] [14] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T18:38:38.6229420Z] [MqttNet] [14] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T18:38:38.6229524Z] [MqttNet] [13] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T18:38:38.6241151Z] [MqttNet] [14] [MqttClient] [Info]: Disconnected.
[2023-05-07T18:38:38.6244033Z] [MqttNet] [14] [MqttClient] [Verbose]: Stopped receiving packets.

And a side note:

The git clone command in the README was incorrectly pointing to one of the submodules. I updated it to point to the correct repo.

Edit: typo

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

I am only wild guessing so I made some changes in the branch which is attached to this ticket:

  • I fixed the decoding of the CONNACK packet (will change the log but not fix the issue).
  • I added a .NET 4.8 build. Maybe this will cause Mono to use other/newer implementations.
  • I exposed more properties in the TLS options. Please play around with them. Probably there is a setting which affects the authentication.

Please use the MyGet feed in order to use this preview build (Link is in README batch). I will post the exact build number as soon as the build is completed.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Also please check the answer from this ticket: #1359
It might help.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Awesome. I'll go try out connecting with cert via the answer in the above ticket. I'll keep an eye on this thread for that build number and test it out when its done.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

I tried connecting with the cert by exporting to Pkcs12 as recommended in that above ticket. It did not make a difference.

    var caCrtKey = authorizer.Split(",");
    var caCert = new X509Certificate2(X509Certificate2.CreateFromPem(File.ReadAllText(caCrtKey[0])).Export(X509ContentType.Pkcs12));
    var cert = new X509Certificate2(X509Certificate2.CreateFromPemFile(caCrtKey[1], caCrtKey[2]).Export(X509ContentType.Pkcs12));
logs
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  cert
authorizer:                target/certs/AmazonRootCA1.pem,target/certs/certificate.pem.crt,target/certs/private.pem.key
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T19:05:27.2647894Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T19:05:27.6671510Z] [MqttNet] [9] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T19:05:27.6687444Z] [MqttNet] [8] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T19:05:27.6767553Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: TX (40 bytes) >>> Connect: [ClientId=testid] [Username=] [Password=] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T19:05:27.7599930Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (113 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=NotAuthorized] [IsSessionPresent=False]
[2023-05-07T19:05:27.7642702Z] [MqttNet] [8] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T19:05:27.7681505Z] [MqttNet] [8] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T19:05:27.7697832Z] [MqttNet] [8] [MqttClient] [Info]: Disconnected.
[2023-05-07T19:05:27.7700843Z] [MqttNet] [8] [MqttClient] [Verbose]: Stopped receiving packets.
[2023-05-07T19:05:27.7637273Z] [MqttNet] [9] [MqttClient] [Error]: Error while connecting with server.
MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
Unhandled exception. MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 181
   at Program.<Main>$(String[] args) in /home/tcroc/dev/aws-iot-custom-auth/aws-iot-custom-auth-mqttnet/Program.cs:line 104
   at Program.<Main>(String[] args)

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Is the quoted code the code you used? Because in the other ticket it uses other constructor as the one you posted.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

The version for MyGet is 4.2.1.740 https://www.myget.org/feed/mqttnet/package/nuget/MQTTnet

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Good catch. I updated the code to be:

    var caCrtKey = authorizer.Split(",");
    var tempCaText = File.ReadAllText(caCrtKey[0]);
    var tempCrtText = File.ReadAllText(caCrtKey[1]);
    var tempKeyText = File.ReadAllText(caCrtKey[2]);
    var tempCa = new X509Certificate2(X509Certificate2.CreateFromPem(tempCaText));
    var tempCrt = X509Certificate2.CreateFromPem(tempCrtText, tempKeyText);
    var ca = new X509Certificate2(tempCa.Export(X509ContentType.Pkcs12));
    var cert = new X509Certificate2(tempCrt.Export(X509ContentType.Pkcs12));
    optionsBuilder = optionsBuilder
        .WithTls(
            new MqttClientOptionsBuilderTlsParameters
            {
                UseTls = true,
                ApplicationProtocols = new List<SslApplicationProtocol> { new("mqtt") },
                Certificates = new X509Certificate[] 
                {
                    ca,
                    cert
                },
            }
        );

And got the same result in the logs

logs
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  cert
authorizer:                target/certs/AmazonRootCA1.pem,target/certs/certificate.pem.crt,target/certs/private.pem.key
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T19:16:06.8111116Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T19:16:08.8696516Z] [MqttNet] [8] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T19:16:08.8711942Z] [MqttNet] [6] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T19:16:08.8780964Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (40 bytes) >>> Connect: [ClientId=testid] [Username=] [Password=] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T19:16:09.6253017Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (113 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=NotAuthorized] [IsSessionPresent=False]
[2023-05-07T19:16:09.6314154Z] [MqttNet] [6] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T19:16:09.6361834Z] [MqttNet] [6] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T19:16:09.6383619Z] [MqttNet] [6] [MqttClient] [Info]: Disconnected.
[2023-05-07T19:16:09.6386716Z] [MqttNet] [6] [MqttClient] [Verbose]: Stopped receiving packets.
[2023-05-07T19:16:09.6310225Z] [MqttNet] [9] [MqttClient] [Error]: Error while connecting with server.
MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
Unhandled exception. MQTTnet.Adapter.MqttConnectingFailedException: Connecting with MQTT server failed (NotAuthorized).
   at MQTTnet.Client.MqttClient.AuthenticateAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 450
   at MQTTnet.Client.MqttClient.ConnectInternal(CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 501
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 143
   at MQTTnet.Client.MqttClient.ConnectAsync(MqttClientOptions options, CancellationToken cancellationToken) in /_/Source/MQTTnet/Client/MqttClient.cs:line 181
   at Program.<Main>$(String[] args) in /home/tcroc/dev/aws-iot-custom-auth/aws-iot-custom-auth-mqttnet/Program.cs:line 109
   at Program.<Main>(String[] args)

@TCROC
Copy link
Author

TCROC commented May 7, 2023

And I'll try out that myget and let you know what I find

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Apart from the certificate stuff I am wondering why the custom authorizer works but out of a sudden disconnects.

Are you sending data between these PING packets? Or is the connection Idle?
The server should allow up to 1.5 times the keep alive time without messages but could you please add separate calls to "PingAsync" to make PING requests happen much more often than the keep alive period?

I started a new build which will write the reason string from the DISCONNECT packet to the log. Hopefully AWS tells more details in that field why it decides to disconnect.

Do you maybe need to start a re authentication (feature of MQTT 5.0.0) from time to time and it is not happening? Is there something written down in the AWS docs regarding this?

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2023

Version 4.2.1.741 is also now available at MyGet. It writes the response string to the log. Hopefully it contains something set by the broker...

@TCROC
Copy link
Author

TCROC commented May 7, 2023

The connection is idle. Just sending pingreq every 35 seconds since AWS specifies we can send every 30 seconds here:

https://aws.amazon.com/iot-core/pricing/additional-details/

@TCROC
Copy link
Author

TCROC commented May 7, 2023

And awesome! I'll go try it out with that MyGet build now!

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Is this the branch you are currently working on?

https://github.com/dotnet/MQTTnet/tree/1729-aws-iot-core-disconnects-with-an-authorization_error-randomly-between-30-seconds-and-5-minutes

If so, its actually easier for me to just check this out than use MyGet. I'm currently using this repo as a git submodule and its really easy to check out different branches.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Here are the logs with the custom authorizer:

log
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T21:01:07.0942621Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T21:01:07.4453187Z] [MqttNet] [6] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T21:01:07.4518716Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T21:01:07.7652312Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T21:01:07.7664020Z] [MqttNet] [8] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T21:01:07.7670077Z] [MqttNet] [8] [MqttClient] [Info]: Using keep alive value (35) sent from the server
[2023-05-07T21:01:07.7670266Z] [MqttNet] [8] [MqttClient] [Info]: Connected
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:01:07.7678229Z] [MqttNet] [10] [MqttClient] [Verbose]: Start sending keep alive packets.
[2023-05-07T21:01:07.7679079Z] [MqttNet] [9] [MqttClient] [Verbose]: Start receiving packets.
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:01:42.8029540Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:01:42.8663726Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:01:56.4972775Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized] [ReasonString=] [ServerReference=] [SessionExpiryInterval=0]
[2023-05-07T21:01:56.4999620Z] [MqttNet] [8] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T21:01:56.5009780Z] [MqttNet] [13] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T21:01:56.5042501Z] [MqttNet] [8] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T21:01:56.5054954Z] [MqttNet] [8] [MqttClient] [Info]: Disconnected.
[2023-05-07T21:01:56.5058030Z] [MqttNet] [8] [MqttClient] [Verbose]: Stopped receiving packets.

Disappointingly, the reason string appears to be empty. I'm going to try increasing the ping interval for keep alive but leave the keep alive setting the same and see what happens.

@logicaloud
Copy link
Contributor

Thanks for posting the example. I have run your code with a test lambda authorizer over WebSockets (Windows 11, .NET 6) and did not encounter any problems. I have not tried TCP. Would you expect the problem to occur in this environment?

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Yes I would have expected the problem to occur. How long did your client stay connected? Mine stays connected for anywhere between 30 seconds and 5 minutes. If yours stays connected beyond that (especially say like 10 minutes), this may indicate there is something wrong on my AWS account in particular.

In fact, I'll go try creating a different AWS account with a different email and see what happens.

Platforms that I've currently tested on:

Pop OS (Ubuntu derivative)
Ubuntu 22.04
Windows 10

@TCROC
Copy link
Author

TCROC commented May 7, 2023

@chkr1011

The server should allow up to 1.5 times the keep alive time without messages but could you please add separate calls to "PingAsync" to make PING requests happen much more often than the keep alive period?

I set it to ping in half the duration of the keep alive period by halfing the keep alive period ticks in MqttClient.cs on line 994:

                while (!cancellationToken.IsCancellationRequested)
                {
                    // Values described here: [MQTT-3.1.2-24].
                    var timeWithoutPacketSent = DateTime.UtcNow - _lastPacketSentTimestamp;

                    if (timeWithoutPacketSent > new TimeSpan(keepAlivePeriod.Ticks / 2))
                    {
                        using (var timeoutCancellationTokenSource = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken))
                        {
                            timeoutCancellationTokenSource.CancelAfter(Options.Timeout);
                            await PingAsync(timeoutCancellationTokenSource.Token).ConfigureAwait(false);
                        }

And the disconnect for "NotAuthorized" still occurred:

log
Running mqtt example application!

===================

Args Used

username:                  testid
password:                  testpassword
endpoint:                  a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com
rootTopic:                 open
authType:                  lambda
authorizer:                iot-lambda-authorizer
transportImplementation:   dotnet
transport:                 tcp

===================

[2023-05-07T21:07:34.1573389Z] [MqttNet] [1] [MqttClient] [Verbose]: Trying to connect with server 'a1jbgnrm1s76uh-ats.iot.us-east-1.amazonaws.com:443'
[2023-05-07T21:07:34.4925036Z] [MqttNet] [8] [MqttClient] [Verbose]: Connection with server established
[2023-05-07T21:07:34.4991862Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (114 bytes) >>> Connect: [ClientId=testid] [Username=username?x-amz-customauthorizer-name=iot-lambda-authorizer] [Password=****] [KeepAlivePeriod=35] [CleanSession=True]
[2023-05-07T21:07:34.8773199Z] [MqttNet] [6] [MqttChannelAdapter] [Verbose]: RX (29 bytes) <<< ConnAck: [ReturnCode=ConnectionAccepted] [ReasonCode=Success] [IsSessionPresent=False]
[2023-05-07T21:07:34.8783920Z] [MqttNet] [6] [MqttClient] [Verbose]: Authenticated MQTT connection with server established.
[2023-05-07T21:07:34.8789577Z] [MqttNet] [6] [MqttClient] [Info]: Using keep alive value (35) sent from the server
[2023-05-07T21:07:34.8789825Z] [MqttNet] [6] [MqttClient] [Info]: Connected
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:07:34.8797420Z] [MqttNet] [9] [MqttClient] [Verbose]: Start receiving packets.
[2023-05-07T21:07:34.8798737Z] [MqttNet] [10] [MqttClient] [Verbose]: Start sending keep alive packets.
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:07:52.5254102Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:07:52.5901590Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:08:10.2324598Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:08:10.2960666Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:08:27.9369702Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:08:28.0002701Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:08:45.6442092Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:08:45.7072687Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:09:03.3494051Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:09:03.4121508Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:09:21.0531901Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:09:21.1181170Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:09:38.7574832Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:09:39.0122321Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:09:56.4008313Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:09:56.4639787Z] [MqttNet] [10] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:10:14.1040290Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:10:14.1682911Z] [MqttNet] [9] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:10:31.8097970Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: TX (2 bytes) >>> PingReq
[2023-05-07T21:10:31.8728314Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (2 bytes) <<< PingResp
Client is connected. Checking again in 10 seconds...
Client is connected. Checking again in 10 seconds...
[2023-05-07T21:10:48.0721855Z] [MqttNet] [8] [MqttChannelAdapter] [Verbose]: RX (4 bytes) <<< Disconnect: [ReasonCode=NotAuthorized] [ReasonString=] [ServerReference=] [SessionExpiryInterval=0]
[2023-05-07T21:10:48.0759954Z] [MqttNet] [8] [MqttClient] [Verbose]: Disconnecting [Timeout=00:01:40]
[2023-05-07T21:10:48.0774358Z] [MqttNet] [10] [MqttClient] [Verbose]: Stopped sending keep alive packets.
[2023-05-07T21:10:48.0800323Z] [MqttNet] [8] [MqttClient] [Verbose]: Disconnected from adapter.
[2023-05-07T21:10:48.0812739Z] [MqttNet] [8] [MqttClient] [Info]: Disconnected.
[2023-05-07T21:10:48.0815826Z] [MqttNet] [8] [MqttClient] [Verbose]: Stopped receiving packets.

@logicaloud
Copy link
Contributor

I have let the client run for about 15 minutes without issues. This seems to point to the authorizer. Maybe you could test with your test account and an "allow all" authorizer and then re-introduce conditions until the problem occurs, i.e., start with

export const handler = async(event, context, callback) => {
    // TODO implement
    callback(null, 
            {
            "isAuthenticated":true,
            "principalId": "testprincipalId",
            "disconnectAfterInSeconds": 86400, 
            "refreshAfterInSeconds": 86400,
            "policyDocuments": [
              {
                "Version": "2012-10-17",
                "Statement": [
                    {
                      "Effect": "Allow",
                      "Action": "iot:*",
                      "Resource": "*"
                    }
                 ]
               }
            ]
            }
        );
};

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Is this the policy you used for your test client? The one that has been connected for 15+ minutes?

Or did you get it to work with my lambda authorizer in the test project?

@logicaloud
Copy link
Contributor

Yes, I have used the "allow all" authorizer for the 15 minutes test. I have tried your authorizer as well (with the account identifiers updated) but it failed to connect immediately and I did not investigate further.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Ok cool. I just updated my authorizer to "allow all". 🤞 it stays connected.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Ok I've been connected for 15+ minutes now! This definitely points to an authorizer / policy doc / AWS error. I'm going to narrow down what the offending policy is. Then we will be able to confirm where the issue actually resides. There is still a possibility that mqttnet is interacting with AWS in a way that AWS doesn't expect when more restrictive policies are in place. Which could still be an AWS error, but it will be nice to narrow down.

Then I'll contribute back to the aws docs with the findings from this issue as well as updating some things. There are a few parts that are outdated such as .NET's implementation not working (it does now): https://github.com/dotnet/MQTTnet/wiki/Client#connecting-with-amazon-aws

@logicaloud
Copy link
Contributor

Thank you for the update; that sounds good so far. Please keep us posted with future developments and also if you get it to work with WebSockets and your version of Mono.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

np! :) And regarding dotnet and mono experience so far:

dotnet 7:

Provider Transport Status
dotnet tcp
dotnet websocket
websocket4net tcp
websocket4net websocket

Mono (specifically tested with Unity Game Engine 2021.3.24f1)

Provider Transport Status
mono tcp
mono websocket
websocket4net tcp
websocket4net websocket

Which I was surprised to find that the mono / dotnet implementation didn't work in Unity since Unity supports .NET Standard 2.1 now and .NET Standard 2.1 now has ALPN support:

https://learn.microsoft.com/en-us/dotnet/api/system.net.security.sslapplicationprotocol?view=netstandard-2.1

In fact, I'll open up a separate issue for ALPN in Unity along with the certificate error above once we wrap up this policy issue.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

Found the offending policy!!! 🥳🎉🎊

✅ Policy that works
{
  "isAuthenticated": true,
  "principalId": "testid",
  "disconnectAfterInSeconds": 86400,
  "refreshAfterInSeconds": 86400,
  "policyDocuments": [
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Connect"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Receive"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Subscribe"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
          ]
        }
      ]
    }
  ]
}
❌ Policy that disconnects randomly between 30 seconds and 5 minutes
{
  "isAuthenticated": true,
  "principalId": "testid",
  "disconnectAfterInSeconds": 86400,
  "refreshAfterInSeconds": 86400,
  "policyDocuments": [
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Action": [
            "iot:Connect"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:client/${iot:ClientId}"
          ],
          "Condition": {
            "ArnEquals": {
              "iot:LastWillTopic": [
                "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
              ]
            }
          }
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Receive"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/*"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Publish"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topic/open/d/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/p/*/${iot:ClientId}",
            "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
          ]
        },
        {
          "Effect": "Allow",
          "Action": [
            "iot:Subscribe"
          ],
          "Resource": [
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/d/${iot:ClientId}/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/p/*/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/s/*",
            "arn:aws:iot:us-east-1:144868213084:topicfilter/open/f/*"
          ]
        }
      ]
    }
  ]
}

And when comparing the 2 policy documents to each other in VS Code, it shows that the offending policy is the condition requiring a LastWillTopic:

image

"Condition": {
  "ArnEquals": {
    "iot:LastWillTopic": [
      "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
    ]
  }
}

So what do you guys think? Is this potentially an issue on mqttnet's side still? Does something have to be included in the pingreq that indicates a last will topic every time for example? Or maybe mqttnet should be reminding AWS of what the LastWill topic is for this client over some interval?

Or do you guys think this is an issue on AWS's side?

My client does specify a LastWill topic in the builder:

var options = optionsBuilder
    .WithClientId(username)
    .WithWillTopic($"{rootTopic}/s/{username}")
    .WithWillRetain(false)
    .WithWillPayload(new byte[] { 0 })
    .WithWillQualityOfServiceLevel(MqttQualityOfServiceLevel.AtLeastOnce)
    .WithProtocolVersion(MqttProtocolVersion.V500)
    .WithKeepAlivePeriod(TimeSpan.FromSeconds(35))
    .WithoutPacketFragmentation()
    .Build();

Also, I'm perfectly fine with not requiring a LastWill topic for my clients. Its more of a "nice to have" and not "mission critical" for my use case. So I feel comfortable going ahead and using the mqttnet client with this ammended policy! :)

I'll happily continue debugging with you guys though if you want to dig in further to make sure its not an issue on mqttnet's end.

@logicaloud @chkr1011 You have both been amazing to work with on this! Thank you very much! I'll gladly send Microsoft some nice words of my experience with you guys and this project if there is somewhere for me leave such a thing? If not, I hope they see this issue :)

Now I'll go open up a couple more issues with my other findings. All of which currently have workarounds. And I'll go contribute back to the docs.

@TCROC
Copy link
Author

TCROC commented May 7, 2023

And if you guys are reasonably confident that its an issue on AWS's end, you can feel free to close this issue

@logicaloud
Copy link
Contributor

Thank you :)

I wouldn't mind having another look at the Last Will condition sometime towards the end of the week, so I'd suggest keeping the issue open for a bit longer.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 8, 2023

Glad to hear that. I will also create a PR with the changes I made because they are useful for others.

@logicaloud
Copy link
Contributor

Funny thing, the only problem with the Last Will condition seems to be that the iot:LastWillTopic is not supposed to be an array. So the condition without the [ and ] around "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}" should work.

"Condition": {
  "ArnEquals": {
    "iot:LastWillTopic":
      "arn:aws:iot:us-east-1:144868213084:topic/open/s/${iot:ClientId}"
  }
}

@TCROC
Copy link
Author

TCROC commented May 11, 2023

Yep I noticed that as well. I tried without the array and got the same behavior. Disconnects between 30 seconds and 5 minutes

@logicaloud
Copy link
Contributor

Can confirm; I just didn't wait long enough.

On a hunch, I have added the following condition after the iot:LastWillTopic condition:

"Condition": {
  "ForAllValues:StringEquals": {
    "iot:ConnectAttributes": [
      "PersistentConnect",
      "LastWill"
    ]
  }
}

After that, the connection stayed alive. The AWS docs are not really clear about that, at least not those that I have seen.

If it works for you then I think the issue is not in MQTTnet and we can close.

@TCROC
Copy link
Author

TCROC commented May 12, 2023

Ok cool. I'll be able to test that this weekend. I'll check it out and let you know.

@chkr1011 chkr1011 linked a pull request May 13, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question It is a question regarding the project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants