Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLE ANCS Pairing Issue with ESP32 (Error Code: 0x66/102) (IDFGH-14177) #14977

Open
3 tasks done
nitin-jethava opened this issue Dec 4, 2024 · 12 comments
Open
3 tasks done
Assignees
Labels
Status: Opened Issue is new

Comments

@nitin-jethava
Copy link

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

General issue report

Issue Description:

The ESP32 running the ESP-IDF BLE ANCS example frequently fails to pair with an iOS device after repeated connect/disconnect cycles. The pairing fails with the error code 0x66 (102).


Steps to Reproduce:

  1. Run the built-in ESP-IDF BLE ANCS example on the ESP32.
  2. Connect the ESP32 to an iOS device.
  3. Disconnect and reconnect multiple times or move the iOS device in and out of range.
  4. Observe pairing failure with error 0x66.

Observation:

  • The issue occurs more frequently when the connection/disconnection cycle is fast.
  • It appears the device cannot proceed with pairing or retrieve ANCS services/characteristics due to timing issues.
  • Even if disconnections are delayed, the problem reoccurs after multiple cycles.

Logs:

I (31) boot: ESP-IDF v5.3.1 2nd stage bootloader
I (31) boot: compile time Dec  4 2024 12:49:33
I (31) boot: Multicore bootloader
I (35) boot: chip revision: v3.1
I (39) boot.esp32: SPI Speed      : 40MHz
I (44) boot.esp32: SPI Mode       : DIO
I (48) boot.esp32: SPI Flash Size : 2MB
I (53) boot: Enabling RNG early entropy source...
I (58) boot: Partition Table:
I (62) boot: ## Label            Usage          Type ST Offset   Length
I (69) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (77) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (84) boot:  2 factory          factory app      00 00 00010000 00100000
I (92) boot: End of partition table
I (96) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=20a7ch (133756) map
I (150) esp_image: segment 1: paddr=00030aa4 vaddr=3ffbdb60 size=04bech ( 19436) load
I (158) esp_image: segment 2: paddr=00035698 vaddr=40080000 size=0a980h ( 43392) load
I (175) esp_image: segment 3: paddr=00040020 vaddr=400d0020 size=7bb34h (506676) map
I (349) esp_image: segment 4: paddr=000bbb5c vaddr=4008a980 size=0caf0h ( 51952) load
I (381) boot: Loaded app from partition at offset 0x10000
I (381) boot: Disabling RNG early entropy source...
I (393) cpu_start: Multicore app
I (402) cpu_start: Pro cpu start user code
I (402) cpu_start: cpu freq: 160000000 Hz
I (402) app_init: Application information:
I (405) app_init: Project name:     ble_ancs
I (410) app_init: App version:      1
I (414) app_init: Compile time:     Dec  4 2024 12:48:21
I (420) app_init: ELF file SHA256:  516f9382e...
I (425) app_init: ESP-IDF:          v5.3.1
I (430) efuse_init: Min chip rev:     v0.0
I (435) efuse_init: Max chip rev:     v3.99 
I (440) efuse_init: Chip rev:         v3.1
I (445) heap_init: Initializing. RAM available for dynamic allocation:
I (452) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
I (458) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
I (464) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (470) heap_init: At 3FFC75F0 len 00018A10 (98 KiB): DRAM
I (476) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (482) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (489) heap_init: At 40097470 len 00008B90 (34 KiB): IRAM
I (497) spi_flash: detected chip: generic
I (500) spi_flash: flash io: dio
W (504) spi_flash: Detected size(16384k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (518) coexist: coex firmware version: 4482466
I (523) main_task: Started on CPU0
I (533) main_task: Calling app_main()
I (553) BTDM_INIT: BT controller compile version [b022216]
I (553) BTDM_INIT: Bluetooth MAC: ec:64:c9:09:fa:96
I (553) phy_init: phy_version 4830,54550f7,Jun 20 2024,14:22:08
I (863) BLE_ANCS: app_main init bluetooth
I (973) BLE_ANCS: REG_EVT
I (983) main_task: Returned from app_main()
I (1013) BLE_ANCS: advertising start success
I (20513) BLE_ANCS: ESP_GATTC_OPEN_EVT
W (22963) BT_SMP: FOR LE SC LTK IS USED INSTEAD OF STK
I (23783) addr: 6f 0d 8b 27 11 26 
I (23783) BLE_ANCS: pair status = success
I (26873) BLE_ANCS: ESP_GATTC_DIS_SRVC_CMPL_EVT
I (26933) BLE_ANCS: ESP_GATTC_CFG_MTU_EVT, Status 0, MTU 500, conn_id 0
I (26933) BLE_ANCS: ESP_GATTC_SEARCH_CMPL_EVT
I (26933) BLE_ANCS: Find Apple control point char
I (26943) BLE_ANCS: Find Apple noticification source char
I (26943) BLE_ANCS: Find Apple data source char
I (50123) BLE_ANCS: EventID:New message EventFlags:0x10 CategoryID:Social CategoryCount:1 NotificationUID:8
I (50123) BLE_ANCS: Get detailed information
I (50183) BLE_ANCS: recevice Notification Attributes response Command_id 0 NotificationUID 8
I (50183) Identifier: net.whatsapp.Wha
I (50183) Identifier: tsApp
I (50193) Title: Nitin
I (50193) Message: Hello
I (50193) MessageSize: 5
I (50203) Date: 20241204T131537
I (50203) NActionLabel: 43 6c 65 61 72 
I (61553) BLE_ANCS: EventID:New message EventFlags:0x10 CategoryID:Social CategoryCount:2 NotificationUID:9
I (61553) BLE_ANCS: Get detailed information
I (61673) BLE_ANCS: recevice Notification Attributes response Command_id 0 NotificationUID 9
I (61673) Identifier: net.whatsapp.Wha
I (61673) Identifier: tsApp
I (61673) Title: Nitin
I (61683) Message: Hmm
I (61683) MessageSize: 3
I (61683) Date: 20241204T131549
I (61693) NActionLabel: 43 6c 65 61 72 
I (76373) BLE_ANCS: EventID:New message EventFlags:0x10 CategoryID:Social CategoryCount:3 NotificationUID:10
I (76373) BLE_ANCS: Get detailed information
I (76433) BLE_ANCS: recevice Notification Attributes response Command_id 0 NotificationUID 10
I (76433) Identifier: net.whatsapp.Wha
I (76433) Identifier: tsApp
I (76443) Title: Nitin
I (76443) Message: 123456
I (76443) MessageSize: 6
I (76453) Date: 20241204T131604
I (76453) NActionLabel: 43 6c 65 61 72 
I (84593) BLE_ANCS: EventID:New message EventFlags:0x10 CategoryID:Social CategoryCount:4 NotificationUID:11
I (84593) BLE_ANCS: Get detailed information
I (84713) BLE_ANCS: recevice Notification Attributes response Command_id 0 NotificationUID 11
I (84713) Identifier: net.whatsapp.Wha
I (84713) Identifier: tsApp
I (84723) Title: Nitin
I (84723) Message: Nitin
I (84723) MessageSize: 5
I (84733) Date: 20241204T131612
I (84733) NActionLabel: 43 6c 65 61 72 
W (95743) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x13
W (95743) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x13
I (95753) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13
I (95773) BLE_ANCS: advertising start success
I (98463) BLE_ANCS: ESP_GATTC_OPEN_EVT
I (98903) addr: 6f 0d 8b 27 11 26 
I (98903) BLE_ANCS: pair status = success
W (102493) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x13
W (102493) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x13
I (102503) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13
I (102523) BLE_ANCS: advertising start success
I (113693) BLE_ANCS: ESP_GATTC_OPEN_EVT
W (114573) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x8
E (114583) BT_BTM: Device not found

W (114583) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x8
W (114583) BT_APPL: bta_gattc_mark_bg_conn unable to find the bg connection mask for: 6f:0d:8b:27:11:26
I (114613) addr: 6f 0d 8b 27 11 26 
I (114613) BLE_ANCS: pair status = fail
I (114613) BLE_ANCS: fail reason = 0x66
I (114613) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x8
I (114633) BLE_ANCS: advertising start success
I (125223) BLE_ANCS: ESP_GATTC_OPEN_EVT
W (125593) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x13
E (125603) BT_BTM: Device not found

W (125603) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x13
W (125603) BT_APPL: bta_gattc_mark_bg_conn unable to find the bg connection mask for: 6f:0d:8b:27:11:26
I (125613) addr: 6f 0d 8b 27 11 26 
I (125623) BLE_ANCS: pair status = fail
I (125623) BLE_ANCS: fail reason = 0x66
I (125623) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13
I (125653) BLE_ANCS: advertising start success
E (133283) BT_APPL: bta_gattc_process_indicate indication/notif for unknown device, ignore
I (133283) BLE_ANCS: ESP_GATTC_OPEN_EVT
W (133633) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x13
E (133643) BT_BTM: Device not found

W (133643) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x13
W (133643) BT_APPL: bta_gattc_mark_bg_conn unable to find the bg connection mask for: 73:6e:05:0f:9e:24
I (133653) addr: 73 6e 05 0f 9e 24 
I (133663) BLE_ANCS: pair status = fail
I (133663) BLE_ANCS: fail reason = 0x66
I (133663) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13
I (133693) BLE_ANCS: advertising start success
I (139713) BLE_ANCS: ESP_GATTC_OPEN_EVT
W (140023) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x13
E (140033) BT_BTM: Device not found

W (140033) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x13
W (140033) BT_APPL: bta_gattc_mark_bg_conn unable to find the bg connection mask for: 73:6e:05:0f:9e:24
I (140043) addr: 73 6e 05 0f 9e 24 
I (140053) BLE_ANCS: pair status = fail
I (140053) BLE_ANCS: fail reason = 0x66
I (140053) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13
I (140083) BLE_ANCS: advertising start success
I (144123) BLE_ANCS: ESP_GATTC_OPEN_EVT
W (144403) BT_APPL: gattc_conn_cb: if=3 st=0 id=3 rsn=0x13
E (144413) BT_BTM: Device not found

W (144413) BT_HCI: hcif disc complete: hdl 0x0, rsn 0x13
W (144413) BT_APPL: bta_gattc_mark_bg_conn unable to find the bg connection mask for: 73:6e:05:0f:9e:24
I (144423) addr: 73 6e 05 0f 9e 24 
I (144433) BLE_ANCS: pair status = fail
I (144433) BLE_ANCS: fail reason = 0x66
I (144433) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13
I (144463) BLE_ANCS: advertising start success

Key observations from logs:

  1. Initial pairing succeeds (pair status = success).
  2. On reconnection attempts, pairing fails (pair status = fail, fail reason = 0x66).

Potential Cause:
The BLE stack might not handle rapid reconnections or repeated disconnection scenarios effectively, leading to pairing and service discovery failure.

@espressif-bot espressif-bot added the Status: Opened Issue is new label Dec 4, 2024
@github-actions github-actions bot changed the title BLE ANCS Pairing Issue with ESP32 (Error Code: 0x66/102) BLE ANCS Pairing Issue with ESP32 (Error Code: 0x66/102) (IDFGH-14177) Dec 4, 2024
@esp-zhp
Copy link
Collaborator

esp-zhp commented Dec 9, 2024


Based on your log, the reason for the SMP failure is 0x66, which indicates an SMP timeout. This means that during the SMP process, no packets were received from the peer device.

ESP_AUTH_SMP_CONN_TOUT, /*!< Pairing process failed due to connection timeout */

Additionally, I noticed the following in your log:
I (144433) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13.
The peer device appears to be repeatedly disconnecting.

I suspect that during the peer device’s disconnection process, the SMP procedure has not been completed, which ultimately results in the SMP failure.

I recommend ensuring the connection is maintained until the SMP process is successfully completed before disconnecting. This should help resolve your issue.


@nitin-jethava
Copy link
Author

Thank you for the insights, @esp-zhp

It seems the issue arises because the user is frequently moving in and out of the device's Bluetooth range. This causes the Bluetooth connection to toggle repeatedly, which may be leading to the SMP timeout and disconnection issues.

Given this behavior, could you suggest steps to prevent or mitigate this problem? Should we consider making adjustments in the firmware to handle frequent reconnections more effectively, or are there other solutions you would recommend?

@esp-zhp
Copy link
Collaborator

esp-zhp commented Dec 10, 2024

@nitin-jethava
It doesn’t seem like the disconnection was caused by exceeding the connection range. If it were, an error code of 0x08 would have been reported instead of 0x13. The 0x13 code indicates that the peer device actively terminated the connection, meaning the iPhone device disconnected intentionally.

There are two possibilities: one is that the Apple Bluetooth stack actively disconnected, and the other is that the user disconnected at the application layer.

I suspect the frequent disconnections might be caused by user actions. Do you have any other perspectives? do you know the scenario in which this issue occurred?

@nitin-jethava
Copy link
Author

Thank you for the detailed insights, @esp-zhp

During our QA/QC process, we intentionally toggled Bluetooth (turning it ON and OFF repeatedly), which seems to have triggered the issue. This resulted in the 0x66 error code.

After this, we found that the only way to restore communication with our device was to manually "forget device" from the iPhone. Without doing this, reconnecting was not possible.

Could you please advise on the best steps to handle this scenario? Should we implement specific measures in the firmware to manage such cases, or are there additional adjustments we can explore?

@esp-zhp
Copy link
Collaborator

esp-zhp commented Dec 10, 2024


Based on your recent description, the issue appears to have changed and seems to be related to SMP (Security Manager Protocol). It looks like the ESP device may have lost the bond information.

To verify this assumption, you can try printing the bond list using the following APIs:

  • int esp_ble_get_bond_device_num(void)
  • esp_err_t esp_ble_get_bond_device_list(int *dev_num, esp_ble_bond_dev_t *dev_list)

If it is confirmed to be an SMP-related issue, you can test the behavior without bonding to see if the problem still occurs.

Please let us know if further assistance is needed!


@nitin-jethava
Copy link
Author

You are correct, @esp-zhp. I had tested this, and the bond information indeed appears to be missing when the issue occurs.

Under normal circumstances, the int esp_ble_get_bond_device_num(void) API returns 1, indicating the presence of bond information. However, when the error (0x66) occurs, the same API returns 0, suggesting that the bond information is lost or in an idle state.

Since bonding is closely tied to the ANCS functionality in our application, this loss of bond information directly impacts the ability to receive notifications seamlessly.

Given this scenario, could you please advise on the best approach to handle such cases? Specifically:

  1. What is the ideal way to manage situations where bond information is lost?
  2. Are there recommended steps or fallback mechanisms we can implement in the firmware to gracefully handle this situation and re-establish the bond without requiring user intervention?

Your guidance on this matter would be greatly appreciated as it will help us optimize our handling of these disconnections.

@esp-zhp
Copy link
Collaborator

esp-zhp commented Dec 11, 2024

We have never encountered a situation where the ESP32 loses bond information.
Could you add additional debug logs to determine exactly when the bond information is lost?

I recommend adding debug logs at key events. For example, print the bond list after advertising starts (esp_ble_gap_start_advertising) but before the connection is established, then again after the connection is established (ESP_GATTC_CONNECT_EVT), and once more after disconnection (ESP_GATTC_DISCONNECT_EVT). This can help pinpoint where the bond information is being lost.

Please also note that the ESP32 bond list has a size limitation of 15 entries.

@nitin-jethava
Copy link
Author

Thank you for your input, @esp-zhp.

After reviewing your suggestions, I’ve added debug logs at key events to monitor the bond list and collected the following observations:

  1. Before Connection:
    The bond list size is logged after advertising starts, showing an initial value:

    I (48502) BLE_ANCS: advertising start success  
    I (48502) BONDING_LIST: Bond list size after restart adverties: 3  
    
  2. After Connection (ESP_GATTC_CONNECT_EVT):
    The bond list size remains consistent, confirming successful pairing:

    I (50842) BLE_ANCS: ESP_GATTC_CONNECT_EVT  
    I (50842) BONDING_LIST: Bond list size after connection: 3  
    I (50852) BLE_ANCS: ESP_GATTC_OPEN_EVT  
    I (47092) BLE_ANCS: pair status = success  
    
  3. After Disconnection (ESP_GATTC_DISCONNECT_EVT):
    The bond list size decreases when an error occurs. Specifically, after the 0x66 error:

    I (51832) BLE_ANCS: pair status = fail  
    I (51832) BLE_ANCS: fail reason = 0x66  
    I (51832) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x8  
    I (51842) BONDING_LIST: Bond list size after disconnection: 2  
    
  4. Reconnection Attempts:
    Subsequent attempts fail with bond list inconsistencies persisting:

    I (52902) BLE_ANCS: fail reason = 0x66  
    I (52902) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13  
    I (52912) BONDING_LIST: Bond list size after disconnection: 2  
    

From these logs, it’s clear that bond information is lost after certain disconnection scenarios, particularly when errors 0x66 and 0x13 are encountered. This affects the ANCS functionality, as bonding is critical for receiving notifications seamlessly.

I’ve attached the full log file (ble_ancs_logs_11_dec.txt) for your review.

Could you please advise on the following:

  1. The most effective approach to recover from bond information loss.
  2. Any additional configurations or changes in the pairing process that could help prevent this issue.

If you require additional logs, specific test scenarios, or further debugging, please feel free to let me know. I’ll be glad to provide any necessary details or support to help resolve this issue.

ble_ancs_logs_11_dec.txt

@esp-zhp
Copy link
Collaborator

esp-zhp commented Dec 12, 2024

Based on the additional logs you provided, we further confirmed the bond loss scenario. Could you check whether the example actively clears bond information?

If there’s no active clearing, we suspect this might be a bug in the protocol stack. I’ve added some HCI logs for further debugging.
0003-fix-bt-Display-hci-data-including-up-and-down.zip
Please apply the patch and share the complete logs with me.

Additionally, have you tried using the latest branch?

@nitin-jethava
Copy link
Author

Thank you for the clarification, @esp-zhp.

  1. No Active Clearing of Bond Information:
    After reviewing the example code, I can confirm that the bond information is not actively cleared in the example.

  2. Patch Application and Log Analysis:
    I applied the provided HCI log patch and rebuilt the project. Below are the observations from the debug logs:

    • Before Connection:
      I (322733) BLE_ANCS: advertising start success  
      I (322733) BONDING_LIST: Bond list size after restart adverties: 3
      
    • After Connection:
      I (324143) BLE_ANCS: ESP_GATTC_CONNECT_EVT  
      I (324153) BONDING_LIST: Bond list size after connection: 3
      
    • After Disconnection (Error 0x66):
      I (324543) BLE_ANCS: ESP_GATTC_DISCONNECT_EVT, reason = 0x13  
      I (324543) BONDING_LIST: Bond list size after disconnection: 2
      

    These logs indicate that the bond list size decreases after the disconnection, suggesting a bond loss scenario.

  3. Attached Full Logs:
    I have attached the full log file for your reference, which includes the HCI logs and additional debug outputs.

ble_ancs_logs_12_dec.txt

  1. ESP-IDF Version:
    I am currently using the stable release v5.3.2.

Could you please advise further based on these findings? Do you suspect this could be a protocol stack issue, and are there additional steps or tests you would recommend?

If you require additional logs, specific test scenarios, or further debugging, please feel free to let me know. I’ll be glad to provide any necessary details or support to help resolve this issue.

@esp-zhp
Copy link
Collaborator

esp-zhp commented Dec 13, 2024

@nitin-jethava
I fixed this issue for you,please test it and give me a feedback.thanks.
The issue occurs because when pairing fails, the ESP proactively clears the bond information to facilitate future use. However, the smartphone does not clear its bond information. Our current approach is to avoid clearing bond information. For reference, see btc_dm_ble_auth_cmpl_evt and bta_dm_remove_sec_dev_entry.

It is important to note that bond information is currently retained only when the failure reason is 0x66. For other failure reasons, bond information is still cleared.

If you prefer to always do not clear the bond information, you can modify the code as follows:

From:

if (sec_event.auth_cmpl.fail_reason != BTA_DM_AUTH_SMP_CONN_TOUT) {
    bta_dm_remove_sec_dev_entry(bda);
}

To:

if (sec_event.auth_cmpl.fail_reason != BTA_DM_AUTH_SMP_CONN_TOUT) {
    // bta_dm_remove_sec_dev_entry(bda);
}

0001-fix-ble-bluedroid-Fixed-BLE-bond-lost-if-pair-fail.zip

@nitin-jethava
Copy link
Author

nitin-jethava commented Dec 13, 2024

Thank you so much for your solution, @esp-zhp!

The issue is now resolved, and everything is working as expected. I appreciate your detailed explanation and the suggested changes regarding bond information retention. The fix has worked for me, and I can confirm the solution is effective.

Actually, I was doing this for the BLEDevice library, but I checked similar things in ESP-IDF and found that this issue is already addressed there. Now, I need to make similar changes in the BLEDevice library for Arduino. Can you suggest where I can start a thread for the BLEDevice library in the Arduino environment? I’m using PlatformIO for this, and I want to address this issue there as well.

Thanks again for your valuable support and for guiding me through this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new
Projects
None yet
Development

No branches or pull requests

3 participants