Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix class BoundedAttributes to have RLock rather than Lock #3859

Merged
merged 10 commits into from
May 23, 2024

Conversation

hyoinandout
Copy link
Contributor

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

The change fixes the class BoundedAttributes to have RLock rather than Lock.
Fixes #3858 (issue).
Motivation of this PR is that a deadlock symptom was observed while using the Opentelemetry _logs API.
No additional dependencies are required for this change.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

It seems that a dedicated test case is required.

Does This PR Require a Contrib Repo Change?

Answer the following question based on these examples of changes that would require a Contrib Repo Change:

  • The OTel specification has changed which prompted this PR to update the method interfaces of opentelemetry-api/ or opentelemetry-sdk/

  • The method interfaces of test/util have changed

  • Scripts in scripts/ that were copied over to the Contrib repo have changed

  • Configuration files that were copied over to the Contrib repo have changed (when consistency between repositories is applicable) such as in

    • pyproject.toml
    • isort.cfg
    • .flake8
  • When a new .github/CODEOWNER is added

  • Major changes to project information, such as in:

    • README.md
    • CONTRIBUTING.md
  • Yes. - Link to PR:

  • No.

Checklist:

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@hyoinandout hyoinandout requested a review from a team April 18, 2024 00:25
Copy link

linux-foundation-easycla bot commented Apr 18, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

Copy link
Contributor

@xrmx xrmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, it would be indeed nice to have a supporting test case

@xrmx
Copy link
Contributor

xrmx commented Apr 29, 2024

@hyoinandout please rebase and add a changelog to have green tests

@hyoinandout
Copy link
Contributor Author

@xrmx Thank you for giving me a heads up. I rebased my branch on top of main branch and added a changelog for this PR.

@xrmx
Copy link
Contributor

xrmx commented May 6, 2024

@hyoinandout The test you added is failling on a specific combination

@hyoinandout
Copy link
Contributor Author

hyoinandout commented May 7, 2024

@xrmx I will have my eyes on it, but since I am not an expert of this build environment, I'm not sure that I will find the reason.
Could you share your opinion why the test is failing in such specific environment?

@emdneto
Copy link
Member

emdneto commented May 12, 2024

From the pipelines, the test is failing on the build Windows 2019 check. From the logs, it seems that thread 2 reads the value from bdict before Thread 1 modifies it, and then Thread 2 modifies it before Thread 1 writes its doubled value. For this reason, the final value might not be the expected 4x as you want. It seems the behavior in Windows is unstable

@ocelotl
Copy link
Contributor

ocelotl commented May 14, 2024

@lzchen I think we should probably skip this test case for the failing Windows environment, what do you think?

@xrmx
Copy link
Contributor

xrmx commented May 14, 2024

The test is racy though, the lock is in __setitem__ and does not cover the reading of the actual value in __getitem__. So maybe just set a fixed value and assert that all elements have been set. We are testing that there was a deadlock and now it's gone so it should be fine.

…k and assert accordingly

This testcase passes only if BoundedAttributes use RLock, not Lock
@hyoinandout
Copy link
Contributor Author

@xrmx Now I fully get the point. I really appreciate your explanations for it and just have pushed a commit to test it like it is.

@hyoinandout
Copy link
Contributor Author

See the comment for the deadlock which motivated this PR.

@ocelotl ocelotl merged commit 8b80a28 into open-telemetry:main May 23, 2024
233 checks passed
@hyoinandout hyoinandout deleted the fix branch May 27, 2024 01:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP Worker Thread Blocked and Invalid Type Warning from Opentelemetry _logs
4 participants