-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++/Python] Fix bugs that were not exposed by broken C++ CI before #11557
[C++/Python] Fix bugs that were not exposed by broken C++ CI before #11557
Conversation
In my local environment, the Python build cannot pass
I'll work on this issue first, but it doesn't affect the correctness of Python client. So I think this PR can be merged first if it's verified. |
From https://github.com/apache/pulsar/pull/11557/checks?check_run_id=3240883659 we can see C++ tests all passed but Python install failed:
|
The website build ran into the same issue, and I think it's the new syntax of python3 type annotations https://docs.python.org/3/library/typing.html, I think we can remove the type annotation to make sure it works in both python2 and python3
@gaoran10 Would this work? |
I'll test it in my local env first, thx @tuteng |
There're still 2 failed Python tests:
|
@gaoran10 Could you help take a look? |
@BewareMyPower I think that is still a problem. If the compatibility with older c++ client was broken, we should roll it back and find a better way to solve the original issue. |
For example, we could attempt at detecting that is the old C++ client and defaulting the |
It's hard to detect whether the client is C++ client because the HTTP request doesn't contain any client related info. For older C++ client, we need to create topics manually in advance. |
That is something we must avoid because it breaks a lot of use cases. |
@merlimat It looks like the only solution might be reverting #10601 ? The original purpose of #10601 is to differ the cases that a topic doesn't exist and a topic is a non-partitioned topic. This new semantic has already been applied to KoP 2.8.0 to detect a non-partitioned topic because KoP doesn't support non-partitioned topics and there's no way to check if a given topic name represents a non-partitioned topic or a topic that doesn't exist before. However, this new behavior will also break other clients that try to use HTTP lookup. I think it's still controversial. We can discuss it in a new thread using email or another issue. |
I noticed @codelipenghui added the FYI @hangc0276 since you're the release manager of 2.8.1. |
3d0eaa6
to
ace8e16
Compare
@BewareMyPower OK,i am waiting you to split this PR into two independent PRs, and then merge to 2.8.1 |
I've reverted 3 commits that 2 commits of them will be included in another PR (#11578), which should not be merged to 2.8.1 branch). The left 1 commit ("Fix Python2 incompatibility") is related to broken documentation generation that blocks release of 2.7.3, FYI @congbobo184 |
61848d8
to
e083099
Compare
e083099
to
ab6db12
Compare
Now there're only 3 failed tests left. We need to fix the Python2 incompatibility issues. |
Now all Python tests are fixed, see the last few lines of # line 1389 to 1390
----------------------------------------------------------------------
Ran 73 tests in 24.356s
# line 1562, before this line there're some logs
OK
# line 1581 to 1585
----------------------------------------------------------------------
Ran 4 tests in 0.005s
OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
### Motivation - fixes issue that cpp build doesn't fail when tests fail - merge after #11557 ### Additional context - #11557 (comment) - https://github.com/apache/pulsar/pull/10309/files#r683626563 ### Modifications - `set -o pipefail;` is required when using `| cat`
…pache#11557) Fixes apache#11551 ### Motivation Currently there're some bugs of C++ client and some tests cannot pass: 1. Introduced from apache#10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail. - AuthPluginTest.testTlsDetectHttps - AuthPluginToken.testTokenWithHttpUrl - BasicEndToEndTest.testHandlerReconnectionLogic - BasicEndToEndTest.testV2TopicHttp - ClientDeduplicationTest.testProducerDeduplication 2. Introduced from apache#11029 and apache#11486 , the implementation will iterate more than once even there's only one valid resolved IP address. - ClientTest.testConnectTimeout In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling. Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed. 1. Some tests in `pulsar_test.py` might encounter `Timeout` error when creating producers or consumers. 2. Some tests in `schema_test.py` failed because some comparisons between two `ComplexRecord`s failed. Since the CI test of C++ client would never fail after apache#10309 (will be fixed by apache#11575), all PRs about C++ or Python client are not verified even if CI passed. Before apache#11575 is merged, we need to fix all existed bugs of C++ client. ### Modifications Corresponding to the above tests group, this PR adds following modifications: 1. Add the `?checkAllowAutoCreation=true` URL suffix to allow HTTP lookup to create topics automatically. 2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP. Regarding to the flaky `testLookupThrottling`, this PR adds a `client.close()` at the end of test and fix the `ClientImpl::close` implementation. Before this PR, if there're no producers or consumers in a client, the `close()` method wouldn't call `shutdown()` to close connection poll and executors. Only after the `Client` instance was destructed would the `shutdown()` method be called. In this case, this PR calls `handleClose` instead of invoking callback directly. In addition, change the log level of this test to debug. This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by `client` was not closed. Regarding to Python schema tests, in Python2, `self.__ne__(other)` is not equivalent to `not self.__eq__(other)` when the default `__eq__` implementation is overwritten. If a `Record` object has a field whose type is also `Record`, the `Record.__ne__` method will be called, see https://github.com/apache/pulsar/blob/ddb5fb0e062c2fe0967efce2a443a31f9cd12c07/pulsar-client-cpp/python/pulsar/schema/definition.py#L138-L139 but it just uses the default implementation to check whether they're not equal. The custom `__eq__` method won't be called. Therefore, this PR implement `Record.__ne__` explicitly to call `Record.__eq__` so that the comparison will work for Python2. ### Verifying this change We can only check the workflow output to verify this change.
### Motivation - fixes issue that cpp build doesn't fail when tests fail - merge after apache#11557 ### Additional context - apache#11557 (comment) - https://github.com/apache/pulsar/pull/10309/files#r683626563 ### Modifications - `set -o pipefail;` is required when using `| cat`
…11557) Fixes #11551 ### Motivation Currently there're some bugs of C++ client and some tests cannot pass: 1. Introduced from #10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail. - AuthPluginTest.testTlsDetectHttps - AuthPluginToken.testTokenWithHttpUrl - BasicEndToEndTest.testHandlerReconnectionLogic - BasicEndToEndTest.testV2TopicHttp - ClientDeduplicationTest.testProducerDeduplication 2. Introduced from #11029 and #11486 , the implementation will iterate more than once even there's only one valid resolved IP address. - ClientTest.testConnectTimeout In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling. Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed. 1. Some tests in `pulsar_test.py` might encounter `Timeout` error when creating producers or consumers. 2. Some tests in `schema_test.py` failed because some comparisons between two `ComplexRecord`s failed. Since the CI test of C++ client would never fail after #10309 (will be fixed by #11575), all PRs about C++ or Python client are not verified even if CI passed. Before #11575 is merged, we need to fix all existed bugs of C++ client. ### Modifications Corresponding to the above tests group, this PR adds following modifications: 1. Add the `?checkAllowAutoCreation=true` URL suffix to allow HTTP lookup to create topics automatically. 2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP. Regarding to the flaky `testLookupThrottling`, this PR adds a `client.close()` at the end of test and fix the `ClientImpl::close` implementation. Before this PR, if there're no producers or consumers in a client, the `close()` method wouldn't call `shutdown()` to close connection poll and executors. Only after the `Client` instance was destructed would the `shutdown()` method be called. In this case, this PR calls `handleClose` instead of invoking callback directly. In addition, change the log level of this test to debug. This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by `client` was not closed. Regarding to Python schema tests, in Python2, `self.__ne__(other)` is not equivalent to `not self.__eq__(other)` when the default `__eq__` implementation is overwritten. If a `Record` object has a field whose type is also `Record`, the `Record.__ne__` method will be called, see https://github.com/apache/pulsar/blob/ddb5fb0e062c2fe0967efce2a443a31f9cd12c07/pulsar-client-cpp/python/pulsar/schema/definition.py#L138-L139 but it just uses the default implementation to check whether they're not equal. The custom `__eq__` method won't be called. Therefore, this PR implement `Record.__ne__` explicitly to call `Record.__eq__` so that the comparison will work for Python2. ### Verifying this change We can only check the workflow output to verify this change. (cherry picked from commit 4919a82)
### Motivation - fixes issue that cpp build doesn't fail when tests fail - merge after #11557 ### Additional context - #11557 (comment) - https://github.com/apache/pulsar/pull/10309/files#r683626563 ### Modifications - `set -o pipefail;` is required when using `| cat` (cherry picked from commit 5ae0554)
…pache#11557) Fixes apache#11551 ### Motivation Currently there're some bugs of C++ client and some tests cannot pass: 1. Introduced from apache#10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail. - AuthPluginTest.testTlsDetectHttps - AuthPluginToken.testTokenWithHttpUrl - BasicEndToEndTest.testHandlerReconnectionLogic - BasicEndToEndTest.testV2TopicHttp - ClientDeduplicationTest.testProducerDeduplication 2. Introduced from apache#11029 and apache#11486 , the implementation will iterate more than once even there's only one valid resolved IP address. - ClientTest.testConnectTimeout In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling. Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed. 1. Some tests in `pulsar_test.py` might encounter `Timeout` error when creating producers or consumers. 2. Some tests in `schema_test.py` failed because some comparisons between two `ComplexRecord`s failed. Since the CI test of C++ client would never fail after apache#10309 (will be fixed by apache#11575), all PRs about C++ or Python client are not verified even if CI passed. Before apache#11575 is merged, we need to fix all existed bugs of C++ client. ### Modifications Corresponding to the above tests group, this PR adds following modifications: 1. Add the `?checkAllowAutoCreation=true` URL suffix to allow HTTP lookup to create topics automatically. 2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP. Regarding to the flaky `testLookupThrottling`, this PR adds a `client.close()` at the end of test and fix the `ClientImpl::close` implementation. Before this PR, if there're no producers or consumers in a client, the `close()` method wouldn't call `shutdown()` to close connection poll and executors. Only after the `Client` instance was destructed would the `shutdown()` method be called. In this case, this PR calls `handleClose` instead of invoking callback directly. In addition, change the log level of this test to debug. This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by `client` was not closed. Regarding to Python schema tests, in Python2, `self.__ne__(other)` is not equivalent to `not self.__eq__(other)` when the default `__eq__` implementation is overwritten. If a `Record` object has a field whose type is also `Record`, the `Record.__ne__` method will be called, see https://github.com/apache/pulsar/blob/ddb5fb0e062c2fe0967efce2a443a31f9cd12c07/pulsar-client-cpp/python/pulsar/schema/definition.py#L138-L139 but it just uses the default implementation to check whether they're not equal. The custom `__eq__` method won't be called. Therefore, this PR implement `Record.__ne__` explicitly to call `Record.__eq__` so that the comparison will work for Python2. ### Verifying this change We can only check the workflow output to verify this change.
### Motivation - fixes issue that cpp build doesn't fail when tests fail - merge after apache#11557 ### Additional context - apache#11557 (comment) - https://github.com/apache/pulsar/pull/10309/files#r683626563 ### Modifications - `set -o pipefail;` is required when using `| cat`
Fixes #11551
Motivation
Currently there're some bugs of C++ client and some tests cannot pass:
In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling.
Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed.
pulsar_test.py
might encounterTimeout
error when creating producers or consumers.schema_test.py
failed because some comparisons between twoComplexRecord
s failed.Since the CI test of C++ client would never fail after #10309 (will be fixed by #11575), all PRs about C++ or Python client are not verified even if CI passed. Before #11575 is merged, we need to fix all existed bugs of C++ client.
Modifications
Corresponding to the above tests group, this PR adds following modifications:
?checkAllowAutoCreation=true
URL suffix to allow HTTP lookup to create topics automatically.Regarding to the flaky
testLookupThrottling
, this PR adds aclient.close()
at the end of test and fix theClientImpl::close
implementation. Before this PR, if there're no producers or consumers in a client, theclose()
method wouldn't callshutdown()
to close connection poll and executors. Only after theClient
instance was destructed would theshutdown()
method be called. In this case, this PR callshandleClose
instead of invoking callback directly. In addition, change the log level of this test to debug.This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by
client
was not closed.Regarding to Python schema tests, in Python2,
self.__ne__(other)
is not equivalent tonot self.__eq__(other)
when the default__eq__
implementation is overwritten. If aRecord
object has a field whose type is alsoRecord
, theRecord.__ne__
method will be called, seepulsar/pulsar-client-cpp/python/pulsar/schema/definition.py
Lines 138 to 139 in ddb5fb0
but it just uses the default implementation to check whether they're not equal. The custom
__eq__
method won't be called. Therefore, this PR implementRecord.__ne__
explicitly to callRecord.__eq__
so that the comparison will work for Python2.Verifying this change
We can only check the workflow output to verify this change.