Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++/Python] Fix bugs that were not exposed by broken C++ CI before #11557

Merged

Conversation

BewareMyPower
Copy link
Contributor

@BewareMyPower BewareMyPower commented Aug 4, 2021

Fixes #11551

Motivation

Currently there're some bugs of C++ client and some tests cannot pass:

  1. Introduced from Fix getting partition metadata of a nonexistent topic returns 0 #10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.
    • AuthPluginTest.testTlsDetectHttps
    • AuthPluginToken.testTokenWithHttpUrl
    • BasicEndToEndTest.testHandlerReconnectionLogic
    • BasicEndToEndTest.testV2TopicHttp
    • ClientDeduplicationTest.testProducerDeduplication
  2. Introduced from [C++][Python] Add connection timeout configuration #11029 and [Issue 11485][C++] Connect timer cancellation does not call timeout callback #11486 , the implementation will iterate more than once even there's only one valid resolved IP address.
    • ClientTest.testConnectTimeout

In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling.

Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed.

  1. Some tests in pulsar_test.py might encounter Timeout error when creating producers or consumers.
  2. Some tests in schema_test.py failed because some comparisons between two ComplexRecords failed.

Since the CI test of C++ client would never fail after #10309 (will be fixed by #11575), all PRs about C++ or Python client are not verified even if CI passed. Before #11575 is merged, we need to fix all existed bugs of C++ client.

Modifications

Corresponding to the above tests group, this PR adds following modifications:

  1. Add the ?checkAllowAutoCreation=true URL suffix to allow HTTP lookup to create topics automatically.
  2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP.

Regarding to the flaky testLookupThrottling, this PR adds a client.close() at the end of test and fix the ClientImpl::close implementation. Before this PR, if there're no producers or consumers in a client, the close() method wouldn't call shutdown() to close connection poll and executors. Only after the Client instance was destructed would the shutdown() method be called. In this case, this PR calls handleClose instead of invoking callback directly. In addition, change the log level of this test to debug.

This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by client was not closed.

Regarding to Python schema tests, in Python2, self.__ne__(other) is not equivalent to not self.__eq__(other) when the default __eq__ implementation is overwritten. If a Record object has a field whose type is also Record, the Record.__ne__ method will be called, see

if self.__getattribute__(field) != other.__getattribute__(field):
return False

but it just uses the default implementation to check whether they're not equal. The custom __eq__ method won't be called. Therefore, this PR implement Record.__ne__ explicitly to call Record.__eq__ so that the comparison will work for Python2.

Verifying this change

We can only check the workflow output to verify this change.

@BewareMyPower BewareMyPower added type/bug The PR fixed a bug or issue reported a bug component/c++ doc-not-needed Your PR changes do not impact docs labels Aug 4, 2021
@BewareMyPower BewareMyPower self-assigned this Aug 4, 2021
@BewareMyPower BewareMyPower requested review from merlimat, massakam, codelipenghui, jiazhai and sijie and removed request for merlimat August 4, 2021 11:20
@BewareMyPower
Copy link
Contributor Author

In my local environment, the Python build cannot pass

---- Installing Python Wheel file
total 43M
drwxr-xr-x  4 root root 128 Aug  4 10:33 .
drwxr-xr-x 22 root root 704 Aug  4 10:26 ..
-rw-r--r--  1 root root 20M Aug  4 10:33 pulsar_client-2.9.0-cp27-cp27mu-linux_x86_64.whl
-rw-r--r--  1 root root 22M Jul 20 06:04 pulsar_client-2.9.0-cp37-cp37m-linux_x86_64.whl
pulsar_client-2.9.0-cp27-cp27mu-linux_x86_64.whl
pulsar_client-2.9.0-cp37-cp37m-linux_x86_64.whl
dist/pulsar_client-2.9.0-cp27-cp27mu-linux_x86_64.whl
pulsar_client-2.9.0-cp37-cp37m-linux_x86_64.whl[all]
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
WARNING: Requirement 'pulsar_client-2.9.0-cp37-cp37m-linux_x86_64.whl[all]' looks like a filename, but the file does not exist
ERROR: pulsar_client-2.9.0-cp37-cp37m-linux_x86_64.whl is not a supported wheel on this platform.

I'll work on this issue first, but it doesn't affect the correctness of Python client. So I think this PR can be merged first if it's verified.

@BewareMyPower BewareMyPower changed the title [C++] Fix bugs that were not exposed by broken C++ tests before [C++] Fix bugs that were not exposed by broken C++ CI before Aug 4, 2021
@BewareMyPower
Copy link
Contributor Author

From https://github.com/apache/pulsar/pull/11557/checks?check_run_id=3240883659 we can see C++ tests all passed but Python install failed:

Processing ./dist/pulsar_client-2.9.0-cp27-cp27mu-linux_x86_64.whl
Collecting certifi
  Downloading certifi-2021.5.30-py2.py3-none-any.whl (145 kB)
Collecting enum34>=1.1.9; python_version < "3.4"
  Downloading enum34-1.1.10-py2-none-any.whl (11 kB)
Collecting six
  Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Collecting apache-bookkeeper-client>=4.9.2; extra == "all"
  Downloading apache_bookkeeper_client-4.14.1-py2.py3-none-any.whl (70 kB)
Collecting protobuf>=3.6.1; extra == "all"
  Downloading protobuf-3.17.3-cp27-cp27mu-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.0 MB)
Collecting grpcio<1.28,>=1.8.2; extra == "all"
  Downloading grpcio-1.27.2-cp27-cp27mu-manylinux2010_x86_64.whl (2.6 MB)
Collecting prometheus-client; extra == "all"
  Downloading prometheus_client-0.11.0-py2.py3-none-any.whl (56 kB)
Collecting ratelimit; extra == "all"
  Downloading ratelimit-2.2.1.tar.gz (5.3 kB)
Collecting fastavro==0.24.0; extra == "all"
  Downloading fastavro-0.24.0-cp27-cp27mu-manylinux2010_x86_64.whl (1.1 MB)
Requirement already satisfied: setuptools>=34.0.0 in /usr/local/lib/python2.7/dist-packages (from apache-bookkeeper-client>=4.9.2; extra == "all"->pulsar-client==2.9.0) (44.1.1)
Collecting futures>=3.2.0; python_version < "3.2"
  Downloading futures-3.3.0-py2-none-any.whl (16 kB)
Collecting pytz
  Downloading pytz-2021.1-py2.py3-none-any.whl (510 kB)
Collecting pymmh3>=0.0.5
  Downloading pymmh3-0.0.5-py2.py3-none-any.whl (7.4 kB)
Collecting requests<3.0.0dev,>=2.18.0
  Downloading requests-2.26.0-py2.py3-none-any.whl (62 kB)
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.6-py2.py3-none-any.whl (138 kB)
Collecting idna<3,>=2.5; python_version < "3"
  Downloading idna-2.10-py2.py3-none-any.whl (58 kB)
Collecting chardet<5,>=3.0.2; python_version < "3"
  Downloading chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Building wheels for collected packages: ratelimit
  Building wheel for ratelimit (setup.py): started
  Building wheel for ratelimit (setup.py): finished with status 'done'
  Created wheel for ratelimit: filename=ratelimit-2.2.1-py2-none-any.whl size=5892 sha256=79a3dbf61feccd7e785b26d7462c08ad951964ef08fe59d24532fed9e06eea59
  Stored in directory: /root/.cache/pip/wheels/f1/a2/98/e1ec50002af5a5a7370a5c330e5a4b1fbc8893c9088564a77e
Successfully built ratelimit
Installing collected packages: certifi, enum34, six, futures, pytz, grpcio, pymmh3, urllib3, idna, chardet, requests, protobuf, apache-bookkeeper-client, prometheus-client, ratelimit, fastavro, pulsar-client
Successfully installed apache-bookkeeper-client-4.14.1 certifi-2021.5.30 chardet-4.0.0 enum34-1.1.10 fastavro-0.24.0 futures-3.3.0 grpcio-1.27.2 idna-2.10 prometheus-client-0.11.0 protobuf-3.17.3 pulsar-client-2.9.0 pymmh3-0.0.5 pytz-2021.1 ratelimit-2.2.1 requests-2.26.0 six-1.16.0 urllib3-1.26.6
---- Running Python unit tests
/tmp /pulsar/pulsar-client-cpp/python /pulsar/pulsar-client-cpp
Traceback (most recent call last):
  File "pulsar_test.py", line 28, in <module>
    from pulsar import Client, MessageId, \
  File "/usr/local/lib/python2.7/dist-packages/pulsar/__init__.py", line 112, in <module>
    from pulsar import schema
  File "/usr/local/lib/python2.7/dist-packages/pulsar/schema/__init__.py", line 24, in <module>
    from .schema_avro import AvroSchema
  File "/usr/local/lib/python2.7/dist-packages/pulsar/schema/schema_avro.py", line 62
    def encode_dict(self, d: dict):

@tuteng
Copy link
Member

tuteng commented Aug 4, 2021

The website build ran into the same issue, and I think it's the new syntax of python3 type annotations https://docs.python.org/3/library/typing.html, I think we can remove the type annotation to make sure it works in both python2 and python3

def encode_dict(self, d):
    obj = {}
    if isinstance(d, dict):
        for k, v in d.items():
            obj[k] = self._get_serialized_value(v)
    return obj

@gaoran10 Would this work?

@BewareMyPower
Copy link
Contributor Author

I'll test it in my local env first, thx @tuteng

@BewareMyPower
Copy link
Contributor Author

There're still 2 failed Python tests:

======================================================================
ERROR: test_producer_routing_mode (__main__.PulsarTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "pulsar_test.py", line 639, in test_producer_routing_mode
    message_routing_mode=PartitionsRoutingMode.UseSinglePartition)
  File "/usr/local/lib/python2.7/dist-packages/pulsar/__init__.py", line 589, in create_producer
    p._producer = self._client.create_producer(topic, conf)
Timeout: Pulsar error: TimeOut

======================================================================
ERROR: test_producer_send (__main__.PulsarTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "pulsar_test.py", line 150, in test_producer_send
    producer = client.create_producer(topic)
  File "/usr/local/lib/python2.7/dist-packages/pulsar/__init__.py", line 589, in create_producer
    p._producer = self._client.create_producer(topic, conf)
Timeout: Pulsar error: TimeOut

======================================================================
ERROR: test_producer_send_async (__main__.PulsarTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "pulsar_test.py", line 129, in test_producer_send_async
    producer = client.create_producer('my-python-topic')
  File "/usr/local/lib/python2.7/dist-packages/pulsar/__init__.py", line 589, in create_producer
    p._producer = self._client.create_producer(topic, conf)
Timeout: Pulsar error: TimeOut

======================================================================
ERROR: test_producer_sequence_after_reconnection (__main__.PulsarTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "pulsar_test.py", line 570, in test_producer_sequence_after_reconnection
    producer = client.create_producer(topic, producer_name='my-producer-name')
  File "/usr/local/lib/python2.7/dist-packages/pulsar/__init__.py", line 589, in create_producer
    p._producer = self._client.create_producer(topic, conf)
Timeout: Pulsar error: TimeOut

======================================================================
ERROR: test_publish_compact_and_consume (__main__.PulsarTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "pulsar_test.py", line 723, in test_publish_compact_and_consume
    producer = client.create_producer(topic, producer_name='my-producer-name', batching_enabled=False)
  File "/usr/local/lib/python2.7/dist-packages/pulsar/__init__.py", line 589, in create_producer
    p._producer = self._client.create_producer(topic, conf)
Timeout: Pulsar error: TimeOut

======================================================================
FAIL: test_produce_and_consume_complex_schema_data (schema_test.SchemaTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/schema_test.py", line 1024, in test_produce_and_consume_complex_schema_data
    produce_consume_test('avro')
  File "/tmp/schema_test.py", line 1007, in produce_consume_test
    self.assertEqual(value, r)
AssertionError: <schema_test.ComplexRecord object at 0x7fd415a15410> != <schema_test.ComplexRecord object at 0x7fd40b3a9f50>

======================================================================
FAIL: test_serialize_schema_complex (schema_test.SchemaTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/schema_test.py", line 950, in test_serialize_schema_complex
    encode_and_decode('avro')
  File "/tmp/schema_test.py", line 934, in encode_and_decode
    self.assertEqual(data_decode, r)
AssertionError: <schema_test.ComplexRecord object at 0x7fd40b352b90> != <schema_test.ComplexRecord object at 0x7fd40b352b50>

----------------------------------------------------------------------
Ran 71 tests in 177.361s

FAILED (failures=2, errors=5)

@BewareMyPower BewareMyPower changed the title [C++] Fix bugs that were not exposed by broken C++ CI before [C++/Python] Fix bugs that were not exposed by broken C++ CI before Aug 4, 2021
@BewareMyPower
Copy link
Contributor Author

@gaoran10 Could you help take a look?

@merlimat
Copy link
Contributor

merlimat commented Aug 4, 2021

Introduced from Fix getting partition metadata of a nonexistent topic returns 0 #10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.

@BewareMyPower I think that is still a problem. If the compatibility with older c++ client was broken, we should roll it back and find a better way to solve the original issue.

@merlimat
Copy link
Contributor

merlimat commented Aug 4, 2021

Introduced from Fix getting partition metadata of a nonexistent topic returns 0 #10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.

@BewareMyPower I think that is still a problem. If the compatibility with older c++ client was broken, we should roll it back and find a better way to solve the original issue.

For example, we could attempt at detecting that is the old C++ client and defaulting the checkAllowAutoCreation=true on the broker side.

@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Aug 5, 2021

Introduced from Fix getting partition metadata of a nonexistent topic returns 0 #10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.

@BewareMyPower I think that is still a problem. If the compatibility with older c++ client was broken, we should roll it back and find a better way to solve the original issue.

For example, we could attempt at detecting that is the old C++ client and defaulting the checkAllowAutoCreation=true on the broker side.

It's hard to detect whether the client is C++ client because the HTTP request doesn't contain any client related info. For older C++ client, we need to create topics manually in advance.

@merlimat
Copy link
Contributor

merlimat commented Aug 5, 2021

For older C++ client, we need to create topics manually in advance.

That is something we must avoid because it breaks a lot of use cases.

@BewareMyPower
Copy link
Contributor Author

@merlimat It looks like the only solution might be reverting #10601 ?

The original purpose of #10601 is to differ the cases that a topic doesn't exist and a topic is a non-partitioned topic. This new semantic has already been applied to KoP 2.8.0 to detect a non-partitioned topic because KoP doesn't support non-partitioned topics and there's no way to check if a given topic name represents a non-partitioned topic or a topic that doesn't exist before. However, this new behavior will also break other clients that try to use HTTP lookup.

I think it's still controversial. We can discuss it in a new thread using email or another issue.

@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Aug 5, 2021

I noticed @codelipenghui added the release/2.8.1 label. Since this PR fixed many bugs that are not exposed because splitting it to multiple PRs could take much time, if we're going to cherry-pick it to branch-2.8.1, #11388 and #11492 are also required because some fixes of this PR are based on them.

FYI @hangc0276 since you're the release manager of 2.8.1.

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-cpp-tests branch from 3d0eaa6 to ace8e16 Compare August 5, 2021 08:40
@BewareMyPower BewareMyPower changed the title [C++/Python] Fix bugs that were not exposed by broken C++ CI before [C++] Fix bugs that were not exposed by broken C++ CI before Aug 5, 2021
@hangc0276
Copy link
Contributor

I'll push a independent PR to fix protobuf native schema so that we won't need to cherry-pick the previous PRs. @hangc0276

And after discussing with @codelipenghui , this PR should be ready to merge after the Python tests are fixed.

@BewareMyPower OK,i am waiting you to split this PR into two independent PRs, and then merge to 2.8.1

@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Aug 6, 2021

I've reverted 3 commits that 2 commits of them will be included in another PR (#11578), which should not be merged to 2.8.1 branch).

The left 1 commit ("Fix Python2 incompatibility") is related to broken documentation generation that blocks release of 2.7.3, FYI @congbobo184

@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-cpp-tests branch from 61848d8 to e083099 Compare August 6, 2021 08:48
@BewareMyPower BewareMyPower force-pushed the bewaremypower/fix-cpp-tests branch from e083099 to ab6db12 Compare August 6, 2021 16:09
@BewareMyPower BewareMyPower changed the title [WIP][C++] Fix bugs that were not exposed by broken C++ CI before [WIP][C++/Python] Fix bugs that were not exposed by broken C++ CI before Aug 6, 2021
@BewareMyPower
Copy link
Contributor Author

Now there're only 3 failed tests left. We need to fix the Python2 incompatibility issues.

@BewareMyPower BewareMyPower changed the title [WIP][C++/Python] Fix bugs that were not exposed by broken C++ CI before [C++/Python] Fix bugs that were not exposed by broken C++ CI before Aug 7, 2021
@BewareMyPower
Copy link
Contributor Author

BewareMyPower commented Aug 7, 2021

Now all Python tests are fixed, see the last few lines of run c++ tests part in https://github.com/apache/pulsar/pull/11557/checks?check_run_id=3268979826

# line 1389 to 1390
----------------------------------------------------------------------
Ran 73 tests in 24.356s
# line 1562, before this line there're some logs
OK
# line 1581 to 1585
----------------------------------------------------------------------
Ran 4 tests in 0.005s

OK

Copy link
Contributor

@gaoran10 gaoran10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BewareMyPower BewareMyPower merged commit 4919a82 into apache:master Aug 8, 2021
@BewareMyPower BewareMyPower deleted the bewaremypower/fix-cpp-tests branch August 8, 2021 15:18
BewareMyPower pushed a commit that referenced this pull request Aug 9, 2021
### Motivation

- fixes issue that cpp build doesn't fail when tests fail
- merge after #11557

### Additional context

- #11557 (comment)
- https://github.com/apache/pulsar/pull/10309/files#r683626563

### Modifications 

- `set -o pipefail;` is required when using `| cat`
LeBW pushed a commit to LeBW/pulsar that referenced this pull request Aug 9, 2021
…pache#11557)

Fixes apache#11551 

### Motivation

Currently there're some bugs of C++ client and some tests cannot pass:

1. Introduced from apache#10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.
    - AuthPluginTest.testTlsDetectHttps
    - AuthPluginToken.testTokenWithHttpUrl
    - BasicEndToEndTest.testHandlerReconnectionLogic
    - BasicEndToEndTest.testV2TopicHttp
    - ClientDeduplicationTest.testProducerDeduplication
2. Introduced from apache#11029 and apache#11486 , the implementation will iterate more than once even there's only one valid resolved IP address.
    - ClientTest.testConnectTimeout

In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling.

Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed.
1. Some tests in `pulsar_test.py` might encounter `Timeout` error when creating producers or consumers.
2. Some tests in `schema_test.py` failed because some comparisons between two `ComplexRecord`s failed. 

Since the CI test of C++ client would never fail after apache#10309 (will be fixed by apache#11575), all PRs about C++ or Python client are not verified even if CI passed. Before apache#11575 is merged, we need to fix all existed bugs of C++ client.

### Modifications

Corresponding to the above tests group, this PR adds following modifications:
1. Add the `?checkAllowAutoCreation=true` URL suffix to allow HTTP lookup to create topics automatically.
2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP.

Regarding to the flaky `testLookupThrottling`, this PR adds a `client.close()` at the end of test and fix the `ClientImpl::close` implementation. Before this PR, if there're no producers or consumers in a client, the `close()` method wouldn't call `shutdown()` to close connection poll and executors. Only after the `Client` instance was destructed would the `shutdown()` method be called. In this case, this PR calls `handleClose` instead of invoking callback directly. In addition, change the log level of this test to debug.

This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by `client` was not closed.

Regarding to Python schema tests, in Python2, `self.__ne__(other)` is not equivalent to `not self.__eq__(other)` when the default `__eq__` implementation is overwritten. If a `Record` object has a field whose type is also `Record`, the `Record.__ne__` method will be called, see

https://github.com/apache/pulsar/blob/ddb5fb0e062c2fe0967efce2a443a31f9cd12c07/pulsar-client-cpp/python/pulsar/schema/definition.py#L138-L139

but it just uses the default implementation to check whether they're not equal. The custom `__eq__` method won't be called. Therefore, this PR implement `Record.__ne__` explicitly to call `Record.__eq__` so that the comparison will work for Python2.

### Verifying this change

We can only check the workflow output to verify this change.
LeBW pushed a commit to LeBW/pulsar that referenced this pull request Aug 9, 2021
### Motivation

- fixes issue that cpp build doesn't fail when tests fail
- merge after apache#11557

### Additional context

- apache#11557 (comment)
- https://github.com/apache/pulsar/pull/10309/files#r683626563

### Modifications 

- `set -o pipefail;` is required when using `| cat`
hangc0276 pushed a commit that referenced this pull request Aug 12, 2021
…11557)

Fixes #11551

### Motivation

Currently there're some bugs of C++ client and some tests cannot pass:

1. Introduced from #10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.
    - AuthPluginTest.testTlsDetectHttps
    - AuthPluginToken.testTokenWithHttpUrl
    - BasicEndToEndTest.testHandlerReconnectionLogic
    - BasicEndToEndTest.testV2TopicHttp
    - ClientDeduplicationTest.testProducerDeduplication
2. Introduced from #11029 and #11486 , the implementation will iterate more than once even there's only one valid resolved IP address.
    - ClientTest.testConnectTimeout

In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling.

Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed.
1. Some tests in `pulsar_test.py` might encounter `Timeout` error when creating producers or consumers.
2. Some tests in `schema_test.py` failed because some comparisons between two `ComplexRecord`s failed.

Since the CI test of C++ client would never fail after #10309 (will be fixed by #11575), all PRs about C++ or Python client are not verified even if CI passed. Before #11575 is merged, we need to fix all existed bugs of C++ client.

### Modifications

Corresponding to the above tests group, this PR adds following modifications:
1. Add the `?checkAllowAutoCreation=true` URL suffix to allow HTTP lookup to create topics automatically.
2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP.

Regarding to the flaky `testLookupThrottling`, this PR adds a `client.close()` at the end of test and fix the `ClientImpl::close` implementation. Before this PR, if there're no producers or consumers in a client, the `close()` method wouldn't call `shutdown()` to close connection poll and executors. Only after the `Client` instance was destructed would the `shutdown()` method be called. In this case, this PR calls `handleClose` instead of invoking callback directly. In addition, change the log level of this test to debug.

This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by `client` was not closed.

Regarding to Python schema tests, in Python2, `self.__ne__(other)` is not equivalent to `not self.__eq__(other)` when the default `__eq__` implementation is overwritten. If a `Record` object has a field whose type is also `Record`, the `Record.__ne__` method will be called, see

https://github.com/apache/pulsar/blob/ddb5fb0e062c2fe0967efce2a443a31f9cd12c07/pulsar-client-cpp/python/pulsar/schema/definition.py#L138-L139

but it just uses the default implementation to check whether they're not equal. The custom `__eq__` method won't be called. Therefore, this PR implement `Record.__ne__` explicitly to call `Record.__eq__` so that the comparison will work for Python2.

### Verifying this change

We can only check the workflow output to verify this change.

(cherry picked from commit 4919a82)
hangc0276 pushed a commit that referenced this pull request Aug 12, 2021
### Motivation

- fixes issue that cpp build doesn't fail when tests fail
- merge after #11557

### Additional context

- #11557 (comment)
- https://github.com/apache/pulsar/pull/10309/files#r683626563

### Modifications

- `set -o pipefail;` is required when using `| cat`

(cherry picked from commit 5ae0554)
@hangc0276 hangc0276 added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Aug 12, 2021
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…pache#11557)

Fixes apache#11551 

### Motivation

Currently there're some bugs of C++ client and some tests cannot pass:

1. Introduced from apache#10601 because it changed the behavior of the admin API to get partition metadata while the C++ implementation relies on the original behavior to create topics automatically. So any test that uses HTTP lookup will fail.
    - AuthPluginTest.testTlsDetectHttps
    - AuthPluginToken.testTokenWithHttpUrl
    - BasicEndToEndTest.testHandlerReconnectionLogic
    - BasicEndToEndTest.testV2TopicHttp
    - ClientDeduplicationTest.testProducerDeduplication
2. Introduced from apache#11029 and apache#11486 , the implementation will iterate more than once even there's only one valid resolved IP address.
    - ClientTest.testConnectTimeout

In addition, there's an existed flaky test from very early time: ClientTest.testLookupThrottling.

Python tests are also broken. Because it must run after all C++ tests passed, they're also not exposed.
1. Some tests in `pulsar_test.py` might encounter `Timeout` error when creating producers or consumers.
2. Some tests in `schema_test.py` failed because some comparisons between two `ComplexRecord`s failed. 

Since the CI test of C++ client would never fail after apache#10309 (will be fixed by apache#11575), all PRs about C++ or Python client are not verified even if CI passed. Before apache#11575 is merged, we need to fix all existed bugs of C++ client.

### Modifications

Corresponding to the above tests group, this PR adds following modifications:
1. Add the `?checkAllowAutoCreation=true` URL suffix to allow HTTP lookup to create topics automatically.
2. When iterating through a resolved IP list, increase the iterator first, then run the connection timer and try to connect the next IP.

Regarding to the flaky `testLookupThrottling`, this PR adds a `client.close()` at the end of test and fix the `ClientImpl::close` implementation. Before this PR, if there're no producers or consumers in a client, the `close()` method wouldn't call `shutdown()` to close connection poll and executors. Only after the `Client` instance was destructed would the `shutdown()` method be called. In this case, this PR calls `handleClose` instead of invoking callback directly. In addition, change the log level of this test to debug.

This PR also fixes the failed timeout Python tests, some are caused by incorrect import of classes, some are caused by `client` was not closed.

Regarding to Python schema tests, in Python2, `self.__ne__(other)` is not equivalent to `not self.__eq__(other)` when the default `__eq__` implementation is overwritten. If a `Record` object has a field whose type is also `Record`, the `Record.__ne__` method will be called, see

https://github.com/apache/pulsar/blob/ddb5fb0e062c2fe0967efce2a443a31f9cd12c07/pulsar-client-cpp/python/pulsar/schema/definition.py#L138-L139

but it just uses the default implementation to check whether they're not equal. The custom `__eq__` method won't be called. Therefore, this PR implement `Record.__ne__` explicitly to call `Record.__eq__` so that the comparison will work for Python2.

### Verifying this change

We can only check the workflow output to verify this change.
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
### Motivation

- fixes issue that cpp build doesn't fail when tests fail
- merge after apache#11557

### Additional context

- apache#11557 (comment)
- https://github.com/apache/pulsar/pull/10309/files#r683626563

### Modifications 

- `set -o pipefail;` is required when using `| cat`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/client cherry-picked/branch-2.8 Archived: 2.8 is end of life doc-not-needed Your PR changes do not impact docs release/2.8.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[C++] There're some bugs that make unit tests fail
7 participants