You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Getting timeout error from Kafka while executing the spark_expectations library.
To Reproduce
I was trying to execute the spark_expectations library using the sample rules provided in git hub. But I haven’t configured a kafka topic as this is not a specific requirement for us. But it looks like the library is configured in such a way that it doesn’t allow to disable to kafka section. It always validates the response form kafka stats and hence its becoming a blocker for me to try out the spark_expectations library in my local environment
Steps to reproduce the behavior:
Install spark_expectations library
Set up Metadata tables (DQ rules table and stats table)
Assign the rules
Set up alerts(create a config file as dq_spark_expectations_config.ini)
Run the spark expectations.
Expected behavior
After executing the library, stats should be written to table regardless of if the kafka topic is configured or not. I am planning to use this library extensively in my project as because of its inflight capability and wider stats. But I expect to have more flexibility to disable Kafka section.
Desktop (please complete the following information):
Tried in my azure environment.
OS: Windows
Browser :-Microsoft edge
Version :- 117.0.2045.31
Additional context
This issue can be resolved by removing the kafka dependency from spark_expectations library so that it doesn’t always expect a response from kafka stats. Currently this library is limited to specific environment where kafka topic is configured. The library can be further enhanced by providing an option to disable kafka section if that is not in scope.
The text was updated successfully, but these errors were encountered:
Describe the bug
Getting timeout error from Kafka while executing the spark_expectations library.
To Reproduce
I was trying to execute the spark_expectations library using the sample rules provided in git hub. But I haven’t configured a kafka topic as this is not a specific requirement for us. But it looks like the library is configured in such a way that it doesn’t allow to disable to kafka section. It always validates the response form kafka stats and hence its becoming a blocker for me to try out the spark_expectations library in my local environment
Steps to reproduce the behavior:
Expected behavior
After executing the library, stats should be written to table regardless of if the kafka topic is configured or not. I am planning to use this library extensively in my project as because of its inflight capability and wider stats. But I expect to have more flexibility to disable Kafka section.
[Screenshots]
[Timeout_kafka)
Desktop (please complete the following information):
Tried in my azure environment.
Additional context
This issue can be resolved by removing the kafka dependency from spark_expectations library so that it doesn’t always expect a response from kafka stats. Currently this library is limited to specific environment where kafka topic is configured. The library can be further enhanced by providing an option to disable kafka section if that is not in scope.
The text was updated successfully, but these errors were encountered: