-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Source Hubspot: Handled 10K+ search-endpoint queries #10700
🐛 Source Hubspot: Handled 10K+ search-endpoint queries #10700
Conversation
9542725
to
3e6258a
Compare
@marcosmarxm Please let me know if there's anything I should do in order to get this merged. We really need this change published 🙏 |
@lgomezm could you run a sync with the dev connector version and shows this works with more than 10k for the specific stream? Our integration account don't have so many records. |
Also is it possible to add a unit test validating the logic when there is more than 10k records? |
@marcosmarxm Sure. The following screenshots show a connection from source-hubspot to local-json. You can see the second sync stops successfully at 10,000 records instead of completing with an error. |
c85a512
to
0ed6e56
Compare
@marcosmarxm I've also added a unit test in 85aac62. |
@lgomezm is it not possible to update the query in the same sync? If someone have a connection to run every 24h this could lead to not run the correct data replication. |
@marcosmarxm I've updated it to use a new state when it reaches the 10Kth record. PTAL again when you get a chance. |
@marcosmarxm These images show a connection that syncs Hubspot companies. In the second sync, you can see it now succeeds after reaching the 10,000th record: |
/test connector=connectors/source-hubspot repo=calixa-io/airbyte
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @lgomezm
What
There's an issue when getting incremental updates for the streams that extend the
CRMSearchStream
in thesource-hubspot
connector. If your search query matches more than 10K records, it will respond withHTTP 400
if you try to get records after the 10K-th one. Hubspot search endpoint documentation for reference: https://developers.hubspot.com/docs/api/crm/searchThis is the error from the logs:
How
It will stop getting records when it gets to 10K and it will use the latest state collected so far to start a new search query on the fly.
Recommended reading order
streams.py
🚨 User Impact 🚨
Are there any breaking changes? What is the end result perceived by the user? If yes, please merge this PR with the 🚨🚨 emoji so changelog authors can further highlight this if needed.
Pre-merge Checklist
Expand the relevant checklist and delete the others.
New Connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/SUMMARY.md
docs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampledocs/integrations/README.md
airbyte-integrations/builds.md
Airbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereUpdating a connector
Community member or Airbyter
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog exampleAirbyter
If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.
/test connector=connectors/<name>
command is passing/publish
command described hereConnector Generator
-scaffold
in their name) have been updated with the latest scaffold by running./gradlew :airbyte-integrations:connector-templates:generator:testScaffoldTemplates
then checking in your changesTests
Unit
Put your unit tests output here.
Integration
Put your integration tests output here.
Acceptance
Put your acceptance tests output here.