Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CdapIO] Add complete examples for CdapIO #23501

Closed
wants to merge 21 commits into from

Conversation

Amar3tto
Copy link
Contributor

@Amar3tto Amar3tto commented Oct 5, 2022

  1. Added examples for the following CDAP plugins:
  • Hubspot
    • Batch Source
    • Batch Sink
    • Streaming Source
  • Salesforce
    • Batch Source
    • Batch Sink
    • Streaming Source
  • ServiceNow
    • Batch Source
  • Zendesk
    • Batch Source
  1. Removed "examples" dependency from "hadoop-format-io", because it caused cyclic dependencies issue.
  2. SparkReceiverIO and CdapIO changes:
  • Added pullFrequencySec parameter - delay in seconds between polling for new records updates.
  • Added startOffset parameter - inclusive start offset from which the reading should be started.
  • Refactoring (Plugin, MappingUtils, context classes)

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI.

@Amar3tto Amar3tto changed the title Add complete examples for CdapIO [CdapIO] Add complete examples for CdapIO Oct 6, 2022
@Amar3tto
Copy link
Contributor Author

Amar3tto commented Oct 7, 2022

Run Java_Examples_Dataflow_Java17 PreCommit

@Amar3tto
Copy link
Contributor Author

Run Java PreCommit

1 similar comment
@elizaveta-lomteva
Copy link
Contributor

Run Java PreCommit

@Amar3tto Amar3tto marked this pull request as ready for review October 12, 2022 15:34
@github-actions
Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @apilloud for label java.
R: @pabloem for label io.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@github-actions
Copy link
Contributor

Reminder, please take a look at this pr: @apilloud @pabloem

@Amar3tto
Copy link
Contributor Author

Run Java PreCommit

1 similar comment
@elizaveta-lomteva
Copy link
Contributor

Run Java PreCommit

@elizaveta-lomteva
Copy link
Contributor

Run Java PreCommit

@codecov
Copy link

codecov bot commented Nov 4, 2022

Codecov Report

Merging #23501 (a4ebc84) into master (63362f5) will decrease coverage by 0.09%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #23501      +/-   ##
==========================================
- Coverage   73.31%   73.22%   -0.10%     
==========================================
  Files         714      717       +3     
  Lines       96418    96932     +514     
==========================================
+ Hits        70686    70974     +288     
- Misses      24411    24637     +226     
  Partials     1321     1321              
Flag Coverage Δ
python 83.00% <ø> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...l/job_management/v1/beam_expansion_api_pb2_grpc.py 51.85% <0.00%> (-6.05%) ⬇️
...hon/apache_beam/runners/worker/worker_pool_main.py 56.32% <0.00%> (-2.94%) ⬇️
sdks/python/apache_beam/utils/interactive_utils.py 95.12% <0.00%> (-2.44%) ⬇️
.../python/apache_beam/transforms/periodicsequence.py 97.01% <0.00%> (-1.50%) ⬇️
...dks/python/apache_beam/options/pipeline_options.py 93.96% <0.00%> (-0.94%) ⬇️
sdks/python/apache_beam/transforms/external.py 78.88% <0.00%> (-0.86%) ⬇️
sdks/python/apache_beam/runners/direct/executor.py 96.46% <0.00%> (-0.55%) ⬇️
...ache_beam/runners/portability/local_job_service.py 82.00% <0.00%> (-0.50%) ⬇️
...apache_beam/typehints/native_type_compatibility.py 85.16% <0.00%> (-0.37%) ⬇️
sdks/python/apache_beam/typehints/typehints.py 93.05% <0.00%> (-0.33%) ⬇️
... and 31 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@github-actions github-actions bot added build and removed build labels Nov 5, 2022
@elizaveta-lomteva
Copy link
Contributor

@chamikaramj @aromanenko-dev @mosche @johnjcasey
Hi, happy to announce that we've prepared the PR with examples of using CDAP plugins that we support now for review. The list of supported plugins is in the description.
Thanks a lot for your attention to it!

@github-actions github-actions bot added build and removed build labels Nov 23, 2022
@github-actions github-actions bot added build and removed build labels Nov 23, 2022
@elizaveta-lomteva
Copy link
Contributor

Run Typescript PreCommit

@elizaveta-lomteva
Copy link
Contributor

Run RAT PreCommit

@elizaveta-lomteva
Copy link
Contributor

Run Python PreCommit

@elizaveta-lomteva
Copy link
Contributor

Run Java PreCommit

@github-actions github-actions bot added build and removed build labels Nov 29, 2022
@Amar3tto
Copy link
Contributor Author

Run Java_Spark3_Versions PreCommit

@elizaveta-lomteva
Copy link
Contributor

R: @aromanenko-dev

@elizaveta-lomteva
Copy link
Contributor

R: @johnjcasey

@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

1 similar comment
@github-actions
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

Copy link
Contributor

@johnjcasey johnjcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't these examples missing will block CDAP being included the next release. It may be worth splitting this PR into one PR for the IO changes, and one PR for the examples

public class StructuredRecordUtils {

/** Converts {@link StructuredRecord} to String json-like format. */
public static String structuredRecordToString(StructuredRecord structuredRecord) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there not a way to do this with json serialization?

* Class for getting a {@link SerializableFunction} that defines how to get record offset for
* different CDAP {@link Plugin} classes.
*/
public class GetOffsetUtils {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these utility files organized by function, instead of by plugin?

@chamikaramj
Copy link
Contributor

+1 for splitting this huge PR in to multiple PRs for ease of review (may be per example).

@elizaveta-lomteva
Copy link
Contributor

@johnjcasey
#24436 is the PR with IO changes
Thank you!

@aromanenko-dev
Copy link
Contributor

+1 for multiple PRs split by example

@Amar3tto
Copy link
Contributor Author

Amar3tto commented Dec 7, 2022

Finished splitting PRs:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants