Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Apache Beam Template to ingest from Apache Kafka to Google Pub/Sub #1

Closed
wants to merge 78 commits into from

Conversation

KhaninArtur
Copy link

@KhaninArtur KhaninArtur commented Oct 2, 2020

[Proposal] Apache Beam Template to ingest data from Apache Kafka to Google Cloud Pub/Sub. It can be used as a Dataflow Flex template in the Google Cloud Platform.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status --- Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status --- Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status --- --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@ilya-kozyrev ilya-kozyrev changed the title [WIP] Kafka to PubSub template for GCP [WIP] Apache Beam Template to ingest from Apache Kafka to Google Pub/Sub Oct 7, 2020
ramazan-yapparov pushed a commit that referenced this pull request Dec 9, 2020
…alueProvider for Python SDK

* Support for NestedValueProvider for Python SDK

* Fix typo

* Update CHANGES.md

* Update value_provider_test.py

* Fix NestedValueProvider docstrings. (#1)

* Fix isort and doc errors. (#2)

* Update CHANGES.md

Co-authored-by: Eugene Nikolaiev <eugene.nikolayev@gmail.com>
ramazan-yapparov pushed a commit that referenced this pull request Jan 29, 2021
ramazan-yapparov pushed a commit that referenced this pull request Feb 26, 2021
Debeziumio PoC (#7)

* New DebeziumIO class.

* Merge connector code

* DebeziumIO and MySqlConnector integrated.

* Added FormatFuntion param to Read builder on DebeziumIO.

* Added arguments checker to DebeziumIO.

* Add simple JSON mapper object (#1)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>
Co-authored-by: Carlos Dominguez <carlos.dominguez@carlos.dominguez>
Co-authored-by: Carlos Domínguez <carlos.dominguez@wizeline.com>

* Add debeziumio tests

* Debeziumio testing json mapper (#3)

* Some code refactors. Use a default DBHistory if not provided

* Add basic tests for Json mapper

* Debeziumio time restriction (#5)

* Add simple JSON mapper object

* Fixed Mapper.

* Add SqlServer connector test

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Added PostgreSql Connector Test

PostgreSql now works with Json mapper

* Fixing MySQL schema DataException

Using file instead of schema should fix it

* MySQL Connector updated from 1.3.0 to 1.3.1

* Some code refactors. Use a default DBHistory if not provided

* Adding based-time restriction

Stop polling after specified amount of time

* Add basic tests for Json mapper

* Adding new restriction

Uses a time-based restriction

* Adding optional restrcition

Uses an optional time-based restriction

Co-authored-by: juanitodread <juanitodread@gmail.com>
Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>

* Upgrade DebeziumIO connector (#4)

* Address comments (Change dependencies to testCompile, Set JsonMapper/Coder as default, refactors) (#8)

* Revert file

* Change dependencies to testCompile
* Move Counter sample to unit test

* Set JsonMapper as default mapper function
* Set String Coder as default coder when using JsonMapper
* Change logs from info to debug

* Debeziumio javadoc (#9)

* Adding javadoc

* Added some titles and examples

* Added SourceRecordJson doc

* Added Basic Connector doc

* Added KafkaSourceConsumer doc

* Javadoc cleanup

* Removing BasicConnector

No usages of this class were found overall

* Editing documentation

* Debeziumio fetched records restriction (#10)

* Adding javadoc

* Adding restriction by number of fetched records

Also adding a quick-fix for null value within SourceRecords
Minor fix on both MySQL and PostgreSQL Connectors Tests

* Run either by time or by number of records

* Added DebeziumOffsetTrackerTest

Tests both restrictions: By amount of time and by Number of records

* Removing comment

* DebeziumIO test for DB2. (#11)

* DebeziumIO test for DB2.

* DebeziumIO javadoc.

* Clean code:removed commented code lines on DebeziumIOConnectorTest.java

* Clean code:removing unused imports and using readAsJson().

Co-authored-by: Carlos Domínguez <74681048+carlosdominguezwl@users.noreply.github.com>

* Debezium limit records (now configurable) (#12)

* Adding javadoc

* Records Limit is now configurable

(It was fixed before)

* Debeziumio dockerize (#13)

* Add mysql docker container to tests

* Move debezium mysql integration test to its own file

* Add assertion to verify that the results contains a record.

* Debeziumio readme (#15)

* Adding javadoc

* Adding README file

* Add number of records configuration to the DebeziumIO component (#16)

* Code refactors (#17)

* Remove/ignore null warnings

* Remove DB2 code

* Remove docker dependency in DebeziumIO unit test and max number of recods to MySql integration test

* Change access modifiers accordingly

* Remove incomplete integration tests (Postgres and SqlServer)

* Add experimenal tag

* Debezium testing stoppable consumer (#18)

* Add try-catch-finally, stop SourceTask at finally.

* Fix warnings

* stopConsumer and processedRecords local variables removed. UT for task stop use case added

* Fix minor code style issue

Co-authored-by: juanitodread <juanitodread@gmail.com>

* Fix style issues (check, spotlessApply) (#19)

Co-authored-by: Osvaldo Salinas <osvaldo.salinas@osvaldo.salinas>
Co-authored-by: alejandro.maguey <alejandro.maguey@wizeline.com>
Co-authored-by: osvaldo-salinas <osvaldo.salinas@wizeline.com>
Co-authored-by: Carlos Dominguez <carlos.dominguez@carlos.dominguez>
Co-authored-by: Carlos Domínguez <carlos.dominguez@wizeline.com>
Co-authored-by: Carlos Domínguez <74681048+carlosdominguezwl@users.noreply.github.com>
Co-authored-by: Alejandro Maguey <alexmaguey1@gmail.com>
Co-authored-by: Hassan Reyes <hassanreyes@users.noreply.github.com>

Add missing apache license to README.md

Enabling integration test for DebeziumIO (#20)

Rename connector package cdc=>debezium. Update doc references (#21)

Fix code style on DebeziumIOMySqlConnectorIT
AydarZaynutdinov pushed a commit that referenced this pull request Jan 21, 2022
Adds ReadChangeStreamPartitionDoFn, which is an SDF to read partitions
from change streams and process them accordingly. This component
receives a change stream name, a partition, a start time and an end time
to query. It then initiates a change stream query with the received
parameters.

Within a change stream, 3 types of records can be received:

1. A Data record
2. A Heartbeat record
3. A Child partitions record

Upon receiving #1, the function updates the watermark with the record's
commit timestamp and emits the record into the output PCollection.
Upon receiving #2, the function updates the watermark with the record's
timestamp, but it does not emit any record into the PCollection.
Upon receiving #3, the function updates the watermark with the record's
timestamp and writes the new child partitions into the metadata table.
These partitions will be later scheduled by the DetectNewPartitions
component.

Once the change stream query for the element partition finishes, it
marks the partition as finished in the metadata table and terminates.
bullet03 pushed a commit that referenced this pull request May 8, 2022
nausharipov pushed a commit that referenced this pull request Aug 9, 2022
nausharipov added a commit that referenced this pull request Aug 19, 2022
* theme setup

* Replaced ThemeProvider with ThemeSwitchNotifier

* header with theme mode switcher and logo

* page container with header & footer

* theme mode tests

* renamed the directory to tour-of-beam

* compressed beam_logo.png

* added missing license comments

* rudimentary layout of the first screen

* review comments fixes #1

* moved notifyListeners inside then

* responsive todo

* split into 2 simple functions

* deleted redundant constants &
replaced 2018 text theme with 2021

* styling refinement

* home screen layout

* clickable sign in text

* font weights fix

* removed _getBaseFontTheme function

* fixed border and bg color

* color fixes

* difficulty component

* _LastModuleBody

* todo in test

* footer border

* fixed overflows

* replaced Project prefix with Tob

* replaced then with await

* inferred type

* started translation of the home screen

* sorted translations

* Complexity comments

* comment fixes

* home screen translations

* sign in overlay

* import fix

* integration test does not fail

* playground_components package with
dismissible_overlay

* missing license

* removed _dots from build

* widgets refinement

* renamed home screen to welcome screen

* deleted copyWith

* _SdkButton

* trailing comma & pubspec formatting

* license and lints

* license

* removed license from .metadata

* pubspec formatting

* total lints update

* changed from tour_of_beam to
tour-of-beam in build.gradle.kts

* license check

* _SdkButton mimics Radio button

* renamed MyApp to TourOfBeamApp

* onChanged mimics Radio button

Co-authored-by: darkhan.nausharipov <darkhan.nausharipov@kzn.akvelon.com>
olehborysevych pushed a commit that referenced this pull request Sep 5, 2022
* Tour of Beam frontend blank project

* TOBF (ToB frontend): welcome screen (#226)

* theme setup

* Replaced ThemeProvider with ThemeSwitchNotifier

* header with theme mode switcher and logo

* page container with header & footer

* theme mode tests

* renamed the directory to tour-of-beam

* compressed beam_logo.png

* added missing license comments

* rudimentary layout of the first screen

* review comments fixes #1

* moved notifyListeners inside then

* responsive todo

* split into 2 simple functions

* deleted redundant constants &
replaced 2018 text theme with 2021

* styling refinement

* home screen layout

* clickable sign in text

* font weights fix

* removed _getBaseFontTheme function

* fixed border and bg color

* color fixes

* difficulty component

* _LastModuleBody

* todo in test

* footer border

* fixed overflows

* replaced Project prefix with Tob

* replaced then with await

* inferred type

* started translation of the home screen

* sorted translations

* Complexity comments

* comment fixes

* home screen translations

* sign in overlay

* import fix

* integration test does not fail

* playground_components package with
dismissible_overlay

* missing license

* removed _dots from build

* widgets refinement

* renamed home screen to welcome screen

* deleted copyWith

* _SdkButton

* trailing comma & pubspec formatting

* license and lints

* license

* removed license from .metadata

* pubspec formatting

* total lints update

* changed from tour_of_beam to
tour-of-beam in build.gradle.kts

* license check

* _SdkButton mimics Radio button

* renamed MyApp to TourOfBeamApp

* onChanged mimics Radio button

Co-authored-by: darkhan.nausharipov <darkhan.nausharipov@kzn.akvelon.com>

* removed whitespace from readme (issue-22583)

* renamed "content" to "child" to mimic widgets

* README in tour-of-beam

* translation path rename,
grey dot with opacity,
footer link text style

* report issue in github, grey dot color

* table instead of row to clip the laptop image

* horizontalHalves & verticalHalves

* cropped laptop image

* row in an expensive intrinsic height

* laptop image in the bottom

* ScreenBreakpoints

* intrinsic height is not needed!

* _WideWelcome and _NarrowWelcome

* draft readme

* blank line in readme

* removed irrelevant info from readme

* removed whitespace

Co-authored-by: Alexey Inkin <alexey.inkin@akvelon.com>
Co-authored-by: darkhan.nausharipov <darkhan.nausharipov@kzn.akvelon.com>
Co-authored-by: alexeyinkin <leha@inkin.ru>
nausharipov pushed a commit that referenced this pull request Sep 5, 2022
* theme setup

* Replaced ThemeProvider with ThemeSwitchNotifier

* header with theme mode switcher and logo

* page container with header & footer

* theme mode tests

* renamed the directory to tour-of-beam

* compressed beam_logo.png

* added missing license comments

* rudimentary layout of the first screen

* review comments fixes #1

* moved notifyListeners inside then

* responsive todo

* split into 2 simple functions

* deleted redundant constants &
replaced 2018 text theme with 2021

* styling refinement

* home screen layout

* clickable sign in text

* font weights fix

* removed _getBaseFontTheme function

* fixed border and bg color

* color fixes

* difficulty component

* _LastModuleBody

* todo in test

* footer border

* fixed overflows

* replaced Project prefix with Tob

* replaced then with await

* inferred type

* started translation of the home screen

* sorted translations

* Complexity comments

* comment fixes

* home screen translations

* sign in overlay

* import fix

* integration test does not fail

* playground_components package with
dismissible_overlay

* missing license

* removed _dots from build

* widgets refinement

* renamed home screen to welcome screen

* deleted copyWith

* _SdkButton

* trailing comma & pubspec formatting

* license and lints

* license

* removed license from .metadata

* pubspec formatting

* total lints update

* changed from tour_of_beam to
tour-of-beam in build.gradle.kts

* license check

* _SdkButton mimics Radio button

* renamed MyApp to TourOfBeamApp

* onChanged mimics Radio button

Tour of Beam frontend blank project

[Tour of Beam][Frontend][apache#22600] TourScreen layout

TourScreen layout (apache#22600)

common theme, constants, split view

missing license

flutter_gen, summary layout details

content layout details

no functional widgets in split view

main screen todos & translation

main screen todos & translation

comment fixes #1

ExpansionTileWrapper

SplitViewController

lists in tour screen widgets

comment fixes #1 (31.08)

split view package in PGC

fixed button overflow

splitter theme color

comment fixes #2 (31.08)

gradlew check

welcome screen overflow test (apache#22600)

SDK dropdown (apache#22600)

flexible complete unit OutlinedButton (apache#22600)

renamed PageContainer to TobScaffold

dropdown style refinement

DropdownButton implicit type

sdk instead of e

licenses apache#22600

renamed _ShrinkedTour to _NarrowTour apache#22600

tour screen style refinement apache#22600

BeamDivider in PGC apache#22600

removed todo, added license apache#22600

built with text apache#22600

_WideWelcome with IntrinsicHeight (apache#22600)

Co-Authored-By: darkhan.nausharipov <darkhan.nausharipov@kzn.akvelon.com>
olehborysevych pushed a commit that referenced this pull request Sep 26, 2022
* [Tour of Beam][Frontend][apache#22600] TourScreen layout

* theme setup

* Replaced ThemeProvider with ThemeSwitchNotifier

* header with theme mode switcher and logo

* page container with header & footer

* theme mode tests

* renamed the directory to tour-of-beam

* compressed beam_logo.png

* added missing license comments

* rudimentary layout of the first screen

* review comments fixes #1

* moved notifyListeners inside then

* responsive todo

* split into 2 simple functions

* deleted redundant constants &
replaced 2018 text theme with 2021

* styling refinement

* home screen layout

* clickable sign in text

* font weights fix

* removed _getBaseFontTheme function

* fixed border and bg color

* color fixes

* difficulty component

* _LastModuleBody

* todo in test

* footer border

* fixed overflows

* replaced Project prefix with Tob

* replaced then with await

* inferred type

* started translation of the home screen

* sorted translations

* Complexity comments

* comment fixes

* home screen translations

* sign in overlay

* import fix

* integration test does not fail

* playground_components package with
dismissible_overlay

* missing license

* removed _dots from build

* widgets refinement

* renamed home screen to welcome screen

* deleted copyWith

* _SdkButton

* trailing comma & pubspec formatting

* license and lints

* license

* removed license from .metadata

* pubspec formatting

* total lints update

* changed from tour_of_beam to
tour-of-beam in build.gradle.kts

* license check

* _SdkButton mimics Radio button

* renamed MyApp to TourOfBeamApp

* onChanged mimics Radio button

Tour of Beam frontend blank project

[Tour of Beam][Frontend][apache#22600] TourScreen layout

TourScreen layout (apache#22600)

common theme, constants, split view

missing license

flutter_gen, summary layout details

content layout details

no functional widgets in split view

main screen todos & translation

main screen todos & translation

comment fixes #1

ExpansionTileWrapper

SplitViewController

lists in tour screen widgets

comment fixes #1 (31.08)

split view package in PGC

fixed button overflow

splitter theme color

comment fixes #2 (31.08)

gradlew check

welcome screen overflow test (apache#22600)

SDK dropdown (apache#22600)

flexible complete unit OutlinedButton (apache#22600)

renamed PageContainer to TobScaffold

dropdown style refinement

DropdownButton implicit type

sdk instead of e

licenses apache#22600

renamed _ShrinkedTour to _NarrowTour apache#22600

tour screen style refinement apache#22600

BeamDivider in PGC apache#22600

removed todo, added license apache#22600

built with text apache#22600

_WideWelcome with IntrinsicHeight (apache#22600)

Co-Authored-By: darkhan.nausharipov <darkhan.nausharipov@kzn.akvelon.com>

* addressing review comments apache#22600

replaced magic numbers apache#22600

comments (apache#22600)

comments apache#22600

comments apache#22600

comments apache#22600

comments apache#22600

comments, flutter 3.3.0 upgrade apache#22600

renamed ActionPadding to ActionVerticalPadding apache#22600

actions formatting apache#22600

* branded sign in buttons apache#22600

* _BrandedSignInButtons apache#22600

* _Divider color apache#22600

* profile apache#22600

* moved split_view from PGC into ToB apache#22600

* indentation fix apache#22600

* split ProfileContent into widgets apache#22600

* Extract playground components to a separate package (apache#22600)

* Minor fixes (apache#22600)

* Address review issues (apache#22600)

* Upgrade Flutter to v3.3.2 (apache#22600)

* Add precommit Gradle task for playground_components, add code generation to frontend Gradle task, remove generated mocks, fix linter issues (apache#22600)

* startTour button (apache#22600)

* lint fixes (apache#22600)

* Fix highlighting for Python and SCIO (apache#22600)

Co-authored-by: darkhan.nausharipov <darkhan.nausharipov@kzn.akvelon.com>
nausharipov pushed a commit that referenced this pull request May 22, 2023
nausharipov pushed a commit that referenced this pull request May 22, 2023
volatilemolotov pushed a commit that referenced this pull request May 29, 2024
* [prism] Add basic processing time queue.

* Initial residual handling refactor.

* Re-work teststream initilization. Remove pending element race.

* touch up

* rm merge duplicate

* Simplify watermark hold tracking.

* First successful run!

* Remove duplicated test run.

* Deduplicate processing time heap.

* rm debug text

* Remove some debug prints, cleanup.

* tiny todo cleanup

* ProcessingTime workming most of the time!

* Some cleanup

* try to get github suite to pass #1

* touch

* reduce counts a bit, filter tests some.

* Clean up unrelated state changes. Clean up comments somewhat.

* Filter out dataflow incompatible test.

* Refine processing time event comment.

* Remove test touch.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants