Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-90] TestCountingSource can throw on checkpointing #14

Closed
wants to merge 2 commits into from
Closed

[BEAM-90] TestCountingSource can throw on checkpointing #14

wants to merge 2 commits into from

Conversation

mshields822
Copy link
Contributor

No description provided.

@mshields822
Copy link
Contributor Author

R=klk
Urgent: blocking customer impacting bugfix.

private static List<Integer> finalizeTracker;
private final int numMessagesPerShard;
private final int shardNumber;
private final boolean dedup;
private boolean throwOnFirstSnapshot;
private static boolean thrown = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a little comment here to explain?

@mshields822
Copy link
Contributor Author

PTAL

@asfgit asfgit closed this in 0528570 Mar 3, 2016
davorbonaci added a commit to GoogleCloudPlatform/DataflowJavaSDK that referenced this pull request Mar 4, 2016
aljoscha pushed a commit to aljoscha/beam that referenced this pull request Mar 16, 2018
Make GroupByKey a primitive in the python sdk
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
…nt. Use rather internal Flink record's timestamp.
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
mareksimunek pushed a commit to mareksimunek/beam that referenced this pull request May 9, 2018
apache#14 [euphoria-flink] Don't send timestamp along with each element.
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
…nt. Use rather internal Flink record's timestamp.
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
dmvk pushed a commit to dmvk/beam that referenced this pull request May 15, 2018
tvalentyn pushed a commit to tvalentyn/beam that referenced this pull request May 15, 2018
mareksimunek referenced this pull request in seznam/beam Jul 9, 2018
mareksimunek referenced this pull request in seznam/beam Jul 9, 2018
kennknowles pushed a commit that referenced this pull request Oct 16, 2018
…e rather internal Flink record's timestamp.
kennknowles pushed a commit that referenced this pull request Oct 16, 2018
pabloem pushed a commit that referenced this pull request Mar 31, 2021
…kenize sensitive data

* [WIP] Transfer from DataflowTemplates to Beam

* Move to beam repo

* moved convertors for GCSio

* Renaming + readme

* build errors

* minimize suppress

* remove UDF usages

* Fixes for stylechecks

* grooming for data protectors

* grooming for data protectors

* fix javadoc

* Added support for window writing; Fixed ordering in tokenization process

* supressed checkstyle errors for BigTableIO class

* add data tokenization tests

* Changed GCS to FileSystem and removed redundant function from SchemasUtils

* Updated README.md for local run with BigQuery sink

* add docstring

* Updated README.md

* Updated README.md and added javadoc for the main pipeline class

* remove unused test case

* Style fix

* Whitespaces fix

* Fixed undeclared dependencies and excluded .csv resource files from license analysis

* Fix for incorrect rpc url

* Fix for nullable types

* Data tokenization example group into batches (#11)

* GroupIntoBatches was used in the data tokenization pipeline

* io files were renamed for the data tokenization template

* code format fixed

* Getting value from environment variables for maxBufferingDurationMs. Information about it added to README (#14)

Getting value from environment variables for maxBufferingDurationMs

* [DATAFLOW-139] Incorrect DSG url lead to NPE (#13)

* Fix bug incorrect DSG url lead to NPE DATAFLOW-139

Co-authored-by: daria-malkova <daria.malkova@akvelon.com>
Co-authored-by: Ilya Kozyrev <ilya.kozyrev@akvelon.com>
Co-authored-by: Ramazan Yapparov <ramazan.yapparov@akvelon.com>
Co-authored-by: Mikhail Medvedev <mikhail.medvedev@akvelon.com>
Co-authored-by: Nuzhdina-Elena <79855159+Nuzhdina-Elena@users.noreply.github.com>
Co-authored-by: Nuzhdina-Elena <elena.nuzhdina@akvelon.com>
Co-authored-by: MikhailMedvedevAkvelon <78736905+MikhailMedvedevAkvelon@users.noreply.github.com>
usingh83 added a commit to usingh83/beam that referenced this pull request May 7, 2021
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly
usingh83 added a commit to usingh83/beam that referenced this pull request May 13, 2021
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#2:

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message apache#2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message apache#3:

Lint the files.

# This is the commit message apache#4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message apache#5:

Linting the project and making some stuff private

# This is the commit message apache#6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message apache#7:

delete the unused folder

# This is the commit message apache#8:

reorganizing pipeline

# This is the commit message apache#9:

Spotless check

# This is the commit message apache#10:

Spotless check

# This is the commit message apache#11:

build failure corrected

# This is the commit message apache#12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message apache#13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message apache#14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message apache#16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

Final Commit with all changes

Added unit test

adding examples for usage

usage for TwitterIO added and Java PreCommit failure fix

Spotless PreCommit failure fix
pabloem pushed a commit that referenced this pull request May 18, 2021
…eams data from twitter

* # This is a combination of 2 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

* # This is a combination of 2 commits.
# This is the 1st commit message:

# This is a combination of 2 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #2:

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 3 commits.
# This is the 1st commit message:

Java PreCommit failure fix

spotless failure fix

 Java PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit assign nullable correctly

Java_Examples_Dataflow PreCommit refix

Java_Examples_Dataflow PreCommit fix

build failure corrected

Spotless check

Spotless check

reorganizing pipeline

delete the unused folder

Revert "Delete build.gradle"

This reverts commit c39a4e44

Delete build.gradle

don't need this file

adding comments and java docs, and removing unneeded dependencies.

Linting the project and making some stuff private

Reorganized and redefined to logic as per standard beam IO structure.

Lint the files.

Added changes for making the implementation more streamlined and understandable

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

# This is a combination of 15 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #3:

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

# This is a combination of 16 commits.
# This is the 1st commit message:

Added a connector that streams data from twitter using a Standard Twitter app.

# This is the commit message #2:

Added changes for making the implementation more streamlined and understandable

# This is the commit message #3:

Lint the files.

# This is the commit message #4:

Reorganized and redefined to logic as per standard beam IO structure.

# This is the commit message #5:

Linting the project and making some stuff private

# This is the commit message #6:

adding comments and java docs, and removing unneeded dependencies.

# This is the commit message #7:

delete the unused folder

# This is the commit message #8:

reorganizing pipeline

# This is the commit message #9:

Spotless check

# This is the commit message #10:

Spotless check

# This is the commit message #11:

build failure corrected

# This is the commit message #12:

Java_Examples_Dataflow PreCommit fix

# This is the commit message #13:

Java_Examples_Dataflow PreCommit refix

# This is the commit message #14:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #15:

Java_Examples_Dataflow PreCommit assign nullable correctly

# This is the commit message #16:

 Java PreCommit assign nullable correctly

 Java PreCommit assign nullable correctly

spotless failure fix

Java PreCommit failure fix

correcting the if checks

cleaning up and adding readme

spotless fixed

readme fixed and compileJava
 fix

compileJava fix

compileJava fix now

spotless fix now

Java PreCommi fix

Java PreCommit fix

Final Commit with all changes

Added unit test

adding examples for usage

usage for TwitterIO added and Java PreCommit failure fix

Spotless PreCommit failure fix

* Unit test for multiple config added, and beautification

* Spotless apply fixed

* Removing redundant comments

* Removing newly added test

* adding newly added test back
hengfengli referenced this pull request in hengfengli/beam Mar 21, 2022
* feat: skeleton implementation of read fn

Adds skeleton implementation of the ReadChangeStreamPartitionDoFn with a
simple test.

* feat: adds initial impl for read component

Adds initial implementation for read change stream partition component,
along with an initial test and a test plan.

* test: adds unit test for record mapping

Adds unit test for mapping a struct to a data changes record.

* refactor: refactor tests

* feat: set the correct bounds for consuming SDF

Initialises the offset range based on the start and end timestamp of the
partition record.

* feat: process heartbeat records

Adds heartbeat record processing to the read change stream partition
dofn. The only thing necessary here is to update the watermark.

* chore: add package info to dao and mapper packages

* feat: add simple implementation on child partition

Adds the first implementation for child partitions, which just updates
the watermark based on the start timestamp of the record.

* feat: updates metadata table on child split

* feat: add change stream restriction tracker

Creates custom implementation of change stream restriction tracker which
tracks the timestamp position as well as the mode that should be
executed.

* refactor: add dao impls for change streams

Adds method to run change stream queries and update the partition
metadata table within a transaction.

* test: narrows unit tests for change stream fn

Unit tests only the process element method from the dofn. This way we
can verify updates to the restriction tracker and watermark estimator.

* test: verify tracker / watermark in tests

Validates that the restriction tracker and watermark were used to update
the correct state in the change streams unit tests.

* feat: add DELETE_PARTITION mode and refactor test

* test: complete unit test for child partition split

Adds assertions for waiting for children, waiting for parents and
deleting the current partition.

* feat: adds possible state transition

From partition query to wait for parents, since in all cases we will
need to wait for parents before terminating.

* feat: update termination of data / hearbeat record

Waits for parents to be deleted before terminating after a data /
hearbeat records.
On termination, marks itself as finished and deletes itself from the
metadata table.

* feat: moves the finishing of a partition state

To after a result set is consumed. This was not the case for the child
partitions, which finished before waiting for the children.

* refactor: refactors split case

* chore: removes unused file

* refactor: decompose read cdc dofn

Decomposes read change stream partition do fn into several actions.

* feat: process child partitions independently

Processes each record within a child partition record independently.

* feat: implement merge all parents finished

Implements child partition record case for merge when all parents are
finished.

* test: add test for merge one parent not finished

* test: refactor tests for read dofn

* tests: test termination case in read dofn

* feat: execute based on mode in read dofn

Does different things based on the partition mode that is currently
saved in the restriction.

* refactor: organize partition metadata dao

* refactor: refactors tests

* feat: implements change stream dao

* fix: fix restriction tracker to update correctly

The restriction tracker was not updating the restriction on each
tryClaim call. This commit fixes this behaviour.

* refactor: minor refactor in read dofn

* refactor: undo modifications to SpannerConfig

* chore: removes solved TODOs

* chore: fixes build violations

* docs: adds a few comments / todos

* feat: handles initial partition

Handles the initial partition which should be mapped to null in the
change stream query.

* fix: fixes the record mapping

Fixes the record mapping according to what is being returned by the
change stream query.

* feat: parameterises change stream query

Uses named params and parameterises the change stream query.

* docs: adds state graph for dofn

Adds documentation for the state graph on the Read Change Stream
Partition DoFn.

* docs: add TODOs / FIXMEs for missing functionality

* Remove linting (#14)

* feat: adds it test for read dofn

Here we had to change several things:

- We had to fix the restriction tracker checkDone() method, not to
  require the done partition mode. This is because this method gets
  called when we resume from a previous mode as well.
- We had to fix the heartbeat parameter for the change stream dao. It is
  now specified as millis instead of seconds.
- We added debug logging to all the actions.
- We fixed a bug in the restriction tracker allowing the mode change
  from wait_for_child_partitions to finish_partition (it was missing).

Co-authored-by: Zoe <zoc@google.com>
sjvanrossum pushed a commit to sjvanrossum/beam that referenced this pull request May 17, 2023
pl04351820 pushed a commit to pl04351820/beam that referenced this pull request Dec 20, 2023
Adds a system test demonstrating that ArrayUnion operation no longer deletes pre-existing data

Fixes apache#14 🦕
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants