Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Flink 1.19 #32648

Merged
merged 3 commits into from
Oct 17, 2024
Merged

Add support for Flink 1.19 #32648

merged 3 commits into from
Oct 17, 2024

Conversation

kennknowles
Copy link
Member

Fixes #32646


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@kennknowles
Copy link
Member Author

Failing test is broken at HEAD and doesn't touch Flink.

Copy link
Contributor

@je-ik je-ik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cool!

I found some outdated references to flink beam_flink1.10 in learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md, we might want to update those as well?

We are apparently using Flink 1.15 for portable tests (image beam_flink1.15).

There is also an issue #31631, do we want to resolve this as part of 2.61.0 release (mark this issue as release blocker for 2.61.0)? That way, this could be the last time we need to update the repo.

Last note, we might add an issue to drop Flink 1.15 and 1.16, the current policy is to support three versions of Flink.

@@ -1,4 +1,5 @@
{
"comment": "Modify this file in a trivial way to cause this test suite to run",
"https://github.com/apache/beam/pull/31156": "noting that PR #31156 should run this test"
"https://github.com/apache/beam/pull/31156": "noting that PR #31156 should run this test",
"https://github.com/apache/beam/pull/32648": "testing addition of Flink 1.19 support"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, but these edits seem to be accumulating over time. Can we (probably in a different issue and PR) add a script that will just update some timestamp field of the json?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we could. But TBH it kind of doesn't matter? The original idea is we would drop the commit before merging. But of course most people are not very careful about their commit history. Some people like to put a meaningless entry. I like to put an entry that describes the test so I don't mind if it gets checked in. We can also just trigger the tests by deleting all the entries, too.

operator.getOperatorConfig(), new MockEnvironmentBuilder().build()));
}

/** The emitWatermarkStatus method was added in Flink 1.14, so we need to wrap Output. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flink 1.14 is no longer supported, seems that we currently need the wrapper because of difference between Flink 1.15-1.18 and 1.19, am I correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. Since we dropped 1.14 support we could eliminate this default and inline it to all the uses of the OutputWrapper. I will probably do some of this clean up since I am learning this area of the code right now.

/** In Flink 1.19 the {@code emitRecordAttributes} method was added. */
@Override
default void emitRecordAttributes(RecordAttributes recordAttributes) {
throw new UnsupportedOperationException("emitRecordAttributes not implemented");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the implementation, we could move the StreamSources to src/main (shared across versions), if we just drop the @Override annotation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we could. I'm still paging in how the FlinkRunner code is organized so right now I am just copying what already exists. Actually I started by making a new adapter before I saw that this already exists, so I am just keeping it where it already was for now.

Copy link
Member Author

@kennknowles kennknowles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Yes, let me make some notes and I will follow up on these pieces as well:

  • update flink1.10 and flink1.15 references
  • drop 1.15 (not sure about dropping two versions in a single release - would like to get an idea of usage)
  • take a look and see if the dockerhub thing is low-hanging fruit or more difficult (maybe you already know?)

@@ -1,4 +1,5 @@
{
"comment": "Modify this file in a trivial way to cause this test suite to run",
"https://github.com/apache/beam/pull/31156": "noting that PR #31156 should run this test"
"https://github.com/apache/beam/pull/31156": "noting that PR #31156 should run this test",
"https://github.com/apache/beam/pull/32648": "testing addition of Flink 1.19 support"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, we could. But TBH it kind of doesn't matter? The original idea is we would drop the commit before merging. But of course most people are not very careful about their commit history. Some people like to put a meaningless entry. I like to put an entry that describes the test so I don't mind if it gets checked in. We can also just trigger the tests by deleting all the entries, too.

operator.getOperatorConfig(), new MockEnvironmentBuilder().build()));
}

/** The emitWatermarkStatus method was added in Flink 1.14, so we need to wrap Output. */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. Since we dropped 1.14 support we could eliminate this default and inline it to all the uses of the OutputWrapper. I will probably do some of this clean up since I am learning this area of the code right now.

/** In Flink 1.19 the {@code emitRecordAttributes} method was added. */
@Override
default void emitRecordAttributes(RecordAttributes recordAttributes) {
throw new UnsupportedOperationException("emitRecordAttributes not implemented");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, we could. I'm still paging in how the FlinkRunner code is organized so right now I am just copying what already exists. Actually I started by making a new adapter before I saw that this already exists, so I am just keeping it where it already was for now.

@kennknowles
Copy link
Member Author

My intention is to follow this up with Flink 1.20 support as well, since I want to get us caught up with current. So then the whole "drop 1.15, 1.16, 1.17 support" issue needs some proper handling and not too rushed.

@kennknowles kennknowles merged commit ba92381 into apache:master Oct 17, 2024
136 of 140 checks passed
@kennknowles kennknowles deleted the flink_1.19 branch October 17, 2024 11:39
@je-ik
Copy link
Contributor

je-ik commented Oct 17, 2024

  • take a look and see if the dockerhub thing is low-hanging fruit or more difficult (maybe you already know?)

Unfortunately no. I'm not too much familiar with how we build the job server images, but I suppose, it should be only changing the image name and bundling Flink version into the tag. There are reference to "latest' though, which would be meaningless. We would likely have to stop pushing those and create tags like '1
19_dev', which would be pushed for every version as we do with 'latest' now. But that's all I have.

@kennknowles
Copy link
Member Author

  • take a look and see if the dockerhub thing is low-hanging fruit or more difficult (maybe you already know?)

Unfortunately no. I'm not too much familiar with how we build the job server images, but I suppose, it should be only changing the image name and bundling Flink version into the tag. There are reference to "latest' though, which would be meaningless. We would likely have to stop pushing those and create tags like '1 19_dev', which would be pushed for every version as we do with 'latest' now. But that's all I have.

OK. I will think about this as a background task.

On the subject of the Flink versions, I do see that in the past we added one version and removed two all in the same Beam release. Since the Flink site themselves only even has the downloads for 1.18 and later, I think we can move quickly eliminating versions that are not longer supported by Flink. So I will do 1.15 and 1.16 now, and also add 1.20. I don't think I'll remove 1.17 until after 2.61.0 is released.

@kennknowles
Copy link
Member Author

kennknowles commented Oct 17, 2024

Noting https://issues.apache.org/jira/browse/INFRA-26211 to add the new dockerhub repo.

And, indeed, noting the specific files that were not updated here and are behind:

  • .github/workflows/beam_LoadTests_Java_GBK_Smoke.yml
  • .github/workflows/beam_PostCommit_Java_Examples_Flink.yml
  • .github/workflows/beam_PostCommit_Java_Nexmark_Flink.yml
  • .github/workflows/beam_PostCommit_Java_PVR_Flink_Streaming.yml
  • .github/workflows/beam_PostCommit_Java_ValidatesRunner_Flink_Java8.yml
  • .github/workflows/beam_PostCommit_XVR_Flink.yml
  • .github/workflows/beam_PreCommit_Java_PVR_Flink_Batch.yml
  • .github/workflows/beam_PreCommit_Java_PVR_Flink_Docker.yml
  • learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md

(this comment will be edited to have the full list of files that referenced 1.15)

I am updating these to 1.19 in #32837 and #32836

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: Add support for Flink 1.19
2 participants