-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running the same batch task multiple times using the Flink Session pattern causes the linux out of memory #4305
Comments
Install jemalloc on the every node of flink cluster can solve this problem 1、install jemalloc 2、Add the configuration below in /etc/profile 3、The source /etc/profile 4、And Then start flink cluster 5、execute the batch jobs I install jemalloc on two of five nodes, then here are the result of test, after seven times exec batch job the memory is increse to 50%+ who use glibc-malloc Why change the glibc-malloc to jemalloc The reason i think maybe the glibc-malloc casued plenty memory fragmentation。 can you help me,i use flink to writer hive table don't have this problem |
I think the core reason is: we have set the primary keys in iceberg table for flink DDL, so the flink write job will think that we are trying to write all those results into an iceberg format v2 table. see the following code piece: Do you want to write all those records into an iceberg format v2 table to deduplicate those rows who has the same primary key ? If not , then I will suggest to remove the primary key definition in the flink sink table DDL. If sure, then I think we may need to port this PR in your branch to fix the OutOfMemory issue. |
@openinx 1、This is my source table: 2、This is Temp Table: This is the sql which my batch job use. 3、insert into temp_data select After i run this job seven or eight tims the memory of linux will be incresed to 80%,And the memory which taskmanger used is 60%. But if i use jemalloc to run several times of job, the memory is noraml , which is same as the flink cluster just started |
i find an article an solve this problem |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
I have A batch task that inserts 100 million data from table A into table B after the primary key hash. I use Flink mode to execute this task. After several times of executing this task, I find that the Flink TaskManager process occupies more and more memory. Finally, the Flink TaskManager process is killed when the system runs out of memory.
Manually performing GC memory reclamation after the task execution is completed has no effect, and viewing memory through top does not decrease. unless the Flink TaskManager is turned off.
Logically, when a task is finished, it should release the resources it occupies, so that subsequent tasks can continue to run, but it does not.
The growth of memory occurs in the icebergStreamWriter stage, but I look at the code and do not find a place to use off-heap memory, so it is more confusing, can anyone help me?
Memory usage is 25% before running the job, and 72% memory is reached after three times
The Configuration
The System kill taskmanager process
The text was updated successfully, but these errors were encountered: