Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [connector-cdc-mysql] mysql connections and memory of jvm increased abnormally #5008

Closed
3 tasks done
happyboy1024 opened this issue Jul 3, 2023 · 7 comments
Closed
3 tasks done
Assignees

Comments

@happyboy1024
Copy link
Contributor

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

During my use of the mysql-cdc, I discovered a phenomenon. Firstly, I need to explain my environment. My source and target are both MySQL, and I have a batch of inventory data. In the process of fully synchronizing inventory data, there will be no problem if no new data is added to the source database. But if during this period, I have a batch of new data that needs to be inserted into the source library, during which time the JVM memory and MySQL connections will continue to grow. When this process is long enough, it can cause JVM memory to approach the threshold, triggering GC, but GC cannot effectively release memory. At the same time, it will cause excessive consumption of MySql Connection pool.

SeaTunnel Version

seatunnel-2.3.2

SeaTunnel Config

env {
  # You can set SeaTunnel environment configuration here
  job.name = "mysql_test"
  job.mode = "STREAMING"
  checkpoint.interval = 10000
  execution.checkpoint.interval = 10000
  execution.checkpoint.data-uri = "hdfs://localhost:9000/checkpoint"
}

source {
     MySQL-CDC {
        result_table_name = "mysql_cdc_test"
        snapshot.split.size = 3000
        incremental.parallelism = 1
        server-id = "5400"
        username = "xxxx"
        password = "xxxxxxxx"
        database-names = ["xxxxx"]
        table-names = ["xxxxx.test_data"]
        base-url = "jdbc:mysql://192.168.xxx.xxx:xxxx/xxxxx"
    }
}
sink{
    jdbc {
        url = "jdbc:mysql://192.168.xxx.xxx:xxxx/xxxxx"
        driver = "com.mysql.cj.jdbc.Driver"
        user = "xxxx"
        password = "xxxxxx"
        table = "xxxxx"
        primary_keys = ["id"]
	database = "xxxxxxxx"
	batch_size = 3000
	batch_interval_ms = 20
    }
}

Running Command

./bin/seatunnel.sh -c config/v2.mysql_cdc.config

Error Exception

No obvious abnormality

Flink or Spark Version

No response

Java or Scala Version

Jdk11

Screenshots

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@wu-a-ge
Copy link
Contributor

wu-a-ge commented Jul 3, 2023

@happyboy1024 Is there a solution? I see this problem, too

@happyboy1024
Copy link
Contributor Author

@happyboy1024 Is there a solution? I see this problem, too
I analyzed the problem and found that it was in the phase of full synchronization history data. Since it is read in fragments, it is necessary to compensate after reading. During fragment compensating, an event listener is registered with BinaryClient, and then the End event is sent based on the offset of binlog. Each listener is associated with a queue to receive data change events, but I found that the current listener was not released after the compensation was completed, resulting in the queue data being unable to be GC. In the subsequent stages of the complete synchronization history, the binlog event will continue to trigger the send End event condition, placing the event in the queue. As for the number of MySQL connections, I found that during sharding reads, each shard enabled will obtain a connection to check the gtid status of MySQL. However, after sharding ends, this connection is not released, resulting in continuous connection growth. I have tried to solve this problem, and I am currently testing to solve it. If possible, I will try to submit a pr

happyboy1024 pushed a commit to happyboy1024/seatunnel that referenced this issue Jul 3, 2023
@wu-a-ge
Copy link
Contributor

wu-a-ge commented Jul 3, 2023

perfect

@wu-a-ge
Copy link
Contributor

wu-a-ge commented Jul 3, 2023

@happyboy1024 Hi, have you found metaspace outof memory while using the zeta engine? Although I have found this problem, I have not found a good solution

@happyboy1024
Copy link
Contributor Author

@happyboy1024 Hi, have you found metaspace outof memory while using the zeta engine? Although I have found this problem, I have not found a good solution

Sorry, I haven't found this problem at the moment. However, you can try to analyze this problem from the generation of dynamic classes and the loading of plugin jars. Of course, this is just my personal opinion, a better solution is to submit an issue, describing the problem in detail.

happyboy1024 pushed a commit to happyboy1024/seatunnel that referenced this issue Jul 10, 2023
@github-actions
Copy link

github-actions bot commented Aug 4, 2023

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

@github-actions github-actions bot added the stale label Aug 4, 2023
liugddx pushed a commit that referenced this issue Sep 12, 2023
* [Bug][connector-cdc-mysql] mysql connections and memory of jvm increased abnormally (#5008)

* [bugfix][connector-cdc-mysql] reset the listener of binaryLogClient before fetch task start (#5008)

* [Bugfix][Clickhouse] fix when the checkpoint triggers flush, the connection is closed, causing subsequent data writing to fail

---------

Co-authored-by: dengjunjie <296442618@qq.com>
gnehil pushed a commit to gnehil/seatunnel that referenced this issue Oct 12, 2023
* [Bug][connector-cdc-mysql] mysql connections and memory of jvm increased abnormally (apache#5008)

* [bugfix][connector-cdc-mysql] reset the listener of binaryLogClient before fetch task start (apache#5008)

* [Bugfix][Clickhouse] fix when the checkpoint triggers flush, the connection is closed, causing subsequent data writing to fail

---------

Co-authored-by: dengjunjie <296442618@qq.com>
@github-actions
Copy link

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

happyboy1024 pushed a commit to happyboy1024/seatunnel that referenced this issue Sep 20, 2024
happyboy1024 pushed a commit to happyboy1024/seatunnel that referenced this issue Sep 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants