Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49749][CORE] Change log level to debug in BlockManagerInfo #48197

Closed
wants to merge 1 commit into from

Conversation

wangyum
Copy link
Member

@wangyum wangyum commented Sep 22, 2024

What changes were proposed in this pull request?

This PR changes the log level to debug in BlockManagerInfo.

Why are the changes needed?

Before this PR:
Logging in BlockManagerMasterEndpoint uses 3.25% of the CPU and generates 60.5% of the logs.

image
cat spark.20240921-09.log | grep "in memory on" | wc -l
8587851
cat spark.20240921-09.log | wc -l
14185544

After this PR:
image

cat  spark.20240926-09.log | grep "in memory on" | wc -l
0
cat  spark.20240926-09.log | wc -l
2224037

Does this PR introduce any user-facing change?

No.

How was this patch tested?

N/A.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the CORE label Sep 22, 2024
@wangyum wangyum changed the title [SPARK-49749][CORE] Change log level to debug in BlockManagerMasterEndpoint [SPARK-49749][CORE] Change log level to debug in BlockManagerInfo Sep 22, 2024
Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging in BlockManagerMasterEndpoint uses 3.25% of the CPU and generates 60.5% of the logs.

How do the numbers look like after your changes?

LGTM, BTW.

@@ -1059,13 +1059,13 @@ private[spark] class BlockManagerInfo(
_blocks.put(blockId, blockStatus)
_remainingMem -= memSize
if (blockExists) {
logInfo(log"Updated ${MDC(BLOCK_ID, blockId)} in memory on " +
logDebug(log"Updated ${MDC(BLOCK_ID, blockId)} in memory on " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structured logging is only supposed to be used for logInfo/logWarning/logError, please also change the value to String

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with the above @MaxGekk 's comment. We need a result of after this PR for the below claim, @wangyum . The comparison between BEFORE and AFTER is mandatory for this kind of claim.

Logging in BlockManagerMasterEndpoint uses 3.25% of the CPU and generates 60.5% of the logs.

Otherwise, please remove the claim. We can say simply that this too verbose in a tone-downed manner.

@wangyum
Copy link
Member Author

wangyum commented Sep 27, 2024

I have updated the number after this change.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.
Merged to master. Thank you, @wangyum and all.

@wangyum wangyum deleted the SPARK-49749 branch September 29, 2024 10:08
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
### What changes were proposed in this pull request?

This PR changes the log level to debug in `BlockManagerInfo`.

### Why are the changes needed?

Before this PR:
Logging in `BlockManagerMasterEndpoint` uses 3.25% of the CPU and generates 60.5% of the logs.

<img width="1186" alt="image" src="https://github.com/user-attachments/assets/81013d98-8700-4e4f-a1aa-aa17fc0c5c52">

```
cat spark.20240921-09.log | grep "in memory on" | wc -l
8587851
cat spark.20240921-09.log | wc -l
14185544
```

After this PR:
<img width="1196" alt="image" src="https://github.com/user-attachments/assets/6d52cd2c-de3b-4a32-a6e8-a98c42909faa">

```
cat  spark.20240926-09.log | grep "in memory on" | wc -l
0
cat  spark.20240926-09.log | wc -l
2224037
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48197 from wangyum/SPARK-49749.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?

This PR changes the log level to debug in `BlockManagerInfo`.

### Why are the changes needed?

Before this PR:
Logging in `BlockManagerMasterEndpoint` uses 3.25% of the CPU and generates 60.5% of the logs.

<img width="1186" alt="image" src="https://github.com/user-attachments/assets/81013d98-8700-4e4f-a1aa-aa17fc0c5c52">

```
cat spark.20240921-09.log | grep "in memory on" | wc -l
8587851
cat spark.20240921-09.log | wc -l
14185544
```

After this PR:
<img width="1196" alt="image" src="https://github.com/user-attachments/assets/6d52cd2c-de3b-4a32-a6e8-a98c42909faa">

```
cat  spark.20240926-09.log | grep "in memory on" | wc -l
0
cat  spark.20240926-09.log | wc -l
2224037
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

N/A.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48197 from wangyum/SPARK-49749.

Authored-by: Yuming Wang <yumwang@ebay.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants