Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(hadoop): Fix the JMX exporter configuration #962

Merged
merged 2 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ All notable changes to this project will be documented in this file.

- kafka: Remove `kubectl`, as we are now using listener-op ([#884]).

### Fixed

- hadoop: Fix the JMX exporter configuration for metrics suffixed with
`_total`, `_info` and `_created` ([#962]).

[#884]: https://github.com/stackabletech/docker-images/pull/884
[#928]: https://github.com/stackabletech/docker-images/pull/928
[#943]: https://github.com/stackabletech/docker-images/pull/943
Expand All @@ -31,6 +36,7 @@ All notable changes to this project will be documented in this file.
[#955]: https://github.com/stackabletech/docker-images/pull/955
[#958]: https://github.com/stackabletech/docker-images/pull/958
[#959]: https://github.com/stackabletech/docker-images/pull/959
[#962]: https://github.com/stackabletech/docker-images/pull/962

## [24.11.0] - 2024-11-18

Expand Down
8 changes: 4 additions & 4 deletions hadoop/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,7 @@ ARG TARGETARCH
ARG TARGETOS
ARG STACKABLE_USER_UID

WORKDIR /stackable

COPY --chown=${STACKABLE_USER_UID}:0 hadoop/stackable/jmx /stackable/jmx
COPY --chown=${STACKABLE_USER_UID}:0 hadoop/stackable/fuse_dfs_wrapper /stackable/fuse_dfs_wrapper
WORKDIR /stackable/jmx

# The symlink from JMX Exporter 0.16.1 to the versionless link exists because old HDFS Operators (up until and including 23.7) used to hardcode
# the version of JMX Exporter like this: "-javaagent:/stackable/jmx/jmx_prometheus_javaagent-0.16.1.jar"
Expand All @@ -27,6 +24,8 @@ RUN curl "https://repo.stackable.tech/repository/packages/jmx-exporter/jmx_prome
ln -s "/stackable/jmx/jmx_prometheus_javaagent-${JMX_EXPORTER}.jar" /stackable/jmx/jmx_prometheus_javaagent.jar && \
ln -s /stackable/jmx/jmx_prometheus_javaagent.jar /stackable/jmx/jmx_prometheus_javaagent-0.16.1.jar

WORKDIR /stackable

RUN ARCH="${TARGETARCH/amd64/x64}" && \
curl "https://repo.stackable.tech/repository/packages/async-profiler/async-profiler-${ASYNC_PROFILER}-${TARGETOS}-${ARCH}.tar.gz" | tar -xzC . && \
ln -s "/stackable/async-profiler-${ASYNC_PROFILER}-${TARGETOS}-${ARCH}" /stackable/async-profiler
Expand Down Expand Up @@ -141,6 +140,7 @@ COPY --chown=${STACKABLE_USER_UID}:0 --from=hadoop-builder /stackable/jmx /stack
COPY --chown=${STACKABLE_USER_UID}:0 --from=hadoop-builder /stackable/async-profiler /stackable/async-profiler/
COPY --chown=${STACKABLE_USER_UID}:0 --from=hdfs-utils-builder /stackable/hadoop-${PRODUCT}/share/hadoop/common/lib/hdfs-utils-${HDFS_UTILS}.jar /stackable/hadoop-${PRODUCT}/share/hadoop/common/lib/hdfs-utils-${HDFS_UTILS}.jar
COPY --chown=${STACKABLE_USER_UID}:0 hadoop/stackable/fuse_dfs_wrapper /stackable/
COPY --chown=${STACKABLE_USER_UID}:0 hadoop/stackable/jmx /stackable/jmx


# fuse is required for fusermount (called by fuse_dfs)
Expand Down
51 changes: 48 additions & 3 deletions hadoop/stackable/jmx/datanode.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,18 @@ rules:
kind: 'MetricsSystem'
sub: $2
type: GAUGE
# FSDatasetState with _total suffix (also extracts the FSDataset ID),
# e.g. Hadoop:name=FSDatasetState,attribute=EstimatedCapacityLostTotal
- pattern: 'Hadoop<service=(.*), name=FSDatasetState-(.*)><>(.*_total): (\d+)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
labels:
service: HDFS
role: $1
fsdatasetid: $2
kind: 'FSDatasetState'
type: COUNTER
# FSDatasetState (also extracts the FSDataset ID)
- pattern: 'Hadoop<service=(.*), name=FSDatasetState-(.*)><>(.*): (\d+)'
attrNameSnakeCase: true
Expand All @@ -33,7 +45,19 @@ rules:
fsdatasetid: $2
kind: 'FSDatasetState'
type: GAUGE
# DataNodeActivity (also extracts hostname and port)
# DataNodeActivity with _info suffix (also extracts hostname and port),
# e.g. Hadoop:name=DataNodeActivity-hdfs-datanode-default-0-9866,attribute=BlocksGetLocalPathInfo
- pattern: 'Hadoop<service=(.*), name=DataNodeActivity-(.*)-(\d+)><>(.*_info): (\d+)'
attrNameSnakeCase: true
name: hadoop_$1_$4_
value: $5
labels:
service: HDFS
role: $1
host: $2
port: $3
kind: 'DataNodeActivity'
type: GAUGE
- pattern: 'Hadoop<service=(.*), name=DataNodeActivity-(.*)-(\d+)><>(.*): (\d+)'
attrNameSnakeCase: true
name: hadoop_$1_$4
Expand All @@ -45,8 +69,29 @@ rules:
port: $3
kind: 'DataNodeActivity'
type: GAUGE
# All other services
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (\d+)'
# Generic counter, e.g. Hadoop:name=FSDatasetState,attribute=EstimatedCapacityLostTotal
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*_total): (\d+)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: COUNTER
# Metrics suffixed with _info, e.g. Hadoop:name=JvmMetrics,attribute=LogInfo
# The suffix _info is reserved for static information, therefore an underscore is appended.
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*_info): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3_
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: GAUGE
# All other Hadoop metrics
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
Expand Down
15 changes: 13 additions & 2 deletions hadoop/stackable/jmx/journalnode.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,19 @@ rules:
kind: 'MetricsSystem'
sub: $2
type: GAUGE
# All JournalNode infos
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (\d+)'
# Metrics suffixed with _info, e.g. Hadoop:name=JvmMetrics,attribute=LogInfo
# The suffix _info is reserved for static information, therefore an underscore is appended.
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*_info): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3_
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: GAUGE
# All other Hadoop metrics
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
Expand Down
46 changes: 44 additions & 2 deletions hadoop/stackable/jmx/namenode.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,50 @@ rules:
kind: 'MetricsSystem'
sub: $2
type: GAUGE
# All NameNode infos
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (\d+)'
# Total raw capacity in bytes, e.g. Hadoop:name=NameNodeInfo,attribute=Total
- pattern: 'Hadoop<service=(.*), name=(.*)><>(total): (\d+)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: COUNTER
# Generic counter, e.g. Hadoop:name=FSNamesystem,attribute=FilesTotal
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*_total): (\d+)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: COUNTER
# Metrics suffixed with _created, e.g. Hadoop:name=NameNodeActivity,attribute=FilesCreated
# The suffix _created is reserved for timestamps, therefore an underscore is appended.
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*_created): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3_
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: GAUGE
# Metrics suffixed with _info, e.g. Hadoop:name=JvmMetrics,attribute=LogInfo
# The suffix _info is reserved for static information, therefore an underscore is appended.
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*_info): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3_
value: $4
labels:
service: HDFS
role: $1
kind: $2
type: GAUGE
# All other Hadoop metrics
- pattern: 'Hadoop<service=(.*), name=(.*)><>(.*): (.*)'
attrNameSnakeCase: true
name: hadoop_$1_$3
value: $4
Expand Down
Loading