Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Ceph module to support new API #7723

Closed
ruflin opened this issue Jul 25, 2018 · 17 comments
Closed

Update Ceph module to support new API #7723

ruflin opened this issue Jul 25, 2018 · 17 comments
Assignees
Labels
candidate Candidate to be added to the current iteration enhancement Metricbeat Metricbeat module Team:Integrations Label for the Integrations team Team:Services (Deprecated) Label for the former Integrations-Services team v7.7.0

Comments

@ruflin
Copy link
Contributor

ruflin commented Jul 25, 2018

ceph-rest-api is replaced by ceph-mgr in newer releases http://docs.ceph.com/docs/luminous/mgr/restful/# See #7661 (comment) for additional details.

@mtojek
Copy link
Contributor

mtojek commented Feb 4, 2020

@sorantis

If we want to switch to ceph-mgr, it's worth considering a Prometheus plugin. See: https://docs.ceph.com/docs/master/mgr/prometheus/
It provides pool, OSD metadata series, disc statistics. It's supported since the Luminous release.

If we agree to switch to Prometheus endpoint, I need some guidance on deprecating existing implementation

@mtojek
Copy link
Contributor

mtojek commented Feb 5, 2020

At the moment I will proceed with a new metricset cephmgr that with use Prometheus metrics endpoint.
(see below)

@sorantis
Copy link
Contributor

sorantis commented Feb 5, 2020

The existing implementation should be still valid for older versions of Ceph. Newer versions that have ceph-mgr could be handled by a separate metricset.
Using Prometheus here is an attractive option, but I'd stick to native APIs wherever possible for several reasons:

  • The Prometheus module will be going through several breaking changes that will impact all light modules based on it, so I'd refrain from adding to the list.
  • The Prometheus endpoints look like an extra capability that require the user to enable them manually. In some cases that might mean changing deployment templates, adding ports to firewall rules, etc.

My recommendation would be to use native APIs wherever possible.

@mtojek
Copy link
Contributor

mtojek commented Feb 5, 2020

It seems that we responded in the same time...

according to what we discussed offline, let's try to stick to native APIs as Prometheus module is not enabled by default.

@sorantis
Copy link
Contributor

sorantis commented Feb 5, 2020

After talking more with @mtojek about this, it seems that the right way would be to use the mgr's restful module instead of prometheus due to the points listed above, but also due to security. Prometheus endpoints at the moment don't support secure communication, which means that in case of building implementation on the prometheus module, for secure communication Metricbeat has to be deployed locally and configured with TLS. With restful there’s no such limitation - Metricbeat can be deployed on another node and restful can be configured to use TLS.

@mtojek
Copy link
Contributor

mtojek commented Feb 5, 2020

@sorantis
I booted up a demo Ceph cluster to review restful resources. To be honest, most of data exposed via endpoint is rather configuration than exact metrics.

Here are some of them: /mon:
[
    {
        "addr": "172.30.0.2:3300/0",
        "in_quorum": true,
        "leader": true,
        "name": "edcb751e8aa1",
        "public_addr": "172.30.0.2:3300/0",
        "public_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:3300",
                    "nonce": 0,
                    "type": "v2"
                }
            ]
        },
        "rank": 0,
        "server": "edcb751e8aa1"
    }
]

/osd:

[
    {
        "cluster_addr": "172.30.0.2:6803/186",
        "cluster_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6802",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6803",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "down_at": 20,
        "heartbeat_back_addr": "172.30.0.2:6807/186",
        "heartbeat_back_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6806",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6807",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "heartbeat_front_addr": "172.30.0.2:6805/186",
        "heartbeat_front_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6804",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6805",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "in": 1,
        "last_clean_begin": 4,
        "last_clean_end": 18,
        "lost_at": 0,
        "osd": 0,
        "pools": [
            1,
            2,
            3,
            4,
            5,
            6,
            7,
            8
        ],
        "primary_affinity": 1.0,
        "public_addr": "172.30.0.2:6801/186",
        "public_addrs": {
            "addrvec": [
                {
                    "addr": "172.30.0.2:6800",
                    "nonce": 186,
                    "type": "v2"
                },
                {
                    "addr": "172.30.0.2:6801",
                    "nonce": 186,
                    "type": "v1"
                }
            ]
        },
        "reweight": 1.0,
        "server": "edcb751e8aa1",
        "state": [
            "exists",
            "up"
        ],
        "up": 1,
        "up_from": 21,
        "up_thru": 21,
        "uuid": "eb1c8d6d-70c2-4511-a1b8-e9e7e5f624aa",
        "valid_commands": [
            "scrub",
            "deep-scrub",
            "repair"
        ],
        "weight": 1.0
    }
]

/pool:

[
    {
        "application_metadata": {},
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:09.277269",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "6",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 1,
        "pool_name": "rbd",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "cephfs": {
                "data": "cephfs"
            }
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:10.354727",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "7",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 2,
        "pool_name": "cephfs_data",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "cephfs": {
                "metadata": "cephfs"
            }
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:11.310873",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "8",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {
            "pg_autoscale_bias": 4.0,
            "pg_num_min": 16,
            "recovery_priority": 5
        },
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 3,
        "pool_name": "cephfs_metadata",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:13.193509",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "10",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 4,
        "pool_name": ".rgw.root",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:14.554436",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "12",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 5,
        "pool_name": "default.rgw.control",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:16.544549",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "14",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 6,
        "pool_name": "default.rgw.meta",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:18.505341",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "16",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 7,
        "pool_name": "default.rgw.log",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    },
    {
        "application_metadata": {
            "rgw": {}
        },
        "auid": 0,
        "cache_min_evict_age": 0,
        "cache_min_flush_age": 0,
        "cache_mode": "none",
        "cache_target_dirty_high_ratio_micro": 600000,
        "cache_target_dirty_ratio_micro": 400000,
        "cache_target_full_ratio_micro": 800000,
        "create_time": "2020-02-05 17:34:20.965857",
        "crush_rule": 0,
        "erasure_code_profile": "",
        "expected_num_objects": 0,
        "fast_read": false,
        "flags": 1,
        "flags_names": "hashpspool",
        "grade_table": [],
        "hit_set_count": 0,
        "hit_set_grade_decay_rate": 0,
        "hit_set_params": {
            "type": "none"
        },
        "hit_set_period": 0,
        "hit_set_search_last_n": 0,
        "last_change": "18",
        "last_force_op_resend": "0",
        "last_force_op_resend_preluminous": "0",
        "last_force_op_resend_prenautilus": "0",
        "last_pg_merge_meta": {
            "last_epoch_clean": 0,
            "last_epoch_started": 0,
            "ready_epoch": 0,
            "source_pgid": "0.0",
            "source_version": "0'0",
            "target_version": "0'0"
        },
        "min_read_recency_for_promote": 0,
        "min_size": 1,
        "min_write_recency_for_promote": 0,
        "object_hash": 2,
        "options": {},
        "pg_autoscale_mode": "warn",
        "pg_num": 8,
        "pg_num_pending": 8,
        "pg_num_target": 8,
        "pg_placement_num_target": 8,
        "pgp_num": 8,
        "pool": 8,
        "pool_name": "default.rgw.buckets.index",
        "pool_snaps": [],
        "quota_max_bytes": 0,
        "quota_max_objects": 0,
        "read_tier": -1,
        "removed_snaps": "[]",
        "size": 1,
        "snap_epoch": 0,
        "snap_mode": "selfmanaged",
        "snap_seq": 0,
        "stripe_width": 0,
        "target_max_bytes": 0,
        "target_max_objects": 0,
        "tier_of": -1,
        "tiers": [],
        "type": 1,
        "use_gmt_hitset": true,
        "write_tier": -1
    }
]

/server:

[
    {
        "ceph_version": "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)",
        "hostname": "",
        "services": [
            {
                "id": "14116",
                "type": "rbd-mirror"
            }
        ]
    },
    {
        "ceph_version": "ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable)",
        "hostname": "edcb751e8aa1",
        "services": [
            {
                "id": "demo",
                "type": "mds"
            },
            {
                "id": "edcb751e8aa1",
                "type": "mgr"
            },
            {
                "id": "edcb751e8aa1",
                "type": "mon"
            },
            {
                "id": "0",
                "type": "osd"
            },
            {
                "id": "edcb751e8aa1",
                "type": "rgw"
            },
            {
                "id": "edcb751e8aa1",
                "type": "rgw-nfs"
            }
        ]
    }
]

I'm afraid it might be hard for end-user to conclude the cluster health state and available storage.

Apart from that, there is one resource that gives you a valid (but also too deep) information is /perf:

Here is a sample:
{
    "mds.demo": {
        "mds.caps": {
            "description": "Capabilities",
            "nick": "caps",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.dir_commit": {
            "description": "Directory commit",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.dir_fetch": {
            "description": "Directory fetch",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 12
        },
        "mds.dir_merge": {
            "description": "Directory merge",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.dir_split": {
            "description": "Directory split",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.exported_inodes": {
            "description": "Exported inodes",
            "nick": "exi",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.forward": {
            "description": "Forwarding request",
            "nick": "fwd",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.imported_inodes": {
            "description": "Imported inodes",
            "nick": "imi",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.inode_max": {
            "description": "Max inodes, cache size",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 2147483647
        },
        "mds.inodes": {
            "description": "Inodes",
            "nick": "inos",
            "priority": 10,
            "type": 2,
            "units": 1,
            "value": 10
        },
        "mds.inodes_expired": {
            "description": "Inodes expired",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.inodes_pinned": {
            "description": "Inodes pinned",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 10
        },
        "mds.inodes_with_caps": {
            "description": "Inodes with capabilities",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.load_cent": {
            "description": "Load per cent",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.openino_dir_fetch": {
            "description": "OpenIno incomplete directory fetchings",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.reply_latency": {
            "count": 0,
            "description": "Reply latency",
            "nick": "rlat",
            "priority": 10,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds.request": {
            "description": "Requests",
            "nick": "req",
            "priority": 10,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds.root_rbytes": {
            "description": "root inode rbytes",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.root_rfiles": {
            "description": "root inode rfiles",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.root_rsnaps": {
            "description": "root inode rsnaps",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds.subtrees": {
            "description": "Subtrees",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 2
        },
        "mds_cache.ireq_enqueue_scrub": {
            "description": "Internal Request type enqueue scrub",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_exportdir": {
            "description": "Internal Request type export dir",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_flush": {
            "description": "Internal Request type flush",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_fragmentdir": {
            "description": "Internal Request type fragmentdir",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_fragstats": {
            "description": "Internal Request type frag stats",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.ireq_inodestats": {
            "description": "Internal Request type inode stats",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_recovering_enqueued": {
            "description": "Files waiting for recovery",
            "nick": "recy",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_recovering_prioritized": {
            "description": "Files waiting for recovery with elevated priority",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_recovering_processing": {
            "description": "Files currently being recovered",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_strays": {
            "description": "Stray dentries",
            "nick": "stry",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_strays_delayed": {
            "description": "Stray dentries delayed",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.num_strays_enqueuing": {
            "description": "Stray dentries enqueuing for purge",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_cache.recovery_completed": {
            "description": "File recoveries completed",
            "nick": "recd",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.recovery_started": {
            "description": "File recoveries started",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_created": {
            "description": "Stray dentries created",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_enqueued": {
            "description": "Stray dentries enqueued for purge",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_migrated": {
            "description": "Stray dentries migrated",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_cache.strays_reintegrated": {
            "description": "Stray dentries reintegrated",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.ev": {
            "description": "Events",
            "nick": "evts",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.evadd": {
            "description": "Events submitted",
            "nick": "subm",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.evex": {
            "description": "Total expired events",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.evexd": {
            "description": "Current expired events",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.evexg": {
            "description": "Expiring events",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.evtrm": {
            "description": "Trimmed events",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.jlat": {
            "count": 0,
            "description": "Journaler flush latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_log.replayed": {
            "description": "Events replayed",
            "nick": "repl",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 1
        },
        "mds_log.seg": {
            "description": "Segments",
            "nick": "segs",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 1
        },
        "mds_log.segadd": {
            "description": "Segments added",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.segex": {
            "description": "Total expired segments",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_log.segexd": {
            "description": "Current expired segments",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.segexg": {
            "description": "Expiring segments",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_log.segtrm": {
            "description": "Trimmed segments",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.cap": {
            "description": "Capabilities",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 0
        },
        "mds_mem.cap+": {
            "description": "Capabilities added",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.cap-": {
            "description": "Capabilities removed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.dir": {
            "description": "Directories",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 12
        },
        "mds_mem.dir+": {
            "description": "Directories opened",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 12
        },
        "mds_mem.dir-": {
            "description": "Directories closed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.dn": {
            "description": "Dentries",
            "nick": "dn",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 10
        },
        "mds_mem.dn+": {
            "description": "Dentries opened",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 10
        },
        "mds_mem.dn-": {
            "description": "Dentries closed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_mem.heap": {
            "description": "Heap size",
            "priority": 5,
            "type": 2,
            "units": 1,
            "value": 332028
        },
        "mds_mem.ino": {
            "description": "Inodes",
            "nick": "ino",
            "priority": 8,
            "type": 2,
            "units": 1,
            "value": 13
        },
        "mds_mem.ino+": {
            "description": "Inodes opened",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 13
        },
        "mds_mem.ino-": {
            "description": "Inodes closed",
            "priority": 5,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.cap_revoke_eviction": {
            "description": "Cap Revoke Client Eviction",
            "nick": "cre",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.handle_client_request": {
            "description": "Client requests",
            "nick": "hcr",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.handle_client_session": {
            "description": "Client session messages",
            "nick": "hcs",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 40
        },
        "mds_server.handle_slave_request": {
            "description": "Slave requests",
            "nick": "hsr",
            "priority": 8,
            "type": 10,
            "units": 1,
            "value": 0
        },
        "mds_server.req_create_latency": {
            "count": 0,
            "description": "Request type create latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_getattr_latency": {
            "count": 0,
            "description": "Request type get attribute latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_getfilelock_latency": {
            "count": 0,
            "description": "Request type get file lock latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_link_latency": {
            "count": 0,
            "description": "Request type link latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookup_latency": {
            "count": 0,
            "description": "Request type lookup latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookuphash_latency": {
            "count": 0,
            "description": "Request type lookup hash of inode latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupino_latency": {
            "count": 0,
            "description": "Request type lookup inode latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupname_latency": {
            "count": 0,
            "description": "Request type lookup name latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupparent_latency": {
            "count": 0,
            "description": "Request type lookup parent latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lookupsnap_latency": {
            "count": 0,
            "description": "Request type lookup snapshot latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_lssnap_latency": {
            "count": 0,
            "description": "Request type list snapshot latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_mkdir_latency": {
            "count": 0,
            "description": "Request type make directory latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_mknod_latency": {
            "count": 0,
            "description": "Request type make node latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
        "mds_server.req_mksnap_latency": {
            "count": 0,
            "description": "Request type make snapshot latency",
            "priority": 5,
            "type": 5,
            "units": 1,
            "value": 0
        },
...

@mtojek
Copy link
Contributor

mtojek commented Feb 6, 2020

Just updating the thread. We had a discussion with @sorantis and will go with /request API resource which internally calls and returns same output as ceph command (e.g. ceph status, ceph df).

Sample call/output:

>>> command='df'
>>> requests.post('https://host:port/request?wait=1', json={'prefix': command, 'format': 'json'}, auth=("demo", "password")).json()
{u'waiting': [], u'has_failed': False, u'state': u'success', u'is_waiting': False, u'running': [], u'failed': [], u'finished': [{u'outb': u'{"stats":{"total_bytes":10737418240,"total_avail_bytes":9621471232,"total_used_bytes":42205184,"total_used_raw_bytes":1115947008,"total_used_raw_ratio":0.10393066704273224,"num_osds":1,"num_per_pool_osds":1},"stats_by_class":{},"pools":[{"name":"rbd","id":1,"stats":{"stored":0,"objects":0,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"cephfs_data","id":2,"stats":{"stored":0,"objects":0,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"cephfs_metadata","id":3,"stats":{"stored":2286,"objects":22,"kb_used":512,"bytes_used":524288,"percent_used":5.7708399253897369e-05,"max_avail":9084600320}},{"name":".rgw.root","id":4,"stats":{"stored":2398,"objects":6,"kb_used":384,"bytes_used":393216,"percent_used":4.3281925172777846e-05,"max_avail":9084600320}},{"name":"default.rgw.control","id":5,"stats":{"stored":0,"objects":8,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"default.rgw.meta","id":6,"stats":{"stored":1173,"objects":7,"kb_used":384,"bytes_used":393216,"percent_used":4.3281925172777846e-05,"max_avail":9084600320}},{"name":"default.rgw.log","id":7,"stats":{"stored":0,"objects":176,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"default.rgw.buckets.index","id":8,"stats":{"stored":0,"objects":2,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}},{"name":"default.rgw.buckets.data","id":9,"stats":{"stored":37122728,"objects":21,"kb_used":36480,"bytes_used":37355520,"percent_used":0.0040951217524707317,"max_avail":9084600320}},{"name":"default.rgw.buckets.non-ec","id":10,"stats":{"stored":0,"objects":0,"kb_used":0,"bytes_used":0,"percent_used":0,"max_avail":9084600320}}]}\n', u'outs': u'', u'command': u'df format=json'}], u'is_finished': True, u'id': u'140124650075600'}

@mtojek
Copy link
Contributor

mtojek commented Feb 7, 2020

I'm working on the following metricsets (metricset ~ ceph command):

mgr_cluster_health ~ ceph status
mgr_cluster_disk ~ ceph df
mgr_osd_disk ~ ceph osd df
mgr_osd_pool_stats ~ ceph osd pool stats
mgr_osd_perf ~ ceph osd perf
mgr_osd_tree ~ ceph osd tree

The mgr prefix suggests that these metricsets are compatible with Ceph Manager Daemon (https://docs.ceph.com/docs/master/mgr/).

@mtojek
Copy link
Contributor

mtojek commented Feb 19, 2020

Module updated to use new API. PRs merged. Resolving.

@toha70
Copy link

toha70 commented Feb 26, 2020

Hi @mtojek : I'm looking at the cherry-pick for #16254 and I can't find the changes for the mgr_osd_disk.
/go/src/github.com/elastic/beats/metricbeat/module/ceph# ls -lrt | grep mgr_
drwxr-xr-x 3 root root 137 Feb 26 14:57 mgr_cluster_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_perf
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_cluster_health
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_osd_pool_stats
drwxr-xr-x 3 root root 128 Feb 26 14:57 mgr_pool_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_tree

All the other metricset are present except for the mgr_osd_disk. should we fall back to osd_df?

@mtojek
Copy link
Contributor

mtojek commented Feb 26, 2020

Hi @mtojek : I'm looking at the cherry-pick for #16254 and I can't find the changes for the mgr_osd_disk.
/go/src/github.com/elastic/beats/metricbeat/module/ceph# ls -lrt | grep mgr_
drwxr-xr-x 3 root root 137 Feb 26 14:57 mgr_cluster_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_perf
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_cluster_health
drwxr-xr-x 3 root root 143 Feb 26 14:57 mgr_osd_pool_stats
drwxr-xr-x 3 root root 128 Feb 26 14:57 mgr_pool_disk
drwxr-xr-x 3 root root 125 Feb 26 14:57 mgr_osd_tree

All the other metricset are present except for the mgr_osd_disk. should we fall back to osd_df?

Hi! It's renamed to mgr_pool_disk (#16254 (comment)).

@toha70
Copy link

toha70 commented Feb 26, 2020

Thank you @mtojek. I must have missed this comment :).

@epuertat
Copy link

epuertat commented Jun 8, 2021

Hi folks, just for you to know: at Ceph project we're planning to deprecate soon the restful API you're relying on here.

The alternatives would either be the fine-grained Ceph Dashboard REST API (more of a management API, so probably not the best for you) or the Prometheus exporter (which gives you all the metrics in a single shot).

@sorantis
Copy link
Contributor

sorantis commented Jun 8, 2021

@epuertat thanks for letting us know. We did consider Prometheus exporter earlier, but decided to stick to the native API capabilities. We'll need to revisit this. Which release are you planning to remove the restful API from?

@epuertat
Copy link

epuertat commented Jun 8, 2021

@sorantis: v17 (codenamed Quincy) to be released by first half of 2022. Please let us know if you need any guidance on this.

@sorantis
Copy link
Contributor

sorantis commented Jun 8, 2021

@epuertat good to know. Any plans to support Prometheus endpoint natively? AFAIK today the user will have to manually enable the exporter via ceph mgr module enable prometheus.

cc @akshay-saraswat

@epuertat
Copy link

epuertat commented Jun 8, 2021

@sorantis, no plans to change that. The Prometheus exporter is embedded inside a Ceph service. It's probably the reference 'metrics agent' for the Ceph project (others are less maintained, like influx, telegraf, zabbix, ...).

The main downside I see there is that it only supports plain-text HTTP, but if you really need HTTPS, it wouldn't be that hard to get that change in [ceph-dashboard sample HTTPS Cherrypy config].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
candidate Candidate to be added to the current iteration enhancement Metricbeat Metricbeat module Team:Integrations Label for the Integrations team Team:Services (Deprecated) Label for the former Integrations-Services team v7.7.0
Projects
None yet
Development

No branches or pull requests

7 participants