Releases: scylladb/scylla-monitoring
Releases · scylladb/scylla-monitoring
Release 4.8.2
Release 4.8.1
Bug fixes
- start-all.sh --target-directory option have error in its documentation #2398
- Some panels in the OS dashboard do not respect the DC filter #2396
- Unknown Alternator OP - BatchGetItemSize #2393
- Compression-related panels are confusing #2392
- Using stack graph for active read is confusing #2389
- Row and Partitions insertions are measured in read/sec #2386
- The latencies Legend format are confusing #2383
- scylla_io_queue_flow_ratio graph is inconvenient #2382
- Add batch latency and batch size metrics to Alternator dashboard #2380
- Alternator OPs are not representative of real ops - in case of BatchGetItem and similar batch ops area/alternator #2374
Release 4.8.0
New In Release 4.8.0
- Support for Scylla Manager 3.3 #2339
- Make the Tablet section collapsible #2329
- Add panels for network compression #2325
- Add filters that limit the number of results per panel [breaking-changes] #2319
- Add a graph for scylla_io_queue_flow_ratio #2306
- Make the IO-group panel group by iogroup, stream #2305
- Tooltip now allows scrolling #2209
- Add metrics for RPC #2104
- Unify Scylla-Manager status and progress #2009
- Different aggregation functions for the latency metrics #1741
Bug Fixes
- Non-Paged CQL Reads Gauge isn't working. #2295
- " I/O Group All Queue consumption" dashboard use wrong type of graph. #2293
- Increase nodes table column width to display full ip by default #2302
- Fix panels description in the advanced dashboard #2290
- Full page screenshot is broken #2324
- Make genconfig support ipv6
Operational changes
- splitBrain alert support for a multi-cluster setup #2304
- Allow setting local network and docker_pram from env file #2035
- scylla_storage_proxy_coordinator_read_timeouts repeated twice in regexp. #2323
- make prometheus the default datasource #2268
- Deprecated level label #2322
- The os dashboard accepts multiple node_exporter jobs #2317
- Support Prometheus various scrap interval sampling [breaking-changes] #2345
Breaking Changes
- The dashboards now support longer Prometheus scrape intervals, which are configurable and passed as a parameter in the Grafana data source configuration.
- To better handle clusters with high core counts, the dashboards limit the number of series shown by default. You can change that limit from the drop-down menu at the top.
Release 4.7.2
Release 4.7.1
Bug fixes in 4.7.1
Release 4.7.0
New in Release 4.7.0
- Update alternator dashboard #2226
- Make the default dashboard refresh interval configurable #2220
- Show scylla_sstables_bloom_filter_memory_size on the detailed dashboard #2219
- Update Alternator latencies histogram and summaries #2214
- Combine the Advisor table with the alert table in the overview dashboard #2166
- Easier method to run multiple monitoring stacks side-by-side #2164
- Add ethtool metrics to Datadog integration #2163
- Add tablet metrics to the detailed dashboard #2119, #2111
- Add storage-related metrics #2044
- New alert - cluster in split-brain state #1677
- Enhanced experience with --archive command line flag #2158, #2177
- The explanation for the unified class group graph is not clear #2178
Bug fixes
- No closing parenthesis #2229
- The variable $sg is not defined. #2228
- Prometheus continues to trigger alerts for a node that has already been removed from scylla_servers.yml #2227
- read-timeouts in the overview dashboard are breaking when no cdc metrics are reported #2193
- Manager metrics are inconsistent #2191
- Version information is cut - although there's plenty of space available in the panel #2189
- Reads panel does not reflect shards #2171
- Overview page - no data [write latency, Read timeout by DC] #2162
- Manager memory metrics interfere with the OS ones #2198
- The actual interval for calculating metrics is greater than the one specified in evaluation_interval. #2087
operational chagnes
- start-all.sh optionally skip alertmanager #2239
- Allow an easy way to start Prometheus with protobuf support #2155
- Regex for empty string |$^ in dashboards #2192
- prometheus/prometheus.yml.template: set evaluation interval to 20s #2185
- Improved experience when working with Archive #2177
- start-all.sh: create a file with the parameters of the last run operation #2174
- remove the deprecated level label #2160
- Performance and security enhencements #2154
- Allow setting local network from env file #2035
scylla-monitoring-4.6.2
Release 4.6.2
Release 4.6.1
Release 4.6.0
New in Release 4.6.0
- Add scylla_io_queue_consumption plots #2088
- Create a metric that shows cache hit/miss rate per table #285
- Add a section that shows all scheduling groups on the same graph #2121
- Add logged/unlogged batches graphs #2081
- The 'Timeouts' item in the general monitoring dashboard has no description #2056
- Add a "CQL Connections (creation) rate" graph #2053
- Alert Manager Rule: add a too-many-files alert #2060
Bug Fixes
- By Instance, doesn't work on any of the dashboards #2138
- "Request Shed" is supposed to be in the "Coordinator" section #2124
- Request/Response payload sizes units are wrong #2083
Operational Change
-
support IPV6 without specifying ports #2136
-
Prometheus does not start with an external directory, and sudo #2134
-
Experimental - Use compose to start the monitoring stack #2123
-
scylla-overview: Do not rely on recording rules for picking the scheduling group #2120
-
remove thrift from all calculations #2102
-
Need to support node_exporter ports #2092