Skip to content

Releases: grafana/tempo

v1.4.1

05 May 17:16
v1.4.1
d3880a9
Compare
Choose a tag to compare

Bugfixes

  • [BUGFIX] metrics-generator: don't inject X-Scope-OrgID header for single-tenant setups 1417 (@kvrhdn)
  • [BUGFIX] compactor: populate compaction_objects_combined_total and tempo_discarded_spans_total{reason="trace_too_large_to_compact"} metrics again 1420 (@mdisibio)
  • [BUGFIX] distributor: prevent panics when concurrently calling shutdown to forwarder's queueManager 1422 (@mapno)

v1.4.0

28 Apr 19:17
6e96c52
Compare
Choose a tag to compare

Breaking changes

  • After this rollout the distributors will use a new API endpoint on the ingesters to push spans. Please rollout all ingesters before rolling the
    distributors to prevent downtime. Also, during this period, the ingesters will use considerably more resources and should be scaled up (or
    incoming traffic should be heavily throttled). Once all distributors and ingesters have rolled performance will return to normal. Internally we
    have observed ~1.5x CPU load on the ingesters during the rollout. #1227 (@joe-elliott)
  • Querier options related to search have moved under a search block: #1350 (@joe-elliott)
    querier:
     search_query_timeout: 30s
     search_external_endpoints: []
     search_prefer_self: 2
    
    becomes
    querier:
      search:
        query_timeout: 30s
        prefer_self: 2
        external_endpoints: []
    
  • Dropped tempo-search-retention-duration parameter on the vulture. #1297 (@joe-elliott)

New Features and Enhancements

  • [FEATURE] Added metrics-generator: an optional components to generate metrics from ingested traces #1282 (@mapno, @kvrhdn)
  • [ENHANCEMENT] v2 object encoding added. This encoding adds a start/end timestamp to every record to reduce proto marshalling and increase search speed. #1227 (@joe-elliott)
  • [ENHANCEMENT] Allow the compaction cycle to be configurable with a default of 30 seconds #1335 (@willdot)
  • [ENHANCEMENT] Add new config options for setting GCS metadata on new objects #1368 (@zalegrala)
  • [ENHANCEMENT] Add new scaling alerts to the tempo-mixin #1292 (@mapno)
  • [ENHANCEMENT] Improve serverless handler error messages #1305 (@joe-elliott)
  • [ENHANCEMENT] Added a configuration option search_prefer_self to allow the queriers to do some work while also leveraging serverless in search. #1307 (@joe-elliott)
  • [ENHANCEMENT] Make trace combination/compaction more efficient #1291 (@mdisibio)
  • [ENHANCEMENT] Add Content-Type headers to query-frontend paths #1306 (@wperron)
  • [ENHANCEMENT] Partially persist traces that exceed max_bytes_per_trace during compaction #1317 (@joe-elliott)
  • [ENHANCEMENT] Make search respect per tenant max_bytes_per_trace and added skippedTraces to returned search metrics. #1318 (@joe-elliott)
  • [ENHANCEMENT] Added tenant ID (instance ID) to trace too large message. #1385 (@cristiangsp)
  • [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
  • [ENHANCEMENT] Add hedging to queries to external endpoints. #1350 (@joe-elliott)
    New config options and defaults:
    querier:
      search:
        external_hedge_requests_at: 5s
        external_hedge_requests_up_to: 3
    
  • [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)

Bug Fixes

  • [BUGFIX] Correct issue where Azure "Blob Not Found" errors were sometimes not handled correctly #1390 (@joe-elliott)
  • [BUGFIX] Enable compaction and retention in Tanka single-binary #1352 (@irizzant)
  • [BUGFIX] Fixed issue when query-frontend doesn't log request details when request is cancelled #1136 (@adityapwr)
  • [BUGFIX] Update OTLP port in examples (docker-compose & kubernetes) from legacy ports (55680/55681) to new ports (4317/4318) #1294 (@mapno)
  • [BUGFIX] Fixes min/max time on blocks to be based on span times instead of ingestion time. #1314 (@joe-elliott)
    • Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
      storage:
        trace:
          wal:
            ingestion_time_range_slack: 2m0s
      
    • Includes a new metric to determine how often this range is exceeded: tempo_warnings_total{reason="outside_ingestion_time_slack"}
  • [BUGFIX] Prevent data race / ingester crash during searching by trace id by using xxhash instance as a local variable. #1387 (@bikashmishra100, @sagarwala, @ashwinidulams)
  • [BUGFIX] Fix spurious "failed to mark block compacted during retention" errors #1372 (@mdisibio)
  • [BUGFIX] Fix error message "Writer is closed" by resetting compression writer correctly on the error path. #1379 (@annanay25)

Other Changes

  • [CHANGE] Vulture now exercises search at any point during the block retention to test full backend search. #1297 (@joe-elliott)
  • [CHANGE] Updated storage.trace.pool.queue_depth default from 200->10000. #1345 (@joe-elliott)
  • [CHANGE] Updated flags -storage.trace.azure.storage-account-name and -storage.trace.s3.access_key to no longer to be considered as secrets #1356 (@simonswine)

v1.4.0-rc.0

19 Apr 18:39
9211d77
Compare
Choose a tag to compare
v1.4.0-rc.0 Pre-release
Pre-release

Breaking changes

  • After this rollout the distributors will use a new API endpoing on the ingesters to push spans. Please rollout all ingesters before rolling the
    distributors to prevent downtime. Also, during this period, the ingesters will use considerably more resources and should be scaled up (or incoming traffic should be
    heavily throttled). Once all distributors and ingesters have rolled performance will return to normal. Internally we have observed ~1.5x CPU load on the
    ingesters during the rollout. #1227 (@joe-elliott)
  • Querier options related to search have moved under a search block: #1350 (@joe-elliott)
    querier:
     search_query_timeout: 30s
     search_external_endpoints: []
     search_prefer_self: 2
    
    becomes
    querier:
      search:
        query_timeout: 30s
        prefer_self: 2
        external_endpoints: []
    
  • Dropped tempo-search-retention-duration parameter on the vulture. #1297 (@joe-elliott)

New Features and Enhancements

  • [FEATURE] Added metrics-generator: an optional components to generate metrics from ingested traces #1282 (@mapno, @kvrhdn)
  • [ENHANCEMENT] v2 object encoding added. This encoding adds a start/end timestamp to every record to reduce proto marshalling and increase search speed. #1227 (@joe-elliott)
  • [ENHANCEMENT] Allow the compaction cycle to be configurable with a default of 30 seconds #1335 (@willdot)
  • [ENHANCEMENT] Add new config options for setting GCS metadata on new objects #1368 (@zalegrala)
  • [ENHANCEMENT] Add new scaling alerts to the tempo-mixin #1292 (@mapno)
  • [ENHANCEMENT] Improve serverless handler error messages #1305 (@joe-elliott)
  • [ENHANCEMENT] Added a configuration option search_prefer_self to allow the queriers to do some work while also leveraging serverless in search. #1307 (@joe-elliott)
  • [ENHANCEMENT] Make trace combination/compaction more efficient #1291 (@mdisibio)
  • [ENHANCEMENT] Add Content-Type headers to query-frontend paths #1306 (@wperron)
  • [ENHANCEMENT] Partially persist traces that exceed max_bytes_per_trace during compaction #1317 (@joe-elliott)
  • [ENHANCEMENT] Make search respect per tenant max_bytes_per_trace and added skippedTraces to returned search metrics. #1318 (@joe-elliott)
  • [ENHANCEMENT] Added tenant ID (instance ID) to trace too large message. #1385 (@cristiangsp)
  • [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
  • [ENHANCEMENT] Add hedging to queries to external endpoints. #1350 (@joe-elliott)
    New config options and defaults:
    querier:
      search:
        external_hedge_requests_at: 5s
        external_hedge_requests_up_to: 3
    

Bug Fixes

  • [BUGFIX] Correct issue where Azure "Blob Not Found" errors were sometimes not handled correctly #1390 (@joe-elliott)
  • [BUGFIX] Enable compaction and retention in Tanka single-binary #1352 (@irizzant)
  • [BUGFIX] Fixed issue when query-frontend doesn't log request details when request is cancelled #1136 (@adityapwr)
  • [BUGFIX] Update OTLP port in examples (docker-compose & kubernetes) from legacy ports (55680/55681) to new ports (4317/4318) #1294 (@mapno)
  • [BUGFIX] Fixes min/max time on blocks to be based on span times instead of ingestion time. #1314 (@joe-elliott)
    • Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
      storage:
        trace:
          wal:
            ingestion_time_range_slack: 2m0s
      
    • Includes a new metric to determine how often this range is exceeded: tempo_warnings_total{reason="outside_ingestion_time_slack"}
  • [BUGFIX] Prevent data race / ingester crash during searching by trace id by using xxhash instance as a local variable. #1387 (@bikashmishra100, @sagarwala, @ashwinidulams)
  • [BUGFIX] Fix spurious "failed to mark block compacted during retention" errors #1372 (@mdisibio)
  • [BUGFIX] Fix error message "Writer is closed" by resetting compression writer correctly on the error path. #1379 (@annanay25)

Other Changes

  • [CHANGE] Vulture now exercises search at any point during the block retention to test full backend search. #1297 (@joe-elliott)
  • [CHANGE] Updated storage.trace.pool.queue_depth default from 200->10000. #1345 (@joe-elliott)
  • [CHANGE] Updated flags -storage.trace.azure.storage-account-name and -storage.trace.s3.access_key to no longer to be considered as secrets #1356 (@simonswine)

v1.3.2

23 Feb 14:27
Compare
Choose a tag to compare

Bug Fixes

  • [BUGFIX] Fixed an issue where the query-frontend would mangle start/end time ranges on searches which included the ingesters [#1295] (@joe-elliott)

v1.3.1

02 Feb 17:09
v1.3.1
5684f8c
Compare
Choose a tag to compare

This patch contains an important fix for users using etcd as kv store in Tempo's consistent hashing ring.

Bug Fixes

  • [BUGFIX] Fixed panic when using etcd as ring's kvstore #1260 (@mapno)

v1.3.0

24 Jan 14:24
be6476d
Compare
Choose a tag to compare

Breaking changes

This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680 to the new 4317. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680 and/or pushing traces to both ports simultaneously until the rollout is complete.

As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit to query_frontend.search.default_result_limit.

  • [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
  • [CHANGE] BREAKING CHANGE Moved querier.search_max_result_limit and querier.search_default_result_limit to query_frontend.search.max_result_limit and query_frontend.search.default_result_limit #1174.
  • [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)

New Features and Enhancements

  • [FEATURE]: Add support for inline environments. #1184 (@irizzant)
  • [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
  • [ENHANCEMENT] Expose upto parameter on hedged requests for each backend with hedge_requests_up_to. #1085](#1085) (@joe-elliott)
  • [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
  • [ENHANCEMENT] Jsonnet: add $._config.namespace to filter by namespace in cortex metrics #1098 (@mapno)
  • [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
  • [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
  • [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
  • [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
  • [ENHANCEMENT] Add Envoy Proxy panel to Tempo / Writes dashboard #1137 (@kvrhdn)
  • [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
  • [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
  • [ENHANCEMENT] Add tempodb_compaction_outstanding_blocks metric to measure compaction load #1143 (@mapno)
  • [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
  • [ENHANCEMENT] Make TempoIngesterFlushesFailing alert more actionable #1157 (@dannykopping)
  • [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
  • [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
  • [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
  • [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
  • [ENHANCEMENT] Add tempo_ingester_live_traces metric #1170 (@mdisibio)
  • [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
  • [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
  • [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)

Bug Fixes

  • [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
  • [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
  • [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
  • [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
  • [BUGFIX] Ingester startup panic slice bounds out of range #1195 (@mdisibio)

Other Changes

  • [CHANGE] Search: Add new per-tenant limit max_bytes_per_tag_values_query to limit the size of tag-values response. #1068 (@annanay25)
  • [CHANGE] Reduce MaxSearchBytesPerTrace ingester.max-search-bytes-per-trace default to 5KB #1129 @annanay25
  • [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
  • [CHANGE] Remove deprecated method Push from tempopb.Pusher #1173 (@kvrhdn)
  • [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
  • [CHANGE] Export trace id constant in api package #1176
  • [CHANGE] GRPC 1.33.3 => 1.38.0 broke compatibility with gogoproto.customtype. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25)

v1.3.0-rc.0

12 Jan 16:14
v1.3.0-rc.0
c7d9d18
Compare
Choose a tag to compare
v1.3.0-rc.0 Pre-release
Pre-release

Breaking changes

This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680 to the new 4317. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680 and/or pushing traces to both ports simultaneously until the rollout is complete.

As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit to query_frontend.search.default_result_limit.

  • [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
  • [CHANGE] BREAKING CHANGE Moved querier.search_max_result_limit and querier.search_default_result_limit to query_frontend.search.max_result_limit and query_frontend.search.default_result_limit #1174.

New Features and Enhancements

  • [FEATURE]: Add support for inline environments. #1184 (@irizzant)
  • [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
  • [ENHANCEMENT] Expose upto parameter on hedged requests for each backend with hedge_requests_up_to. #1085](#1085) (@joe-elliott)
  • [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
  • [ENHANCEMENT] Jsonnet: add $._config.namespace to filter by namespace in cortex metrics #1098 (@mapno)
  • [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
  • [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
  • [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
  • [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
  • [ENHANCEMENT] Add Envoy Proxy panel to Tempo / Writes dashboard #1137 (@kvrhdn)
  • [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
  • [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
  • [ENHANCEMENT] Add tempodb_compaction_outstanding_blocks metric to measure compaction load #1143 (@mapno)
  • [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
  • [ENHANCEMENT] Make TempoIngesterFlushesFailing alert more actionable #1157 (@dannykopping)
  • [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
  • [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
  • [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
  • [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
  • [ENHANCEMENT] Add tempo_ingester_live_traces metric #1170 (@mdisibio)
  • [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
  • [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
  • [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)

Bug Fixes

  • [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
  • [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
  • [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
  • [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
  • [BUGFIX] Ingester startup panic slice bounds out of range #1195 (@mdisibio)

Other Changes

  • [CHANGE] Search: Add new per-tenant limit max_bytes_per_tag_values_query to limit the size of tag-values response. #1068 (@annanay25)
  • [CHANGE] Reduce MaxSearchBytesPerTrace ingester.max-search-bytes-per-trace default to 5KB #1129 @annanay25
  • [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
  • [CHANGE] Remove deprecated method Push from tempopb.Pusher #1173 (@kvrhdn)
  • [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
  • [CHANGE] Export trace id constant in api package #1176
  • [CHANGE] GRPC 1.33.3 => 1.38.0 broke compatibility with gogoproto.customtype. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25)
  • [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)

v1.2.1

15 Nov 20:52
Compare
Choose a tag to compare

This patch contains two important bug fixes and is recommended for all users running v1.2.0.

Bug Fixes

  • [BUGFIX] Fix defaults for MaxBytesPerTrace (ingester.max-bytes-per-trace) and MaxSearchBytesPerTrace (ingester.max-search-bytes-per-trace) #1109 (@BitProcessor)
  • [BUGFIX] Ignore empty objects during compaction #1113 (@mdisibio)

v1.2.0

05 Nov 15:30
fb7fcca
Compare
Choose a tag to compare

Breaking Changes

This release contains a number of small breaking changes. They will likely have no impact on your deployment, but it should be noted that due to a change in the API between the query-frontend and querier there may be a temporary read outage during deployment.

  • [CHANGE] BREAKING CHANGE Drop support for v0 and v1 blocks. See 1.1 changelog for details #919 (@joe-elliott)
  • [CHANGE] BREAKING CHANGE Consolidate status information onto /status endpoint #952 @zalegrala)
    The following endpoints moved.
    /runtime_config moved to /status/runtime_config
    /config moved to /status/config
    /services moved to /status/services
  • [CHANGE] BREAKING CHANGE Change ingester metric ingester_bytes_metric_total in favor of ingester_bytes_received_total #979 (@mapno)
  • [CHANGE] Renamed CLI flag from --storage.trace.maintenance-cycle to --storage.trace.blocklist_poll. This is a BREAKING CHANGE #897 (@mritunjaysharma394)
  • [CHANGE] BREAKING CHANGE Support partial results from failed block queries #1007 (@mapno)
    Querier GET /querier/api/traces/<traceid> response's body has been modified
    to return tempopb.TraceByIDResponse instead of simply tempopb.Trace. This will cause a disruption of the read path during rollout of the change.
  • [CHANGE] BRREAKING CHANGE Change the metrics name from cortex_runtime_config_last_reload_successful to tempo_runtime_config_last_reload_successful #945 (@kavirajk)

New Features and Enhancements

  • [FEATURE] Add ability to search ingesters for traces #806 (@mdisibio @kvrhdn @annanay25)
  • [FEATURE] Add runtime config handler #936 (@mapno)
  • [FEATURE] Add ScalableSingleBinary operational run mode #1004 (@zalegrala)
  • [ENHANCEMENT] Added "query blocks" cli option. #876 (@joe-elliott)
  • [ENHANCEMENT] Added "search blocks" cli option. #972 (@joe-elliott)
  • [ENHANCEMENT] Added traceid to trace too large message. #888 (@mritunjaysharma394)
  • [ENHANCEMENT] Add support to tempo workloads to overrides from single configmap in microservice mode. #896 (@kavirajk)
  • [ENHANCEMENT] Updated config defaults to reflect better capture operational knowledge. #913 (@joe-elliott)
    ingester:
      trace_idle_period: 30s => 10s  # reduce ingester memory requirements with little impact on querying
      flush_check_period: 30s => 10s
    query_frontend:
      query_shards: 2 => 20          # will massively improve performance on large installs
    storage:
      trace:
        wal:
          encoding: none => snappy   # snappy has been tested thoroughly and ready for production use
        block:
          bloom_filter_false_positive: .05 => .01          # will increase total bloom filter size but improve query performance
          bloom_filter_shard_size_bytes: 256KiB => 100 KiB # will improve query performance
    compactor:
      compaction:
        chunk_size_bytes: 10 MiB => 5 MiB  # will reduce compactor memory needs
        compaction_window: 4h => 1h        # will allow more compactors to participate in compaction without substantially increasing blocks
    
  • [ENHANCEMENT] Make s3 backend readError logic more robust #905 (@wei840222)
  • [ENHANCEMENT] Add gen index and gen bloom commands to tempo-cli. #903 (@annanay25)
  • [ENHANCEMENT] Implement trace comparison in Vulture #904 (@zalegrala)
  • [ENHANCEMENT] Compression updates: Added s2, improved snappy performance #961 (@joe-elliott)
  • [ENHANCEMENT] Add support for vulture sending long running traces #951 (@zalegrala)
  • [ENHANCEMENT] Shard tenant index creation by tenant and add functionality to handle stale indexes. #1005 (@joe-elliott)
  • [ENHANCEMENT] Support partial results from failed block queries #1007 (@mapno)
  • [ENHANCEMENT] Add new metric tempo_distributor_push_duration_seconds #1027 (@zalegrala)
  • [ENHANCEMENT] Add query parameter to show the default config values and the difference between the current values and the defaults. #1045 (@MichelHollands)
  • [ENHANCEMENT] Adding metrics around ingester flush retries #1049 (@dannykopping)
  • [ENHANCEMENT] Performance: More efficient distributor batching #1075 (@joe-elliott)
  • [ENHANCEMENT] Include tempo-cli in the release #1086 (@zalegrala)

Bug Fixes

  • [BUGFIX] Update port spec for GCS docker-compose example #869 (@zalegrala)
  • [BUGFIX] Fix "magic number" errors and other block mishandling when an ingester forcefully shuts down #937 (@mdisibio)
  • [BUGFIX] Fix compactor memory leak #806 (@mdisibio)
  • [BUGFIX] Set span's tag span.kind to client in query-frontend #975 (@mapno)
  • [BUGFIX] Fixes tempodb_backend_hedged_roundtrips_total to correctly count hedged roundtrips. #1079 (@joe-elliott)
  • [BUGFIX] Update go-kit logger package to remove spurious debug logs #1094 (@bboreham)

Other Changes

  • [CHANGE] update jsonnet alerts and recording rules to use job_selectors and cluster_selectors for configurable unique identifier labels #935 (@kevinschoonover)
  • [CHANGE] Add troubleshooting language to config for server.grpc_server_max_recv_msg_size and server.grpc_server_max_send_msg_size when handling large traces #1023 (@thejosephstevens)

v1.2.0-rc.1

02 Nov 19:17
c5d007d
Compare
Choose a tag to compare
v1.2.0-rc.1 Pre-release
Pre-release

Breaking Changes

This release contains a number of small breaking changes. They will likely have no impact on your deployment, but it should be noted that due to a change in the API between the query-frontend and querier there may be a temporary read outage during deployment.

  • [CHANGE] BREAKING CHANGE Drop support for v0 and v1 blocks. See 1.1 changelog for details #919 (@joe-elliott)
  • [CHANGE] BREAKING CHANGE Consolidate status information onto /status endpoint #952 @zalegrala)
    The following endpoints moved.
    /runtime_config moved to /status/runtime_config
    /config moved to /status/config
    /services moved to /status/services
  • [CHANGE] BREAKING CHANGE Change ingester metric ingester_bytes_metric_total in favor of ingester_bytes_received_total #979 (@mapno)
  • [CHANGE] Renamed CLI flag from --storage.trace.maintenance-cycle to --storage.trace.blocklist_poll. This is a BREAKING CHANGE #897 (@mritunjaysharma394)
  • [CHANGE] BREAKING CHANGE Support partial results from failed block queries #1007 (@mapno)
    Querier GET /querier/api/traces/<traceid> response's body has been modified
    to return tempopb.TraceByIDResponse instead of simply tempopb.Trace. This will cause a disruption of the read path during rollout of the change.
  • [CHANGE] BRREAKING CHANGE Change the metrics name from cortex_runtime_config_last_reload_successful to tempo_runtime_config_last_reload_successful #945 (@kavirajk)

New Features and Enhancements

  • [FEATURE] Add ability to search ingesters for traces #806 (@mdisibio @kvrhdn @annanay25)
  • [FEATURE] Add runtime config handler #936 (@mapno)
  • [FEATURE] Add ScalableSingleBinary operational run mode #1004 (@zalegrala)
  • [ENHANCEMENT] Added "query blocks" cli option. #876 (@joe-elliott)
  • [ENHANCEMENT] Added "search blocks" cli option. #972 (@joe-elliott)
  • [ENHANCEMENT] Added traceid to trace too large message. #888 (@mritunjaysharma394)
  • [ENHANCEMENT] Add support to tempo workloads to overrides from single configmap in microservice mode. #896 (@kavirajk)
  • [ENHANCEMENT] Updated config defaults to reflect better capture operational knowledge. #913 (@joe-elliott)
    ingester:
      trace_idle_period: 30s => 10s  # reduce ingester memory requirements with little impact on querying
      flush_check_period: 30s => 10s
    query_frontend:
      query_shards: 2 => 20          # will massively improve performance on large installs
    storage:
      trace:
        wal:
          encoding: none => snappy   # snappy has been tested thoroughly and ready for production use
        block:
          bloom_filter_false_positive: .05 => .01          # will increase total bloom filter size but improve query performance
          bloom_filter_shard_size_bytes: 256KiB => 100 KiB # will improve query performance
    compactor:
      compaction:
        chunk_size_bytes: 10 MiB => 5 MiB  # will reduce compactor memory needs
        compaction_window: 4h => 1h        # will allow more compactors to participate in compaction without substantially increasing blocks
    
  • [ENHANCEMENT] Make s3 backend readError logic more robust #905 (@wei840222)
  • [ENHANCEMENT] Add gen index and gen bloom commands to tempo-cli. #903 (@annanay25)
  • [ENHANCEMENT] Implement trace comparison in Vulture #904 (@zalegrala)
  • [ENHANCEMENT] Compression updates: Added s2, improved snappy performance #961 (@joe-elliott)
  • [ENHANCEMENT] Add support for vulture sending long running traces #951 (@zalegrala)
  • [ENHANCEMENT] Shard tenant index creation by tenant and add functionality to handle stale indexes. #1005 (@joe-elliott)
  • [ENHANCEMENT] Support partial results from failed block queries #1007 (@mapno)
  • [ENHANCEMENT] Add new metric tempo_distributor_push_duration_seconds #1027 (@zalegrala)
  • [ENHANCEMENT] Add query parameter to show the default config values and the difference between the current values and the defaults. #1045 (@MichelHollands)
  • [ENHANCEMENT] Adding metrics around ingester flush retries #1049 (@dannykopping)
  • [ENHANCEMENT] Performance: More efficient distributor batching #1075 (@joe-elliott)
  • [ENHANCEMENT] Include tempo-cli in the release #1086 (@zalegrala)

Bug Fixes

  • [BUGFIX] Update port spec for GCS docker-compose example #869 (@zalegrala)
  • [BUGFIX] Fix "magic number" errors and other block mishandling when an ingester forcefully shuts down #937 (@mdisibio)
  • [BUGFIX] Fix compactor memory leak #806 (@mdisibio)
  • [BUGFIX] Set span's tag span.kind to client in query-frontend #975 (@mapno)
  • [BUGFIX] Fixes tempodb_backend_hedged_roundtrips_total to correctly count hedged roundtrips. #1079 (@joe-elliott)
  • [BUGFIX] Update go-kit logger package to remove spurious debug logs #1094 (@bboreham)

Other Changes

  • [CHANGE] update jsonnet alerts and recording rules to use job_selectors and cluster_selectors for configurable unique identifier labels #935 (@kevinschoonover)
  • [CHANGE] Add troubleshooting language to config for server.grpc_server_max_recv_msg_size and server.grpc_server_max_send_msg_size when handling large traces #1023 (@thejosephstevens)