Releases: grafana/tempo
v1.4.1
Bugfixes
- [BUGFIX] metrics-generator: don't inject X-Scope-OrgID header for single-tenant setups 1417 (@kvrhdn)
- [BUGFIX] compactor: populate
compaction_objects_combined_total
andtempo_discarded_spans_total{reason="trace_too_large_to_compact"}
metrics again 1420 (@mdisibio) - [BUGFIX] distributor: prevent panics when concurrently calling
shutdown
to forwarder's queueManager 1422 (@mapno)
v1.4.0
Breaking changes
- After this rollout the distributors will use a new API endpoint on the ingesters to push spans. Please rollout all ingesters before rolling the
distributors to prevent downtime. Also, during this period, the ingesters will use considerably more resources and should be scaled up (or
incoming traffic should be heavily throttled). Once all distributors and ingesters have rolled performance will return to normal. Internally we
have observed ~1.5x CPU load on the ingesters during the rollout. #1227 (@joe-elliott) - Querier options related to search have moved under a
search
block: #1350 (@joe-elliott)becomesquerier: search_query_timeout: 30s search_external_endpoints: [] search_prefer_self: 2
querier: search: query_timeout: 30s prefer_self: 2 external_endpoints: []
- Dropped
tempo-search-retention-duration
parameter on the vulture. #1297 (@joe-elliott)
New Features and Enhancements
- [FEATURE] Added metrics-generator: an optional components to generate metrics from ingested traces #1282 (@mapno, @kvrhdn)
- [ENHANCEMENT] v2 object encoding added. This encoding adds a start/end timestamp to every record to reduce proto marshalling and increase search speed. #1227 (@joe-elliott)
- [ENHANCEMENT] Allow the compaction cycle to be configurable with a default of 30 seconds #1335 (@willdot)
- [ENHANCEMENT] Add new config options for setting GCS metadata on new objects #1368 (@zalegrala)
- [ENHANCEMENT] Add new scaling alerts to the tempo-mixin #1292 (@mapno)
- [ENHANCEMENT] Improve serverless handler error messages #1305 (@joe-elliott)
- [ENHANCEMENT] Added a configuration option
search_prefer_self
to allow the queriers to do some work while also leveraging serverless in search. #1307 (@joe-elliott) - [ENHANCEMENT] Make trace combination/compaction more efficient #1291 (@mdisibio)
- [ENHANCEMENT] Add Content-Type headers to query-frontend paths #1306 (@wperron)
- [ENHANCEMENT] Partially persist traces that exceed
max_bytes_per_trace
during compaction #1317 (@joe-elliott) - [ENHANCEMENT] Make search respect per tenant
max_bytes_per_trace
and addedskippedTraces
to returned search metrics. #1318 (@joe-elliott) - [ENHANCEMENT] Added tenant ID (instance ID) to
trace too large message
. #1385 (@cristiangsp) - [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
- [ENHANCEMENT] Add hedging to queries to external endpoints. #1350 (@joe-elliott)
New config options and defaults:querier: search: external_hedge_requests_at: 5s external_hedge_requests_up_to: 3
- [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
Bug Fixes
- [BUGFIX] Correct issue where Azure "Blob Not Found" errors were sometimes not handled correctly #1390 (@joe-elliott)
- [BUGFIX] Enable compaction and retention in Tanka single-binary #1352 (@irizzant)
- [BUGFIX] Fixed issue when query-frontend doesn't log request details when request is cancelled #1136 (@adityapwr)
- [BUGFIX] Update OTLP port in examples (docker-compose & kubernetes) from legacy ports (55680/55681) to new ports (4317/4318) #1294 (@mapno)
- [BUGFIX] Fixes min/max time on blocks to be based on span times instead of ingestion time. #1314 (@joe-elliott)
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
storage: trace: wal: ingestion_time_range_slack: 2m0s
- Includes a new metric to determine how often this range is exceeded:
tempo_warnings_total{reason="outside_ingestion_time_slack"}
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
- [BUGFIX] Prevent data race / ingester crash during searching by trace id by using xxhash instance as a local variable. #1387 (@bikashmishra100, @sagarwala, @ashwinidulams)
- [BUGFIX] Fix spurious "failed to mark block compacted during retention" errors #1372 (@mdisibio)
- [BUGFIX] Fix error message "Writer is closed" by resetting compression writer correctly on the error path. #1379 (@annanay25)
Other Changes
- [CHANGE] Vulture now exercises search at any point during the block retention to test full backend search. #1297 (@joe-elliott)
- [CHANGE] Updated storage.trace.pool.queue_depth default from 200->10000. #1345 (@joe-elliott)
- [CHANGE] Updated flags
-storage.trace.azure.storage-account-name
and-storage.trace.s3.access_key
to no longer to be considered as secrets #1356 (@simonswine)
v1.4.0-rc.0
Breaking changes
- After this rollout the distributors will use a new API endpoing on the ingesters to push spans. Please rollout all ingesters before rolling the
distributors to prevent downtime. Also, during this period, the ingesters will use considerably more resources and should be scaled up (or incoming traffic should be
heavily throttled). Once all distributors and ingesters have rolled performance will return to normal. Internally we have observed ~1.5x CPU load on the
ingesters during the rollout. #1227 (@joe-elliott) - Querier options related to search have moved under a
search
block: #1350 (@joe-elliott)becomesquerier: search_query_timeout: 30s search_external_endpoints: [] search_prefer_self: 2
querier: search: query_timeout: 30s prefer_self: 2 external_endpoints: []
- Dropped
tempo-search-retention-duration
parameter on the vulture. #1297 (@joe-elliott)
New Features and Enhancements
- [FEATURE] Added metrics-generator: an optional components to generate metrics from ingested traces #1282 (@mapno, @kvrhdn)
- [ENHANCEMENT] v2 object encoding added. This encoding adds a start/end timestamp to every record to reduce proto marshalling and increase search speed. #1227 (@joe-elliott)
- [ENHANCEMENT] Allow the compaction cycle to be configurable with a default of 30 seconds #1335 (@willdot)
- [ENHANCEMENT] Add new config options for setting GCS metadata on new objects #1368 (@zalegrala)
- [ENHANCEMENT] Add new scaling alerts to the tempo-mixin #1292 (@mapno)
- [ENHANCEMENT] Improve serverless handler error messages #1305 (@joe-elliott)
- [ENHANCEMENT] Added a configuration option
search_prefer_self
to allow the queriers to do some work while also leveraging serverless in search. #1307 (@joe-elliott) - [ENHANCEMENT] Make trace combination/compaction more efficient #1291 (@mdisibio)
- [ENHANCEMENT] Add Content-Type headers to query-frontend paths #1306 (@wperron)
- [ENHANCEMENT] Partially persist traces that exceed
max_bytes_per_trace
during compaction #1317 (@joe-elliott) - [ENHANCEMENT] Make search respect per tenant
max_bytes_per_trace
and addedskippedTraces
to returned search metrics. #1318 (@joe-elliott) - [ENHANCEMENT] Added tenant ID (instance ID) to
trace too large message
. #1385 (@cristiangsp) - [ENHANCEMENT] Add a startTime and endTime parameter to the Trace by ID Tempo Query API to improve query performance #1388 (@sagarwala, @bikashmishra100, @ashwinidulams)
- [ENHANCEMENT] Add hedging to queries to external endpoints. #1350 (@joe-elliott)
New config options and defaults:querier: search: external_hedge_requests_at: 5s external_hedge_requests_up_to: 3
Bug Fixes
- [BUGFIX] Correct issue where Azure "Blob Not Found" errors were sometimes not handled correctly #1390 (@joe-elliott)
- [BUGFIX] Enable compaction and retention in Tanka single-binary #1352 (@irizzant)
- [BUGFIX] Fixed issue when query-frontend doesn't log request details when request is cancelled #1136 (@adityapwr)
- [BUGFIX] Update OTLP port in examples (docker-compose & kubernetes) from legacy ports (55680/55681) to new ports (4317/4318) #1294 (@mapno)
- [BUGFIX] Fixes min/max time on blocks to be based on span times instead of ingestion time. #1314 (@joe-elliott)
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
storage: trace: wal: ingestion_time_range_slack: 2m0s
- Includes a new metric to determine how often this range is exceeded:
tempo_warnings_total{reason="outside_ingestion_time_slack"}
- Includes new configuration option to restrict the amount of slack around now to update the block start/end time. #1332 (@joe-elliott)
- [BUGFIX] Prevent data race / ingester crash during searching by trace id by using xxhash instance as a local variable. #1387 (@bikashmishra100, @sagarwala, @ashwinidulams)
- [BUGFIX] Fix spurious "failed to mark block compacted during retention" errors #1372 (@mdisibio)
- [BUGFIX] Fix error message "Writer is closed" by resetting compression writer correctly on the error path. #1379 (@annanay25)
Other Changes
- [CHANGE] Vulture now exercises search at any point during the block retention to test full backend search. #1297 (@joe-elliott)
- [CHANGE] Updated storage.trace.pool.queue_depth default from 200->10000. #1345 (@joe-elliott)
- [CHANGE] Updated flags
-storage.trace.azure.storage-account-name
and-storage.trace.s3.access_key
to no longer to be considered as secrets #1356 (@simonswine)
v1.3.2
Bug Fixes
- [BUGFIX] Fixed an issue where the query-frontend would mangle start/end time ranges on searches which included the ingesters [#1295] (@joe-elliott)
v1.3.1
v1.3.0
Breaking changes
This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680
to the new 4317
. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680
and/or pushing traces to both ports simultaneously until the rollout is complete.
As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit
to query_frontend.search.default_result_limit
.
- [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] BREAKING CHANGE Moved
querier.search_max_result_limit
andquerier.search_default_result_limit
toquery_frontend.search.max_result_limit
andquery_frontend.search.default_result_limit
#1174. - [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)
New Features and Enhancements
- [FEATURE]: Add support for inline environments. #1184 (@irizzant)
- [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
- [ENHANCEMENT] Expose
upto
parameter on hedged requests for each backend withhedge_requests_up_to
. #1085](#1085) (@joe-elliott) - [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
- [ENHANCEMENT] Jsonnet: add
$._config.namespace
to filter by namespace in cortex metrics #1098 (@mapno) - [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
- [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
- [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
- [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
- [ENHANCEMENT] Add Envoy Proxy panel to
Tempo / Writes
dashboard #1137 (@kvrhdn) - [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
- [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
- [ENHANCEMENT] Add
tempodb_compaction_outstanding_blocks
metric to measure compaction load #1143 (@mapno) - [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
- [ENHANCEMENT] Make
TempoIngesterFlushesFailing
alert more actionable #1157 (@dannykopping) - [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
- [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
- [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
- [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
- [ENHANCEMENT] Add
tempo_ingester_live_traces
metric #1170 (@mdisibio) - [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
- [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
- [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)
Bug Fixes
- [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
- [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
- [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
- [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
- [BUGFIX] Ingester startup panic
slice bounds out of range
#1195 (@mdisibio)
Other Changes
- [CHANGE] Search: Add new per-tenant limit
max_bytes_per_tag_values_query
to limit the size of tag-values response. #1068 (@annanay25) - [CHANGE] Reduce MaxSearchBytesPerTrace
ingester.max-search-bytes-per-trace
default to 5KB #1129 @annanay25 - [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] Remove deprecated method
Push
fromtempopb.Pusher
#1173 (@kvrhdn) - [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
- [CHANGE] Export trace id constant in api package #1176
- [CHANGE] GRPC
1.33.3
=>1.38.0
broke compatibility withgogoproto.customtype
. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25)
v1.3.0-rc.0
Breaking changes
This release updates OpenTelemetry libraries version to v0.40.0, and with that, it updates OTLP gRPC's default listening port from the legacy 55680
to the new 4317
. There are two main routes to avoid downtime: configuring the receiver to listen in the old port 55680
and/or pushing traces to both ports simultaneously until the rollout is complete.
As part of adding support for full backend search, a search config parameter has had its name change from query_frontend.search.max_result_limit
to query_frontend.search.default_result_limit
.
- [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] BREAKING CHANGE Moved
querier.search_max_result_limit
andquerier.search_default_result_limit
toquery_frontend.search.max_result_limit
andquery_frontend.search.default_result_limit
#1174.
New Features and Enhancements
- [FEATURE]: Add support for inline environments. #1184 (@irizzant)
- [FEATURE] Added support for full backend search. #1174 (@joe-elliott)
- [ENHANCEMENT] Expose
upto
parameter on hedged requests for each backend withhedge_requests_up_to
. #1085](#1085) (@joe-elliott) - [ENHANCEMENT] Search: drop use of TagCache, extract tags and tag values on-demand #1068 (@kvrhdn)
- [ENHANCEMENT] Jsonnet: add
$._config.namespace
to filter by namespace in cortex metrics #1098 (@mapno) - [ENHANCEMENT] Add middleware to compress frontend HTTP responses with gzip if requested #1080 (@kvrhdn, @zalegrala)
- [ENHANCEMENT] Allow query disablement in vulture #1117 (@zalegrala)
- [ENHANCEMENT] Improve memory efficiency of compaction and block cutting. #1121 #1130 (@joe-elliott)
- [ENHANCEMENT] Include metrics for configured limit overrides and defaults: tempo_limits_overrides, tempo_limits_defaults #1089 (@zalegrala)
- [ENHANCEMENT] Add Envoy Proxy panel to
Tempo / Writes
dashboard #1137 (@kvrhdn) - [ENHANCEMENT] Reduce compactionCycle to improve performance in large multitenant environments #1145 (@joe-elliott)
- [ENHANCEMENT] Added max_time_per_tenant to allow for independently configuring polling and compaction cycle. #1145 (@joe-elliott)
- [ENHANCEMENT] Add
tempodb_compaction_outstanding_blocks
metric to measure compaction load #1143 (@mapno) - [ENHANCEMENT] Update mixin to use new backend metric #1151 (@zalegrala)
- [ENHANCEMENT] Make
TempoIngesterFlushesFailing
alert more actionable #1157 (@dannykopping) - [ENHANCEMENT] Switch open-telemetry/opentelemetry-collector to grafana/opentelemetry-collectorl fork, update it to 0.40.0 and add missing dependencies due to the change #1142 (@tete17)
- [ENHANCEMENT] Allow environment variables for Azure storage credentials #1147 (@zalegrala)
- [ENHANCEMENT] jsonnet: set rollingUpdate.maxSurge to 3 for distributor, frontend and queriers #1164 (@kvrhdn)
- [ENHANCEMENT] Reduce search data file sizes by optimizing contents #1165 (@mdisibio)
- [ENHANCEMENT] Add
tempo_ingester_live_traces
metric #1170 (@mdisibio) - [ENHANCEMENT] Update compactor ring to automatically forget unhealthy entries #1178 (@mdisibio)
- [ENHANCEMENT] Added the ability to pass ISO8601 date/times for start/end date to tempo-cli query api search #1208 (@joe-elliott)
- [ENHANCEMENT] Prevent writes to large traces even after flushing to disk #1199 (@mdisibio)
Bug Fixes
- [BUGFIX] Add process name to vulture traces to work around display issues #1127 (@mdisibio)
- [BUGFIX] Fixed issue where compaction sometimes dropped spans. #1130 (@joe-elliott)
- [BUGFIX] Ensure that the admin client jsonnet has correct S3 bucket property. (@hedss)
- [BUGFIX] Publish tenant index age correctly for tenant index writers. #1146 (@joe-elliott)
- [BUGFIX] Ingester startup panic
slice bounds out of range
#1195 (@mdisibio)
Other Changes
- [CHANGE] Search: Add new per-tenant limit
max_bytes_per_tag_values_query
to limit the size of tag-values response. #1068 (@annanay25) - [CHANGE] Reduce MaxSearchBytesPerTrace
ingester.max-search-bytes-per-trace
default to 5KB #1129 @annanay25 - [CHANGE] BREAKING CHANGE The OTEL GRPC receiver's default port changed from 55680 to 4317. #1142 (@tete17)
- [CHANGE] Remove deprecated method
Push
fromtempopb.Pusher
#1173 (@kvrhdn) - [CHANGE] Upgrade cristalhq/hedgedhttp from v0.6.0 to v0.7.0 #1159 (@cristaloleg)
- [CHANGE] Export trace id constant in api package #1176
- [CHANGE] GRPC
1.33.3
=>1.38.0
broke compatibility withgogoproto.customtype
. Enforce the use of gogoproto marshalling/unmarshalling for Tempo, Cortex & Jaeger structs. #1186 (@annanay25) - [CHANGE] BREAKING CHANGE Remove deprecated ingester gRPC endpoint and data encoding. The current data encoding was introduced in v1.0. If running earlier versions, first upgrade to v1.0 through v1.2 and allow time for all blocks to be switched to the "v1" data encoding. #1215 (@mdisibio)
v1.2.1
This patch contains two important bug fixes and is recommended for all users running v1.2.0.
Bug Fixes
- [BUGFIX] Fix defaults for MaxBytesPerTrace (ingester.max-bytes-per-trace) and MaxSearchBytesPerTrace (ingester.max-search-bytes-per-trace) #1109 (@BitProcessor)
- [BUGFIX] Ignore empty objects during compaction #1113 (@mdisibio)
v1.2.0
Breaking Changes
This release contains a number of small breaking changes. They will likely have no impact on your deployment, but it should be noted that due to a change in the API between the query-frontend and querier there may be a temporary read outage during deployment.
- [CHANGE] BREAKING CHANGE Drop support for v0 and v1 blocks. See 1.1 changelog for details #919 (@joe-elliott)
- [CHANGE] BREAKING CHANGE Consolidate status information onto /status endpoint #952 @zalegrala)
The following endpoints moved.
/runtime_config
moved to/status/runtime_config
/config
moved to/status/config
/services
moved to/status/services
- [CHANGE] BREAKING CHANGE Change ingester metric
ingester_bytes_metric_total
in favor ofingester_bytes_received_total
#979 (@mapno) - [CHANGE] Renamed CLI flag from
--storage.trace.maintenance-cycle
to--storage.trace.blocklist_poll
. This is a BREAKING CHANGE #897 (@mritunjaysharma394) - [CHANGE] BREAKING CHANGE Support partial results from failed block queries #1007 (@mapno)
QuerierGET /querier/api/traces/<traceid>
response's body has been modified
to returntempopb.TraceByIDResponse
instead of simplytempopb.Trace
. This will cause a disruption of the read path during rollout of the change. - [CHANGE] BRREAKING CHANGE Change the metrics name from
cortex_runtime_config_last_reload_successful
totempo_runtime_config_last_reload_successful
#945 (@kavirajk)
New Features and Enhancements
- [FEATURE] Add ability to search ingesters for traces #806 (@mdisibio @kvrhdn @annanay25)
- [FEATURE] Add runtime config handler #936 (@mapno)
- [FEATURE] Add ScalableSingleBinary operational run mode #1004 (@zalegrala)
- [ENHANCEMENT] Added "query blocks" cli option. #876 (@joe-elliott)
- [ENHANCEMENT] Added "search blocks" cli option. #972 (@joe-elliott)
- [ENHANCEMENT] Added traceid to
trace too large message
. #888 (@mritunjaysharma394) - [ENHANCEMENT] Add support to tempo workloads to
overrides
from single configmap in microservice mode. #896 (@kavirajk) - [ENHANCEMENT] Updated config defaults to reflect better capture operational knowledge. #913 (@joe-elliott)
ingester: trace_idle_period: 30s => 10s # reduce ingester memory requirements with little impact on querying flush_check_period: 30s => 10s query_frontend: query_shards: 2 => 20 # will massively improve performance on large installs storage: trace: wal: encoding: none => snappy # snappy has been tested thoroughly and ready for production use block: bloom_filter_false_positive: .05 => .01 # will increase total bloom filter size but improve query performance bloom_filter_shard_size_bytes: 256KiB => 100 KiB # will improve query performance compactor: compaction: chunk_size_bytes: 10 MiB => 5 MiB # will reduce compactor memory needs compaction_window: 4h => 1h # will allow more compactors to participate in compaction without substantially increasing blocks
- [ENHANCEMENT] Make s3 backend readError logic more robust #905 (@wei840222)
- [ENHANCEMENT] Add
gen index
andgen bloom
commands to tempo-cli. #903 (@annanay25) - [ENHANCEMENT] Implement trace comparison in Vulture #904 (@zalegrala)
- [ENHANCEMENT] Compression updates: Added s2, improved snappy performance #961 (@joe-elliott)
- [ENHANCEMENT] Add support for vulture sending long running traces #951 (@zalegrala)
- [ENHANCEMENT] Shard tenant index creation by tenant and add functionality to handle stale indexes. #1005 (@joe-elliott)
- [ENHANCEMENT] Support partial results from failed block queries #1007 (@mapno)
- [ENHANCEMENT] Add new metric
tempo_distributor_push_duration_seconds
#1027 (@zalegrala) - [ENHANCEMENT] Add query parameter to show the default config values and the difference between the current values and the defaults. #1045 (@MichelHollands)
- [ENHANCEMENT] Adding metrics around ingester flush retries #1049 (@dannykopping)
- [ENHANCEMENT] Performance: More efficient distributor batching #1075 (@joe-elliott)
- [ENHANCEMENT] Include tempo-cli in the release #1086 (@zalegrala)
Bug Fixes
- [BUGFIX] Update port spec for GCS docker-compose example #869 (@zalegrala)
- [BUGFIX] Fix "magic number" errors and other block mishandling when an ingester forcefully shuts down #937 (@mdisibio)
- [BUGFIX] Fix compactor memory leak #806 (@mdisibio)
- [BUGFIX] Set span's tag
span.kind
toclient
in query-frontend #975 (@mapno) - [BUGFIX] Fixes
tempodb_backend_hedged_roundtrips_total
to correctly count hedged roundtrips. #1079 (@joe-elliott) - [BUGFIX] Update go-kit logger package to remove spurious debug logs #1094 (@bboreham)
Other Changes
- [CHANGE] update jsonnet alerts and recording rules to use
job_selectors
andcluster_selectors
for configurable unique identifier labels #935 (@kevinschoonover) - [CHANGE] Add troubleshooting language to config for
server.grpc_server_max_recv_msg_size
andserver.grpc_server_max_send_msg_size
when handling large traces #1023 (@thejosephstevens)
v1.2.0-rc.1
Breaking Changes
This release contains a number of small breaking changes. They will likely have no impact on your deployment, but it should be noted that due to a change in the API between the query-frontend and querier there may be a temporary read outage during deployment.
- [CHANGE] BREAKING CHANGE Drop support for v0 and v1 blocks. See 1.1 changelog for details #919 (@joe-elliott)
- [CHANGE] BREAKING CHANGE Consolidate status information onto /status endpoint #952 @zalegrala)
The following endpoints moved.
/runtime_config
moved to/status/runtime_config
/config
moved to/status/config
/services
moved to/status/services
- [CHANGE] BREAKING CHANGE Change ingester metric
ingester_bytes_metric_total
in favor ofingester_bytes_received_total
#979 (@mapno) - [CHANGE] Renamed CLI flag from
--storage.trace.maintenance-cycle
to--storage.trace.blocklist_poll
. This is a BREAKING CHANGE #897 (@mritunjaysharma394) - [CHANGE] BREAKING CHANGE Support partial results from failed block queries #1007 (@mapno)
QuerierGET /querier/api/traces/<traceid>
response's body has been modified
to returntempopb.TraceByIDResponse
instead of simplytempopb.Trace
. This will cause a disruption of the read path during rollout of the change. - [CHANGE] BRREAKING CHANGE Change the metrics name from
cortex_runtime_config_last_reload_successful
totempo_runtime_config_last_reload_successful
#945 (@kavirajk)
New Features and Enhancements
- [FEATURE] Add ability to search ingesters for traces #806 (@mdisibio @kvrhdn @annanay25)
- [FEATURE] Add runtime config handler #936 (@mapno)
- [FEATURE] Add ScalableSingleBinary operational run mode #1004 (@zalegrala)
- [ENHANCEMENT] Added "query blocks" cli option. #876 (@joe-elliott)
- [ENHANCEMENT] Added "search blocks" cli option. #972 (@joe-elliott)
- [ENHANCEMENT] Added traceid to
trace too large message
. #888 (@mritunjaysharma394) - [ENHANCEMENT] Add support to tempo workloads to
overrides
from single configmap in microservice mode. #896 (@kavirajk) - [ENHANCEMENT] Updated config defaults to reflect better capture operational knowledge. #913 (@joe-elliott)
ingester: trace_idle_period: 30s => 10s # reduce ingester memory requirements with little impact on querying flush_check_period: 30s => 10s query_frontend: query_shards: 2 => 20 # will massively improve performance on large installs storage: trace: wal: encoding: none => snappy # snappy has been tested thoroughly and ready for production use block: bloom_filter_false_positive: .05 => .01 # will increase total bloom filter size but improve query performance bloom_filter_shard_size_bytes: 256KiB => 100 KiB # will improve query performance compactor: compaction: chunk_size_bytes: 10 MiB => 5 MiB # will reduce compactor memory needs compaction_window: 4h => 1h # will allow more compactors to participate in compaction without substantially increasing blocks
- [ENHANCEMENT] Make s3 backend readError logic more robust #905 (@wei840222)
- [ENHANCEMENT] Add
gen index
andgen bloom
commands to tempo-cli. #903 (@annanay25) - [ENHANCEMENT] Implement trace comparison in Vulture #904 (@zalegrala)
- [ENHANCEMENT] Compression updates: Added s2, improved snappy performance #961 (@joe-elliott)
- [ENHANCEMENT] Add support for vulture sending long running traces #951 (@zalegrala)
- [ENHANCEMENT] Shard tenant index creation by tenant and add functionality to handle stale indexes. #1005 (@joe-elliott)
- [ENHANCEMENT] Support partial results from failed block queries #1007 (@mapno)
- [ENHANCEMENT] Add new metric
tempo_distributor_push_duration_seconds
#1027 (@zalegrala) - [ENHANCEMENT] Add query parameter to show the default config values and the difference between the current values and the defaults. #1045 (@MichelHollands)
- [ENHANCEMENT] Adding metrics around ingester flush retries #1049 (@dannykopping)
- [ENHANCEMENT] Performance: More efficient distributor batching #1075 (@joe-elliott)
- [ENHANCEMENT] Include tempo-cli in the release #1086 (@zalegrala)
Bug Fixes
- [BUGFIX] Update port spec for GCS docker-compose example #869 (@zalegrala)
- [BUGFIX] Fix "magic number" errors and other block mishandling when an ingester forcefully shuts down #937 (@mdisibio)
- [BUGFIX] Fix compactor memory leak #806 (@mdisibio)
- [BUGFIX] Set span's tag
span.kind
toclient
in query-frontend #975 (@mapno) - [BUGFIX] Fixes
tempodb_backend_hedged_roundtrips_total
to correctly count hedged roundtrips. #1079 (@joe-elliott) - [BUGFIX] Update go-kit logger package to remove spurious debug logs #1094 (@bboreham)
Other Changes
- [CHANGE] update jsonnet alerts and recording rules to use
job_selectors
andcluster_selectors
for configurable unique identifier labels #935 (@kevinschoonover) - [CHANGE] Add troubleshooting language to config for
server.grpc_server_max_recv_msg_size
andserver.grpc_server_max_send_msg_size
when handling large traces #1023 (@thejosephstevens)