Releases: cortexproject/cortex
v1.21.1
1.21.1 2026-06-04
- [BUGFIX] gRPC: Fix panic when
grpc_compressionis set tosnappyon ingester client or store-gateway client configurations. #7459 - [BUGFIX] Config: Mask Swift, etcd, Redis, and HTTP basic-auth credentials on the
/configendpoint. #7473 - [BUGFIX] Memberlist: Drop incoming TCP transport packets when digest verification fails, preventing corrupted payloads from being forwarded. #7474
- [BUGFIX] Ingester: Reject
PushStreamrequests where the per-messageTenantIDdoes not match the authenticated caller, and add HMAC-SHA256 stream authentication forPushStreamvia-distributor.sign-write-requests-keys. #7475 - [BUGFIX] Security: Fix stored XSS vulnerability in Alertmanager and Store Gateway status pages by replacing
text/templatewithhtml/template. #7512 - [BUGFIX] Security: Limit decompressed gzip output in
ParseProtoReaderand OTLP ingestion path. The decompressed body is now capped by-distributor.otlp-max-recv-msg-size. #7515 - [BUGFIX] Memberlist: Add
-memberlist.packet-read-timeout,-memberlist.max-packet-size, and-memberlist.max-concurrent-connectionsflags to bound inbound gossip TCP connections, preventing slow-read, OOM, and connection-flood attacks on the gossip port. #7518 - [BUGFIX] Distributor: Fix a panic (
slice bounds out of range) in the stream push path when the context deadline expires while the worker goroutine is still marshalling aWriteRequest. #7541 - [BUGFIX] Distributor: Add
WrappedHistogramwith configurable size limit (-validation.max-native-histogram-size-bytes, default 16 KB) to cap native histogram protobuf size before unmarshalling, preventing memory amplification attacks via packed varint deltas. #7570
Full Changelog: v1.21.0...v1.21.1
v1.21.1-rc.0
What's Changed
- [BUGFIX] gRPC: Fix panic when
grpc_compressionis set tosnappyon ingester client or store-gateway client configurations. #7459 - [BUGFIX] Config: Mask Swift, etcd, Redis, and HTTP basic-auth credentials on the
/configendpoint. #7473 - [BUGFIX] Memberlist: Drop incoming TCP transport packets when digest verification fails, preventing corrupted payloads from being forwarded. #7474
- [BUGFIX] Ingester: Reject
PushStreamrequests where the per-messageTenantIDdoes not match the authenticated caller, and add HMAC-SHA256 stream authentication forPushStreamvia-distributor.sign-write-requests-keys. #7475 - [BUGFIX] Security: Fix stored XSS vulnerability in Alertmanager and Store Gateway status pages by replacing
text/templatewithhtml/template. #7512 - [BUGFIX] Security: Limit decompressed gzip output in
ParseProtoReaderand OTLP ingestion path. The decompressed body is now capped by-distributor.otlp-max-recv-msg-size. #7515 - [BUGFIX] Memberlist: Add
-memberlist.packet-read-timeout,-memberlist.max-packet-size, and-memberlist.max-concurrent-connectionsflags to bound inbound gossip TCP connections, preventing slow-read, OOM, and connection-flood attacks on the gossip port. #7518 - [BUGFIX] Distributor: Fix a panic (
slice bounds out of range) in the stream push path when the context deadline expires while the worker goroutine is still marshalling aWriteRequest. #7541 - [BUGFIX] Distributor: Add
WrappedHistogramwith configurable size limit (-validation.max-native-histogram-size-bytes, default 16 KB) to cap native histogram protobuf size before unmarshalling, preventing memory amplification attacks via packed varint deltas. #7570
Full Changelog: v1.21.0...v1.21.1-rc.0
v1.21.0
This release contains 164 contributions from 29 contributors. We also have 12 new contributors. Thank you all for the contributions!
Some notable changes and improvements in this release are:
- New Parquet mode for Store Gateway
- Configurable OTLP metric suffixes via
-distributor.otlp.add-metric-suffixes - Multiple PRW2 bug fixes for data corruption and panics
- Graduate Ruler API, Alertmanager API/sharding, tenant federation, FIFO/Redis cache, instance limits, and memcached DNS-based service discovery from experimental support
- New Overrides API module to control tenant limits via api
- HATracker memberlist experimental support
- Tenant federation partial response experimental support
- Alertmanager upgraded to v0.31.1 with IncidentIO and Mattermost integrations
- Bucket index enabled by default
What's Changed
- [CHANGE] Ruler: Graduate Ruler API from experimental. #7312
- Flag: Renamed
-experimental.ruler.enable-apito-ruler.enable-api. The old flag is kept as deprecated. - Ruler API is no longer marked as experimental.
- Flag: Renamed
- [CHANGE] Alertmanager: Graduate Alertmanager API and sharding from experimental. #7315
- Flag: Renamed
-experimental.alertmanager.enable-apito-alertmanager.enable-api. The old flag is kept as deprecated. - Alertmanager sharding is no longer marked as experimental.
- Flag: Renamed
- [CHANGE] Blocks storage: Bucket index is now enabled by default. Disabling the bucket index (
-blocks-storage.bucket-store.bucket-index.enabled=false) is not recommended for production. #7259 - [CHANGE] Users Scanner: Rename user index update configuration. #7180
- Flag: Renamed
-*.users-scanner.user-index.cleanup-intervalto-*.users-scanner.user-index.update-interval. - Config: Renamed
clean_up_intervaltoupdate_intervalwithin theusers_scannerconfiguration block..
- Flag: Renamed
- [CHANGE] Querier: Refactored parquet cache configuration naming. #7146
- Metrics: Renamed
cortex_parquet_queryable_cache_*tocortex_parquet_cache_*. - Flags: Renamed
-querier.parquet-queryable-shard-cache-sizeto-querier.parquet-shard-cache-sizeand-querier.parquet-queryable-shard-cache-ttlto-querier.parquet-shard-cache-ttl. - Config: Renamed
parquet_queryable_shard_cache_sizetoparquet_shard_cache_sizeandparquet_queryable_shard_cache_ttltoparquet_shard_cache_ttl.
- Metrics: Renamed
- [FEATURE] Overrides: Add new Overrides API component and rename old overrides module to
overrides-configs. #6975 - [FEATURE] HATracker: Add experimental support for
memberlistandmultias a KV store backend. #7284 - [FEATURE] Distributor: Add
-distributor.otlp.add-metric-suffixesflag. If true, suffixes will be added to the metrics for name normalization. #7286 - [FEATURE] StoreGateway: Introduces a new parquet mode. #7046
- [FEATURE] StoreGateway: Add a parquet shard cache to parquet mode. #7166
- [FEATURE] Distributor: Add a per-tenant flag
-distributor.enable-type-and-unit-labelsthat enables adding__unit__and__type__labels for remote write v2 and OTLP requests. This is a breaking change; the-distributor.otlp.enable-type-and-unit-labelsflag is now deprecated, operates as a no-op, and has been consolidated into this new flag. #7077 - [FEATURE] Querier: Add experimental projection pushdown support in Parquet Queryable. #7152
- [FEATURE] Ingester: Add experimental active series queried metric. #7173
- [FEATURE] Update prometheus Alertmanager version to v0.31.1 and add new integration to IncidentIO and Mattermost. #7092 #7267
- [FEATURE] Tenant Federation: Add experimental support for partial responses using the
-tenant-federation.allow-partial-dataflag. When enabled, failures from individual tenants during a federated query are treated as warnings, allowing results from successful tenants to be returned. #7232 - [FEATURE] Alertmanager: Add
-alertmanager.disable-replica-set-extensionflag to limit blast radius during config corruption incidents. #7153 - [ENHANCEMENT] Tenant Federation: Add a local cache to regex resolver. #7363
- [ENHANCEMENT] Distributor: Add
cortex_distributor_push_requests_totalmetric to track the number of push requests by type. #7239 - [ENHANCEMENT] Querier: Add
-querier.store-gateway-series-batch-sizeflag to configure the maximum number of series to be batched in a single gRPC response message from Store Gateways. #7203 - [ENHANCEMENT] HATracker: Add
-distributor.ha-tracker.enable-startup-syncflag. If enabled, the ha-tracker fetches all tracked keys on startup to populate the local cache. #7213 - [ENHANCEMENT] Distributor: Add validation to ensure remote write v2 requests contain at least one sample or histogram. #7201
- [ENHANCEMENT] Ingester: Add support for ingesting Native Histogram with Custom Buckets. #7191
- [ENHANCEMENT] Ingester: Optimize labels out-of-order (ooo) check by allowing the iteration to terminate immediately upon finding the first unsorted label. #7186
- [ENHANCEMENT] Distributor: Skip attaching
__unit__and__type__labels when-distributor.enable-type-and-unit-labelsis enabled, as these are appended from metadata. #7145 - [ENHANCEMENT] Distributor: Add
cortex_distributor_ingester_push_timeouts_totalmetric to track the number of push requests to ingesters that were canceled due to timeout. #7155 #7229 - [ENHANCEMENT] StoreGateway: Add tracings to parquet mode. #7125
- [ENHANCEMENT] Querier: Add a
-querier.parquet-queryable-shard-cache-ttlflag to add TTL to parquet shard cache. #7098 - [ENHANCEMENT] Ingester: Add
enable_matcher_optimizationconfig to apply low selectivity matchers lazily. #7063 - [ENHANCEMENT] Distributor: Add a label references validation for remote write v2 request. #7074
- [ENHANCEMENT] Distributor: Add count, spans, and buckets validations for native histogram. #7072
- [ENHANCEMENT] Alertmanager/Ruler: Introduce a user scanner to reduce the number of list calls to object storage. #6999
- [ENHANCEMENT] Ruler: Add DecodingConcurrency config flag for Thanos Engine. #7118
- [ENHANCEMENT] Query Frontend: Add query priority based on operation. #7128
- [ENHANCEMENT] Compactor: Avoid double compaction by cleaning partition files in 2 cycles. #7130 #7209 #7257
- [ENHANCEMENT] Distributor: Optimize memory usage by recycling v2 requests. #7131
- [ENHANCEMENT] Compactor: Avoid double compaction by not filtering delete blocks on real time when using bucketIndex lister. #7156
- [ENHANCEMENT] Upgrade to go 1.25.8 #7164 #7340
- [ENHANCEMENT] Upgraded container base images to
alpine:3.23. #7163 - [ENHANCEMENT] Ingester: Instrument Ingester CPU profile with userID for read APIs. #7184
- [ENHANCEMENT] Ingester: Add fetch timeout for Ingester expanded postings cache. #7185
- [ENHANCEMENT] Ingester: Add feature flag to collect metrics of how expensive an unoptimized regex matcher is and new limits to protect Ingester query path against expensive unoptimized regex matchers. #7194 #7210
- [ENHANCEMENT] Querier: Add active API requests tracker logging to help with OOMKill troubleshooting. #7216
- [ENHANCEMENT] Compactor: Add partition group creation time to visit marker. #7217
- [ENHANCEMENT] Compactor: Add concurrency for partition cleanup and mark block for deletion #7246
- [ENHANCEMENT] Distributor: Validate metric name before removing empty labels. #7253
- [ENHANCEMENT] Ruler/Ingester: Propagate append hints to discard out of order samples on Ingester #7226
- [ENHANCEMENT] Make cortex_ingester_tsdb_sample_ooo_delta metric per-tenant #7278
- [ENHANCEMENT] Distributor: Add dimension
nhcbto keep track of nhcb samples incortex_distributor_received_samples_totalandcortex_distributor_samples_in_totalmetrics. - [ENHANCEMENT] Distributor: Add
-distributor.accept-unknown-remote-write-content-typeflag. When enabled, requests with unknown or invalid Content-Type header are treated as remote write v1 instead of returning 415 Unsupported Media Type. Default is false. #7293 - [ENHANCEMENT] Ingester: Added
cortex_ingester_ingested_histogram_bucketsmetric to track number of histogram buckets ingested per user. #7297 - [ENHANCEMENT] Ring: Reuse timers in lifecycler and backoff loops to reduce allocations. #7270
- [ENHANCEMENT] Ring/KV: Reuse timers in DynamoDB watch loops to avoid per-poll allocations. #7266
- [ENHANCEMENT] Ring/KV: Reuse timers in memberlist client to reduce allocations. #7285
- [ENHANCEMENT] PromQL: Add
holt_wintersbackwards compatibility as alias fordouble_exponential_smoothing. #7223 - [ENHANCEMENT] Query Frontend: Add logical plan fragmentation for distributed query execution. #7018
- [ENHANCEMENT] Parquet: Support sharded parquet files in parquet converter and queryable. #7189
- [ENHANCEMENT] Compactor: Add graceful period for compaction groups to prevent compacting recently written blocks. #7182
- [ENHANCEMENT] Query Engine: Add projection pushdown optimizer for improved query performance. #7141
- [ENHANCEMENT] Distributor: Optimize memory allocations by pooling PreallocWriteRequestV2 and preserving the capacity of the Symbols slice during resets. #7404
- [ENHANCEMENT] Ruler: Allow ExternalPusher and ExternalQueryable to be specified separately. #7224
- [BUGFIX] Distributor: Add bounds checking for symbol references in Remote Write V2 requests to prevent panics when UnitRef or HelpRef exceed the symbols array length. #7290
- [BUGFIX] Distributor: If remote write v2 is disabled, explicitly return HTTP 415 (Unsupported Media Type) for Remote Write V2 requests instead of attempting to parse them as V1. #7238
- [BUGFIX] Ring: Change DynamoDB KV to retry indefinitely for WatchKey. #7088
- [BUGFIX] Ruler: Add XFunctions validation support. #7111
- [BUGFIX] Querier: propagate Prometheus info annotations in protobuf responses. #7132
- [BUGFIX] Scheduler: Fix memory leak by properly cleaning up query fragment registry. #7148
- [BUGFIX] Compactor: Add back deletion of partition group info file e...
v1.21.0-rc.1
What's Changed
- [ENHANCEMENT] Tenant Federation: Add a local cache to regex resolver. #7363
- [BUGFIX] Memberlist: Skip nil values delivered by
WatchPrefixwhen a key is deleted, preventing a panic in the HA tracker caused by a failed type assertion on a nil interface value. #7429 - [BUGFIX] Tenant Federation: Fix
unsupported charactererror whentenant-federation.regex-matcher-enabledis enabled and the input regex matches 0 or 1 existing tenant. #7424 - [BUGFIX] KV store: Fix false-positive
status_code="500"metrics for HA tracker CAS operations when using memberlist. #7408 - [BUGFIX] Fix nil when ingester_query_max_attempts > 1. #7369
- [BUGFIX] Alertmanager: Fix disappearing user config and state when ring is temporarily unreachable. #7372
- [BUGFIX] Fix memory leak in
ReuseWriteRequestV2by explicitly clearing theSymbolsbacking array string pointers before returning the object tosync.Pool. #7373 - [BUGFIX] Querier: Fix queryWithRetry and labelsWithRetry returning (nil, nil) on cancelled context by propagating ctx.Err(). #7375
Full Changelog: v1.21.0-rc.0...v1.21.0-rc.1
v1.21.0-rc.0
This release contains 164 contributions from 29 contributors. We also have 12 new contributors. Thank you all for the contributions!
Some notable changes and improvements in this release are:
- New Parquet mode for Store Gateway
- Configurable OTLP metric suffixes via
-distributor.otlp.add-metric-suffixes - Multiple PRW2 bug fixes for data corruption and panics
- Graduate Ruler API, Alertmanager API/sharding, tenant federation, FIFO/Redis cache, instance limits, and memcached DNS-based service discovery from experimental support
- New Overrides API module to control tenant limits via api
- HATracker memberlist experimental support
- Tenant federation partial response experimental support
- Alertmanager upgraded to v0.31.1 with IncidentIO and Mattermost integrations
- Bucket index enabled by default
What's Changed
- [CHANGE] Ruler: Graduate Ruler API from experimental. #7312
- Flag: Renamed
-experimental.ruler.enable-apito-ruler.enable-api. The old flag is kept as deprecated. - Ruler API is no longer marked as experimental.
- Flag: Renamed
- [CHANGE] Alertmanager: Graduate Alertmanager API and sharding from experimental. #7315
- Flag: Renamed
-experimental.alertmanager.enable-apito-alertmanager.enable-api. The old flag is kept as deprecated. - Alertmanager sharding is no longer marked as experimental.
- Flag: Renamed
- [CHANGE] Blocks storage: Bucket index is now enabled by default. Disabling the bucket index (
-blocks-storage.bucket-store.bucket-index.enabled=false) is not recommended for production. #7259 - [CHANGE] Users Scanner: Rename user index update configuration. #7180
- Flag: Renamed
-*.users-scanner.user-index.cleanup-intervalto-*.users-scanner.user-index.update-interval. - Config: Renamed
clean_up_intervaltoupdate_intervalwithin theusers_scannerconfiguration block..
- Flag: Renamed
- [CHANGE] Querier: Refactored parquet cache configuration naming. #7146
- Metrics: Renamed
cortex_parquet_queryable_cache_*tocortex_parquet_cache_*. - Flags: Renamed
-querier.parquet-queryable-shard-cache-sizeto-querier.parquet-shard-cache-sizeand-querier.parquet-queryable-shard-cache-ttlto-querier.parquet-shard-cache-ttl. - Config: Renamed
parquet_queryable_shard_cache_sizetoparquet_shard_cache_sizeandparquet_queryable_shard_cache_ttltoparquet_shard_cache_ttl.
- Metrics: Renamed
- [FEATURE] Overrides: Add new Overrides API component and rename old overrides module to
overrides-configs. #6975 - [FEATURE] HATracker: Add experimental support for
memberlistandmultias a KV store backend. #7284 - [FEATURE] Distributor: Add
-distributor.otlp.add-metric-suffixesflag. If true, suffixes will be added to the metrics for name normalization. #7286 - [FEATURE] StoreGateway: Introduces a new parquet mode. #7046
- [FEATURE] StoreGateway: Add a parquet shard cache to parquet mode. #7166
- [FEATURE] Distributor: Add a per-tenant flag
-distributor.enable-type-and-unit-labelsthat enables adding__unit__and__type__labels for remote write v2 and OTLP requests. This is a breaking change; the-distributor.otlp.enable-type-and-unit-labelsflag is now deprecated, operates as a no-op, and has been consolidated into this new flag. #7077 - [FEATURE] Querier: Add experimental projection pushdown support in Parquet Queryable. #7152
- [FEATURE] Ingester: Add experimental active series queried metric. #7173
- [FEATURE] Update prometheus Alertmanager version to v0.31.1 and add new integration to IncidentIO and Mattermost. #7092 #7267
- [FEATURE] Tenant Federation: Add experimental support for partial responses using the
-tenant-federation.allow-partial-dataflag. When enabled, failures from individual tenants during a federated query are treated as warnings, allowing results from successful tenants to be returned. #7232 - [FEATURE] Alertmanager: Add
-alertmanager.disable-replica-set-extensionflag to limit blast radius during config corruption incidents. #7153 - [ENHANCEMENT] Distributor: Add
cortex_distributor_push_requests_totalmetric to track the number of push requests by type. #7239 - [ENHANCEMENT] Querier: Add
-querier.store-gateway-series-batch-sizeflag to configure the maximum number of series to be batched in a single gRPC response message from Store Gateways. #7203 - [ENHANCEMENT] HATracker: Add
-distributor.ha-tracker.enable-startup-syncflag. If enabled, the ha-tracker fetches all tracked keys on startup to populate the local cache. #7213 - [ENHANCEMENT] Distributor: Add validation to ensure remote write v2 requests contain at least one sample or histogram. #7201
- [ENHANCEMENT] Ingester: Add support for ingesting Native Histogram with Custom Buckets. #7191
- [ENHANCEMENT] Ingester: Optimize labels out-of-order (ooo) check by allowing the iteration to terminate immediately upon finding the first unsorted label. #7186
- [ENHANCEMENT] Distributor: Skip attaching
__unit__and__type__labels when-distributor.enable-type-and-unit-labelsis enabled, as these are appended from metadata. #7145 - [ENHANCEMENT] Distributor: Add
cortex_distributor_ingester_push_timeouts_totalmetric to track the number of push requests to ingesters that were canceled due to timeout. #7155 #7229 - [ENHANCEMENT] StoreGateway: Add tracings to parquet mode. #7125
- [ENHANCEMENT] Querier: Add a
-querier.parquet-queryable-shard-cache-ttlflag to add TTL to parquet shard cache. #7098 - [ENHANCEMENT] Ingester: Add
enable_matcher_optimizationconfig to apply low selectivity matchers lazily. #7063 - [ENHANCEMENT] Distributor: Add a label references validation for remote write v2 request. #7074
- [ENHANCEMENT] Distributor: Add count, spans, and buckets validations for native histogram. #7072
- [ENHANCEMENT] Alertmanager/Ruler: Introduce a user scanner to reduce the number of list calls to object storage. #6999
- [ENHANCEMENT] Ruler: Add DecodingConcurrency config flag for Thanos Engine. #7118
- [ENHANCEMENT] Query Frontend: Add query priority based on operation. #7128
- [ENHANCEMENT] Compactor: Avoid double compaction by cleaning partition files in 2 cycles. #7130 #7209 #7257
- [ENHANCEMENT] Distributor: Optimize memory usage by recycling v2 requests. #7131
- [ENHANCEMENT] Compactor: Avoid double compaction by not filtering delete blocks on real time when using bucketIndex lister. #7156
- [ENHANCEMENT] Upgrade to go 1.25.8 #7164 #7340
- [ENHANCEMENT] Upgraded container base images to
alpine:3.23. #7163 - [ENHANCEMENT] Ingester: Instrument Ingester CPU profile with userID for read APIs. #7184
- [ENHANCEMENT] Ingester: Add fetch timeout for Ingester expanded postings cache. #7185
- [ENHANCEMENT] Ingester: Add feature flag to collect metrics of how expensive an unoptimized regex matcher is and new limits to protect Ingester query path against expensive unoptimized regex matchers. #7194 #7210
- [ENHANCEMENT] Querier: Add active API requests tracker logging to help with OOMKill troubleshooting. #7216
- [ENHANCEMENT] Compactor: Add partition group creation time to visit marker. #7217
- [ENHANCEMENT] Compactor: Add concurrency for partition cleanup and mark block for deletion #7246
- [ENHANCEMENT] Distributor: Validate metric name before removing empty labels. #7253
- [ENHANCEMENT] Ruler/Ingester: Propagate append hints to discard out of order samples on Ingester #7226
- [ENHANCEMENT] Make cortex_ingester_tsdb_sample_ooo_delta metric per-tenant #7278
- [ENHANCEMENT] Distributor: Add dimension
nhcbto keep track of nhcb samples incortex_distributor_received_samples_totalandcortex_distributor_samples_in_totalmetrics. - [ENHANCEMENT] Distributor: Add
-distributor.accept-unknown-remote-write-content-typeflag. When enabled, requests with unknown or invalid Content-Type header are treated as remote write v1 instead of returning 415 Unsupported Media Type. Default is false. #7293 - [ENHANCEMENT] Ingester: Added
cortex_ingester_ingested_histogram_bucketsmetric to track number of histogram buckets ingested per user. #7297 - [ENHANCEMENT] Ring: Reuse timers in lifecycler and backoff loops to reduce allocations. #7270
- [ENHANCEMENT] Ring/KV: Reuse timers in DynamoDB watch loops to avoid per-poll allocations. #7266
- [ENHANCEMENT] Ring/KV: Reuse timers in memberlist client to reduce allocations. #7285
- [ENHANCEMENT] PromQL: Add
holt_wintersbackwards compatibility as alias fordouble_exponential_smoothing. #7223 - [ENHANCEMENT] Query Frontend: Add logical plan fragmentation for distributed query execution. #7018
- [ENHANCEMENT] Parquet: Support sharded parquet files in parquet converter and queryable. #7189
- [ENHANCEMENT] Compactor: Add graceful period for compaction groups to prevent compacting recently written blocks. #7182
- [ENHANCEMENT] Query Engine: Add projection pushdown optimizer for improved query performance. #7141
- [ENHANCEMENT] Ruler: Allow ExternalPusher and ExternalQueryable to be specified separately. #7224
- [BUGFIX] Distributor: Add bounds checking for symbol references in Remote Write V2 requests to prevent panics when UnitRef or HelpRef exceed the symbols array length. #7290
- [BUGFIX] Distributor: If remote write v2 is disabled, explicitly return HTTP 415 (Unsupported Media Type) for Remote Write V2 requests instead of attempting to parse them as V1. #7238
- [BUGFIX] Ring: Change DynamoDB KV to retry indefinitely for WatchKey. #7088
- [BUGFIX] Ruler: Add XFunctions validation support. #7111
- [BUGFIX] Querier: propagate Prometheus info annotations in protobuf responses. #7132
- [BUGFIX] Scheduler: Fix memory leak by properly cleaning up query fragment registry. #7148
- [BUGFIX] Compactor: Add back deletion of partition group info file even if not complete #7157
- [BUGFIX] Query Frontend: Add Native Histogram extraction logic in results cache #7167
- [BUGFIX] Alertmanager: Fix alertmanager reloading bug that removes user template files #7196
- [BUGFIX] Query Scheduler: I...
v1.20.1
What's Changed
- [BUGFIX] Distributor: Fix panic on health check failure when using stream push. #7116
Full Changelog: v1.20.0...v1.20.1
v1.20.0
Cortex 1.20.0 Release Notes
This release contains 371 contributions from 38 contributors. We also have 14 new contributors. Thank you all for the contributions!
Some notable changes in this release are:
- Prometheus Remote Write 2.0 Support: Experimental support for Prometheus Remote Write 2.0 protocol.
- Parquet Format Support: Experimental Parquet based block storage. A new parquet converter service to convert TSDB blocks to parquet and querier to query parquet files.
- Query Federation with Regex Tenant Resolver: Introduce experimental regex tenant resolver allowing regex patterns in
X-Scope-OrgIDheader via-tenant-federation.regex-matcher-enabledflag - gRPC Stream Push between Distributor and Ingester: Experimental feature to use gRPC stream connections for push requests.
- More Native Histogram Support: Out-of-order native histogram ingestion support, per-tenant native histogram ingestion config, native histogram active series metrics and limits
- Resource-Based Monitor and Limiter:
ResourceMonitorto collect CPU and Heap usage for Cortex andResourceBasedLimiterin Ingesters and StoreGateways to protect the service from incoming requests when hitting limits - UTF-8 Name: UTF-8 name support via
-name-validation-schemeflag
What's Changed
- [CHANGE] StoreGateway/Alertmanager: Add default 5s connection timeout on client. #6603
- [CHANGE] Ingester: Remove EnableNativeHistograms config flag and instead gate keep through new per-tenant limit at ingestion. #6718
- [CHANGE] Validate a tenantID when to use a single tenant resolver. #6727
- [CHANGE] Ring: Add zone label to ring_members metric. #6900
- [FEATURE] Distributor: Add an experimental
-distributor.otlp.enable-type-and-unit-labelsflag to add__type__and__unit__labels for OTLP metrics. #6969 - [FEATURE] Distributor: Add an experimental
-distributor.otlp.allow-delta-temporalityflag to ingest delta temporality otlp metrics. #6934 - [FEATURE] Query Frontend: Add dynamic interval size for query splitting. This is enabled by configuring experimental flags
querier.max-shards-per-queryand/orquerier.max-fetched-data-duration-per-query. The split interval size is dynamically increased to maintain a number of shards and total duration fetched below the configured values. #6458 - [FEATURE] Querier/Ruler: Add
query_partial_dataandrules_partial_datalimits to allow queries/rules to be evaluated with data from a single zone, if other zones are not available. #6526 - [FEATURE] Update prometheus alertmanager version to v0.28.0 and add new integration msteamsv2, jira, and rocketchat. #6590
- [FEATURE] Ingester/StoreGateway: Add
ResourceMonitormodule in Cortex, and addResourceBasedLimiterin Ingesters and StoreGateways. #6674 - [FEATURE] Support Prometheus remote write 2.0. #6330
- [FEATURE] Ingester: Support out-of-order native histogram ingestion. It is automatically enabled when
-ingester.out-of-order-time-window > 0and-blocks-storage.tsdb.enable-native-histograms=true. #6626 #6663 - [FEATURE] Ruler: Add support for percentage based sharding for rulers. #6680
- [FEATURE] Ruler: Add support for group labels. #6665
- [FEATURE] Query federation: Introduce a regex tenant resolver to allow regex in
X-Scope-OrgIDvalue. #6713
- Add an experimental
tenant-federation.regex-matcher-enabledflag. If it enabled, user can input regex toX-Scope-OrgId, the matched tenantIDs are automatically involved. The user discovery is based on scanning block storage, so new users can get queries after uploading a block (generally 2h). - Add an experimental
tenant-federation.user-sync-intervalflag, it specifies how frequently to scan users. The scanned users are used to calculate matched tenantIDs.
- [FEATURE] Experimental Support Parquet format: Implement parquet converter service to convert a TSDB block into Parquet and Parquet Queryable. #6716 #6743
- [FEATURE] Distributor/Ingester: Implemented experimental feature to use gRPC stream connection for push requests. This can be enabled by setting
-distributor.use-stream-push=true. #6580 - [FEATURE] Compactor: Add support for percentage based sharding for compactors. #6738
- [FEATURE] Querier: Allow choosing PromQL engine via header
X-PromQL-EngineType. #6777 - [FEATURE] Querier: Support for configuring query optimizers and enabling XFunctions in the Thanos engine. #6873
- [FEATURE] Query Frontend: Add support /api/v1/format_query API for formatting queries. #6893
- [FEATURE] Query Frontend: Add support for /api/v1/parse_query API (experimental) to parse a PromQL expression and return it as a JSON-formatted AST (abstract syntax tree). #6978
- [ENHANCEMENT] Upgrade the Prometheus version to 3.6.0 and add a
-name-validation-schemeflag to support UTF-8. #7040 #7056 - [ENHANCEMENT] Distributor: Emit an error with a 400 status code when empty labels are found before the relabelling or label dropping process. #7052
- [ENHANCEMENT] Parquet Storage: Add support for additional sort columns during Parquet file generation #7003
- [ENHANCEMENT] Modernizes the entire codebase by using go modernize tool. #7005
- [ENHANCEMENT] Overrides Exporter: Expose all fields that can be converted to float64. Also, the label value
max_local_series_per_metricgot renamed tomax_series_per_metric, andmax_local_series_per_usergot renamed tomax_series_per_user. #6979 - [ENHANCEMENT] Ingester: Add
cortex_ingester_tsdb_wal_replay_unknown_refs_totalandcortex_ingester_tsdb_wbl_replay_unknown_refs_totalmetrics to track unknown series references during wal/wbl replaying. #6945 - [ENHANCEMENT] Distributor: Introduce a Protobuf model for Prometheus Remote Write 2.0 and a pool to improve performance. #6917
- [ENHANCEMENT] Ruler: Emit an error message when the rule synchronization fails. #6902
- [ENHANCEMENT] Querier: Support snappy and zstd response compression for
-querier.response-compressionflag. #6848 - [ENHANCEMENT] Tenant Federation: Add a # of query result limit logic when the
-tenant-federation.regex-matcher-enabledis enabled. #6845 - [ENHANCEMENT] Query Frontend: Add a
cortex_slow_queries_totalmetric to track # of slow queries per user. #6859 - [ENHANCEMENT] Query Frontend: Change to return 400 when the tenant resolving fail. #6715
- [ENHANCEMENT] Querier: Support query parameters to metadata api (/api/v1/metadata) to allow user to limit metadata to return. Add a
-ingester.return-all-metadataflag to make the metadata API run when the deployment. Please set this flag tofalseto use the metadata API with the limits later. #6681 #6744 - [ENHANCEMENT] Ingester: Add a
cortex_ingester_active_native_histogram_seriesmetric to track # of active NH series. #6695 - [ENHANCEMENT] Query Frontend: Add new limit
-frontend.max-query-response-sizefor total query response size after decompression in query frontend. #6607 - [ENHANCEMENT] Alertmanager: Add nflog and silences maintenance metrics. #6659
- [ENHANCEMENT] Querier: limit label APIs to query only ingesters if
startparam is not specified. #6618 - [ENHANCEMENT] Alertmanager: Add new limits
-alertmanager.max-silences-countand-alertmanager.max-silences-size-bytesfor limiting silences per tenant. #6605 - [ENHANCEMENT] Add
compactor.auto-forget-delayfor compactor to auto forget compactors after X minutes without heartbeat. #6533 - [ENHANCEMENT] StoreGateway: Emit more histogram buckets on the
cortex_querier_storegateway_refetches_per_querymetric. #6570 - [ENHANCEMENT] Querier: Apply bytes limiter to LabelNames and LabelValuesForLabelNames. #6568
- [ENHANCEMENT] Query Frontend: Add a
too_many_tenantsreason label value tocortex_rejected_queries_totalmetric to track the rejected query count due to the # of tenant limits. #6569 - [ENHANCEMENT] Alertmanager: Add receiver validations for msteamsv2 and rocketchat. #6606
- [ENHANCEMENT] Query Frontend: Add a
-frontend.enabled-ruler-query-statsflag to configure whether to report the query stats log for queries coming from the Ruler. #6504 - [ENHANCEMENT] OTLP: Support otlp metadata ingestion. #6617
- [ENHANCEMENT] AlertManager: Add
keep_instance_in_the_ring_on_shutdownandtokens_file_pathconfigs for alertmanager ring. #6628 - [ENHANCEMENT] Querier: Add metric and enhanced logging for query partial data. #6676
- [ENHANCEMENT] Ingester: Push request should fail when label set is out of order #6746
- [ENHANCEMENT] Querier: Add
querier.ingester-query-max-attemptsto retry on partial data. #6714 - [ENHANCEMENT] Distributor: Add min/max schema validation for Native Histogram. #6766
- [ENHANCEMENT] Ingester: Handle runtime errors in query path #6769
- [ENHANCEMENT] Compactor: Support metadata caching bucket for Cleaner. Can be enabled via
-compactor.cleaner-caching-bucket-enabledflag. #6778 - [ENHANCEMENT] Distributor: Add ingestion rate limit for Native Histogram. #6794 and #6994
- [ENHANCEMENT] Ingester: Add active series limit specifically for Native Histogram. #6796
- [ENHANCEMENT] Compactor, Store Gateway: Introduce user scanner strategy and user index. #6780
- [ENHANCEMENT] Querier: Support chunks cache for parquet queryable. #6805
- [ENHANCEMENT] Parquet Storage: Add some metrics for parquet blocks and converter. #6809 #6821
- [ENHANCEMENT] Compactor: Optimize cleaner run time. #6815
- [ENHANCEMENT] Parquet Storage: Allow percentage based dynamic shard size for Parquet Converter. #6817
- [ENHANCEMENT] Query Frontend: Enhance the performance of the JSON codec. #6816
- [ENHANCEMENT] Compactor: Emit partition metrics separate from cleaner job. #6827
- [ENHANCEMENT] Metadata Cache: Support inmemory and multi level cache backend. #6829
- [ENHANCEMENT] Store Gateway: Allow to ignore syncing blocks older than certain time using
ignore_blocks_before. #6830 - [ENHANCEMENT] Distributor: Add native histograms max sample size bytes limit validation. #683...
Cortex v1.20.0-rc.1
What's Changed
- [BUGFIX] Fix bug where validating metric names uses the wrong validation logic. #7086
- [BUGFIX] Compactor: Avoid race condition which allow a grouper to not compact all partitions. #7082
Full Changelog: v1.20.0-rc.0...v1.20.0-rc.1
Cortex v1.20.0-rc.0
Cortex 1.20.0 Release Notes
This release contains 368 contributions from 38 contributors. We also have 14 new contributors. Thank you all for the contributions!
Some notable changes in this release are:
- Prometheus Remote Write 2.0 Support: Experimental support for Prometheus Remote Write 2.0 protocol.
- Parquet Format Support: Experimental Parquet based block storage. A new parquet converter service to convert TSDB blocks to parquet and querier to query parquet files.
- Query Federation with Regex Tenant Resolver: Introduce experimental regex tenant resolver allowing regex patterns in
X-Scope-OrgIDheader via-tenant-federation.regex-matcher-enabledflag - gRPC Stream Push between Distributor and Ingester: Experimental feature to use gRPC stream connections for push requests.
- More Native Histogram Support: Out-of-order native histogram ingestion support, per-tenant native histogram ingestion config, native histogram active series metrics and limits
- Resource-Based Monitor and Limiter:
ResourceMonitorto collect CPU and Heap usage for Cortex andResourceBasedLimiterin Ingesters and StoreGateways to protect the service from incoming requests when hitting limits - UTF-8 Name: UTF-8 name support via
-name-validation-schemeflag
What's Changed
- [CHANGE] StoreGateway/Alertmanager: Add default 5s connection timeout on client. #6603
- [CHANGE] Ingester: Remove EnableNativeHistograms config flag and instead gate keep through new per-tenant limit at ingestion. #6718
- [CHANGE] Validate a tenantID when to use a single tenant resolver. #6727
- [CHANGE] Ring: Add zone label to ring_members metric. #6900
- [FEATURE] Distributor: Add an experimental
-distributor.otlp.enable-type-and-unit-labelsflag to add__type__and__unit__labels for OTLP metrics. #6969 - [FEATURE] Distributor: Add an experimental
-distributor.otlp.allow-delta-temporalityflag to ingest delta temporality otlp metrics. #6934 - [FEATURE] Query Frontend: Add dynamic interval size for query splitting. This is enabled by configuring experimental flags
querier.max-shards-per-queryand/orquerier.max-fetched-data-duration-per-query. The split interval size is dynamically increased to maintain a number of shards and total duration fetched below the configured values. #6458 - [FEATURE] Querier/Ruler: Add
query_partial_dataandrules_partial_datalimits to allow queries/rules to be evaluated with data from a single zone, if other zones are not available. #6526 - [FEATURE] Update prometheus alertmanager version to v0.28.0 and add new integration msteamsv2, jira, and rocketchat. #6590
- [FEATURE] Ingester/StoreGateway: Add
ResourceMonitormodule in Cortex, and addResourceBasedLimiterin Ingesters and StoreGateways. #6674 - [FEATURE] Support Prometheus remote write 2.0. #6330
- [FEATURE] Ingester: Support out-of-order native histogram ingestion. It is automatically enabled when
-ingester.out-of-order-time-window > 0and-blocks-storage.tsdb.enable-native-histograms=true. #6626 #6663 - [FEATURE] Ruler: Add support for percentage based sharding for rulers. #6680
- [FEATURE] Ruler: Add support for group labels. #6665
- [FEATURE] Query federation: Introduce a regex tenant resolver to allow regex in
X-Scope-OrgIDvalue. #6713
- Add an experimental
tenant-federation.regex-matcher-enabledflag. If it enabled, user can input regex toX-Scope-OrgId, the matched tenantIDs are automatically involved. The user discovery is based on scanning block storage, so new users can get queries after uploading a block (generally 2h). - Add an experimental
tenant-federation.user-sync-intervalflag, it specifies how frequently to scan users. The scanned users are used to calculate matched tenantIDs.
- [FEATURE] Experimental Support Parquet format: Implement parquet converter service to convert a TSDB block into Parquet and Parquet Queryable. #6716 #6743
- [FEATURE] Distributor/Ingester: Implemented experimental feature to use gRPC stream connection for push requests. This can be enabled by setting
-distributor.use-stream-push=true. #6580 - [FEATURE] Compactor: Add support for percentage based sharding for compactors. #6738
- [FEATURE] Querier: Allow choosing PromQL engine via header
X-PromQL-EngineType. #6777 - [FEATURE] Querier: Support for configuring query optimizers and enabling XFunctions in the Thanos engine. #6873
- [FEATURE] Query Frontend: Add support /api/v1/format_query API for formatting queries. #6893
- [FEATURE] Query Frontend: Add support for /api/v1/parse_query API (experimental) to parse a PromQL expression and return it as a JSON-formatted AST (abstract syntax tree). #6978
- [ENHANCEMENT] Upgrade the Prometheus version to 3.6.0 and add a
-name-validation-schemeflag to support UTF-8. #7040 #7056 - [ENHANCEMENT] Distributor: Emit an error with a 400 status code when empty labels are found before the relabelling or label dropping process. #7052
- [ENHANCEMENT] Parquet Storage: Add support for additional sort columns during Parquet file generation #7003
- [ENHANCEMENT] Modernizes the entire codebase by using go modernize tool. #7005
- [ENHANCEMENT] Overrides Exporter: Expose all fields that can be converted to float64. Also, the label value
max_local_series_per_metricgot renamed tomax_series_per_metric, andmax_local_series_per_usergot renamed tomax_series_per_user. #6979 - [ENHANCEMENT] Ingester: Add
cortex_ingester_tsdb_wal_replay_unknown_refs_totalandcortex_ingester_tsdb_wbl_replay_unknown_refs_totalmetrics to track unknown series references during wal/wbl replaying. #6945 - [ENHANCEMENT] Distributor: Introduce a Protobuf model for Prometheus Remote Write 2.0 and a pool to improve performance. #6917
- [ENHANCEMENT] Ruler: Emit an error message when the rule synchronization fails. #6902
- [ENHANCEMENT] Querier: Support snappy and zstd response compression for
-querier.response-compressionflag. #6848 - [ENHANCEMENT] Tenant Federation: Add a # of query result limit logic when the
-tenant-federation.regex-matcher-enabledis enabled. #6845 - [ENHANCEMENT] Query Frontend: Add a
cortex_slow_queries_totalmetric to track # of slow queries per user. #6859 - [ENHANCEMENT] Query Frontend: Change to return 400 when the tenant resolving fail. #6715
- [ENHANCEMENT] Querier: Support query parameters to metadata api (/api/v1/metadata) to allow user to limit metadata to return. Add a
-ingester.return-all-metadataflag to make the metadata API run when the deployment. Please set this flag tofalseto use the metadata API with the limits later. #6681 #6744 - [ENHANCEMENT] Ingester: Add a
cortex_ingester_active_native_histogram_seriesmetric to track # of active NH series. #6695 - [ENHANCEMENT] Query Frontend: Add new limit
-frontend.max-query-response-sizefor total query response size after decompression in query frontend. #6607 - [ENHANCEMENT] Alertmanager: Add nflog and silences maintenance metrics. #6659
- [ENHANCEMENT] Querier: limit label APIs to query only ingesters if
startparam is not specified. #6618 - [ENHANCEMENT] Alertmanager: Add new limits
-alertmanager.max-silences-countand-alertmanager.max-silences-size-bytesfor limiting silences per tenant. #6605 - [ENHANCEMENT] Add
compactor.auto-forget-delayfor compactor to auto forget compactors after X minutes without heartbeat. #6533 - [ENHANCEMENT] StoreGateway: Emit more histogram buckets on the
cortex_querier_storegateway_refetches_per_querymetric. #6570 - [ENHANCEMENT] Querier: Apply bytes limiter to LabelNames and LabelValuesForLabelNames. #6568
- [ENHANCEMENT] Query Frontend: Add a
too_many_tenantsreason label value tocortex_rejected_queries_totalmetric to track the rejected query count due to the # of tenant limits. #6569 - [ENHANCEMENT] Alertmanager: Add receiver validations for msteamsv2 and rocketchat. #6606
- [ENHANCEMENT] Query Frontend: Add a
-frontend.enabled-ruler-query-statsflag to configure whether to report the query stats log for queries coming from the Ruler. #6504 - [ENHANCEMENT] OTLP: Support otlp metadata ingestion. #6617
- [ENHANCEMENT] AlertManager: Add
keep_instance_in_the_ring_on_shutdownandtokens_file_pathconfigs for alertmanager ring. #6628 - [ENHANCEMENT] Querier: Add metric and enhanced logging for query partial data. #6676
- [ENHANCEMENT] Ingester: Push request should fail when label set is out of order #6746
- [ENHANCEMENT] Querier: Add
querier.ingester-query-max-attemptsto retry on partial data. #6714 - [ENHANCEMENT] Distributor: Add min/max schema validation for Native Histogram. #6766
- [ENHANCEMENT] Ingester: Handle runtime errors in query path #6769
- [ENHANCEMENT] Compactor: Support metadata caching bucket for Cleaner. Can be enabled via
-compactor.cleaner-caching-bucket-enabledflag. #6778 - [ENHANCEMENT] Distributor: Add ingestion rate limit for Native Histogram. #6794 and #6994
- [ENHANCEMENT] Ingester: Add active series limit specifically for Native Histogram. #6796
- [ENHANCEMENT] Compactor, Store Gateway: Introduce user scanner strategy and user index. #6780
- [ENHANCEMENT] Querier: Support chunks cache for parquet queryable. #6805
- [ENHANCEMENT] Parquet Storage: Add some metrics for parquet blocks and converter. #6809 #6821
- [ENHANCEMENT] Compactor: Optimize cleaner run time. #6815
- [ENHANCEMENT] Parquet Storage: Allow percentage based dynamic shard size for Parquet Converter. #6817
- [ENHANCEMENT] Query Frontend: Enhance the performance of the JSON codec. #6816
- [ENHANCEMENT] Compactor: Emit partition metrics separate from cleaner job. #6827
- [ENHANCEMENT] Metadata Cache: Support inmemory and multi level cache backend. #6829
- [ENHANCEMENT] Store Gateway: Allow to ignore syncing blocks older than certain time using
ignore_blocks_before. #6830 - [ENHANCEMENT] Distributor: Add native histograms max sample size bytes limit validation. #683...