Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce aliases for data streams #66163

Open
4 of 7 tasks
martijnvg opened this issue Dec 10, 2020 · 20 comments
Open
4 of 7 tasks

Introduce aliases for data streams #66163

martijnvg opened this issue Dec 10, 2020 · 20 comments
Labels
:Data Management/Data streams Data streams and their lifecycles >enhancement Meta Team:Data Management Meta label for data/management team

Comments

@martijnvg
Copy link
Member

martijnvg commented Dec 10, 2020

Currently aliases can only refer to indices. This issue is about extending that ability to data streams.

In order to reduce complexity aliases for data streams will behave differently than aliases for indices. Data stream aliases should only be able to refer to data streams. A data stream alias should not be able to refer to a backing index or any other regular index.

Aliases pointing to data streams will implementation wise be different than aliases pointing to indices. Data stream aliases will refer to data stream names. This should automatically allow resolvability to all backing indices of a data stream, even when rollovers have occurred. Data stream aliases are separately stored next to data streams in the cluster state.

Authorization shouldn't be defined on the alias level, but be determined by the authorization defined on the data stream level.

Ideally we should reuse the current aliases api in order to allow users to define aliases for data streams.

Data stream aliases should at least support the following functionality:

  • Being able to refer to one or more data streams by a common name. For example logs alias that point to logs-http and logs-my-application data stream.
  • Being able to define which data stream is the write data stream, so ingesting can happen via a data stream alias. Write requests are then resolved to the write index of the data stream designated as write data stream. This to support fail over scenario in a bi-direction ccr setup (as is described here).

I currently do not see use cases where we need to be able to support index and search routing on aliases for data streams.
Being able to define a filter on a data stream alias seems to make sense to me, but I think when needed can be added as a follow up.

Tasks:

@martijnvg martijnvg added >enhancement :Data Management/Data streams Data streams and their lifecycles labels Dec 10, 2020
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Dec 10, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 10, 2020
…action.Alias class.

This change is part of series of changes to clean up the usage `IndexAbstraction.Alias` in the codebase,
so that it is no longer needed to cast to `IndexAbstraction.Alias` and just use the methods on the `IndexAbstraction`
interface. This should help adding data stream aliases, so that `IndexAbstraction` instances of type `ALIAS` can
also be data stream aliases.

Relates to elastic#66163
@danhermann
Copy link
Contributor

  • Authorization shouldn't be defined on the alias level, but be determined by the authorization defined on the data stream level.

Big +1 on this one. That will simplify lots of things.

In order to reduce complexity aliases for data streams will behave differently than aliases for indices. Data stream aliases should only be able to refer to data streams. A data stream alias should not be able to refer to a backing index or any other regular index.

Not allowing aliases to refer to regular indices and data streams at the same time seems like a big limitation. I suppose that limitation could be relaxed later? I suppose authorization would be tricky to determine for aliases pointing to data streams and regular indices.

@martijnvg
Copy link
Member Author

Not allowing aliases to refer to regular indices and data streams at the same time seems like a big limitation. I suppose that limitation could be relaxed later?

We could think about relaxing it later, because essentially even a data stream will resolve to indices. However like you mention there can be complications, especially if we endup having different authorization schemes for aliases pointing to indices and aliases pointing to data streams.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 10, 2020
…related casts.

This change is part of series of changes to clean up the usage IndexAbstraction.Alias in the codebase,
so that it is no longer needed to cast to IndexAbstraction.Alias and just use the methods on the IndexAbstraction
interface. This should help adding data stream aliases, so that IndexAbstraction instances of type ALIAS can
also be data stream aliases.

Relates to elastic#66163
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Dec 17, 2020
and change validation to be an implementation detail and
part of construction of the alias abstraction.

This change is part of series of changes to clean up the usage
IndexAbstraction.Alias in the codebase, so that it is no longer
needed to cast to IndexAbstraction.Alias and just use the methods
on the IndexAbstraction interface. This should help adding data
stream aliases, so that IndexAbstraction instances of type ALIAS
can also be data stream aliases.

Relates to elastic#66163
martijnvg added a commit that referenced this issue Jan 6, 2021
and change validation to be an implementation detail and
part of construction of the alias abstraction.

This change is part of series of changes to clean up the usage
IndexAbstraction.Alias in the codebase, so that it is no longer
needed to cast to IndexAbstraction.Alias and just use the methods
on the IndexAbstraction interface. This should help adding data
stream aliases, so that IndexAbstraction instances of type ALIAS
can also be data stream aliases.

Relates to #66163
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Jan 6, 2021
and change validation to be an implementation detail and
part of construction of the alias abstraction.

This change is part of series of changes to clean up the usage
IndexAbstraction.Alias in the codebase, so that it is no longer
needed to cast to IndexAbstraction.Alias and just use the methods
on the IndexAbstraction interface. This should help adding data
stream aliases, so that IndexAbstraction instances of type ALIAS
can also be data stream aliases.

Relates to elastic#66163
martijnvg added a commit that referenced this issue Jan 6, 2021
and change validation to be an implementation detail and
part of construction of the alias abstraction.

Backport of #66508 to 7.x branch.

This change is part of series of changes to clean up the usage
IndexAbstraction.Alias in the codebase, so that it is no longer
needed to cast to IndexAbstraction.Alias and just use the methods
on the IndexAbstraction interface. This should help adding data
stream aliases, so that IndexAbstraction instances of type ALIAS
can also be data stream aliases.

Relates to #66163
martijnvg added a commit that referenced this issue Jan 7, 2021
…action.Alias class (#66165)

This change is part of series of changes to clean up the usage `IndexAbstraction.Alias` in the codebase,
so that it is no longer needed to cast to `IndexAbstraction.Alias` and just use the methods on the `IndexAbstraction`
interface. This should help adding data stream aliases, so that `IndexAbstraction` instances of type `ALIAS` can
also be data stream aliases.

Relates to #66163
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Jan 7, 2021
…action.Alias class

Backporting elastic#66165 to 7.x branch.

This change is part of series of changes to clean up the usage `IndexAbstraction.Alias` in the codebase,
so that it is no longer needed to cast to `IndexAbstraction.Alias` and just use the methods on the `IndexAbstraction`
interface. This should help adding data stream aliases, so that `IndexAbstraction` instances of type `ALIAS` can
also be data stream aliases.

Relates to elastic#66163
martijnvg added a commit that referenced this issue Jan 7, 2021
…action.Alias class (#67161)

Backporting #66165 to 7.x branch.

This change is part of series of changes to clean up the usage `IndexAbstraction.Alias` in the codebase,
so that it is no longer needed to cast to `IndexAbstraction.Alias` and just use the methods on the `IndexAbstraction`
interface. This should help adding data stream aliases, so that `IndexAbstraction` instances of type `ALIAS` can
also be data stream aliases.

Relates to #66163
martijnvg added a commit that referenced this issue Jan 7, 2021
…related casts. (#66178)

This change is part of series of changes to clean up the usage IndexAbstraction.Alias in the codebase,
so that it is no longer needed to cast to IndexAbstraction.Alias and just use the methods on the IndexAbstraction
interface. This should help adding data stream aliases, so that IndexAbstraction instances of type ALIAS can
also be data stream aliases.

Relates to #66163
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Jan 7, 2021
…related casts.

Backport elastic#66178 to 7.x branch.

This change is part of series of changes to clean up the usage IndexAbstraction.Alias in the codebase,
so that it is no longer needed to cast to IndexAbstraction.Alias and just use the methods on the IndexAbstraction
interface. This should help adding data stream aliases, so that IndexAbstraction instances of type ALIAS can
also be data stream aliases.

Relates to elastic#66163
martijnvg added a commit that referenced this issue Jan 7, 2021
…related casts. (#67176)

Backport #66178 to 7.x branch.

This change is part of series of changes to clean up the usage IndexAbstraction.Alias in the codebase,
so that it is no longer needed to cast to IndexAbstraction.Alias and just use the methods on the IndexAbstraction
interface. This should help adding data stream aliases, so that IndexAbstraction instances of type ALIAS can
also be data stream aliases.

Relates to #66163
@martijnvg martijnvg self-assigned this Jan 18, 2021
ywangd pushed a commit to ywangd/elasticsearch that referenced this issue Jul 30, 2021
This allows specifying a query as filter on data stream alias,
which will then always be applied when searching via this alias.

Relates elastic#66163
@Alsheh
Copy link

Alsheh commented Aug 10, 2021

@martijnvg Thanks for introducing aliases to data stream, it's a much needed feature!

Data stream aliases should only be able to refer to data streams.

Recently, I upgraded our ELK setup to 7.14.0 after seeing the support for data stream aliases. Extending aliases to data streams was encouraging to migrate to data streams and I attempted to make this migration backward compatible for end users by using an alias that will make quires span concrete indices and data streams. However,:

  • Not being able to make an alias point to both data streams and indices is, indeed, a big limitation for us. Also, I foresee running into a similar issue where we need to restore an index prior to using data stream such that restored data can be queried using an existing alias pointing to data streams.
  • Not having read the documentations thoroughly about aliases, I wasn't aware that aliases wouldn't work for both indices and data streams; moreover, when using the API to add an alias for both data streams and indices, my request seemed to work just fine and the API didn't complain about my request not being supported. So I was puzzled for quite a bit when I couldn't see any data using the newly created alias until I read on the docs that what I was trying to do wasn't supported. So, I think making the API raise an error here would be very helpful.

@willemdh
Copy link

Thanks @Alsheh, I totally agree. The whole point of aliases is imho so we can migrate to datastreams and that the existing dashboards and visualisations keep working..
If an alias cannot work for indices and datastreams, then this has all kinds of questionable implications..

@martijnvg
Copy link
Member Author

Thanks @Alsheh for trying out data stream aliases!

Not being able to make an alias point to both data streams and indices is, indeed, a big limitation for us.

I see, I think we can relax this limitation and allow a data stream alias and an index alias to share the same name. The limitation comes from being overprotective about the fact that data stream aliases and index aliases are stored in different places in the cluster state. I will look into this and see whether this limitation can really be lifted.

moreover, when using the API to add an alias for both data streams and indices, my request seemed to work just fine and the API didn't complain about my request not being supported. So I was puzzled for quite a bit when I couldn't see any data using the newly created alias until I read on the docs that what I was trying to do wasn't supported. So, I think making the API raise an error here would be very helpful.

An error should be returned. Not sure why this didn't happen, but this looks like a bug.

martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Aug 17, 2021
ywangd added a commit to ywangd/elasticsearch that referenced this issue Oct 13, 2021
Alias for datastream is supported since elastic#66163. Currently alias for the
backing indices are still not allowed. As a result, the error messages
are updated to reflect the latest status.
ywangd added a commit that referenced this issue Oct 13, 2021
Alias for datastream is supported since #66163. Currently alias for the
backing indices are still not allowed. As a result, the error messages
are updated to reflect the latest status.
ywangd added a commit to ywangd/elasticsearch that referenced this issue Oct 13, 2021
Alias for datastream is supported since elastic#66163. Currently alias for the
backing indices are still not allowed. As a result, the error messages
are updated to reflect the latest status.
elasticsearchmachine pushed a commit that referenced this issue Oct 13, 2021
Alias for datastream is supported since #66163. Currently alias for the
backing indices are still not allowed. As a result, the error messages
are updated to reflect the latest status.
@sidharthvijayakumar
Copy link

Has this been implemented?

@Danouchka
Copy link

+1 really need this, what about on indices made by rollovers in ILM..for instance all filebeat-* , auditbeat-* ?

@MakoWish
Copy link

MakoWish commented Aug 25, 2022

+1 Not allowing aliases is preventing us from migrating to Data Streams. We have quite a lot of things referencing multiple indices by aliases, so the lack of aliases on data streams would break our entire deployment. For instance, the alias network references cisco.firewall-*, suricata-*, netflow-*, packetbeat-*, etc.

@willemdh
Copy link

@martijnvg martijn
When can we expect a fix for this? Without this feature it becomes really hard to update to Elastic 8... As every Kibana visualisation we made over the past 4+ years is linked to an alias.. Migrating to datastreams would imply all these visualisations to break.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@willemdh
Copy link

willemdh commented Dec 2, 2022

Any new about this urgent issue?

Talking about "Support aliases that point to both regular indices and data streams"

@willemdh
Copy link

willemdh commented Jan 6, 2023

...

@Hythloday-zero
Copy link

Bump for current engagement re-asking about this Enhancement. They need various periods of dual-ingestion (migrating reg indices to data streams) and always the ability to query across both.

@hamartin
Copy link

When a datastream has a template and that template defines which aliases the datastream should have. That work fine.

However, when I later alter the template to have new aliases/remove old aliases, upload the altered template and do a _rollover on the datastream. The new indices have the aliases defined in the altered template, but not those indices created before the template was altered.

When I realised this, I tried to alter each index in the datastream by adding and removing aliases the way you can on legacy indices, but then I get an error saying that the indices are a part of a datastream and I am not allowed to change the aliases.

This is a little bit frustrating in an environment where we use aliases a lot to create "views" for the end users. It is also a problem when and if there is a change request asking to change the alias name. I have not been able to find a way to change an alias other than adding the alias to the template, upload it, rollover the stream and then telling the end user that only new data will be found under the new alias and the old data will be accessible from the old alias until the content is deleted by the lifecycle management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Data streams Data streams and their lifecycles >enhancement Meta Team:Data Management Meta label for data/management team
Projects
None yet
Development

Successfully merging a pull request may close this issue.