diff --git a/_clients/data-prepper/data-prepper-reference.md b/_clients/data-prepper/data-prepper-reference.md index 6cc5fb0983..8936e48918 100644 --- a/_clients/data-prepper/data-prepper-reference.md +++ b/_clients/data-prepper/data-prepper-reference.md @@ -14,7 +14,7 @@ This page lists all supported Data Prepper server, sources, buffers, processors, Option | Required | Type | Description :--- | :--- | :--- | :--- ssl | No | Boolean | Indicates whether TLS should be used for server APIs. Defaults to true. -keyStoreFilePath | No | String | Path to a .jks or .p12 keystore file. Required if `ssl` is true. +keyStoreFilePath | No | String | Path to a .jks or .p12 keystore file. Required if ssl is true. keyStorePassword | No | String | Password for keystore. Optional, defaults to empty string. privateKeyPassword | No | String | Password for private key within keystore. Optional, defaults to empty string. serverPort | No | Integer | Port number to use for server APIs. Defaults to 4900 @@ -47,15 +47,12 @@ unframed_requests | No | Boolean | Enable requests not framed using the gRPC wir thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`. max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`. ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`. -sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if `ssl` is set to `true`. -sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if `ssl` is set to `true`. +sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if ssl is set to `true`. +sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if ssl is set to `true`. useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`. acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`. awsRegion | Conditionally | String | Represents the AWS region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths. -authentication | No | Object | An authentication configuration. By default, an unauthenticated server is created for the pipeline. This parameter uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java). -record_type | No | String | A string represents the supported record data type that is written into the buffer plugin. Value options are `otlp` or `event`. Default is `otlp`. -`otlp` | No | String | Otel-trace-source writes each incoming `ExportTraceServiceRequest` request as record data type into the buffer. -`event` | No | String | Otel-trace-source decodes each incoming `ExportTraceServiceRequest` request into a collection of Data Prepper internal spans serving as buffer items. To achieve better performance in this mode, we recommend setting buffer capacity proportional to the estimated number of spans in the incoming request payload. +authentication | No | Object| An authentication configuration. By default, this runs an unauthenticated server. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java). ### http_source @@ -68,30 +65,7 @@ request_timeout | No | Integer | The request timeout in millis. Default is `10_0 thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`. max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`. max_pending_requests | No | Integer | The maximum number of allowed tasks in ScheduledThreadPool work queue. Default is `1024`. -authentication | No | Object | An authentication configuration. By default, this creates an unauthenticated server for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java). - -### otel_metrics_source - -Source for the OpenTelemetry Collector for collecting metric data. - -Option | Required | Type | Description -:--- | :--- | :--- | :--- -port | No | Integer | The port OTel metrics source is running on. Default is `21891`. -request_timeout | No | Integer | The request timeout in milliseconds. Default is `10_000`. -health_check_service | No | Boolean | Enables a gRPC health check service under `grpc.health.v1/Health/Check`. Default is `false`. -proto_reflection_service | No | Boolean | Enables a reflection service for Protobuf services (see [gRPC reflection](https://github.com/grpc/grpc/blob/master/doc/server-reflection.md) and [gRPC Server Reflection Tutorial](https://github.com/grpc/grpc-java/blob/master/documentation/server-reflection-tutorial.md) docs). Default is `false`. -unframed_requests | No | Boolean | Enable requests not framed using the gRPC wire protocol. -thread_count | No | Integer | The number of threads to keep in the ScheduledThreadPool. Default is `200`. -max_connection_count | No | Integer | The maximum allowed number of open connections. Default is `500`. -ssl | No | Boolean | Enables connections to the OTel source port over TLS/SSL. Defaults to `true`. -sslKeyCertChainFile | Conditionally | String | File-system path or AWS S3 path to the security certificate (e.g. `"config/demo-data-prepper.crt"` or `"s3://my-secrets-bucket/demo-data-prepper.crt"`). Required if `ssl` is set to `true`. -sslKeyFile | Conditionally | String | File-system path or AWS S3 path to the security key (e.g. `"config/demo-data-prepper.key"` or `"s3://my-secrets-bucket/demo-data-prepper.key"`). Required if `ssl` is set to `true`. -useAcmCertForSSL | No | Boolean | Whether to enable TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`. -acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificates. Required if `useAcmCertForSSL` is set to `true`. -awsRegion | Conditionally | String | Represents the AWS Region to use ACM or S3. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths. -authentication | No | Object | An authentication configuration. By default, an unauthenticated server is created for the pipeline. This uses pluggable authentication for HTTPS. To use basic authentication, define the `http_basic` plugin with a `username` and `password`. To provide customer authentication, use or create a plugin that implements [GrpcAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/GrpcAuthenticationProvider.java). - - +authentication | No | Object | An authentication configuration. By default, this runs an unauthenticated server. This uses pluggable authentication for HTTPS. To use basic authentication define the `http_basic` plugin with a `username` and `password`. To provide customer authentication use or create a plugin which implements: [ArmeriaHttpAuthenticationProvider](https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/armeria-common/src/main/java/com/amazon/dataprepper/armeria/authentication/ArmeriaHttpAuthenticationProvider.java). ### file @@ -142,19 +116,13 @@ Prior to Data Prepper 1.3, Processors were named Preppers. Starting in Data Prep ### otel_trace_raw_prepper -Converts OpenTelemetry data to OpenSearch-compatible JSON documents and fills in trace group related fields in those JSON documents. It requires `record_type` to be set as `otlp` in `otel_trace_source`. +Converts OpenTelemetry data to OpenSearch-compatible JSON documents. Option | Required | Type | Description :--- | :--- | :--- | :--- +root_span_flush_delay | No | Integer | Represents the time interval in seconds to flush all the root spans in the processor together with their descendants. Default is 30. trace_flush_interval | No | Integer | Represents the time interval in seconds to flush all the descendant spans without any root span. Default is 180. -### otel_trace_raw - -This processor is a Data Prepper event record type compatible version of `otel_trace_raw_prepper` that fills in trace group related fields into all incoming Data Prepper span records. It requires `record_type` to be set as `event` in `otel_trace_source`. - -Option | Required | Type | Description -:--- | :--- | :--- | :--- -trace_flush_interval | No | Integer | Represents the time interval in seconds to flush all the descendant spans without any root span. Default is 180. ### service_map_stateful @@ -176,12 +144,12 @@ target_port | No | Integer | The destination port to forward requests to. Defaul discovery_mode | No | String | Peer discovery mode to be used. Allowable values are `static`, `dns`, and `aws_cloud_map`. Defaults to `static`. static_endpoints | No | List | List containing string endpoints of all Data Prepper instances. domain_name | No | String | Single domain name to query DNS against. Typically used by creating multiple DNS A Records for the same domain. -ssl | No | Boolean | Indicates whether to use TLS. Default is true. +ssl | No | Boolean | Indicates whether TLS should be used. Default is true. awsCloudMapNamespaceName | Conditionally | String | Name of your CloudMap Namespace. Required if `discovery_mode` is set to `aws_cloud_map`. awsCloudMapServiceName | Conditionally | String | Service name within your CloudMap Namespace. Required if `discovery_mode` is set to `aws_cloud_map`. sslKeyCertChainFile | Conditionally | String | Represents the SSL certificate chain file path or AWS S3 path. S3 path example `s3:///`. Required if `ssl` is set to `true`. useAcmCertForSSL | No | Boolean | Enables TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`. -awsRegion | Conditionally | String | Represents the AWS Region to use ACM, S3, or CloudMap. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths. +awsRegion | Conditionally | String | Represents the AWS region to use ACM, S3, or CloudMap. Required if `useAcmCertForSSL` is set to `true` or `sslKeyCertChainFile` and `sslKeyFile` are AWS S3 paths. acmCertificateArn | Conditionally | String | Represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `useAcmCertForSSL` is set to `true`. ### string_converter @@ -204,7 +172,7 @@ group_duration | No | String | The amount of time that a group should exist befo ### date -Adds a default timestamp to the event or parses timestamp fields, and converts it to ISO 8601 format, which can be used as event timestamp. +Adds a default timestamp to the event or parses timestamp fields, and converts it to ISO 8601 format which can be used as event timestamp. Option | Required | Type | Description :--- | :--- | :--- | :--- diff --git a/_clients/data-prepper/get-started.md b/_clients/data-prepper/get-started.md index 3d00fb4a5e..11ef4ea96e 100644 --- a/_clients/data-prepper/get-started.md +++ b/_clients/data-prepper/get-started.md @@ -38,7 +38,7 @@ Run the following command with your pipeline configuration YAML. ```bash docker run --name data-prepper \ -v /full/path/to/pipelines.yaml:/usr/share/data-prepper/pipelines.yaml \ - opensearchproject/data-prepper:latest + opensearchproject/opensearch-data-prepper:latest ``` This sample pipeline configuration above demonstrates a simple pipeline with a source (`random`) sending data to a sink (`stdout`). For more examples and details on more advanced pipeline configurations, see [Pipelines]({{site.url}}{{site.baseurl}}/clients/data-prepper/pipelines). diff --git a/_clients/data-prepper/pipelines.md b/_clients/data-prepper/pipelines.md index 0b6ac6c0cd..b664d98a33 100644 --- a/_clients/data-prepper/pipelines.md +++ b/_clients/data-prepper/pipelines.md @@ -71,14 +71,10 @@ log-pipeline: This example uses weak security. We strongly recommend securing all plugins which open external ports in production environments. {: .note} -### Trace analytics pipeline +### Trace Analytics pipeline The following example demonstrates how to build a pipeline that supports the [Trace Analytics OpenSearch Dashboards plugin]({{site.url}}{{site.baseurl}}/observability-plugin/trace/ta-dashboards/). This pipeline takes data from the OpenTelemetry Collector and uses two other pipelines as sinks. These two separate pipelines index trace and the service map documents for the dashboard plugin. -#### Classic - -This pipeline definition will be deprecated in 2.0. Users are recommended to use [Event record type](#event-record-type) pipeline definition. - ```yml entry-pipeline: delay: "100" @@ -119,91 +115,6 @@ service-map-pipeline: trace_analytics_service_map: true ``` -#### Event record type - -Starting from Data Prepper 1.4, Data Prepper supports event record type in trace analytics pipeline source, buffer, and processors. - -```yml -entry-pipeline: - delay: "100" - source: - otel_trace_source: - ssl: false - record_type: event - buffer: - bounded_blocking: - buffer_size: 10240 - batch_size: 160 - sink: - - pipeline: - name: "raw-pipeline" - - pipeline: - name: "service-map-pipeline" -raw-pipeline: - source: - pipeline: - name: "entry-pipeline" - buffer: - bounded_blocking: - buffer_size: 10240 - batch_size: 160 - processor: - - otel_trace_raw: - sink: - - opensearch: - hosts: ["https://localhost:9200"] - insecure: true - username: admin - password: admin - trace_analytics_raw: true -service-map-pipeline: - delay: "100" - source: - pipeline: - name: "entry-pipeline" - buffer: - bounded_blocking: - buffer_size: 10240 - batch_size: 160 - processor: - - service_map_stateful: - sink: - - opensearch: - hosts: ["https://localhost:9200"] - insecure: true - username: admin - password: admin - trace_analytics_service_map: true -``` - -Note that it is recommended to scale the `buffer_size` and `batch_size` by the estimated maximum batch size in the client request payload to maintain similar ingestion throughput and latency as in [Classic](#classic). - -### Metrics pipeline - -Data Prepper supports metrics ingestion using OTel. It currently supports the following metric types: - -* Gauge -* Sum -* Summary -* Histogram - -Other types are not supported. Data Prepper drops all other types, including Exponential Histogram and Summary. Additionally, Data Prepper does not support Scope instrumentation. - -To set up a metrics pipeline: - -```yml -metrics-pipeline: - source: - otel_trace_source: - processor: - - otel_metrics_raw_processor: - sink: - - opensearch: - hosts: ["https://localhost:9200"] - username: admin - password: admin -``` - ## Migrating from Logstash Data Prepper supports Logstash configuration files for a limited set of plugins. Simply use the logstash config to run Data Prepper. @@ -211,7 +122,7 @@ Data Prepper supports Logstash configuration files for a limited set of plugins. ```bash docker run --name data-prepper \ -v /full/path/to/logstash.conf:/usr/share/data-prepper/pipelines.conf \ - opensearchproject/data-prepper:latest + opensearchproject/opensearch-data-prepper:latest ``` This feature is limited by feature parity of Data Prepper. As of Data Prepper 1.2 release, the following plugins from the Logstash configuration are supported: @@ -238,5 +149,5 @@ To configure the Data Prepper server, run Data Prepper with the additional yaml ```bash docker run --name data-prepper -v /full/path/to/pipelines.yaml:/usr/share/data-prepper/pipelines.yaml \ /full/path/to/data-prepper-config.yaml:/usr/share/data-prepper/data-prepper-config.yaml \ - opensearchproject/data-prepper:latest + opensearchproject/opensearch-data-prepper:latest ```` diff --git a/_dashboards/install/rpm.md b/_dashboards/install/rpm.md deleted file mode 100644 index 87ec62650c..0000000000 --- a/_dashboards/install/rpm.md +++ /dev/null @@ -1,54 +0,0 @@ ---- -layout: default -title: RPM -parent: Install OpenSearch Dashboards -nav_order: 31 ---- - -# Run OpenSearch Dashboards using RPM - -1. Create a repository file for OpenSearch Dashboards: - - ```bash - sudo curl -SL https://artifacts.opensearch.org/releases/bundle/opensearch-dashboards/2.x/opensearch-dashboards-2.x.repo -o /etc/yum.repos.d/opensearch-dashboards-2.x.repo - ``` - -2. Clean your YUM cache, to ensure a smooth installation: - - ```bash - sudo yum clean all - ``` - -3. With the repository file downloaded, list all available versions of OpenSearch: - - ```bash - sudo yum list | grep opensearch-dashboards - ``` - -4. Choose the version of OpenSearch Dashboards you want to install: - - ```bash - sudo yum install opensearch-dashboards - ``` - - Unless otherwise indicated, the highest minor version of OpenSearch Dashboards installs. - -5. During installation, the installer stops to see if the GPG key matches the OpenSearch project. Verify that the `Fingerprint` matches the following: - - ```bash - Fingerprint: c5b7 4989 65ef d1c2 924b a9d5 39d3 1987 9310 d3fc - ``` - - If correct, enter `yes` or `y`. The OpenSearch Dashboards installation continues. - -6. Run OpenSearch Dashboards using `systemctl`. - - ```bash - sudo systemctl start opensearch-dashboards.service - ``` - -7. To stop running OpenSearch Dashboards, enter - - ```bash - sudo systemctl stop opensearch-dashboards.service - ``` diff --git a/_im-plugin/ism/policies.md b/_im-plugin/ism/policies.md index a99a6f5286..c35a49e251 100644 --- a/_im-plugin/ism/policies.md +++ b/_im-plugin/ism/policies.md @@ -127,7 +127,7 @@ Parameter | Description | Type | Required ### read_only -Sets a managed index to be read only. Read-only indexes don't refresh. +Sets a managed index to be read only. ```json { diff --git a/_opensearch/cluster.md b/_opensearch/cluster.md index 661462b5f2..25ee211d56 100644 --- a/_opensearch/cluster.md +++ b/_opensearch/cluster.md @@ -12,23 +12,26 @@ OpenSearch can operate as a single-node or multi-node cluster. The steps to conf To create and deploy an OpenSearch cluster according to your requirements, it’s important to understand how node discovery and cluster formation work and what settings govern them. -There are many ways to design a cluster. The following illustration shows a basic architecture: +There are many ways to design a cluster. The following illustration shows a basic architecture that includes a four-node cluster that has one dedicated cluster manager node, one dedicated coordinating node, and two data nodes that are cluster manager eligible and also used for ingesting data. + + The nomenclature recently changed for the master node; it is now called the cluster manager node. + {: .note } ![multi-node cluster architecture diagram]({{site.url}}{{site.baseurl}}/images/cluster.png) -This is a four-node cluster that has one dedicated master node, one dedicated coordinating node, and two data nodes that are master-eligible and also used for ingesting data. +### Nodes The following table provides brief descriptions of the node types: Node type | Description | Best practices for production :--- | :--- | :-- | -`Master` | Manages the overall operation of a cluster and keeps track of the cluster state. This includes creating and deleting indices, keeping track of the nodes that join and leave the cluster, checking the health of each node in the cluster (by running ping requests), and allocating shards to nodes. | Three dedicated master nodes in three different zones is the right approach for almost all production use cases. This configuration ensures your cluster never loses quorum. Two nodes will be idle for most of the time except when one node goes down or needs some maintenance. -`Master-eligible` | Elects one node among them as the master node through a voting process. | For production clusters, make sure you have dedicated master nodes. The way to achieve a dedicated node type is to mark all other node types as false. In this case, you have to mark all the other nodes as not master-eligible. -`Data` | Stores and searches data. Performs all data-related operations (indexing, searching, aggregating) on local shards. These are the worker nodes of your cluster and need more disk space than any other node type. | As you add data nodes, keep them balanced between zones. For example, if you have three zones, add data nodes in multiples of three, one for each zone. We recommend using storage and RAM-heavy nodes. -`Ingest` | Preprocesses data before storing it in the cluster. Runs an ingest pipeline that transforms your data before adding it to an index. | If you plan to ingest a lot of data and run complex ingest pipelines, we recommend you use dedicated ingest nodes. You can also optionally offload your indexing from the data nodes so that your data nodes are used exclusively for searching and aggregating. -`Coordinating` | Delegates client requests to the shards on the data nodes, collects and aggregates the results into one final result, and sends this result back to the client. | A couple of dedicated coordinating-only nodes is appropriate to prevent bottlenecks for search-heavy workloads. We recommend using CPUs with as many cores as you can. +Cluster manager | Manages the overall operation of a cluster and keeps track of the cluster state. This includes creating and deleting indexes, keeping track of the nodes that join and leave the cluster, checking the health of each node in the cluster (by running ping requests), and allocating shards to nodes. | Three dedicated cluster manager nodes in three different zones is the right approach for almost all production use cases. This configuration ensures your cluster never loses quorum. Two nodes will be idle for most of the time except when one node goes down or needs some maintenance. +Cluster manager eligible | Elects one node among them as the cluster manager node through a voting process. | For production clusters, make sure you have dedicated cluster manager nodes. The way to achieve a dedicated node type is to mark all other node types as false. In this case, you have to mark all the other nodes as not cluster manager eligible. +Data | Stores and searches data. Performs all data-related operations (indexing, searching, aggregating) on local shards. These are the worker nodes of your cluster and need more disk space than any other node type. | As you add data nodes, keep them balanced between zones. For example, if you have three zones, add data nodes in multiples of three, one for each zone. We recommend using storage and RAM-heavy nodes. +Ingest | Pre-processes data before storing it in the cluster. Runs an ingest pipeline that transforms your data before adding it to an index. | If you plan to ingest a lot of data and run complex ingest pipelines, we recommend you use dedicated ingest nodes. You can also optionally offload your indexing from the data nodes so that your data nodes are used exclusively for searching and aggregating. +Coordinating | Delegates client requests to the shards on the data nodes, collects and aggregates the results into one final result, and sends this result back to the client. | A couple of dedicated coordinating-only nodes is appropriate to prevent bottlenecks for search-heavy workloads. We recommend using CPUs with as many cores as you can. -By default, each node is a master-eligible, data, ingest, and coordinating node. Deciding on the number of nodes, assigning node types, and choosing the hardware for each node type depends on your use case. You must take into account factors like the amount of time you want to hold on to your data, the average size of your documents, your typical workload (indexing, searches, aggregations), your expected price-performance ratio, your risk tolerance, and so on. +By default, each node is a cluster-manager-eligible, data, ingest, and coordinating node. Deciding on the number of nodes, assigning node types, and choosing the hardware for each node type depends on your use case. You must take into account factors like the amount of time you want to hold on to your data, the average size of your documents, your typical workload (indexing, searches, aggregations), your expected price-performance ratio, your risk tolerance, and so on. After you assess all these requirements, we recommend you use a benchmark testing tool like Rally to provision a small sample cluster and run tests with varying workloads and configurations. Compare and analyze the system and query metrics for these tests to design an optimum architecture. To get started with Rally, see the [Rally documentation](https://esrally.readthedocs.io/en/stable/). @@ -62,18 +65,18 @@ Make the same change on all the nodes to make sure that they'll join to form a c After you name the cluster, set node attributes for each node in your cluster. -#### Master node +#### Cluster manager node -Give your master node a name. If you don't specify a name, OpenSearch assigns a machine-generated name that makes the node difficult to monitor and troubleshoot. +Give your cluster manager node a name. If you don't specify a name, OpenSearch assigns a machine-generated name that makes the node difficult to monitor and troubleshoot. ```yml -node.name: opensearch-master +node.name: opensearch-cluster_manager ``` -You can also explicitly specify that this node is a master node. This is already true by default, but adding it makes it easier to identify the master node. +You can also explicitly specify that this node is a cluster manager node, even though it is already set to true by default. Set the node role to `cluster_manager` to make it easier to identify the cluster manager node. ```yml -node.roles: [ master ] +node.roles: [ cluster_manager ] ``` #### Data nodes @@ -88,7 +91,7 @@ node.name: opensearch-d1 node.name: opensearch-d2 ``` -You can make them master-eligible data nodes that will also be used for ingesting data: +You can make them cluster-manager-eligible data nodes that will also be used for ingesting data: ```yml node.roles: [ data, ingest ] @@ -132,9 +135,9 @@ Now that you've configured the network hosts, you need to configure the discover Zen Discovery is the built-in, default mechanism that uses [unicast](https://en.wikipedia.org/wiki/Unicast) to find other nodes in the cluster. -You can generally just add all your master-eligible nodes to the `discovery.seed_hosts` array. When a node starts up, it finds the other master-eligible nodes, determines which one is the master, and asks to join the cluster. +You can generally just add all of your cluster-manager-eligible nodes to the `discovery.seed_hosts` array. When a node starts up, it finds the other cluster-manager-eligible nodes, determines which one is the cluster manager, and asks to join the cluster. -For example, for `opensearch-master` the line looks something like this: +For example, for `opensearch-cluster_manager` the line looks something like this: ```yml discovery.seed_hosts: ["", "", ""] @@ -161,8 +164,8 @@ curl -XGET https://:9200/_cat/nodes?v -u 'admin:admin' --insecure ``` ``` -ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name -x.x.x.x 13 61 0 0.02 0.04 0.05 mi * opensearch-master +ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role cluster_manager name +x.x.x.x 13 61 0 0.02 0.04 0.05 mi * opensearch-cluster_manager x.x.x.x 16 60 0 0.06 0.05 0.05 md - opensearch-d1 x.x.x.x 34 38 0 0.12 0.07 0.06 md - opensearch-d2 x.x.x.x 23 38 0 0.12 0.07 0.06 md - opensearch-c1 diff --git a/_opensearch/index-templates.md b/_opensearch/index-templates.md index 743d2a135e..326ad05965 100644 --- a/_opensearch/index-templates.md +++ b/_opensearch/index-templates.md @@ -131,7 +131,7 @@ You can create multiple index templates for your indexes. If the index name matc The settings from the more recently created index templates override the settings of older index templates. So, you can first define a few common settings in a generic template that can act as a catch-all and then add more specialized settings as required. -An even better approach is to explicitly specify template priority using the `priority` parameter. OpenSearch applies templates with lower priority numbers first and then overrides them with templates with higher priority numbers. +An even better approach is to explicitly specify template priority using the `order` parameter. OpenSearch applies templates with lower priority numbers first and then overrides them with templates with higher priority numbers. For example, say you have the following two templates that both match the `logs-2020-01-02` index and there’s a conflict in the `number_of_shards` field: diff --git a/_opensearch/install/docker-security.md b/_opensearch/install/docker-security.md index c6e891f969..fa81af4984 100644 --- a/_opensearch/install/docker-security.md +++ b/_opensearch/install/docker-security.md @@ -24,7 +24,7 @@ services: - cluster.name=opensearch-cluster - node.name=opensearch-node1 - discovery.seed_hosts=opensearch-node1,opensearch-node2 - - cluster.initial_master_nodes=opensearch-node1,opensearch-node2 + - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 - bootstrap.memory_lock=true # along with the memlock settings below, disables swapping - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # minimum and maximum Java heap size, recommend setting both to 50% of system RAM - network.host=0.0.0.0 # required if not using the demo security configuration @@ -60,7 +60,7 @@ services: - cluster.name=opensearch-cluster - node.name=opensearch-node2 - discovery.seed_hosts=opensearch-node1,opensearch-node2 - - cluster.initial_master_nodes=opensearch-node1,opensearch-node2 + - cluster.initial_cluster_manager_nodes=opensearch-node1,opensearch-node2 - bootstrap.memory_lock=true - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" - network.host=0.0.0.0 diff --git a/_opensearch/install/rpm.md b/_opensearch/install/rpm.md deleted file mode 100644 index b24312a561..0000000000 --- a/_opensearch/install/rpm.md +++ /dev/null @@ -1,157 +0,0 @@ ---- -layout: default -title: RPM -parent: Install OpenSearch -nav_order: 51 ---- - -# RPM - -The RPM Package Manager (RPM) installation provides everything you need to run OpenSearch inside Red Hat or Red Hat-based Linux Distributions. - -RPM supports CentOS 7 and 8, and Amazon Linux 2. If you have your own Java installation and set `JAVA_HOME` in your terminal application, macOS works, as well. - -There are two methods for installing OpenSearch on RPM: - -## Manual method - - -1. Download the RPM package directly from the [OpenSearch downloads page](https://opensearch.org/downloads.html){:target='\_blank'}. The RPM package can be download both as `x64` and `arm64`. - -2. Import the public GPG key. This key verifies that the your OpenSearch instance is signed. - - ```bash - sudo rpm --import https://artifacts.opensearch.org/publickeys/opensearch.pgp - ``` - -3. On your host, use `sudo yum install` or `sudo rpm -ivh` to install the package. - - **x64** - - ```bash - sudo yum install opensearch-{{site.opensearch_version}}-linux-x64.rpm - sudo yum install opensearch-dashboards-{{site.opensearch_version}}-linux-x64.rpm - ``` - - ```bash - sudo rpm -ivh opensearch-{{site.opensearch_version}}-linux-x64.rpm - sudo rpm -ivh opensearch-dashboards-{{site.opensearch_version}}-linux-x64.rpm - ``` - - **arm64** - - ```bash - sudo yum install opensearch-{{site.opensearch_version}}-linux-x64.rpm - sudo yum install opensearch-dashboards-{{site.opensearch_version}}-linux-arm64.rpm - ``` - - ```bash - sudo rpm -ivh opensearch-{{site.opensearch_version}}-linux-x64.rpm - sudo rpm -ivh opensearch-dashboards-{{site.opensearch_version}}-linux-arm64.rpm - ``` - - Once complete, you can run OpenSearch inside your distribution. - -## YUM method - -YUM, an RPM package management tool, allows you to pull the RPM package from the YUM repository library. - -1. Create a repository file for both OpenSearch and OpenSearch Dashboards: - - ```bash - sudo curl -SL https://artifacts.opensearch.org/releases/bundle/opensearch/2.x/opensearch-2.x.repo -o /etc/yum.repos.d/opensearch-2.x.repo - ``` - - ```bash - sudo curl -SL https://artifacts.opensearch.org/releases/bundle/opensearch-dashboards/2.x/opensearch-dashboards-2.x.repo -o /etc/yum.repos.d/opensearch-dashboards-2.x.repo - ``` - - To verify that the repos appear in your repo list, use `sudo yum repolist`. - -2. Clean your YUM cache, to ensure a smooth installation: - - ```bash - sudo yum clean all - ``` - -3. With the repository file downloaded, list all available versions of OpenSearch: - - ```bash - sudo yum list | grep opensearch - ``` - -4. Choose the version of OpenSearch you want to install: - - ```bash - sudo yum install opensearch - sudo yum install opensearch-dashboards - ``` - - Unless otherwise indicated, the highest minor version of OpenSearch installs. - - To install a specific version of OpenSearch: - - ```bash - sudo yum install 'opensearch-{{site.opensearch_version}}' - ``` - -5. During installation, the installer stops to see if the GPG key matches the OpenSearch project. Verify that the `Fingerprint` matches the following: - - ```bash - Fingerprint: c5b7 4989 65ef d1c2 924b a9d5 39d3 1987 9310 d3fc - ``` - - If correct, enter `yes` or `y`. The OpenSearch installation continues. - - Once complete, you can run OpenSearch inside your distribution. - -## Run OpenSearch - -1. Run OpenSearch and OpenSearch Dashboards using `systemctl`. - - ```bash - sudo systemctl start opensearch.service - sudo systemctl start opensearch-dashboards.service - ``` - -2. Send requests to the server to verify that OpenSearch is running: - - ```bash - curl -XGET https://localhost:9200 -u 'admin:admin' --insecure - curl -XGET https://localhost:9200/_cat/config?v -u 'admin:admin' --insecure - ``` - -3. To stop running OpenSearch, enter: - - ```bash - sudo systemctl stop opensearch.service - sudo systemctl stop opensearch-dashboards.service - ``` - - -## *(Optional)* Set up Performance Analyzer - -When enabled, the Performance Analyzer plugin collects data related to the performance of your OpenSearch instance. To start the Performance Analyzer plugin, enter: - -```bash -sudo systemctl start opensearch-performance-analyzer.service -``` - -To stop the Performance Analyzer, enter: - -```bash -sudo systemctl stop opensearch-performance-analyzer.service -``` - -## Upgrade RPM - -You can upgrade your RPM OpenSearch instance both manually and through YUM. - - -### Manual - -Download the new version of OpenSearch you want to use, and then use `rmp -Uvh` to upgrade. - -### YUM - -To upgrade to the latest version of OpenSearch with YUM, use `sudo yum update`. You can also upgrade to a specific OpenSearch version by using `sudo yum update opensearch-`. diff --git a/_opensearch/query-dsl/full-text.md b/_opensearch/query-dsl/full-text.md index 5614c01fd3..e21efaa1b9 100644 --- a/_opensearch/query-dsl/full-text.md +++ b/_opensearch/query-dsl/full-text.md @@ -7,7 +7,7 @@ nav_order: 40 # Full-text queries -This page lists all full-text query types and common options. There are many options for full-text queries, each with its own subtle behavior difference, so the best method to ensure that you obtain useful search results is to test different queries against representative indexes and verify the outputs individually. +This page lists all full-text query types and common options. Given the sheer number of options and subtle behaviors, the best method of ensuring useful search results is to test different queries against representative indices and verify the output. --- @@ -405,13 +405,12 @@ GET _search ## Options -You can increase the specificity of your query by adding the following options. - Option | Valid values | Description :--- | :--- | :--- `allow_leading_wildcard` | Boolean | Whether `*` and `?` are allowed as the first character of a search term. The default is true. `analyze_wildcard` | Boolean | Whether OpenSearch should attempt to analyze wildcard terms. Some analyzers do a poor job at this task, so the default is false. `analyzer` | `standard, simple, whitespace, stop, keyword, pattern, , fingerprint` | The analyzer you want to use for the query. Different analyzers have different character filters, tokenizers, and token filters. The `stop` analyzer, for example, removes stop words (e.g. "an," "but," "this") from the query string. +`auto_generate_synonyms_phrase_query` | Boolean | A value of true (default) automatically generates [phrase queries](https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/PhraseQuery.html) for multi-term synonyms. For example, if you have the synonym `"ba, batting average"` and search for "ba," OpenSearch searches for `ba OR "batting average"` (if this option is true) or `ba OR (batting AND average)` (if this option is false). `boost` | Floating-point | Boosts the clause by the given multiplier. Useful for weighing clauses in compound queries. The default is 1.0. `cutoff_frequency` | Between `0.0` and `1.0` or a positive integer | This value lets you define high and low frequency terms based on number of occurrences in the index. Numbers between 0 and 1 are treated as a percentage. For example, 0.10 is 10%. This value means that if a word occurs within the search field in more than 10% of the documents on the shard, OpenSearch considers the word "high frequency" and deemphasizes it when calculating search score.

Because this setting is *per shard*, testing its impact on search results can be challenging unless a cluster has many documents. `enable_position_increments` | Boolean | When true, result queries are aware of position increments. This setting is useful when the removal of stop words leaves an unwanted "gap" between terms. The default is true. @@ -434,16 +433,3 @@ Option | Valid values | Description `time_zone` | UTC offset | The time zone to use (e.g. `-08:00`) if the query string contains a date range (e.g. `"query": "wind rises release_date[2012-01-01 TO 2014-01-01]"`). The default is `UTC`. `type` | `best_fields, most_fields, cross_fields, phrase, phrase_prefix` | Determines how OpenSearch executes the query and scores the results. The default is `best_fields`. `zero_terms_query` | `none, all` | If the analyzer removes all terms from a query string, whether to match no documents (default) or all documents. For example, the `stop` analyzer removes all terms from the string "an but this." - -#### Multi-term synonym query - -If you are searching for multiple terms such as synonyms, but you only require one term to match, you can use the `auto_generate_synonyms_phrase_query` option to perform your search with logical `OR` operation. This automatically searches for multi-term synonyms in a phrase query. - -This option takes boolean values; `true` is the default value. - -Consider this example search for the synonym `"ba, batting average"`: - -* set to `true` - OpenSearch looks for `ba OR "batting average"` -* set to `false` - OpenSearch looks for `ba OR (batting AND average)`. - -For more details about multi-term synonyms, see [phrase queries](https://lucene.apache.org/core/8_9_0/core/org/apache/lucene/search/PhraseQuery.html). diff --git a/_opensearch/query-dsl/index.md b/_opensearch/query-dsl/index.md index 3b6dd4e074..f3167f5219 100644 --- a/_opensearch/query-dsl/index.md +++ b/_opensearch/query-dsl/index.md @@ -12,31 +12,9 @@ redirect_from: # Query DSL -OpenSearch provides a query domain-specific language (DSL) that you can use to search with more options than a simple search with an HTTP request parameter alone. The query DSL uses the HTTP request body, so you can more easily customize your queries to get the exact results that you want. +While you can use HTTP request parameters to perform simple searches, you can also use the OpenSearch query domain-specific language (DSL), which provides a wider range of search options. The query DSL uses the HTTP request body, so you can more easily customize your queries to get the exact results that you want. -The OpenSearch query DSL provides three query options: term-level queries, full-text queries, and boolean queries. You can even perform more complicated searches by using different elements from each variety to find whatever data you need. - -## DSL query types - -OpenSearch supports two types of queries when you search for data: term-level queries and full-text queries. - -The following table describes the differences between them. - -| Metrics | Term-level queries | Full-text queries -:--- | :--- | :--- -*Query results* | Term-level queries answer which documents match a query. | Full-text queries answer how well the documents match a query. -*Analyzer* | The search term isn't analyzed. This means that the term query searches for your search term as it is. | The search term is analyzed by the same analyzer that was used for the specific field of the document at the time it was indexed. This means that your search term goes through the same analysis process as the document's field. -*Relevance* | Term-level queries simply return documents that match without sorting them based on the relevance score. They still calculate the relevance score, but this score is the same for all the documents that are returned. | Full-text queries calculate a relevance score for each match and sort the results by decreasing order of relevance. -*Use Case* | Use term-level queries when you want to match exact values, such as numbers, dates, tags, and so on, and don't need the matches to be sorted by relevance. | Use full-text queries to match text fields and sort by relevance after taking into account factors like casing and stemming variants. - -OpenSearch uses a probabilistic ranking framework called Okapi BM25 to calculate relevance scores. To learn more about Okapi BM25, see [Wikipedia](https://en.wikipedia.org/wiki/Okapi_BM25). -{: .note } - -The following examples show the difference between a simple HTTP search and a search with query DSL. - -## Example: HTTP simple search - -The following request performs a simple search for a `speaker` field that has a value of `queen`. +For example, the following request performs a simple search to search for a `speaker` field that has a value of `queen`. **Sample request** ```json @@ -77,9 +55,7 @@ GET _search?q=speaker:queen } ``` -## Example: Query DSL search - -With a query DSL search, you can include an HTTP request body to look for results more tailored to your needs. The following example shows how to search for `speaker` and `text_entry` fields that have a value of `QUEEN`. +With query DSL, however, you can include an HTTP request body to look for results more tailored to your needs. The following example shows how to search for `speaker` and `text_entry` fields that have a value of `QUEEN`. **Sample request** ```json @@ -142,4 +118,5 @@ With a query DSL search, you can include an HTTP request body to look for result ] } } -``` \ No newline at end of file +``` +The OpenSearch query DSL comes in three varieties: term-level queries, full-text queries, and boolean queries. You can even perform more complicated searches by using different elements from each variety to find whatever data you need. diff --git a/_opensearch/query-dsl/term.md b/_opensearch/query-dsl/term.md index f0e47ba5bc..d4a836d4cb 100644 --- a/_opensearch/query-dsl/term.md +++ b/_opensearch/query-dsl/term.md @@ -7,6 +7,20 @@ nav_order: 30 # Term-level queries +OpenSearch supports two types of queries when you search for data: term-level queries and full-text queries. + +The following table describes the differences between them: + +| | Term-level queries | Full-text queries +:--- | :--- | :--- +*Description* | Term-level queries answer which documents match a query. | Full-text queries answer how well the documents match a query. +*Analyzer* | The search term isn't analyzed. This means that the term query searches for your search term as it is. | The search term is analyzed by the same analyzer that was used for the specific field of the document at the time it was indexed. This means that your search term goes through the same analysis process that the document's field did. +*Relevance* | Term-level queries simply return documents that match without sorting them based on the relevance score. They still calculate the relevance score, but this score is the same for all the documents that are returned. | Full-text queries calculate a relevance score for each match and sort the results by decreasing order of relevance. +*Use Case* | Use term-level queries when you want to match exact values such as numbers, dates, tags, and so on, and don't need the matches to be sorted by relevance. | Use full-text queries to match text fields and sort by relevance after taking into account factors like casing and stemming variants. + +OpenSearch uses a probabilistic ranking framework called Okapi BM25 to calculate relevance scores. To learn more about Okapi BM25, see [Wikipedia](https://en.wikipedia.org/wiki/Okapi_BM25). +{: .note } + Assume that you have the complete works of Shakespeare indexed in an OpenSearch cluster. We use a term-level query to search for the phrase "To be, or not to be" in the `text_entry` field: ```json @@ -214,12 +228,7 @@ The search query “HAMLET” is also searched literally. So, to get a match on --- -# Term-level query operations - -This section provides examples of term-level query operations that you can use for specific search use cases. - - -## Single term +## Term Use the `term` query to search for an exact term in a field. @@ -236,9 +245,9 @@ GET shakespeare/_search } ``` -## Multiple terms +## Terms -Use the `terms` operation to search for multiple value matches for the same query field. +Use the `terms` query to search for multiple terms in the same field. ```json GET shakespeare/_search @@ -255,86 +264,8 @@ GET shakespeare/_search ``` You get back documents that match any of the terms. -#### Sample response - -```json -{ - "took" : 11, - "timed_out" : false, - "_shards" : { - "total" : 1, - "successful" : 1, - "skipped" : 0, - "failed" : 0 - }, - "hits" : { - "total" : { - "value" : 2, - "relation" : "eq" - }, - "max_score" : 1.0, - "hits" : [ - { - "_index" : "shakespeare", - "_id" : "61808", - "_score" : 1.0, - "_source" : { - "type" : "line", - "line_id" : 61809, - "play_name" : "Merchant of Venice", - "speech_number" : 33, - "line_number" : "1.3.115", - "speaker" : "SHYLOCK", - "text_entry" : "Go to, then; you come to me, and you say" - } - }, - { - "_index" : "shakespeare", - "_id" : "61809", - "_score" : 1.0, - "_source" : { - "type" : "line", - "line_id" : 61810, - "play_name" : "Merchant of Venice", - "speech_number" : 33, - "line_number" : "1.3.116", - "speaker" : "SHYLOCK", - "text_entry" : "Shylock, we would have moneys: you say so;" - } - } - ] - } -} -``` - -## Terms lookup query (TLQ) - -Use a terms lookup query (TLQ) to retrieve multiple field values in a specific document within a specific index. Use the `terms` operation and specify the index name, document ID and field you want to look up with the `path` parameter. - -Parameter | Behavior -:--- | :--- -`index` | The index name that contains the document you want search. -`id` | The exact document to query for terms. -`path` | The field name for the query. - -To get all the lines from a Shakespeare play for a role (or roles) specified in the index `play-assignments` for the document `42`: - -```json -GET shakespeare/_search -{ - "query": { - "terms": { - "speaker": { - "index": "play-assignments", - "id": "42", - "path": "role" - } - } - } -} -``` -## Document IDs +## IDs Use the `ids` query to search for one or more document ID values. @@ -352,7 +283,7 @@ GET shakespeare/_search } ``` -## Range of values +## Range Use the `range` query to search for a range of values in a field. @@ -433,7 +364,7 @@ GET products/_search The keyword `now` refers to the current date and time. -## Multiple terms by prefix +## Prefix Use the `prefix` query to search for terms that begin with a specific prefix. @@ -448,7 +379,7 @@ GET shakespeare/_search } ``` -## All instances of a specific field in a document +## Exists Use the `exists` query to search for documents that contain a specific field. @@ -463,7 +394,7 @@ GET shakespeare/_search } ``` -## Wildcard patterns +## Wildcards Use wildcard queries to search for terms that match a wildcard pattern. @@ -491,7 +422,7 @@ If we change `*` to `?`, we get no matches, because `?` refers to a single chara Wildcard queries tend to be slow because they need to iterate over a lot of terms. Avoid placing wildcard characters at the beginning of a query because it could be a very expensive operation in terms of both resources and time. -## Regular expressions (Regex) +## Regex Use the `regexp` query to search for terms that match a regular expression. diff --git a/_opensearch/rest-api/cat/cat-allocation.md b/_opensearch/rest-api/cat/cat-allocation.md index 02bc8a1041..6c5c0aa7ab 100644 --- a/_opensearch/rest-api/cat/cat-allocation.md +++ b/_opensearch/rest-api/cat/cat-allocation.md @@ -58,6 +58,6 @@ The following response shows that 8 shards are allocated to each the two nodes a ```json shards | disk.indices | disk.used | disk.avail | disk.total | disk.percent host | ip | node - 8 | 989.4kb | 25.9gb | 32.4gb | 58.4gb | 44 172.18.0.4 | 172.18.0.4 | opensearch-node1 - 8 | 962.4kb | 25.9gb | 32.4gb | 58.4gb | 44 172.18.0.3 | 172.18.0.3 | opensearch-node2 + 8 | 989.4kb | 25.9gb | 32.4gb | 58.4gb | 44 172.18.0.4 | 172.18.0.4 | odfe-node1 + 8 | 962.4kb | 25.9gb | 32.4gb | 58.4gb | 44 172.18.0.3 | 172.18.0.3 | odfe-node2 ``` diff --git a/_opensearch/rest-api/cat/cat-master.md b/_opensearch/rest-api/cat/cat-cluster_manager.md similarity index 60% rename from _opensearch/rest-api/cat/cat-master.md rename to _opensearch/rest-api/cat/cat-cluster_manager.md index 7895b39f8e..bb18f6082b 100644 --- a/_opensearch/rest-api/cat/cat-master.md +++ b/_opensearch/rest-api/cat/cat-cluster_manager.md @@ -1,39 +1,39 @@ --- layout: default -title: cat master +title: CAT cluster manager parent: CAT grand_parent: REST API reference nav_order: 30 has_children: false --- -# cat master +# CAT cluster_manager Introduced 1.0 {: .label .label-purple } -The cat master operation lists information that helps identify the elected master node. +The cat cluster manager operation lists information that helps identify the elected cluster manager node. ## Example ``` -GET _cat/master?v +GET _cat/cluster_manager?v ``` ## Path and HTTP methods ``` -GET _cat/master +GET _cat/cluster_manager ``` ## URL parameters -All cat master URL parameters are optional. +All cat cluster manager URL parameters are optional. In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/opensearch/rest-api/cat/index#common-url-parameters), you can specify the following parameters: Parameter | Type | Description :--- | :--- | :--- -master_timeout | Time | The amount of time to wait for a connection to the master node. Default is 30 seconds. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. ## Response diff --git a/_opensearch/rest-api/cat/cat-field-data.md b/_opensearch/rest-api/cat/cat-field-data.md index 3170ff4e16..d86d17a1a5 100644 --- a/_opensearch/rest-api/cat/cat-field-data.md +++ b/_opensearch/rest-api/cat/cat-field-data.md @@ -54,6 +54,6 @@ The following response shows the memory size for all fields as 284 bytes: ```json id host ip node field size -1vo54NuxSxOrbPEYdkSF0w 172.18.0.4 172.18.0.4 opensearch-node1 _id 284b -ZaIkkUd4TEiAihqJGkp5CA 172.18.0.3 172.18.0.3 opensearch-node2 _id 284b +1vo54NuxSxOrbPEYdkSF0w 172.18.0.4 172.18.0.4 odfe-node1 _id 284b +ZaIkkUd4TEiAihqJGkp5CA 172.18.0.3 172.18.0.3 odfe-node2 _id 284b ``` diff --git a/_opensearch/rest-api/cat/cat-health.md b/_opensearch/rest-api/cat/cat-health.md index f261a3f814..476681f8c1 100644 --- a/_opensearch/rest-api/cat/cat-health.md +++ b/_opensearch/rest-api/cat/cat-health.md @@ -40,5 +40,5 @@ ts | Boolean | If true, returns HH:MM:SS and Unix epoch timestamps. Default is t GET _cat/health?v&time=5d epoch | timestamp | cluster | status | node.total | node.data | shards | pri | relo | init | unassign | pending_tasks | max_task_wait_time | active_shards_percent -1624248112 | 04:01:52 | opensearch-cluster | green | 2 | 2 | 16 | 8 | 0 | 0 | 0 | 0 | - | 100.0% +1624248112 | 04:01:52 | odfe-cluster | green | 2 | 2 | 16 | 8 | 0 | 0 | 0 | 0 | - | 100.0% ``` diff --git a/_opensearch/rest-api/cat/cat-nodeattrs.md b/_opensearch/rest-api/cat/cat-nodeattrs.md index 5f8e488dcb..c06ac527b1 100644 --- a/_opensearch/rest-api/cat/cat-nodeattrs.md +++ b/_opensearch/rest-api/cat/cat-nodeattrs.md @@ -41,5 +41,5 @@ master_timeout | Time | The amount of time to wait for a connection to the maste ```json node | host | ip | attr | value -opensearch-node2 | 172.18.0.3 | 172.18.0.3 | testattr | test +odfe-node2 | 172.18.0.3 | 172.18.0.3 | testattr | test ``` diff --git a/_opensearch/rest-api/cat/cat-nodes.md b/_opensearch/rest-api/cat/cat-nodes.md index a7e308c72f..bd8d794fc3 100644 --- a/_opensearch/rest-api/cat/cat-nodes.md +++ b/_opensearch/rest-api/cat/cat-nodes.md @@ -13,7 +13,7 @@ Introduced 1.0 The cat nodes operation lists node-level information, including node roles and load metrics. -A few important node metrics are `pid`, `name`, `master`, `ip`, `port`, `version`, `build`, `jdk`, along with `disk`, `heap`, `ram`, and `file_desc`. +A few important node metrics are `pid`, `name`, `cluster_manager`, `ip`, `port`, `version`, `build`, `jdk`, along with `disk`, `heap`, `ram`, and `file_desc`. ## Example @@ -37,8 +37,8 @@ Parameter | Type | Description :--- | :--- | :--- bytes | Byte size | Specify the units for byte size. For example, `7kb` or `6gb`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). full_id | Boolean | If true, return the full node ID. If false, return the shortened node ID. Defaults to false. -local | Boolean | Whether to return information from the local node only instead of from the master node. Default is false. -master_timeout | Time | The amount of time to wait for a connection to the master node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). include_unloaded_segments | Boolean | Whether to include information from segments not loaded into memory. Default is false. @@ -46,7 +46,9 @@ include_unloaded_segments | Boolean | Whether to include information from segmen ## Response ```json -ip | heap.percent | ram.percent | cpu load_1m | load_5m | load_15m | node.role | master | name + +ip | heap.percent | ram.percent | cpu load_1m | load_5m | load_15m | node.role | cluster_manager | name + 172.18.0.3 | 31 | 97 | 3 | 0.03 | 0.10 | 0.14 dimr | * | opensearch-node2 172.18.0.4 | 45 | 97 | 3 | 0.19 | 0.14 | 0.15 dimr | - | opensearch-node1 ``` diff --git a/_opensearch/rest-api/cat/cat-pending-tasks.md b/_opensearch/rest-api/cat/cat-pending-tasks.md index 37cf82ac6a..f727693502 100644 --- a/_opensearch/rest-api/cat/cat-pending-tasks.md +++ b/_opensearch/rest-api/cat/cat-pending-tasks.md @@ -33,8 +33,8 @@ In addition to the [common URL parameters]({{site.url}}{{site.baseurl}}/opensear Parameter | Type | Description :--- | :--- | :--- -local | Boolean | Whether to return information from the local node only instead of from the master node. Default is false. -master_timeout | Time | The amount of time to wait for a connection to the master node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster_manager node. Default is false. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. time | Time | Specify the units for time. For example, `5d` or `7h`. For more information, see [Supported units]({{site.url}}{{site.baseurl}}/opensearch/units/). diff --git a/_opensearch/rest-api/cat/cat-plugins.md b/_opensearch/rest-api/cat/cat-plugins.md index f6e6b16cd6..c49829141e 100644 --- a/_opensearch/rest-api/cat/cat-plugins.md +++ b/_opensearch/rest-api/cat/cat-plugins.md @@ -41,24 +41,24 @@ master_timeout | Time | The amount of time to wait for a connection to the maste ```json name component version -opensearch-node2 opendistro-alerting 1.13.1.0 -opensearch-node2 opendistro-anomaly-detection 1.13.0.0 -opensearch-node2 opendistro-asynchronous-search 1.13.0.1 -opensearch-node2 opendistro-index-management 1.13.2.0 -opensearch-node2 opendistro-job-scheduler 1.13.0.0 -opensearch-node2 opendistro-knn 1.13.0.0 -opensearch-node2 opendistro-performance-analyzer 1.13.0.0 -opensearch-node2 opendistro-reports-scheduler 1.13.0.0 -opensearch-node2 opendistro-sql 1.13.2.0 -opensearch-node2 opendistro_security 1.13.1.0 -opensearch-node1 opendistro-alerting 1.13.1.0 -opensearch-node1 opendistro-anomaly-detection 1.13.0.0 -opensearch-node1 opendistro-asynchronous-search 1.13.0.1 -opensearch-node1 opendistro-index-management 1.13.2.0 -opensearch-node1 opendistro-job-scheduler 1.13.0.0 -opensearch-node1 opendistro-knn 1.13.0.0 -opensearch-node1 opendistro-performance-analyzer 1.13.0.0 -opensearch-node1 opendistro-reports-scheduler 1.13.0.0 -opensearch-node1 opendistro-sql 1.13.2.0 -opensearch-node1 opendistro_security 1.13.1.0 +odfe-node2 opendistro-alerting 1.13.1.0 +odfe-node2 opendistro-anomaly-detection 1.13.0.0 +odfe-node2 opendistro-asynchronous-search 1.13.0.1 +odfe-node2 opendistro-index-management 1.13.2.0 +odfe-node2 opendistro-job-scheduler 1.13.0.0 +odfe-node2 opendistro-knn 1.13.0.0 +odfe-node2 opendistro-performance-analyzer 1.13.0.0 +odfe-node2 opendistro-reports-scheduler 1.13.0.0 +odfe-node2 opendistro-sql 1.13.2.0 +odfe-node2 opendistro_security 1.13.1.0 +odfe-node1 opendistro-alerting 1.13.1.0 +odfe-node1 opendistro-anomaly-detection 1.13.0.0 +odfe-node1 opendistro-asynchronous-search 1.13.0.1 +odfe-node1 opendistro-index-management 1.13.2.0 +odfe-node1 opendistro-job-scheduler 1.13.0.0 +odfe-node1 opendistro-knn 1.13.0.0 +odfe-node1 opendistro-performance-analyzer 1.13.0.0 +odfe-node1 opendistro-reports-scheduler 1.13.0.0 +odfe-node1 opendistro-sql 1.13.2.0 +odfe-node1 opendistro_security 1.13.1.0 ``` diff --git a/_opensearch/rest-api/cat/cat-recovery.md b/_opensearch/rest-api/cat/cat-recovery.md index 2e197c8c04..548456c0b9 100644 --- a/_opensearch/rest-api/cat/cat-recovery.md +++ b/_opensearch/rest-api/cat/cat-recovery.md @@ -54,6 +54,6 @@ time | Time | Specify the units for time. For example, `5d` or `7h`. For more in ```json index | shard | time | type | stage | source_host | source_node | target_host | target_node | repository | snapshot | files | files_recovered | files_percent | files_total | bytes | bytes_recovered | bytes_percent | bytes_total | translog_ops | translog_ops_recovered | translog_ops_percent -movies | 0 | 117ms | empty_store | done | n/a | n/a | 172.18.0.4 | opensearch-node1 | n/a | n/a | 0 | 0 | 0.0% | 0 | 0 | 0 | 0.0% | 0 | 0 | 0 | 100.0% -movies | 0 | 382ms | peer | done | 172.18.0.4 | opensearch-node1 | 172.18.0.3 | opensearch-node2 | n/a | n/a | 1 | 1 | 100.0% | 1 | 208 | 208 | 100.0% | 208 | 1 | 1 | 100.0% +movies | 0 | 117ms | empty_store | done | n/a | n/a | 172.18.0.4 | odfe-node1 | n/a | n/a | 0 | 0 | 0.0% | 0 | 0 | 0 | 0.0% | 0 | 0 | 0 | 100.0% +movies | 0 | 382ms | peer | done | 172.18.0.4 | odfe-node1 | 172.18.0.3 | odfe-node2 | n/a | n/a | 1 | 1 | 100.0% | 1 | 208 | 208 | 100.0% | 208 | 1 | 1 | 100.0% ``` diff --git a/_opensearch/rest-api/cat/cat-shards.md b/_opensearch/rest-api/cat/cat-shards.md index 8a54172e1f..00d4b55481 100644 --- a/_opensearch/rest-api/cat/cat-shards.md +++ b/_opensearch/rest-api/cat/cat-shards.md @@ -55,6 +55,6 @@ time | Time | Specify the units for time. For example, `5d` or `7h`. For more in ```json index | shard | prirep | state | docs | store | ip | | node -plugins | 0 | p | STARTED | 0 | 208b | 172.18.0.4 | opensearch-node1 -plugins | 0 | r | STARTED | 0 | 208b | 172.18.0.3 | opensearch-node2 +plugins | 0 | p | STARTED | 0 | 208b | 172.18.0.4 | odfe-node1 +plugins | 0 | r | STARTED | 0 | 208b | 172.18.0.3 | odfe-node2 ``` diff --git a/_opensearch/rest-api/cat/cat-snapshots.md b/_opensearch/rest-api/cat/cat-snapshots.md index d4cce9fbb4..71aa30cf6d 100644 --- a/_opensearch/rest-api/cat/cat-snapshots.md +++ b/_opensearch/rest-api/cat/cat-snapshots.md @@ -41,6 +41,6 @@ time | Time | Specify the units for time. For example, `5d` or `7h`. For more in ```json index | shard | prirep | state | docs | store | ip | | node -plugins | 0 | p | STARTED | 0 | 208b | 172.18.0.4 | opensearch-node1 -plugins | 0 | r | STARTED | 0 | 208b | 172.18.0.3 | opensearch-node2 +plugins | 0 | p | STARTED | 0 | 208b | 172.18.0.4 | odfe-node1 +plugins | 0 | r | STARTED | 0 | 208b | 172.18.0.3 | odfe-node2 ``` diff --git a/_opensearch/rest-api/cat/cat-tasks.md b/_opensearch/rest-api/cat/cat-tasks.md index d8d4aae0d9..2d30836b3d 100644 --- a/_opensearch/rest-api/cat/cat-tasks.md +++ b/_opensearch/rest-api/cat/cat-tasks.md @@ -43,5 +43,5 @@ time | Time | Specify the units for time. For example, `5d` or `7h`. For more in ```json action | task_id | parent_task_id | type | start_time | timestamp | running_time | ip | node -cluster:monitor/tasks/lists | 1vo54NuxSxOrbPEYdkSF0w:168062 | - | transport | 1624337809471 | 04:56:49 | 489.5ms | 172.18.0.4 | opensearch-node1 +cluster:monitor/tasks/lists | 1vo54NuxSxOrbPEYdkSF0w:168062 | - | transport | 1624337809471 | 04:56:49 | 489.5ms | 172.18.0.4 | odfe-node1 ``` diff --git a/_opensearch/rest-api/cat/cat-thread-pool.md b/_opensearch/rest-api/cat/cat-thread-pool.md index f2676140ec..e15b8e2705 100644 --- a/_opensearch/rest-api/cat/cat-thread-pool.md +++ b/_opensearch/rest-api/cat/cat-thread-pool.md @@ -47,7 +47,7 @@ master_timeout | Time | The amount of time to wait for a connection to the maste ```json node_name name active queue rejected -opensearch-node2 ad-batch-task-threadpool 0 0 0 -opensearch-node2 ad-threadpool 0 0 0 -opensearch-node2 analyze 0 0 0s +odfe-node2 ad-batch-task-threadpool 0 0 0 +odfe-node2 ad-threadpool 0 0 0 +odfe-node2 analyze 0 0 0s ``` diff --git a/_opensearch/rest-api/cat/index.md b/_opensearch/rest-api/cat/index.md index c53766361b..7f18f3195e 100644 --- a/_opensearch/rest-api/cat/index.md +++ b/_opensearch/rest-api/cat/index.md @@ -1,6 +1,6 @@ --- layout: default -title: CAT +title: CAT API parent: REST API reference nav_order: 100 has_children: true @@ -8,15 +8,15 @@ redirect_from: - /opensearch/catapis/ --- -# cat API +# Compact and aligned text (CAT) API -You can get essential statistics about your cluster in an easy-to-understand, tabular format using the compact and aligned text (CAT) API. The cat API is a human-readable interface that returns plain text instead of traditional JSON. +You can get essential statistics about your cluster in an easy-to-understand, tabular format using the compact and aligned text (CAT) API. The CAT API is a human-readable interface that returns plain text instead of traditional JSON. -Using the cat API, you can answer questions like which node is the elected master, what state is the cluster in, how many documents are in each index, and so on. +Using the CAT API, you can answer questions like which node is the elected master, what state is the cluster in, how many documents are in each index, and so on. ## Example -To see the available operations in the cat API, use the following command: +To see the available operations in the CAT API, use the following command: ``` GET _cat diff --git a/_opensearch/rest-api/cluster-health.md b/_opensearch/rest-api/cluster-health.md index 64c7a909da..8af627893e 100644 --- a/_opensearch/rest-api/cluster-health.md +++ b/_opensearch/rest-api/cluster-health.md @@ -36,10 +36,10 @@ All cluster health parameters are optional. Parameter | Type | Description :--- | :--- | :--- -expand_wildcards | Enum | Expands wildcard expressions to concrete indices. Combine multiple values with commas. Supported values are `all`, `open`, `closed`, `hidden`, and `none`. Default is `open`. +expand_wildcards | Enum | Expands wildcard expressions to concrete indexes. Combine multiple values with commas. Supported values are `all`, `open`, `closed`, `hidden`, and `none`. Default is `open`. level | Enum | The level of detail for returned health information. Supported values are `cluster`, `indices`, and `shards`. Default is `cluster`. -local | Boolean | Whether to return information from the local node only instead of from the master node. Default is false. -master_timeout | Time | The amount of time to wait for a connection to the master node. Default is 30 seconds. +local | Boolean | Whether to return information from the local node only instead of from the cluster manager node. Default is false. +cluster_manager_timeout | Time | The amount of time to wait for a connection to the cluster manager node. Default is 30 seconds. timeout | Time | The amount of time to wait for a response. If the timeout expires, the request fails. Default is 30 seconds. wait_for_active_shards | String | Wait until the specified number of shards is active before returning a response. `all` for all shards. Default is `0`. wait_for_events | Enum | Wait until all currently queued events with the given priority are processed. Supported values are `immediate`, `urgent`, `high`, `normal`, `low`, and `languid`. diff --git a/_opensearch/rest-api/cluster-settings.md b/_opensearch/rest-api/cluster-settings.md index 7c8b98703e..570d00f53e 100644 --- a/_opensearch/rest-api/cluster-settings.md +++ b/_opensearch/rest-api/cluster-settings.md @@ -44,7 +44,7 @@ Parameter | Type | Description :--- | :--- | :--- flat_settings | Boolean | Whether to return settings in the flat form, which can improve readability, especially for heavily nested settings. For example, the flat form of `"cluster": { "max_shards_per_node": 500 }` is `"cluster.max_shards_per_node": "500"`. include_defaults (GET only) | Boolean | Whether to include default settings as part of the response. This parameter is useful for identifying the names and current values of settings you want to update. -master_timeout | Time | The amount of time to wait for a response from the master node. Default is 30 seconds. +cluster_manager_timeout | Time | The amount of time to wait for a response from the cluster manager node. Default is 30 seconds. timeout (PUT only) | Time | The amount of time to wait for a response from the cluster. Default is 30 seconds. diff --git a/_opensearch/rest-api/cluster-stats.md b/_opensearch/rest-api/cluster-stats.md index 1eef74d05c..13b308cb89 100644 --- a/_opensearch/rest-api/cluster-stats.md +++ b/_opensearch/rest-api/cluster-stats.md @@ -14,7 +14,7 @@ The cluster stats API operation returns statistics about your cluster. ## Examples ```json -GET _cluster/stats/nodes/_master +GET _cluster/stats/nodes/_cluster_manager ``` ## Path and HTTP methods @@ -31,8 +31,10 @@ All cluster stats parameters are optional. Parameter | Type | Description :--- | :--- | :--- -<node-filters> | List | A comma-separated list of node-filters that OpenSearch uses to filter results. Available options are `all`, `_local`, `_master`, a node name or ID, `master:true`, `master:false`, `data:true`, `data:false`, `ingest:true`, `ingest:false`, `voting_only:true`, `voting_only:false`, `ml:true`, `ml:false`, `coordinating_only:true`, `coordinating_only:false`, and <custom node attributes> : <attribute values> pairs. +<node-filters> | List | A comma-separated list of node-filters that OpenSearch uses to filter results. Available options are `all`, `_local`, `_cluster_manager`, a node name or ID, `cluster_manager:true`, `cluster_manager:false`, `data:true`, `data:false`, `ingest:true`, `ingest:false`, `voting_only:true`, `voting_only:false`, `ml:true`, `ml:false`, `coordinating_only:true`, `coordinating_only:false`, and <custom node attributes> : <attribute values> pairs. + Although the `master` node is now called `cluster_manager` for version 2.0, we retained the `master` field for backwards compatibility. If you have a node that has either a `master` role or a `cluster_manager` role, the `count` increases for both fields by 1. To see an example node count increase, see the Response sample. + {: .note } ## Response @@ -218,6 +220,7 @@ Parameter | Type | Description "data": 1, "ingest": 1, "master": 1, + "cluster_manager": 1, "remote_cluster_client": 1 }, "versions": [ diff --git a/_opensearch/stats-api.md b/_opensearch/stats-api.md index ac0573570c..050d0bb662 100644 --- a/_opensearch/stats-api.md +++ b/_opensearch/stats-api.md @@ -44,7 +44,7 @@ If `enforced` is `true`: "roles": [ "data", "ingest", - "master", + "cluster_manager", "remote_cluster_client" ], "attributes": { @@ -151,7 +151,7 @@ If `enforced` is `false`: "roles": [ "data", "ingest", - "master", + "cluster_manager", "remote_cluster_client" ], "attributes": { @@ -264,7 +264,7 @@ GET _nodes/_local/stats/shard_indexing_pressure?include_all "roles": [ "data", "ingest", - "master", + "cluster_manager", "remote_cluster_client" ], "attributes": { @@ -379,7 +379,7 @@ If `enforced` is `true`: "roles": [ "data", "ingest", - "master", + "cluster_manager", "remote_cluster_client" ], "attributes": { @@ -422,7 +422,7 @@ If `enforced` is `false`: "roles": [ "data", "ingest", - "master", + "cluster_manager", "remote_cluster_client" ], "attributes": { @@ -471,7 +471,7 @@ GET _nodes/stats/shard_indexing_pressure "roles": [ "data", "ingest", - "master", + "cluster_manager", "remote_cluster_client" ], "attributes": { diff --git a/_security-plugin/access-control/document-level-security.md b/_security-plugin/access-control/document-level-security.md index e3176aa853..210c2d2932 100644 --- a/_security-plugin/access-control/document-level-security.md +++ b/_security-plugin/access-control/document-level-security.md @@ -124,13 +124,13 @@ PUT _plugins/_security/api/roles/abac }] } ``` -## Use TLQs with DLS +## Use term-level lookup queries (TLQs) with DLS -You can perform TLQs with DLS using either of two modes: adaptive or filter level. The default mode is adaptive, where OpenSearch automatically switches between Lucene-level or filter-level mode depending on whether or not there is a TLQ. DLS queries that do not contain a TLQ are executed in Lucene-level mode, whereas DLS queries with TLQs are executed in filter-level mode. +You can perform term-level lookup queries (TLQs) with document-level security (DLS) using either of two modes: adaptive or filter level. The default mode is adaptive, where OpenSearch automatically switches between Lucene-level or filter-level mode depending on whether or not there is a TLQ. DLS queries without TLQs are executed in Lucene-level mode, whereas DLS queries with TLQs are executed in filter-level mode. By default, the security plugin detects if a DLS query contains a TLQ or not and chooses the appropriate mode automatically at runtime. -To learn more about OpenSearch TLQ, see [Terms lookup query (TLQ)](https://opensearch.org/docs/latest/opensearch/query-dsl/term/#terms-lookup-query-tlq). +To learn more about OpenSearch queries, see [Term-level queries](https://opensearch.org/docs/latest/opensearch/query-dsl/term/). ### How to set the DLS evaluation mode in `opensearch.yml` @@ -145,5 +145,5 @@ plugins.security.dls.mode: filter-level | Evaluation mode | Parameter | Description | Usage | :--- | :--- | :--- | :--- | Lucene-level DLS | `lucene-level` | This setting makes all DLS queries apply to the Lucene level. | Lucene-level DLS modifies Lucene queries and data structures directly. This is the most efficient mode but does not allow certain advanced constructs in DLS queries, including TLQs. -Filter-level DLS | `filter-level` | This setting makes all DLS queries apply to the filter level. | In this mode, OpenSearch applies DLS by modifying the queries received. This allows for TLQs in DLS queries, but you can only use the `get`, `search`, `mget`, and `msearch` operations to retrieve data from the protected index. Additionally, cross-cluster searches are limited with this mode. -Adaptive | `adaptive-level` | By default, this setting allows OpenSearch to automatically choose the mode. | DLS queries without TLQs are executed in Lucene-level mode, while DLS queries that contain a TLQ are executed in filter-level mode. +Filter-level DLS | `filter-level` | This setting makes all DLS queries apply to the filter level. | In this mode, OpenSearch applies DLS by modifying queries that OpenSearch receives. This allows for term-level lookup queries in DLS queries, but you can only use the `get`, `search`, `mget`, and `msearch` operations to retrieve data from the protected index. Additionally, cross-cluster searches are limited with this mode. +Adaptive | `adaptive-level` | The default setting that allows OpenSearch to automatically choose the mode. | DLS queries without TLQs are executed in Lucene-level mode, while DLS queries that contain TLQ are executed in filter- level mode. diff --git a/images/cluster.png b/images/cluster.png index f5262a06ef..419ee31ad2 100644 Binary files a/images/cluster.png and b/images/cluster.png differ