diff --git a/_field-types/supported-field-types/star-tree.md b/_field-types/supported-field-types/star-tree.md index af737e6447..77fb28802e 100644 --- a/_field-types/supported-field-types/star-tree.md +++ b/_field-types/supported-field-types/star-tree.md @@ -37,7 +37,8 @@ PUT logs "settings": { "index.number_of_shards": 1, "index.number_of_replicas": 0, - "index.composite_index": true + "index.composite_index": true, + "index.append_only.enabled": true }, "mappings": { "composite": { @@ -54,6 +55,16 @@ PUT logs }, { "name": "port" + }, + { + "name": "method" + }, + { + "name": "@timestamp", + "calendar_intervals": [ + "month", + "day" + ] } ], "metrics": [ @@ -80,6 +91,10 @@ PUT logs } }, "properties": { + "@timestamp": { + "format": "strict_date_optional_time||epoch_second", + "type": "date" + }, "status": { "type": "integer" }, @@ -89,6 +104,9 @@ PUT logs "request_size": { "type": "integer" }, + "method" : { + "type": "keyword" + }, "latency": { "type": "scaled_float", "scaling_factor": 10 @@ -118,17 +136,33 @@ When using the `ordered_dimesions` parameter, follow these best practices: - The order of dimensions matters. You can define the dimensions ordered from the highest cardinality to the lowest cardinality for efficient storage and query pruning. - Avoid using high-cardinality fields as dimensions. High-cardinality fields adversely affect storage space, indexing throughput, and query performance. -- Currently, fields supported by the `ordered_dimensions` parameter are all [numeric field types]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/numeric/), with the exception of `unsigned_long`. For more information, see [GitHub issue #15231](https://github.com/opensearch-project/OpenSearch/issues/15231). -- Support for other field types, such as `keyword` and `ip`, will be added in future versions. For more information, see [GitHub issue #16232](https://github.com/opensearch-project/OpenSearch/issues/16232). - A minimum of `2` and a maximum of `10` dimensions are supported per star-tree index. +The `ordered_dimensions` parameter supports the following field types: + + - All numeric field types, excluding `unsigned_long` and `scaled_float` + - `keyword` + - `object` + - `date`, which can use up to three of the following calendar intervals: + - `year` (of era) + - `quarter` (of year) + - `month` (of year) + - `week` (of week-based year) + - `day` (of month) + - `hour` (of day) + - `half-hour` (of day) + - `quater-hour` (of day) + - `minute` (of hour) + - `second` (of minute) + +Support for other field types, such as `ip`, will be added in future versions. For more information, see [GitHub issue #13875](https://github.com/opensearch-project/OpenSearch/issues/13875). + The `ordered_dimensions` parameter supports the following property. | Parameter | Required/Optional | Description | | :--- | :--- | :--- | | `name` | Required | The name of the field. The field name should be present in the `properties` section as part of the index `mapping`. Ensure that the `doc_values` setting is `enabled` for any associated fields. | - ### Metrics Configure any metric fields on which you need to perform aggregations. `Metrics` are required as part of a star-tree index configuration. diff --git a/_search-plugins/star-tree-index.md b/_search-plugins/star-tree-index.md index d1e03208d3..d39d9029d3 100644 --- a/_search-plugins/star-tree-index.md +++ b/_search-plugins/star-tree-index.md @@ -26,7 +26,7 @@ A star-tree index can be used to perform faster aggregations. Consider the follo Star-tree indexes have the following limitations: -- A star-tree index should only be enabled on indexes whose data is not updated or deleted because updates and deletions are not accounted for in a star-tree index. +- A star-tree index should only be enabled on indexes whose data is not updated or deleted because updates and deletions are not accounted for in a star-tree index. To enforce this policy and use star-tree indexes, set the `index.append_only.enabled` setting to `true`. - A star-tree index can be used for aggregation queries only if the queried fields are a subset of the star-tree's dimensions and the aggregated fields are a subset of the star-tree's metrics. - After a star-tree index is enabled, it cannot be disabled. In order to disable a star-tree index, the data in the index must be reindexed without the star-tree mapping. Furthermore, changing a star-tree configuration will also require a reindex operation. - [Multi-values/array values]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/index/#arrays) are not supported. @@ -68,6 +68,7 @@ To use a star-tree index, modify the following settings: - Set the feature flag `opensearch.experimental.feature.composite_index.star_tree.enabled` to `true`. For more information about enabling and disabling feature flags, see [Enabling experimental features]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/experimental/). - Set the `indices.composite_index.star_tree.enabled` setting to `true`. For instructions on how to configure OpenSearch, see [Configuring settings]({{site.url}}{{site.baseurl}}/install-and-configure/configuring-opensearch/index/#static-settings). - Set the `index.composite_index` index setting to `true` during index creation. +- Set the `index.append_only.enabled` index setting to `true` during index creation. - Ensure that the `doc_values` parameter is enabled for the `dimensions` and `metrics` fields used in your star-tree mapping. @@ -81,7 +82,8 @@ PUT logs "settings": { "index.number_of_shards": 1, "index.number_of_replicas": 0, - "index.composite_index": true + "index.composite_index": true, + "index.append_only.enabled": true }, "mappings": { "composite": { @@ -94,6 +96,9 @@ PUT logs }, { "name": "port" + }, + { + "name": "method" } ], "metrics": [ @@ -123,6 +128,9 @@ PUT logs "size": { "type": "integer" }, + "method" : { + "type": "keyword" + }, "latency": { "type": "scaled_float", "scaling_factor": 10 @@ -131,6 +139,7 @@ PUT logs } } ``` +{% include copy.html %} For detailed information about star-tree index mappings and parameters, see [Star-tree field type]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/star-tree/). @@ -140,14 +149,20 @@ Star-tree indexes can be used to optimize queries and aggregations. ### Supported queries -The following queries are supported as of OpenSearch 2.18: +The following queries are supported as of OpenSearch 2.19: - [Term query]({{site.url}}{{site.baseurl}}/query-dsl/term/term/) +- [Terms query]({{site.url}}{{site.baseurl}}/query-dsl/term/terms/) - [Match all docs query]({{site.url}}{{site.baseurl}}/query-dsl/match-all/) +- [Range query]({{site.url}}{{site.baseurl}}/query-dsl/term/range/) -To use a query with a star-tree index, the query's fields must be present in the `ordered_dimensions` section of the star-tree configuration. Queries must also be paired with a supported aggregation. +To use a query with a star-tree index, the query's fields must be present in the `ordered_dimensions` section of the star-tree configuration. Queries must also be paired with a supported aggregation. Queries without aggregations cannot be used with a star-tree index. Currently, queries on `date` fields are not supported and will be added in later versions. ### Supported aggregations + +The following aggregations are supported by star-tree indexes. + +#### Metric aggregations The following metric aggregations are supported as of OpenSearch 2.18: - [Sum]({{site.url}}{{site.baseurl}}/aggregations/metric/sum/) @@ -156,13 +171,11 @@ The following metric aggregations are supported as of OpenSearch 2.18: - [Value count]({{site.url}}{{site.baseurl}}/aggregations/metric/value-count/) - [Average]({{site.url}}{{site.baseurl}}/aggregations/metric/average/) -To use aggregations: +To use searchable aggregations with a star-tree index, make sure you fulfill the following prerequisites: - The fields must be present in the `metrics` section of the star-tree configuration. - The metric aggregation type must be part of the `stats` parameter. -### Aggregation example - The following example gets the sum of all the values in the `size` field for all error logs with `status=500`, using the [example mapping](#example-mapping): ```json @@ -182,9 +195,56 @@ POST /logs/_search } } ``` +{% include copy.html %} Using a star-tree index, the result will be retrieved from a single aggregated document as it traverses the `status=500` node, as opposed to scanning through all of the matching documents. This results in lower query latency. +#### Date histograms with metric aggregations + +You can use [date histograms]({{site.url}}{{site.baseurl}}/aggregations/bucket/date-histogram/) on calendar intervals with metric sub-aggregations. + +To use date histogram aggregations and make them searchable in a star-tree index, remember the following requirements: + +- The calendar intervals in a star-tree mapping configuration can use either the request's calendar field or a field of lower granularity than the request field. For example, if an aggregation uses the `month` field, the star-tree search can still use lower-granularity fields such as `day`. +- A metric sub-aggregation must be part of the aggregation request. + +The following example gets the sum of all the values in the `size` field with a range query, aggregated for each calendar month, for all error logs containing `method:get`: + +```json +POST /logs/_search +{ +{ + "query": { + "range": { + "created": { + "gte": "2019/01/01", + "lte": "2019/12/31" + }, + "method": { + "status": "get" + } + }, + "size": 0, + "aggs": { + "by_hour": { + "date_histogram": { + "field": "@timestamp", + "calendar_interval": "month" + }, + "aggs": { + "sum_size": { + "sum": { + "field": "size" + } + } + } + } + } +} +``` +{% include copy-curl.html %} + + ## Using queries without a star-tree index Set the `indices.composite_index.star_tree.enabled` setting to `false` to run queries without using a star-tree index.