-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Created the blog post announcing Data Prepper 2.0 #1066
Created the blog post announcing Data Prepper 2.0 #1066
Conversation
Co-authored-by: Hai Yan <[email protected]> Signed-off-by: David Venable <[email protected]>
Co-authored-by: Hai Yan <[email protected]> Signed-off-by: David Venable <[email protected]>
I added Hai to the list of authors based on the username he supplied in #1067. We will need to merge that in prior to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added my rewrites for each section in the review. One comment = one section.
Might need to wait to add documentation links until this PR is merged: opensearch-project/documentation-website#1510
- technical-post | ||
--- | ||
|
||
Today the maintainers are announcing the release of Data Prepper 2.0. It has been over a year since Data Prepper 1.0 was first introduced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this paragraph to:
The Data Prepper maintainers are proud to announce the release of Data Prepper 2.0. This release makes Data Prepper easier to use and helps you improve your observability stack based on feedback from our users.
Here are some of the major changes and enhancements made for Data Prepper 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe:
The Data Prepper maintainers are proud to announce the release of Data Prepper 2.0. This release makes Data Prepper easier to use and helps you improve your observability stack based on feedback from you, our users.
Here are some of the major changes and enhancements made for Data Prepper 2.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlvenable: Could we add a line in this intro or somewhere in the blog about OpenSearch compatibility? Data Prepper 2.0 is compatible with all OpenSearch versions, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the following:
Data Prepper 2.0 retains compatibility with all current versions of OpenSearch.
* The HTTP source now supports loading TLS/SSL credentials from either Amazon S3 or Amazon Certificate Manager. The OTel Trace Source supported these options and now pipeline authors can configure them for their log ingestion use-cases as well. | ||
* Data Prepper now requires Java 11 and the Docker image deploys with JDK 17. | ||
|
||
Please see our release notes for a complete list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a link to these release notes?
- technical-post | ||
--- | ||
|
||
Today the maintainers are announcing the release of Data Prepper 2.0. It has been over a year since Data Prepper 1.0 was first introduced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe:
The Data Prepper maintainers are proud to announce the release of Data Prepper 2.0. This release makes Data Prepper easier to use and helps you improve your observability stack based on feedback from you, our users.
Here are some of the major changes and enhancements made for Data Prepper 2.0.
Signed-off-by: David Venable <[email protected]>
Thanks @Naarcha-AWS ! I took most of the changes to all sections except the Directory Structure. I want to check with @oeyh on those first. I did make some tweaks from your suggestions - most of them were to try to be more accurate. I also wasn't quite sure about some of the paragraphs. Did you intend all those paragraphs? The ones in the examples read too broken up and didn't keep the same train of thought. |
Signed-off-by: David Venable <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more minor tweaks before we pass them off to @natebower.
accepts log data from external sources such as Fluent Bit. | ||
|
||
The pipeline then uses the `grok` processor to split the log line into multiple fields. | ||
The `grok` processor adds named `loglevel` to the event. Pipeline authors can use that field in routes. This pipeline has two OpenSearch sinks. The first sink only receives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's break this up a little more:
The pipeline then uses the grok
processor to split the log line into multiple fields. The grok
processor adds a named loglevel
to the event. Pipeline authors can use that field in routes.
This pipeline contains two OpenSearch sinks. The first sink will only receive logs with a log level of WARN
or ERROR
. Data Prepper will route all events to the second sink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took your suggestion and made one clarification by adding "field" which you can see here: "... adds a
field named loglevel
..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small things:
Simply pick a name appropriate for the domain and a Data Prepper expression. | ||
Then for any sink that should only have some data coming through, define one or more routes to apply Data Prepper will evaluate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this supposed to be a space, not a line break; also missing a period in front of Data Prepper will evaluate...
:
Simply pick a name appropriate for the domain and a Data Prepper expression. | |
Then for any sink that should only have some data coming through, define one or more routes to apply Data Prepper will evaluate | |
Simply pick a name appropriate for the domain and a Data Prepper expression. Then for any sink that should only have some data coming through, define one or more routes to apply. Data Prepper will evaluate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line breaks should not affect the rendered page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a space and line break which did create a new paragraph in the rendered page. Thanks for noting that!
Signed-off-by: David Venable <[email protected]>
509bfed
to
3bc00b6
Compare
Signed-off-by: David Venable <[email protected]>
It took all the suggested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlvenable Please see my changes and comments, and let me know if you have any questions. Thanks!
|
||
One common use case for conditional routing is reducing the volume of data going to some clusters. | ||
When you want info logs that produce large volumes of data to go to a cluster, index with more frequent rollovers, or add deletions to clear out large volumes of data, you can now configure pipelines to route the data with your chosen action. | ||
deletions to clear out these large volumes of data, you now configure pipelines to route your data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deletions to clear out these large volumes of data, you now configure pipelines to route your data. |
|
||
|
||
Simply pick a name appropriate for the domain and a Data Prepper expression. | ||
Then for any sink that should only have some data coming through, define one or more routes to apply. Data Prepper will evaluate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second sentence: "to route these events to"?
For example, when one large object includes a serialized JSON string, you can use the `parse_json` processor to extract | ||
the fields from the JSON into your event. | ||
|
||
Data Prepper can now import CSV or TSV formatted files from Amazon S3 sources. This is useful for systems like Amazon CloudFront |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove "formatted"? Otherwise, this would need to be "CSV- or TSV-formatted files".
Data Prepper 2.0 includes a number of other improvements. We want to highlight a few of them. | ||
|
||
* The OpenSearch sink now supports `create` actions for OpenSearch when writing documents. Pipeline authors can configure their pipelines to only create new documents and not update existing ones. | ||
* The HTTP source now supports loading TLS/SSL credentials from either Amazon S3 or Amazon Certificate Manager. Pipeline authors can now configure them for their log ingestion use cases. Before Data Prepper 2.0, only the OTel Trace Source supported these options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The HTTP source now supports loading TLS/SSL credentials from either Amazon S3 or Amazon Certificate Manager. Pipeline authors can now configure them for their log ingestion use cases. Before Data Prepper 2.0, only the OTel Trace Source supported these options. | |
* The HTTP source now supports loading SSL/TLS credentials from either Amazon S3 or AWS Certificate Manager (ACM). Pipeline authors can now configure them for their log ingestion use cases. Before Data Prepper 2.0, only the OTel Trace Source supported these options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe either SSL/TLS
or TLS/SSL
is in use. I intentially chose TLS/SSL
because we are using TLS. The SSL part is mostly there for historical reasons.
You can also see that the term TLS/SSL
is used in the following Wikipedia article.
Data Prepper 2.0 includes a number of other improvements. We want to highlight a few of them. | ||
|
||
* The OpenSearch sink now supports `create` actions for OpenSearch when writing documents. Pipeline authors can configure their pipelines to only create new documents and not update existing ones. | ||
* The HTTP source now supports loading TLS/SSL credentials from either Amazon S3 or Amazon Certificate Manager. Pipeline authors can now configure them for their log ingestion use cases. Before Data Prepper 2.0, only the OTel Trace Source supported these options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming we were referring to AWS Certificate Manager (ACM).
* The HTTP source now supports loading TLS/SSL credentials from either Amazon S3 or Amazon Certificate Manager. Pipeline authors can now configure them for their log ingestion use cases. Before Data Prepper 2.0, only the OTel Trace Source supported these options. | ||
* Data Prepper now requires Java 11 or higher. The Docker image deploys with JDK 17. | ||
|
||
Please see our [release notes](https://github.com/opensearch-project/data-prepper/releases/tag/2.0.0) for a complete list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing we're missing here is a call to action. We need to conclude with a couple sentences telling the reader what we'd like for them to do next or where they can go to learn more. The below is an example from a recent blog post announcing Snapshot Management (SM):
Wrapping it up
SM automates taking snapshots of your cluster and provides useful features like notifications. To learn more about SM, check out the SM documentation section. For more technical details, read the SM meta issue.
If you’re interested in snapshots, consider contributing to the next improvement we’re working on: searchable snapshots.
Signed-off-by: David Venable <[email protected]>
4fd8501
to
0f160f1
Compare
Signed-off-by: David Venable <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed all the changes except the final call-to-action section. I will push that soon.
With peer forwarding as a core feature, pipeline authors can perform stateful | ||
aggregations on multiple Data Prepper nodes. When performing stateful aggregations, Data Prepper uses a hash ring to determine | ||
which nodes are responsible for processing different events based on the values of certain fields. Peer forwarder | ||
routes events to the node responsible for processing the event. That node then holds all the state necessary for performing the aggregation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the change to "states" here. Using a singular noun for state is quite common.
In information technology and computer science, a system is described as stateful if it is designed to remember preceding events or user interactions; the remembered information is called the state of the system.
Data Prepper 2.0 includes a number of other improvements. We want to highlight a few of them. | ||
|
||
* The OpenSearch sink now supports `create` actions for OpenSearch when writing documents. Pipeline authors can configure their pipelines to only create new documents and not update existing ones. | ||
* The HTTP source now supports loading TLS/SSL credentials from either Amazon S3 or Amazon Certificate Manager. Pipeline authors can now configure them for their log ingestion use cases. Before Data Prepper 2.0, only the OTel Trace Source supported these options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe either SSL/TLS
or TLS/SSL
is in use. I intentially chose TLS/SSL
because we are using TLS. The SSL part is mostly there for historical reasons.
You can also see that the term TLS/SSL
is used in the following Wikipedia article.
@dlvenable I changed to "states" to match @Naarcha-AWS edits and also to avoid "all the state". If you want to use "state", remove "all". |
Signed-off-by: David Venable <[email protected]>
Signed-off-by: David Venable <[email protected]>
That makes sense. I've removed "all" from the sentence. |
I have also pushed a short conclusion section. |
|
||
## Try Data Prepper 2.0 | ||
|
||
Data Prepper 2.0 is available for [download](https://opensearch.org/downloads.html#data-prepper) now. The maintainers encourage you to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a blog, an exclamation point would work after the first sentence. Other than that small nit, LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: David Venable <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
Description
We are releasing Data Prepper 2.0.0 on Oct 10. This is our announcement blog post.
This requires the bio for @oeyh as supplied in #1067.
Issues Resolved
N/A
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.