Skip to content

Commit

Permalink
Add Filestream integration (#11332)
Browse files Browse the repository at this point in the history
This commit adds the Filestream integration
  • Loading branch information
belimawr authored Nov 20, 2024
1 parent 2a607af commit a40ebee
Show file tree
Hide file tree
Showing 17 changed files with 878 additions and 0 deletions.
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@
/packages/f5 @elastic/security-service-integrations
/packages/f5_bigip @elastic/security-service-integrations
/packages/falco @elastic/security-service-integrations
/packages/filestream @elastic/elastic-agent-data-plane
/packages/fim @elastic/sec-linux-platform
/packages/fireeye @elastic/security-service-integrations
/packages/first_epss @elastic/security-service-integrations
Expand Down
4 changes: 4 additions & 0 deletions packages/filestream/_dev/build/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
dependencies:
ecs:
reference: [email protected]
import_mappings: true
38 changes: 38 additions & 0 deletions packages/filestream/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Custom Filestream Log integration

The `filestream` custom input is used to read lines from active log files. It is the
new, improved alternative to the `log` input. It comes with various improvements
to the existing input:

1. Checking of `close_*` options happens out of band. Thus, if an output is blocked,
Elastic Agent can close the reader and avoid keeping too many files open.

2. The order of `parsers` is configurable. So it is possible to parse JSON lines and then
aggregate the contents into a multiline event.

3. Some position updates and metadata changes no longer depend on the publishing pipeline.
If the pipeline is blocked some changes are still applied to the registry.

4. Only the most recent updates are serialized to the registry. In contrast, the `log` input
has to serialize the complete registry on each ACK from the outputs. This makes the registry updates
much quicker with this input.

5. The input ensures that only offsets updates are written to the registry append only log.
The `log` writes the complete file state.

6. Stale entries can be removed from the registry, even if there is no active input.

7. The fingerprint file identity is used by default.

More information can be found on the {{ url "filebeat-input-filestream" "Filestream documentation page" }}

As Filestream configures a new input, configuring it to collect data
from a file that was previously collected by Custom Logs integration
will result in duplicate data. You may wish to configure
`ignore_older` or temporarily set `ignore_inactive: since_first_start`
to limit the amount of duplicate data collected.

If the Custom Logs integration is removed and the Custom Filestream
Logs is added in the same policy change, there risk of data being
missed between the last entry ingested by the Custom Logs and the
first one ingested by the Custom Filestream Logs.
8 changes: 8 additions & 0 deletions packages/filestream/_dev/deploy/docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
version: '2.3'
services:
filestream-logfile:
image: alpine
volumes:
- ./sample_logs:/sample_logs:ro
- ${SERVICE_LOGS_DIR}:/var/log
command: /bin/sh -c "cp /sample_logs/* /var/log/"
201 changes: 201 additions & 0 deletions packages/filestream/_dev/deploy/docker/sample_logs/test-filestream.log

Large diffs are not rendered by default.

5 changes: 5 additions & 0 deletions packages/filestream/changelog.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
- version: "0.0.1"
changes:
- description: Initial Release
type: enhancement
link: https://github.com/elastic/integrations/pull/11332
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
service: filestream-logfile
input: filestream
data_stream:
vars:
paths:
- "{{SERVICE_LOGS_DIR}}/test-filestream.log"
assert:
hit_count: 201
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
data_stream:
dataset: {{data_stream.dataset}}
paths:
{{#each paths as |path i|}}
- {{path}}
{{/each}}

{{#if pipeline}}
pipeline: {{pipeline}}
{{/if}}

{{#if recursive_glob}}
prospector.scanner.recursive_glob: {{recursive_glob}}
{{/if}}

{{#if exclude_files}}
prospector.scanner.exclude_files:
{{#each exclude_files as |exclude_file i|}}
- {{exclude_file}}
{{/each}}
{{/if}}

{{#if include_files}}
prospector.scanner.include_files:
{{#each include_files as |include_file i|}}
- {{include_file}}
{{/each}}
{{/if}}

{{#if symlinks}}
prospector.scanner.symlinks: {{symlinks}}
{{/if}}

{{#if resend_on_touch}}
prospector.scanner.resend_on_touch: {{resend_on_touch}}
{{/if}}

{{#if check_interval}}
prospector.scanner.check_interval: {{check_interval}}
{{/if}}

{{#if ignore_older}}
ignore_older: {{ignore_older}}
{{/if}}

{{#if ignore_inactive}}
ignore_inactive: {{ignore_inactive}}
{{/if}}

{{#if close_on_state_changed_inactive}}
close.on_state_change.inactive: {{close_on_state_changed_inactive}}
{{/if}}

{{#if close_on_state_changed_renamed}}
close.on_state_change.renamed: {{close_on_state_changed_renamed}}
{{/if}}

{{#if close_on_state_changed_removed}}
close.on_state_change.removed: {{close_on_state_changed_removed}}
{{/if}}

{{#if close_reader_eof}}
close.reader.on_eof: {{close_reader_eof}}
{{/if}}

{{#if close_reader_after_interval}}
close.reader.after_interval: {{close_reader_after_interval}}
{{/if}}

{{#if clean_inactive}}
clean_inactive: {{clean_inactive}}
{{/if}}

{{#if clean_removed}}
clean_removed: {{clean_removed}}
{{/if}}

{{#if backoff_init}}
backoff.init: {{backoff_init}}
{{/if}}

{{#if backoff_max}}
backoff.max: {{backoff_max}}
{{/if}}

{{#if rotation_external_strategy_copytruncate}}
rotation.external.strategy.copytruncate: {{rotation_external_strategy_copytruncate}}
{{/if}}

{{#if encoding}}
encoding: {{encoding}}
{{/if}}

{{#if exclude_lines}}
exclude_lines:
{{#each exclude_lines as |exclude_line i|}}
- {{exclude_line}}
{{/each}}
{{/if}}

{{#if include_lines}}
include_lines:
{{#each include_lines as |include_line i|}}
- {{include_line}}
{{/each}}
{{/if}}

{{#if buffer_size}}
buffer_size: {{buffer_size}}
{{/if}}

{{#if message_max_bytes}}
message_max_bytes: {{message_max_bytes}}
{{/if}}

{{#if parsers}}
parsers:
{{parsers}}
{{/if}}

{{#if tags}}
tags:
{{#each tags as |tag i|}}
- {{tag}}
{{/each}}
{{/if}}

{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}

{{#if processors}}
processors:
{{processors}}
{{/if}}

{{#if harvester_limit }}
harvester_limit: {{harvester_limit}}
{{/if}}

{{#if fingerprint }}
prospector.scanner.fingerprint.enabled: true
file_identity.fingerprint.enabled: true
file_identity.fingerprint.offset: {{ fingerprint_offset }}
file_identity.fingerprint.length: {{ fingerprint_length }}
{{/if}}
20 changes: 20 additions & 0 deletions packages/filestream/data_stream/generic/fields/base-fields.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
- name: data_stream.type
type: constant_keyword
description: Data stream type.
- name: data_stream.dataset
type: constant_keyword
description: Data stream dataset.
- name: data_stream.namespace
type: constant_keyword
description: Data stream namespace.
- name: event.module
type: constant_keyword
description: Event module
value: filestream
- name: event.dataset
type: constant_keyword
description: Event dataset
value: filestream.generic
- name: "@timestamp"
type: date
description: Event timestamp.
6 changes: 6 additions & 0 deletions packages/filestream/data_stream/generic/fields/beats.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
- name: input.type
description: Type of Filebeat input.
type: keyword
- name: tags
type: keyword
description: User defined tags
12 changes: 12 additions & 0 deletions packages/filestream/data_stream/generic/fields/ecs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
- name: ecs.version
external: ecs
- name: log.file.path
external: ecs
- name: log.offset
description: Current log offset
- name: log.level
external: ecs
- name: message
external: ecs
- name: event.original
external: ecs
13 changes: 13 additions & 0 deletions packages/filestream/data_stream/generic/fields/filestream.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
- name: log.file.inode
type: keyword
description: |
inode of the ingested file.
- name: log.file.device_id
type: keyword
description: |
device ID from the device where the file is.
- name: log.file.fingerprint
type: keyword
index: false
description: |
The fingerprint of the file when using the fingerprint file identity.
Loading

0 comments on commit a40ebee

Please sign in to comment.