Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: host metrics are not available running only source plugins #2821

Closed
Andreagit97 opened this issue Sep 21, 2023 · 11 comments
Closed
Labels
Milestone

Comments

@Andreagit97
Copy link
Member

Andreagit97 commented Sep 21, 2023

Describe the bug

After this change, falcosecurity/libs#1362 host metrics will no longer be available when running a source plugin like k8saduit. This is a regression and we need to find a way to address it.

Metrics extracted with dummy plugin in Falco 0.36.0-rc3

{
	"output_fields": {
		"evt.source": "dummy",
		"evt.time": 1695381599561603364,
		"falco.container_memory_used": 0,
		"falco.cpu_usage_perc": 0.0,
		"falco.duration_sec": 1695381599,
		"falco.host_boot_ts": 0,
		"falco.host_num_cpus": 0,
		"falco.hostname": "",
		"falco.kernel_release": "",
		"falco.memory_pss": 25,
		"falco.memory_rss": 29,
		"falco.memory_vsz": 1779,
		"falco.num_evts": 18641280,
		"falco.num_evts_prev": 0,
		"falco.outputs_queue_num_drops": 0,
		"falco.start_ts": 0,
		"falco.version": "0.36.0-rc3",
		"scap.engine_name": "source_plugin"
	},
	"sample": 1
}
{
	"output_fields": {
		"evt.source": "dummy",
		"evt.time": 1695381604561553388,
		"falco.container_memory_used": 0,
		"falco.cpu_usage_perc": 0.1,
		"falco.duration_sec": 1695381604,
		"falco.evts_rate_sec": 3573155.7144059963,
		"falco.host_boot_ts": 0,
		"falco.host_num_cpus": 0,
		"falco.hostname": "",
		"falco.kernel_release": "",
		"falco.memory_pss": 25,
		"falco.memory_rss": 29,
		"falco.memory_vsz": 1908,
		"falco.num_evts": 36506880,
		"falco.num_evts_prev": 18641280,
		"falco.outputs_queue_num_drops": 0,
		"falco.start_ts": 0,
		"falco.version": "0.36.0-rc3",
		"scap.engine_name": "source_plugin"
	},
	"sample": 2
}
@poiana
Copy link
Contributor

poiana commented Dec 21, 2023

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@Andreagit97
Copy link
Member Author

/remove-lifecycle stale

@Andreagit97 Andreagit97 modified the milestones: 0.37.0, 0.38.0 Jan 30, 2024
@poiana
Copy link
Contributor

poiana commented Apr 29, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@Andreagit97
Copy link
Member Author

/remove-lifecycle stale

@incertum
Copy link
Contributor

Pitching one approach to get out of this regression here: falcosecurity/libs#1362 (comment)

@gnosek
Copy link
Contributor

gnosek commented Jul 22, 2024

Oh hey @gnosek

Special notes for your reviewer:
This is definitely not the first time around the block with this, so I'm waiting for the inevitable "revert this" PR ;)

... the day has come ;)

Instead of reverting, would the following be acceptable to get out of the regression for the host metrics falcosecurity/falco#2821 when you run Falco w/ plugin source only, but still on Linux.

  • Add a new sinsp_mode_t called SINSP_MODE_PLUGIN_HOST_METRICS (or similar) that complements SINSP_MODE_PLUGIN when running Falco w/ a plugin source only (no syscalls), but also with the Falco metrics framework enabled
  • When running in mode SINSP_MODE_PLUGIN_HOST_METRICS, allocate a new platform that could be called scap_linux_host_metrics_alloc_platform or similar. It would basically be a Linux light platform that would only alloc what is needed in order to have m_machine_info, m_agent_info and the iflist (for a new metrics field in the making) available to the metrics framework.
  • Other ideas?

Hey @incertum, long time no see :)

I like the idea of a separate linux-lite platform that would skip the process list, but still collect non-process metadata. Having said that, I can't express how much I dislike adding a new mode (and the concept of sinsp modes in general) :)

My suggestion would be:

  • add the linux-lite platform (as a scap_platform implementation; would just share most of the code with the full linux one)
  • introduce a special-purpose enum for the platform selection (generic/lite/full linux)
  • replace the sinsp_mode_t param to sinsp::open_plugin with the enum (we'd still pick a mode for sinsp; would probably be SINSP_MODE_PLUGIN for the lite platform)

I am going to sidestep the question of which should be the default by making the parameter mandatory, and therefore the caller's problem :D

I went through the libs code and the specific mode doesn't seem to matter in this case, so let's try and slowly move towards having no modes (they're effectively a class hierarchy smeared all over the code).

Sadly, this will break the sinsp API, but it feels better than adding a new mode value that's only ever used in one place (and would have to be handled in is_plugin or is_live)

[side rant: scap does not care about machine_info etc. one bit, except for capture files so this could all happily live in sinsp; this would need a giant-but-ultimately-trivial PR to untangle that I have made several times already but never ended up submitting it because $REASONS]

@gnosek
Copy link
Contributor

gnosek commented Jul 22, 2024

falcosecurity/libs#1969

@incertum
Copy link
Contributor

@gnosek yes SGTM #2821 (comment)

Also thanks for opening the PR, I was gonna suggest that you would be most suited for this clean up ;)

re

scap does not care about machine_info etc. one bit, except for capture files so this could all happily live in sinsp;

We are still always going back and forth with what should be in sinsp vs scap. Btw @mrgian is moving more of the metrics stuff to scap in order to make it multi-OS, see the PR falcosecurity/libs#1870.

@gnosek
Copy link
Contributor

gnosek commented Jul 23, 2024

Sigh :)

That looks like it could use moving libsinsp/metrics_collector.cpp to something like libsinsp/linux/metrics_collector.cpp and some build system changes to avoid #ifdef __linux__ in platform-agnostic code (either provide a stub for other platforms, or a #define HAS_METRICS_COLLECTOR or some such)

I'm not opposed to platform-specific code in libsinsp (honestly, there's little else in there), I'm just opposed to spreading it around the codebase with ifdefs or hidden assumptions. I'd love to see e.g. platform-agnostic interfaces with platform-specific implementations.

Ceterum censeo libscap esse delendam :)

@incertum
Copy link
Contributor

Ceterum censeo libscap esse delendam :)

:) looking forward to it!

Also thanks for posting on Gianmatteo's PR with few suggestions!

@incertum
Copy link
Contributor

This was addressed by falcosecurity/libs#1969

Closing for now. If we still see issues we can re-open this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants