-
-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
smartctl_exporter ignores nvme devices by default #210041
Comments
Hi @robryk, the latest version of
However, it does need to be run by root. A little Googling lead me to Why smartctl could not be run without root. They concluded root was required and they went so far as to open an issue with smartmontools. After digging through the smartmontools tickets, I found the ticket the blogger opened. RFE: add O_RDRW mode for sat/scsi/ata devices
|
Please note that we have smartctl running as non-root and collecting information from SATA devices. I don't know why your link claims it doesn't work (maybe it used not to?), but it does work now, so I won't try to investigate whether it used not to. The issue here is that smartctl while running as non-root and without access to open top-level NVMe devices won't collect from NVMe. The reason it doesn't have access to top-level NVMe devices is because they are not group-owned by |
Let's add a group named I'm opposed to making the exporter run as root, if we don't have to, sorry. |
smartctl_exporter already runs with SupplementaryGroups "disk", which gives full access to SATA drives, but NVMe devices are owned by root:root, resulting in no access: [...] msg="Smartctl open device: /dev/nvme0 failed: Permission denied" This patch introduces a "smartctl-exporter-access" supplementary group, and an udev rule with setfacl to give the exporter access to NVMe drives, without changing the base root:root ownership. Fixes #210041 (cherry picked from commit 86a6ef5)
Describe the bug
smartctl_exporter is completely silent in its metrics about NVMe devices:
When you look at its log, you can see that it complains about being unable to open the device:
Steps To Reproduce
Steps to reproduce the behavior:
prometheus.exporters.smartctl.enable = true;
http://127.0.0.1:9633/metrics
journalctl -u prometheus-smartctl-exporter.service
Expected behavior
I would expect the NVMe device to have its smart data collected. I don't have an opinion on whether they should be collected from the top-level device (e.g.
/dev/nvme0
) or at the namespace level (e.g./dev/nvme0n1
).Additional context
smartctl_exporter gets started as a user that doesn't have read access to top-level nvme devices, which causes it to completely ignore them (i.e. behave as if they didn't exist; see #91 on smartctl_exporter for the bug about it being silent). It does have access to devices at namespace level, but its device auto-detection detects the top-level one.
It seems that smartctl claims that the "correct" level to query is
/dev/nvme0
(seesmartctl --scan
, even whensmartctl -a /dev/nvme0n1
also works.The reason the exporter can query all the other devices is because it runs with
disk
in supplementary groups:As shown above, top-level NVMe devices are owned by
root:root
. I guess this might be caused by them being char (as opposed to block) devices.Notify maintainers
@mweinelt @Frostman
Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.My nixpkgs version is c4e1db0 with own patches on top.
The text was updated successfully, but these errors were encountered: