Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firecracker: Snapshot load causes a lot of firecracker virtio block error logs #2080

Open
jeromegn opened this issue Nov 20, 2024 · 3 comments

Comments

@jeromegn
Copy link

I don't know if these logs are important because my app appears to be working properly (though it isn't accessing the filesystem post-snapshot-load).

2024-11-20T12:07:50.095125693 [anonymous-instance:main] Failed to execute Out virtio block request: FileEngine(Sync(Transfer(IOError(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))))
2024-11-20T12:07:50.095131604 [anonymous-instance:main] Failed to execute Out virtio block request: FileEngine(Sync(Transfer(IOError(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))))
2024-11-20T12:07:50.095137595 [anonymous-instance:main] Failed to execute Out virtio block request: FileEngine(Sync(Transfer(IOError(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))))

I haven't seen anybody else report these errors to the firecracker repository so I am assuming it is something specific to nanos. I'm happy to make a firecracker issue as well if I am misguided.

To reproduce, I believe:

  • Build (using ops) + start any deno app in a firecracker microvm
  • Pause it
  • Take a snapshot
  • Stop the firecracker
  • Start a new firecracker without any config
  • Load the snapshot in the new firecracker

I see these logs before seeing any logs from my app:

2024-11-20T14:18:59.530639927 [anonymous-instance:main] Running Firecracker v1.10.1
2024-11-20T14:19:02.397687996 [anonymous-instance:fc_api] The API server received a Put request on "/snapshot/load" with body "{\n            \"snapshot_path\": \"./snapshot_file\",\n            \"mem_backend\": {\n                \"backend_path\": \"./mem_file\",\n                \"backend_type\": \"File\"\n            },\n            \"enable_diff_snapshots\": true,\n            \"resume_vm\": true\n    }".
2024-11-20T14:19:02.397716760 [anonymous-instance:main] [DevPreview] Virtual machine snapshots is in development preview.
2024-11-20T14:19:02.397852002 [anonymous-instance:main] Host CPU vendor ID: [65, 117, 116, 104, 101, 110, 116, 105, 99, 65, 77, 68]
2024-11-20T14:19:02.397856540 [anonymous-instance:main] Snapshot CPU vendor ID: [65, 117, 116, 104, 101, 110, 116, 105, 99, 65, 77, 68]
2024-11-20T14:19:02.407181046 [anonymous-instance:main] Artificially kick devices.
2024-11-20T14:19:02.407187508 [anonymous-instance:main] kick vsock vsock.
2024-11-20T14:19:02.407192928 [anonymous-instance:main] kick block rootfs.
2024-11-20T14:19:02.407195954 [anonymous-instance:main] kick net eth0.
2024-11-20T14:19:02.407227983 [anonymous-instance:fc_vcpu 0] Received a VcpuEvent::Resume message with immediate_exit enabled. immediate_exit was disabled before proceeding
2024-11-20T14:19:02.407245746 [anonymous-instance:main] [DevPreview] Virtual machine snapshots is in development preview - 'load snapshot' VMM action took 9520 us.
2024-11-20T14:19:02.407257158 [anonymous-instance:fc_api] The request was executed successfully. Status code: 204 No Content.
2024-11-20T14:19:02.407263109 [anonymous-instance:fc_api] 'load snapshot' API request took 9593 us.

Some artifacts:

Firecracker config
{
  "boot-source": {
    "kernel_image_path": "/home/jerome/.ops/0.1.52/kernel.img",
    "boot_args": "console=ttyS0 reboot=k panic=1 pci=off random.trust_cpu=on"
  },
  "drives": [
    {
      "drive_id": "rootfs",
      "path_on_host": "/home/jerome/.ops/images/app",
      "is_root_device": true,
      "is_read_only": true
    }
  ],
  "network-interfaces": [
    {
      "iface_id": "eth0",
      "guest_mac": "AA:FC:00:00:00:01",
      "host_dev_name": "tap0"
    }
  ],
  "machine-config": {
    "vcpu_count": 1,
    "mem_size_mib": 512
  },
  "balloon": {
    "amount_mib": 0,
    "deflate_on_oom": false,
    "stats_polling_interval_s": 1
  },
  "logger": {
    "log_path": "logs.fifo",
    "level": "Debug"
  },
  "metrics": {
    "metrics_path": "metrics.fifo"
  },
  "vsock": {
    "guest_cid": 3,
    "uds_path": "./v.sock"
  }
}
Deno main.ts
Deno.serve((_req) => {
    let mem = Deno.memoryUsage();
    console.log(`rss: ${mem.rss}, heap total: ${mem.heapTotal}, heap used: ${mem.heapUsed}, external: ${mem.external}`);
    return fetch("https://debug.fly.dev")
});
Snapshotting + load
sudo curl --unix-socket app/fc.sock -i \
    -X PATCH 'http://localhost/vm' \
    -H 'Accept: application/json' \
    -H 'Content-Type: application/json' \
    -d '{
            "state": "Paused"
    }'

sudo curl --unix-socket app/fc.sock -i \
    -X PUT 'http://localhost/snapshot/create' \
    -H  'Accept: application/json' \
    -H  'Content-Type: application/json' \
    -d '{
            "snapshot_type": "Full",
            "snapshot_path": "./snapshot_file",
            "mem_file_path": "./mem_file"
    }'
sudo rm fc.sock v.sock; sudo firecracker-v1.10.1-x86_64 --api-sock fc.sock
sudo curl --unix-socket app/fc.sock -i \
    -X PUT 'http://localhost/snapshot/load' \
    -H  'Accept: application/json' \
    -H  'Content-Type: application/json' \
    -d '{
            "snapshot_path": "./snapshot_file",
            "mem_backend": {
                "backend_path": "./mem_file",
                "backend_type": "File"
            },
            "enable_diff_snapshots": true,
            "resume_vm": true
    }'
@francescolavra
Copy link
Member

Those virtio block errors appear to be unrelated to snapshotting, and are caused by Deno trying to write to the filesystem while the underlying drive is marked as read-only (by the "is_read_only": true option in your config file). Even though your app may not be accessing the filesystem directly, Deno tries to create its internal cache (at /root/.cache/deno/) and write files in there.
If you want to prevent Deno from trying to write to the filesystem, you can mark the filesystem as read-only at the guest level, e.g. by adding readonly_rootfs=t (which is a Nanos-specific option) to the kernel command line (i.e. the boot_args string in the Firecracker config).

@jeromegn
Copy link
Author

Thank you, that does appear to be my bad.

If you want to prevent Deno from trying to write to the filesystem, you can mark the filesystem as read-only at the guest level, e.g. by adding readonly_rootfs=t (which is a Nanos-specific option) to the kernel command line (i.e. the boot_args string in the Firecracker config).

What if I have 3 firecracker drives (the app image and 3 volumes)? Can I set which ones are read-only and which ones aren't? They all are currently, but that might not be true in the future.

@eyberg
Copy link
Contributor

eyberg commented Nov 25, 2024

yes you can set RO to individual volumes https://docs.ops.city/ops/volumes#read-only

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants