Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve snapshotting functionality #29

Open
wants to merge 10 commits into
base: v0.25_reference
Choose a base branch
from

Conversation

amohoste
Copy link

This PR adds the following changes

  • The HTTP control client timeout has been increased to accomodate longer snapshot creation and restore times
  • Booted and snapshot loaded VMs can execute in their own network namespace
  • A new shim is created upon loading a snapshot, allowing us to boot multiple VMs from a single snapshot
  • A newSnapshotPath parameter has been added which allows to specify where the devicemapper snapshot file file that contains the disk state is located (see Allow specifying a custom devmapper snapshot device on snapshot boot ease-lab/firecracker#3)
  • Added the ability to create differential snapshot

@@ -98,11 +98,12 @@ message IPConfiguration {
message FirecrackerMachineConfiguration {
string CPUTemplate = 1; // Specifies the cpu template. Example: "T2" or "C3"
bool HtEnabled = 2; // Specifies if hyper-threading should be enabled
bool TrackDirtyPages = 3; // Specified if dirty-page-tracking should be enabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please move to the 5th position to allow backwards compatibility.

// Specifies the memory size of VM
// This lets us create a Firecracker VM of up to 4096 TiB, which
// for a microVM should be large enough
uint32 MemSizeMib = 3;
uint32 VcpuCount = 4; // Specifies the number of vCPUs for the VM
uint32 MemSizeMib = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert

@@ -268,20 +274,22 @@ func (s *service) startEventForwarders(remotePublisher shim.Publisher) {
go func() {
<-s.vmReady

// Once the VM is ready, also start forwarding events from it to our exchange
attachCh := eventbridge.Attach(ctx, s.eventBridgeClient, s.eventExchange)
if ! s.snapLoaded {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor for a better code style:

if s.snapLoaded { return }

<...> // everything just like before but without an indent

return &empty.Empty{}, nil
}

// shutdownSnapLoadedVm shuts down a vm that has been loaded from a snapshot
func (s *service) shutdownSnapLoadedVm() error {
// Kill firecracker process and its shild processes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Kill firecracker process and its shild processes
// Kill firecracker process and its child processes

DialTimeout: 100 * time.Millisecond,
RequestTimeout: 10 * time.Second,
ResponseHeaderTimeout: 10 * time.Second,
DialTimeout: 1000 * time.Millisecond,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for this timeout increasing? why to those values?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed some timeouts when restoring or loading snapshots, as far as I remember especially for larger memory sizes which makes sense. The values are chosen quite arbitrarily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try decreasing them (e.g., try 2x vs 10x as of now). large timeouts create issues in the control plane, the system becomes less responsive

}

// Offload Shuts down a VM and deletes the corresponding firecracker socket
// and vsock. All of the other resources will persist
// and vsock. All of the other resources will persist. Depracated!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// and vsock. All of the other resources will persist. Depracated!
// and vsock. All of the other resources will persist. DEPRECATED!

Signed-off-by: Amory Hoste <[email protected]>
@amohoste amohoste force-pushed the v0.25_reference branch 7 times, most recently from 0e94206 to 032bd10 Compare January 23, 2022 18:24
Copy link
Member

@ustiugov ustiugov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants