Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maintenance mode destination nullpointer with local storage data disk #9887

Open
BartJM opened this issue Nov 4, 2024 · 4 comments
Open

Comments

@BartJM
Copy link
Contributor

BartJM commented Nov 4, 2024

ISSUE TYPE
  • Bug Report
COMPONENT NAME
Maintenance Mode
CLOUDSTACK VERSION
4.19.1.2
CONFIGURATION
host.maintenance.local.storage.strategy: Migration
OS / ENVIRONMENT
SUMMARY

Nullpointer exception when setting a host in maintenance if the host has a running vm with a local storage data disk.

Found the bug while fixing another bug where the host going into maintenance is not avoided so maintenance for local storage vms can fail.
During testing I did not find a path in the current version to reach the breaking code without a debugger attached.

Bug is caused by the deployment returning null causing a nullpointer in

Deployment returns null as the only suitable storage pool returned by findSuitablePoolsForVolumes in server/src/main/java/com/cloud/deploy/DeploymentPlanningManagerImpl.java for the data disk is the host the data disk is located on. (Only applies for data disk, for a local storage root disk all suitable storage pools are evaluated). With that host being avoided to prepare it for maintenance the deployment planner has no suitable hosts returning null.

Migrating the vm using the ui does not cause issues.

STEPS TO REPRODUCE
EXPECTED RESULTS

VM migrated and host going into maintenance

ACTUAL RESULTS
2024-11-01 16:04:39,428 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-5:ctx-b739d4ae job-371) (logid:51494050) Unexpected exception while executing org.apache.cloudstack.api.command.
admin.host.PrepareForMaintenanceCmd
java.lang.NullPointerException
        at com.cloud.resource.ResourceManagerImpl.migrateAwayVmWithVolumes(ResourceManagerImpl.java:1471)
        at com.cloud.resource.ResourceManagerImpl.doMaintain(ResourceManagerImpl.java:1403)
        at com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1489)
        at com.cloud.resource.ResourceManagerImpl.maintain(ResourceManagerImpl.java:1543)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
        at com.sun.proxy.$Proxy200.maintain(Unknown Source)
        at org.apache.cloudstack.api.command.admin.host.PrepareForMaintenanceCmd.execute(PrepareForMaintenanceCmd.java:101)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:172)
        at com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:112)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:654)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:48)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:55)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:102)
        at org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:52)
        at org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:45)
        at org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:602)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

@DaanHoogland
Copy link
Contributor

@BartJM is #9892 a full solution for the issue, or just a partial solution?

@BartJM
Copy link
Contributor Author

BartJM commented Nov 7, 2024

#9892 is not a solution to this issue. It is a solution to another issue (any local storage vms not migrating with maintenance) which allows this issue to be reachable.

@rajujith
Copy link
Collaborator

rajujith commented Nov 8, 2024

@BartJM I tested the host maintenance on a host with vm using local storage in 4.18. I set host.maintenance.local.storage.strategy: Migration

It fails since it selects the source host/primary storage for migration.

I understand PR 9892 fixed the above issue but still won't result in successful vm migration without fixing this NPE.

@BartJM
Copy link
Contributor Author

BartJM commented Nov 8, 2024

@rajujith PR9892 fixed the issue you describe, but it will result in a successful vm migration when no data disk is used or when the datadisk is not on local storage.

This NPE only occurs when a local storage data disk is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

3 participants