-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Clone from Snapshot] CloneCommand never send #536
Comments
I've also noticed that the clone feature support is not tested in any way either, or at least I don't see it being ran. That being said, with the lack of logging it's REALLY hard to bugtrace. |
I took a few notes the other day of what should be done about lacking test for the local engines. I'll write it up on a ticket so we can track progress on addressing it. |
Ahh good to soo activity on that!
Yes that would be great as well!
I noticed, I think it might be best to fix the tests first and go from there? Edit/Addition On another note is the lack of verbosity though... |
I've raised openebs/openebs#3726
Yes that makes sense. I'd also like to see how much of these local engines (lvm, zfs...) test we can "unify". Basic CSI etc functionality should work exactly the same, so presumably the same basic tests should apply to all?
So you're saying even if you increase verbosity there's nothing else being added? Maybe we can track this as an enhancement too. CC @sinhaashish EDIT: sorry mouse slip... |
Is there even a verbosity option? |
Doubt verbosity would've helped much in this case btw, as I manually tracked the log messages through the code and it never even hit the function that should've send the zfs-clone command. |
sigh, sorry my mistake for making this assumption, seems there's not.. |
@jnels124 Can you confirm on here that you also see this behavior as well? |
Hey @Ornias1993, I tried to reproduce this issue but it seems it is working for me. I tried on Can i get more specific details so that i can see how to reproduce this issue (what exact steps you performed, with yaml if possible)? Logs doesn't seems to be useful here, i just see bunch of repeating following lines for different PVCs |
I've not added all logs yet, but the issue is also that there is no verbosity at all. Those "this no worky" errors are not useable log output anyway.
Obviously this is a running system so the logs do contain things that aren't related.
Primarily the use is through VolSync, so I cannot provide YAML for you. However, everything works perfectly fine, except that the PVC volsync creates throws a vague error and OpenEBS never executes the clone command. But due to the insane lack of any way of bugtracing it (because there is no verbosity anywhere), I've no way of even trying to fix it myself. |
@w3aman Can you share your way of testing this in yaml format? |
I tried to replicated spec from your zfsvolume and these are my yamls (zfs_yaml.txt) and they work perfectly fine. here is my zfsvolume for target pvc:
|
Can you send me the output for this command for the node agent pod on the node
|
I already uploaded those logs: Its not my personal system, I went through a troubleshooting session with one of our users. |
@Ornias1993 I do see this behavior as well. However, I am creating the snapshots with different tooling. We deploy citusdb (postgrsql extension) via stackgres. Stackgres provides the snapshotting ability.
|
@Ornias1993 are you setting the |
In my case, I believe the issue is caused by the The issue appears to be with This block is problematic. Because the value of
This means that we are able to create fresh volumes correctly, but not from volume clones, or snapshot clones because a few lines down we have.
That method will fail when the value of selected is the |
Interesting, i'm not explicitly doing that, but maybe VolSync is doing that? |
I believe the reason for the using the OwnerNodeId is to ensure that we select the node where the parent volume's pool is present, since snapshot, restore, clone all of them need to go to the same node. You said that the
|
@Abhinandan-Purkait So with this configuration, I am able to create new ZFS volumes without issue. However, when you attempt to create a zfs volume from a snapshot definition it fails. In the logs I can see In the case of creating a new volume this method is called with the nodename but when it is from a volume clone or a snapshot it gets called with the value of
Replication steps:
|
I reviewed some of your logs that i didn't see before and saw that these two values match in your case. can ignore that. |
@jnels124 Thanks for explaining your use case. It moreover seems pre-labeling the node was not a part of the design and thus I feel as per current design it's not supposed to work or do you think the way it works is fundamentally wrong and needs refactoring. Can you please create an enhancement or refactoring issue and explain what can be done here so that it can be worked upon? Having said that, there are two issues here. One is @jnels124 use case and the other reported by @Ornias1993, both of them seem to have different scope and might need different resolution. Unfortunately we haven't been able to reproduce the latter, even with same volume specs and environment, as reported by @w3aman. |
Yeah thats a shame for sure... Worth of note: btw @Abhinandan-Purkait I still owe you an apology for my previous remark calling you an intern an incompetent. In hindsight, you shouldn't have been blamed by me solely and personally. |
That's unfortunate. Alright we can try this out with the VolSync and let you know the findings. |
That would be completely awesome, thank you! I see the request being filed for the clone, but it just... well... doesn't pass through the complete codepath... |
Can you provide some information about the VolSync configuration? Like what is the replication method is being used and what is the copyMethod that is being used for that. |
|
Hi @Ornias1993, I tried using the volsync with Restic + GCP bucket and Restore using VolumePopulator and copyMethod Snapshot. I was not able to reproduce it. You can check below: The configs used: The logs and the pv: |
Logically because the copy method doesn't use snapshots at all-all. However, thats great information, because that limits where the issue is accuring to purely the clone from snapshot side of things. |
Ahh okay. I thought you referred to copyMethod Snapshot. In that case can you tell me what's the flow, I am new to using the VolSync tool. I mean what do I need to change in the way I did so that I can reproduce. |
Closed As: Cannot Reproduce |
I have a similar issue. I encountered this with VolSync, CloudNativePG backups, etc. https://gist.github.com/dszakallas/e3653d1a70df5f7477530586802120d1 I cannot access the links anymore so I cannot corroborate if the underlying root cause is the same, but in my case
causes the issue. This seems to be the same as #427, with the resolution of removing the |
This change resolves the issue: https://github.com/openebs/zfs-localpv/compare/v2.5.x...dszakallas:zfs-localpv:dsz/fix-readonly-clone-params?expand=1 |
@Abhinandan-Purkait I'll reopen this, as @dszakallas found the potential cause. Though I also have to add, that it also highlights how lacking the testing regimen is for some of the codebase. |
What steps did you take and what happened:
When creating a pvc as a clone from a snapshot of another PVC, the clone command never gets send
ZFS command log:
Notice "testclone" which was ran manually, the log never showed any attempt at running the clone command, successfull or otherwise.
What did you expect to happen:
It should at least send the clone command, failure or success regardless.
The output of the following commands will help us better understand what's going on:
zfsvolume cr for targetPVC
volume for origin is all correct.
target and origin share the same (known-good) storageClass
kubectl logs -f openebs-zfs-controller-f78f7467c-blr7q -n openebs -c openebs-zfs-plugin
Gist logs are to be added, but highlights.
But error is just failing on the Dataset existence check, after creation of the ZFSvolume CR and no descriptive error is output in any of them. Just the fact volume creation failed, which we can also already see on the zfsvolume cr.
kubectl logs -f openebs-zfs-node-[xxxx] -n openebs -c openebs-zfs-plugin
https://gist.github.com/Ornias1993/9cb23dc0df026233e8d64c74d70bd39a
https://gist.github.com/Ornias1993/991e63fbc41fd68e71a35c2f8f2e3a62
Anything else you would like to add:
Everything works perfectly fine and the PVC is also created fine.
Other PVC creation works fine.
All OpenEBS pods and components are fine
volumesnapshot objects are present and fine, as-well-as ZFS snapshots.
But regardless of this, at least the clone command should've been send.
Environment:
The text was updated successfully, but these errors were encountered: