Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workingDir not created when missing #2164

Open
ldoktor opened this issue Nov 21, 2024 · 4 comments
Open

workingDir not created when missing #2164

ldoktor opened this issue Nov 21, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@ldoktor
Copy link
Contributor

ldoktor commented Nov 21, 2024

Describe the bug

When creating a normal pod with workingDir pointing to a non-existent location it happily creates the directory for us. This is not happening with peer-pods.

How to reproduce

cat << \EOF | oc apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: ldoktor
  labels:
    app: http-server-app
spec:
  runtimeClassName: kata-remote
  containers:
  - name: http-server
    image: busybox
    command: ["sh", "-xc", "pwd"]
    workingDir: /foo/bar
  restartPolicy: Never
EOF

CoCo version information

cloud-api-adaptor 04ccaf6

What TEE are you seeing the problem on

None

Failing command and relevant log output

oc describe pod/ldoktor
...
Events:
  Type     Reason     Age    From               Message
  ----     ------     ----   ----               -------
  Normal   Scheduled  2m21s  default-scheduler  Successfully assigned user-ns2/ldoktor to ip-192-168-13-255.ec2.internal
  Normal   Pulling    5s     kubelet            Pulling image "busybox"
  Normal   Pulled     5s     kubelet            Successfully pulled image "busybox" in 121ms (121ms including waiting). Image size: 2167126 bytes.
  Normal   Created    5s     kubelet            Created container http-server
  Warning  Failed     2s     kubelet            Error: failed to create containerd task: failed to create shim task: ENOENT: No such file or directory

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: rustjail::container::do_init_child
   2: rustjail::container::init_child
   3: kata_agent::main
   4: std::sys_common::backtrace::__rust_begin_short_backtrace
   5: std::rt::lang_start

Stack backtrace:
   0: anyhow::kind::Adhoc::new
   1: rustjail::sync_with_async::read_async::{{closure}}
   2: <rustjail::container::LinuxContainer as rustjail::container::BaseContainer>::start::{{closure}}
   3: kata_agent::rpc::AgentService::do_create_container::{{closure}}::{{closure}}.10322
   4: <kata_agent::rpc::AgentService as protocols::agent_ttrpc_async::AgentService>::create_container::{{closure}}
   5: <protocols::agent_ttrpc_async::CreateContainerMethod as ttrpc::asynchronous::utils::MethodHandler>::handler::{{closure}}
   6: ttrpc::asynchronous::server::HandlerContext::handle_msg::{{closure}}
   7: <ttrpc::asynchronous::server::ServerReader as ttrpc::asynchronous::connection::ReaderDelegate>::handle_msg::{{closure}}::{{closure}}
   8: tokio::runtime::task::raw::poll
   9: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  10: tokio::runtime::task::raw::poll
  11: std::sys_common::backtrace::__rust_begin_short_backtrace
  12: core::ops::function::FnOnce::call_once{{vtable.shim}}
  13: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at ./rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2007:9
  14: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at ./rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/alloc/src/boxed.rs:2007:9
  15: std::sys::unix::thread::Thread::new::thread_start
             at ./rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/std/src/sys/unix/thread.rs:108:17: unknown
@ldoktor ldoktor added the bug Something isn't working label Nov 21, 2024
@bpradipt
Copy link
Member

bpradipt commented Dec 3, 2024

@ldoktor is this issue only with kata-remote or with Kata/Qemu also?

@ldoktor
Copy link
Contributor Author

ldoktor commented Dec 20, 2024

Hello @bpradipt, I reproduced this issue on AWS using eks cluster with containerd as well as crio and using kcli created generic cluster on AWS as well, all with the same error message.

As for kata I tried kata-qemu and kata-clh, both failed with slightly different error message compare to peer-pods:

  Type     Reason          Age   From               Message
  ----     ------          ----  ----               -------
  Normal   Scheduled       87s   default-scheduler  Successfully assigned default/ldoktor to kata-k8s-worker-0
  Normal   AddedInterface  86s   multus             Add eth0 [10.244.1.20/24] from cbr0
  Normal   Pulling         85s   kubelet            Pulling image "busybox"
  Normal   Pulled          84s   kubelet            Successfully pulled image "busybox" in 1.02s (1.02s including waiting). Image size: 2167089 bytes.
  Normal   Created         84s   kubelet            Created container: http-server
  Warning  Failed          84s   kubelet            Error: failed to create containerd task: failed to create shim task: ENOENT: No such file or directory

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>

Stack backtrace:
   0: <unknown>
   1: <unknown>
   2: <unknown>
   3: <unknown>
   4: <unknown>
   5: <unknown>
   6: <unknown>
   7: <unknown>
   8: <unknown>
   9: <unknown>
  10: <unknown>
  11: <unknown>
  12: <unknown>
  13: <unknown>: unknown

@bpradipt
Copy link
Member

Hello @bpradipt, I reproduced this issue on AWS using eks cluster with containerd as well as crio and using kcli created generic cluster on AWS as well, all with the same error message.

As for kata I tried kata-qemu and kata-clh, both failed with slightly different error message compare to peer-pods:

Probably the bug should be created in kata-containers repo ?

@ldoktor
Copy link
Contributor Author

ldoktor commented Dec 21, 2024

It seems such bug already existed kata-containers/kata-containers#2555 but was resolved. @gkurz claims this issue won't reproduce on OCP so perhaps it depends on kubernetes version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants