Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cchost exits/crashes when joining a node using a read only snapshot mounted via blobfuse2 backed storage #6466

Open
gaurav137 opened this issue Sep 5, 2024 · 3 comments
Assignees

Comments

@gaurav137
Copy link

When hosting the read only snapshot directory backed by a blobfuse2 mount point the cchost process seems to exit/crash after logging "All sanity checks passed". Need to debug this more. Filing the bug for tracking purposes.

 2024-09-04T10:28:39.592537Z        100 [info ] ../src/host/main.cpp:444             | Registering RPC interface debug_interface, on  0.0.0.0:8082
2024-09-04T10:28:39.592581Z        100 [info ] ../src/host/socket.h:49              | TCP RPC Client listening on 0.0.0.0:8082
2024-09-04T10:28:39.592595Z        100 [info ] ../src/host/main.cpp:458             | Registered RPC interface debug_interface, on  0.0.0.0:8082
2024-09-04T10:28:39.592605Z        100 [info ] ../src/host/main.cpp:444             | Registering RPC interface primary_rpc_interface, on  0.0.0.0:8080
2024-09-04T10:28:39.592632Z        100 [info ] ../src/host/socket.h:49              | TCP RPC Client listening on 0.0.0.0:8080
2024-09-04T10:28:39.592643Z        100 [info ] ../src/host/main.cpp:458             | Registered RPC interface primary_rpc_interface, on  0.0.0.0:8080
2024-09-04T10:28:39.592719Z        100 [info ] ../src/host/main.cpp:593             | Startup host time: 2024-09-04 10:28:39.592716
2024-09-04T10:28:39.592758Z        100 [info ] ../src/host/main.cpp:651             | Creating new node - join existing network at host.docker.internal:59583
2024-09-04T10:28:41.966691Z        100 [info ] ../src/host/main.cpp:696             | Found latest snapshot file: /mnt/ro-snapshots/snapshots/snapshot_7_8.committed (size: 3168)
2024-09-04T10:28:41.966726Z        100 [info ] ../src/host/main.cpp:716             | Initialising enclave: enclave_create_node
2024-09-04T10:28:41.968036Z        100 [debug] ../src/host/enclave.h:239            | Padding config of size 1912 to 1912 bytes
2024-09-04T10:28:41.968084Z        100 [debug] ../src/host/enclave.h:248            | Padding startup snapshot of size 3168 to 3168 bytes
2024-09-04T10:28:42.067360Z        0   [debug] ../src/enclave/main.cpp:56           | All sanity check tests passed
root@cc8e54fc8945:/app#
 
@gaurav137 gaurav137 added the bug label Sep 5, 2024
@gaurav137
Copy link
Author

Update. Running cchost in gdb gave this output:

2024-09-05T07:48:19.978464Z        100 [fail ] ../src/host/main.cpp:276             | DEPRECATED: Enclave path was specified in config file! This should be removed from the config, and passed directly to the CLI instead
Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /usr/lib/ccf/libjs_generic.virtual.so]
2024-09-05T07:48:20.265254Z        100 [info ] ../src/host/ledger.h:1067            | Recovering main ledger directory /mnt/storage/ledger
2024-09-05T07:48:21.273732Z        100 [info ] ../src/host/ledger.h:1114            | Main ledger directory /mnt/storage/ledger is empty: no ledger file to recover
2024-09-05T07:48:21.274045Z        100 [info ] ../src/host/snapshots.h:198          | Snapshots will be stored in existing directory: /mnt/storage/snapshots
2024-09-05T07:48:21.274523Z        100 [info ] ../src/host/lfs_file_handler.h:23    | Clearing contents from existing directory .index
2024-09-05T07:48:21.275173Z        100 [info ] ../src/host/socket.h:49              | TCP Node Server listening on 0.0.0.0:8081
2024-09-05T07:48:21.275210Z        100 [info ] ../src/host/main.cpp:444             | Registering RPC interface debug_interface, on  0.0.0.0:8082
2024-09-05T07:48:21.275234Z        100 [info ] ../src/host/socket.h:49              | TCP RPC Client listening on 0.0.0.0:8082
2024-09-05T07:48:21.275249Z        100 [info ] ../src/host/main.cpp:458             | Registered RPC interface debug_interface, on  0.0.0.0:8082
2024-09-05T07:48:21.275288Z        100 [info ] ../src/host/main.cpp:444             | Registering RPC interface primary_rpc_interface, on  0.0.0.0:8080
2024-09-05T07:48:21.275323Z        100 [info ] ../src/host/socket.h:49              | TCP RPC Client listening on 0.0.0.0:8080
2024-09-05T07:48:21.275356Z        100 [info ] ../src/host/main.cpp:458             | Registered RPC interface primary_rpc_interface, on  0.0.0.0:8080
2024-09-05T07:48:21.275422Z        100 [info ] ../src/host/main.cpp:593             | Startup host time: 2024-09-05 07:48:21.275419
2024-09-05T07:48:21.275459Z        100 [info ] ../src/host/main.cpp:651             | Creating new node - join existing network at host.docker.internal:55532
2024-09-05T07:48:23.459607Z        100 [info ] ../src/host/main.cpp:696             | Found latest snapshot file: /mnt/ro-snapshots/snapshots/snapshot_15_16.committed (size: 63406)
2024-09-05T07:48:23.459759Z        100 [info ] ../src/host/main.cpp:716             | Initialising enclave: enclave_create_node
[New Thread 0x7fbff1fde700 (LWP 444)]
2024-09-05T07:48:23.736428Z        0   [debug] ../src/enclave/main.cpp:56           | All sanity check tests passed
 
Thread 1 "cchost" received signal SIGSEGV, Segmentation fault.

0x00007fbff612e56f in nlohmann::json_abi_v3_11_3::detail::lexer<nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >, void>, nlohmann::json_abi_v3_11_3::detail::iterator_input_adapter<unsigned char const*> >::get() () from /usr/lib/ccf/libjs_generic.virtual.so

(gdb)

@achamayou
Copy link
Member

(gdb) bt
#0  0x00007fbff612e56f in nlohmann::json_abi_v3_11_3::detail::lexer<nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >, void>, nlohmann::json_abi_v3_11_3::detail::iterator_input_adapter<unsigned char const*> >::get() () from /usr/lib/ccf/libjs_generic.virtual.so
#1  0x00007fbff612b11e in nlohmann::json_abi_v3_11_3::detail::lexer<nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >, void>, nlohmann::json_abi_v3_11_3::detail::iterator_input_adapter<unsigned char const*> >::scan() () from /usr/lib/ccf/libjs_generic.virtual.so
#2  0x00007fbff6127bf3 in nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >, void> nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >, void>::parse<unsigned char const*>(unsigned char const*, unsigned char const*, std::__1::function<bool (int, nlohmann::json_abi_v3_11_3::detail::parse_event_t, nlohmann::json_abi_v3_11_3::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::json_abi_v3_11_3::adl_serializer, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >, void>&)>, bool, bool) () from /usr/lib/ccf/libjs_generic.virtual.so
#3  0x00007fbff61c7259 in ccf::deserialise_snapshot(std::__1::shared_ptr<ccf::kv::Store> const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > const&, std::__1::vector<std::__1::unique_ptr<ccf::kv::ConsensusHook, std::__1::default_delete<ccf::kv::ConsensusHook> >, std::__1::allocator<std::__1::unique_ptr<ccf::kv::ConsensusHook, std::__1::default_delete<ccf::kv::ConsensusHook> > > >&, std::__1::vector<unsigned long, std::__1::allocator<unsigned long> >*, bool, std::__1::optional<std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > >) () from /usr/lib/ccf/libjs_generic.virtual.so
#4  0x00007fbff6221d17 in ccf::NodeState::initialise_startup_snapshot(bool) () from /usr/lib/ccf/libjs_generic.virtual.so
#5  0x00007fbff621daad in ccf::NodeState::launch_node() () from /usr/lib/ccf/libjs_generic.virtual.so
#6  0x00007fbff621a710 in ccf::NodeState::initiate_quote_generation()::{lambda(ccf::QuoteInfo const&, ccf::pal::snp::EndorsementEndpointsConfiguration const&)#1}::operator()(ccf::QuoteInfo const&, ccf::pal::snp::EndorsementEndpointsConfiguration const&) const () from /usr/lib/ccf/libjs_generic.virtual.so
#7  0x00007fbff61b8a20 in ccf::NodeState::create(StartType, StartupConfig&&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >&&) () from /usr/lib/ccf/libjs_generic.virtual.so
#8  0x00007fbff61b5d61 in ccf::Enclave::create_new_node(StartType, StartupConfig&&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >&&, unsigned char*, unsigned long, unsigned long*, unsigned char*, unsigned long, unsigned long*) () from /usr/lib/ccf/libjs_generic.virtual.so
#9  0x00007fbff61a6053 in enclave_create_node () from /usr/lib/ccf/libjs_generic.virtual.so
#10 0x00005555555ea4b9 in host::Enclave::create_node(EnclaveConfig const&, StartupConfig const&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >&&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >&, std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >&, StartType, LoggerLevel, unsigned long, void*) ()
#11 0x00005555555c2971 in main ()

@achamayou
Copy link
Member

The snapshot file appears to be corrupt, it is not clear it was corrupted during snapshot production, writing, or perhaps as result of non-atomic copies in blobfuse. One thing is clear though: the snapshot parser is insufficiently defensive.

@achamayou achamayou self-assigned this Sep 12, 2024
@achamayou achamayou removed the bug label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants