Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build and run custom version of EVerest on the uMWC #51

Closed
shankari opened this issue Jun 3, 2024 · 113 comments
Closed

Build and run custom version of EVerest on the uMWC #51

shankari opened this issue Jun 3, 2024 · 113 comments

Comments

@shankari
Copy link
Collaborator

shankari commented Jun 3, 2024

The EVerest project has open hardware as well https://everest.github.io/nightly/hardware/pionix_belay_box.html which is available as a kit from Pionix. Pionix also sells the uMWC (https://shop.pionix.com/products/umwc-micro-mega-watt-charger). This is is a non-open device in a housing that shares some hardware with the Belay Box although it has a different power module that is limited to 1W output.

In this issue, we will track the steps to run a custom build of EVerest on the uMWC so that we can perform HIL testing.

@faizanmir21 The instructions are here:
https://everest.github.io/nightly/hardware/pionix_belay_box.html#developing-with-everest-and-belaybox

They are for the Belay Box, but I'm hoping that they will apply to the uMWC as well. If not, we can ask the community for help.

My suggested steps are:

  1. check everest-dev.service and verify that it starts the dev build from /mnt/user_data/opt/everest
  2. Install a dev build from the latest stable release (2024.3.0) https://github.com/EVerest/everest-core/releases/tag/2024.3.0
    • We already have everest builds in the docker containers. So you can run the manager docker container and use it as the base to cross-compile. To get a shell, you can use:
    • one of the demo scripts and then docker exec -it ....manager /bin/bash OR
    • docker run -it ghcr.io/everest/everest-demo/manager:0.0.16 /bin/bash
  3. Then rsync it over and try to boot up!

@drmrd @couryrr-afs @wjmp for visibility

@faizanmir21
Copy link

Connection refused while compiling the environment. See below.

Screenshot 2024-06-03 at 5 07 26 PM

@faizanmir21
Copy link

I think it was getting blocked by the NREL firewall. Was able to access using the hotspot network.

@louisg1337
Copy link
Contributor

I am currently on the third step on the Pionix guide where you have to run the cmake commands, and I am getting this error:

cb57bb2cbe6f:/ext/source# cmake \
 -DCMAKE_TOOLCHAIN_FILE=../../bullseye-toolchain/toolchain.cmake \
 -DCMAKE_INSTALL_PREFIX=/mnt/user_data/opt/everest \
 -S . -B build-cross
-- The CXX compiler identification is unknown
-- The C compiler identification is unknown
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - failed
-- Check for working CXX compiler: /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++
-- Check for working CXX compiler: /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++ - broken
CMake Error at /usr/share/cmake/Modules/CMakeTestCXXCompiler.cmake:62 (message):
  The C++ compiler

    "/bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: /ext/source/build-cross/CMakeFiles/CMakeTmp
    
    Run Build Command(s):/usr/bin/make -f Makefile cmTC_f7171/fast && /usr/bin/make  -f CMakeFiles/cmTC_f7171.dir/build.make CMakeFiles/cmTC_f7171.dir/build
    make[1]: Entering directory '/ext/source/build-cross/CMakeFiles/CMakeTmp'
    Building CXX object CMakeFiles/cmTC_f7171.dir/testCXXCompiler.cxx.o
    /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++ -D__ARM_ARCH_8__ -D__arm__  -Wno-psabi  -o CMakeFiles/cmTC_f7171.dir/testCXXCompiler.cxx.o -c /ext/source/build-cross/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    make[1]: /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++: No such file or directory
    make[1]: *** [CMakeFiles/cmTC_f7171.dir/build.make:78: CMakeFiles/cmTC_f7171.dir/testCXXCompiler.cxx.o] Error 127
    make[1]: Leaving directory '/ext/source/build-cross/CMakeFiles/CMakeTmp'
    make: *** [Makefile:127: cmTC_f7171/fast] Error 2
    
    

  

  CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
  CMakeLists.txt:3 (project)


-- Configuring incomplete, errors occurred!
See also "/ext/source/build-cross/CMakeFiles/CMakeOutput.log".
See also "/ext/source/build-cross/CMakeFiles/CMakeError.log".

It seems like something is wrong with bullseye, but when I ran the commands to install and untar it, it looked normal and didn't give me any errors.

cb57bb2cbe6f:/# wget http://build.pionix.de:8888/release/toolchains/bullseye-toolchain.tgz
Connecting to build.pionix.de:8888 (35.242.226.179:8888)
saving to 'bullseye-toolchain.tgz'
bullseye-toolchain.t 100% |**********************************************************************************************************************************************************************************************|  606M  0:00:00 ETA
'bullseye-toolchain.tgz' saved
cb57bb2cbe6f:/# ls
=0.10.9.5               =3.7.4                  bin                     etc                     media                   root                    sys                     workspace
=0.11.0                 =5.9.1                  bullseye-toolchain.tgz  ext                     mnt                     run                     tmp
=2.8.2                  =9.5.0                  dev                     home                    opt                     sbin                    usr
=3.4.6                  assets                  entrypoint.sh           lib                     proc                    srv                     var
cb57bb2cbe6f:/# tar xfz bullseye-toolchain.tgz 

To get here, I did docker exec into the everest-ac-demo-manager-1, which was built using our one line demos in the latest branch of everest-demo.

@shankari
Copy link
Collaborator Author

shankari commented Jun 10, 2024

this is the same error that Faizan got. When you untar the toolchain, what directory does it create? Maybe you have to untar it in a different way or move it around or sth. you need to show that you have a /bullseye-toolchain/crosstool/bin directory.

@louisg1337
Copy link
Contributor

When I untar the toolchain it creates a directory called bullseye-toolchain the same place that I downloaded the bullseye-toolchain.tgz file. Going off of this error:

make[1]: /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++: No such file or directory

It is weird because I do have a /bullseye-toolchain/crosstool/bin directory, and I have the armv8-rpi4-linux-gnueabihf-g++ file within it.

cb57bb2cbe6f:/# ls
=0.10.9.5               =3.7.4                  bin                     entrypoint.sh           lib                     proc                    srv                     var
=0.11.0                 =5.9.1                  bullseye-toolchain      etc                     media                   root                    sys                     workspace
=2.8.2                  =9.5.0                  bullseye-toolchain.tgz  ext                     mnt                     run                     tmp
=3.4.6                  assets                  dev                     home                    opt                     sbin                    usr
cb57bb2cbe6f:/# cd bullseye-toolchain/crosstool/bin/
cb57bb2cbe6f:/bullseye-toolchain/crosstool/bin# ls
armv8-rpi4-linux-gnueabihf-addr2line      armv8-rpi4-linux-gnueabihf-dwp            armv8-rpi4-linux-gnueabihf-gcov           armv8-rpi4-linux-gnueabihf-ld.gold        armv8-rpi4-linux-gnueabihf-readelf
armv8-rpi4-linux-gnueabihf-ar             armv8-rpi4-linux-gnueabihf-elfedit        armv8-rpi4-linux-gnueabihf-gcov-dump      armv8-rpi4-linux-gnueabihf-ldd            armv8-rpi4-linux-gnueabihf-size
armv8-rpi4-linux-gnueabihf-as             armv8-rpi4-linux-gnueabihf-g++            armv8-rpi4-linux-gnueabihf-gcov-tool      armv8-rpi4-linux-gnueabihf-lto-dump       armv8-rpi4-linux-gnueabihf-strings
armv8-rpi4-linux-gnueabihf-c++            armv8-rpi4-linux-gnueabihf-gcc            armv8-rpi4-linux-gnueabihf-gdb            armv8-rpi4-linux-gnueabihf-nm             armv8-rpi4-linux-gnueabihf-strip
armv8-rpi4-linux-gnueabihf-c++filt        armv8-rpi4-linux-gnueabihf-gcc-10.3.0     armv8-rpi4-linux-gnueabihf-gdb-add-index  armv8-rpi4-linux-gnueabihf-objcopy
armv8-rpi4-linux-gnueabihf-cc             armv8-rpi4-linux-gnueabihf-gcc-ar         armv8-rpi4-linux-gnueabihf-gprof          armv8-rpi4-linux-gnueabihf-objdump
armv8-rpi4-linux-gnueabihf-cpp            armv8-rpi4-linux-gnueabihf-gcc-nm         armv8-rpi4-linux-gnueabihf-ld             armv8-rpi4-linux-gnueabihf-populate
armv8-rpi4-linux-gnueabihf-ct-ng.config   armv8-rpi4-linux-gnueabihf-gcc-ranlib     armv8-rpi4-linux-gnueabihf-ld.bfd         armv8-rpi4-linux-gnueabihf-ranlib
cb57bb2cbe6f:/bullseye-toolchain/crosstool/bin# 

Any ideas on where I would move it or where I should untar it instead?

@louisg1337
Copy link
Contributor

I think that this may be our issue, I'm figuring out how to install this dependency right now to see if that fixes it.

cb57bb2cbe6f:/# ldd /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++
	/lib64/ld-linux-x86-64.so.2 (0x7f44e57a8000)
	libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7f44e57a8000)
Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-g++)

@louisg1337
Copy link
Contributor

I don't know a lot about Linux, so please correct me if I'm wrong, but it seems like the issue we are experiencing is because I am trying to use Alpine Linux (via the docker container) instead of Ubuntu. Apparently Alpine uses musl as its C library, which doesn't have the package above, while Ubuntu uses glibc, which does have the package.

In the Alpine Linux wiki, they have a way to run glibc binaries on Alpine, which is by installing gcompat via apk add gcompat. After that, the cmake command seemed to work successfully.

@shankari
Copy link
Collaborator Author

Perfect! Unfortunately, we bricked the uMWC while experimenting today so I can't test this until I get some new firmware. But maybe you can zip up the directory that we are supposed to copy over using rsync and upload it to someplace internal to NREL? I can then pull it and install it once everything is working again.

We will need to repeat this once the docker image with the setChargingProfile is ready

@louisg1337
Copy link
Contributor

Unfortunately I've hit another roadblock, instead this time with the final make build commands. When I run the command, the build gets to about 60-65% done before it substantially slows down, and eventually stops making progress. I don't get any errors either, it kind of just freezes. I've ran this build command a few times, and it has always happened like this. Heres an example of what I've been stuck on for the past 10 ish minutes.

...
[ 64%] Building C object _deps/ext-mbedtls-build/library/CMakeFiles/mbedtls.dir/ssl_ticket.c.o
make[2]: Leaving directory '/ext/source/build-cross'
make[2]: Entering directory '/ext/source/build-cross'
[ 64%] Building CXX object modules/CMakeFiles/API.dir/API/API.cpp.o
[ 64%] Building C object _deps/ext-mbedtls-build/library/CMakeFiles/mbedtls.dir/ssl_tls.c.o
[ 64%] Building C object _deps/ext-mbedtls-build/library/CMakeFiles/mbedtls.dir/ssl_tls13_keys.c.o
[ 64%] Linking C static library libmbedtls.a
make[2]: Leaving directory '/ext/source/build-cross'
[ 64%] Built target mbedtls
make[2]: Entering directory '/ext/source/build-cross'
make[2]: Leaving directory '/ext/source/build-cross'
make[2]: Entering directory '/ext/source/build-cross'
[ 64%] Building CXX object modules/Auth/lib/CMakeFiles/auth_handler.dir/AuthHandler.cpp.o
[ 64%] Linking CXX executable controller
make[2]: Leaving directory '/ext/source/build-cross'
[ 64%] Built target controller
make[2]: Entering directory '/ext/source/build-cross'
make[2]: Leaving directory '/ext/source/build-cross'
make[2]: Entering directory '/ext/source/build-cross'
[ 64%] Building CXX object modules/CMakeFiles/EnergyManager.dir/EnergyManager/EnergyManager.cpp.o

I figured it may be because the docker container doesn't have enough resources, but I'm decently below my max utilization for everything. I'll keep working on this and see if I can get this to work.

@shankari
Copy link
Collaborator Author

shankari commented Jun 11, 2024 via email

@louisg1337
Copy link
Contributor

I tried bumping all the resources to 75% of what my computer can handle (12 gb RAM, 6 cores, 3 gb swap) and it still is stuck in the low 60s. I left it for about a hour and it doesn't seem to be making any progress past that. I did get an error now, however, although it seems a bit unhelpful. Here are the logs:

...
[ 53%] Built target slac_fsm_ev
[ 53%] Building CXX object _deps/everest-framework-build/lib/CMakeFiles/framework.dir/everest.cpp.o
[ 53%] Building CXX object _deps/everest-framework-build/lib/CMakeFiles/framework.dir/runtime.cpp.o
[ 53%] Built target slac_fsm_evse
[ 60%] Built target websockets_shared
[ 60%] Building CXX object _deps/everest-framework-build/lib/CMakeFiles/framework.dir/yaml_loader.cpp.o
make[2]: Entering directory '/ext/source/build-cross'
make[2]: Entering directory '/ext/source/build-cross'
make[2]: Leaving directory '/ext/source/build-cross'
[ 61%] Built target controller
Consolidate compiler generated dependencies of target ocpp
make[2]: Leaving directory '/ext/source/build-cross'
make[2]: Entering directory '/ext/source/build-cross'
[ 61%] Building CXX object _deps/libocpp-build/lib/CMakeFiles/ocpp.dir/ocpp/v16/charge_point.cpp.o
[ 61%] Building CXX object _deps/libocpp-build/lib/CMakeFiles/ocpp.dir/ocpp/v16/smart_charging.cpp.o
armv8-rpi4-linux-gnueabihf-g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make[2]: *** [_deps/everest-framework-build/lib/CMakeFiles/framework.dir/build.make:328: _deps/everest-framework-build/lib/CMakeFiles/framework.dir/runtime.cpp.o] Error 1
make[2]: *** Waiting for unfinished jobs....
[ 61%] Building CXX object _deps/libocpp-build/lib/CMakeFiles/ocpp.dir/ocpp/v16/charge_point_configuration.cpp.o
[ 61%] Building CXX object _deps/libocpp-build/lib/CMakeFiles/ocpp.dir/ocpp/v16/charge_point_state_machine.cpp.o

@louisg1337
Copy link
Contributor

I read somewhere online that the armv8-rpi4-linux-gnueabihf-g++: fatal error: Killed signal terminated program cc1plus error is usually do to lack of memory, so I dug a bit deeper into whether or not the container itself was getting all the resources I allocated. I then realized that in docker-compose.ocpp201.yml there was a 1 core limit on the CPU and 1 gb limit for the ram. I then bumped it up to 6 cores and 10 gb of ram and it finally built. The container consistently used around 5 cores and the ram fluctuated between 2-6 gb, so the boost was definitely needed. I'll write up a nice guide under this post so that in the future the steps to build using Docker are outlined.

@louisg1337
Copy link
Contributor

Building EVerest Using Docker

The cross compiling steps are outlined here, in the EVerest docs. You are required to have EVerest built natively, which means you need a Linux machine, and the rest of the commands are Linux based as well. We can circumvent this hardware requirement by doing this build process on a docker container which 1) has EVerest built on it, and 2) runs Linux Alpine. Thankfully, we have one line demos which builds EVerest to these containers for us. Below is a complete walkthrough of the steps I took to utilize Docker to build for hardware.

Steps

  1. Increase Docker resources through the settings.
    • When I was building my cpu usage was between 5-6 cores, and my memory fluctuated between 2-6gb. So, ideally set cores = 6 and memory = 6 gb.
  2. Change the CPU and memory limit in everest-demo/.env found here.
    • These limits prevent everest-ac-demo-manager-1, the docker container we will be working in, from utilizing all the resources we allocated.
    • Recommended Steps
      1. Clone everest-demo to your machine.
      2. Navigate to .env and change EVEREST_MANAGER_CPUS and EVEREST_MANAGER_MEMORYto 1 below the maximum values set in Step 1.
  3. Run one of the one line demos to build EVerest in Docker.
    • Recommended Steps Continued (assuming the steps above were followed)
      1. Navigate to demo-iso15118-2-ac-plus-ocpp.sh and change
        git clone --branch "${DEMO_BRANCH}" "${DEMO_REPO}" everest-demo to
        cp -r "${DEMO_REPO}" everest-demo
      2. Now run bash demo-iso15118-2-ac-plus-ocpp.sh -r $(pwd) -3
      3. Wait until Docker shows all containers in everest-ac-demo running.
  4. Enter the everest-ac-demo-manager-1 container via docker exec -it <container_id> /bin/bash.
  5. Run apk add gcompat
  6. The next steps follow the EVerest docs for cross compiling.
    • Here are the tailored steps for documentation purposes.
      1. cd .. to navigate to the / directory.
      2.  wget http://build.pionix.de:8888/release/toolchains/bullseye-toolchain.tgz
         tar xfz bullseye-toolchain.tgz
        
      3. cd ext/source
      4.   cmake \
          -DCMAKE_TOOLCHAIN_FILE=../../bullseye-toolchain/toolchain.cmake \
          -DCMAKE_INSTALL_PREFIX=/mnt/user_data/opt/everest \
          -S . -B build-cross
        
      5.   make -j$(nproc) -C build-cross
          make -j$(nproc) DESTDIR=./dist -C build-cross install
        
  7. Once built, get the cross compiled build out of docker
    • This is how I did it docker cp <container_id>:/ext/source/build-cross ~/Desktop

@shankari
Copy link
Collaborator Author

shankari commented Jun 13, 2024

Affer switching to the custom build that includes smart charging, ran into this error

CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the
  system variable OPENSSL_ROOT_DIR: Found unsuitable version "1.1.1n", but
  required is at least "3" (found
  /bullseye-toolchain/sysroot/usr/lib/arm-linux-gnueabihf/libcrypto.so, )
Call Stack (most recent call first):
  /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:592 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake/Modules/FindOpenSSL.cmake:599 (find_package_handle_standard_args)
  build-cross/_deps/libevse-security-src/CMakeLists.txt:46 (find_package)

This was caused by EVerest/libevse-security@8777cc2

Double-checked the uMWC, and found

/usr/lib/arm-linux-gnueabihf/libcrypt
libcrypt.a               libcryptsetup.so.12      libcrypt.so              libcrypt.so.1.1.0
libcrypto.so.1.1         libcryptsetup.so.12.6.0  libcrypt.so.1

So we will have to revert the libevse change. Let's try only removing that dependency and see if it works.

EDIT: Removed dependency in build-cross/_deps/libocpp-src/CMakeLists.txt and in /ext/source/build-cross/_deps/libevse-security-src/CMakeLists.txt (wait those are the same, so why didn't it work the first time?) It has moved forward now...

EDIT: cmake completed successfully, starting build...

@shankari
Copy link
Collaborator Author

Build failed with

/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:544:21: error: 'X509_ADD_FLAG_NO_DUP' was not declared in this scope; did you mean 'X509_FLAG_NO_AUX'?
  544 |         int flags = X509_ADD_FLAG_NO_DUP | X509_ADD_FLAG_NO_SS;
      |                     ^~~~~~~~~~~~~~~~~~~~
      |                     X509_FLAG_NO_AUX
make[2]: Leaving directory '/ext/source/build-cross'
make[2]: Entering directory '/ext/source/build-cross'
[ 61%] Building CXX object _deps/libmodbus-build/CMakeFiles/modbus.dir/src/modbus_client.cpp.o
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:544:44: error: 'X509_ADD_FLAG_NO_SS' was not declared in this scope; did you mean 'X509_FLAG_NO_IDS'?
  544 |         int flags = X509_ADD_FLAG_NO_DUP | X509_ADD_FLAG_NO_SS;
      |                                            ^~~~~~~~~~~~~~~~~~~
      |                                            X509_FLAG_NO_IDS
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:547:22: error: 'X509_add_cert' was not declared in this scope; did you mean 'X509_add_ext'?
  547 |             if (1 != X509_add_cert(untrusted.get(), get(untrusted_cert), flags)) {
      |                      ^~~~~~~~~~~~~
      |                      X509_add_ext

Replaced all the text in /ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp with the version before the latest changes:
https://github.com/EVerest/libevse-security/blob/34ced9f4452c2ffe3145f4ff200c42dc83278c47/lib/evse_security/crypto/openssl/openssl_supplier.cpp

Ran into issues with

/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:188:9: error: 'EC_KEY_ptr' was not declared in this scope; did you mean 'EVP_PKEY_ptr'?
  188 |         EC_KEY_ptr ec_key(EC_KEY_new_by_curve_name(nid));

/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:211:9: error: 'RSA_ptr' was not declared in this scope
  211 |         RSA_ptr rsa_key(RSA_generate_key(bits, RSA_PRIME, nullptr, nullptr));
      |         ^~~~~~~

/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp: In static member function 'static evse_security::CertificateValidationResult evse_security::OpenSSLSupplier::x509_verify_certificate_chain(evse_security::X509Handle*, const std::vector<evse_security::X509Handle*>&, const std::vector<evse_security::X509Handle*>&, bool, std::optional<std::filesystem::__cxx11::path>, std::optional<std::filesystem::__cxx11::path>)':
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:544:21: error: 'X509_ADD_FLAG_NO_DUP' was not declared in this scope; did you mean 'X509_FLAG_NO_AUX'?
  544 |         int flags = X509_ADD_FLAG_NO_DUP | X509_ADD_FLAG_NO_SS;
      |                     ^~~~~~~~~~~~~~~~~~~~
      |                     X509_FLAG_NO_AUX
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:544:44: error: 'X509_ADD_FLAG_NO_SS' was not declared in this scope; did you mean 'X509_FLAG_NO_IDS'?
  544 |         int flags = X509_ADD_FLAG_NO_DUP | X509_ADD_FLAG_NO_SS;
      |                                            ^~~~~~~~~~~~~~~~~~~
      |                                            X509_FLAG_NO_IDS
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:547:22: error: 'X509_add_cert' was not declared in this scope; did you mean 'X509_add_ext'?
  547 |             if (1 != X509_add_cert(untrusted.get(), get(untrusted_cert), flags)) {
      |                      ^~~~~~~~~~~~~
      |                      X509_add_ext

Copied over deleted code. There are still failures. Trying to just revert the dependency and rebuild. This was only changed 2 weeks ago, and computers are faster than me.

@shankari
Copy link
Collaborator Author

shankari commented Jun 13, 2024

Reverted back to version 0.6.0, OCPP compile is failing with

[ 84%] Building CXX object _deps/libocpp-build/lib/CMakeFiles/ocpp.dir/ocpp/common/types.cpp.o
In file included from /ext/source/lib/staging/evse_security/conversions.cpp:6:
/ext/source/lib/staging/evse_security/../evse_security/conversions.hpp:27:16: error: 'CertificateInfo' in namespace 'evse_security' does not name a type; did you mean 'CertificateType'?
   27 | evse_security::CertificateInfo from_everest(types::evse_security::CertificateInfo other);
      |                ^~~~~~~~~~~~~~~
      |                CertificateType

Trying to revert to the commit just before this change.

EDIT: Setting the git_tag to 4330ce2e28e25535dd01558edb2331891c146769 built!
Now, let's try to cross-compile again. Will check with SIL before pushing out to the uMWC...

@shankari
Copy link
Collaborator Author

Now it is required in libocpp

CMake Error at /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the
  system variable OPENSSL_ROOT_DIR: Found unsuitable version "1.1.1n", but
  required is at least "3" (found
  /bullseye-toolchain/sysroot/usr/lib/arm-linux-gnueabihf/libcrypto.so, )
Call Stack (most recent call first):
  /usr/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:592 (_FPHSA_FAILURE_MESSAGE)
  /usr/share/cmake/Modules/FindOpenSSL.cmake:599 (find_package_handle_standard_args)
  build-cross/_deps/libocpp-src/CMakeLists.txt:28 (find_package)

Not easy to roll this back, so let's just try to edit and assume that the dependencies are not too bad.

@shankari
Copy link
Collaborator Author

Broke in the same way

/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp: In static member function 'static evse_security::CertificateValidationResult evse_security::OpenSSLSupplier::x509_verify_certificate_chain(evse_security::X509Handle*, const std::vector<evse_security::X509Handle*>&, const std::vector<evse_security::X509Handle*>&, bool, std::optional<std::filesystem::__cxx11::path>, std::optional<std::filesystem::__cxx11::path>)':
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:599:21: error: 'X509_ADD_FLAG_NO_DUP' was not declared in this scope; did you mean 'X509_FLAG_NO_AUX'?
  599 |         int flags = X509_ADD_FLAG_NO_DUP | X509_ADD_FLAG_NO_SS;
      |                     ^~~~~~~~~~~~~~~~~~~~
      |                     X509_FLAG_NO_AUX
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:599:44: error: 'X509_ADD_FLAG_NO_SS' was not declared in this scope; did you mean 'X509_FLAG_NO_IDS'?
  599 |         int flags = X509_ADD_FLAG_NO_DUP | X509_ADD_FLAG_NO_SS;
      |                                            ^~~~~~~~~~~~~~~~~~~
      |                                            X509_FLAG_NO_IDS
/ext/source/build-cross/_deps/libevse-security-src/lib/evse_security/crypto/openssl/openssl_supplier.cpp:602:22: error: 'X509_add_cert' was not declared in this scope; did you mean 'X509_add_ext'?
  602 |             if (1 != X509_add_cert(untrusted.get(), get(untrusted_cert), flags)) {
      |                      ^~~~~~~~~~~~~
      |                      X509_add_ext

I guess the interim changes are not known to work. Off to check the openssl code...

@shankari
Copy link
Collaborator Author

shankari commented Jun 13, 2024

ok so verified that these are from SSL 3 https://man.archlinux.org/man/X509_add_cert.3ssl.en
But the OpenSSL docs have a slightly different method name
https://www.openssl.org/docs/man1.1.1/man3/X509_STORE_add_cert.html
which has been unchanged since version 1.

And grepping seems to find the old value as well. Did this ever work? When was it added?
/bullseye-toolchain/sysroot/var/lib/dpkg/info/libssl1.1:armhf.symbols: X509_STORE_add_cert@OPENSSL_1_1_0 1.1.0

There are no symbols for X509_ADD_FLAG_NO_DUP and X509_ADD_FLAG_NO_SS
It was added in EVerest/libevse-security@acc12fe
which is good for PKI but is not needed for our demo. Reverting to the version before that for just this location:
EVerest/libevse-security@acc12fe#diff-9b5e7ace83962c09e4fd78bac64e78b58816ec321a0ef6ecf90c97493d1ff6c6R593

Having said that, I do see a X509_STORE_add_cert(store_ptr.get(), get(parents[i])); but this is if (1 != X509_add_cert(untrusted.get(), get(untrusted_cert), flags)) {....

@shankari
Copy link
Collaborator Author

shankari commented Jun 13, 2024

Final issue: it looks like openssl cleaned up their code by adding consts and plumbing them through. So the calling code has consts but the function does not.

/ext/source/build-cross/_deps/libocpp-src/lib/ocpp/common/websocket/websocket_libwebsockets.cpp:175:61: error: invalid conversion from 'const X509_STORE_CTX*' {aka 'const x509_store_ctx_st*'} to 'X509_STORE_CTX*' {aka 'x509_store_ctx_st*'} [-fpermissive]
  175 |         X509* server_cert = X509_STORE_CTX_get_current_cert(ctx);
      |                                                             ^~~
      |                                                             |
      |                                                             const X509_STORE_CTX* {aka const x509_store_ctx_st*}

I am making the assumption that the underlying implementation has not changed significantly, so we just need to un constify it. I did consider removing the const, but it is passed in as an argument, so we would need to go through and remove all the call sites and potentially their call sites and so on, which will make this very complex.

I can create a new struct fairly easily, but copying over the fields is complicated. And some of them are functions.
https://docs.huihoo.com/doxygen/openssl/1.0.1c/structx509__store__ctx__st.html

So I just use memcpy to copy over all the pointers. If the underlying functions are not actually const, this will be bad, but see assumption above.

        X509_STORE_CTX* non_const_ctx = X509_STORE_CTX_new();
        memcpy(non_const_ctx, ctx, sizeof(ctx)); // args are (dest, src, size)
        X509* server_cert = X509_STORE_CTX_get_current_cert(non_const_ctx);

This code, copied three times, lets the compile succeed!

@shankari
Copy link
Collaborator Author

Verified that there is also a firmware file and only one firmware file

./dist/mnt/user_data/opt/everest/share/everest/modules/YetiDriver/firmware/yetiR1_2.1_firmware.bin

which is built here

./modules/YetiDriver/cmake_install.cmake:  file(INSTALL DESTINATION "${CMAKE_INSTALL_PREFIX}/share/everest/modules/YetiDriver/firmware" TYPE FILE FILES "/ext/source/modules/YetiDriver/yetiR1_2.1_firmware.bin")

Double-checked and the instructions from Pionix say that it should be in the "/libexec/everest/modules/YetiDriver", which is not the same, but at least it is in the YetiDriver directory. Just to be on the safe side, I will first backport the changes and check in the SIL before pushing it to EVerest, and will update EVerest without flashing the firmware first (learned my lesson last time!)

@shankari
Copy link
Collaborator Author

edited /ext/cache/cpm/libevse-security/f4c722882414e8cb77a2f572b45fde98e2647f8d/libevse-security/lib/evse_security/crypto/openssl/openssl_supplier.cpp and /ext/cache/cpm/libocpp/6502037f667273b3f55e917ec94a3fe0a2d27720/libocpp/lib/ocpp/common/websocket/websocket_libwebsockets.cpp

Recompiled. But I can't kill the existing process! Need to pull all the patches and recompile to restart the new version.

@shankari
Copy link
Collaborator Author

SIL fails

2024-06-14 01:05:31.810823 [INFO] ocpp:OCPP201     :: Loading CA csms bundle to verify server certificate: /ext/source/build/dist/etc/everest/certs/ca/v2g/V2G_ROOT_CA.pem
2024-06-14 01:05:31.851814 [INFO] ocpp:OCPP201     :: LWS connect with info port: [443] address: [host.docker.internal] path: [/ws/cp001] protocol: [ocpp2.0.1]
2024-06-14 01:05:31.851958 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Callback with unhandled reason: 85
2024-06-14 01:05:31.852062 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Client connecting...
2024-06-14 01:05:31.852196 [DEBG] ocpp:OCPP201    void ocpp::WebsocketTlsTPM::client_loop() :: Init client loop with ID: 7f35cdd71b38
2024-06-14 01:05:31.853606 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Callback with unhandled reason: 29
2024-06-14 01:05:31.858875 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Verifying server certs!
2024-06-14 01:05:31.900821 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Module ocpp (pid: 4010) exited with status: 11. Terminating all modules.

Trying SP2 and then SP1 as backup

@shankari
Copy link
Collaborator Author

SP2 got the same error
2024-06-14 01:09:16.578069 [DEBG] ocpp:OCPP201    Everest::Everest::subscribe_var(const Requirement&, const std::string&, const JsonCallback&)::<lambda(const Everest::json&)> :: Incoming evse_manager_1:EvseManager->evse:evse_manager->session_event
2024-06-14 01:09:16.619116 [INFO] ocpp:OCPP201     :: LWS connect with info port: [443] address: [host.docker.internal] path: [/ws/cp001] protocol: [ocpp2.0.1]
2024-06-14 01:09:16.619264 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Callback with unhandled reason: 85
2024-06-14 01:09:16.619409 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Client connecting...
2024-06-14 01:09:16.619647 [DEBG] ocpp:OCPP201    void ocpp::WebsocketTlsTPM::client_loop() :: Init client loop with ID: 7f441ccd3b38
2024-06-14 01:09:16.621047 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Callback with unhandled reason: 29
2024-06-14 01:09:16.624015 [DEBG] ocpp:OCPP201    int ocpp::WebsocketTlsTPM::process_callback(void*, int, void*, void*, size_t) :: Verifying server certs!
2024-06-14 01:09:16.647011 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Module ocpp (pid: 4280) exited with status: 11. Terminating all modules.
But SP1 worked
2024-06-14 01:33:01.959977 [INFO] ocpp:OCPP201     :: Received BootNotificationResponse: {
    "currentTime": "2024-06-14T01:33:01.000Z",
    "interval": 300,
    "status": "Accepted"
}

So it is almost certainly a cert issue. We are going with SP1 tomorrow and will have to figure out how to debug cert issues in the future.

@shankari
Copy link
Collaborator Author

shankari commented Jun 14, 2024

With all these changes, SP1 works with EIM. PnC still fails because of SP1 and because the CSMS can't validate the cert.
Set charging profile works, and returns the proper response

2024-06-14 02:05:33.588493 [DEBG] iso15118_car    pybind11_init_everestpy(pybind11::module_&)::<lambda(const std::string&)> :: Decoded message (ns=urn:iso:15118:2:2013:MsgDef): {"V2G_Message":{"Header":{"SessionID":"CADE7FF7DE3FFF4F"},"Body":{"ChargingStatusRes":{"ResponseCode":"OK","EVSEID":"DE*PNX*00001","SAScheduleTupleID":1,"EVSEMaxCurrent":{"Multiplier":-1,"Unit":"A","Value":66},"MeterInfo":{"MeterID":"YETI_POWERMETER","MeterReading":10},"AC_EVSEStatus":{"NotificationMaxDelay":0,"EVSENotification":"None","RCD":false}}}}}

Checking in these changes and re-cross-compiling...

EDIT: The demo is cursed. I finally got to the point where I had a stable build that included all the changes required for crosscompiling and was running (at least in a restricted version). And then the pionix release server went down, so no toolchain for me...

bd830b356e53:/# wget http://build.pionix.de:8888/release/toolchains/bullseye-toolchain.tgz
Connecting to build.pionix.de:8888 (35.242.226.179:8888)
wget: can't connect to remote host (35.242.226.179): Connection refused

EDIT: This is because I was on the VPN which blocks port 8888. getting off allows the download; phew!

shankari added a commit to US-JOET/everest-demo that referenced this issue Jun 14, 2024
- Dependencies.yaml: set the security module version
EVerest#51 (comment)
- libevse-security: remove the validation of sub-certs
EVerest#51 (comment)
- Make the context be unconst
EVerest#51 (comment)
shankari added a commit to US-JOET/everest-demo that referenced this issue Jun 14, 2024
We now need to:
- reconfigure (dependencies.yaml)
- rerun cmake
- apply code patches (these can't be applied earlier because the right dependency isn't there)
- rebuild
- apply non-code patches that would be overridden

Testing done:
- Ran with SP1, started everest, charged with EIM
EVerest#51 (comment)
@shankari
Copy link
Collaborator Author

Cross-compile was successful.

  • Create a release bundle
# cd /ext/source/build-cross/dist
# tar czvf charin-e2e-demo-smart-charging.tar.gz mnt/
  • Copy it to the laptop
$ docker cp everest-ac-demo-manager-1:/ext/source/build-cross/dist/charin-e2e-demo-smart-charging.tar.gz .
  • Boot up the uMWC and connect the ethernet port
  • Stop the existing service
$ systemctl stop everest.service (this didn't work immediately, so had to start and then stop)
$ ps -aef | grep manager
everest   1604  1069  0 04:15 pts/0    00:00:00 grep --color=auto manager
  • Copy it to the uMWC
$ scp ~/joet-everest/everest-demo/charin-e2e-demo-smart-charging.tar.gz everest@...:/tmp
  • Copy it to root and untar it
$ pushd /tmp/
$ tar xzvf charin-e2e-demo-smart-charging.tar.gz
$ mv mnt/user_data/opt /mnt/user_data/
  • Try to start it with a known working config
$ /mnt/user_data/opt/everest/bin/manager --conf /mnt/user_data/custom_configs/2_config-dc-iso2.yaml
2024-06-14 04:21:21.275342 [ERRO] manager         void Everest::Config::load_and_validate_manifest(const string&, const json&) :: Failed to load and parse manifest file /mnt/user_data/opt/everest/libexec/everest/modules/JsIMDSimulator/manifest.yaml: File 'manifest.(yaml|json)' does not exist
2024-06-14 04:21:21.279842 [ERRO] manager         int boot(const boost::program_options::variables_map&) :: Failed to load and validate config!
2024-06-14 04:21:21.280027 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Caught top level boost::exception:
/ext/source/build-cross/_deps/everest-framework-src/lib/config.cpp(262): Throw in function void Everest::Config::load_and_validate_manifest(const string&, const json&)
Dynamic exception type: boost::wrapexcept<Everest::EverestConfigError>
std::exception::what: Failed to load and parse manifest file /mnt/user_data/opt/everest/libexec/everest/modules/JsIMDSimulator/manifest.yaml: File 'manifest.(yaml|json)' does not exist
[boost::log::v2_mt_posix::current_scope_info_tag*] = Everest::Config::Config(std::shared_ptr<Everest::RuntimeSettings>, bool)

That won't actually work any more because the modules in the config have changed.

$ /opt/everest/libexec/everest/modules/JsIMDSimulator
$ /mnt/user_data/opt/everest/libexec/everest/modules/IMDSimulator

Let's copy over the config that we use with our SIL

$ scp ~/joet-everest/everest-demo/config-sil-ocpp201-pnc.yaml [email protected]:/tmp
$ /mnt/user_data/opt/everest/bin/manager --conf /tmp/config-sil-ocpp201-pnc.yaml
2024-06-14 04:27:19.688053 [ERRO] ocpp:OCPP201    void ocpp::v201::DeviceModel::check_integrity(const std::map<int, int>&) :: Integrity check in Device Model storage failed:Number of EVSE configured in device model is incompatible with number of configured EVSEs of the ChargePoint: 2: 1
terminate called after throwing an instance of 'ocpp::v201::DeviceModelStorageError'
  what():  Number of EVSE configured in device model is incompatible with number of configured EVSEs of the ChargePoint: 2: 1

It actually has 2 EVSEs and two connectors

$ sqlite3 opt/everest/share/everest/modules/OCPP201/device_model_storage.db
sqlite> select * from component;
20|InternalCtrlr|||
21|EVSE||1|
22|Connector||1|1
23|Connector||2|1
24|EVSE||2|

It now starts up, but tries to connect to the local OCPP server, which fails. Resetting the host...

sqlite> update variable_attribute set "value"='[{"configurationSlot": 1, "connectionData": {"messageTimeout": 30, "ocppCsmsUrl": "ws://169.254.223.121/ws/cp001", "ocppInterface": "Wired0", "ocppTransport": "JSON", "ocppVersion": "OCPP20", "securityProfile": 1}}]' where id=161

The OCPP connection is now set up correctly and we can receive a setChargingProfile!

2024-06-14 04:53:09.342111 [ERRO] ocpp:OCPP201    void ocpp::v201::ChargePoint::handle_set_charging_profile_req(ocpp::Call<ocpp::v201::SetChargingProfileRequest>) :: Received profile validity: ChargingScheduleChargingRateUnitUnsupportedsetting response to {
    "status": "Rejected"
}

But it is invalid because the rate is incorrect. Fixing that...

SetChargingProfiles now works
2024-06-14 04:55:58.071299 [ERRO] ocpp:OCPP201    void ocpp::v201::ChargePoint::handle_set_charging_profile_req(ocpp::Call<ocpp::v201::SetChargingProfileRequest>) :: Received SetChargingProfile: {
    "chargingProfile": {
        "chargingProfileKind": "Absolute",
        "chargingProfilePurpose": "TxDefaultProfile",
        "chargingSchedule": [
            {
                "chargingRateUnit": "A",
                "chargingSchedulePeriod": [
                    {
                        "limit": 20.0,
                        "numberPhases": 1,
                        "startPeriod": 0
                    }
                ],
                "duration": 86400,
                "id": 0,
                "minChargingRate": 0.0,
                "startSchedule": "2024-06-14T00:00:00.000Z"
            }
        ],
        "id": 1,
        "recurrencyKind": "Daily",
        "stackLevel": 1
    },
    "evseId": 1
}
with messageId: d0475541-502f-4687-9833-b8fb2857601e
2024-06-14 04:55:58.073914 [WARN] ocpp:OCPP201    module::OCPP201::ready()::<lambda()> :: Received a new Charging Schedules from the CSMS or another actor.
2024-06-14 04:55:58.074300 [INFO] ocpp:OCPP201     :: ProfileId #1 Kind: Absolute
2024-06-14 04:55:58.074453 [INFO] ocpp:OCPP201     :: #1 find_period_at> 2024-06-14T00:00:00.000Z
2024-06-14 04:55:58.074645 [INFO] ocpp:OCPP201     ::    find_period_at>        start_time> 2024-06-14T03:55:58.074Z
2024-06-14 04:55:58.074864 [INFO] ocpp:OCPP201     ::    find_period_at> period_start_time> 2024-06-14T00:00:00.000Z
2024-06-14 04:55:58.075066 [INFO] ocpp:OCPP201     ::    find_period_at>   period_end_time> 2024-06-15T00:00:00.000Z
2024-06-14 04:55:58.075487 [INFO] ocpp:OCPP201     :: PeriodDateTimePair>  period: {
    "limit": 20.0,
    "numberPhases": 1,
    "startPeriod": 0
} end_time: 2024-06-15T00:00:00.000Z
2024-06-14 04:55:58.075789 [INFO] ocpp:OCPP201     :: period.has_value() limit = 4600
2024-06-14 04:55:58.075989 [INFO] ocpp:OCPP201     :: period.has_value() stackLevel = 4600
2024-06-14 04:55:58.076380 [WARN] ocpp:OCPP201    void module::OCPP201::set_external_limits(const std::map<int, ocpp::v201::CompositeSchedule>&) :: OCPP sets the following external limits for EVSE 1 at index 0 :
{
    "schedule_import": [
        {
            "limits_to_leaves": {
                "total_power_W": 4600.0
            },
            "limits_to_root": {},
            "timestamp": "2024-06-14T03:55:58.076Z"
        }
    ]
}
2024-06-14 04:55:58.132150 [ERRO] ocpp:OCPP201    void ocpp::v201::ChargePoint::handle_set_charging_profile_req(ocpp::Call<ocpp::v201::SetChargingProfileRequest>) :: Received profile validity: Validsetting response to {
    "status": "Accepted"
}

Next step: I see a message saying:

2024-06-14 04:55:39.453787 [INFO] ocpp:OCPP201     :: Logging SecurityEvents to file
2024-06-14 04:55:39.560707 [INFO] ocpp:OCPP201     :: TxStartPoints from device model: PowerPathClosed
2024-06-14 04:55:39.561344 [INFO] ocpp:OCPP201     :: TxStopPoints from device model: EVConnected,Authorized
2024-06-14 04:55:39.590262 [INFO] evse_manager_1:  :: Ignoring BSP Event, BSP is not enabled yet.
2024-06-14 04:55:39.805773 [INFO] evse_manager_1:  :: Max AC hardware capabilities: 32A/3ph
2024-06-14 04:55:39.880845 [INFO] car_simulator_1  :: Unplug detected, restarting simulation.
2024-06-14 04:55:39.911712 [INFO] evse_manager_1:  :: AC HLC mode enabled.

Let's enable the uMWC BSP on this config after copying it to a safe place.

  • configured the BSP
  • configured slac to also be real and not a simulator
  • removed the car simulator

Now the error is

2024-06-14 05:09:26.842693 [DEBG] yeti_driver_1:M void Everest::MQTTAbstractionImpl::on_mqtt_message(std::shared_ptr<Everest::Message>) :: topic everest/yeti_driver_1/board_support/cmd starts with everest/
2024-06-14 05:09:26.843793 [DEBG] yeti_driver_1:M Everest::Everest::provide_cmd(std::string, std::string, JsonCommand)::<lambda(Everest::json)> :: Incoming yeti_driver_1:MicroMegaWattBSP->board_support:evse_board_support->get_hw_capabilities() for <handler>
terminate called after throwing an instance of 'std::out_of_range'
  what():  No known string conversion for provided enum of type Connector_type

I guess it is time to update the firmware but let's check the code and verify first.

@shankari
Copy link
Collaborator Author

Bingo!
The connector_type in https://github.com/EVerest/everest-core/blame/14aa49efe8a29d8e3e7f8485f522b4298fbe4250/types/evse_board_support.yaml#L3 was added in the commit that I identified at the beginning of this
EVerest/everest-core@e194737 (feeling pretty good...)

Let's update the firmware. Firmware updates are failing

$ ./umwc_fwupdate /dev/serial0 ../share/everest/modules/YetiDriver/firmware/yetiR1_2.1_firmware.bin
uMWC ROM Bootloader Firmware Updater
uMWC ROM Bootloader Firmware Updater
cobsDecode: Garbage detected
cobsDecode: Garbage detected
Current uMWC SW Version: fcb731e (Protocol 0.1)

Rebooting uMWC in ROM Bootloader mode...
Executing stm32flash -b 115200 /dev/serial0 -v -w ../share/everest/modules/YetiDriver/firmware/yetiR1_2.1_firmware.bin -R ...
stm32flash 0.7

http://stm32flash.sourceforge.net/

Using Parser : Raw BINARY
Size         : 114808
Interface serial_posix: 115200 8E1
Failed to init device, timeout.

Reverting back to the simulator temporarily. This should allow us to get the communication to work again and have a "new" everest starting up by default. Then, we can try to downgrade the BSP and see if it works.

@shankari
Copy link
Collaborator Author

This might actually be easier than I thought, and might be a bug in the code.

The simulator, phytec board, and BSP all set the connector type, but the uMWC does not

./simulation/JsYetiSimulator/index.js:    connector_type: 'IEC62196Type2Cable',
./PhyVersoBSP/connector_1/evse_board_supportImpl.cpp:            caps.connector_type = types::evse_board_support::Connector_type::IEC62196Type2Socket;
./PhyVersoBSP/connector_1/evse_board_supportImpl.cpp:            caps.connector_type = types::evse_board_support::Connector_type::IEC62196Type2Cable;
./PhyVersoBSP/connector_2/evse_board_supportImpl.cpp:            caps.connector_type = types::evse_board_support::Connector_type::IEC62196Type2Socket;
./PhyVersoBSP/connector_2/evse_board_supportImpl.cpp:            caps.connector_type = types::evse_board_support::Connector_type::IEC62196Type2Cable;
./YetiDriver/board_support/evse_board_supportImpl.cpp:        caps.connector_type = types::evse_board_support::Connector_type::IEC62196Type2Cable;

Let's try hardcoding it and rebuilding...

@the-bay-kay
Copy link

  1. Do you see the log statements are only the function name?
  • A: Only the function names. From what I understand, our cross-compilation doesn't run with the -g or -ggdb CMake flags -- since the debug info isn't included in the ELF file, we don't get the additional code. Looking at the file footprints...
/ext/source/build-cross/modules/API/API: ELF 32-bit LSB executable, ARM, EABI5 version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 3.2.0, not stripped
# ...
/ext/source/build/modules/API/API: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, with debug_info, not stripped  

... the native build includes debug info, but cross compile doesn't. We could probably set this flag somewhere in the source tree / make files, but I haven't dug too deeply.

  1. Yes! I just re-ran the test with /API/, and I'm seeing the function definition of the foo() I added. Checking the manager/ ELF next as a sanity check...

@Abby-Wheelis
Copy link
Contributor

Update: I've been able to successfully see changes made during the compilation process!

I was able to replicate these steps and also see a function that I added in the logs!

@the-bay-kay
Copy link

When I go to check the manager...

/bullseye-toolchain/crosstool/bin/armv8-rpi4-linux-gnueabihf-objdump -S /ext/source/build-cross/dist/mnt/user_data/opt/everest/bin/manager > /tmp/temp.log

...I'm not seeing the signatures of the functions I added. They still appear within the intermediate phases' ARM code (API & YetiDriver) -- am I checking the wrong file? We should expect to see these signatures within the final manager, no? I deleted and re-compiled to ensure that I'm not looking at an old version...

I suppose there's a non-zero chance that my dummy files got optimized out via gcc (I don't know what flags we use, but IRRC, higher levels of optimization may remove "useless" code during optimization). From what I know about GCC, this shouldn't be an issue since we're defining a whole function and writing to the log stream.

Anyway, I'm glad to hear that you can replicate my build process @Abby-Wheelis ! LMK if those functions show up on the hardware -- I'll spend a bit more time digging into the manager's ASM, then I'll try running on the VM.

@Abby-Wheelis
Copy link
Contributor

Abby-Wheelis commented Oct 14, 2024

LMK if those functions show up on the hardware

Yes, after copying over I can see my function when I objdump /mnt/user_data/oct14/everest/libexec/everest/modules/YetiDriver/YetiDriver

FWIW I can't find it in /mnt/user_data/oct14/everest/bin/manager

It seems like I am updating the code, and the updates are copied over, so I'm going to add LOTS of logs and see if I can get anything to show up on hardware since it seems like the changes are persisting so maybe it's just crashing too early for the logs I tried to add earlier.

Edit: added EVLOG_error calls to several of the modules, none showing on the hardware

@the-bay-kay
Copy link

Good to know that we're seeing changes to the individual ELFs after copying! That rules that out, at least.

... added EVLOG_error calls to several of the modules, none showing on the hardware

Since we're not seeing the changes in manager, my hunch is that something hinky is going on during the final linking phase... I'll do some digging!

@the-bay-kay
Copy link

Let's trace the workflow to see how we compile manager...

  • When we run make -j$(nproc) -C build-cross, the Makefile within build-cross builds the manager ELF starting on line 458, using Makefile2 found in build-cross/CMakeFiles.

    Makefile Excerpt

    image

    • Aside: each Makefile we begin with has at the header "DO NOT EDIT! \n ...Generated by "Unix Makefiles"... So, these are generated by our initial run of cmake? If we need to adjustthe Makefiles, I'll need to understand this relationship better.
  • Next, Makefile2 starts setting directory rules for modules/ around line 1921...

    • As before, I'll use "YetiDriver" as our somewhat arbitrary sample. We find the directory level rules for this module around line 2492.
  • We find the build rules for YetiDriver start all the way down on line 5920! This tells us the CMake files for YetiDriver are in modules/CMakeFiles (Which I should've noticed, in retrospect).

    YetiDriver Build in Makefile2

    image

    • We see in line 163 of modules/CMakeFiles/YetiDriver.dir/build.make that we use a set of object files linked together (198) to the ELF we've already proven is OK. So, I don't think we have to worry about this sub-routine.

The YetiDriver process defined in Makefile2 above seems to be the last mention of the module; likewise, nowhere in this section do I see a linking phase similar to YetiDriver.dir/build.make. I think this is all of the pieces, but I'm missing something -- let's read these closer and see if we can't find anything (If we can't, I may be barking up the wrong build tree...)

@shankari shankari added this to the CharIN demo prep milestone Oct 15, 2024
@Abby-Wheelis
Copy link
Contributor

Investigate possibility of path being hardcoded:

  1. running the manager in opt/ with the config that worked in SIL gives the same charger type error
  2. removed opt/ and tried to run the manager in oct14/ (most recent cross-compiled copy) - failed immediately
$ sudo /mnt/user_data/oct14/everest/bin/manager --conf /mnt/user_data/oct14/everest/etc/everest/config-sil-no-crash-1.yaml 
sudo: unable to resolve host umwcdbde: Name or service not known
/mnt/user_data/oct14/everest/bin/manager: error while loading shared libraries: libframework.so.0.14.0: cannot open shared object file: No such file or directory
  1. overwrite opt with contents of oct14

  2. run it - does not crash -- seeing logs! (INIT API module is from me/new)


 $ sudo /mnt/user_data/opt/everest/bin/manager --conf /mnt/user_data/opt/everest/etc/everest/config-sil-no-crash-1.yaml 
sudo: unable to resolve host umwcdbde: Name or service not known
2024-10-15 21:00:42.271515 [INFO] manager          ::   ________      __                _   
2024-10-15 21:00:42.272199 [INFO] manager          ::  |  ____\ \    / /               | |  
2024-10-15 21:00:42.272321 [INFO] manager          ::  | |__   \ \  / /__ _ __ ___  ___| |_ 
2024-10-15 21:00:42.272422 [INFO] manager          ::  |  __|   \ \/ / _ \ '__/ _ \/ __| __|
2024-10-15 21:00:42.272515 [INFO] manager          ::  | |____   \  /  __/ | |  __/\__ \ |_ 
2024-10-15 21:00:42.272605 [INFO] manager          ::  |______|   \/ \___|_|  \___||___/\__|
2024-10-15 21:00:42.272694 [INFO] manager          :: 
2024-10-15 21:00:42.272779 [INFO] manager          :: Using MQTT broker localhost:1883
2024-10-15 21:00:42.272877 [INFO] manager          :: Telemetry enabled
2024-10-15 21:00:42.300282 [INFO] everest_ctrl     :: Launching controller service on port 8849
2024-10-15 21:00:42.365175 [INFO] manager          :: Loading config file at: /mnt/user_data/opt/everest/etc/everest/config-sil-no-crash-1.yaml
2024-10-15 21:00:43.289640 [INFO] manager          :: Config loading completed in 1012ms
2024-10-15 21:00:46.990462 [INFO] auth:Auth        :: Module auth initialized [3590ms]
2024-10-15 21:00:47.027747 [INFO] api:API          :: INIT API module
2024-10-15 21:00:47.008461 [INFO] grid_connection  :: Module grid_connection_point initialized [3469ms]
2024-10-15 21:00:47.105437 [INFO] evse_security:E  :: Module evse_security initialized [3600ms]
2024-10-15 21:00:47.098469 [INFO] token_validator  :: Module token_validator initialized [3490ms]

2024-10-15 21:00:47.120089 [INFO] iso15118_charge  :: TCP server on eth1 is listening on port [fe80::122c:797c:de49:474a%3]:61341

2024-10-15 21:00:47.120978 [INFO] iso15118_charge  :: TLS server on eth1 is listening on port [fe80::122c:797c:de49:474a%3]:641092024-10-15 21:00:47.108623 [INFO] token_provider:  :: Module token_provider initialized [3500ms]


2024-10-15 21:00:47.121292 [INFO] iso15118_charge  :: SDP socket setup succeeded
2024-10-15 21:00:47.124899 [INFO] iso15118_charge  :: Module iso15118_charger initialized [3529ms]
2024-10-15 21:00:47.162707 [INFO] yeti_driver_1:M  :: Initializing MMWBSP
2024-10-15 21:00:47.163305 [INFO] yeti_driver_1:M  :: log message before the new connector hardcoding
2024-10-15 21:00:47.163458 [INFO] yeti_driver_1:M  :: FILE WITH HARDCODED CONNECTOR TYPE!
2024-10-15 21:00:47.164208 [INFO] yeti_driver_1:M  :: Module yeti_driver_1 initialized [3500ms]
2024-10-15 21:00:47.200673 [INFO] api:API          :: Module api initialized [3825ms]
2024-10-15 21:00:47.256126 [INFO] slac:EvseSlac    :: Module slac initialized [3648ms]
2024-10-15 21:00:47.584714 [INFO] connector_1:Evs  :: Module connector_1 initialized [4112ms]
2024-10-15 21:00:49.067936 [INFO] manager          :: 🚙🚙🚙 All modules are initialized. EVerest up and running [6838ms] 🚙🚙🚙
sh: 1: echo: echo: I/O error
cobsDecode: Garbage detected
2024-10-15 21:00:49.169164 [INFO] slac:EvseSlac    :: Starting the SLAC state machine
2024-10-15 21:00:49.170590 [INFO] slac:EvseSlac    :: Qualcomm PLC Device Attributes:
  HW Platform: QCA7000
  SW Platform: MAC
  Firmware: 
  Build date: 1.2.5-0
  ZC signal: Missing
  Line frequency: 50Hz
2024-10-15 21:00:49.270996 [INFO] slac:EvseSlac    :: Entered Reset state
2024-10-15 21:00:49.271259 [INFO] slac:EvseSlac    :: New NMK key: 43:37:4E:58:39:41:41:52:4D:4C:30:59:34:4E:47:50
2024-10-15 21:00:49.271811 [INFO] slac:EvseSlac    :: Received CM_SET_KEY_CNF
2024-10-15 21:00:49.272551 [INFO] slac:EvseSlac    :: Entered Idle state
2024-10-15 21:00:49.517694 [INFO] connector_1:Evs  :: Max AC hardware capabilities: 6A/3ph
2024-10-15 21:00:49.577299 [INFO] connector_1:Evs  :: AC HLC mode enabled.

2024-10-15 21:00:49.632972 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-15 21:00:49.683111 [INFO] connector_1:Evs  :: 🌀🌀🌀 Ready to start charging 🌀🌀🌀

2024-10-15 21:00:49.843083 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-15 21:00:59.603321 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:00:59.704037 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:00:59.804667 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:00:59.905352 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:01:00.006025 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:01:00.106656 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:01:00.207342 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:01:00.307978 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 21:01:00.408608 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.

@shankari
Copy link
Collaborator Author

Yay! I knew it had worked for me! Such a weird issue, and so annoying
Now it's time to go to the lab and plug it in?!!

@Abby-Wheelis
Copy link
Contributor

Abby-Wheelis commented Oct 15, 2024

Now it's time to go to the lab and plug it in?!!

I think so!

There were a few parts of the config that I commented out, so I want to make sure we're testing with the "right" config, and debug anything else that comes up, but it does run, and we now know that code must be in /mnt/user_data/opt to be called as expected.

@shankari
Copy link
Collaborator Author

I think we are tracking that as a separate issue, so once you have checked over the configs, and confirmed that it consistently runs, we can go ahead and close this one (finally!!)

@shankari
Copy link
Collaborator Author

I will also point out that this only works if we disable the energy manager, which results in a max current of 0.
Not sure if this will be a problem.

2024-10-15 23:09:56.883206 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-15 23:09:56.925627 [INFO] connector_1:Evs  :: 🌀🌀🌀 Ready to start charging 🌀🌀🌀

2024-10-15 23:09:57.009075 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-15 23:10:06.898035 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-15 23:10:06.998978 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.

In the SIL, if I enable the energy manager, it still crashes. @Abby-Wheelis can you confirm that this is true even on hardware?

Will spend a time-bounded effort tonight to see if I can get the energy manager to stop crashing so that we can try to actually send a real current value.

@shankari
Copy link
Collaborator Author

ok so the energy manager thing is weird. It looks like everything crashes ~ 10 seconds after the manager starts

I enabled debug logging in build/dist/etc/everest/default_logging.cfg and added several sleep statements to the main energy manager loop.
new_sleep_statements.patch

If none of the sleep statements are in place, the energy manager crashes after 999 runs
no_sleeps.log

$ grep "Run energy optimizer" /tmp/no_sleeps.log | wc -l
     999

If all the sleep statements are in place, the energy manager crashes after a single run
all_sleeps.log

$ grep "Run energy optimizer" /tmp/all_sleeps.log | wc -l
       1

In both cases, the crash occurred after around 10 seconds

$ head -n1 /tmp/no_sleeps.log 
2024-10-16 00:40:36.868416 [INFO] manager          ::   ________      __                _   
$ tail -n1 /tmp/no_sleeps.log 
2024-10-16 00:40:42.411856 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Exiting manager.

$ head -n1 /tmp/all_sleeps.log 
2024-10-16 00:52:21.105189 [INFO] manager          ::   ________      __                _   
$ tail -n1 /tmp/all_sleeps.log 
2024-10-16 00:52:33.012974 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Exiting manager.

So I suspect this is related to some kind of threading/timing issue and not a particular issue with the code.

Note also that the optimizer doesn't actually return any values; in the no_sleep case, it consistently returns a vector of length 0. Can we just hack this to set the energy limit once and disable the thread? that would be the most surgical change...

@shankari
Copy link
Collaborator Author

shankari commented Oct 16, 2024

Looking at logs from a prior successful SIL run (non-cross-compiled), I can see that the optimizer returns a vector with a single value

2024-07-17 22:26:45.186727 [INFO] energy_manager:  ::                                NO TRADE
2024-07-17 22:26:45.186918 [INFO] energy_manager:  :: Sending enfored limits (import) to :evse_manager_1 {
    "ac_max_current_A": 32.0
}
2024-07-17 22:26:45.186986 [INFO] energy_manager:  :: returning optimized values vector of length 1
2024-07-17 22:26:45.187077 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W

let's try to return that value and see how it goes.

@shankari
Copy link
Collaborator Author

After the changes, I see

2024-10-16 01:40:09.569465 [INFO] energy_manager:  :: about to init globals again
2024-10-16 01:40:09.570124 [INFO] energy_manager:  :: about to lock the energy mutex
2024-10-16 01:40:09.570222 [INFO] energy_manager:  :: launched the mutex
2024-10-16 01:40:09.570526 [INFO] energy_manager:  :: returning optimized values vector of length 1
2024-10-16 01:40:09.570684 [INFO] energy_manager:  :: ran optimizer in the ready loop, got values 1
2024-10-16 01:40:09.570825 [INFO] energy_manager:  :: my_fake_uuid Enforce limits 32A -9999W 

And then a crash

Will fix the UUID to evse_manager_1, but the issue is clearly not with the values returned but with the thread. Trying to set the values once and not launching the thread at all...

@shankari
Copy link
Collaborator Author

shankari commented Oct 16, 2024

Removing the infinite loop

    // start thread to update energy optimization                                                           
    std::thread([this] {                                                                      
            EVLOG_info << "running another round of energy optimization with update_interval " << config.upda
            EVLOG_info << "about to init globals again";                                                     
            globals.init(date::utc_clock::now(), config.schedule_interval_duration, config.schedule_total_dur
                         config.slice_ampere, config.slice_watt, config.debug, energy_flow_request);         
            auto optimized_values = run_optimizer(energy_flow_request);                             
            EVLOG_info << "ran optimizer in the ready loop, got values " << optimized_values.size();
            enforce_limits(optimized_values);                                                       
            EVLOG_info << "enforced limits with " << optimized_values.size();                                
    }).detach();                                                                                             
causes the process to not crash, but goes back to 0 power, similar to disabling the module completely
2024-10-16 13:53:27.276290 [INFO] energy_manager:  :: about to lock the energy mutex
2024-10-16 13:53:27.276441 [INFO] energy_manager:  :: launched the mutex
2024-10-16 13:53:27.276767 [INFO] energy_manager:  :: returning optimized values vector of length 1
2024-10-16 13:53:27.276932 [INFO] energy_manager:  :: ran optimizer in the ready loop, got values 1
2024-10-16 13:53:27.276997 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 

...
2024-10-16 13:53:27.694557 [DEBG] energy_manager: Everest::Everest::subscribe_var(const Requirement&, const std::string&, const JsonCallback&)::<lambda(const Everest::json&)> :: Incoming grid_connection_point:EnergyNode->energy_grid:energy->energy_flow_request
2024-10-16 13:53:27.812083 [INFO] connector_1:Evs  :: Ignoring BSP Event, BSP is not enabled yet.
2024-10-16 13:53:27.812661 [INFO] connector_1:Evs  :: AC HLC mode enabled.

2024-10-16 13:53:27.900739 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-16 13:53:27.918303 [INFO] ev_manager:JsEv  :: Unplug detected, restarting simulation.
2024-10-16 13:53:27.944001 [INFO] connector_1:Evs  :: 🌀🌀🌀 Ready to start charging 🌀🌀🌀

2024-10-16 13:53:28.035004 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-16 13:53:28.697083 [DEBG] energy_manager: void Everest::MQTTAbstractionImpl::on_mqtt_message(std::shared_ptr<Everest::Message>) :: topic everest/grid_connection_point/energy_grid/var starts with everest/
...
2024-10-16 13:52:21.063319 [DEBG] energy_manager: void Everest::MQTTAbstractionImpl::on_mqtt_message(std::shared_ptr<Everest::Message>) :: topic everest/grid_connection_point/energy_grid/var starts with everest/
2024-10-16 13:52:21.064637 [DEBG] energy_manager: Everest::Everest::subscribe_var(const Requirement&, const std::string&, const JsonCallback&)::<lambda(const Everest::json&)> :: Incoming grid_connection_point:EnergyNode->energy_grid:energy->energy_flow_request
2024-10-16 13:52:21.122293 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-16 13:52:21.223231 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.

@shankari
Copy link
Collaborator Author

shankari commented Oct 16, 2024

What is interesting is that with the loop and the forced limits, the 2024-10-16 13:53:28.035004 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000 log never shows up

$ grep Enforce /tmp/with_forced_limit.log 
2024-10-16 14:02:28.032095 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 14:02:28.110878 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 14:02:28.236020 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 14:02:28.323189 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 14:02:28.378272 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 

$ grep iso15118_charge /tmp/with_forced_limit.log 
2024-10-16 14:02:25.299565 [DEBG] energy_manager  void Everest::Config::load_and_validate_manifest(const std::string&, const Everest::json&) :: Found module iso15118_charger:EvseV2G, loading and verifying manifest...
2024-10-16 14:02:25.750956 [DEBG] energy_manager  void Everest::Config::resolve_all_requirements() :: Manifest of connector_1:EvseManager lists requirement 'hlc' which will be fulfilled by iso15118_charger:EvseV2G->charger:ISO15118_charger...
2024-10-16 14:02:25.754719 [DEBG] energy_manager  void Everest::Config::resolve_all_requirements() :: Manifest of iso15118_charger:EvseV2G lists requirement 'security' which will be fulfilled by evse_security:EvseSecurity->main:evse_security...
2024-10-16 14:02:27.377779 [INFO] iso15118_charge  :: TCP server on eth0 is listening on port [fe80::42:acff:fe12:4%27]:61341
2024-10-16 14:02:27.378673 [INFO] iso15118_charge  :: TLS server on eth0 is listening on port [fe80::42:acff:fe12:4%27]:64109
2024-10-16 14:02:27.378744 [INFO] iso15118_charge  :: SDP socket setup succeeded
2024-10-16 14:02:27.379200 [INFO] iso15118_charge  :: Module iso15118_charger initialized [2286ms]
2024-10-16 14:02:28.525104 [INFO] manager          :: SIGTERM of child: iso15118_charger (pid: 11109) succeeded.

That's weird, because I set the hardcoded value to be valid for 3600 seconds, which should be an hour.

Next tries:

  • Bump up the validity to 3600,000 (just in case it is milliseconds)
  • Set the limits in the loop without calling the optimizer (just in case it is the optimizer and not the enforce_limits that is the problem
  • Start digging into iso15118_charger

@Abby-Wheelis
Copy link
Contributor

In the SIL, if I enable the energy manager, it still crashes. @Abby-Wheelis can you confirm that this is true even on hardware?

Yes, I can confirm that the energy_manager also seems to crash shortly after startup, I restored

this
-  energy_manager:
-    connections:
-      energy_trunk:
-        - implementation_id: energy_grid
-          module_id: grid_connection_point
-    module: EnergyManager

And then ran, seeing lots of logs that I added, and a crash from energy_manager

logs:
$ sudo /mnt/user_data/opt/everest/bin/manager --conf /mnt/user_data/opt/everest/etc/everest/config-sil-no-crash.yaml
sudo: unable to resolve host umwcdbde: Name or service not known
2024-10-16 16:20:42.204488 [INFO] manager          ::   ________      __                _   
2024-10-16 16:20:42.204978 [INFO] manager          ::  |  ____\ \    / /               | |  
2024-10-16 16:20:42.205069 [INFO] manager          ::  | |__   \ \  / /__ _ __ ___  ___| |_ 
2024-10-16 16:20:42.205347 [INFO] manager          ::  |  __|   \ \/ / _ \ '__/ _ \/ __| __|
2024-10-16 16:20:42.205478 [INFO] manager          ::  | |____   \  /  __/ | |  __/\__ \ |_ 
2024-10-16 16:20:42.205605 [INFO] manager          ::  |______|   \/ \___|_|  \___||___/\__|
2024-10-16 16:20:42.205777 [INFO] manager          :: 
2024-10-16 16:20:42.205919 [INFO] manager          :: Using MQTT broker localhost:1883
2024-10-16 16:20:42.206048 [INFO] manager          :: Telemetry enabled
2024-10-16 16:20:42.225010 [INFO] everest_ctrl     :: Launching controller service on port 8849
2024-10-16 16:20:42.290588 [INFO] manager          :: Loading config file at: /mnt/user_data/opt/everest/etc/everest/config-sil-no-crash.yaml
2024-10-16 16:20:43.257814 [INFO] manager          :: Config loading completed in 1048ms
2024-10-16 16:20:46.946623 [INFO] energy_manager:  :: Module energy_manager initialized [3481ms]
2024-10-16 16:20:47.294016 [INFO] token_validator  :: Module token_validator initialized [3603ms]
2024-10-16 16:20:47.457844 [INFO] auth:Auth        :: Module auth initialized [4108ms]
2024-10-16 16:20:47.478470 [INFO] api:API          :: INIT API module
2024-10-16 16:20:47.508750 [INFO] token_provider:  :: Module token_provider initialized [3970ms]
2024-10-16 16:20:47.534440 [INFO] grid_connection  :: Module grid_connection_point initialized [4043ms]
2024-10-16 16:20:47.591294 [INFO] api:API          :: Module api initialized [4224ms]
2024-10-16 16:20:47.667752 [INFO] yeti_driver_1:M  :: Initializing MMWBSP

2024-10-16 16:20:47.676306 [INFO] yeti_driver_1:M  :: log message before the new connector hardcoding
2024-10-16 16:20:47.676629 [INFO] yeti_driver_1:M  :: FILE WITH HARDCODED CONNECTOR TYPE!
2024-10-16 16:20:47.676107 [INFO] iso15118_charge  :: TCP server on eth1 is listening on port [fe80::fbba:70d9:9849:a121%3]:61341

2024-10-16 16:20:47.677077 [INFO] iso15118_charge  :: TLS server on eth1 is listening on port [fe80::fbba:70d9:9849:a121%3]:64109

2024-10-16 16:20:47.677619 [INFO] iso15118_charge  :: SDP socket setup succeeded
2024-10-16 16:20:47.678083 [INFO] yeti_driver_1:M  :: Module yeti_driver_1 initialized [3925ms]
2024-10-16 16:20:47.679090 [INFO] iso15118_charge  :: Module iso15118_charger initialized [4048ms]
2024-10-16 16:20:47.743297 [INFO] evse_security:E  :: Module evse_security initialized [4162ms]
2024-10-16 16:20:47.835952 [INFO] slac:EvseSlac    :: Module slac initialized [4159ms]
2024-10-16 16:20:48.041326 [INFO] connector_1:Evs  :: Module connector_1 initialized [4598ms]
2024-10-16 16:20:49.157922 [INFO] manager          :: 🚙🚙🚙 All modules are initialized. EVerest up and running [6984ms] 🚙🚙🚙
2024-10-16 16:20:49.160513 [INFO] energy_manager:  :: NEW LOG STATEMENT
sh: 1: echo: echo: I/O error
2024-10-16 16:20:49.188471 [INFO] energy_manager:  :: returning optimized values vector of length 0
2024-10-16 16:20:49.204564 [INFO] slac:EvseSlac    :: Starting the SLAC state machine
2024-10-16 16:20:49.206176 [INFO] slac:EvseSlac    :: Qualcomm PLC Device Attributes:
  HW Platform: QCA7000
  SW Platform: MAC
  Firmware: 
  Build date: 1.2.5-0
  ZC signal: Missing
  Line frequency: 50Hz
2024-10-16 16:20:49.306508 [INFO] slac:EvseSlac    :: Entered Reset state
2024-10-16 16:20:49.306791 [INFO] slac:EvseSlac    :: New NMK key: 49:30:45:4E:59:58:41:44:46:54:53:41:39:5A:48:37
2024-10-16 16:20:49.307355 [INFO] slac:EvseSlac    :: Received CM_SET_KEY_CNF
2024-10-16 16:20:49.307785 [INFO] slac:EvseSlac    :: Entered Idle state
2024-10-16 16:20:49.592767 [INFO] connector_1:Evs  :: Max AC hardware capabilities: 6A/3ph
2024-10-16 16:20:49.755591 [INFO] connector_1:Evs  :: AC HLC mode enabled.

2024-10-16 16:20:49.808876 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-16 16:20:49.857819 [INFO] connector_1:Evs  :: 🌀🌀🌀 Ready to start charging 🌀🌀🌀

2024-10-16 16:20:50.011078 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-16 16:20:50.454433 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Module energy_manager (pid: 1593) exited with status: 139. Terminating all modules.
2024-10-16 16:20:50.455798 [INFO] manager          :: SIGTERM of child: api (pid: 1589) succeeded.
2024-10-16 16:20:50.456008 [INFO] manager          :: SIGTERM of child: auth (pid: 1590) succeeded.
2024-10-16 16:20:50.456276 [INFO] manager          :: SIGTERM of child: connector_1 (pid: 1591) succeeded.
2024-10-16 16:20:50.458290 [INFO] manager          :: SIGTERM of child: connector_1_powerpath (pid: 1592) succeeded.
2024-10-16 16:20:50.458561 [INFO] manager          :: SIGTERM of child: evse_security (pid: 1594) succeeded.
2024-10-16 16:20:50.458716 [INFO] manager          :: SIGTERM of child: grid_connection_point (pid: 1595) succeeded.
2024-10-16 16:20:50.458830 [INFO] manager          :: SIGTERM of child: iso15118_car (pid: 1597) succeeded.
2024-10-16 16:20:50.458937 [INFO] manager          :: SIGTERM of child: iso15118_charger (pid: 1602) succeeded.
2024-10-16 16:20:50.468762 [INFO] manager          :: SIGTERM of child: slac (pid: 1603) succeeded.
2024-10-16 16:20:50.469523 [INFO] manager          :: SIGTERM of child: token_provider (pid: 1604) succeeded.
2024-10-16 16:20:50.469902 [INFO] manager          :: SIGTERM of child: token_validator (pid: 1605) succeeded.
2024-10-16 16:20:50.470145 [INFO] manager          :: SIGTERM of child: yeti_driver_1 (pid: 1606) succeeded.
2024-10-16 16:20:50.470277 [CRIT] manager         int boot(const boost::program_options::variables_map&) :: Exiting manager.

@shankari
Copy link
Collaborator Author

Bump up the validity to 3600,000 (just in case it is milliseconds) -- did not get fixed
Set the limits in the loop without calling the optimizer - does not crash, but we are still getting the `Power budget expired, falling back to 0.` message. Note that this is more of an indication of how the problem is with the mutex
        while(true) {
            std::vector<types::energy::EnforcedLimits> optimized_values;
            optimized_values.reserve(1);
            EVLOG_info << "about to init globals again";
            globals.init(date::utc_clock::now(), config.schedule_interval_duration, config.schedule_total_dur
                         config.slice_ampere, config.slice_watt, config.debug, energy_flow_request);
            // auto optimized_values = run_optimizer(energy_flow_request);
            // EVLOG_info << "ran optimizer in the ready loop, got values " << optimized_values.size();
            if (optimized_values.size() == 0) {
                types::energy::EnforcedLimits l;                                              
                l.uuid = "evse_manager_1";         
                l.valid_until =             
                        Everest::Date::to_rfc3339(globals.start_time + std::chrono::seconds(3600000));
                types::energy::LimitsRes r;
                r.ac_max_current_A = 32;
                l.limits_root_side = r;
                optimized_values.push_back(l);  
            }
       
            enforce_limits(optimized_values);
            EVLOG_info << "enforced limits with " << optimized_values.size();
        }

generates

2024-10-16 18:15:51.891699 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 18:15:51.891839 [INFO] energy_manager:  :: about to init globals again
2024-10-16 18:15:51.892061 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 18:15:51.976614 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 18:15:51.976697 [INFO] energy_manager:  :: about to init globals again
2024-10-16 18:15:51.976861 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 18:15:51.982200 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-16 18:15:52.063449 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 18:15:52.063567 [INFO] energy_manager:  :: about to init globals again
2024-10-16 18:15:52.063797 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 18:15:52.083250 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.
2024-10-16 18:15:52.183896 [WARN] connector_1:Evs bool module::Charger::power_available() :: Power budget expired, falling back to 0.

So we do need to muck around with the iso15118_charger code to stop the power budget from expiring...

@shankari
Copy link
Collaborator Author

Found those logs in `modules/EvseManager/Charger.cpp`
// returns whether power is actually available from EnergyManager                                              
// i.e. max_current is in valid range                                                                          
bool Charger::power_available() {                
    if (shared_context.max_current_valid_until < date::utc_clock::now()) {                                     
        EVLOG_warning << "Power budget expired, falling back to 0.";   
        if (shared_context.max_current > 0.) {                                                               
            shared_context.max_current = 0.;                                                                  
            signal_max_current(shared_context.max_current);                                                   
        }                                                                                        
    }                                                                                               
    return (get_max_current_internal() > 5.9);                                                      
}                                                                          

max_current_valid_until is set in Charger::set_max_current(float c, std::chrono::time_point<date::utc_clock> validUntil), which is called from ../modules/EvseManager/EvseManager.cpp (charger->set_max_current(0.0F, date::utc_clock::now() + std::chrono::seconds(10));)

    //  start with a limit of 0 amps. We will get a budget from EnergyManager that is locally limited by hw    
    //  caps.                                                                                                  
    charger->set_max_current(0.0F, date::utc_clock::now() + std::chrono::seconds(10));                         

So that is where the original (expiring) limit comes from. Why aren't we getting any values from the EnergyManager?

Aha! It looks like handle_enforce_limits can be called from EnergyNode or EvseManager

# grep -r handle_enforce_limits ../modules
../modules/EvseManager/energy_grid/energyImpl.hpp:    virtual void handle_enforce_limits(types::energy::EnforcedLimits& value) override;
../modules/EvseManager/energy_grid/energyImpl.cpp:void energyImpl::handle_enforce_limits(types::energy::EnforcedLimits& value) {
../modules/EnergyNode/energy_grid/energyImpl.hpp:    virtual void handle_enforce_limits(types::energy::EnforcedLimits& value) override;
../modules/EnergyNode/energy_grid/energyImpl.cpp:void energyImpl::handle_enforce_limits(types::energy::EnforcedLimits& value) {

And both of them check to see if the response is "for them"

    // is it for me?
    if (value.uuid == energy_flow_request.uuid) {    

Let's see if that is why they are ignoring the limit...

@shankari
Copy link
Collaborator Author

shankari commented Oct 16, 2024

Bingo!
2024-10-16 19:14:59.070561 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 19:14:59.070646 [INFO] energy_manager:  :: about to init globals again
2024-10-16 19:14:59.070853 [INFO] energy_manager:  :: evse_manager_1 Enforce limits 32A -9999W 
2024-10-16 19:14:59.112087 [ERRO] grid_connection virtual void module::energy_grid::energyImpl::handle_enforce_limits(types::energy::EnforcedLimits&) :: In EnergyNode: for UUID of grid_connection_point, received limit of {
    "limits_root_side": {
        "ac_max_current_A": 32.0
    },
    "uuid": "evse_manager_1",
    "valid_until": "2024-11-27T11:14:59.070Z"
}
2024-10-16 19:14:59.154161 [ERRO] connector_1:Evs virtual void module::energy_grid::energyImpl::handle_enforce_limits(types::energy::EnforcedLimits&) :: EvseManager: for connector_1, received limit of {
    "limits_root_side": {
        "ac_max_current_A": 32.0
    },
    "uuid": "evse_manager_1",
    "valid_until": "2024-11-27T11:14:59.070Z"
}
Changed the UUID to match and everything seems to work now
2024-10-16 19:21:07.974365 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 19:21:07.974448 [INFO] energy_manager:  :: about to init globals again
2024-10-16 19:21:07.974649 [INFO] energy_manager:  :: connector_1 Enforce limits 32A -9999W 
2024-10-16 19:21:08.015554 [ERRO] grid_connection virtual void module::energy_grid::energyImpl::handle_enforce_limits(types::energy::EnforcedLimits&) :: In EnergyNode: for UUID of grid_connection_point, received limit of {
    "limits_root_side": {
        "ac_max_current_A": 32.0
    },
    "uuid": "connector_1",
    "valid_until": "2024-11-27T11:21:07.974Z"
}
2024-10-16 19:21:08.032377 [ERRO] connector_1:Evs virtual void module::energy_grid::energyImpl::handle_enforce_limits(types::energy::EnforcedLimits&) :: EvseManager: for connector_1, received limit of {
    "limits_root_side": {
        "ac_max_current_A": 32.0
    },
    "uuid": "connector_1",
    "valid_until": "2024-11-27T11:21:07.974Z"
}

2024-10-16 19:21:08.034156 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 32.000000, we get 32.000000

@Abby-Wheelis can you also apply this additional patch and re-enable the energy manager?
Change the while(true) loop in modules/EnergyManager/EnergyManager.cpp to

        while(true) {            
            std::vector<types::energy::EnforcedLimits> optimized_values;
            optimized_values.reserve(1);
            EVLOG_info << "new round of optimization launched";
            globals.init(date::utc_clock::now(), config.schedule_interval_duration, config.schedule_total_durat
                         config.slice_ampere, config.slice_watt, config.debug, energy_flow_request);
            // auto optimized_values = run_optimizer(energy_flow_request);
            // EVLOG_info << "ran optimizer in the ready loop, got values " << optimized_values.size();
            if (optimized_values.size() == 0) {
                types::energy::EnforcedLimits l;
                l.uuid = "connector_1";                                                       
                l.valid_until =                    
                        Everest::Date::to_rfc3339(globals.start_time + std::chrono::seconds(config.update_inter
                types::energy::LimitsRes r;
                r.ac_max_current_A = 32;
                l.limits_root_side = r;
                optimized_values.push_back(l);
            }                                   
         
            enforce_limits(optimized_values);
            EVLOG_info << "enforced limits with " << optimized_values.size();
            {            
                std::unique_lock<std::mutex> lock(mainloop_sleep_mutex);
                mainloop_sleep_condvar.wait_for(lock, std::chrono::seconds(config.update_interval));
            }                
        }                 

I have also attached the expected logs. Once you confirm that this works, we can close this issue and move to #75

EDIT: Actually attached the logs
expected_logs_basic_config_sil.log

@corneliusclaussen
Copy link

I see you are running into a lot of issues, but I do not understand what problem you are solving here? The umwc yocto image comes with an sdk that can be used to cross compile Everest and install it under /var/everest and it just works.

@corneliusclaussen
Copy link

Why do you want to hardcode the uuid? This is definitely wrong.

@shankari
Copy link
Collaborator Author

@corneliusclaussen our current plan for the CharIN milestone is to experiment with plugging in and yocto builds in parallel.
I encourage you to look at the milestone tasks (https://github.com/EVerest/everest-demo/milestone/1), this is just one of them 😄 Let us know if you have any feedback on our plans!

  • Concretely, for plugging in, on the uMWC, we are running the release from Mar/Apr 2024 + the hacks that AFS gave me for the OCPP managed charging in June + the hacks that I put in during CharIN week on raspbian.

    • This is already so hacky that a hardcoded uuid is the least of our problems. We are not going to submit this - it is throwaway code to allow us to make progress. I did think about changing it to energy_flow_request.uuid but felt like I had spent enough time on this already.
    • This will allow us to make progress in plugging the uMWC into an EV/EV simulator at NREL and debugging the electrical components. At CharIN, I got everything to build and run on the last day, but the charge session was not initiated (apparently because we didn't send the 5% PWM) and I wasn't sure how to debug in the 30 minutes that I had.
  • In parallel, we are working on rolling forward to the most recent release, which has the correct, non-hacky implementation of managed charging from AFS

    • We will indeed use Yocto after we roll forward
    • This will let move from a working system to a working system and help us isolate whether the issues that we see are due to hardware or software.

@corneliusclaussen
Copy link

Iirc all you want to do is cross compile Everest and run it on the umwc, that's why I don't understand why you are running into so many issues. I think it is a combination of a very old Debian image and a very old Everest version with lots of changes.
With the yocto SDK this is a matter of minutes, we do it all the time, especially on Charin Testivals.
So I think the quickest by far is to upgrade to the latest umwc yocto build ( has to be done once with the raspberry pi tooling as rauc update from Debian to yocto is not possible).

Once you are on yocto it can be updated with rauc again. And a custom build can be installed under /var/everest

In this build there is a symlink /etc/selected-everest which you can point to the /var/Everest install, and then you can also use this custom version from the ui.

So it's all prepared for that.

@shankari
Copy link
Collaborator Author

Iirc all you want to do is cross compile Everest and run it on the umwc, that's why I don't understand why you are running into so many issues. I think it is a combination of a very old Debian image and a very old Everest version with lots of changes.

Yup. Note that the first issue that I ran into was the SSL version - everest had rolled forward to SSL3, but debian (or at least the toolchain) had only SSL1 installed - 90% of my hacks were related to rolling back to the older version of SSL.
#51 (comment) and onwards.

I am still not sure what is causing this most recent error with the run_optimizer - I did not change it in June, and was able to start up properly. But now, it is crashing consistently, both in SIL and in hardware. We are not going to spend additional time investigating since it is so old and the code is so hacky.

I think that part of it was also a little bit of "too many cooks" - AFS made the smart charging changes; I did the initial cross-compile, and then tried to hand it off to @Abby-Wheelis and @faizanmir21 who are on-site and can access the EV lab, but don't have a lot of experience with EVerest.

But going forward, as we update to a more recent version, we will use yocto, and hopefully it will be simpler.
@catarial has already been able to compile the most recent version of everest on a pi zero, and is working on the yocto build next.

We will be tracking the work required to get the yocto build to work in #76 Fingers crossed that it will be smooth sailing!

@Abby-Wheelis
Copy link
Contributor

@Abby-Wheelis can you also apply this additional patch and re-enable the energy manager?

I made this patch (was there another patch I needed to make too? It looks like most of the other work was logging/troubleshooting) and restored the manager, and it no longer seems to be crashing.

I have also attached the expected logs.

Are these the logs you were talking about?

Changed the UUID to match and everything seems to work now

If so, then I believe that is what I am seeing, after initialization I am seeing a loop of messages:
2024-10-16 22:26:51.075596 [INFO] connector_1:Evs  :: 🌀🌀🌀 Ready to start charging 🌀🌀🌀

2024-10-16 22:26:51.224352 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 0.000000, we get 0.000000
2024-10-16 22:26:52.020574 [INFO] energy_manager:  :: new round of optimization launched

2024-10-16 22:26:52.087538 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 32.000000, we get 32.000000
2024-10-16 22:26:52.139143 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 22:26:53.139522 [INFO] energy_manager:  :: new round of optimization launched

2024-10-16 22:26:53.157806 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 32.000000, we get 32.000000
2024-10-16 22:26:53.210145 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 22:26:54.210580 [INFO] energy_manager:  :: new round of optimization launched

2024-10-16 22:26:54.228634 [ERRO] iso15118_charge void dlog_func(dloglevel_t, const char*, int, const char*, const char*, ...) :: In ISO15118 charger impl, after updating AC max current to 32.000000, we get 32.000000
2024-10-16 22:26:54.279238 [INFO] energy_manager:  :: enforced limits with 1
2024-10-16 22:26:55.279616 [INFO] energy_manager:  :: new round of optimization launched

@shankari
Copy link
Collaborator Author

No, only this patch.
All looks good. I am closing this.

Further updates on creating a custom version of EVerest using the new yocto method will be tracked in #76
Further updates on plugging this custom version into a hardware simulator/EV will be tracked in #75

Hopefully rolling forward will be smoother.
One task down, four to go!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants