Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(api): fixes for tesla 20.49 API changes #16

Conversation

pridkett
Copy link
Contributor

NOTE: This supersedes PR #15. The major difference between the two PRs is that using this PR will allow people to maintain their existing data streams. It creates a custom Telegraf container with a proxy running inside of it that passes the cookies through to the Powerwall. To the best of my ability, I believe this is the only way to get Telegraf to pass cookies on a request.

With the 20.49 firmware update by Tesla, the endpoints that previously
were open are no longer without authentication. Unfortuantely, the
mechanism used for authentication is not HTTP basic, rather it's cookie
based. This brings in a very simple cookie aware proxy to run in the
telegraf container. In short, cron updates the cookies every two minutes
and then cookieproxy forwards to connection with cookies through to the
Powerwall.

Fixes #14

DCO 1.1 Signed-off-by: Patrick Wagstrom [email protected]

With the 20.49 firmware update by Tesla, the endpoints that previously
were open are no longer without authentication. Unfortuantely, the
mechanism used for authentication is not HTTP basic, rather it's cookie
based. This brings in a very simple cookie aware proxy to run in the
telegraf container. In short, cron updates the cookies every two minutes
and then cookieproxy forwards to connection with cookies through to the
Powerwall.

Fixes mihailescu2m#14

DCO 1.1 Signed-off-by: Patrick Wagstrom <[email protected]>
@nhallwood
Copy link

@pridkett Thanks for putting this update together - I've made the changes to my existing configuration and when I run the docker-compose command the build is failing in step 5/8 when it's getting packages - the specific line is
Get:31 http://deb.debian.org/debian buster/main armhf psmisc armhf 23.2-1 [123 kB] debconf: delaying package configuration, since apt-utils is not installed
At the end of the process I get the following errors:
Successfully built 6f1942eff8ae Successfully tagged powerwall_monitor_telegraf:latest WARNING: Image for service telegraf was built because it did not already exist. To rebuild this image you must use docker-compose buildordocker-compose up --build`.
Creating influxdb ... done
Creating telegraf ... error
Creating grafana ... done

ERROR: for telegraf Cannot start service telegraf: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: "/entrypoint.sh": permission denied": unknown
`
I'm not sure where the permission error is and I assumed that since this is in the docker container it should be consistent with your configuration? Any ideas?

Thanks. Nick.

@pridkett
Copy link
Contributor Author

pridkett commented Mar 2, 2021

@nhallwood: The error is caused by entrypoint.sh not having execution permissions. I want to make sure that I apply the correct fix. I've got a few questions that will help me narrow down the cause of the problem:

  • What platform are you building the container on? Linux, Mac, or Windows?
  • Can you look at the permissions of the telegraf/entrypoint.sh file? On Mac and Linux you can see this with ls -al telegraf and the output should look something like this:
    total 24
    drwxr-xr-x   5 pwagstro  staff   160 Mar  2 06:58 ./
    drwxr-xr-x  11 pwagstro  staff   352 Mar  2 06:58 ../
    -rw-r--r--   1 pwagstro  staff   250 Mar  2 06:58 Dockerfile
    -rwxr-xr-x   1 pwagstro  staff  1823 Mar  2 06:58 cookieproxy.sh*
    -rwxr-xr-x   1 pwagstro  staff   756 Mar  2 06:58 entrypoint.sh*
    
    The big thing that I'm looking for there is the x in the line for cookieproxy.sh and entrypoint.sh. If you're on windows, this is moot and I'll probably need to apply the fix or start pushing a package to Docker Hub, which I can set up over the weekend.

The line about delaying package configuration can be safely ignored - that's just what happens when you run without apt-utils installed and it's not even for a relevant package.

I'm beginning to think that the best route is probably for me to just start pushing a prebuilt powerwall_monitor_telegraf container to Docker Hub, which would avoid these problems. Just gotta figure out how to do that, I'm normally pushing to internal corporate Docker repositories.

@nhallwood
Copy link

nhallwood commented Mar 2, 2021

@pridkett I'm building/running on a raspberry pi4. I've checked and none of the 3 files in that folder have execution permissions. I've added executable permissions to both .sh files. I then built the containers again (which rebuilt telegraf only) and it looks like it's running now. I will need to check that it starts updating the influxdb. After 5 minutes I'm now seeing data again in the Grafana dashboard.

One thing that wasn't clear in the instructions (and I don't yet fully understand all the inner workings of the influxdb) is whether I needed to re-run all of the instructions following the command to start the docker containers. Clearly from my experience above, if you had the previous incarnation running then all you need to do is get to the "start the docker containers" command and then everything should start again. Might be worth adding a line to the md file after the start docker containers command that lets updaters know they should stop there but new builders should continue.

Thanks for your help in solving this and for getting this working again!

@andrewfoster
Copy link

After a day or so cookieproxy stops responding:

root 15 0.1 0.0 0 0 ? Z Mar04 3:36 [cookieproxy] <defunct>

Nothing obviously helpful in telegraf logs, other than connection refused errors:

[inputs.http] Error in plugin: [url=http://localhost:8675/p/?target=https://powerwall/api/system_status/soe]: Get "http://localhost:8675/p/?target=https://powerwall/api/system_status/soe": dial tcp 127.0.0.1:8675: connect: connection refused

@pridkett
Copy link
Contributor Author

pridkett commented Mar 6, 2021

If you kick just the telegraf container that should work, but let me find a better fix. I'll probably add support for supervisord or something similar. Gimme a chance to tweak that.

How to kick just the telegraf container:

docker-compose -f powerwall.yml restart telegraf

Obviously, this is not a complete fix, but it should get you back on track while I either integrate supervisord or figure out what's breaking cookieproxy.

@andrewfoster
Copy link

I ran cookieproxy manually for a while. The last thing logged to stdout was:

2021/03/09 13:18:25 open /tmp/cookies/powerwall.txt: too many open files

I assume this is caused by the curl cronjob creating a new cookies.txt and cookieproxy not handling that well. I've switched the cronjob to restart cookieproxy each time the cookie file changes.

@pridkett
Copy link
Contributor Author

You're 100% correct on that. I used defer to close the cookiejar, but the goroutine reading it never exited. Leading an explosion in files. Let me fix that.

@nwhobart
Copy link

nwhobart commented Apr 4, 2021

This looks great. What's the delay @mihailescu2m ? @pridkett are you actively using your branch?

@pridkett
Copy link
Contributor Author

pridkett commented Apr 4, 2021

@nwhobart - this is a great question. I guess I could use a little bit of help here with deciding on an architectural direction. Apologies for the tome that I've written below.

I've been using this branch for a little over a month - and since the bug with the open file handles was discovered, I haven't had any problems and haven't had to restart anything. I run my setup on a Raspberry Pi 4 with 4GB of RAM that also runs an overly complicated PiHole setup. Here's my dashboard for the month of March. (I don't get great performance my setup, the gable on my house runs almost straight north/south so my panels face almost nearly straight east and west.)

image

However, why I'm not pushing for it real hard is because I'm not sure it's the correct architectural solution. I don't really like having to pull in so many other things into the telegraf container - including curl and cookieproxy. The way that the cookies are managed for the Powerwall Gateway as a text file just seems dirty and hacky. The way that cookieproxy can die but the container keeps running seems dirty and hacky. The increase in size of the telegraf container seems like a mess. In short, it's a hack.

From an architectural perspective I've been playing with a couple of different solutions:

  1. Enhancements to the telegraf container customizations to make cookieproxy more resilient. That still leaves a bunch of messy stuff on top of the telegraf container which means more moving parts and a harder time to debug.
  2. Publishing a modified container that powerwall_monitor can use instead of telegraf. This seems cleaner because the end user doesn't need to go through the hassle of a two stage build process with golang and a bunch of other stuff. But, it leaves me or @mihailescu2m on the hook for maintaining that container and ensuring it gets updated. I don't really want to do that and once again it's another dependency.
  3. Submitting patches to telegraf to allow it execute a pre-flight request to get cookies. In all reality, this is probably the best solution, but it requires a lot of work on my end and I don't really want to go munging around in the telegraf code and then navigate the process of submitting a PR against it with all the tests. This will take me a long time for something that is a super niche use case and it isn't certain that it will happen even if I generate a nice patch. In short, if this were something for work or that would take me in good direction for my career, I might have the patience to walk through this process, but not for a small hobby.
  4. Moving cookieproxy into its own container and taking the responsibility to publish a cookieproxy container whenever a new release is done of cookieproxy. I recently added code to cookieproxy that removes the need for curl (see pridkett/cookieproxy@5b78434), so this can be done as a very lightweight container. In fact, because it's golang, it could probably be a FROM scratch container. In this setup docker-compose would be used restart the cookieproxy container if it failed and it could spew logs to stdout that you could grab easily.

I'm pretty sure that solution 4 is the best solution, I just need some time to put it together. In the mean time, this branch works fine for me and for other people (or so it seems). If people find bugs, particularly bugs with cookieproxy - like what @andrewfoster found earlier they'll get patched right up.

@andrewfoster
Copy link

This looks great. What's the delay @mihailescu2m ? @pridkett are you actively using your branch?

I am using this branch, calling docker-compose restart telegraf periodically to avoid the issue I noted above.

@mihailescu2m
Copy link
Owner

mihailescu2m commented Apr 8, 2021

@nwhobart @andrewfoster @pridkett
I have another solution already working nicely (another telegraf plugin), but I am writing my own frontend interface instead of using grafana.
So I will publish when I get that done ...

@mrisher
Copy link
Contributor

mrisher commented Apr 8, 2021

Hi. Trying your branch for the first time, it doesn't seem the cookie proxy is working, my docker logs telegraf shows line after line of

2021-04-08T01:13:35Z E! [inputs.http] Error in plugin: [url=http://localhost:8675/p/?target=https://powerwall/api/meters/aggregates]: Get "http://localhost:8675/p/?target=https://powerwall/api/meters/aggregates": dial tcp 127.0.0.1:8675: connect: connection refused

This is on RasberryPi4. I did try docker cp telegraf:/tmp/cookies and there is a file with some libcurl output there, but the dashboard is still blank so I'm skeptical. Any suggestions how to debug? Thanks!

@pridkett
Copy link
Contributor Author

I'm going to close this in favor of #17 - which has a much cleaner implementation. In short, it adds in Cookieproxy as an independent container and requires no changes to your influxdb schema, no silliness with kicking containers because of Cookieproxy dying (docker-compose should do that), and no building of containers locally. For people with previously working setups, it should just be a matter of setting the IP address of your Powerwall and dropping your password into .env.cookieproxy.

@pridkett pridkett closed this Apr 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Version 20.49 broke this
6 participants