-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Best practice: obtaining complete state after server outage #246
Comments
Having an API command to trigger ALL configured fields would be nice indeed. |
@jbanyer when a vehicle goes offline and comes back it does backfill its data, are you saying when your server goes down and the vehicle reconnects the same doesnt occur? I have proof that vehicles backfill when they go offline, but I never take all my load balanced Fleet Telemetry servers down simultaneously so I dont know if it works there too. |
@Bre77 I'm referring to when the third-party backend system (eg my system) is down for a while. During the time that it's down, vehicles will send field changes, and the backend will miss them. There needs to be some way for a backend to aquire the missed updates when it comes back up. Having a zero-downtime deploy process help avoid this situation, but all systems experience total outages occasionally, so there needs to be a method to get the missed updates. |
Simmilar question: what happens/what to do if the car is offline (like no connection in underground parking)? |
The car sends these signals as soon as it reconnects, so I would assume so. |
The vehicle stores a buffer of messages (up to 5k messages currently) to be sent once the vehicle comes back online. This behavior is needed as otherwise the backend will not be able to reconcile the vehicle's state. If the vehicle goes to sleep before reconnecting to the internet, buffered messages will not be sent and you won't be billed. Server outage is a great question. There is not currently a way to force all data values to be sent. Please don't update fleet-telemetry configurations for all vehicles to trigger this. I'm not guaranteeing any of these solutions but the ideas that come to my mind:
Thoughts or other ideas? |
@patrickdemers6 thanks for your reply. I think it would be best if the application was in control, since it is best placed to know that a resync is required. An API request which prompted the vehicle to resend all telemetry fields should work. Although perhaps there are other solutions. The situation is probably fairly rare, especially if backends are using a persistent queue mechanism to hold telemetry records. Many developers may choose not to bother making use of a resync mechanism. If telemetry records included some kind of sequence number, the backend could detect that a message has been missed and request a resync. But that would require adding a new field just to help with a rare situation? Although it may also be useful to handle race conditions in distributed systems? Perhaps we'll have firmer ideas once we've all had more experience with using fleet telemetry at scale. Cheers. |
@patrickdemers6 I don't need all fields, just a couple of fields for our state machine like "Gear", "ChargeState" and maybe 2-3 others. So a configurable field list would be much more useful. |
@patrickdemers6 Is this buffering behavior you describe for all fields on the vehicle (so Tesla can build their own state) or only the fields we're subscribed to? Also, what exactly does 'online' mean here in relation to buffering? Does the car buffer fields when the websocket connection is severed (perhaps due to an issue on our end) or only when it loses its internet connection? |
I think a sequence id is a great idea; it will help us determine if a full sync is needed, but will also help detect partially stale states; for example, say we wanted to calculate voltage from amps and power; we could miss an update to amps but receive updated power data and we'd use a stale amp value to calculate voltage incorrectly. |
Doesn't the existing timestamp on every update serve that purpose, @morganofslo ? If you requested this every 5 minutes, and your value is more than 5 minutes old... its stale? |
Since vehicles only send fields which have changed, if our server is down for a while then we will miss updates. There is also the possibility of entirely losing stored vehicle state due to a problem on our server.
After a server outage or restart, how should the complete vehicle state be obtained?
I noticed that deleting and recreating fleet telemetry config causes the vehicle to immediately send a telemetry record containing all configured fields. Is that an acceptable way to obtain the complete state?
If so, is is acceptable to do this to every vehicle connected to a service after an restart? Could be many thousands of vehicles.
Most of the time the issue will be a short outage, not a complete loss of server data, so the only issue is missing updates during the outage. It is unnecessary to request complete state for vehicles which sent no updates during the outage. Is it possible to detect for a given vehicle that updates have been missed, and then only trigger a full resend in that case?
Another idea would be to poll the vehicle using the polling API call. That would involve substantially higher costs, though probably not prohibitive (0.2 cents per vehicle after each restart).
The text was updated successfully, but these errors were encountered: