You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After being in production for a few months, we've seen that the TFL API timeouts at least 10-15 times a week, even though we have a timeout waiting time of 50 seconds configured. It might be convenient to deal with these in an orderly/managed way even if it's to avoid polluting the log/Bugsnag.
Note: Double check which timeout it is – we're doing HTTPoison.get!(recv_timeout: 50000), but there are two kinds of timeouts: edgurgel/httpoison#211
p 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | 22:44:31.197 [error] GenServer #PID<0.28997.0> terminating
Sep 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | ** (HTTPoison.Error) :timeout
Sep 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (httpoison) lib/httpoison.ex:128: HTTPoison.request!/5
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/tfl.ex:27: Londibot.TFL.status!/1
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:22: Londibot.StatusBroker.get_latest!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:36: Londibot.StatusBroker.get_changes!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:31: Londibot.StatusBroker.get_non_routinary_changes!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/disruptions/disruption_worker.ex:41: Londibot.DisruptionWorker.handle_info/2
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | Last message: :work
Might be interesting to develop a retry mechanism on a timeout? If a timeout ocurrs, that means the API won't be polled again until another 3 minutes, so if two happens in a row, that means that Londibot potentially won't be notificating of changes within a period of ~10 minutes, even if changes have ocurred.
The text was updated successfully, but these errors were encountered:
Committed 431719d. It doesn't deal with the timeout errors once raised, but it increases the window when trying to establish connection. Hopefully that stops most of them from happening.
After being in production for a few months, we've seen that the TFL API timeouts at least 10-15 times a week, even though we have a timeout waiting time of 50 seconds configured. It might be convenient to deal with these in an orderly/managed way even if it's to avoid polluting the log/Bugsnag.
Note: Double check which timeout it is – we're doing
HTTPoison.get!(recv_timeout: 50000)
, but there are two kinds of timeouts: edgurgel/httpoison#211Might be interesting to develop a retry mechanism on a timeout? If a timeout ocurrs, that means the API won't be polled again until another 3 minutes, so if two happens in a row, that means that Londibot potentially won't be notificating of changes within a period of ~10 minutes, even if changes have ocurred.
The text was updated successfully, but these errors were encountered: