Deal with timeouts normally instead of raising exceptions #22

manzanit0 · 2019-09-17T12:05:52Z

After being in production for a few months, we've seen that the TFL API timeouts at least 10-15 times a week, even though we have a timeout waiting time of 50 seconds configured. It might be convenient to deal with these in an orderly/managed way even if it's to avoid polluting the log/Bugsnag.

Note: Double check which timeout it is – we're doing HTTPoison.get!(recv_timeout: 50000), but there are two kinds of timeouts: edgurgel/httpoison#211

p 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | 22:44:31.197 [error] GenServer #PID<0.28997.0> terminating
Sep 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | ** (HTTPoison.Error) :timeout
Sep 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (httpoison) lib/httpoison.ex:128: HTTPoison.request!/5
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/tfl.ex:27: Londibot.TFL.status!/1
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:22: Londibot.StatusBroker.get_latest!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:36: Londibot.StatusBroker.get_changes!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:31: Londibot.StatusBroker.get_non_routinary_changes!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/disruptions/disruption_worker.ex:41: Londibot.DisruptionWorker.handle_info/2
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | Last message: :work

Might be interesting to develop a retry mechanism on a timeout? If a timeout ocurrs, that means the API won't be polled again until another 3 minutes, so if two happens in a row, that means that Londibot potentially won't be notificating of changes within a period of ~10 minutes, even if changes have ocurred.

The text was updated successfully, but these errors were encountered:

manzanit0 · 2019-09-18T16:32:00Z

Committed 431719d. It doesn't deal with the timeout errors once raised, but it increases the window when trying to establish connection. Hopefully that stops most of them from happening.

manzanit0 · 2019-09-21T18:58:03Z

As of today, there have been 8 timeouts in the last 3 days, so it's still making noise.

manzanit0 added the bug Something isn't working label Sep 17, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deal with timeouts normally instead of raising exceptions #22

Deal with timeouts normally instead of raising exceptions #22

manzanit0 commented Sep 17, 2019

manzanit0 commented Sep 18, 2019

manzanit0 commented Sep 21, 2019

Deal with timeouts normally instead of raising exceptions #22

Deal with timeouts normally instead of raising exceptions #22

Comments

manzanit0 commented Sep 17, 2019

manzanit0 commented Sep 18, 2019

manzanit0 commented Sep 21, 2019