Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with timeouts normally instead of raising exceptions #22

Open
manzanit0 opened this issue Sep 17, 2019 · 2 comments
Open

Deal with timeouts normally instead of raising exceptions #22

manzanit0 opened this issue Sep 17, 2019 · 2 comments
Labels
bug Something isn't working

Comments

@manzanit0
Copy link
Owner

After being in production for a few months, we've seen that the TFL API timeouts at least 10-15 times a week, even though we have a timeout waiting time of 50 seconds configured. It might be convenient to deal with these in an orderly/managed way even if it's to avoid polluting the log/Bugsnag.

Note: Double check which timeout it is – we're doing HTTPoison.get!(recv_timeout: 50000), but there are two kinds of timeouts: edgurgel/httpoison#211

p 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | 22:44:31.197 [error] GenServer #PID<0.28997.0> terminating
Sep 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | ** (HTTPoison.Error) :timeout
Sep 16 11:44:31.197pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (httpoison) lib/httpoison.ex:128: HTTPoison.request!/5
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/tfl.ex:27: Londibot.TFL.status!/1
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:22: Londibot.StatusBroker.get_latest!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:36: Londibot.StatusBroker.get_changes!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/tfl/status_broker.ex:31: Londibot.StatusBroker.get_non_routinary_changes!/0
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (londibot) lib/londibot/disruptions/disruption_worker.ex:41: Londibot.DisruptionWorker.handle_info/2
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (stdlib) gen_server.erl:637: :gen_server.try_dispatch/4
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | (stdlib) gen_server.erl:711: :gen_server.handle_msg/6
Sep 16 11:44:31.198pm info londibot-slack londibot-slack-6884c89d88-bqr7m web.1 | Last message: :work

Might be interesting to develop a retry mechanism on a timeout? If a timeout ocurrs, that means the API won't be polled again until another 3 minutes, so if two happens in a row, that means that Londibot potentially won't be notificating of changes within a period of ~10 minutes, even if changes have ocurred.

@manzanit0 manzanit0 added the bug Something isn't working label Sep 17, 2019
@manzanit0
Copy link
Owner Author

Committed 431719d. It doesn't deal with the timeout errors once raised, but it increases the window when trying to establish connection. Hopefully that stops most of them from happening.

@manzanit0
Copy link
Owner Author

As of today, there have been 8 timeouts in the last 3 days, so it's still making noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant