Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Improved Status Endpoint #4934

Open
ColbyTresness opened this issue Sep 16, 2019 · 10 comments
Open

[Feature] Improved Status Endpoint #4934

ColbyTresness opened this issue Sep 16, 2019 · 10 comments
Assignees

Comments

@ColbyTresness
Copy link

What problem would the feature you're requesting solve? Please describe.

A single place to go to understand the health of my function app (and all functions within it). Potential scenarios include: alerting customers when something about their system has become unhealthy, building automation to rollback deployment if an app becomes unhealthy, displaying in the Azure portal what the overall status of a function app is

Describe the solution you'd like

Probably an ARM API? But that's up to dev :)

Describe alternatives you've considered

The host status endpoint isn't sufficient here - it gives basic information about the current status of the host, but we need more. Also, if the host is unavailable, sometimes this API is too, which gives us very little information.

Additional context

The portal will benefit from this massively, but there are other reasons to do it as well!

@jeffhollan
Copy link

@kulkarnisonia16 as an FYI as I know previously this endpoint was a big cause for supportability.

@ColbyTresness would be good to understand if anything specific here you are thinking would be surfaced that isn't today? You say it "needs more" but would be good to know what more we have or potentially may have we could surface

@jeffhollan jeffhollan added this to the Triaged milestone Sep 16, 2019
@ColbyTresness
Copy link
Author

@btardif Has some opinions on this one. One example is when a user brings a custom container without the functions host in it - the only thing we surface is "host not found".

I'd say the biggest class of issues I want us to get better at is where the host itself can't be reached.

@fabiocav
Copy link
Member

fabiocav commented Oct 2, 2019

Most of the capabilities you're asking for are already exposed with the different status APIs (app level and function level). Let's discuss this to better understand what (if any) work needs to be done.

@ColbyTresness
Copy link
Author

I think the main piece I want is better knowledge of situations where the host can't be reached. I'm imagining displaying some sort of "last known good" state. I don't believe any existing APIs handle this. Also, having function information in the function app API would be helpful - which functions are disabled, in error, or healthy, based on their latest invocation. Also, aggregation across instances - if one instance of the host is healthy but others aren't, it would be great to be able to display that information to users. Those are the main things I'm looking to light up in the portal, I guess.

@ColbyTresness
Copy link
Author

If/when we do this, it would be great to include data around the number of successful and failed functions executions

@jeffhollan
Copy link

Just got in a state where I was using a connection string to Service Bus that wasn't fully valid. No functions would trigger, so I didn't see any logs. At first I didn't see any logs anywhere in portal showing me this error. After I restarted the app I saw some "runtime unable to start" (which I believe was related), but the error wasn't super useful. I wonder if any of the stuff @fabiocav mentioned would enable a scenario the UX could have pointed more precisely to "Unable to connect to Service Bus 'foo'. Connection string is invalid" or something

@jeffhollan
Copy link

Related to #4705

@kulkarnisonia16 kulkarnisonia16 changed the title Improved Status Endpoint [Feature] Improved Status Endpoint Mar 16, 2020
@btardif
Copy link
Member

btardif commented Aug 6, 2020

Is this related to: #6255 ?? @apawast ??

@jeffhollan
Copy link

Appears that way

@jviau
Copy link
Contributor

jviau commented Mar 21, 2024

Improved observability for functions will be covered by our OTel work #9273. We will emit FaaS compliant telemetry and you will have your choice of dashboards (whatever telemetry sink you decide to use) and not locked into just the one we provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants