- Centralized ways of getting insights from application to infrastructure
- You can diagnose, trace and debug issues
- Uses ML to detect anomalies and reveal hidden patterns
- Track how customers interact with the application
- Components
- • Alerts • Metrics • Action groups • Monitoring & reporting • Dashboard • Logs
- Collects data from
- • Application • Operating system • Resources • Subscription • Tenant
- Populates stores
- Metrics & logs
- Perform functions:
- Insights: • Application • Container • VM • Monitoring solutions
- Visualize: • Dashboards • Views • Power BI • Workbooks
- Analyze: • Metrics Explorer • Log Analytics
- Respond: • Alerts • Autoscale
- Integrate: • Event Hubs • Logic Apps • Ingest & Export APIs
- Notifies when important conditions are found in the monitoring data
- Flow of alerts
- Alert Rule
- Target Resource (Signal) → Criteria (Logic Test)
- Action Group (Actions to do)
- Monitor condition (Alert State)
- Alert Rule
- Alert rules have single of each properties:
- Target resource
- Scope & signals for alerting.
- E.g. VM
- Signal
- Emitted by target resource
- Can be metrics, activity log, application insights and log.
- Criteria
- Combination of signal and logic applied on target resources.
- E.g. less than X CPU usage.
- Logic
- User-defined logic to verify that signal is within expected range/values.
- E.g. less than 30% CPU usage.
- Alert name
- Alert description
- Severity
- Alert once the criteria specified in the alert.
- Can range from 0 to 4.
- Action
- Specific action taken when the alert is fired.
- Target resource
- You can alert on:
- Metric values
- Log search queries
- Health of underlying Azure platform
- More..
- State of alerts:
- New: Created or fired
- Acknowledged: Issue is reviewed.
- Closed: Issue has been resolved.
- Can be reopened by changing its state.
- User changes state from New.
- Non-compute resources: Resource metrics
- Compute resources: Guest OS (e.g. syslog for Linux, event logs for Windows)
- Azure Monitoring Agents
- Azure Diagnostics Extension (cloud only)
- Windows Server and Linux
- useful for basic resource-level monitoring
- Deployed automatically to VM when you enable it.
- Boot diagnostics (serial console)
- Log Analytics Agent (hybrid solution)
- Can collect logs from Azure & on-prem systems to same namespace.
- Azure Diagnostics Extension (cloud only)
- 🤗 Formerly known as diagnostic logs 1
- Trace event streams
- Programmed in application itself.
- Application Insights
- Instrumentation tool
- HTTP requests
- Dependency Calls (to e.g. SQL, external services, background services)
- Azure infrastructure logs
- E.g.
- Who created VM?
- Who configured this VNet?
- Traffic stream from NSG?
- Types include: • Administrative events • Service health events • Autoscale events • Recommendations • Security alerts • Alerts
- ❗ Stored for 19 days
- Through a diagnostic setting, they can be streamed into
- Storage account
- Log analytics workspace
- Azure marketplace partner
- Event Hub
- Flow logs handled by NSGs.
- Plot using
- In-built Azure plotting tool Network Watcher
- Power BI
- In portal it can be eached through "Cost analysis" blade of desired scope.
- In "Cost analysis" you can filter by "Tag"s.
- Cost Management shows organizational cost and usage patterns with advanced analytics
- Reports show your internal and external costs for usage and Azure Marketplace charges
- You can automate periodically export of your costs
- 💡 You can also see daily usage data in Portal: Azure Account Center → Billing history → Current period → Download usage
- Data is consumed by other Azure resources
- Predictive analytics are also available.
- Collected one-minute frequency
- Uniquely identified in a namespace.
- 💡 Stored for 93 days
- Collected in Azure metrics database (time series database)
- 💡 Copy to Log Analytics for long term storage
- Holds value properties: Time, Type, Resource, Value, Multiple Dimensions
- Value:
- Health of application: can help to identify route cause.
- Valuable when combined with other metrics.
- Sources of metrics:
- Platform metrics
- Each resource provides
- Visibility into health and performance
- Application metrics
- Generated by application insights
- Detect performance issues & track trends
- Custom metrics
- ❗ Must be created in same region as the resource that has the metrics
- Platform metrics
- Use-cases: • Metrics explorer • Metric Alert Rule • Auto Scale • Route & Stream • Archive • Access
- ITSM
- IT as a Service
- Helps to design, plan, deliver, operate, and control information technology (IT) services
- Azure ITSM Connector
- Bi-directional connection layer between and your ITSM tool(s)
- Use cases:
- Create ITSM work items based on Azure alerts.
- Sync ITSM incident/change request data to Azure.
- SIEM
- Security information and event management
- E.g. Splunk (there's an open source add-on to send to Event Hubs)
- 💡 You could even use Azure Sentinel as a SIEM tool.
- Name: Unique identifier
- Action type
- Voice call or SMS
- ❗ Up to 10 SMS / voice call actions in an action group.
- ❗ No more than 1 SMS / Voice call every 5 minutes.
- Webhook
- ❗ Up to 10 webhook call actions in an action group.
- It'll retry 2 times: first after 10, then 100 seconds.
- Logic App
- ❗ Up to 10 logic app actions in an action group.
- Automation runbook
- ❗ Up to 10 Runbook actions in an action group.
- Azure Function
- ITSM
- ❗ Up to 10 ITSM actions in an action group.
- Email
- ❗ Up to 1000 e-mail actions in an action group.
- ❗ No more than 100 emails in an hour.
- Push notification
- Azure App Push
- ❗ Up to 10 Azure app actions in an action group.
- Voice call or SMS
- Details: corresponding phone number, email address, webhooks URI, or ITSM connection details.
- Two ways to understand Azure bill to compare usage and costs (invoice):
- Using usage file
- Detailed usage CSV file shows charges & daily usage in billing period
- Download:
- Sign into the Azure account Center as the Account Administrator
- Select the subscription for which you want the invoice and usage information
- Select billing history → Download usage
- Select billing history
- Download:
- Detailed usage CSV file shows charges & daily usage in billing period
- Using Azure portal
- Subscription → Cost analysis → Filter by Timespan
- Using usage file
- See estimated costs on Portal: Subscription → Usage and estimated costs
- Old: OMS, new: Embedded in Azure Monitor as Logs.
- It's a dataware house for telemetry
- It converts any schema to a table schema that allows you to query.
- Uses KQL (pipe-based) language to query.
- It converts any schema to a table schema that allows you to query.
- All monitoring roads lead t o Azure Log Analytics
- There's always an integration from an logging Azure component to Log Analytics.
- You can download agents in Workspace → Connect
- Agents do not require VPN
- System Center Operations Manager
- Can send data to Log Analytics from cloud/on-prem servers.
- Azure Data Explorer
- Query language is used & viewed
- Alert rule
- Based on each query that run on regular intervals, results are evaluated to trigger an alert.
- Target
- Specific Aure resource
- Criteria
- Specific logic to trigger an action
- Log Alerts
- Describes where signal is custom query based on Log Analytics
- Action
- Call to send a notification
- Set-up in Log Analytics → Alerts
- Export
- • Excel • PowerBI
- Application Insights data is used in a different partition in Log Analytics.
- E.g. requests, traces, usages
- Allows you to cross application queries
- Function
- Queries can be saved as functions to be used within another query.
- Requires log analytics workspace
- Baseline
- Configuration management term
- Signifies an agreed-upon description of product attributes, per unit time, which serves as a basis for defining change.
- 💡 It's not only recommended but mandatory for team to develop a baseline.
- Gather diagnostics for long enough time.
- Capture all peaks and values over ordinary usage.
- Enable streams and create baseline
- Even analyze those and agree upon which performance ranges are acceptable to define SLAs.
- Helps to isolate problem
- Gather diagnostics for long enough time.
- Baselining in Azure
- Continuous monitoring
- Normal operational parameters
- Alerts on deviations
- Take proactive corrective actions
- Baselines actions
- Enable diagnostics monitoring and telemetry, e.g.:
- Azure IaaS resources
- Azure App Service apps
- Creating performance baselines
- Analyze diagnostics output
- Plot metrics
- Enable diagnostics monitoring and telemetry, e.g.: