Skip to content

Commit

Permalink
v1.3.84
Browse files Browse the repository at this point in the history
  • Loading branch information
joeyorlando authored Jan 10, 2024
2 parents 2fd150a + 39421f2 commit 2f4cce0
Show file tree
Hide file tree
Showing 106 changed files with 1,637 additions and 1,458 deletions.
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## Unreleased

## v1.3.84 (2024-01-10)

### Added

- Add endpoint for alert group escalation snapshot by @Ferril ([#3615](https://github.com/grafana/oncall/pull/3615))

### Changed

- Do not retry `firebase.messaging.UnregisteredError` exceptions for FCM relay tasks by @joeyorlando ([#3637](https://github.com/grafana/oncall/pull/3637))
- Decrease outgoing webhook timeouts from 10secs to 4secs by @joeyorlando ([#3639](https://github.com/grafana/oncall/pull/3639))
- Add stack slug to `/organization` endpoint response by @Ferril ([#3644](https://github.com/grafana/oncall/pull/3644))
- Moved Mobile Connection Tab to separate user profile in Grafana ([#3296](https://github.com/grafana/oncall/pull/3296)

### Fixed

- Address HTTP 500s occurring when receiving messages from Telegram user in a discussion group by @joeyorlando ([#3622](https://github.com/grafana/oncall/pull/3622))
- Fix `module 'apps.schedules.tasks.notify_about_empty_shifts_in_schedule' has no attribute 'apply_async'`
`AttributeError` by @joeyorlando ([#3640](https://github.com/grafana/oncall/pull/3640))

## v1.3.83 (2024-01-08)

### Changed
Expand Down
4 changes: 3 additions & 1 deletion dev/helm-local.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ base_url_protocol: http
env:
- name: GRAFANA_CLOUD_NOTIFICATIONS_ENABLED
value: "False"
- name: FEATURE_PROMETHEUS_EXPORTER_ENABLED
value: "True"
image:
repository: localhost:63628/oncall/engine
tag: dev
Expand Down Expand Up @@ -134,7 +136,7 @@ service:
port: 8080
nodePort: 30001
prometheus:
enabled: false
enabled: true
extraScrapeConfigs: |
- job_name: 'oncall-exporter'
metrics_path: /metrics/
Expand Down
17 changes: 10 additions & 7 deletions docs/sources/outgoing-webhooks/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@ weight: 500

# Outgoing Webhooks

> ⚠️ A note about **(Legacy)** webhooks: Webhooks that were created before version **v1.3.11** are marked as
> **(Legacy)**. Do not worry! They are still connected to their respective escalation chains and will continue to to
> ⚠️ A note about **(Legacy)** webhooks: Webhooks that were created before version **v1.3.11** are marked as
> **(Legacy)**. Do not worry! They are still connected to their respective escalation chains and will continue to to
> execute as they always have.
> <br/><br/>
> The **(Legacy)** webhook is no longer editable due to changes to the internal representation. If you need to edit it
> you must use the `Make a copy` action in the menu and make your changes there. This will create the webhook in the
> new format. Be sure to change your escalation chains to point to the new copy otherwise it will not be active. The
> you must use the `Make a copy` action in the menu and make your changes there. This will create the webhook in the
> new format. Be sure to change your escalation chains to point to the new copy otherwise it will not be active. The
> **(Legacy)** webhook can then be deleted.
Outgoing webhooks are used by Grafana OnCall to send data to a URL in a flexible way. These webhooks can be
Expand All @@ -33,7 +33,7 @@ To create an outgoing webhook navigate to **Outgoing Webhooks** and click **+ Cr
webhooks can be viewed, edited and deleted. To create the outgoing webhook click **New Outgoing Webhook** and then
select a preset based on what you want to do. A simple webhook will POST alert group data as a selectable escalation
step to the specified url. If you require more customization use the advanced webhook which provides all of the
fields described below.
fields described below.

### Outgoing webhook fields

Expand Down Expand Up @@ -63,7 +63,7 @@ Controls whether the outgoing webhook will trigger or is ignored.

#### Assign to Team

Sets which team owns the outgoing webhook for filtering and visibility.
Sets which team owns the outgoing webhook for filtering and visibility.
This setting does not restrict outgoing webhook execution to events from the selected team.

| Required | [Template Accepted](#outgoing-webhook-templates) | Default Value |
Expand Down Expand Up @@ -111,6 +111,9 @@ If no integrations are selected the outgoing webhook will trigger for any integr

The destination URL the outgoing webhook will make a request to. This must be a FQDN.

> ⚠️ **Note** the destination server must respond back within 4 seconds or it will result in a timeout
> (this can be seen in the "Response Body" under the "Last Run" section)
| Required | [Template Accepted](#outgoing-webhook-templates) | Default Value |
| :------: | :----------------------------------------------: | :-----------: |
| ✔️ | ✔️ | _Empty_ |
Expand Down Expand Up @@ -467,7 +470,7 @@ otherwise it will only display the value. Fields which are not used are not show

### Using trigger template field

The [trigger template field](#trigger-type) can be used to provide control over whether a webhook will execute.
The [trigger template field](#trigger-type) can be used to provide control over whether a webhook will execute.
This is useful in situations where many different kinds of alerts are going to the same integration but only some of
them should call the webhook. To accomplish this the trigger template field can contain a template that will process
data from the alert group and evaluate to empty, True or 1 if the webhook should execute, any other values will result
Expand Down
52 changes: 50 additions & 2 deletions engine/apps/alerts/tasks/check_escalation_finished.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,13 +82,13 @@ def audit_alert_group_escalation(alert_group: "AlertGroup") -> None:
f"{base_msg}'s escalation snapshot has {num_of_executed_escalation_policy_snapshots} executed escalation policies"
)

check_personal_notifications_task.apply_async((alert_group_id,))
check_alert_group_personal_notifications_task.apply_async((alert_group_id,))

task_logger.info(f"{base_msg} passed the audit checks")


@shared_task
def check_personal_notifications_task(alert_group_id) -> None:
def check_alert_group_personal_notifications_task(alert_group_id) -> None:
# Check personal notifications are completed
# triggered (< 5min ago) == failed + success
from apps.base.models import UserNotificationPolicy, UserNotificationPolicyLogRecord
Expand All @@ -115,6 +115,54 @@ def check_personal_notifications_task(alert_group_id) -> None:
task_logger.info(f"{base_msg} personal notifications check passed")


@shared_task
def check_personal_notifications_task() -> None:
"""
This task checks that triggered personal notifications are completed.
It will log the triggered/completed values to be used as metrics.
Attention: don't retry this task, the idea is to be alerted of failures
"""
from apps.alerts.models import AlertGroup
from apps.base.models import UserNotificationPolicy, UserNotificationPolicyLogRecord

# use readonly database if available
readonly_db = get_random_readonly_database_key_if_present_otherwise_default()

now = timezone.now()

# consider alert groups from the last 2 days
alert_groups = AlertGroup.objects.using(readonly_db).filter(
started_at__range=(now - timezone.timedelta(days=2), now),
)

# review notifications triggered in the last 20-minute window
# (task should run periodically about every 15 minutes)
since = now - timezone.timedelta(minutes=20)

log_records_qs = UserNotificationPolicyLogRecord.objects.using(readonly_db)
# personal notifications triggered in the given window for those alert groups
triggered = log_records_qs.filter(
type=UserNotificationPolicyLogRecord.TYPE_PERSONAL_NOTIFICATION_TRIGGERED,
notification_step=UserNotificationPolicy.Step.NOTIFY,
created_at__gte=since,
created_at__lte=now,
alert_group__in=alert_groups,
).count()

# personal notifications completed in the given window for those alert groups
completed = log_records_qs.filter(
Q(type=UserNotificationPolicyLogRecord.TYPE_PERSONAL_NOTIFICATION_FAILED)
| Q(type=UserNotificationPolicyLogRecord.TYPE_PERSONAL_NOTIFICATION_SUCCESS),
notification_step=UserNotificationPolicy.Step.NOTIFY,
created_at__gt=since,
created_at__lte=now,
alert_group__in=alert_groups,
).count()

task_logger.info(f"personal_notifications_triggered={triggered} personal_notifications_completed={completed}")


@shared_task
def check_escalation_finished_task() -> None:
"""
Expand Down
12 changes: 10 additions & 2 deletions engine/apps/alerts/tests/test_check_escalation_finished_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
from apps.alerts.tasks.check_escalation_finished import (
AlertGroupEscalationPolicyExecutionAuditException,
audit_alert_group_escalation,
check_alert_group_personal_notifications_task,
check_escalation_finished_task,
check_personal_notifications_task,
send_alert_group_escalation_auditor_task_heartbeat,
Expand Down Expand Up @@ -502,15 +503,22 @@ def test_check_escalation_finished_task_calls_audit_alert_group_personal_notific
alert_group4.personal_log_records.update(created_at=now - timezone.timedelta(minutes=2))

# trigger task
with patch("apps.alerts.tasks.check_escalation_finished.check_personal_notifications_task") as mock_check_notif:
with patch(
"apps.alerts.tasks.check_escalation_finished.check_alert_group_personal_notifications_task"
) as mock_check_notif:
check_escalation_finished_task()

for alert_group in alert_groups:
mock_check_notif.apply_async.assert_any_call((alert_group.id,))
check_personal_notifications_task(alert_group.id)
check_alert_group_personal_notifications_task(alert_group.id)
if alert_group == alert_group3:
assert f"Alert group {alert_group3.id} has (1) uncompleted personal notifications" in caplog.text
else:
assert f"Alert group {alert_group.id} personal notifications check passed" in caplog.text

mocked_send_alert_group_escalation_auditor_task_heartbeat.assert_called()

# also trigger the general personal notification checker
check_personal_notifications_task()

assert "personal_notifications_triggered=4 personal_notifications_completed=2" in caplog.text
55 changes: 55 additions & 0 deletions engine/apps/api/serializers/alert_group_escalation_snapshot.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
from rest_framework import serializers

from apps.api.serializers.custom_button import CustomButtonFastSerializer
from apps.api.serializers.escalation_policy import EscalationPolicySerializer
from apps.api.serializers.schedule_base import ScheduleFastSerializer
from apps.api.serializers.user import FastUserSerializer
from apps.api.serializers.user_group import UserGroupSerializer
from apps.api.serializers.webhook import WebhookFastSerializer


class EscalationPolicySnapshotAPISerializer(EscalationPolicySerializer):
"""Serializes AlertGroup escalation policies snapshots for API endpoint"""

notify_to_users_queue = FastUserSerializer(many=True, read_only=True)
notify_schedule = ScheduleFastSerializer(read_only=True)
notify_to_group = UserGroupSerializer(read_only=True)
custom_button_trigger = CustomButtonFastSerializer(read_only=True)
custom_webhook = WebhookFastSerializer(read_only=True)

class Meta(EscalationPolicySerializer.Meta):
fields = [
"step",
"wait_delay",
"notify_to_users_queue",
"from_time",
"to_time",
"num_alerts_in_window",
"num_minutes_in_window",
"slack_integration_required",
"custom_button_trigger",
"custom_webhook",
"notify_schedule",
"notify_to_group",
"important",
]
read_only_fields = fields


class AlertGroupEscalationSnapshotAPISerializer(serializers.Serializer):
"""Serializes AlertGroup escalation snapshot for API endpoint"""

escalation_chain = serializers.SerializerMethodField()
channel_filter = serializers.SerializerMethodField()
escalation_policies = EscalationPolicySnapshotAPISerializer(
source="escalation_policies_snapshots", many=True, read_only=True
)

class Meta:
fields = ["escalation_chain", "channel_filter", "escalation_policies"]

def get_escalation_chain(self, obj):
return {"name": obj.escalation_chain_snapshot.name}

def get_channel_filter(self, obj):
return {"name": obj.channel_filter_snapshot.str_for_clients}
8 changes: 8 additions & 0 deletions engine/apps/api/serializers/custom_button.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,3 +69,11 @@ def validate_forward_whole_payload(self, data):
if data is None:
return False
return data


class CustomButtonFastSerializer(serializers.ModelSerializer):
id = serializers.CharField(read_only=True, source="public_primary_key")

class Meta:
model = CustomButton
fields = ["id", "name"]
2 changes: 2 additions & 0 deletions engine/apps/api/serializers/organization.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,11 +31,13 @@ class Meta:
fields = [
"pk",
"name",
"stack_slug",
"slack_team_identity",
"slack_channel",
"rbac_enabled",
]
read_only_fields = [
"stack_slug",
"slack_team_identity",
"rbac_enabled",
]
Expand Down
8 changes: 8 additions & 0 deletions engine/apps/api/serializers/webhook.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,3 +220,11 @@ def is_field_controlled(self, field_name):
if field_name not in preset.metadata.controlled_fields:
return False
return True


class WebhookFastSerializer(serializers.ModelSerializer):
id = serializers.CharField(read_only=True, source="public_primary_key")

class Meta:
model = Webhook
fields = ["id", "name"]
35 changes: 35 additions & 0 deletions engine/apps/api/tests/test_alert_group.py
Original file line number Diff line number Diff line change
Expand Up @@ -1423,6 +1423,41 @@ def test_alert_group_detail_permissions(
assert response.status_code == expected_status


@pytest.mark.django_db
@pytest.mark.parametrize(
"role,expected_status",
[
(LegacyAccessControlRole.ADMIN, status.HTTP_200_OK),
(LegacyAccessControlRole.EDITOR, status.HTTP_200_OK),
(LegacyAccessControlRole.VIEWER, status.HTTP_200_OK),
(LegacyAccessControlRole.NONE, status.HTTP_403_FORBIDDEN),
],
)
def test_alert_group_escalation_snapshot_permissions(
alert_group_internal_api_setup,
make_user_for_organization,
make_user_auth_headers,
role,
expected_status,
):
_, token, alert_groups = alert_group_internal_api_setup
_, _, new_alert_group, _ = alert_groups
organization = new_alert_group.channel.organization
user = make_user_for_organization(organization, role)

client = APIClient()
url = reverse("api-internal:alertgroup-escalation-snapshot", kwargs={"pk": new_alert_group.public_primary_key})

with patch(
"apps.api.views.alert_group.AlertGroupView.escalation_snapshot",
return_value=Response(
status=status.HTTP_200_OK,
),
):
response = client.get(url, format="json", **make_user_auth_headers(user, token))
assert response.status_code == expected_status


@pytest.mark.django_db
def test_silence(alert_group_internal_api_setup, make_user_auth_headers):
client = APIClient()
Expand Down
Loading

0 comments on commit 2f4cce0

Please sign in to comment.