v1.3.84

grafana · Jan 10, 2024 · 2f4cce0 · 2f4cce0
2 parents 2fd150a + 39421f2
commit 2f4cce0
Show file tree

Hide file tree

Showing 106 changed files with 1,637 additions and 1,458 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## Unreleased
 
+## v1.3.84 (2024-01-10)
+
+### Added
+
+- Add endpoint for alert group escalation snapshot by @Ferril ([#3615](https://github.com/grafana/oncall/pull/3615))
+
+### Changed
+
+- Do not retry `firebase.messaging.UnregisteredError` exceptions for FCM relay tasks by @joeyorlando ([#3637](https://github.com/grafana/oncall/pull/3637))
+- Decrease outgoing webhook timeouts from 10secs to 4secs by @joeyorlando ([#3639](https://github.com/grafana/oncall/pull/3639))
+- Add stack slug to `/organization` endpoint response by @Ferril ([#3644](https://github.com/grafana/oncall/pull/3644))
+- Moved Mobile Connection Tab to separate user profile in Grafana ([#3296](https://github.com/grafana/oncall/pull/3296)
+
+### Fixed
+
+- Address HTTP 500s occurring when receiving messages from Telegram user in a discussion group by @joeyorlando ([#3622](https://github.com/grafana/oncall/pull/3622))
+- Fix `module 'apps.schedules.tasks.notify_about_empty_shifts_in_schedule' has no attribute 'apply_async'`
+  `AttributeError` by @joeyorlando ([#3640](https://github.com/grafana/oncall/pull/3640))
+
 ## v1.3.83 (2024-01-08)
 
 ### Changed

diff --git a/dev/helm-local.yml b/dev/helm-local.yml
@@ -3,6 +3,8 @@ base_url_protocol: http
 env:
   - name: GRAFANA_CLOUD_NOTIFICATIONS_ENABLED
     value: "False"
+  - name: FEATURE_PROMETHEUS_EXPORTER_ENABLED
+    value: "True"
 image:
   repository: localhost:63628/oncall/engine
   tag: dev
@@ -134,7 +136,7 @@ service:
   port: 8080
   nodePort: 30001
 prometheus:
-  enabled: false
+  enabled: true
   extraScrapeConfigs: |
     - job_name: 'oncall-exporter'
       metrics_path: /metrics/

diff --git a/docs/sources/outgoing-webhooks/_index.md b/docs/sources/outgoing-webhooks/_index.md
@@ -13,13 +13,13 @@ weight: 500
 
 # Outgoing Webhooks
 
-> ⚠️ A note about **(Legacy)** webhooks:  Webhooks that were created before version **v1.3.11** are marked as
-> **(Legacy)**.  Do not worry! They are still connected to their respective escalation chains and will continue to to
+> ⚠️ A note about **(Legacy)** webhooks: Webhooks that were created before version **v1.3.11** are marked as
+> **(Legacy)**. Do not worry! They are still connected to their respective escalation chains and will continue to to
 > execute as they always have.
 > <br/><br/>
 > The **(Legacy)** webhook is no longer editable due to changes to the internal representation. If you need to edit it
-> you must use the `Make a copy` action in the menu and make your changes there.  This will create the webhook in the
-> new format.  Be sure to change your escalation chains to point to the new copy otherwise it will not be active. The
+> you must use the `Make a copy` action in the menu and make your changes there. This will create the webhook in the
+> new format. Be sure to change your escalation chains to point to the new copy otherwise it will not be active. The
 > **(Legacy)** webhook can then be deleted.
 
 Outgoing webhooks are used by Grafana OnCall to send data to a URL in a flexible way. These webhooks can be
@@ -33,7 +33,7 @@ To create an outgoing webhook navigate to **Outgoing Webhooks** and click **+ Cr
 webhooks can be viewed, edited and deleted. To create the outgoing webhook click **New Outgoing Webhook** and then
 select a preset based on what you want to do. A simple webhook will POST alert group data as a selectable escalation
 step to the specified url. If you require more customization use the advanced webhook which provides all of the
-fields described below.  
+fields described below.
 
 ### Outgoing webhook fields
 
@@ -63,7 +63,7 @@ Controls whether the outgoing webhook will trigger or is ignored.
 
 #### Assign to Team
 
-Sets which team owns the outgoing webhook for filtering and visibility.  
+Sets which team owns the outgoing webhook for filtering and visibility.
 This setting does not restrict outgoing webhook execution to events from the selected team.
 
 | Required | [Template Accepted](#outgoing-webhook-templates) | Default Value |
@@ -111,6 +111,9 @@ If no integrations are selected the outgoing webhook will trigger for any integr
 
 The destination URL the outgoing webhook will make a request to. This must be a FQDN.
 
+> ⚠️ **Note** the destination server must respond back within 4 seconds or it will result in a timeout
+> (this can be seen in the "Response Body" under the "Last Run" section)
+
 | Required | [Template Accepted](#outgoing-webhook-templates) | Default Value |
 | :------: | :----------------------------------------------: | :-----------: |
 |    ✔️    |                        ✔️                        |    _Empty_    |
@@ -467,7 +470,7 @@ otherwise it will only display the value. Fields which are not used are not show
 
 ### Using trigger template field
 
-The [trigger template field](#trigger-type) can be used to provide control over whether a webhook will execute.  
+The [trigger template field](#trigger-type) can be used to provide control over whether a webhook will execute.
 This is useful in situations where many different kinds of alerts are going to the same integration but only some of
 them should call the webhook. To accomplish this the trigger template field can contain a template that will process
 data from the alert group and evaluate to empty, True or 1 if the webhook should execute, any other values will result

diff --git a/engine/apps/alerts/tasks/check_escalation_finished.py b/engine/apps/alerts/tasks/check_escalation_finished.py
@@ -82,13 +82,13 @@ def audit_alert_group_escalation(alert_group: "AlertGroup") -> None:
             f"{base_msg}'s escalation snapshot has {num_of_executed_escalation_policy_snapshots} executed escalation policies"
         )
 
-    check_personal_notifications_task.apply_async((alert_group_id,))
+    check_alert_group_personal_notifications_task.apply_async((alert_group_id,))
 
     task_logger.info(f"{base_msg} passed the audit checks")
 
 
 @shared_task
-def check_personal_notifications_task(alert_group_id) -> None:
+def check_alert_group_personal_notifications_task(alert_group_id) -> None:
     # Check personal notifications are completed
     # triggered (< 5min ago) == failed + success
     from apps.base.models import UserNotificationPolicy, UserNotificationPolicyLogRecord
@@ -115,6 +115,54 @@ def check_personal_notifications_task(alert_group_id) -> None:
         task_logger.info(f"{base_msg} personal notifications check passed")
 
 
+@shared_task
+def check_personal_notifications_task() -> None:
+    """
+    This task checks that triggered personal notifications are completed.
+    It will log the triggered/completed values to be used as metrics.
+
+    Attention: don't retry this task, the idea is to be alerted of failures
+    """
+    from apps.alerts.models import AlertGroup
+    from apps.base.models import UserNotificationPolicy, UserNotificationPolicyLogRecord
+
+    # use readonly database if available
+    readonly_db = get_random_readonly_database_key_if_present_otherwise_default()
+
+    now = timezone.now()
+
+    # consider alert groups from the last 2 days
+    alert_groups = AlertGroup.objects.using(readonly_db).filter(
+        started_at__range=(now - timezone.timedelta(days=2), now),
+    )
+
+    # review notifications triggered in the last 20-minute window
+    # (task should run periodically about every 15 minutes)
+    since = now - timezone.timedelta(minutes=20)
+
+    log_records_qs = UserNotificationPolicyLogRecord.objects.using(readonly_db)
+    # personal notifications triggered in the given window for those alert groups
+    triggered = log_records_qs.filter(
+        type=UserNotificationPolicyLogRecord.TYPE_PERSONAL_NOTIFICATION_TRIGGERED,
+        notification_step=UserNotificationPolicy.Step.NOTIFY,
+        created_at__gte=since,
+        created_at__lte=now,
+        alert_group__in=alert_groups,
+    ).count()
+
+    # personal notifications completed in the given window for those alert groups
+    completed = log_records_qs.filter(
+        Q(type=UserNotificationPolicyLogRecord.TYPE_PERSONAL_NOTIFICATION_FAILED)
+        | Q(type=UserNotificationPolicyLogRecord.TYPE_PERSONAL_NOTIFICATION_SUCCESS),
+        notification_step=UserNotificationPolicy.Step.NOTIFY,
+        created_at__gt=since,
+        created_at__lte=now,
+        alert_group__in=alert_groups,
+    ).count()
+
+    task_logger.info(f"personal_notifications_triggered={triggered} personal_notifications_completed={completed}")
+
+
 @shared_task
 def check_escalation_finished_task() -> None:
     """

diff --git a/engine/apps/alerts/tests/test_check_escalation_finished_task.py b/engine/apps/alerts/tests/test_check_escalation_finished_task.py
@@ -9,6 +9,7 @@
 from apps.alerts.tasks.check_escalation_finished import (
     AlertGroupEscalationPolicyExecutionAuditException,
     audit_alert_group_escalation,
+    check_alert_group_personal_notifications_task,
     check_escalation_finished_task,
     check_personal_notifications_task,
     send_alert_group_escalation_auditor_task_heartbeat,
@@ -502,15 +503,22 @@ def test_check_escalation_finished_task_calls_audit_alert_group_personal_notific
     alert_group4.personal_log_records.update(created_at=now - timezone.timedelta(minutes=2))
 
     # trigger task
-    with patch("apps.alerts.tasks.check_escalation_finished.check_personal_notifications_task") as mock_check_notif:
+    with patch(
+        "apps.alerts.tasks.check_escalation_finished.check_alert_group_personal_notifications_task"
+    ) as mock_check_notif:
         check_escalation_finished_task()
 
     for alert_group in alert_groups:
         mock_check_notif.apply_async.assert_any_call((alert_group.id,))
-        check_personal_notifications_task(alert_group.id)
+        check_alert_group_personal_notifications_task(alert_group.id)
         if alert_group == alert_group3:
             assert f"Alert group {alert_group3.id} has (1) uncompleted personal notifications" in caplog.text
         else:
             assert f"Alert group {alert_group.id} personal notifications check passed" in caplog.text
 
     mocked_send_alert_group_escalation_auditor_task_heartbeat.assert_called()
+
+    # also trigger the general personal notification checker
+    check_personal_notifications_task()
+
+    assert "personal_notifications_triggered=4 personal_notifications_completed=2" in caplog.text
diff --git a/engine/apps/api/serializers/alert_group_escalation_snapshot.py b/engine/apps/api/serializers/alert_group_escalation_snapshot.py
@@ -0,0 +1,55 @@
+from rest_framework import serializers
+
+from apps.api.serializers.custom_button import CustomButtonFastSerializer
+from apps.api.serializers.escalation_policy import EscalationPolicySerializer
+from apps.api.serializers.schedule_base import ScheduleFastSerializer
+from apps.api.serializers.user import FastUserSerializer
+from apps.api.serializers.user_group import UserGroupSerializer
+from apps.api.serializers.webhook import WebhookFastSerializer
+
+
+class EscalationPolicySnapshotAPISerializer(EscalationPolicySerializer):
+    """Serializes AlertGroup escalation policies snapshots for API endpoint"""
+
+    notify_to_users_queue = FastUserSerializer(many=True, read_only=True)
+    notify_schedule = ScheduleFastSerializer(read_only=True)
+    notify_to_group = UserGroupSerializer(read_only=True)
+    custom_button_trigger = CustomButtonFastSerializer(read_only=True)
+    custom_webhook = WebhookFastSerializer(read_only=True)
+
+    class Meta(EscalationPolicySerializer.Meta):
+        fields = [
+            "step",
+            "wait_delay",
+            "notify_to_users_queue",
+            "from_time",
+            "to_time",
+            "num_alerts_in_window",
+            "num_minutes_in_window",
+            "slack_integration_required",
+            "custom_button_trigger",
+            "custom_webhook",
+            "notify_schedule",
+            "notify_to_group",
+            "important",
+        ]
+        read_only_fields = fields
+
+
+class AlertGroupEscalationSnapshotAPISerializer(serializers.Serializer):
+    """Serializes AlertGroup escalation snapshot for API endpoint"""
+
+    escalation_chain = serializers.SerializerMethodField()
+    channel_filter = serializers.SerializerMethodField()
+    escalation_policies = EscalationPolicySnapshotAPISerializer(
+        source="escalation_policies_snapshots", many=True, read_only=True
+    )
+
+    class Meta:
+        fields = ["escalation_chain", "channel_filter", "escalation_policies"]
+
+    def get_escalation_chain(self, obj):
+        return {"name": obj.escalation_chain_snapshot.name}
+
+    def get_channel_filter(self, obj):
+        return {"name": obj.channel_filter_snapshot.str_for_clients}
diff --git a/engine/apps/api/serializers/custom_button.py b/engine/apps/api/serializers/custom_button.py
@@ -69,3 +69,11 @@ def validate_forward_whole_payload(self, data):
         if data is None:
             return False
         return data
+
+
+class CustomButtonFastSerializer(serializers.ModelSerializer):
+    id = serializers.CharField(read_only=True, source="public_primary_key")
+
+    class Meta:
+        model = CustomButton
+        fields = ["id", "name"]
diff --git a/engine/apps/api/serializers/organization.py b/engine/apps/api/serializers/organization.py
@@ -31,11 +31,13 @@ class Meta:
         fields = [
             "pk",
             "name",
+            "stack_slug",
             "slack_team_identity",
             "slack_channel",
             "rbac_enabled",
         ]
         read_only_fields = [
+            "stack_slug",
             "slack_team_identity",
             "rbac_enabled",
         ]

diff --git a/engine/apps/api/serializers/webhook.py b/engine/apps/api/serializers/webhook.py
@@ -220,3 +220,11 @@ def is_field_controlled(self, field_name):
             if field_name not in preset.metadata.controlled_fields:
                 return False
         return True
+
+
+class WebhookFastSerializer(serializers.ModelSerializer):
+    id = serializers.CharField(read_only=True, source="public_primary_key")
+
+    class Meta:
+        model = Webhook
+        fields = ["id", "name"]
diff --git a/engine/apps/api/tests/test_alert_group.py b/engine/apps/api/tests/test_alert_group.py
@@ -1423,6 +1423,41 @@ def test_alert_group_detail_permissions(
     assert response.status_code == expected_status
 
 
+@pytest.mark.django_db
+@pytest.mark.parametrize(
+    "role,expected_status",
+    [
+        (LegacyAccessControlRole.ADMIN, status.HTTP_200_OK),
+        (LegacyAccessControlRole.EDITOR, status.HTTP_200_OK),
+        (LegacyAccessControlRole.VIEWER, status.HTTP_200_OK),
+        (LegacyAccessControlRole.NONE, status.HTTP_403_FORBIDDEN),
+    ],
+)
+def test_alert_group_escalation_snapshot_permissions(
+    alert_group_internal_api_setup,
+    make_user_for_organization,
+    make_user_auth_headers,
+    role,
+    expected_status,
+):
+    _, token, alert_groups = alert_group_internal_api_setup
+    _, _, new_alert_group, _ = alert_groups
+    organization = new_alert_group.channel.organization
+    user = make_user_for_organization(organization, role)
+
+    client = APIClient()
+    url = reverse("api-internal:alertgroup-escalation-snapshot", kwargs={"pk": new_alert_group.public_primary_key})
+
+    with patch(
+        "apps.api.views.alert_group.AlertGroupView.escalation_snapshot",
+        return_value=Response(
+            status=status.HTTP_200_OK,
+        ),
+    ):
+        response = client.get(url, format="json", **make_user_auth_headers(user, token))
+    assert response.status_code == expected_status
+
+
 @pytest.mark.django_db
 def test_silence(alert_group_internal_api_setup, make_user_auth_headers):
     client = APIClient()