From fc0c06e1772863fedcda9111e491d6773460a467 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Luis=20Segura=20Lucas?= Date: Thu, 2 Nov 2023 18:24:40 +0100 Subject: [PATCH 1/2] Remove log checks, rely on process already finished --- features/parquet-factory/kafka_messages.feature | 10 ---------- parquet_factory_tests.sh | 10 ++++++++++ 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/features/parquet-factory/kafka_messages.feature b/features/parquet-factory/kafka_messages.feature index 796f244c..405cdcbd 100644 --- a/features/parquet-factory/kafka_messages.feature +++ b/features/parquet-factory/kafka_messages.feature @@ -156,16 +156,6 @@ Feature: Ability to process the Kafka messages correctly And I set the environment variable "PARQUET_FACTORY__KAFKA_FEATURES__MAX_CONSUMED_RECORDS" to "1" And I run Parquet Factory with a timeout of "10" seconds Then Parquet Factory should have finish - And The logs should contain - | topic | partition | offset | message | - | incoming_features_topic | 0 | 0 | message processed | - | incoming_features_topic | 1 | 0 | message processed | - | incoming_rules_topic | 0 | 0 | message processed | - | incoming_rules_topic | 1 | 0 | message processed | - | incoming_features_topic | 0 | 1 | FINISH | - | incoming_features_topic | 1 | 1 | FINISH | - | incoming_rules_topic | 0 | 1 | FINISH | - | incoming_rules_topic | 0 | 1 | FINISH | Then The S3 bucket is not empty Scenario: Parquet Factory should not commit the messages from current hour if there are no prior messages diff --git a/parquet_factory_tests.sh b/parquet_factory_tests.sh index eb26f57e..47d27988 100755 --- a/parquet_factory_tests.sh +++ b/parquet_factory_tests.sh @@ -75,6 +75,16 @@ function code_coverage_report() { EOF } +function add_exit_trap { + local to_add=$1 + if [[ -z "$exit_trap_command" ]] + then + exit_trap_command="$to_add" + else + exit_trap_command="$exit_trap_command; $to_add" + fi +} + flag=${1:-""} if [[ "${flag}" = "coverage" ]] From e715d39cd643451ec7dcc7e4a8560614ec686c03 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jos=C3=A9=20Luis=20Segura=20Lucas?= Date: Thu, 2 Nov 2023 18:29:02 +0100 Subject: [PATCH 2/2] Update scenarios --- docs/scenarios_list.md | 36 ++++++++++++++++++++++++++++++++++++ tools/gen_scenario_list.py | 1 + 2 files changed, 37 insertions(+) diff --git a/docs/scenarios_list.md b/docs/scenarios_list.md index 369a2d63..3e8fcf36 100644 --- a/docs/scenarios_list.md +++ b/docs/scenarios_list.md @@ -1004,3 +1004,39 @@ nav_order: 3 * Check if CCX Upgrade Risk Data Engineering Service application is available * Check if CCX Upgrade Risk Data Engineering Service can be run +## [`parquet-factory/indexes.feature`](https://github.com/RedHatInsights/insights-behavioral-spec/blob/main/features/parquet-factory/indexes.feature) + +* If Parquet file already exists, the index of the new one should be 1 + +## [`parquet-factory/kafka_messages.feature`](https://github.com/RedHatInsights/insights-behavioral-spec/blob/main/features/parquet-factory/kafka_messages.feature) + +* Parquet Factory should fail if it cannot read from Kafka +* Parquet Factory shouldn't finish if only messages from the previous hour arrived +* Parquet Factory shouldn't finish if not all the topics and partitions are filled with current hour messages +* Parquet Factory should finish if all the topics and partitions are filled with current hour messages +* After aggregating messages from previous hour, the first messages from current hour has to be processed first +* Parquet Factory should finish if the limit of kafka messages is exceeded even if no messages from current hour arrived +* Parquet Factory should not commit the messages from current hour if there are no prior messages +* Parquet Factory shouldn't send duplicate rows + +## [`parquet-factory/metrics.feature`](https://github.com/RedHatInsights/insights-behavioral-spec/blob/main/features/parquet-factory/metrics.feature) + +* If the Pushgateway is not accessible, Parquet Factory should run successfully +* If the Pushgateway is accessible, Parquet Factory should run successfully and send the metrics to the Pushgateway +* If the Pushgateway is accessible and I run Parquet Factory with messages from the previous hour, the "files_generated" and "inserted_rows" metrics should be 1 for all the tables +* If the Pushgateway is accessible and Parquet Factory errors, the "error_count" metric should increase + +## [`parquet-factory/parquet_files.feature`](https://github.com/RedHatInsights/insights-behavioral-spec/blob/main/features/parquet-factory/parquet_files.feature) + +* Table generation: cluster_info +* Table generation: available_updates +* Table generation: conditional_update_conditions +* Table generation: conditional_update_risks +* Table generation: cluster_thanos_info + +## [`parquet-factory/s3.feature`](https://github.com/RedHatInsights/insights-behavioral-spec/blob/main/features/parquet-factory/s3.feature) + +* Parquet Factory should fail if it cannot connect with S3. When I rerun it, it should re-process the messages from the beginning +* Parquet Factory should fail if it cannot find the bucket +* Parquet Factory shouldn't fail if it cannot find the folder/prefix where the files are stored + diff --git a/tools/gen_scenario_list.py b/tools/gen_scenario_list.py index f2bc73dd..98a75d61 100644 --- a/tools/gen_scenario_list.py +++ b/tools/gen_scenario_list.py @@ -50,6 +50,7 @@ "ccx-notification-writer", "ccx-upgrades-inference", "ccx-upgrades-data-eng", + "parquet-factory", ) # generate page header