Resolve some falky tests and improve CI times #2401

AurelienFT · 2024-10-28T15:13:46Z

Linked Issues/PRs

Description

This PR fix an issue in P2P heartbeat. The problem was that P2P heartbeat was updated only if new blocks were received or produced. This means that if we start the node from an existing db but doesn't produce blocks and not connect it to anyone it will send block height 0 to the peers that connects to him. We believe that this fix, resolves #2408 #2407 #2406 and #2351.

For #2394 we just increased the timeouts.
For #2393 we removed the panic in the test and just let p2p reconnect
For #2395 we launch this test using multi-threads mode of Tokio to follow the convention of all the others tests that launch a node using FuelCoreDriver. Also we added a kill of the driver to try to kill the node in a more graceful way in all of the test, it should fix a lot of flakyness in these tests

This PR also change the CI workflow by removing all docker related jobs and codecov job. These two set of jobs has been moved to separated workflow that are not triggered automatically but can be triggered manually on the "Actions" tab of this repository (after the merge of this PR).

The tests launched by the CI job now use nextest that allow us to add timeout for each test and provide more detailed output. The timeout is currently 5 min (and 8 for two really big tests) because we have tests that take a long time but we should lower it in the future.
The steps on the matrix are not cancelled anymore when one failed to allow possible other success and cache their success for a relaunch of the tests.

There is still more improve to do on our tests especially on timeout and rapid execution but this should improve a lot our workflow.

Checklist

Breaking changes are clearly marked as such in the PR description and changelog
New behavior is reflected in tests
The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

I have reviewed the code myself
I have created follow-up issues caused by this PR and linked them here

…ow manual in future commit

…nto resolve_flaky_tests

xgreenx

CI failed. And the time required to do the dry run is enormously big. It shouldn't be so slow. We have a problem that we need to investigate; increasing the timeout for it will not help.

AurelienFT · 2024-11-16T22:11:48Z

@xgreenx do we want to block all the improvements that are provided by this PR for this test or we can leave the issue related to this test open and still merge this ? Because even if there is still a problem on this test I think we improve a lot of things here.

…xecutor(it uses 1024). Speed up state rewind test by using less blocks to re-run. Speedup gas price test by using less maximum block height. Speed up all tests by not creating all family colums in the RocksDB database. Related issue: facebook/rocksdb#5117 Fixed deploy large contract e2e test(before it was not tested). Buffered dry runs in e2e tests to avoid resource exhaustion. Use interval in e2e tests, to avoid issues with block broadcasting.

Also run p2p only once.

AurelienFT · 2024-11-17T10:39:32Z

Ok boss 😂😂

@acerone85

## Version v0.41.0 ### Added - [2547](#2547): Replace the old Graphql gas price provider adapter with the ArcGasPriceEstimate. - [2445](#2445): Added GQL endpoint for querying asset details. - [2442](#2442): Add uninitialized task for V1 gas price service - [2154](#2154): Added `Unknown` variant to `ConsensusParameters` graphql queries - [2154](#2154): Added `Unknown` variant to `Block` graphql queries - [2154](#2154): Added `TransactionType` type in `fuel-client` - [2321](#2321): New metrics for the TxPool: - The size of transactions in the txpool (`txpool_tx_size`) - The time spent by a transaction in the txpool in seconds (`txpool_tx_time_in_txpool_seconds`) - The number of transactions in the txpool (`txpool_number_of_transactions`) - The number of transactions pending verification before entering the txpool (`txpool_number_of_transactions_pending_verification`) - The number of executable transactions in the txpool (`txpool_number_of_executable_transactions`) - The time it took to select transactions for inclusion in a block in microseconds (`txpool_select_transactions_time_microseconds`) - The time it took to insert a transaction in the txpool in microseconds (`transaction_insertion_time_in_thread_pool_microseconds`) - [2385](#2385): Added new histogram buckets for some of the TxPool metrics, optimize the way they are collected. - [2347](#2364): Add activity concept in order to protect against infinitely increasing DA gas price scenarios - [2362](#2362): Added a new request_response protocol version `/fuel/req_res/0.0.2`. In comparison with `/fuel/req/0.0.1`, which returns an empty response when a request cannot be fulfilled, this version returns more meaningful error codes. Nodes still support the version `0.0.1` of the protocol to guarantee backward compatibility with fuel-core nodes. Empty responses received from nodes using the old protocol `/fuel/req/0.0.1` are automatically converted into an error `ProtocolV1EmptyResponse` with error code 0, which is also the only error code implemented. More specific error codes will be added in the future. - [2386](#2386): Add a flag to define the maximum number of file descriptors that RocksDB can use. By default it's half of the OS limit. - [2376](#2376): Add a way to fetch transactions in P2P without specifying a peer. - [2361](#2361): Add caches to the sync service to not reask for data it already fetched from the network. - [2327](#2327): Add more services tests and more checks of the pool. Also add an high level documentation for users of the pool and contributors. - [2416](#2416): Define the `GasPriceServiceV1` task. - [2447](#2447): Use new `expiration` policy in the transaction pool. Add a mechanism to prune the transactions when they expired. - [1922](#1922): Added support for posting blocks to the shared sequencer. - [2033](#2033): Remove `Option<BlockHeight>` in favor of `BlockHeightQuery` where applicable. - [2490](#2490): Added pagination support for the `balances` GraphQL query, available only when 'balances indexation' is enabled. - [2439](#2439): Add gas costs for the two new zk opcodes `ecop` and `eadd` and the benches that allow to calibrate them. - [2472](#2472): Added the `amountU128` field to the `Balance` GraphQL schema, providing the total balance as a `U128`. The existing `amount` field clamps any balance exceeding `U64` to `u64::MAX`. - [2526](#2526): Add possibility to not have any cache set for RocksDB. Add an option to either load the RocksDB columns families on creation of the database or when the column is used. - [2532](#2532): Getters for inner rocksdb database handles. - [2524](#2524): Adds a new lock type which is optimized for certain workloads to the txpool and p2p services. - [2535](#2535): Expose `backup` and `restore` APIs on the `CombinedDatabase` struct to create portable backups and restore from them. - [2550](#2550): Add statistics and more limits infos about txpool on the node_info endpoint ### Fixed - [2560](#2560): Fix flaky test by increasing timeout - [2558](#2558): Rename `cost` and `reward` to remove `excess` wording - [2469](#2469): Improved the logic for syncing the gas price database with on_chain database - [2365](#2365): Fixed the error during dry run in the case of race condition. - [2366](#2366): The `importer_gas_price_for_block` metric is properly collected. - [2369](#2369): The `transaction_insertion_time_in_thread_pool_milliseconds` metric is properly collected. - [2413](#2413): block production immediately errors if unable to lock the mutex. - [2389](#2389): Fix construction of reverse iterator in RocksDB. - [2479](#2479): Fix an error on the last iteration of the read and write sequential opcodes on contract storage. - [2478](#2478): Fix proof created by `message_receipts_proof` function by ignoring the receipts from failed transactions to match `message_outbox_root`. - [2485](#2485): Hardcode the timestamp of the genesis block and version of `tai64` to avoid breaking changes for us. - [2511](#2511): Fix backward compatibility of V0Metadata in gas price db. ### Changed - [2469](#2469): Updated adapter for querying costs from DA Block committer API - [2469](#2469): Use the gas price from the latest block to estimate future gas prices - [2501](#2501): Use gas price from block for estimating future gas prices - [2468](#2468): Abstract unrecorded blocks concept for V1 algorithm, create new storage impl. Introduce `TransactionableStorage` trait to allow atomic changes to the storage. - [2295](#2295): `CombinedDb::from_config` now respects `state_rewind_policy` with tmp RocksDB. - [2378](#2378): Use cached hash of the topic instead of calculating it on each publishing gossip message. - [2438](#2438): Refactored service to use new implementation of `StorageRead::read` that takes an offset in input. - [2429](#2429): Introduce custom enum for representing result of running service tasks - [2377](#2377): Add more errors that can be returned as responses when using protocol `/fuel/req_res/0.0.2`. The errors supported are `ProtocolV1EmptyResponse` (status code `0`) for converting empty responses sent via protocol `/fuel/req_res/0.0.1`, `RequestedRangeTooLarge`(status code `1`) if the client requests a range of objects such as sealed block headers or transactions too large, `Timeout` (status code `2`) if the remote peer takes too long to fulfill a request, or `SyncProcessorOutOfCapacity` if the remote peer is fulfilling too many requests concurrently. - [2233](#2233): Introduce a new column `modification_history_v2` for storing the modification history in the historical rocksDB. Keys in this column are stored in big endian order. Changed the behaviour of the historical rocksDB to write changes for new block heights to the new column, and to perform lookup of values from the `modification_history_v2` table first, and then from the `modification_history` table, performing a migration upon access if necessary. - [2383](#2383): The `balance` and `balances` GraphQL query handlers now use index to provide the response in a more performant way. As the index is not created retroactively, the client must be initialized with an empty database and synced from the genesis block to utilize it. Otherwise, the legacy way of retrieving data will be used. - [2463](#2463): The `coinsToSpend` GraphQL query handler now uses index to provide the response in a more performant way. As the index is not created retroactively, the client must be initialized with an empty database and synced from the genesis block to utilize it. Otherwise, the legacy way of retrieving data will be used. - [2556](#2556): Ensure that the `last_recorded_height` is set for the DA gas price source. #### Breaking - [2469](#2469): Move from `GasPriceServicev0` to `GasPriceServiceV1`. Include new config values. - [2438](#2438): The `fuel-core-client` can only work with new version of the `fuel-core`. The `0.40` and all older versions are not supported. - [2438](#2438): Updated `fuel-vm` to `0.59.1` release. Check [release notes](https://github.com/FuelLabs/fuel-vm/releases/tag/v0.59.0) for more details. - [2389](#2258): Updated the `messageProof` GraphQL schema to return a non-nullable `MessageProof`. - [2154](#2154): Transaction graphql endpoints use `TransactionType` instead of `fuel_tx::Transaction`. - [2446](#2446): Use graphiql instead of graphql-playground due to known vulnerability and stale development. - [2379](#2379): Change `kv_store::Value` to be `Arc<[u8]>` instead of `Arc<Vec<u8>>`. - [2490](#2490): Updated GraphQL complexity calculation for `balances` query to account for pagination (`first`/`last`) and nested field complexity (`child_complexity`). Queries with large pagination values or deeply nested fields may have higher complexity costs. - [2463](#2463): 'CoinsQueryError::MaxCoinsReached` variant has been removed. The `InsufficientCoins` variant has been renamed to `InsufficientCoinsForTheMax` and it now contains the additional `max` field - [2463](#2463): The number of excluded ids in the `coinsToSpend` GraphQL query is now limited to the maximum number of inputs allowed in transaction. - [2463](#2463): The `coinsToSpend` GraphQL query may now return different coins, depending whether the indexation is enabled or not. However, regardless of the differences, the returned coins will accurately reflect the current state of the database within the context of the query. - [2526](#2526): By default the cache of RocksDB is now disabled instead of being `1024 * 1024 * 1024`. ## What's Changed * Add metrics to TxPool by @acerone85 in #2321 * Fix collection of gas price metric by @rafal-ch in #2366 * Add documentation to run a ignition node in readme by @AurelienFT in #2363 * Fix collection of tx pool insertion time metric by @rafal-ch in #2369 * Add versioning to request response protocols by @acerone85 in #2362 * Return reason of why proof cant be generated by @rafal-ch in #2258 * p2p: use precalculated topic hash by @yaziciahmet in #2378 * Remove ignore RUSTSEC-2024-0336 by @AurelienFT in #2384 * Deal with negative feed back loop in DA gas price by @MitchTurner in #2364 * Add new flag for maximum file descriptors in rocksdb. by @AurelienFT in #2386 * Add codeowners for gas price algorithm crate by @rafal-ch in #2404 * Weekly `cargo update` by @github-actions in #2373 * chore(gas_price_service): initialize v1 metadata by @rymnc in #2288 * chore(gas_price_service_v0): remove unused trait impl by @rymnc in #2410 * Update tai64 to fix the wrong time offset by @AurelienFT in #2409 * fix(block_producer): immediately return error if lock cannot be acquired during production by @rymnc in #2413 * Add a way to fetch transactions in P2P without specifying a peer by @AurelienFT in #2376 * Add a new code owner for tx pool by @AurelienFT in #2417 * Satisfy clippy in `gas-price-analysis` by @rafal-ch in #2418 * Txpool metrics update by @rafal-ch in #2385 * Improve TxPool tests and documentation by @AurelienFT in #2327 * feat(gas_price_service_v1): define RunnableTask for GasPriceServiceV1 by @rymnc in #2416 * Return reason of why proof cant be generated (api change) by @rafal-ch in #2389 * Fuel/Request_Response v0.0.2: More meaningful error messages by @acerone85 in #2377 * Fix reverse iterator in RocksDB by @AurelienFT in #2398 * Add test node herself in reserved nodes. by @AurelienFT in #2390 * Weekly `cargo update` by @github-actions in #2424 * Weekly `cargo update` by @github-actions in #2440 * Resolve some falky tests and improve CI times by @AurelienFT in #2401 * feat: handle `Unknown` transactions, blocks and consensus parameters by @hal3e in #2154 * fix(p2p): cache responses to serve without roundtrip to db by @rymnc in #2352 * Replace task `run()` return result with custom enum by @MitchTurner in #2429 * Fix codeowners by @AurelienFT in #2444 * fix(graphql_playground): use graphiql instead by @rymnc in #2446 * Weekly `cargo update` by @github-actions in #2453 * refactor: remove `Option<BlockHeight>` and use new enum where applicable by @matt-user in #2033 * Fixed the error during dry run by @xgreenx in #2365 * Add decompression traits and a test case by @Dentosal in #2295 * Versioned Storage for Modifications History by @acerone85 in #2233 * Allow DA recorded blocks to come out-of-order by @MitchTurner in #2415 * feat: Change `kv_store::Value` to be Arc<[u8]> instead of Arc<Vec<u8>> by @netrome in #2411 * Optimize balance-related queries with a cache by @rafal-ch in #2383 * fix: Add missing features to `fuel-core-tests` by @netrome in #2467 * Keep data in fails cases in sync service by @AurelienFT in #2361 * Weekly `cargo update` by @github-actions in #2470 * Revert balances amount to `U64` and introduce new `amountU128` getter by @rafal-ch in #2472 * Create uninitialized task for v1 gas price service by @MitchTurner in #2442 * Port the 0.40.2 fix of TAI on master by @AurelienFT in #2485 * Ignore RUSTSEC-2024-0421 by @AurelienFT in #2489 * Ignore receipts from failed transactions in `message_receipts_proof` by @AurelienFT in #2478 * Add unrecorded blocks abstraction to gas price algo by @MitchTurner in #2468 * Fix last iteration in sequential opcode by @AurelienFT in #2479 * fix(gas_price_service_v0): bring back removed fields, causing UB when trying to access by @rymnc in #2511 * Refactor fuel-core to use version of StorageRead::read with offset (Full update to 0.59.1) by @acerone85 in #2438 * Sync the version of the `fuel-core` with minor hot fixes by @xgreenx in #2516 * fix(docs): typo preventing ci checks from passing by @rymnc in #2525 * Integration test for balances and (non)retryable messages by @rafal-ch in #2505 * Add document for launching Ignition node from source and Local network from source by @AurelienFT in #2502 * Make the rocksdb cache optional in config and add policy for column opening by @AurelienFT in #2526 * Weekly `cargo update` by @github-actions in #2530 * chore(rocksdb): getter for inner database handle by @rymnc in #2532 * Use gas prices from actual blocks to calculate estimate gas prices by @MitchTurner in #2501 * chore(codeowners): gas price service codeowners by @rymnc in #2534 * Add zk opcodes by @AurelienFT in #2439 * Gas price simulation data retriever by @acerone85 in #2533 * Shared sequencer integration by @Dentosal in #1922 * Use expiration policy by @AurelienFT in #2447 * Fixed TPS benchmark to work with latest changes by @xgreenx in #2515 * Use indexation cache to satisfy "coins to spend" queries by @rafal-ch in #2463 * feat(txpool|p2p): use seqlock instead of small copy-able RwLocks by @rymnc in #2524 * Create new index for tracking Asset metadata by @maschad in #2445 * feat(rocksdb): remove getters for internal rocksdb handles, expose `backup` instead by @rymnc in #2535 * Integrate with V1 algo for tests by @MitchTurner in #2469 * Lock-free `latest_l2_height` in gas price service by @rafal-ch in #2546 * chore(gas_price_service_v1): strictly ensure last_recorded_height is set, to avoid initial poll of da source by @rymnc in #2556 * Replace old Graphql Gas Price adapter with new latest gas price struct by @MitchTurner in #2547 * Rename cost and rewards without 'excess' by @MitchTurner in #2558 * Add current pool gas to the node info endpoint by @AurelienFT in #2550 * Pagination queries for `balances` endpoint by @rafal-ch in #2490 * 2559 Increase timeout for test by @MitchTurner in #2560 * Add test expiration policy in executor by @AurelienFT in #2563 ## New Contributors * @yaziciahmet made their first contribution in #2378 **Full Changelog**: v0.40.0...v0.41.0

Test adding nextest

17adede

AurelienFT added the no changelog Skip the CI check of the changelog modification label Oct 28, 2024

AurelienFT self-assigned this Oct 28, 2024

Remove nextest addition

42f5439

AurelienFT changed the title ~~Test adding nextest~~ Try resolve some falky tests and reduce timeout Oct 28, 2024

AurelienFT and others added 20 commits October 28, 2024 16:48

Readd nextest without retried to add timeout

0b63735

Improve robustness backpressure tests

f9bb547

change timing back pressure

e96b25e

Remove timeout in poa test

8ded514

Readd timeout and fix ci

e3ad9c6

Merge branch 'master' into resolve_flaky_tests

3ff4769

Merge branch 'master' into resolve_flaky_tests

95d02ca

try to debug

0e69bfd

fmt, spellcheck

e8137a4

use nocapture

a97ad11

Add base last height to p2p

4dec2aa

allow clippy

fd4bc3b

Clean up branch and split docker production to remove it and only all…

40ada1a

…ow manual in future commit

fmt

6bb56a4

Fix flaky gas price test

3f30d90

remove launch of docker builds

d2fce0f

Merge branch 'master' into resolve_flaky_tests

e4fa26c

remove cancel of other jobs when one fails

dc9c226

fix disable of cancel for jobs

efb84ba

Fix test gas price

5311409

AurelienFT changed the title ~~Try resolve some falky tests and reduce timeout~~ Resolve some falky tests and improve CI times Nov 12, 2024

AurelienFT added 4 commits November 12, 2024 12:36

Split codecov out of CI and increase timing before timeout

d376350

Try to fix gas price test

fc8b8f7

Fix all gas price tests

e86c0b4

remove unused tracing

3179ead

AurelienFT and others added 4 commits November 14, 2024 09:11

Merge branch 'resolve_flaky_tests' of github.com:FuelLabs/fuel-core i…

6e6611c

…nto resolve_flaky_tests

Merge branch 'master' into resolve_flaky_tests

95e126e

Merge branch 'master' into resolve_flaky_tests

10eafea

Merge branch 'master' into resolve_flaky_tests

e81e6ca

xgreenx reviewed Nov 16, 2024

View reviewed changes

xgreenx added 7 commits November 16, 2024 21:52

Fixed race condition

4e7e8da

Fix tests in CI.

d0fcad9

Also run p2p only once.

Make CI happy and update ci checks.

b3c9672

Increase channel size for da source

b1dbbe7

Make executor tests happy

ab65d20

Make CI happy

e5829d2

xgreenx previously approved these changes Nov 17, 2024

View reviewed changes

Clean up small things

0c93efe

xgreenx dismissed their stale review via 0c93efe November 17, 2024 04:38

xgreenx approved these changes Nov 17, 2024

View reviewed changes

AurelienFT requested a review from MitchTurner November 17, 2024 10:39

Merge branch 'master' into resolve_flaky_tests

35f7b8e

rymnc approved these changes Nov 18, 2024

View reviewed changes

AurelienFT requested a review from xgreenx November 18, 2024 09:12

xgreenx approved these changes Nov 18, 2024

View reviewed changes

xgreenx merged commit f9b1fe2 into master Nov 18, 2024
31 checks passed

xgreenx deleted the resolve_flaky_tests branch November 18, 2024 12:09

xgreenx mentioned this pull request Nov 18, 2024

Attempt to dodge flakiness in heavy_tasks_doesnt_block_graphql test #2437

Closed

1 task

This was referenced Dec 11, 2024

Flaky test: gas_price::latest_gas_price__if_node_restarts_gets_latest_value #2395

Closed

Flaky startup__can_override_gas_price_values_by_changing_config test #2493

Closed

xgreenx mentioned this pull request Jan 15, 2025

Release v0.41.0 #2565

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolve some falky tests and improve CI times #2401

Resolve some falky tests and improve CI times #2401

AurelienFT commented Oct 28, 2024 •

edited

Loading

xgreenx left a comment

AurelienFT commented Nov 16, 2024

AurelienFT commented Nov 17, 2024

Resolve some falky tests and improve CI times #2401

Resolve some falky tests and improve CI times #2401

Conversation

AurelienFT commented Oct 28, 2024 • edited Loading

Linked Issues/PRs

Description

Checklist

Before requesting review

xgreenx left a comment

Choose a reason for hiding this comment

AurelienFT commented Nov 16, 2024

AurelienFT commented Nov 17, 2024

AurelienFT commented Oct 28, 2024 •

edited

Loading