Use `configure-aws-credentials` workflow instead of passing `secret_access_key` #1361

vudiep411 · 2024-11-26T22:39:25Z

Summary

This PR fixes #1346 where we can get rid of the long term credentials by using OpenID Connect. OpenID Connect (OIDC) allows your GitHub Actions workflows to access resources in Amazon Web Services (AWS), without needing to store the AWS credentials as long-lived GitHub secrets.

Changes

We can remove these secrets that were passed in previously:

token: ${{ secrets.GITHUB_TOKEN }}
      bucket: ${{ secrets.AWS_S3_BUCKET }}
      access_key_id: ${{ secrets.AWS_S3_ACCESS_KEY_ID }}
      secret_access_key: ${{ secrets.AWS_S3_ACCESS_KEY }}

Instead we only need the role-to-assume arn. For more information OIDC.

Prerequisites

Before merging this PR, we need to make sure to set up the proper Identity providers on the production AWS account. Follow this guides.
Quick guide:

Create an Identity provider with OpenID Connect.
Provider url: https://token.actions.githubusercontent.com

Audience: sts.amazonaws.com

Assign the role into this new provider with this trust policy like below

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<aws_account_id>:oidc-provider/token.actions.githubusercontent.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringLike": {
                    "token.actions.githubusercontent.com:sub": [
                        "repo:valkey-io/valkey:ref:refs/heads/unstable",
                        "repo:valkey-io/valkey:ref:refs/tags/*"
                    ],
                    "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

In Github secrets, Add these secrets:

Results

Github action run:

Files in S3 Dev env:

Note: the failed workflow can be found in this issue #1343 which will be in a separate PR.

Build release workflow

…#1340) Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

When a node role changes, we should brocast the change to notify other nodes. For example, one primary and one replica, after a failover, the replica became a new primary, the primary became a new replica. And then we trigger a second cluster failover for the new replica, the new replica will send a MFSTART to its primary, ie, the new primary. But the new primary may reject the MFSTART due to this logic: ``` } else if (type == CLUSTERMSG_TYPE_MFSTART) { if (!sender || sender->replicaof != myself) return 1; ``` In the new primary views, sender is still a primary, and sender->replicaof is NULL, so we will return. Then the manual failover timedout. Another possibility is that other primaries refuse to vote after receiving the FAILOVER_AUTH_REQUEST, since in their's views, sender is still a primary, so it refuse to vote, and then manual failover timedout. ``` void clusterSendFailoverAuthIfNeeded(clusterNode *node, clusterMsg *request) { ... if (clusterNodeIsPrimary(node)) { serverLog(LL_WARNING, "Failover auth denied to... ``` The reason is that, currently, we only update the node->replicaof information when we receive a PING/PONG from the sender. For details, see clusterProcessPacket. Therefore, in some scenarios, such as clusters with many nodes and a large cluster-ping-interval (that is, cluster-node-timeout), the role change of the node will be very delayed. Added a DEBUG DISABLE-CLUSTER-RANDOM-PING command, send cluster ping to a random node every second (see clusterCron). Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

…alkey-io#1249) These commands are all administrator commands. If they are operated incorrectly, serious consequences may occur. Print the full client info by using catClientInfoString, the info is useful when we want to identify the source of request. Since the origin client info is very large and might complicate the output, we added a catClientInfoShortString function, it will only print some basic fields, we want these fields that are useful to identify the client. These fields are: - id - addr - laddr - connection info - name - user - lib-name - lib-ver And also used it to replace the origin client info where it has the same purpose. Some logging is changed from full client info to short client info: - CLUSTER FAILOVER - FAILOVER / PSYNC - REPLICAOF NO ONE - SHUTDOWN Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

Signed-off-by: vudiep411 <[email protected]>

valkey-io#1334) We have a replicationEmptyDbCallback, it is a callback used by emptyData while flushing away old data. Previously, we did not add this callback logic for function, in case of abuse, there may be a lot of functions, and also to make the code consistent, we add the same callback logic for function. Changes around this commit: 1. Extend emptyData / functionsLibCtxClear to support passing callback when flushing functions. 2. Added disklessLoad function create and discard helper function, just like disklessLoadInitTempDb and disklessLoadDiscardTempDb), we wll always flush the temp function in a async way to avoid any block. 3. Cleanup around discardTempDb, remove the callback pointer since in async way we don't need the callback. 4. Remove functionsLibCtxClear call in readSyncBulkPayload, because we called emptyData in the previous lines, which also empty functions. We are doing this callback in replication is because during the flush, replica may block a while if the flush is doing in the sync way, to avoid the primary to detect the replica is timing out, replica will use this callback to notify the primary (we also do this callback when loading a RDB). And in the async way, we empty the data in the bio and there is no slw operation, so it will ignores the callback. Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

Fast_float is a C++ header-only library to parse doubles using SIMD instructions. The purpose is to speed up sorted sets and other commands that use doubles. A single-file copy of fast_float is included in this repo. This introduces an optional dependency on a C++ compiler. The use of fast_float is enabled at compile time using the make variable `USE_FAST_FLOAT=yes`. It is disabled by default. Fixes valkey-io#1069. --------- Signed-off-by: Parth Patel <[email protected]> Signed-off-by: Parth <[email protected]> Signed-off-by: Madelyn Olson <[email protected]> Signed-off-by: Viktor Söderqvist <[email protected]> Co-authored-by: Roshan Swain <[email protected]> Co-authored-by: Madelyn Olson <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]> Signed-off-by: vudiep411 <[email protected]>

…io#1342) Optimize sdscatrepr by reducing realloc calls, furthermore, we can reduce memcpy calls by batch processing of consecutive printable characters. Signed-off-by: Ray Cao <[email protected]> Co-authored-by: Ray Cao <[email protected]> Signed-off-by: vudiep411 <[email protected]>

…g modified (valkey-io#1347) Apparently on Mac, sleep will modify errno to ETIMEDOUT, and then it prints the misleading message: Operation timed out. Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

The code is ok before 2de544c, but now we will set server.repl_transfer_fd right after dfd was initiated, and in here we have a double close error since dfd and server.repl_transfer_fd are the same fd. Also move the declaration of dfd/maxtries to a small scope to avoid the confusion since they are only used in this code. Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

This PR introduces a consistent tagging system for dual-channel logs. The goal is to improve log readability and filterability, making it easier for operators to manage and analyze log entries. Resolves valkey-io#986 --------- Signed-off-by: naglera <[email protected]> Signed-off-by: vudiep411 <[email protected]>

Signed-off-by: vudiep411 <[email protected]>

vudiep411 and others added 5 commits November 24, 2024 14:02

Improved workflow code via AWS OIDC

6928770

test

d484f5c

improve workflow pattern, bumb aws-cred-conf ver

eda6877

Merge branch 'valkey-io:unstable' into build-release-workflow

2db5c14

Merge pull request #1 from vudiep411/build-release-workflow

295a03b

Build release workflow

vudiep411 force-pushed the aws-creds-workflow branch from 295a03b to 02adab3 Compare November 26, 2024 22:42

enjoy-binbin and others added 13 commits November 26, 2024 15:03

Add cmake-build-debug and cmake-build-release to gitignore (valkey-io…

174c376

…#1340) Signed-off-by: Binbin <[email protected]> Signed-off-by: vudiep411 <[email protected]>

CMake fixes + README update (valkey-io#1276)

2a5b4fc

Signed-off-by: vudiep411 <[email protected]>

Improved workflow code via AWS OIDC

51b48ee

Signed-off-by: vudiep411 <[email protected]>

test

d76d81a

Signed-off-by: vudiep411 <[email protected]>

improve workflow pattern, bumb aws-cred-conf ver

6914dac

Signed-off-by: vudiep411 <[email protected]>

vudiep411 force-pushed the aws-creds-workflow branch from 02adab3 to 6914dac Compare November 26, 2024 23:03

Merge branch 'unstable' into aws-creds-workflow

e2a8b4f

vudiep411 closed this Nov 26, 2024

vudiep411 deleted the aws-creds-workflow branch November 26, 2024 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `configure-aws-credentials` workflow instead of passing `secret_access_key` #1361

Use `configure-aws-credentials` workflow instead of passing `secret_access_key` #1361

vudiep411 commented Nov 26, 2024

Use configure-aws-credentials workflow instead of passing secret_access_key #1361

Use configure-aws-credentials workflow instead of passing secret_access_key #1361

Conversation

vudiep411 commented Nov 26, 2024

Summary

Changes

Prerequisites

Results

Use `configure-aws-credentials` workflow instead of passing `secret_access_key` #1361

Use `configure-aws-credentials` workflow instead of passing `secret_access_key` #1361