Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add integration tests for NAB #73

Conversation

mateusoliveira43
Copy link
Contributor

@mateusoliveira43 mateusoliveira43 commented Aug 28, 2024

Why the changes were made

Add Integration tests for NAB, so we can trust our code base more

Fixes #61

Also

  • removed duplication in logs
  • removed unnecessary validations
  • made each reconcile only execute one update call per object, to avoid race conditions
  • removed terminal error from wrong places

Can help on refactor of #62

How to test the changes made

To run integration tests (and also unit tests), run make simulation-test

Check Test logs to see what is being tested

To check code coverage, run go tool cover -html=cover.out

Also check #73 (comment)

Copy link

openshift-ci bot commented Aug 28, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@mateusoliveira43 mateusoliveira43 force-pushed the fix/add-integration-tests-nab branch from b8aa9c2 to 8082cb3 Compare August 28, 2024 20:57
@mateusoliveira43 mateusoliveira43 force-pushed the fix/add-integration-tests-nab branch from ab565eb to 698144a Compare September 3, 2024 12:47
go.mod Outdated Show resolved Hide resolved
@mateusoliveira43 mateusoliveira43 marked this pull request as ready for review September 6, 2024 15:52
Copy link
Contributor Author

@mateusoliveira43 mateusoliveira43 Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest running these test against master branch to see the current problems

To do so, in master branch run

git restore -s fix/add-integration-tests-nab hack/extra-crds/velero.io_backups.yaml
git restore -s fix/add-integration-tests-nab internal/controller/suite_test.go
git restore -s fix/add-integration-tests-nab internal/controller/nonadminbackup_controller_test.go
git restore -s fix/add-integration-tests-nab cmd/main.go
git restore -s fix/add-integration-tests-nab Makefile 
git apply test-master.diff
make simulation-test

where test-master.diff is

diff --git a/internal/common/function/function_test.go b/internal/common/function/function_test.go
index dcec107..840c4ce 100644
--- a/internal/common/function/function_test.go
+++ b/internal/common/function/function_test.go
@@ -22,6 +22,7 @@ import (
 	"reflect"
 	"testing"
 
+	"github.com/onsi/ginkgo/v2"
 	"github.com/stretchr/testify/assert"
 	velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
@@ -35,6 +36,8 @@ import (
 	"github.com/migtools/oadp-non-admin/internal/common/constant"
 )
 
+var _ = ginkgo.Describe("PLACEHOLDER", func() {})
+
 func TestMergeMaps(t *testing.T) {
 	const (
 		d     = "d"
diff --git a/internal/controller/nonadminbackup_controller.go b/internal/controller/nonadminbackup_controller.go
index 85456c2..d05590b 100644
--- a/internal/controller/nonadminbackup_controller.go
+++ b/internal/controller/nonadminbackup_controller.go
@@ -20,7 +20,6 @@ package controller
 import (
 	"context"
 	"errors"
-	"time"
 
 	"github.com/go-logr/logr"
 	velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
@@ -43,8 +42,9 @@ import (
 // NonAdminBackupReconciler reconciles a NonAdminBackup object
 type NonAdminBackupReconciler struct {
 	client.Client
-	Scheme  *runtime.Scheme
-	Context context.Context
+	Scheme        *runtime.Scheme
+	Context       context.Context
+	OADPNamespace string
 }
 
 const (
@@ -89,7 +89,7 @@ func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Reque
 
 	reconcileExit, reconcileRequeue, reconcileErr := r.InitNonAdminBackup(ctx, rLog, &nab)
 	if reconcileRequeue {
-		return ctrl.Result{Requeue: true, RequeueAfter: requeueTimeSeconds * time.Second}, reconcileErr
+		return ctrl.Result{Requeue: true}, reconcileErr
 	} else if reconcileExit && reconcileErr != nil {
 		return ctrl.Result{}, reconcile.TerminalError(reconcileErr)
 	} else if reconcileExit {
@@ -98,7 +98,7 @@ func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Reque
 
 	reconcileExit, reconcileRequeue, reconcileErr = r.ValidateVeleroBackupSpec(ctx, rLog, &nab)
 	if reconcileRequeue {
-		return ctrl.Result{Requeue: true, RequeueAfter: requeueTimeSeconds * time.Second}, reconcileErr
+		return ctrl.Result{Requeue: true}, reconcileErr
 	} else if reconcileExit && reconcileErr != nil {
 		return ctrl.Result{}, reconcile.TerminalError(reconcileErr)
 	} else if reconcileExit {
@@ -107,7 +107,7 @@ func (r *NonAdminBackupReconciler) Reconcile(ctx context.Context, req ctrl.Reque
 
 	reconcileExit, reconcileRequeue, reconcileErr = r.CreateVeleroBackupSpec(ctx, rLog, &nab)
 	if reconcileRequeue {
-		return ctrl.Result{Requeue: true, RequeueAfter: requeueTimeSeconds * time.Second}, reconcileErr
+		return ctrl.Result{Requeue: true}, reconcileErr
 	} else if reconcileExit && reconcileErr != nil {
 		return ctrl.Result{}, reconcile.TerminalError(reconcileErr)
 	} else if reconcileExit {
@@ -235,7 +235,7 @@ func (r *NonAdminBackupReconciler) CreateVeleroBackupSpec(ctx context.Context, l
 	}
 
 	veleroBackup := velerov1api.Backup{}
-	err := r.Get(ctx, client.ObjectKey{Namespace: constant.OadpNamespace, Name: veleroBackupName}, &veleroBackup)
+	err := r.Get(ctx, client.ObjectKey{Namespace: r.OADPNamespace, Name: veleroBackupName}, &veleroBackup)
 
 	if err != nil && apierrors.IsNotFound(err) {
 		// Create VeleroBackup
@@ -255,7 +255,7 @@ func (r *NonAdminBackupReconciler) CreateVeleroBackupSpec(ctx context.Context, l
 		veleroBackup = velerov1api.Backup{
 			ObjectMeta: metav1.ObjectMeta{
 				Name:      veleroBackupName,
-				Namespace: constant.OadpNamespace,
+				Namespace: r.OADPNamespace,
 			},
 			Spec: *backupSpec,
 		}
@@ -325,7 +325,7 @@ func (r *NonAdminBackupReconciler) SetupWithManager(mgr ctrl.Manager) error {
 		WithEventFilter(predicate.CompositePredicate{
 			NonAdminBackupPredicate: predicate.NonAdminBackupPredicate{},
 			VeleroBackupPredicate: predicate.VeleroBackupPredicate{
-				OadpVeleroNamespace: constant.OadpNamespace,
+				OadpVeleroNamespace: r.OADPNamespace,
 			},
 			Context: r.Context,
 		}).

internal/common/function/function.go Show resolved Hide resolved
internal/controller/nonadminbackup_controller.go Outdated Show resolved Hide resolved
// It then checks if a Velero Backup object with that name already exists. If it does not exist, it creates a new one.
// The function generates the name for the Velero Backup object based on the provided namespace and name.
// It then checks if a Velero Backup object with that name already exists. If it does not exist, it creates a new one
// and updates NonAdminBackup Status. Otherwise, updates NonAdminBackup VeleroBackup Spec and Status based on Velero Backup object Spec and Status.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not update the NAB Velerospec from Velero backup object spec here, its fine if Velero adds some default values, its not the responsibility of the controller to update spec, specs are user defined or by external entity, controller's job is to get the object to the desired state and provide updates on status in doing so. The spec update to NAB object will also generate an event to trigger reconcile which we don't want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discuss in meeting

// The function returns boolean values indicating whether the reconciliation loop should exit or requeue
func (r *NonAdminBackupReconciler) CreateVeleroBackupSpec(ctx context.Context, logrLogger logr.Logger, nab *nacv1alpha1.NonAdminBackup) (exitReconcile bool, requeueReconcile bool, errorReconcile error) {
logger := logrLogger.WithValues("CreateVeleroBackupSpec", nab.Namespace)
func (r *NonAdminBackupReconciler) UpdateSpecStatus(ctx context.Context, logrLogger logr.Logger, nab *nacv1alpha1.NonAdminBackup) (exitReconcile bool, requeueReconcile bool, errorReconcile error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets changes the function name to some thing like ProcessNABSpec or something similar

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original function name was CreateVeleroBackupSpec which I think was correct. It reflected we are creating it not updating nor Processing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It did update NAB when VeleroBackup updates

} else {
// We should not update already created VeleroBackup object.
// The VeleroBackup within NonAdminBackup will
// be reverted back to the previous state - the state which created VeleroBackup
// in a first place, so they will be in sync.
logger.Info("Backup already exists, updating NonAdminBackup status", nameField, veleroBackupName)
updatedNab, errBackupUpdate := function.UpdateNonAdminBackupFromVeleroBackup(ctx, r.Client, logger, nab, &veleroBackup)
// Regardless if the status was updated or not, we should not
// requeue here as it was only status update.
if errBackupUpdate != nil {
return true, false, errBackupUpdate
} else if updatedNab {
logger.V(1).Info("NonAdminBackup CR - Rqueue after Status Update")
return false, true, nil
}
return true, false, nil
}

we should break this function into smaller ones, it does too many things. But I tried to changed as little as possible in this PR

I suggest follow up with break up of function. Do you agree?

internal/predicate/velerobackup_predicate.go Outdated Show resolved Hide resolved
@shubham-pampattiwar
Copy link
Member

OK at the high level this is starting to look like:

  • We have a main Reconcile function for NAB controller
  • This controller has a bunch of tasks: Init, ValidateSpec and so on...
  • Now each task returns exitReconcile bool, requeueReconcile bool, errorReconcile error
    IMO as I had said before too we could structure this similar to the OADP DPA reconciler
    It wont be the exact same but refactoring would be similar.
    In our NAB controller case, in addition to what the controller tasks currently return exitReconcile bool, requeueReconcile bool, errorReconcile error, we could add one more struct called NABUpdate This struct would consist of the phase and condition that needs to be updated for NAB object in that particular reconcile loop and the status update call could be executed at the end of the reconcile loop in a consolidated manner.
    All the above changes would simplify the code workflow a lot, would be easier to understand/debug and make the code maintainable as well.

internal/controller/nonadminbackup_controller_test.go Outdated Show resolved Hide resolved
gomega.Expect(deleteTestNamespaces(ctx, nonAdminNamespaceName, oadpNamespaceName)).To(gomega.Succeed())
})

ginkgo.DescribeTable("Reconcile called by NonAdminBackup Delete event",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets change this test describe, we can say some thing like "NAB object does not exist case"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not better to keep as it is for consistency?

the other test describe is Reconcile triggered by NonAdminBackup Create/Update events and by Requeue

internal/controller/nonadminbackup_controller_test.go Outdated Show resolved Hide resolved
internal/controller/nonadminbackup_controller_test.go Outdated Show resolved Hide resolved
internal/controller/nonadminbackup_controller_test.go Outdated Show resolved Hide resolved
},
result: reconcile.Result{Requeue: true},
}),
ginkgo.Entry("When called by Requeue(update NonAdminBackup Condition to Queued True), should update NonAdminBackup VeleroBackupStatus and Requeue", nonAdminBackupSingleReconcileScenario{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is failing when executed independently/focused. (For Velero Backup name counter seems incorrect, expects a 6th reconcile instead of first) I am assuming these tests should always be executed together ?
image
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed the problem in next commit, please test when possible

internal/controller/nonadminbackup_controller_test.go Outdated Show resolved Hide resolved
},
result: reconcile.Result{Requeue: true},
}),
ginkgo.Entry("When called by Requeue(update NonAdminBackup VeleroBackupStatus), should update NonAdminBackup spec BackupSpec and Requeue", nonAdminBackupSingleReconcileScenario{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets hold off on adding this test as we are gonna remove this task from NAB reconcile loop.

internal/controller/nonadminbackup_controller_test.go Outdated Show resolved Hide resolved
gomega.Expect(checkTestNonAdminBackupStatus(nonAdminBackup, scenario.status)).To(gomega.Succeed())
if scenario.priorStatus != nil {
if len(scenario.priorStatus.VeleroBackupName) > 0 {
gomega.Expect(reflect.DeepEqual(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something incomplete or residual/debug code ?
We are comparing the NAB.spec.backupSpec with VeleroBackupSpec with just IncludedNS. Let me know if am missing something here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm NAB spec was updated by reconcile loop

mpryc
mpryc previously approved these changes Sep 23, 2024
Copy link
Collaborator

@mpryc mpryc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on a cluster, looks ok, there is one gotcha which is different then previous workflow:

Create NAB 'mynab' -> Backup is created -> Delete NAB -> Create NAB 'mynab'

With this PR the NAB get's the accepted / created state with an old Backup being in place.

Previously those were final on the Backup object where UUID was pointing to the deleted NAB:

  annotations:
    openshift.io/oadp-nab-origin-name: example
    openshift.io/oadp-nab-origin-namespace: default
    openshift.io/oadp-nab-origin-uuid: 463057c2-6d57-4be5-9a71-fc12723f7cb4

Currently it's the same, but the NAB object gets updated to point to Backup which now mismatches the uuid. Previously the NAB was not updated to get into accepted state and was left on New state.

Copy link
Collaborator

@mpryc mpryc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@mateusoliveira43
Copy link
Contributor Author

/hold

@shubham-pampattiwar
Copy link
Member

Can follow up rest of the updates

Copy link

openshift-ci bot commented Sep 24, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mateusoliveira43, mpryc, shubham-pampattiwar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [mateusoliveira43,mpryc,shubham-pampattiwar]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shubham-pampattiwar
Copy link
Member

/unhold

@openshift-merge-bot openshift-merge-bot bot merged commit f50ab81 into migtools:master Sep 24, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Merged / Ready for build
Development

Successfully merging this pull request may close these issues.

Create tests for NAB controller.
4 participants