-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
initial take on ADR reasoning the move from Azure managed db to in-cl… #583
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the feelings going into this but to me this is too biased. The switch away from Azure Database for MariaDB is the initiator of this change. Recent experiences pushed the urgency of this change. The ADR should reflect this.
I also have some more comments but as this point I want to signal that I do not think that this can be accepted as is.
@kasperg please provide your thoughts. This needs to get done by today as agreed on the last meeting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the context section is way too long. It gets rambly and doesn't stick to the relevant background for the concrete decision.
It needs to be useful for future readers looking back at what the factors were, when we made this decision. Right now it reads as if the primary reason was frustration. I don't think that's accurate: there are significant objective factors that disappear in the impassioned appeal.
The pro/con list and final decision are fine imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good, and is much more to the point!
I added some tiny notes on language and request for a few clarifications. With those, in my opinion, we're golden! ⭐
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have read this through a couple of times now and provided some suggestions for you to consider.
My intention is to add documentation, improve structure and reduce emotional appeal.
The project has experienced project wide down time due to database crashes on | ||
multiple occasions: | ||
|
||
- 17.01.2025: Database went down due to a restart initiated by the Azure Product | ||
Group team taking unsanctioned action | ||
- 15.01.2025: Database went down due to a restart initiated by the Azure Product | ||
Group team taking unsanctioned action | ||
- 9.01.2025: Database went down as a result of to many login requests | ||
- 28.11.2024: Database went down due to too many active connection attempts | ||
- 7.11.2024: Database went down due to too many connections | ||
- Summer 2024: Database was restarted by Azure for unscheduled and unnotified | ||
maintenance | ||
|
||
Microsoft's database server logs are prohibitively expensive. | ||
Microsoft misconfigured the database provision-time. | ||
The current Microsoft support tier is useless and useful support is | ||
prohibitively expensive. | ||
Microsoft support has made unsanctioned changes on the databse, such as | ||
maintenance work during business hours and restarts in attempts to fix support | ||
tickets. | ||
It is not possible to access the database server. | ||
|
||
Installing a MariaDB Sql operator in the cluster provides the following | ||
benefits: | ||
|
||
- access to the database server logs | ||
- access to the server itself | ||
- a guarantee that no one but the platform team can take action on the server | ||
- ability to right-size the database | ||
- Running clusters of databases | ||
- Database recovery | ||
- The ability to start recovering from a crash immediately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I see it these are pros and cons are partly duplicated below. I appreciate the pros & cons structure so I suggest restructuring this to fit into the sections below.
If we do this I think the current content should be replaced with something like:
"The database is a critical part of the sites managed by the platform. Downtime or degradation of service for the database is usually reflected to all applications running on the platform.
Our experience with using Azure MariaDB up until now requires us to consider alternative solutions instead of just going with the recommended solution. The two solutions we have considered are Azure Database for MySQL and moving to running the database ourselves within our Kubernetes cluster."
Can we provide any reasoning for why we do not seriously consider other alternatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also: I think it is worth adding a bit more context into what we mean by "In-cluster database".
I am not an expert here but as I see it there might be multiple ways to run a database in a cluster. I think it would be good if we could provide more info regarding what we actually have in mind, preferably combined with a link or two if we are referring to specific solutions/products/architectures e.g. Kubernetes Operator for MariaDB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The database is a critical part of the sites managed by the platform. Downtime or degradation of service for the database is usually reflected to all applications running on the platform.
This is already in the document. Do you find it unclear?
Can we provide any reasoning for why we do not seriously consider other alternatives?
I don't think I know of any architecturally sound alternatives to the two options we have at hand.
Also: I think it is worth adding a bit more context into what we mean by "In-cluster database".
Please elaborate on what you think should be mentioned
I think it would be good if we could provide more info regarding what we actually have in mind,
I agree, but we haven't decided on an architecture yet. We should circle back to this when we have made a decision.
… sunset summer 2025
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
Co-authored-by: Kasper Garnæs <[email protected]>
@kasperg a lot of good comments, suggestions and revisions you provided here. I appreciate the effort put into this. |
…uster db
What does this PR do?
Add an ADR describing why we are moving the database into the cluster instead of moving to another managed DB.
Should this be tested by the reviewer and how?
It should be read. Please request some changes if you think something is missing.
Any specific requests for how the PR should be reviewed?
What are the relevant tickets?
https://reload.atlassian.net/jira/software/c/projects/DDFDRIFT/boards/464?selectedIssue=DDFDRIFT-285