k3s database size grow due to slow compact process #10626
Replies: 1 comment 6 replies
-
This indicates that some other server node is compacting, and moved the compact revision forward since this node last checked. Examine the logs on your other node for log messages about compaction.
That is not what K3s does. Kine observes the same behavior as the Kubernetes apiserver itself, and attempts to compact up to the revision 5 minutes in the past. Every 5 minutes, it compacts (in chunks of 1000 rows) up to the max revision (the newest row) from 5 minutes ago. The target compact revision is also capped to be at least 1000 revisions back from the current revision. If the compact transaction times out, it is definitely possible for things to spiral. Check the logs on your nodes to see how long compaction is taking. You may want to temporarily shut down all of your cluster members except for 1 server, and allow it to catch up with compaction. Once that is done, add additional nodes and workloads, and monitor compaction to ensure that it is keeping up. If it is not, you should consider adding additional capacity to your SQL server.
Note that we do not support multi-master databases that use an |
Beta Was this translation helpful? Give feedback.
-
Environmental Info:
K3s Version:
k3s version v1.28.11+k3s2 (d076d9a)
go version go1.21.11
Node(s) CPU architecture, OS, and Version:
Linux master01 5.15.0-113-generic #123-Ubuntu SMP Mon Jun 10 08:16:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
6 master nodes in 2 DCs with MariaDB Galera 4 nodes cluster as backend
Describe the bug:
When k3s installed in the configuration with database backend, the database size is quickly growing even on relatively "calm" cluster.
Steps To Reproduce:
/usr/local/bin/k3s server --kube-apiserver-arg oidc-username-claim=name --kube-apiserver-arg oidc-groups-claim=groups --kube-apiserver-arg oidc-client-id=kubernetes --kube-apiserver-arg oidc-issuer-url=https://auth.domain.com/auth/realms/test --disable local-storage --disable coredns --disable traefik --kubelet-arg kube-reserved=cpu=250m,memory=512Mi --kubelet-arg kube-reserved=cpu=250m,memory=512Mi --kubelet-arg system-reserved=cpu=250m,memory=512Mi --tls-san cluster-test.domain.com --datastore-endpoint=mysql://user:password@tcp(database.hostname:3306)/test_db
Expected behavior:
Database size stays proportional to the running workloads.
Actual behavior:
Database size growing up to 100+ GB
Additional context / logs:
I've took a look on the similar reports in other bugs, and do some research. It looks like compaction process cannot handle the load cluster creates during run even with light load, and constantly falling behind.
Test cluster, no actual workloads, only infra stuff, test query
Result after just 1 day of run
Out of these, 226 884 rows are "deleted = 0", and 349 493 are "deleted = 1"
Compact log message
Jul 31 20:36:11 multi-01 k3s[31503]: time="2024-07-31T20:36:11Z" level=info msg="COMPACT compact revision changed since last iteration: 1143674 => 1145674"
It seems that it is deleting just 2000 records at once every 5 minutes, so just to catch up it would need (349 493/2000)*5 = 873 minutes, doesn't seem realistic.
I assume the number of compacted records should be in proportion to the number of "deleted = 1" records.
Records breakdown, query:
Result:
Gap count, query
Result: 353953
And this is just test cluster with no application load. On the production cluster, stat from 20 GB database for queries above
Additional query:
Result:
And the log of k3s service is full of messages like
Thus it is clear that compact/old data deletion process is not working properly and should be improved. Even if I would temporary shutdown all nodes and do a manual kine table cleanup, the database quickly will fill up again.
Beta Was this translation helpful? Give feedback.
All reactions