The "balance" subcommand does not balance if there was a recent "upmap-remapped.py" run #47

patrakov · 2024-08-19T07:16:54Z

A customer has a badly-balanced cluster where TheJJ balancer refuses to do anything unless --ignore-ideal-pgcounts=all is given. The offline osdmap optimization as described here produces more than 1500 PG moves.

History:

Start with a 14-node cluster, 13-14 OSDs per node, ~80% full
Add two more nodes
Observe that some PGs became backfill_toofull (i.e., would become backfillfull if left unattended)
Cancel all backfills using the upmap-remapped.py script
Note that the cluster still needs to be balanced (to fill the OSDs in the two new nodes), run TheJJ balancer
Wait a week
Notice that the balance is still bad and TheJJ balancer does not seem to be able to improve it.

$ ./placementoptimizer.py -v balance --osdsize device --osdused delta --max-pg-moves 200 --osdfrom fullest --max-move-attempts=30 
[2024-08-19 07:03:03,797] gathering cluster state via ceph api...
[2024-08-19 07:03:10,135] running pg balancer
[2024-08-19 07:03:10,172] cluster variance for crushclasses:
[2024-08-19 07:03:10,172]               ssd: 0.041
[2024-08-19 07:03:10,172]               hdd: 323.013
[2024-08-19 07:03:10,172] OSD fill rate by crushclass:
[2024-08-19 07:03:10,172]   ssd: average=0.71%, unconstrained=2.27%
[2024-08-19 07:03:10,172]       min osd.186   0.156%
[2024-08-19 07:03:10,172]    median osd.66    0.757%
[2024-08-19 07:03:10,172]       max osd.160   1.018%
[2024-08-19 07:03:10,172]   hdd: average=69.28%, unconstrained=68.34%
[2024-08-19 07:03:10,173]       min osd.203   20.067%
[2024-08-19 07:03:10,173]    median osd.181   74.947%
[2024-08-19 07:03:10,173]       max osd.30    82.935%
[2024-08-19 07:03:10,196] couldn't empty osd.70, so we're done. if you want to try more often, set --max-move-attempts=$nr, this may unlock more balancing possibilities. setting --ignore-ideal-pgcounts also unlocks more, but will then we will fight with ceph's default balancer.
[2024-08-19 07:03:10,196] --------------------------------------------------------------------------------
[2024-08-19 07:03:10,196] generated 0 remaps in 0 steps.
[2024-08-19 07:03:10,196] total movement size: 0.0B
[2024-08-19 07:03:10,196] --------------------------------------------------------------------------------
[2024-08-19 07:03:10,196] OSD fill rate by crushclass:
[2024-08-19 07:03:10,196]                       OLD                         NEW
[2024-08-19 07:03:10,196]   ssd:
[2024-08-19 07:03:10,196]       raw             2.269%                      2.269%
[2024-08-19 07:03:10,196]       avg             0.715%                      0.715%
[2024-08-19 07:03:10,196]  variance             0.041                       0.041
[2024-08-19 07:03:10,196]       min osd.186     0.156%          osd.186     0.156%
[2024-08-19 07:03:10,196]    median osd.66      0.757%          osd.66      0.757%
[2024-08-19 07:03:10,196]       max osd.160     1.018%          osd.160     1.018%
[2024-08-19 07:03:10,196]   hdd:
[2024-08-19 07:03:10,196]       raw            68.337%                     68.337%
[2024-08-19 07:03:10,197]       avg            69.277%                     69.277%
[2024-08-19 07:03:10,197]  variance           323.013                     323.013
[2024-08-19 07:03:10,197]       min osd.203    20.067%          osd.203    20.067%
[2024-08-19 07:03:10,197]    median osd.181    74.947%          osd.181    74.947%
[2024-08-19 07:03:10,197]       max osd.30     82.935%          osd.30     82.935%
[2024-08-19 07:03:10,197] 
[2024-08-19 07:03:10,197] new usable space:
[2024-08-19 07:03:10,197]   pool name         id previous         new      change
[2024-08-19 07:03:10,197]   .mgr               1    1.65T ->    1.65T =>     0.0B
[2024-08-19 07:03:10,197]   mainfs.meta        2    7.02T ->    7.02T =>     0.0B
[2024-08-19 07:03:10,197]   mainfs.data-hdd    3  147.32T ->  147.32T =>     0.0B
[2024-08-19 07:03:10,197]   .nfs               4    4.79T ->    4.79T =>     0.0B
[2024-08-19 07:03:10,197]                                           sum:     0.0B
[2024-08-19 07:03:10,197] --------------------------------------------------------------------------------

The gather subcommand output will be sent by email.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The "balance" subcommand does not balance if there was a recent "upmap-remapped.py" run #47

The "balance" subcommand does not balance if there was a recent "upmap-remapped.py" run #47

patrakov commented Aug 19, 2024

The "balance" subcommand does not balance if there was a recent "upmap-remapped.py" run #47

The "balance" subcommand does not balance if there was a recent "upmap-remapped.py" run #47

Comments

patrakov commented Aug 19, 2024