Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The "balance" subcommand does not balance if there was a recent "upmap-remapped.py" run #47

Open
patrakov opened this issue Aug 19, 2024 · 0 comments

Comments

@patrakov
Copy link

A customer has a badly-balanced cluster where TheJJ balancer refuses to do anything unless --ignore-ideal-pgcounts=all is given. The offline osdmap optimization as described here produces more than 1500 PG moves.

History:

  1. Start with a 14-node cluster, 13-14 OSDs per node, ~80% full
  2. Add two more nodes
  3. Observe that some PGs became backfill_toofull (i.e., would become backfillfull if left unattended)
  4. Cancel all backfills using the upmap-remapped.py script
  5. Note that the cluster still needs to be balanced (to fill the OSDs in the two new nodes), run TheJJ balancer
  6. Wait a week
  7. Notice that the balance is still bad and TheJJ balancer does not seem to be able to improve it.
$ ./placementoptimizer.py -v balance --osdsize device --osdused delta --max-pg-moves 200 --osdfrom fullest --max-move-attempts=30 
[2024-08-19 07:03:03,797] gathering cluster state via ceph api...
[2024-08-19 07:03:10,135] running pg balancer
[2024-08-19 07:03:10,172] cluster variance for crushclasses:
[2024-08-19 07:03:10,172]               ssd: 0.041
[2024-08-19 07:03:10,172]               hdd: 323.013
[2024-08-19 07:03:10,172] OSD fill rate by crushclass:
[2024-08-19 07:03:10,172]   ssd: average=0.71%, unconstrained=2.27%
[2024-08-19 07:03:10,172]       min osd.186   0.156%
[2024-08-19 07:03:10,172]    median osd.66    0.757%
[2024-08-19 07:03:10,172]       max osd.160   1.018%
[2024-08-19 07:03:10,172]   hdd: average=69.28%, unconstrained=68.34%
[2024-08-19 07:03:10,173]       min osd.203   20.067%
[2024-08-19 07:03:10,173]    median osd.181   74.947%
[2024-08-19 07:03:10,173]       max osd.30    82.935%
[2024-08-19 07:03:10,196] couldn't empty osd.70, so we're done. if you want to try more often, set --max-move-attempts=$nr, this may unlock more balancing possibilities. setting --ignore-ideal-pgcounts also unlocks more, but will then we will fight with ceph's default balancer.
[2024-08-19 07:03:10,196] --------------------------------------------------------------------------------
[2024-08-19 07:03:10,196] generated 0 remaps in 0 steps.
[2024-08-19 07:03:10,196] total movement size: 0.0B
[2024-08-19 07:03:10,196] --------------------------------------------------------------------------------
[2024-08-19 07:03:10,196] OSD fill rate by crushclass:
[2024-08-19 07:03:10,196]                       OLD                         NEW
[2024-08-19 07:03:10,196]   ssd:
[2024-08-19 07:03:10,196]       raw             2.269%                      2.269%
[2024-08-19 07:03:10,196]       avg             0.715%                      0.715%
[2024-08-19 07:03:10,196]  variance             0.041                       0.041
[2024-08-19 07:03:10,196]       min osd.186     0.156%          osd.186     0.156%
[2024-08-19 07:03:10,196]    median osd.66      0.757%          osd.66      0.757%
[2024-08-19 07:03:10,196]       max osd.160     1.018%          osd.160     1.018%
[2024-08-19 07:03:10,196]   hdd:
[2024-08-19 07:03:10,196]       raw            68.337%                     68.337%
[2024-08-19 07:03:10,197]       avg            69.277%                     69.277%
[2024-08-19 07:03:10,197]  variance           323.013                     323.013
[2024-08-19 07:03:10,197]       min osd.203    20.067%          osd.203    20.067%
[2024-08-19 07:03:10,197]    median osd.181    74.947%          osd.181    74.947%
[2024-08-19 07:03:10,197]       max osd.30     82.935%          osd.30     82.935%
[2024-08-19 07:03:10,197] 
[2024-08-19 07:03:10,197] new usable space:
[2024-08-19 07:03:10,197]   pool name         id previous         new      change
[2024-08-19 07:03:10,197]   .mgr               1    1.65T ->    1.65T =>     0.0B
[2024-08-19 07:03:10,197]   mainfs.meta        2    7.02T ->    7.02T =>     0.0B
[2024-08-19 07:03:10,197]   mainfs.data-hdd    3  147.32T ->  147.32T =>     0.0B
[2024-08-19 07:03:10,197]   .nfs               4    4.79T ->    4.79T =>     0.0B
[2024-08-19 07:03:10,197]                                           sum:     0.0B
[2024-08-19 07:03:10,197] --------------------------------------------------------------------------------

image

The gather subcommand output will be sent by email.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant