Skip to content

Performance Diagnostics and Tuning

Jonathan Beakley edited this page Jan 16, 2019 · 1 revision

Performance Diagnostics and Tuning on Kubernetes and OpenShift

This page describes how to tune Black Duck to optimize performance.

This is an advanced topic - if you're not comfortable with performance tuning relational databases, you might be better off contacting Black Duck support for direct assistance.

What is "large scale" for Black Duck?

If properly tuned, a single instance of Black Duck can manage and store 1000s of projects. A properly-tuned Black Duck instance will have the proper number of CPU cores, memory, jobrunners, hub-scans, and Postgres disk space.

Black Duck is heavily reliant on Postgres

If performance issues arise, a first step is to examine Black Duck's relational database, especially its parameters and cores.

How should I tune postgres for high performance?

The simplest thing to set initially is to make sure your shared_buffers to 25% of your total system RAM. Its default is generally lower. You also may want to either limit or increase the total number of connections postgres allows. If you have lots of jobrunners, you might end up getting 100s of connections to your database, which may not scale well in terms of whether or not your postgres instance can handle it.

Even with a postgres that has several cores, reducing the query load is a better performance optimization then increasing the hardware specification.

Where should I look to tell how Black Duck is doing?

  • If the dashboard is routinely taking more then 45 seconds to load.
  • Several or all jobs that are running do not seem to be making progress, based on log data or based on the progress meters in the Hub UI.

How can I know if my database is oversubscribed?

Using just a terminal

An easy place to start is by exec'ing into your postgres (if you are in a container, use kubectl exec to create a /bin/sh shell). From there, run top and take a look at a few parameters.

  • Is your load average consistently above 10?
  • Is your CPU % or memory usage hovering in the high 90th percentiles?
  • Is the of disk space df -h running out?

If any of these are true, you can assume that your database needs more resources in one way or other. To dive deeper, read on.

Using raw SQL

You can quickly get a sense of what queries are taking a long time by running:

SELECT pid, age(clock_timestamp(), query_start),query FROM pg_stat_activity WHERE query != '<IDLE>' AND query NOT ILIKE '%pg_stat_activity%' ORDER BY query_start desc;

If you see that any queries are taking more then 20 minutes to run, you can send them to Black Duck Support. For finer-grained tuning information, you can use a tool like PGHero, as described below.

Using PGHero

PGHero is one of the easiest tools to use to get a quick view of what might be slow.

pghero

If you have external (password) database access, you can run a tool such as pghero, which is a container that directly can look at your postgres settings and help you see slow running queries and easily view your tuning parameters.

kind: List
apiVersion: v1
items:
- apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: pghero
  spec:
    selector:
      matchLabels:
        app: pghero
        tier: pghero
    replicas: 1
    template:
      metadata:
        name: pghero
        labels:
          app: pghero
          tier: pghero
      spec:
        containers:
        - image: ankane/pghero
          imagePullPolicy: Always
          name: pghero
          env:
          - name: DATABASE_URL
            value: "postgres://myuser:[email protected]:5432/bds_hub"
          ports:
          - containerPort: 8080
            protocol: TCP
- apiVersion: v1
  kind: Service
  metadata:
    name: pghero
  spec:
    ports:
    - name: 8080-tcp
      protocol: TCP
      port: 8080
      targetPort: 8080
    selector:
      app: pghero
  status:
    loadBalancer: {}

Once pghero is running, you can go to the "Overview" page, and see how many long-running queries are running. Note that you can easily just run it as a standard docker command as well if you don't want to run it in your cluster (docker run -ti -e DATABASE_URL=postgres://myusername:[email protected]:5432/bds_hub -p 8080:8080 ankane/pghero).