Skip to content

niedbalski/repeat

Repository files navigation

Repeat CircleCI

repeat is a data collection tool for linux, with the following features:

  • Collections (command runs) can be run on a given time periodicity or a single shot.
  • Command output can be processed and stored on files or on a local database for further analysis.
  • Collections can be shared through imports (local or via http[s]+)
  • Resulting data can be explored as pandas dataframes.
  • A compressed tarball report will be generated.

Installation

Release artifacts can be found in the releases section Github Releases

wget -c https://github.com/niedbalski/repeat/releases/download/v0.0.4/repeat-0.0.4.linux-amd64.tar.gz -O - | tar -xz -C . --strip=1
./repeat --help

For installing the latest master build snap (edge for master or stable channel for releases):

snap install --channel edge repeat --classic

Also docker images are available:

  • Amd64 docker container Docker Repository on Quay

Command line

usage: repeat --config=CONFIG [<flags>]

Flags:
  -h, --help             Show context-sensitive help (also try --help-long and --help-man).
  -l, --loglevel="info"  Log level: [debug, info, warn, error, fatal]
  -t, --timeout=0s       Timeout: overall timeout for all collectors
  -c, --config=CONFIG    Path to collectors configuration file
  -b, --basedir="/tmp"   Temporary base directory to create the resulting collection tarball
  -r, --results-dir="."  Directory to store the resulting collection tarball
      --db-dir="."       Path to store the local results database 

Running with configuration

An example of running the collection for 5s (could be expressed in s,m,hours)

repeat --config metrics.yaml --timeout=5s --results-dir=.

Example configuration

  • Note : Imports are allowed as http[s]/files, local collection names have precedence over imported ones.
  • Note2 : database storage and fields configuration are totally user-defined.
import:
  - https://raw.githubusercontent.com/niedbalski/repeat/master/example_metrics.yaml#md5sum=6c5b5d8fafd343d5cf452a7660ad9dd1

collections:
  tcp_mem:
    command: cat /proc/sys/net/ipv4/tcp*mem
    run-every: 2s
    exit-codes: 0

  # scripts can be defined inline
  sar:
    run-once: true
    exit-codes: 0 127 126
    script: |
      #!/bin/bash

      echo "testing"

  process_list:
    command: ps aux --no-headers
    run-every: 1s
    exit-codes: any
    # store type database, will create a table in the collections database
    # and use the map-values definition to populate each column for th given
    # command output
    store: database
    database:
      map-values:
        field-separator: " "
        fields:
          - name: rss
            type: int
            field-index: 5
          - name: vsz
            type: int
            field-index: 4
          - name: pid
            type: string
            field-index: 1

  sockstat_tcp:
    command: grep -i tcp /proc/net/sockstat
    run-every: 1s
    exit-codes: any
    store: database
    database:
      map-values:
        field-separator: " "
        fields:
          - name: inuse
            type: int
            field-index: 2
          - name: alloc
            type: int
            field-index: 8

This command will generate the following report structure:

$ tar -xvf repeat-report-2020-07-04-00-05.tar.gz 
repeat-077356600/collections.db
repeat-077356600/collections.db-journal
repeat-077356600/run-script-557986359
repeat-077356600/sar-2020-07-04-00:05:04
repeat-077356600/tcp_mem-2020-07-04-00:05:04
repeat-077356600/tcp_mem-2020-07-04-00:05:06
[...]
repeat-077356600/tcp_mem-2020-07-04-00:05:12
repeat-077356600/tcp_mem-2020-07-04-00:05:14

Visualizing stored data with pandas.

There is an example file called Example pandas notebook that can be used with Jupyter.

Use this helper to generate dataframes from the report's tarball pandas helper. Note: python3-sqlalchemy and pandas are required. (Ubuntu: apt install python3-sqlalchemy python3-pandas)

Getting data from a repeat report

Curated list of collections

Repeat also maintains a curated list of collections that can be found collections

Contributing

Feel free to send PR(s) or reach niedbalski on #freenode or Telegram.