Snowflake

This is a Fork of: http://github.com/dwayn/snowflaked

Introduction

This is a C implementation of Twitter's snowflake algorithm as a standalone daemon and client

Snowflake is an uncoordinated, distributed unique id generator that generates roughly ordered 64bit integer IDs.

The algorithm generates a time based ID that gains its entropy from the region id, worker id, and sequence bits. Region id and worker id must be a unique combination for each snowflaked process that is generating IDs for a single ID space. The sequence number is reset to 0 at the beginning of each millisecond and the first ID generated by a snowflaked daemon will use 0 as its sequence and then increment the sequence number for the next request within that millisecond. If clocks on all servers running snowflaked process are in sync, the IDs are ordered down to the millisecond, but IDs generated within a single millisecond are not ordered. In practice, due to clock skew, the IDs generated could be unordered at a higher granularity. NTP clock adjustments are handled by the daemon to ensure that IDs are not regenerated when NTP adjusts the system clock backwards. Note: NTP should be configured and running on all hosts running snowflaked.

Parts of a snowflake ID:

sign bit: will always be 0
this is because some languages don't support unsigned integers properly (Java and PHP are notable cases)
next 41 bits: milliseconds since custom epoch
next 4 bits: region id
next 10 bits: worker id

lowest 8 bits: sequence number

   sign           ms since custom epoch            region    worker     sequence  
  | 0 |                               16917051000 |    2 |         26 |       37 |
  | 0 | ----------------------------------------- | ---- | ---------- | -------- |
  | 0 | 00000001111110000010101011011011001111000 | 0010 | 0000011010 | 00100101 |
  
  ID: 70955254678034981

This has been implemented to always provide positive signed 64bit integers rather than using unsigned 64bit integers so that Java (prior to version 8) can use these IDs without painful workarounds.

The number of bits assigned to each of the parts as well as the custom epoch start time can be adjusted in src/snowflake.h by modifying the SNOWFLAKE_* constants and the project recompiled. Be sure that the sum of the 4 *_BITS constants does not exceed 63 bits as that will cause overflow on the ID.

Limits

The theoretical limit to the algorithm is approximately 2^4 * 2^10 * 2^8 * 1000 or ~4.2B IDs per second by 2^4 * 2^10 worker processes. Due to the single threaded nature of the daemon, CPU clock speed is the primary limiter to the rate at which IDs can be generated. On a 2.8ghz Intel Core i7, I have benchmarked to about 130k IDs per second which is why I chose 8 bits for the sequence (would take approx 5ghz CPU to overrun the sequence). Unfortunately, I have been unable to squeeze anymore performance out of the daemon due to the fact that the majority of the CPU is spent in libevent library handling the connections.

Components

Core

`snowflaked`

This is the main server daemon

Required Arguments:

    -r REGIONID, --region REGIONID
                        Region ID for the process

    -w WORKERID, --worker WORKERID
                        Worker ID for the process

Optional Arguments:

    -p PORT, --port PORT
                        Port number for the server to run on. 
                        Default: 8008

`snowflake`

Command line client for snowflaked

Optional Arguments

    -h HOST, --host HOST
                        Hostname of the server to query
                        Default: localhost

    -p PORT, --port PORT
                        Port number on the server to query
                        Default: 8008

`sfbench`

Command line client for generating load on snowflaked daemon

    -h HOST, --host HOST
                        Hostname of the server to query
                        Default: localhost

    -p PORT, --port PORT
                        Port number on the server to query
                        Default: 8008

    -n ITERATIONS, --iterations ITERATIONS
                        Number of ID GET calls to make to the server
                        Default: 1000

    -m MODVALUE, --output-every MODVALUE
                        Modulus value to set how often to write to stdout
                        Default: 1 (write line to stdout for every GET call)

Clients

PHP

There is an extension for PHP in clients/php that can be compiled. See clients/php/README.md

Building

Dependencies

Library dependencies:

libevent 2.x
check 0.9.8 (need to test if check lib detection is working properly)

Build dependencies:

libtoolize/glibtoolize
autotools (aclocal, autoheader, autoconf, automake)

Debug build dependencies:

gperftools

Installing dependencies

Mac OSX

If you have homebrew installed, you can run the following to install all needed build dependencies:

    ./homebrew_install_dependencies.sh

Debian/Ubuntu

    sudo apt-get install libevent-dev pkg-config check libtool automake autoconf gperftools

you can leave out gperftools if you are not planning on compiling the debug version

RHEL/CentOS

    sudo yum install libevent-devel pkgconfig check libtool automake autoconf gperftools

you can leave out gperftools if you are not planning on compiling the debug version

Compiling/Installing:

./bootstrap.sh
./configure [--enable-debug]
make
sudo make install

Protocol

The daemon implements a simple text protocol:

Get ID request:

GET

Server ID response:

64bit integer preceeded by "+"

    +70955254678034981

Server error response:

If there is an error, the response will be preceeded by "-"

    -ERROR error message

Get server stats:

    INFO

Server info response

The following is actually a contiguous string, but it is broken up into multiple lines for the sake of readability.

    +uptime:16761
    version:00.02.00
    region:0
    worker:39
    seq_cap:255
    seq_max:84
    ids:334241
    waits:0

uptime - Seconds since server started
version - Compiled version of the server
region - Region ID for the server
worker - Worker ID for the server
seq_cap - Max sequence ID that can be generated
seq_max - The highest sequence number that the server has generated since it was started
ids - The number of IDs that the server has generated since it was started
waits - The number of times the server has had to enter a wait loop to generate an ID
This can be caused by hitting the sequence number cap or by waiting due to an NTP clock adjustment that rolled time backwards on the host

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
clients/php		clients/php
src		src
tests		tests
.gitignore		.gitignore
Makefile.am		Makefile.am
README.md		README.md
bootstrap.sh		bootstrap.sh
configure.ac		configure.ac
homebrew_install_dependencies.sh		homebrew_install_dependencies.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Snowflake

Introduction

Limits

Components

Core

`snowflaked`

`snowflake`

`sfbench`

Clients

PHP

Building

Dependencies

Library dependencies:

Build dependencies:

Debug build dependencies:

Installing dependencies

Mac OSX

Debian/Ubuntu

RHEL/CentOS

Compiling/Installing:

Protocol

Get ID request:

Server ID response:

Server error response:

Get server stats:

Server info response

About

Releases

Packages

Languages

cineman/snowflake

Folders and files

Latest commit

History

Repository files navigation

Snowflake

Introduction

Limits

Components

Core

snowflaked

snowflake

sfbench

Clients

PHP

Building

Dependencies

Library dependencies:

Build dependencies:

Debug build dependencies:

Installing dependencies

Mac OSX

Debian/Ubuntu

RHEL/CentOS

Compiling/Installing:

Protocol

Get ID request:

Server ID response:

Server error response:

Get server stats:

Server info response

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`snowflaked`

`snowflake`

`sfbench`

Packages