Skip to content
vahid-sohrabloo edited this page Dec 2, 2022 · 6 revisions

The goal of this library is to provide a better connection and high performance for reading and writing to the ClickHouse database.

Connect to database

A single connection or a pool connection can be used to connect to a database. Pool connections are recommended in most cases, however to use chpool you have to use https://github.com/vahid-sohrabloo/chconn/v2/chpool and use New with the database URL

package main

import (
	"context"
	"log"

	"github.com/vahid-sohrabloo/chconn/v2/chpool"
)

func main() {
	db, err := chpool.New("password=something")
	if err != nil {
		log.Fatal(err)
	}
defer conn.Close()
        ...
}
...

database URL connection is very like libpq of PostgreSQL you can read it here https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING excepts in the URI you should use clickhouse:// and also you can set this Env

CHHOST
CHPORT
CHDATABASE
CHUSER
CHPASSWORD
CHCLIENTNAME
CHCONNECT_TIMEOUT
CHSSLMODE
CHSSLKEY
CHSSLCERT
CHSSLROOTCERT

Exec

If you want to send a query that is not selected and insert a query (like creating a table) you can use the Exec method For example

package main

import (
	"context"
	"log"

	"github.com/vahid-sohrabloo/chconn/v2/chpool"
)

func main() {
	db, err := chpool.Connect(context.Background(), "password=something")
	if err != nil {
		log.Fatal(err)
	}

	err = db.Exec(context.Background(), `DROP TABLE IF EXISTS example_table`)
	if err != nil {
		panic(err)
	}

	err = db.Exec(context.Background(), `CREATE TABLE  example_table (
		uint64 UInt64,
		uint64_nullable Nullable(UInt64)
	) Engine=Memory`)
	if err != nil {
		panic(err)
	}
}

Write

You can write data to ClickHouse using Insert ClickHouse recommends inserting data in packets of at least 1000 rows, or no more than a single request per second. Therefore, with prepared columns, you can do this easily. we provide a simple generic way to use ClickHouse data types For example

package main

import (
	"context"
	"log"
	"time"

	"github.com/vahid-sohrabloo/chconn/v2/chpool"
	"github.com/vahid-sohrabloo/chconn/v2/column"
)

func main() {
	db, err := chpool.Connect(context.Background(), "password=something")
	if err != nil {
		log.Fatal(err)
	}

	_, err = db.Exec(context.Background(), `DROP TABLE IF EXISTS example_table`)
	if err != nil {
		panic(err)
	}

	_, err = db.Exec(context.Background(), `CREATE TABLE  example_table (
		uint64 UInt64,
		uint64_nullable Nullable(UInt64)
	) Engine=Memory`)
	if err != nil {
		panic(err)
	}

	uint64Col := column.New[uint64]()
	nullUint64 := column.New[uint64]().Nullable()
	rows := 1_000_000 // One hundred million rows- insert in 10 times
	numInsert := 10
	for i := 0; i < numInsert; i++ {
		for y := 0; y < rows; y++ {
			uint64Col.Append(uint64(i))
                        // or you can use nullUint64.AppendP function for a pointer variable
			if i%2 == 0 {
				nullUint64.Append(uint64(i))
			} else {
				nullUint64.AppendNil()
			}
		}

		ctxInsert, cancelInsert := context.WithTimeout(context.Background(), time.Second*30)
		// insert data
		err = db.Insert(ctxInsert, "INSERT INTO example_table (uint64,uint64_nullable) VALUES", uint64Col, nullUint64)
		if err != nil {
			cancelInsert()
			panic(err)
		}
		cancelInsert()
	}
}

NOTE: chconn will reset the data after each successful insertion and don't reset if it returnsan error. So it is safe to use the same column to try to insert data again.

Select

If you want to run a select query you can use this Select

chconn read data block by block (one reason why it is fast).

// select data
	col1Read := column.New[uint64]()
	col2Read := column.New[uint64]().Nullable()

	ctxSelect, cancelSelect := context.WithTimeout(context.Background(), time.Second*30)
	defer cancelSelect()

	startSelect := time.Now()
	selectStmt, err := db.Select(ctxSelect, "SELECT uint64,uint64_nullable FROM  example_table", col1Read, col2Read)
	if err != nil {
		panic(err)
	}

	// make sure the stmt close after select. but it's not necessary
	defer selectStmt.Close()

	var col1Data []uint64
	var col2DataNil []bool
	var col2Data []uint64
	// read data block by block
	// for more information about the  block, see: https://clickhouse.com/docs/en/development/architecture/#block
	for selectStmt.Next() {
		col1Data = col1Data[:0]
		col1Data = col1Read.Read(col1Data)

		col2DataNil = col2DataNil[:0]
		col2DataNil = col2Read.ReadNil(col2DataNil)

		col2Data = col2Data[:0]
		col2Data = col2Read.Read(col2Data)
	}

	// check errors
	if selectStmt.Err() != nil {
		panic(selectStmt.Err())
	}

please pay attention to the Next method go to the next block of data (not the next row)

NOTE: if you use chpool after the Next() return false the connection return to the pool and you should not use it again (only Err()).

Data types

UIntx and Intx

as you can see we use column.New[uint64]() for UInt64 ClickHouse datatype.

as the same, you can use intx for Intx and uintx for UIntx ClickHouse data types

chconn doesn't care about the Golang data type. It just checks the size of the Golang type with the ClickHouse data type.

So you can use your data types. for example

type MyUInt64 uint64

you can use column.New[MyUInt64]() for Uint64 ClickHouse data type. (or any other 8-byte data type)

Then You can use Append and AppendSlice to append slices of your data types.

Also If you know about the size of your insert data You can use SetWriteBufferSize to set the buffer size.

Bool

ClickHouse natively don't support Bool data type and use UInt8 for it. But as I mentioned before, chconn doesn't care about the Golang data type. It just checks the size of the Golang type with the ClickHouse data type. and the size of bool and uint8 is the same. so you can use column.New[bool]() for UInt8 ClickHouse data type.

UUID, Decimal, IPv4 , IPv6, UInt128, Int128, UInt256, Int256

For these data types that are not supported in Golang or have different behavior (UUID and IPv4 have different behavior), You can use the github.com/vahid-sohrabloo/chconn/v2/column/types package. for example you can use column.New[types.UUID]() for UUID ClickHouse data type. (or any other 16-byte data type)

String

For string data type you can use column.NewString() for string ClickHouse data type.

FixedString

For fixed string data type you can use (for example for FixedString(4)) column.New[[4]byte]()

Nullable data

If your column is Nullable for example Nullable(UInt64) you can use column.New[uint64]().Nullable() or column.NewString().Nullable()

You can Append and AppendSlice as the same as the original column. in addition, it has AppendNil to append Nil data

It also has AppendP and AppendSliceP for pointer data. but it has a less performance.

Array Data

If your column is Array for example Array(UInt64) you can use column.New[uint64]().Array() or column.NewString().Array() or for nullable data column.New[uint64]().Nullable().Array() or column.NewString().Nullable().Array()

and for Array Of Array you can use column.New[uint64]().Array().Array() or column.NewString().Array().Array() (and so on for more levels of arrays) it has the same methods as the original column. but accepts a slice of data.

NOTE: For now chconn only support three level of the array. I think it supports most of the project. if you need more levels just create an issue to support it.

Low Cardinality

If your column is Array for example LowCardinality(UInt64) you can use column.New[uint64]().LowCardinality() or column.NewString().LC() or for nullable data column.New[uint64]().LowCardinality().Nullable() or column.NewString().LowCardinality().Nullable() (and so on for array nullable cardinality LC().Nullable().Array())

it has the same methods as the original column.

for more information please read https://clickhouse.com/docs/en/sql-reference/data-types/lowcardinality/

Tuple Data

the Tuple data type is a data type that is used to store a group of columns. there is two way to define a tuple data type.

  1. column.NewTuple it this way you should pass tuple to insert and select methods. But for read or write data use column.New[T]() object column
col1 := column.New[T1]()
col2 := column.New[T2]()
.
.
.
colN := column.New[TN]()
tuple := column.NewTuple(col1, col2, ..., colN)

col1.Append(...)
col2.Append(...)
.
.
.
colN.Append(...)
db.Insert(ctx, "INSERT INTO example_table (tuple) VALUES", tuple)

// or for select
stmt, err := db.Select(ctx, "SELECT tuple FROM example_table", tuple)
for stmt.Next() {
	col1.Read(...)
	col2.Read(...)
	.
	.
	.
}
  1. define a struct and define this interface of it
type TupleType[T any] interface {
	Append([]column.ColumnBasic)
	Get([]column.ColumnBasic, int) T
	Column() []column.ColumnBasic
}

For example for Tuple(float64,float64)

type Point struct {
	X float64
	Y float64
}

func (t Point) Append(columns []column.ColumnBasic) {
	columns[0].(*column.Base[float64]).Append(t.X)
	columns[1].(*column.Base[float64]).Append(t.Y)
}

func (t Point) Get(columns []column.ColumnBasic, row int) Point {
	return Point{
		X: columns[0].(*column.Base[float64]).Row(row),
		Y: columns[1].(*column.Base[float64]).Row(row),
	}
}

func (t Point) Column() []column.ColumnBasic {
	return []column.ColumnBasic{
		column.New[float64](),
		column.New[float64](),
	}
}

then you can use column.NewTupleOf[Point]() to read and append data.

Map Data

If your column is Array for example Map(String,Int64) you can use column.NewMap(column.NewString(),column.NewInt64()) The first and second columns can be Nullable and LowCardinality. But the methods (like Append) use map[K]V type. if you want to use pointer for V (map[K]*V) for Map(K,Nullable(V)) you can use column.NewMapNullable