Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finish collection builder #109

Merged
merged 2 commits into from
May 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 4 additions & 7 deletions examples/search.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
use anyhow::Result;
use qdrant_client::prelude::*;
use qdrant_client::qdrant::Distance;
use qdrant_client::qdrant::{
Condition, CreateCollectionBuilder, Filter, SearchPoints, VectorParamsBuilder,
Condition, CreateCollectionBuilder, Distance, Filter, QuantizationType,
ScalarQuantizationBuilder, SearchPoints, VectorParamsBuilder,
};
use serde_json::json;

Expand Down Expand Up @@ -31,11 +31,8 @@ async fn main() -> Result<()> {
.create_collection(
&CreateCollectionBuilder::default()
.collection_name(collection_name)
.vectors_config(
VectorParamsBuilder::default()
.distance(Distance::Cosine)
.size(10),
)
.vectors_config(VectorParamsBuilder::new(300, Distance::Cosine))
.quantization_config(ScalarQuantizationBuilder::new(QuantizationType::Int8))
Comment on lines +34 to +35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@generall what is your opinion on this; having some required parameters in new()?

I think it is fine. But we'd have to make sure all parameters we add in the future are optional, otherwise it would fail to compile again.

And this better than a runtime panic for missing fields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we discussed it in slack briefly. I think having ::new with required params is better

.build(),
)
.await?;
Expand Down
63 changes: 53 additions & 10 deletions src/grpc_ext.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,18 @@ use crate::qdrant::{
alias_operations, condition, group_id, points_update_operation, quantization_config,
quantization_config_diff, r#match, read_consistency, shard_key, start_from, target_vector,
vector_example, vectors_config, vectors_config_diff, with_payload_selector,
with_vectors_selector, AliasOperations, BinaryQuantization, Condition, CreateAlias,
DeleteAlias, Disabled, FieldCondition, Filter, GeoLineString, GeoPoint, GroupId,
HasIdCondition, IntegerIndexParams, IsEmptyCondition, IsNullCondition, ListValue, Match,
NamedVectors, NestedCondition, PayloadExcludeSelector, PayloadIncludeSelector,
with_vectors_selector, AliasOperations, BinaryQuantization, BinaryQuantizationBuilder,
Condition, CreateAlias, DeleteAlias, Disabled, FieldCondition, Filter, GeoLineString, GeoPoint,
GroupId, HasIdCondition, IntegerIndexParams, IsEmptyCondition, IsNullCondition, ListValue,
Match, NamedVectors, NestedCondition, PayloadExcludeSelector, PayloadIncludeSelector,
PayloadIndexParams, PointId, PointStruct, PointsIdsList, PointsSelector, PointsUpdateOperation,
ProductQuantization, QuantizationConfig, QuantizationConfigDiff, ReadConsistency, RenameAlias,
RepeatedIntegers, RepeatedStrings, ScalarQuantization, ShardKey, ShardKeySelector,
SparseIndexConfig, SparseIndices, SparseVectorConfig, SparseVectorParams, StartFrom, Struct,
TargetVector, TextIndexParams, Value, Vector, VectorExample, VectorParams, VectorParamsBuilder,
VectorParamsDiff, VectorParamsDiffMap, VectorParamsMap, Vectors, VectorsConfig,
VectorsConfigDiff, VectorsSelector, WithPayloadSelector, WithVectorsSelector,
ProductQuantization, ProductQuantizationBuilder, QuantizationConfig, QuantizationConfigDiff,
ReadConsistency, RenameAlias, RepeatedIntegers, RepeatedStrings, ScalarQuantization,
ScalarQuantizationBuilder, ShardKey, ShardKeySelector, SparseIndexConfig, SparseIndices,
SparseVectorConfig, SparseVectorParams, StartFrom, Struct, TargetVector, TextIndexParams,
Value, Vector, VectorExample, VectorParams, VectorParamsBuilder, VectorParamsDiff,
VectorParamsDiffMap, VectorParamsMap, Vectors, VectorsConfig, VectorsConfigDiff,
VectorsSelector, WithPayloadSelector, WithVectorsSelector,
};
use std::collections::HashMap;

Expand Down Expand Up @@ -772,8 +773,50 @@ impl From<Vec<PointId>> for PointsIdsList {
}
}

impl From<VectorParamsBuilder> for vectors_config::Config {
fn from(value: VectorParamsBuilder) -> Self {
value.build().into()
}
}

impl From<&mut VectorParamsBuilder> for vectors_config::Config {
fn from(value: &mut VectorParamsBuilder) -> Self {
value.build().into()
}
}

impl From<ScalarQuantizationBuilder> for quantization_config::Quantization {
fn from(value: ScalarQuantizationBuilder) -> Self {
Self::Scalar(value.build())
}
}

impl From<&mut ScalarQuantizationBuilder> for quantization_config::Quantization {
fn from(value: &mut ScalarQuantizationBuilder) -> Self {
Self::Scalar(value.build())
}
}

impl From<ProductQuantizationBuilder> for quantization_config::Quantization {
fn from(value: ProductQuantizationBuilder) -> Self {
Self::Product(value.build())
}
}

impl From<&mut ProductQuantizationBuilder> for quantization_config::Quantization {
fn from(value: &mut ProductQuantizationBuilder) -> Self {
Self::Product(value.build())
}
}

impl From<BinaryQuantizationBuilder> for quantization_config::Quantization {
fn from(value: BinaryQuantizationBuilder) -> Self {
Self::Binary(value.build())
}
}

impl From<&mut BinaryQuantizationBuilder> for quantization_config::Quantization {
fn from(value: &mut BinaryQuantizationBuilder) -> Self {
Self::Binary(value.build())
}
}
80 changes: 77 additions & 3 deletions src/qdrant.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
// This file is @generated by prost-build.
#[derive(derive_builder::Builder)]
#[builder(build_fn(private, error = "std::convert::Infallible", name = "build_inner"))]
#[builder(build_fn(private, name = "build_inner"), custom_constructor)]
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct VectorParams {
Expand All @@ -18,7 +18,13 @@ pub struct VectorParams {
pub hnsw_config: ::core::option::Option<HnswConfigDiff>,
/// Configuration of vector quantization config. If omitted - the collection configuration will be used
#[prost(message, optional, tag = "4")]
#[builder(default, setter(into, strip_option))]
#[builder(
setter(into, strip_option),
field(
ty = "Option<quantization_config::Quantization>",
build = "convert_option(&self.quantization_config)"
)
)]
pub quantization_config: ::core::option::Option<QuantizationConfig>,
/// If true - serve vectors from disk. If set to false, the vectors will be loaded in RAM.
#[prost(bool, optional, tag = "5")]
Expand Down Expand Up @@ -226,26 +232,34 @@ pub struct SparseIndexConfig {
#[prost(bool, optional, tag = "2")]
pub on_disk: ::core::option::Option<bool>,
}
#[derive(derive_builder::Builder)]
#[builder(build_fn(private, error = "std::convert::Infallible", name = "build_inner"))]
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct WalConfigDiff {
/// Size of a single WAL block file
#[prost(uint64, optional, tag = "1")]
#[builder(default, setter(strip_option))]
pub wal_capacity_mb: ::core::option::Option<u64>,
/// Number of segments to create in advance
#[prost(uint64, optional, tag = "2")]
#[builder(default, setter(strip_option))]
pub wal_segments_ahead: ::core::option::Option<u64>,
}
#[derive(derive_builder::Builder)]
#[builder(build_fn(private, error = "std::convert::Infallible", name = "build_inner"))]
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct OptimizersConfigDiff {
///
/// The minimal fraction of deleted vectors in a segment, required to perform segment optimization
#[prost(double, optional, tag = "1")]
#[builder(default, setter(strip_option))]
pub deleted_threshold: ::core::option::Option<f64>,
///
/// The minimal number of vectors in a segment, required to perform segment optimization
#[prost(uint64, optional, tag = "2")]
#[builder(default, setter(strip_option))]
pub vacuum_min_vector_number: ::core::option::Option<u64>,
///
/// Target amount of segments the optimizer will try to keep.
Expand All @@ -257,6 +271,7 @@ pub struct OptimizersConfigDiff {
/// It is recommended to select the default number of segments as a factor of the number of search threads,
/// so that each segment would be handled evenly by one of the threads.
#[prost(uint64, optional, tag = "3")]
#[builder(default, setter(strip_option))]
pub default_segment_number: ::core::option::Option<u64>,
///
/// Do not create segments larger this size (in kilobytes).
Expand All @@ -268,6 +283,7 @@ pub struct OptimizersConfigDiff {
/// Note: 1Kb = 1 vector of size 256
/// If not set, will be automatically selected considering the number of available CPUs.
#[prost(uint64, optional, tag = "4")]
#[builder(default, setter(strip_option))]
pub max_segment_size: ::core::option::Option<u64>,
///
/// Maximum size (in kilobytes) of vectors to store in-memory per segment.
Expand All @@ -279,6 +295,7 @@ pub struct OptimizersConfigDiff {
///
/// Note: 1Kb = 1 vector of size 256
#[prost(uint64, optional, tag = "5")]
#[builder(default, setter(strip_option))]
pub memmap_threshold: ::core::option::Option<u64>,
///
/// Maximum size (in kilobytes) of vectors allowed for plain index, exceeding this threshold will enable vector indexing
Expand All @@ -289,19 +306,24 @@ pub struct OptimizersConfigDiff {
///
/// Note: 1kB = 1 vector of size 256.
#[prost(uint64, optional, tag = "6")]
#[builder(default, setter(strip_option))]
pub indexing_threshold: ::core::option::Option<u64>,
///
/// Interval between forced flushes.
#[prost(uint64, optional, tag = "7")]
#[builder(default, setter(strip_option))]
pub flush_interval_sec: ::core::option::Option<u64>,
///
/// Max number of threads (jobs) for running optimizations per shard.
/// Note: each optimization job will also use `max_indexing_threads` threads by itself for index building.
/// If null - have no limit and choose dynamically to saturate CPU.
/// If 0 - no optimization threads, optimizations will be disabled.
#[prost(uint64, optional, tag = "8")]
#[builder(default, setter(strip_option))]
pub max_optimization_threads: ::core::option::Option<u64>,
}
#[derive(derive_builder::Builder)]
#[builder(build_fn(private, name = "build_inner"), custom_constructor)]
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct ScalarQuantization {
Expand All @@ -310,11 +332,15 @@ pub struct ScalarQuantization {
pub r#type: i32,
/// Number of bits to use for quantization
#[prost(float, optional, tag = "2")]
#[builder(default, setter(strip_option))]
pub quantile: ::core::option::Option<f32>,
/// If true - quantized vectors always will be stored in RAM, ignoring the config of main storage
#[prost(bool, optional, tag = "3")]
#[builder(default, setter(strip_option))]
pub always_ram: ::core::option::Option<bool>,
}
#[derive(derive_builder::Builder)]
#[builder(build_fn(private, name = "build_inner"), custom_constructor)]
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct ProductQuantization {
Expand All @@ -323,13 +349,17 @@ pub struct ProductQuantization {
pub compression: i32,
/// If true - quantized vectors always will be stored in RAM, ignoring the config of main storage
#[prost(bool, optional, tag = "2")]
#[builder(default, setter(strip_option))]
pub always_ram: ::core::option::Option<bool>,
}
#[derive(derive_builder::Builder)]
#[builder(build_fn(private, name = "build_inner"), custom_constructor)]
#[allow(clippy::derive_partial_eq_without_eq)]
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct BinaryQuantization {
/// If true - quantized vectors always will be stored in RAM, ignoring the config of main storage
#[prost(bool, optional, tag = "1")]
#[builder(default, setter(strip_option))]
pub always_ram: ::core::option::Option<bool>,
}
#[allow(clippy::derive_partial_eq_without_eq)]
Expand Down Expand Up @@ -432,7 +462,13 @@ pub struct CreateCollection {
pub init_from_collection: ::core::option::Option<::prost::alloc::string::String>,
/// Quantization configuration of vector
#[prost(message, optional, tag = "14")]
#[builder(default, setter(into, strip_option))]
#[builder(
setter(into, strip_option),
field(
ty = "Option<quantization_config::Quantization>",
build = "convert_option(&self.quantization_config)"
)
)]
pub quantization_config: ::core::option::Option<QuantizationConfig>,
/// Sharding method
#[prost(enumeration = "ShardingMethod", optional, tag = "15")]
Expand Down Expand Up @@ -7480,4 +7516,42 @@ where
builder_type_conversions!(CreateCollection, CreateCollectionBuilder);
builder_type_conversions!(VectorParams, VectorParamsBuilder);
builder_type_conversions!(HnswConfigDiff, HnswConfigDiffBuilder);
builder_type_conversions!(ScalarQuantization, ScalarQuantizationBuilder);
builder_type_conversions!(ProductQuantization, ProductQuantizationBuilder);
builder_type_conversions!(BinaryQuantization, BinaryQuantizationBuilder);
builder_type_conversions!(OptimizersConfigDiff, OptimizersConfigDiffBuilder);
builder_type_conversions!(WalConfigDiff, WalConfigDiffBuilder);

impl VectorParamsBuilder {
pub fn new(size: u64, distance: Distance) -> Self {
let mut builder = Self::create_empty();
builder.size = Some(size);
builder.distance = Some(distance.into());
builder
}
}

impl ScalarQuantizationBuilder {
pub fn new(r#type: QuantizationType) -> Self {
let mut builder = Self::create_empty();
builder.r#type = Some(r#type.into());
builder
}
}

impl ProductQuantizationBuilder {
pub fn new(compression: i32) -> Self {
let mut builder = Self::create_empty();
builder.compression = Some(compression);
builder
}
}

impl BinaryQuantizationBuilder {
pub fn new(always_ram: bool) -> Self {
let mut builder = Self::create_empty();
builder.always_ram = Some(Some(always_ram));
builder
}
}
timvisee marked this conversation as resolved.
Show resolved Hide resolved

Loading
Loading