Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CQL Vector support #1165

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions scylla-cql/src/frame/frame_errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -454,6 +454,8 @@ pub enum CqlTypeParseError {
TupleLengthParseError(LowLevelDeserializationError),
#[error("CQL Type not yet implemented, id: {0}")]
Comment on lines 454 to 455
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Commit frame: added a parser for Java-class type names:
There must be an empty line after the commit name. Otherwise, git considers the subsequent message continuation of the commit name and displays it in a strange way.

TypeNotImplemented(u16),
#[error("Failed to parse abstract type")]
AbstractTypeParseError(),
Comment on lines +457 to +458
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. What is an abstract type? I thought that what TypeParser does is it parses some specific Custom CQL types. Maybe it should be called CustomTypeParseError.
  2. Need more context - what exactly failed during parsing of custom type? I propose to create a new error type called CustomTypeParseError. It should be an enum with variants corresponding to the possible cause of failures. Then CqlTypeParseError could have a variant like:
#[error("Failed to parse custom CQL type: {0}")]
CustomTypeParseError(#[from] CustomTypeParseError)

cc: @wprzytula

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: it's idiomatic to avoid the parentheses if the list of arguments for the variant is empty

}

/// A low level deserialization error.
Expand Down Expand Up @@ -485,6 +487,8 @@ pub enum LowLevelDeserializationError {
InvalidInetLength(u8),
#[error("UTF8 deserialization failed: {0}")]
UTF8DeserializationError(#[from] std::str::Utf8Error),
#[error(transparent)]
ParseIntError(#[from] std::num::ParseIntError),
Comment on lines 487 to +491
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is some remnant from one of your initial implementations. It's not needed anymore AFAIU (I deleted it locally and everything compiled).

}

impl From<std::io::Error> for LowLevelDeserializationError {
Expand Down
1 change: 1 addition & 0 deletions scylla-cql/src/frame/response/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ pub mod error;
pub mod event;
pub mod result;
pub mod supported;
pub mod type_parser;

use std::sync::Arc;

Expand Down
296 changes: 296 additions & 0 deletions scylla-cql/src/frame/response/type_parser.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
use crate::frame::frame_errors::CqlTypeParseError;
use std::{borrow::Cow, char, str::from_utf8};

use super::result::ColumnType;

type UDTParameters<'result> = (
Cow<'result, str>,
Cow<'result, str>,
Vec<(Cow<'result, str>, ColumnType<'result>)>,
);
Comment on lines +6 to +10
Copy link
Contributor

@muzarski muzarski Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer to have it as a struct instead of a type alias. Cow<'result, str> type appears twice and it's hard to reason about it without explicit field names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer to have it as a struct instead of a newtype.

Actually, a struct is called a newtype. type is a type alias, which is not a new type, yet just a new name for an existing type.


pub(crate) struct TypeParser<'result> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit sad that you had to introduce this type from scratch, we already have very similar parsing utilities in the scylla crate (scylla::utils::parse::ParserState).

I learned from @wprzytula that he suggested moving the scylla::utils::parse module to scylla-cql and then rework your TypeParser to reuse the existing code. I highly suggest that you do that, we would rather avoid maintaining two separate parsers.

pos: usize,
str: Cow<'result, str>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Let's not use str as a name for a field - it's a name of a type.

}
Comment on lines +12 to +15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Probably need to rename it to CustomTypeParser (or AbstractTypeParser if we decide to stick to abstract naming convention). Same goes for the name of the module - type_parser.rs is not specific enough IMO.


impl<'result> TypeParser<'result> {
fn new(str: Cow<'result, str>) -> TypeParser<'result> {
TypeParser { pos: 0, str }
}

pub(crate) fn parse(str: Cow<'result, str>) -> Result<ColumnType<'result>, CqlTypeParseError> {
let mut parser = TypeParser::new(str);
parser.do_parse()
}
Comment on lines +22 to +25
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the functions/methods in this module unnecessarily return such broad error type as CqlTypeParseError. We could narrow it - see my other comment about introducing separate error type for custom type parsing failures.

Comment on lines +18 to +25
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Let's not use str as a name for a variable - it's a name of a type.


fn is_eos(&self) -> bool {
self.pos >= self.str.len()
}

fn is_blank(c: char) -> bool {
c == ' ' || c == '\t' || c == '\n'
}

fn is_identifier_char(c: char) -> bool {
c.is_alphanumeric() || c == '+' || c == '-' || c == '_' || c == '.' || c == '&'
}

fn char_at_pos(&self) -> char {
self.str.as_bytes()[self.pos] as char
}
Comment on lines +39 to +41
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 This may panic. Wouldn't it be better to use the checked get(self.pos) method?


fn read_next_identifier(&mut self) -> Cow<'result, str> {
let start = self.pos;
while !self.is_eos() && TypeParser::is_identifier_char(self.char_at_pos()) {
self.pos += 1;
}
match &self.str {
Cow::Borrowed(s) => Cow::Borrowed(&s[start..self.pos]),
Cow::Owned(s) => Cow::Owned(s[start..self.pos].to_owned()),
}
Comment on lines +43 to +51
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 This logic requires comments.

}

fn skip_blank(&mut self) -> usize {
while !self.is_eos() && TypeParser::is_blank(self.char_at_pos()) {
self.pos += 1;
}
self.pos
}

fn skip_blank_and_comma(&mut self) -> bool {
let mut comma_found = false;
while !self.is_eos() {
let c = self.char_at_pos();
if c == ',' {
if comma_found {
return true;
} else {
comma_found = true;
}
} else if !TypeParser::is_blank(c) {
return true;
}
self.pos += 1;
}
false
}
Comment on lines +61 to +77
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 This logic needs comments.


fn get_simple_abstract_type(
name: Cow<'result, str>,
) -> Result<ColumnType<'result>, CqlTypeParseError> {
let string_class_name: String;
let class_name: Cow<'result, str>;
if name.contains("org.apache.cassandra.db.marshal.") {
class_name = name
} else {
string_class_name = "org.apache.cassandra.db.marshal.".to_owned() + &name;
class_name = Cow::Owned(string_class_name);
}
Comment on lines +82 to +89
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 This can be made more FP:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let string_class_name: String;
let class_name: Cow<'result, str>;
if name.contains("org.apache.cassandra.db.marshal.") {
class_name = name
} else {
string_class_name = "org.apache.cassandra.db.marshal.".to_owned() + &name;
class_name = Cow::Owned(string_class_name);
}
let class_name: Cow<'result, str> = if name.contains("org.apache.cassandra.db.marshal.") {
name
} else {
let string_class_name: String = "org.apache.cassandra.db.marshal.".to_owned() + &name;
Cow::Owned(string_class_name);
}


match class_name.as_ref() {
"org.apache.cassandra.db.marshal.AsciiType" => Ok(ColumnType::Ascii),
"org.apache.cassandra.db.marshal.BooleanType" => Ok(ColumnType::Boolean),
"org.apache.cassandra.db.marshal.BytesType" => Ok(ColumnType::Blob),
"org.apache.cassandra.db.marshal.CounterColumnType" => Ok(ColumnType::Counter),
"org.apache.cassandra.db.marshal.DateType" => Ok(ColumnType::Date),
"org.apache.cassandra.db.marshal.DecimalType" => Ok(ColumnType::Decimal),
"org.apache.cassandra.db.marshal.DoubleType" => Ok(ColumnType::Double),
"org.apache.cassandra.db.marshal.DurationType" => Ok(ColumnType::Duration),
"org.apache.cassandra.db.marshal.FloatType" => Ok(ColumnType::Float),
"org.apache.cassandra.db.marshal.InetAddressType" => Ok(ColumnType::Inet),
"org.apache.cassandra.db.marshal.Int32Type" => Ok(ColumnType::Int),
"org.apache.cassandra.db.marshal.IntegerType" => Ok(ColumnType::Varint),
"org.apache.cassandra.db.marshal.LongType" => Ok(ColumnType::BigInt),
"org.apache.cassandra.db.marshal.SimpleDateType" => Ok(ColumnType::Date),
"org.apache.cassandra.db.marshal.ShortType" => Ok(ColumnType::SmallInt),
"org.apache.cassandra.db.marshal.UTF8Type" => Ok(ColumnType::Text),
"org.apache.cassandra.db.marshal.ByteType" => Ok(ColumnType::TinyInt),
"org.apache.cassandra.db.marshal.UUIDType" => Ok(ColumnType::Uuid),
"org.apache.cassandra.db.marshal.TimeUUIDType" => Ok(ColumnType::Timeuuid),
"org.apache.cassandra.db.marshal.SmallIntType" => Ok(ColumnType::SmallInt),
"org.apache.cassandra.db.marshal.TinyIntType" => Ok(ColumnType::TinyInt),
"org.apache.cassandra.db.marshal.TimeType" => Ok(ColumnType::Time),
"org.apache.cassandra.db.marshal.TimestampType" => Ok(ColumnType::Timestamp),
_ => Err(CqlTypeParseError::AbstractTypeParseError()),
}
Comment on lines +82 to +116
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ IIUC, name may contain or not contain the org.apache.cassandra.db.marshal. prefix, and you want to handle both scenarios. For it, you add the prefix if it's not present, which unfortunately allocates.

🔧 If that's the case, I suggest the opposite direction: if the prefix is present, let's skip it, because slicing a &str is a cheap operation without allocations incurred.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let string_class_name: String;
let class_name: Cow<'result, str>;
if name.contains("org.apache.cassandra.db.marshal.") {
class_name = name
} else {
string_class_name = "org.apache.cassandra.db.marshal.".to_owned() + &name;
class_name = Cow::Owned(string_class_name);
}
match class_name.as_ref() {
"org.apache.cassandra.db.marshal.AsciiType" => Ok(ColumnType::Ascii),
"org.apache.cassandra.db.marshal.BooleanType" => Ok(ColumnType::Boolean),
"org.apache.cassandra.db.marshal.BytesType" => Ok(ColumnType::Blob),
"org.apache.cassandra.db.marshal.CounterColumnType" => Ok(ColumnType::Counter),
"org.apache.cassandra.db.marshal.DateType" => Ok(ColumnType::Date),
"org.apache.cassandra.db.marshal.DecimalType" => Ok(ColumnType::Decimal),
"org.apache.cassandra.db.marshal.DoubleType" => Ok(ColumnType::Double),
"org.apache.cassandra.db.marshal.DurationType" => Ok(ColumnType::Duration),
"org.apache.cassandra.db.marshal.FloatType" => Ok(ColumnType::Float),
"org.apache.cassandra.db.marshal.InetAddressType" => Ok(ColumnType::Inet),
"org.apache.cassandra.db.marshal.Int32Type" => Ok(ColumnType::Int),
"org.apache.cassandra.db.marshal.IntegerType" => Ok(ColumnType::Varint),
"org.apache.cassandra.db.marshal.LongType" => Ok(ColumnType::BigInt),
"org.apache.cassandra.db.marshal.SimpleDateType" => Ok(ColumnType::Date),
"org.apache.cassandra.db.marshal.ShortType" => Ok(ColumnType::SmallInt),
"org.apache.cassandra.db.marshal.UTF8Type" => Ok(ColumnType::Text),
"org.apache.cassandra.db.marshal.ByteType" => Ok(ColumnType::TinyInt),
"org.apache.cassandra.db.marshal.UUIDType" => Ok(ColumnType::Uuid),
"org.apache.cassandra.db.marshal.TimeUUIDType" => Ok(ColumnType::Timeuuid),
"org.apache.cassandra.db.marshal.SmallIntType" => Ok(ColumnType::SmallInt),
"org.apache.cassandra.db.marshal.TinyIntType" => Ok(ColumnType::TinyInt),
"org.apache.cassandra.db.marshal.TimeType" => Ok(ColumnType::Time),
"org.apache.cassandra.db.marshal.TimestampType" => Ok(ColumnType::Timestamp),
_ => Err(CqlTypeParseError::AbstractTypeParseError()),
}
let class_name: &str = if let Some(suffix) = name.strip_prefix("org.apache.cassandra.db.marshal.") {
&*suffix
} else {
name
}
match class_name {
"AsciiType" => Ok(ColumnType::Ascii),
"BooleanType" => Ok(ColumnType::Boolean),
"BytesType" => Ok(ColumnType::Blob),
"CounterColumnType" => Ok(ColumnType::Counter),
"DateType" => Ok(ColumnType::Date),
"DecimalType" => Ok(ColumnType::Decimal),
"DoubleType" => Ok(ColumnType::Double),
"DurationType" => Ok(ColumnType::Duration),
"FloatType" => Ok(ColumnType::Float),
"InetAddressType" => Ok(ColumnType::Inet),
"Int32Type" => Ok(ColumnType::Int),
"IntegerType" => Ok(ColumnType::Varint),
"LongType" => Ok(ColumnType::BigInt),
"SimpleDateType" => Ok(ColumnType::Date),
"ShortType" => Ok(ColumnType::SmallInt),
"UTF8Type" => Ok(ColumnType::Text),
"ByteType" => Ok(ColumnType::TinyInt),
"UUIDType" => Ok(ColumnType::Uuid),
"TimeUUIDType" => Ok(ColumnType::Timeuuid),
"SmallIntType" => Ok(ColumnType::SmallInt),
"TinyIntType" => Ok(ColumnType::TinyInt),
"TimeType" => Ok(ColumnType::Time),
"TimestampType" => Ok(ColumnType::Timestamp),
_ => Err(CqlTypeParseError::AbstractTypeParseError()),
}

}

fn get_type_parameters(&mut self) -> Result<Vec<ColumnType<'result>>, CqlTypeParseError> {
let mut parameters = Vec::new();
if self.is_eos() {
return Ok(parameters);
}
if self.str.as_bytes()[self.pos] != b'(' {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
self.pos += 1;
loop {
if !self.skip_blank_and_comma() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
if self.str.as_bytes()[self.pos] == b')' {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Please avoid indexing in favour of checked get(). We don't want to panic in case of a rare bug / unexpected data received from the DB.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using is_eos() and then indexing, let's just pattern match on get() result. If it's None, then it's obviously EOS; otherwise, it's not, and we get this guarantee syntactically.

self.pos += 1;
return Ok(parameters);
}
let typ = self.do_parse()?;
parameters.push(typ);
}
}
Comment on lines +119 to +139
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 This could be rewritten in a FP way, returning an iterator:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn get_type_parameters(&mut self) -> Result<Vec<ColumnType<'result>>, CqlTypeParseError> {
let mut parameters = Vec::new();
if self.is_eos() {
return Ok(parameters);
}
if self.str.as_bytes()[self.pos] != b'(' {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
self.pos += 1;
loop {
if !self.skip_blank_and_comma() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
if self.str.as_bytes()[self.pos] == b')' {
self.pos += 1;
return Ok(parameters);
}
let typ = self.do_parse()?;
parameters.push(typ);
}
}
fn get_type_parameters(
&mut self,
) -> Result<
impl Iterator<Item = Result<ColumnType<'result>, CqlTypeParseError>> + '_,
CqlTypeParseError,
> {
match self.str.as_bytes().get(self.pos) {
None => return Ok(either::Left(std::iter::empty())),
Some(b'(') => self.pos += 1,
Some(_) => return Err(CqlTypeParseError::AbstractTypeParseError()),
};
Ok(either::Right(std::iter::from_fn(|| {
if !self.skip_blank_and_comma() {
return Some(Err(CqlTypeParseError::AbstractTypeParseError()));
}
if self.str.as_bytes()[self.pos] == b')' {
self.pos += 1;
return None;
}
Some(self.do_parse())
})))
}


fn get_vector_parameters(&mut self) -> Result<(ColumnType<'result>, u32), CqlTypeParseError> {
if self.is_eos() || self.str.as_bytes()[self.pos] != b'(' {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
self.pos += 1;
self.skip_blank_and_comma();
if self.str.as_bytes()[self.pos] == b')' {
return Err(CqlTypeParseError::AbstractTypeParseError());
}

let typ = self.do_parse()?;
self.skip_blank_and_comma();
let start = self.pos;
while !self.is_eos() && char::is_numeric(self.char_at_pos()) {
self.pos += 1;
}
let len = self.str[start..self.pos]
.parse::<u32>()
.map_err(|_| CqlTypeParseError::AbstractTypeParseError())?;
if self.is_eos() || self.str.as_bytes()[self.pos] != b')' {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
self.pos += 1;
Ok((typ, len))
}

fn from_hex(s: Cow<'result, str>) -> Result<Vec<u8>, CqlTypeParseError> {
if s.len() % 2 != 0 {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
for c in s.chars() {
if !c.is_ascii_hexdigit() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
}
let mut bytes = Vec::new();
for i in 0..s.len() / 2 {
let byte = u8::from_str_radix(&s[i * 2..i * 2 + 2], 16)
.map_err(|_| CqlTypeParseError::AbstractTypeParseError())?;
bytes.push(byte);
}
Ok(bytes)
}

fn get_udt_parameters(&mut self) -> Result<UDTParameters<'result>, CqlTypeParseError> {
if self.is_eos() || self.str.as_bytes()[self.pos] != b'(' {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
self.pos += 1;

self.skip_blank_and_comma();
let keyspace = self.read_next_identifier();
self.skip_blank_and_comma();
let hex_name = &TypeParser::from_hex(self.read_next_identifier())?;
let name = from_utf8(hex_name).map_err(|_| CqlTypeParseError::AbstractTypeParseError())?;
let mut fields = Vec::new();
loop {
if !self.skip_blank_and_comma() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
if self.str.as_bytes()[self.pos] == b')' {
self.pos += 1;
return Ok((keyspace, Cow::Owned(name.to_owned()), fields));
}
let field_name = self.read_next_identifier();
self.skip_blank_and_comma();
let field_type = self.do_parse()?;
fields.push((field_name, field_type));
}
}

fn get_complex_abstract_type(
&mut self,
name: Cow<'result, str>,
) -> Result<ColumnType<'result>, CqlTypeParseError> {
let string_class_name: String;
let class_name: Cow<'result, str>;
if name.contains("org.apache.cassandra.db.marshal.") {
class_name = name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: semicolon missing at the end of line

} else {
string_class_name = "org.apache.cassandra.db.marshal.".to_owned() + &name;
class_name = Cow::Owned(string_class_name);
}
match class_name.as_ref() {
"org.apache.cassandra.db.marshal.ListType" => {
let mut params = self.get_type_parameters()?;
if params.len() != 1 {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
Ok(ColumnType::List(Box::new(params.remove(0))))
}
Comment on lines +225 to +231
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 If you refactor get_type_parameters to return an Iterator as I suggested above, then usage can be adjusted this way to avoid allocations altogether:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"org.apache.cassandra.db.marshal.ListType" => {
let mut params = self.get_type_parameters()?;
if params.len() != 1 {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
Ok(ColumnType::List(Box::new(params.remove(0))))
}
"org.apache.cassandra.db.marshal.ListType" => {
let mut params = self.get_type_parameters()?;
let param1 = params
.next()
.ok_or(CqlTypeParseError::AbstractTypeParseError())??;
let param2 = params.next();
if param2.is_some() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
Ok(ColumnType::List(Box::new(param1)))
}

"org.apache.cassandra.db.marshal.SetType" => {
let mut params = self.get_type_parameters()?;
if params.len() != 1 {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
Ok(ColumnType::Set(Box::new(params.remove(0))))
}
"org.apache.cassandra.db.marshal.MapType" => {
let mut params = self.get_type_parameters()?;
if params.len() != 2 {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
Ok(ColumnType::Map(
Box::new(params.remove(0)),
Box::new(params.remove(0)),
))
}
"org.apache.cassandra.db.marshal.TupleType" => {
let params = self.get_type_parameters()?;
if params.is_empty() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
Ok(ColumnType::Tuple(params))
}
"org.apache.cassandra.db.marshal.VectorType" => {
let (typ, len) = self.get_vector_parameters()?;
Ok(ColumnType::Vector(Box::new(typ), len))
}
"org.apache.cassandra.db.marshal.UserType" => {
let (keyspace, name, fields) = self.get_udt_parameters()?;
Ok(ColumnType::UserDefinedType {
type_name: name,
keyspace,
field_types: fields,
})
}
_ => Err(CqlTypeParseError::AbstractTypeParseError()),
}
}

fn do_parse(&mut self) -> Result<ColumnType<'result>, CqlTypeParseError> {
self.skip_blank();

let mut name = self.read_next_identifier();
if name.is_empty() {
if !self.is_eos() {
return Err(CqlTypeParseError::AbstractTypeParseError());
}
return Ok(ColumnType::Blob);
}

if !self.is_eos() && self.str.as_bytes()[self.pos] == b':' {
self.pos += 1;
let _ = usize::from_str_radix(&name, 16)
.map_err(|_| CqlTypeParseError::AbstractTypeParseError());
name = self.read_next_identifier();
}
Comment on lines +283 to +288
Copy link
Contributor

@muzarski muzarski Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this part do? Is it tested somewhere?

self.skip_blank();
if !self.is_eos() && self.str.as_bytes()[self.pos] == b'(' {
self.get_complex_abstract_type(name)
} else {
TypeParser::get_simple_abstract_type(name)
}
}
}
Comment on lines +295 to +296
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 To sum up, general comments about this module:

  • please avoid operations that may panic (especially indexing). Use pattern matching and this way leverage the type system to prevent bugs,
  • more comments are needed for the parsing logic,
  • there are a few inefficiencies around allocations that can be avoided altogether.