Skip to content

Commit

Permalink
Add typmod string tests
Browse files Browse the repository at this point in the history
  • Loading branch information
robozmey committed Dec 4, 2024
1 parent 908af43 commit 9440244
Show file tree
Hide file tree
Showing 5 changed files with 6,751 additions and 595 deletions.
70 changes: 70 additions & 0 deletions contrib/try_convert/data/tt_bpchar.data
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
All the world's a stage,
And all the men and women merely players;
They have their exits and their entrances,
And one man in his time plays many parts,
His acts being seven ages. At first, the infant,
Mewling and puking in the nurse's arms.
Then the whining schoolboy, with his satchel
And shining morning face, creeping like snail
Unwillingly to school. And then the lover,
Sighing like furnace, with a woeful ballad
Made to his mistress' eyebrow. Then a soldier,
Full of strange oaths and bearded like the pard,
Jealous in honor, sudden and quick in quarrel,
Seeking the bubble reputation
Even in the cannon's mouth. And then the justice,
In fair round belly with good capon lined,
With eyes severe and beard of formal cut,
Full of wise saws and modern instances;
And so he plays his part. The sixth age shifts
Into the lean and slippered pantaloon,
With spectacles on nose and pouch on side;
His youthful hose, well saved, a world too wide
For his shrunk shank, and his big manly voice,
Turning again toward childish treble, pipes
And whistles in his sound. Last scene of all,
That ends this strange eventful history,
Is second childishness and mere oblivion,
Sans teeth, sans eyes, sans taste, sans everything.

std::to_string
C++ Strings library std::basic_string
Defined in header <string>
std::string to_string( int value );
std::string to_string( long value );
std::string to_string( long long value );
std::string to_string( unsigned value );
std::string to_string( unsigned long value );
std::string to_string( unsigned long long value );
std::string to_string( float value );
std::string to_string( double value );
std::string to_string( long double value );
Converts a numeric value to std::string.

Let buf be an internal to the conversion functions buffer, sufficiently large to contain the result of conversion.

1) Converts a signed integer to a string as if by std::sprintf(buf, "%d", value).
2) Converts a signed integer to a string as if by std::sprintf(buf, "%ld", value).
3) Converts a signed integer to a string as if by std::sprintf(buf, "%lld", value).
4) Converts an unsigned integer to a string as if by std::sprintf(buf, "%u", value).
5) Converts an unsigned integer to a string as if by std::sprintf(buf, "%lu", value).
6) Converts an unsigned integer to a string as if by std::sprintf(buf, "%llu", value).
7,8) Converts a floating point value to a string as if by std::sprintf(buf, "%f", value).
9) Converts a floating point value to a string as if by std::sprintf(buf, "%Lf", value).
(until C++26)
1-9) Converts a numeric value to a string as if by std::format("{}", value).
(since C++26)
Parameters
Return value
A string holding the converted value.

Exceptions
May throw std::bad_alloc from the std::string constructor.

Notes
With floating point types std::to_string may yield unexpected results as the number of significant digits in the returned string can be zero, see the example.
The return value may differ significantly from what std::cout prints by default, see the example.
std::to_string relies on the current C locale for formatting purposes, and therefore concurrent calls to std::to_string from multiple threads may result in partial serialization of calls.
The results of overloads for integer types do not rely on the current C locale, and thus implementations generally avoid access to the current C locale in these overloads for both correctness and performance. However, such avoidance is not guaranteed by the standard.
(until C++26)
C++17 provides std::to_chars as a higher-performance locale-independent alternative.
185 changes: 144 additions & 41 deletions contrib/try_convert/generate_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,9 @@
'json', # JSON
'jsonb',
'xml',
# 'bytea', # STRINGS
'char',
# 'bytea',
'char', # STRINGS
# 'bpchar',
'varchar',
'text',
'money',
Expand All @@ -43,15 +44,24 @@
'citext',
]

string_types = [
'text',
'citext',
'char',
# 'bpchar',
'varchar',
]

typmod_types = [
'bit',
'varbit',
'char',
'varchar',
# 'bpchar',
]

typmod_lens = [
None, 1, 5, 10
None, 1, 5, 10, 20
]

def get_typemod_type(t, l):
Expand All @@ -60,6 +70,12 @@ def get_typemod_type(t, l):
else:
return f'{t}({l})'

def get_typemod_table(t, l):
if l is None:
return f'tt_{t}'
else:
return f'tt_{t}_{l}'

uncomparable_types = [
'json',
'xml',
Expand Down Expand Up @@ -181,10 +197,52 @@ def remove_empty_lines(t):

test_funcs = ''

func_text = \
f'CREATE FUNCTION try_convert_by_sql_text(_in text, INOUT _out ANYELEMENT, source_type text)\n' \
f' LANGUAGE plpgsql AS\n' \
f'$func$\n' \
f' BEGIN\n' \
f' EXECUTE format(\'SELECT %L::%s::%s\', $1, source_type, pg_typeof(_out))\n' \
f' INTO _out;\n' \
f' EXCEPTION WHEN others THEN\n' \
f' -- do nothing: _out already carries default\n' \
f' END\n' \
f'$func$;\n'

test_funcs += func_text

func_text = \
f'CREATE FUNCTION try_convert_by_sql_text_with_len_out(_in text, INOUT _out ANYELEMENT, source_type text, len_out int)\n' \
f' LANGUAGE plpgsql AS\n' \
f'$func$\n' \
f' BEGIN\n' \
f' EXECUTE format(\'SELECT %L::%s::%s(%d)\', $1, source_type, pg_typeof(_out), len_out)\n' \
f' INTO _out;\n' \
f' EXCEPTION WHEN others THEN\n' \
f' -- do nothing: _out already carries default\n' \
f' END\n' \
f'$func$;\n'

test_funcs += func_text

for type_name in supported_types:

func_text = \
f'CREATE OR REPLACE FUNCTION try_convert_by_sql(_in {type_name}, INOUT _out ANYELEMENT)\n' \
f'CREATE FUNCTION try_convert_by_sql_with_len_out(_in {type_name}, INOUT _out ANYELEMENT, len_out int)\n' \
f' LANGUAGE plpgsql AS\n' \
f'$func$\n' \
f' BEGIN\n' \
f' EXECUTE format(\'SELECT %L::{type_name}::%s(%s)\', $1, pg_typeof(_out), len_out::text)\n' \
f' INTO _out;\n' \
f' EXCEPTION WHEN others THEN\n' \
f' -- do nothing: _out already carries default\n' \
f' END\n' \
f'$func$;\n'

test_funcs += func_text

func_text = \
f'CREATE FUNCTION try_convert_by_sql(_in {type_name}, INOUT _out ANYELEMENT)\n' \
f' LANGUAGE plpgsql AS\n' \
f'$func$\n' \
f' BEGIN\n' \
Expand Down Expand Up @@ -215,43 +273,67 @@ def remove_empty_lines(t):
test_load_data = '-- LOAD DATA\n'

test_load_data += f'CREATE TABLE tt_temp (v text) DISTRIBUTED BY (v);\n'
test_load_data += f'CREATE TABLE tt_temp_citext (v citext) DISTRIBUTED BY (v);\n'

def copy_data(table_name, filename, type_name):
return f'DELETE FROM tt_temp;' \
return f'DELETE FROM tt_temp;\n' \
f'COPY tt_temp from \'@abs_srcdir@/{filename}\';\n' \
f'INSERT INTO {table_name}(id, v) SELECT row_number() OVER(), v::{type_name} from tt_temp;'

type_tables = {}

def create_table(type_name, varlen=None):
table_name = f'tt_{type_name}'
field_type = type_name

if varlen is not None:
table_name = f'tt_{type_name}_{varlen}'
field_type = f'{type_name}({varlen})'
table_name = get_typemod_table(type_name, varlen)
field_type = get_typemod_type(type_name, varlen)

type_tables[type_name] = table_name

load_data = f'CREATE TABLE {table_name} (id serial, v {field_type}) DISTRIBUTED BY (id);\n'

filename = f'data/tt_{type_name}.data'

load_data += copy_data(table_name, filename, type_name) + '\n'
load_data += copy_data(table_name, filename, field_type) + '\n'

# load_data += f'SELECT * FROM {table_name};'

return load_data

def get_string_table(type_name, string_type, type_varlen=None, string_varlen=None):

if type_varlen is not None and string_varlen is not None:
return f'tt_{string_type}_{string_varlen}_of_{type_name}_{type_varlen}'
elif type_varlen is not None:
return f'tt_{string_type}_of_{type_name}_{type_varlen}'
elif string_varlen is not None:
return f'tt_{string_type}_{string_varlen}_of_{type_name}'

return f'tt_{string_type}_of_{type_name}'

for type_name in supported_types:

if type_name in typmod_types:
for varlen in typmod_lens:
test_load_data += create_table(type_name, varlen)
for type_varlen in typmod_lens:
if type_varlen is not None and type_name not in typmod_types:
continue

else:
test_load_data += create_table(type_name)
test_load_data += create_table(type_name, type_varlen)

for string_type in string_types:
for string_varlen in typmod_lens:
if string_varlen is not None and string_type not in typmod_types:
continue

field_type = get_typemod_type(type_name, type_varlen)
string_field_type = get_typemod_type(string_type, string_varlen)

table_name = get_string_table(type_name, string_type, type_varlen, string_varlen)

load_data = f'CREATE TABLE {table_name} (id serial, v {string_field_type}) DISTRIBUTED BY (id);\n'

cut = f'::{field_type}' if type_varlen is not None else ''

load_data += f'INSERT INTO {table_name}(id, v) SELECT row_number() OVER(), v{cut}::{string_field_type} from tt_temp;\n'

test_load_data += load_data



## GET DATA
Expand All @@ -272,15 +354,31 @@ def get_from_data(type_name, i = None):

## TEST

def create_test(source_name, target_name, test_data, default='NULL'):
def create_test(source_name, target_name, test_data, default='NULL', source_varlen=None, target_varlen=None):

test_filter = 'v1 is distinct from v2' if target_name not in uncomparable_types else 'v1::text is distinct from v2::text'

try_convert_sql = f'try_convert_by_sql(v, {default}::{target_name})'

if target_varlen is not None:
try_convert_sql = f'try_convert_by_sql_with_len_out(v, {default}::{target_name}, {target_varlen})'

if source_varlen is not None or source_name in ['bpchar']:

source_name_1 = get_typemod_type(source_name, source_varlen)

try_convert_sql = f'try_convert_by_sql_text(v::text, {default}::{target_name}, \'{source_name_1}\'::text)'

if target_varlen is not None:
try_convert_sql = f'try_convert_by_sql_text_with_len_out(v::text, {default}::{target_name}, \'{source_name_1}\'::text, {target_varlen})'

target_name_1 = get_typemod_type(target_name, target_varlen)

query = \
f'select * from (' \
f'select ' \
f'try_convert(v, {default}::{target_name}) as v1, ' \
f'try_convert_by_sql(v, {default}::{target_name}) as v2' \
f'try_convert(v, {default}::{target_name_1}) as v1, ' \
f'{try_convert_sql} as v2' \
f' from {test_data}' \
f') as t(v1, v2) where {test_filter};'
result = \
Expand All @@ -299,24 +397,30 @@ def create_test(source_name, target_name, test_data, default='NULL'):
text_tests_in = []
text_tests_out = []

# text_types = [('text', 'tt_temp'), ('citext', 'tt_temp_citext')]
default_value = 'NULL'

# for text_type, text_type_table in text_types:
for string_type in string_types:
for string_varlen in typmod_lens:
if string_varlen is not None and type_name not in typmod_types:
continue

# for type_name in supported_types:
for type_name in supported_types:
for type_varlen in typmod_lens:
if type_varlen is not None and type_name not in typmod_types:
continue

# test_type_data = get_data(type_name)
test_type_table = get_typemod_table(type_name, type_varlen)

# load_text_data_text = f'DELETE FROM {text_type_table}; COPY {text_type_table} from \'@abs_srcdir@/data/tt_{type_name}.data\';'
text_type_table = get_string_table(type_name, string_type, type_varlen, string_varlen)

# test_corrupted_text_data = f'(select (\'!@#%^&*\' || v || \'!@#%^&*\') from {text_type_table}) as t(v)'
test_corrupted_text_data = f'(select (\'!@#%^&*\' || v || \'!@#%^&*\') from {text_type_table}) as t(v)'

# to_text_in, to_text_out = create_test(type_name, text_type, test_type_data)
# from_text_in, from_text_out = create_test(text_type, type_name, text_type_table)
# from_corrupted_text_in, from_corrupted_text_out = create_test(text_type, type_name, test_corrupted_text_data)
to_text_in, to_text_out = create_test(type_name, string_type, test_type_table, default_value, type_varlen, string_varlen)
from_text_in, from_text_out = create_test(string_type, type_name, text_type_table, default_value, string_varlen, type_varlen)
from_corrupted_text_in, from_corrupted_text_out = create_test(string_type, type_name, test_corrupted_text_data, default_value, string_varlen, type_varlen)

# text_tests_in += [to_text_in, load_text_data_text, from_text_in, from_corrupted_text_in]
# text_tests_out += [to_text_out, load_text_data_text, from_text_out, from_corrupted_text_out]
text_tests_in += [to_text_in, from_text_in]
text_tests_out += [to_text_out, from_text_out]

# print(text_tests_in[0])
# print(text_tests_in[1])
Expand All @@ -331,24 +435,23 @@ def create_test(source_name, target_name, test_data, default='NULL'):
+ extension_casts

for source_name, target_name in type_casts:

if (source_name not in supported_types or target_name not in supported_types):
continue

d = f'\'{get_from_data(target_name, 0)}\''

for default in ['NULL']:

test_data = get_data(source_name)
for source_varlen in typmod_lens:
if source_varlen is not None and source_name not in typmod_types:
continue

for varlen1 in typmod_lens:
for varlen2 in typmod_lens:
if varlen1 is not None and source_name not in typmod_types:
continue
if varlen2 is not None and target_name not in typmod_types:
test_table = get_typemod_table(source_name, source_varlen)

for target_varlen in typmod_lens:
if target_varlen is not None and target_name not in typmod_types:
continue

test_in, test_out = create_test(get_typemod_type(source_name, varlen1), get_typemod_type(target_name, varlen2), test_data, default)
test_in, test_out = create_test(source_name, target_name, test_table, default, source_varlen, target_varlen)

function_tests_in += [test_in]
function_tests_out += [test_out]
Expand Down
Loading

0 comments on commit 9440244

Please sign in to comment.