-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Support for e notation using existing parse_decimal in string to decimal conversion #6905
base: main
Are you sure you want to change the base?
Changes from 7 commits
8ce814d
4b19083
45ec17e
c69b938
819f0d6
dd3874d
460d323
a4f0667
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,6 +16,7 @@ | |
// under the License. | ||
|
||
use crate::cast::*; | ||
use crate::parse::*; | ||
|
||
/// A utility trait that provides checked conversions between | ||
/// decimal types inspired by [`NumCast`] | ||
|
@@ -230,6 +231,7 @@ where | |
)?)) | ||
} | ||
|
||
#[allow(dead_code)] | ||
/// Parses given string to specified decimal native (i128/i256) based on given | ||
/// scale. Returns an `Err` if it cannot parse given string. | ||
pub(crate) fn parse_string_to_decimal_native<T: DecimalType>( | ||
|
@@ -342,10 +344,9 @@ where | |
&'a S: StringArrayType<'a>, | ||
{ | ||
if cast_options.safe { | ||
let iter = from.iter().map(|v| { | ||
v.and_then(|v| parse_string_to_decimal_native::<T>(v, scale as usize).ok()) | ||
.and_then(|v| T::is_valid_decimal_precision(v, precision).then_some(v)) | ||
}); | ||
let iter = from | ||
.iter() | ||
.map(|v| v.and_then(|v| parse_decimal::<T>(v, precision, scale).ok())); | ||
// Benefit: | ||
// 20% performance improvement | ||
// Soundness: | ||
|
@@ -359,15 +360,12 @@ where | |
.iter() | ||
.map(|v| { | ||
v.map(|v| { | ||
parse_string_to_decimal_native::<T>(v, scale as usize) | ||
.map_err(|_| { | ||
ArrowError::CastError(format!( | ||
"Cannot cast string '{}' to value of {:?} type", | ||
v, | ||
T::DATA_TYPE, | ||
)) | ||
}) | ||
.and_then(|v| T::validate_decimal_precision(v, precision).map(|_| v)) | ||
parse_decimal::<T>(v, precision, scale).map_err(|_| { | ||
ArrowError::CastError(format!( | ||
"Cannot cast string '{}' to decimal type of precision {} and scale {}", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. T:DATA_TYPE shows default Decimal(38,10) or Decimal256(76,..) in the error message, hiding the precision and scale provided for cast. |
||
v, precision, scale | ||
)) | ||
}) | ||
}) | ||
.transpose() | ||
}) | ||
|
@@ -629,15 +627,6 @@ mod tests { | |
|
||
#[test] | ||
fn test_parse_string_to_decimal_native() -> Result<(), ArrowError> { | ||
assert_eq!( | ||
parse_string_to_decimal_native::<Decimal128Type>("0", 0)?, | ||
0_i128 | ||
); | ||
assert_eq!( | ||
parse_string_to_decimal_native::<Decimal128Type>("0", 5)?, | ||
0_i128 | ||
); | ||
|
||
assert_eq!( | ||
parse_string_to_decimal_native::<Decimal128Type>("123", 0)?, | ||
123_i128 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1284,7 +1284,7 @@ mod tests { | |
assert_eq!("53.002666", lat.value_as_string(1)); | ||
assert_eq!("52.412811", lat.value_as_string(2)); | ||
assert_eq!("51.481583", lat.value_as_string(3)); | ||
assert_eq!("12.123456", lat.value_as_string(4)); | ||
assert_eq!("12.123457", lat.value_as_string(4)); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here we can see this is a breaking change to the rounding behaviour There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also to note, previous behavior was not correct. 12.12345678 cast to `Decimal128(38, 6)` = 12.123457 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It truncated rather than rounding, they're both valid behaviours, changing this is a breaking change There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is an argument for accepting the breaking change to use rounding since it would be consistent with how we cast floating point to decimal. However, do we want to consider adding a parameter to choose between truncation and rounding? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I personally wouldn't characterize this a breaking change, though I can see how others might. In my opinion, adding a parameter to choose between the behaviors would be the safest thing (aka a field to Maybe @liukun4515 who added much of the initial decimal support in arrow-rs has time to offer historical perspective on rounding vs truncation during casting? |
||
assert_eq!("50.760000", lat.value_as_string(5)); | ||
assert_eq!("0.123000", lat.value_as_string(6)); | ||
assert_eq!("123.000000", lat.value_as_string(7)); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails in clippy, hence added
#[allow(dead_code)]
, there is no use, if required we can remove it and cover existing tests with parse_decimal.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should remove this and port the tests, to ensure we aren't losing test coverage / accidentally changing behaviour
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done