-
Notifications
You must be signed in to change notification settings - Fork 6
Transformation
Now that we learned about extraction of the data from excel spreadsheets, let's look into how we can further consume and manipulate with this data.
In extraction examples all the parsed records were the instances of GenericRecord
. It holds the data in the denormalized way as a Map<String, Any>
. But what we need is a strict model with checks along the way.
E.g. let's look into into broken stats
spreadsheet within basic_examples
file.
You can see that Club Brugge misses the data about the total points.
However, this is a critical information for us and we want to check non-nullability of the data along the extraction of the data (with capturing what rows do not comply with our model validation).
Apart from this, we want to store the data in a normalised way as a data class with statically typed fields.
E.g. we know that points
could only be integers, if it's something else — it should not pass the validation checks!
To support both of the features refinery
allows you to define the data class with custom row parser to map the values to the fields of the data class