Releases: Nike-Inc/koheesio
koheesio-v0.9.0
What's Changed
v0.9 brings many changes to the spark module, allowing support for pyspark connect along with a bunch of bug fixes and some new features. Additionally, the snowflake implementation is significantly reworked now relying on a pure python implementation for interacting with Snowflake outside of spark.
New features / Refactors
The following new features are included with 0.9:
- [feature] Box - Add overwrite functionality to the BoxFileWriterClass by @ToneVDB in #103
- [feature] Box - allow setting file encoding by @louis-paulvlx in #96
- [refactor] Core - change private attr and step getter by @mikita-sakalouski in #82
- [feature] DataBricks - DataBricksSecret for getting secrets from DataBricks scope by @mikita-sakalouski in #133
- [feature] Delta - Enable adding options to DeltaReader both streaming and batch by @mikita-sakalouski in #111
- [feature] SE - SparkExpectations bump version to 2.2.0 by @dannymeijer in #99
- [feature] Snowflake - Populate account from url if not provided in SnowflakeBaseModel by @mikita-sakalouski in #117
- [feature] Spark - add support for Spark Connect by @mikita-sakalouski in #63
- [feature] Spark - Make Transformations callable by @dannymeijer in #126
- [feature] Tableau - Add support for HyperProcess parameters by @maxim-mityutko in #112
Bug fixes
The following bugfixes are included with 0.9:
- [bugfix] Core - Accidental duplication of logs by @dannymeijer in #105
- [bugfix] Core - Adjust branch fetching logic for forked repo for Github Actions by @mikita-sakalouski in #106 and @mikita-sakalouski in #109
- [bugfix] Delta - DeltaMergeBuilder instance type didn't check out by @dannymeijer in #100
- [bugfix] Delta - fix merge builder instance check for connect + util fix by @dannymeijer in #130
- [bugfix] Docs - broken import statements and updated hello-world.md by @dannymeijer in #107
- [bugfix] Snowflake - python connector default config dir by @mikita-sakalouski in #125
- [bugfix] Snowflake - Remove duplicated implementation by @mikita-sakalouski in #116
- [bugfix] Spark - unused SparkSession being import from pyspark.sql in several tests by @dannymeijer in #140
- [bugfix] Spark/Docs - Remove mention of non-existent class type in docs by @dannymeijer in #138
- [bugfix] Tableau - Decimals conversion in HyperFileDataFrameWriter by @maxim-mityutko in #77
- [bugfix] Tableau - small fix for Tableau Server path checking by @dannymeijer in #134
- [bugfix] Snowflake - replace RunQuery with SnowflakeRunQueryPython by @mikita-sakalouski in #121
New Contributors
Big shout out to all contributors and a heartfelt welcome to our new contributors:
- @louis-paulvlx made their first contribution in #96
- @ToneVDB made their first contribution in #103
Migrating from v0.8
For users currently using v0.8, consider the following:
-
Spark connect is now fully supported. For this to work we've had to introduce several replacement types for pyspark such as DataFrame (i.e.
pyspark.sql.DataFrame
vspyspark.sql.connect.DataFrame
) as well as the SparkSession. If you are using custom Step logic in which you reference spark types, take these types from thekoheesio.spark
module instead. This will allow you to use pyspark connect with your custom code also. -
Snowflake was extensively reworked.
- To be able to use snowflake, a new
extra
/feature
was added to thepyproject.toml
- install this usingkoheesio[snowflake]
in order to have access to snowflake python - Code for snowflake support was moved to new primary modules:
koheesio.integrations.spark.snowflake
hosts all spark related snowflake codekoheesio.integrations.snowflake
hosts the non-spark / pure-python implementations- The original API was kept in place through pass-through imports; no immediate code changes should be needed
- To be able to use snowflake, a new
Full Changelog: koheesio-v0.8.1...koheesio-v0.9.0
koheesio-v0.9.0rc7
What's Changed
- Release/0.9 - final version bump and docs by @dannymeijer in #132
- [FEATURE] Make Transformations callable by @dannymeijer in #126
- [BUG] small fix for Tableau Server path checking by @dannymeijer in #134
- [FEATURE] DataBricksSecret for getting secrets from DataBricks scope by @mikita-sakalouski in #133
Full Changelog: koheesio-v0.9.0rc6...koheesio-v0.9.0rc7
koheesio-v0.9.0rc6
What's Changed
- refactor: replace RunQuery with SnowflakeRunQueryPython by @mikita-sakalouski in #121
- hotfix: snowflake python connector default config dir by @mikita-sakalouski in #125
- hotfix: delta merge builder instance check for connect + util fix by @dannymeijer in #130
Full Changelog: koheesio-v0.9.0rc5...koheesio-v0.9.0rc6
koheesio-v0.9.0rc5
Adjust logic for getting account from url/sfURL
koheesio-v0.9.0rc4
What's Changed
- fix: test github by @mikita-sakalouski in #109
- [Fix] Add overwrite functionality to the BoxFileWriterClass by @ToneVDB in #103
- [FEATURE] Enable adding options to DeltaReader both streaming and writing by @mikita-sakalouski in #111
- Add support for HyperProcess parameters by @maxim-mityutko in #112
- [HOTFIX] Remove duplicated implementation by @mikita-sakalouski in #116
- [FEATURE] Populate account from url if not provided in SnowflakeBaseModel by @mikita-sakalouski in #117
New Contributors
Full Changelog: koheesio-v0.9.0rc3...koheesio-v0.9.0rc4
koheesio-v0.9.0rc3
What's Changed
- [FIX] Accidental duplication of logs by @dannymeijer in #105
- fix: adjust branch fetching by @mikita-sakalouski in #106
- [FIX] broken import statements and updated hello-world.md by @dannymeijer in #107
Full Changelog: koheesio-v0.9.0rc2...koheesio-v0.9.0rc3
koheesio-v0.9.0rc2
Several bugfixes
Full Changelog: koheesio-v0.9.0rc1...koheesio-v0.9.0rc2
koheesio-v0.9.0rc1
What's Changed
- 90-Bug-fix-file-encoding-box-integration by @louis-paulvlx in #96
- fix for DeltaMergeBuilder, when the instance doesn't check out by @dannymeijer in #100
- Fix/98 sparkexpectations bump version to 220 by @dannymeijer in #99
New Contributors
- @louis-paulvlx made their first contribution in #96
Full Changelog: koheesio-v0.9.0rc0...koheesio-v0.9.0rc1
koheesio-v0.9.0rc0
What's Changed
- [bugfix] Decimals conversion in HyperFileDataFrameWriter by @maxim-mityutko in #77
- feature: add support for Spark Connect by @mikita-sakalouski in #63
- refactor: change private attr and step getter by @mikita-sakalouski in #82
Full Changelog: koheesio-v0.8.1...koheesio-v0.9.0rc0
v0.8.1
What's Changed
- [FIX] - Snowflake reader and writer have different parameter names for table name by @riccamini in #68
- [FIX] update sparkstep to be able to manage the sparksession more effectively by @dannymeijer in #69
- [FIX] 53 - Update CONTRIBUTING.md by @riccamini in #71
- [BUGFIX] Tableau Hyper generation on DBFS is broken by @maxim-mityutko in #73
- [FEATURE] Exists function in DeltaTableStep should not log error by @femilian-6582 in #70
New Contributors
- @femilian-6582 made their first contribution in #70
Full Changelog: koheesio-v0.8.0...koheesio-v0.8.1