forked from xiaoyongzhu/feathr
-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge main02 #13
Open
aabbasi-hbo
wants to merge
78
commits into
snowflake-source-scala-update
Choose a base branch
from
merge-main02
base: snowflake-source-scala-update
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Merge main02 #13
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]>
* Meaningful error * Copy/paste typo
* Enhance error messages of synapse jobs
…ai#808) * Support timePartitionPattern in paths of data sources.
Enhance sample notebook to solve some issues in synapse
Bumps [loader-utils](https://github.com/webpack/loader-utils) from 2.0.3 to 2.0.4. - [Release notes](https://github.com/webpack/loader-utils/releases) - [Changelog](https://github.com/webpack/loader-utils/blob/v2.0.4/CHANGELOG.md) - [Commits](webpack/loader-utils@v2.0.3...v2.0.4) --- updated-dependencies: - dependency-name: loader-utils dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Fix unexpected 500 * Handle AtlasException
* Include noop-1.0.jar into the wheel * No need for this
* Fix unexpected 500 * Handle AtlasException * Missing typeName * Specify purview client version * Remove debug code
Signed-off-by: Yuqing Wei <[email protected]>
* Bump version to 0.9.0 * Include a python client setup fix from Jay * Add a fallback
* update links * fix typos
…athr-ai#862) * Insert test coverage check for python client into github pipeline
…ass credentials (feathr-ai#818) * Create feathr-registry-client-update.md * add docs on additional user * Update local-spark-provider.md * Create feathr-advanced-topic.md * Create feathr-credential-passthru.md * Update feathr-credential-passthru.md * update docs for feathr upgrade * update docs * fix conetent
* Fix local spark output file-format bug Signed-off-by: Jun Ki Min <[email protected]> * Add dev dependencies. Add unit-test for local spark job launcher Signed-off-by: Jun Ki Min <[email protected]> * Fix local spark submission unused param error Signed-off-by: Jun Ki Min <[email protected]> * Refactor nyc_taxi example. TODO: update refs to the notebook Signed-off-by: Jun Ki Min <[email protected]> * Add dataset utilities and notebook path refactor. TODO: update reference links Signed-off-by: Jun Ki Min <[email protected]> * Add init.py to datasets module. Modify maybe_download to accept dir as dst_path Signed-off-by: Jun Ki Min <[email protected]> * Add notebook test Signed-off-by: Jun Ki Min <[email protected]> * change notebook to use scrap flag and is_databricks Signed-off-by: Jun Ki Min <[email protected]> * Fix databricks path Signed-off-by: Jun Ki Min <[email protected]> * Fix unittest Signed-off-by: Jun Ki Min <[email protected]> * Modify databricks notebook. Fix dbfs path errors in utils. Signed-off-by: Jun Ki Min <[email protected]> * Address review comments Signed-off-by: Jun Ki Min <[email protected]> * put the user_workspace feature python files back Signed-off-by: Jun Ki Min <[email protected]> * Revive feathr_config.yaml Signed-off-by: Jun Ki Min <[email protected]> * Add custom marker to pyproject.toml Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]>
* Add more secret manager support * Add abstract class * Update feathr-configuration-and-env.md * Update _envvariableutil.py * add tests for aws secrets manager * Update test_secrets_read.py * fix tests * Update test_secrets_read.py * fix test * Update pull_request_push_test.yml * get_secrets_update * move import statement * update spelling * update raise exception * revert * feature registry hack * query for uppercase * add snowflake source * remove snowflake type * enableDebugLogger * add logging * simple path snowflake fix * snowflake-update * fix bugs/log * get_snowflake_path * update get_snowflake_path * remove log * log * add logs * test with path * update snowflake registry handling * update source * remove logs * update error handling and test * make lowercase * remove logging * Revert "Merge pull request #5 from aabbasi-hbo/secrets-key-test" This reverts commit 41554b4, reversing changes made to 6b401de. * Revert "remove logging" This reverts commit e01635d. * Revert "update error handling and test" This reverts commit e5c200f. * Revert "query for uppercase" This reverts commit 0531788. * Revert "revert" This reverts commit 87cd083. * Revert "update raise exception" This reverts commit 44a3ce0. * Revert "update spelling" This reverts commit 07a8cf0. * Revert "move import statement" This reverts commit 218123f. * Revert "get_secrets_update" This reverts commit 9cb332c. * Revert "Update pull_request_push_test.yml" This reverts commit e617b99. * Revert "fix test" This reverts commit 8be6a42. * Revert "Update test_secrets_read.py" This reverts commit 997a2b1. * Revert "fix tests" This reverts commit a6870d9. * Revert "Update test_secrets_read.py" This reverts commit aa5fdda. * Revert "add tests for aws secrets manager" This reverts commit cdcd612. * Revert "Update _envvariableutil.py" This reverts commit f616522. * Revert "Update feathr-configuration-and-env.md" This reverts commit 2d6c135. * Revert "Add abstract class" This reverts commit e96459a. * Revert "Add more secret manager support" This reverts commit c31906c. * remove extra line * fix formatting * Update setup.py * update python tests * update scala test * update tests * update test * add test * update docs * fix test * add snowflake guide * add to NonTimeBasedDataSourceAccessor * remove registry fixes * Update source.py * Update source.py * Update source.py * remove print * Update feathr-snowflake-guide.md Co-authored-by: Xiaoyong Zhu <[email protected]>
* Restructure code to encapsulate the `save_to_feature_config_from_context` * Update client.py * fix merge issues * Update config_helper.py * fix comments
) * Fix xdist test error. Also make a small cleanup some codes Signed-off-by: Jun Ki Min <[email protected]> * Revert "Revert 756 (feathr-ai#798)" This reverts commit ff438f5. * revert 798 (revert756 - example notebook refactor). Also add job_utils unit tests Signed-off-by: Jun Ki Min <[email protected]> * Update test_azure_spark_e2e.py * Fix doc dead links (feathr-ai#805) This PR fixes dead links detected in latest ci run. The doc scan ci action has been updated to run on main only, as running this in PR frequently reports false alarm due to changes in CI not deployed. * Improve UI experience and clean up ui code warnings (feathr-ai#801) * Add DataSourcesSelect and FlowGraph and ResizeTable components. Fix all warning and lint issues. Signed-off-by: Boli Guan <[email protected]> * Add CardDescriptions component and fix ESlint warning. Signed-off-by: Boli Guan <[email protected]> * Update FeatureDetails page title. Signed-off-by: Boli Guan <[email protected]> * Rename ProjectSelect Signed-off-by: Boli Guan <[email protected]> Signed-off-by: Boli Guan <[email protected]> * Add release instructions for Release Candidate (feathr-ai#809) * Add release instructions for Release Candidate * Add a section for release versioning * Add a section for overall process triggered by the release manager * Bump version to 0.9.0-rc1 (feathr-ai#810) * Fix tests to use mocks and fix get_result_df's databricks behavior Signed-off-by: Jun Ki Min <[email protected]> * fix tem file to dir Signed-off-by: Jun Ki Min <[email protected]> * checkout the feature_derivations.py from main (it was temporally changed to goaround previous issues) Signed-off-by: Jun Ki Min <[email protected]> * Remove old databricks sample notebook. Change pip install feathr from the github main branch to pickup the latest changes always Signed-off-by: Jun Ki Min <[email protected]> * Fix config and get_result_df for synapse * Fix generate_config to accept all the feathr env var config name Signed-off-by: Jun Ki Min <[email protected]> * Add more pytests Signed-off-by: Jun Ki Min <[email protected]> * Use None as default dataformat in the job_utils. Instead, set 'avro' as a default output format to the job tags from the client Signed-off-by: Jun Ki Min <[email protected]> * Change feathr client to mocked object Signed-off-by: Jun Ki Min <[email protected]> * Change timeout to 1000s in the notebook Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Boli Guan <[email protected]> Co-authored-by: Blair Chen <[email protected]> Co-authored-by: Blair Chen <[email protected]> Co-authored-by: Boli Guan <[email protected]>
* Add 'format' arg to get_result_df Signed-off-by: Jun Ki Min <[email protected]> * Add unittest for arg alias of get_result_df Signed-off-by: Jun Ki Min <[email protected]> * Update explicit functions to use kwargs and update unit-tests accordingly Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]>
* Add working gradle build * Set up pdl support * Working PDL java code gen * With pdl files from metadata models * With pdl files from compute model * Fix compile for all pdl files * Add working gradle build * Migrate frame-config module into feathr * Migrate fcm graph module to feathr * Add FCM offline execution code, includes FDS metadata code * Add needed jars for feathr-config tests * Switch client to FeathrClient2 for local tests and fix config errors * Fix SWA test * Add gradle wrapper jar * Change name of git PR test from sbt to gradle * Switch python client to use FCM client * Exclude json from dependency * Add hacky solution to handle json dependency conflict in cloud * Add json to local dependency * Add log to debug cloud jar * Add json as dependency * Another attempt at resolving json dependency * Resolve json via shading * Fix json shading * Remove log * Shade typesafe config for cloud jar * Add maven publish code to build.gradle * Add working local maven build and rename frame-config to feathr-config to avoid namespace conflict * Modify sonatype creds * Change so no need to sign if releasing snapshot version * Update build.gradle to allow publishing of all modules * Removed FDS handling from Feathr * All tests working * Deleted FR stuff * Remove dimension and other tensor related stuff * Remove mlfeatureversionurn from defaultvalueresolver * Remove mlfeatureversionurn and featureref * Remove featuredefinition files * Remove featureRef and typedRef * final cleanup * Fix merge conflict bugs * Fix guava error * udf plugin for swa features * row-transformations optimization * fix bug * fix another bug * always execute agg nodes first * Add SWA log * reverse order of execution * group by datasource * Fix bug * Merge main into fcm branch * Remove insecure URLs * Add back removed files * Add back removed files * Add back removed files * Change PR build system to gradle * Change sbt job to gradle jobb * Change sbt workflow:wq * Update maven github workflow to use gradle * fix failing test * remove sbt project module * Remove sbt related files * Change docs to reflect gradle * Remove keywords * Create a single jar * 1. Fix jar not getting populated\n 2. Fix documentation bugs * pubishToMavenLocal Working * With FFE integrated * maven upload working * Update docs and code clean up * add gradle-wrapper file * Push all dependency jars * Update docs * Docs cleanup * Update github workflow commands * Update github workflow * Update workflow syntax * Update version * Add gradle version to github workflow * Update gradle version w/o quotes * Remove github gradle version * Github workflow fix * Github workflow fix-2 * Github workflow fix-4 Co-authored-by: Bozhong Hu <[email protected]> Co-authored-by: rkashyap <[email protected]>
… arg order to make it work with old codes (feathr-ai#890) Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]>
Bumps [minimatch](https://github.com/isaacs/minimatch) and [recursive-readdir](https://github.com/jergason/recursive-readdir). These dependencies needed to be updated together. Updates `minimatch` from 3.0.4 to 3.1.2 - [Release notes](https://github.com/isaacs/minimatch/releases) - [Changelog](https://github.com/isaacs/minimatch/blob/main/changelog.md) - [Commits](isaacs/minimatch@v3.0.4...v3.1.2) Updates `recursive-readdir` from 2.2.2 to 2.2.3 - [Release notes](https://github.com/jergason/recursive-readdir/releases) - [Changelog](https://github.com/jergason/recursive-readdir/blob/master/CHANGELOG.md) - [Commits](https://github.com/jergason/recursive-readdir/commits/v2.2.3) --- updated-dependencies: - dependency-name: minimatch dependency-type: indirect - dependency-name: recursive-readdir dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Add docs for checking/improving test coverage
Co-authored-by: Bozhong Hu <[email protected]>
…om registry (feathr-ai#886) Support printing features and returning keys dict when getting features from registry
Introduce an option to select between env vars and config yaml file. Feathr client to use explicitly configured yaml file over environment variable if use_env_var flag is set to False. The changes are added as the last argument of the existing functions and set default to True (use env variables) so that existing codes don't break. Resolves feathr-ai#922
* Update registry-access-control.md * Update README.md * add logo * Update README.md
* Add is_synapse Signed-off-by: Jun Ki Min <[email protected]> * fix is_synapse and add tests Signed-off-by: Jun Ki Min <[email protected]> * Describe the reason we pin aiohttp package Signed-off-by: Jun Ki Min <[email protected]> * Change is_databricks to use os environ Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]>
Signed-off-by: Yuqing Wei <[email protected]>
* Support get online features by composite keys
* React best practice implementations in ui code * Fix CI code defects * Update package-lock.json * Update package-lock.json
Latest numpy deprecated np.bool which conflicts with pyspark and pandas. This PR pins the numpy version to use older one to resolve that. We can unpin later once pyspark resolves the issue.
* Create Feathr – An Enterprise-Grade High Performance Feature Store.pdf * change name * Update README.md
Signed-off-by: Boli Guan <[email protected]> Signed-off-by: Boli Guan <[email protected]>
… to use latest feathr api (feathr-ai#921) * Update notebooks to use latest codes with extra notebook dependencies Signed-off-by: Jun Ki Min <[email protected]> * wip. Remove azure cli package from extra dependencies Signed-off-by: Jun Ki Min <[email protected]> * Update fraud detection demo notebook and add test Signed-off-by: Jun Ki Min <[email protected]> * WIP debugging Signed-off-by: Jun Ki Min <[email protected]> * Update notebooks Signed-off-by: Jun Ki Min <[email protected]> * modify notebook test to go-around materialization issue Signed-off-by: Jun Ki Min <[email protected]> * Change notebook parameter name to align with client argument Signed-off-by: Jun Ki Min <[email protected]> * Update recommendation notebook Signed-off-by: Jun Ki Min <[email protected]> * Update synapse example. Add azure-cli dependency to notebook dependencies Signed-off-by: Jun Ki Min <[email protected]> * Update data url constants to point the source github repo's raw files Signed-off-by: Jun Ki Min <[email protected]> * add dataset url constants to init.py Signed-off-by: Jun Ki Min <[email protected]> * Update feature embedding notebook to use the original dataset from azure example github Signed-off-by: Jun Ki Min <[email protected]> * Add recommendation sample notebook test Signed-off-by: Jun Ki Min <[email protected]> * Fix numpy.bool deprecation error Signed-off-by: Jun Ki Min <[email protected]> * Change databricks cluster node size from Dv2 to DSv2 Signed-off-by: Jun Ki Min <[email protected]> * Use Dv4 for databricks notebook test due to the limit of Dv2 quota at US East 2 Signed-off-by: Jun Ki Min <[email protected]> * Fix to use the supported vm size Signed-off-by: Jun Ki Min <[email protected]> * pin numpy to resolve conflict with pyspark Signed-off-by: Jun Ki Min <[email protected]> * Add document intelligence sample notebook Signed-off-by: Jun Ki Min <[email protected]> * Update fraud detection sample Signed-off-by: Jun Ki Min <[email protected]> Signed-off-by: Jun Ki Min <[email protected]>
* Spark SQL source * Add unit test * Clean up after test * Typo * Fix conflict * Add doc comments * add spark sql source test cases Signed-off-by: Yuqing Wei <[email protected]> * add spark sql source test cases Signed-off-by: Yuqing Wei <[email protected]> * Ignore sql if it is None Signed-off-by: Yuqing Wei <[email protected]> Co-authored-by: Yuqing Wei <[email protected]>
This PR locks all python dependencies version for registry projects
feathr-ai#937) * Add pytest cases and check test coverage for sql-registry and purview-registry * modify types annotation
* Remove hadoop dependency * Bump RC version * Experiment test failures * Exclude hadoop file Co-authored-by: rkashyap <[email protected]>
* set purview name environment variable in workflow
I found there's a bug in the model result graph at the Fraud Detection sample notebook Fixes are: to use ML model output probability instead of the prediction class label for precision/recall graph change the chart label to be the correct ML model name
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Resolves #XXX
How was this PR tested?
Does this PR introduce any user-facing changes?