-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add information about pii classification feature (#1517)
* Update duplicates_pandas.py (#1427) Fixing Bug Report #1384 Dataset with categorical features causes memory error even on tiny dataset. * chore(actions): update sonarsource/sonarqube-scan-action action to v2.0.1 * chore(actions): update actions/checkout action to v4 * docs: setup new docs with mkdocs (#1418) * chore(actions): update actions/checkout action to v4 * fix: remove the duplicated cardinality threshold under categorical and text settings * fix: fixate matplotlib upper version * docs: change from `zap` to `sparkles` (#1447) Co-authored-by: Fabiana <[email protected]> * fix: template {{ file_name }} error in HTML wrapper (#1380) * Update javascript.html * Update style.html * feat: add density histogram (#1458) * feat: add histogram density option * test: add unit test * fix: discard weights if exceed max_bins * docs: update README.html (#1461) Update url of use cases, main integrations, and common issues. * fix: bug when creating a new report (#1440) * fix: gen wordcloud only for non-empty cols (#1459) * fix: table template ignoring text format (#1462) * fix: table template ignoring text format * fix: timeseries unit test * fix(linting): code formatting --------- Co-authored-by: Azory YData Bot <[email protected]> * fix: to_category misshandling pd.NA (#1464) * docs: add 📊 for Key features (#1451) See also #1445 (comment) * docs: fix hyperlink - related to package name change (#1457) Co-authored-by: Martin Mokry <[email protected]> * chore(deps): increase numpy upper limit (#1467) * chore(deps): increase numpy upper limit * chore(deps): fixate numpy version for spark * chore(deps): fix numba package version, and filter warns (#1468) * chore: fix numba package version, and filter warns * fix: skip isort linter on init * chore(deps): update dependency typeguard to v4 (#1324) * chore(deps): update dependency typeguard to v4 --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Maciej Bukczynski <[email protected]> * docs: update docs with advent of code * docs: update links for fabric * chore(actions): update actions/setup-python action to v5 * docs: add information about PII classification & management. --------- Co-authored-by: boris-kogan <[email protected]> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Vasco Ramos <[email protected]> Co-authored-by: ricardodcpereira <[email protected]> Co-authored-by: Anselm Hahn <[email protected]> Co-authored-by: Joge <[email protected]> Co-authored-by: Alex Barros <[email protected]> Co-authored-by: Miriam Seoane Santos <[email protected]> Co-authored-by: Chris Mahoney <[email protected]> Co-authored-by: Azory YData Bot <[email protected]> Co-authored-by: martin-kokos <[email protected]> Co-authored-by: Martin Mokry <[email protected]> Co-authored-by: Maciej Bukczynski <[email protected]> Co-authored-by: Fabiana Clemente <[email protected]>
- Loading branch information
1 parent
8d4d347
commit a9c4114
Showing
5 changed files
with
85 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# Personally identifiable information (PII) identification & management ** | ||
|
||
!!! info "** YData's Enterprise feature" | ||
|
||
This feature is only available for users of [YData Fabric](https://ydata.ai). | ||
|
||
[Sign-up Fabric community](http://ydata.ai/register?utm_source=ydata-profiling&utm_medium=documentation&utm_campaign=YData%20Fabric%20Community) and | ||
start your journey into **data management** with automated PII identification. | ||
|
||
Personal Identifiable Information **(PII)** refers to any information that can be used to identify an individual. | ||
This includes but is not limited to, names, addresses, phone numbers, social security numbers, email addresses, | ||
and financial information. PII is crucial in today's digital age, where data is extensively collected, stored, | ||
and processed. | ||
|
||
[YData Fabric Data Catalog](https://ydata.ai/products/data_catalog), a scalable and interactive version of ydata-profiling, | ||
integrates into the data profiling experience, an advanced machine learning solutions based on a Named Entity Recognition (NER) model | ||
combine with traditional rule-based patterns identification, allowing to efficiently detect PII. | ||
|
||
:fontawesome-brands-youtube:{ .youtube } | ||
<a href="https://www.youtube.com/clip/UgkxBntXvAvCQ6I39Cp2KZRD4Ug9-NPzG1o1"><u>See Fabric's Data Catalog PII identification in action</u></a>. | ||
|
||
## Why Fabric Catalog automated PII identification? | ||
|
||
The relevance of automating the identification of PII lies in the need to protect individuals' privacy and comply | ||
with various data protection regulations. Mishandling or unauthorized access to PII can lead to severe consequences | ||
such as identity theft, financial fraud, and breaches of privacy. With the increasing volume of data generated manual | ||
identification of PII becomes impractical and error-prone. | ||
|
||
Additionally, having a robust PII management solution is essential for organizations to establish and maintain | ||
a secure approach to handling sensitive information, fostering trust and adhering to legal requirements. | ||
|
||
## Why Fabric to manage dataset PII identification | ||
|
||
Besides automated PII identification, *Fabric Catalog* offers several key benefits in the content of data governance, | ||
privacy compliance and overall data management, through automated data profiling and metadata management: | ||
|
||
### Compliance with Privacy Regulations: | ||
Many countries and regions have stringent data protection regulations (such as GDPR, CCPA, or HIPAA) | ||
that require organizations to handle PII responsibly. A dedicated platform ensures that PII is correctly classified, | ||
helping organizations comply with legal requirements and avoid potential penalties. | ||
|
||
### Data Profiling for Accuracy: | ||
|
||
Data profiling involves analyzing and understanding the structure and content of data. By incorporating data profiling | ||
capabilities into the platform, organizations can ensure accurate identification and classification of PII. | ||
This helps in maintaining the integrity of data and reduces the risk of misclassifications. | ||
|
||
### Efficient Management of PII: | ||
As the volume of data continues to grow, manually managing and editing PII classifications becomes impractical. | ||
A platform streamlines this process, making it more efficient and reducing the likelihood of errors. | ||
It allows organizations to keep track of PII across various datasets and systems. | ||
|
||
### Facilitating Data Governance: | ||
|
||
Data governance involves establishing policies and processes to ensure high data quality, security, and compliance. | ||
A PII management solution enhances data governance efforts by providing a centralized hub for overseeing PII classifications, | ||
metadata, and related policies. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters