diff --git a/space.yml b/space.yml index 94da956..9626052 100644 --- a/space.yml +++ b/space.yml @@ -1,4 +1,3 @@ ---- title: ISCC-LAB - Semantic-Code Text emoji: ▶️ colorFrom: red @@ -8,37 +7,38 @@ sdk_version: 4.41.0 app_file: iscc_sct/demo.py pinned: true license: CC-BY-NC-SA-4.0 +python_version: 3.12 short_description: Cross Lingual Similarity Preserving Text Simprints ---- -# ISCC-LAB - Semantic-Code Text +description: > + # ISCC-LAB - Semantic-Code Text -`iscc-sct` is a **proof of concept implementation** of a semantic Text-Code for the -[ISCC](https://core.iscc.codes) (*International Standard Content Code*). Semantic Text-Codes are -short identifiers created from text documents that preserve similarity (in hamming distance) -for semantically similar cross-lingual text inputs. + `iscc-sct` is a **proof of concept implementation** of a semantic Text-Code for the + [ISCC](https://core.iscc.codes) (*International Standard Content Code*). Semantic Text-Codes are + short identifiers created from text documents that preserve similarity (in hamming distance) + for semantically similar cross-lingual text inputs. -## What is the ISCC + ## What is the ISCC -The ISCC is a combination of various similarity preserving fingerprints and an identifier for -digital media content. + The ISCC is a combination of various similarity preserving fingerprints and an identifier for + digital media content. -ISCCs are generated algorithmically from digital content, just like cryptographic hashes. However, -instead of using a single cryptographic hash function to identify data only, the ISCC uses various -algorithms to create a composite identifier that exhibits similarity-preserving properties (soft -hash or Simprint). + ISCCs are generated algorithmically from digital content, just like cryptographic hashes. However, + instead of using a single cryptographic hash function to identify data only, the ISCC uses various + algorithms to create a composite identifier that exhibits similarity-preserving properties (soft + hash or Simprint). -The component-based structure of the ISCC identifies content at multiple levels of abstraction. Each -component is self-describing, modular, and can be used separately or with others to aid in various -content identification tasks. The algorithmic design supports content deduplication, database -synchronization, indexing, integrity verification, timestamping, versioning, data provenance, -similarity clustering, anomaly detection, usage tracking, allocation of royalties, fact-checking and -general digital asset management use-cases. + The component-based structure of the ISCC identifies content at multiple levels of abstraction. Each + component is self-describing, modular, and can be used separately or with others to aid in various + content identification tasks. The algorithmic design supports content deduplication, database + synchronization, indexing, integrity verification, timestamping, versioning, data provenance, + similarity clustering, anomaly detection, usage tracking, allocation of royalties, fact-checking and + general digital asset management use-cases. -## ISCC Status + ## ISCC Status -The [ISCC](https://iscc.codes) is an ISO Standrad published under -[ISO 24138:2024](https://www.iso.org/standard/77899.html) - International Standard Content Code -within [ISO/TC 46/SC 9/WG 18](https://www.iso.org/committee/48836.html). + The [ISCC](https://iscc.codes) is an ISO Standrad published under + [ISO 24138:2024](https://www.iso.org/standard/77899.html) - International Standard Content Code + within [ISO/TC 46/SC 9/WG 18](https://www.iso.org/committee/48836.html). -The algorithms of this `iscc-sct` are experimental and not (yet) part of the official standard. + The algorithms of this `iscc-sct` are experimental and not (yet) part of the official standard.