Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to modify laws #16

Open
ghxm opened this issue Aug 17, 2023 · 0 comments
Open

Ability to modify laws #16

ghxm opened this issue Aug 17, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@ghxm
Copy link
Owner

ghxm commented Aug 17, 2023

Objective

Individual parts of laws (e.g. spans in the .spans SpanGroups should be editable.

Aims

  • Obtain a spaCy document with updated text while keeping all annotations/elements

Solutions

A. Markup text and export marked text

  • Store replacement text in ._.replacement_text attribute
  • Import into spaCy/euCy with special markup reader for element detection

Problem: When re-reading into spaCy, markup will become part of document

B. Export text and annotation separately

  • Store replacement text in ._.replacement_text attribute
  • Export annotation in standard spaCy JSON

Problems:

  • How to export the text and
  • update the spans (when a previous span changes, start of next one also changes)

C. Split text and recreate doc at every (changes) span to obtain text + annotation (can then continue with e.g. 2.)

https://stackoverflow.com/a/75300856/5565500

Works at token level?

Problem: Problem might be computationally heavy

@ghxm ghxm added the enhancement New feature or request label Aug 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant