Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix committee protocol scraping to adjust for new protocols format #170

Closed
OriHoch opened this issue Mar 18, 2019 · 1 comment
Closed

fix committee protocol scraping to adjust for new protocols format #170

OriHoch opened this issue Mar 18, 2019 · 1 comment

Comments

@OriHoch
Copy link
Contributor

OriHoch commented Mar 18, 2019

In recent months, Knesset improved the committee protocol format to include rich metadata, this change affected some of our existing protocol parsing code.

The fix can be done by copying and modifying one of the jupyter notebooks, see the README on how to use the Jupyter Lab server.

To investigate and visualize the problem you can use the following meeting:

https://oknesset.org/meetings/2/0/2078315.html

image

The speaker parts are not identified when they are wrapped with the formatting tags דובר / יור

The fix should be done as early in the pipelines as possible and children pipelines should be tested to make sure they are not affected.

@OriHoch
Copy link
Contributor Author

OriHoch commented Mar 15, 2023

this is fixed but there is a new problem, opened a new issue for it: #201

@OriHoch OriHoch closed this as completed Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant