Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architecture documentation #226

Merged
merged 1 commit into from
Apr 12, 2024
Merged

Architecture documentation #226

merged 1 commit into from
Apr 12, 2024

Conversation

benoit74
Copy link
Collaborator

@benoit74 benoit74 commented Apr 5, 2024

Changes:

  • create a high level documentation of the scraper

I've left multiple **TODO** where I need your help if possible (especially @mgautierfr) to resolve the situation.

The exercise was (as expected by all of us) very interesting to discover areas for improvements:

  • probably unused Python libraries
  • potentially useless transformations (JSON, JSONP and static JS url rewriting, all these are done dynamically by wombat as far as I've understood)

I'm waiting for your feedback on these last points and will add more test cases to the test website around these topics in the mean time.

@benoit74 benoit74 self-assigned this Apr 5, 2024
@benoit74 benoit74 changed the base branch from main to warc2zim2 April 5, 2024 16:07
Copy link

codecov bot commented Apr 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 85.98%. Comparing base (437ed7a) to head (2f9b95c).
Report is 1 commits behind head on warc2zim2.

❗ Current head 2f9b95c differs from pull request most recent head 3de8ab8. Consider uploading reports for the commit 3de8ab8 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           warc2zim2     #226   +/-   ##
==========================================
  Coverage      85.98%   85.98%           
==========================================
  Files             13       13           
  Lines           1049     1049           
  Branches         195      195           
==========================================
  Hits             902      902           
  Misses           116      116           
  Partials          31       31           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@mgautierfr mgautierfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems mostly good.
See my comments.

docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/software_architecture.md Outdated Show resolved Hide resolved
docs/software_architecture.md Outdated Show resolved Hide resolved
docs/software_architecture.md Outdated Show resolved Hide resolved
docs/software_architecture.md Outdated Show resolved Hide resolved
docs/software_architecture.md Outdated Show resolved Hide resolved
@benoit74
Copy link
Collaborator Author

benoit74 commented Apr 8, 2024

Great, thank you @mgautierfr !

@benoit74
Copy link
Collaborator Author

benoit74 commented Apr 8, 2024

I've fixed the documentation. Thank you @mgautierfr for these valuables feedbacks, please confirm it is now OK for you.

@kelson42 @rgaudin do you consider this is a sufficient first step towards more documentation, or did you expected something more detailed? I tried to find the subtle limit between too many details which would get out of sync with the codebase very quickly and too few which would prevent a good understanding of warc2zim behavior, but I might have failed doing so ^^

@benoit74 benoit74 requested a review from mgautierfr April 8, 2024 11:33
Copy link
Contributor

@mgautierfr mgautierfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ok for me.

Copy link
Member

@rgaudin rgaudin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you ; I think it's a good start indeed ; please check my inline suggestions

docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/functional_architecture.md Show resolved Hide resolved
docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/functional_architecture.md Outdated Show resolved Hide resolved
docs/technical_architecture.md Outdated Show resolved Hide resolved
docs/technical_architecture.md Outdated Show resolved Hide resolved
docs/technical_architecture.md Outdated Show resolved Hide resolved
@benoit74 benoit74 force-pushed the documentation branch 2 times, most recently from d12a21d to 2f9b95c Compare April 11, 2024 09:19
@benoit74
Copy link
Collaborator Author

@kelson42 do you wanna review it as well or may I merge like it is, and we will always be able to open an issue to fix what needs to be later on?

@benoit74 benoit74 merged commit b5e1a91 into warc2zim2 Apr 12, 2024
4 checks passed
@benoit74 benoit74 deleted the documentation branch April 12, 2024 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants