Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022-Q2 Work Package: UKWA PyWB Support for Document Access #74

Closed
9 tasks done
anjackson opened this issue Jan 12, 2022 · 10 comments
Closed
9 tasks done

2022-Q2 Work Package: UKWA PyWB Support for Document Access #74

anjackson opened this issue Jan 12, 2022 · 10 comments
Assignees

Comments

@anjackson
Copy link
Contributor

anjackson commented Jan 12, 2022

The design re-uses the existing PyWB system to provide single-concurrent-use locks on ‘live’ content drawn from a downstream HTTP document delivery server. https://github.com/ukwa/ukwa-pywb/milestone/2

This is under development on the custom-viewers branch. Deployment configuration is in ukwa-services/access/rrwb.

To complete the design, we require the ukwa-pywb service to be modified to:

  • Support PDF and ePub 2/3 rendering as HTML #68
  • Block Content-Disposition: attachment downloads so the rendering is controlled by PyWB. 
  • Intercept responses that are not PDF/ePub or web formats, and block them, replacing them with a help page.
  • Add appropriate redirects so all requests go to a fixed timestamp+ARK combination (as the locking mechanism requires both a timestamp and an ARK identifier).  (@anjackson to do this in ukwa-services, I think?)
  • Remove UKWA branding and calendar navigation from the document access collection.
  • Provide an interface that meets or exceeds the WCAG 2.1 AA accessibility standard.
  • Display the NPLD terms of use at the start of a new session, requiring the reader to accept terms.
  • Can't see iframe contents when printing #93
  • Lock page should not say Archived Web Page Locked for documents.
@ikreymer
Copy link
Contributor

It might be good to clarify what part of this should happen here in pywb and what is only in the electron NPLD Viewer. Assume that this will only be accessed via the electron app (outside testing), some things might make sense to do there.

Support pdf.js and ePub.js wrapping on document access collection. 
This will be part of pywb, part of #85

Block Content-Disposition: attachment downloads so the rendering is controlled by PyWB. 
If the goal is to just block downloads, Electron provides an elegant way to do this. Removing Content-Disposition alone doesn't prevent a browser from ever downloading, so its a bit more tricky if needed to also be supported in pywb itself. If not, this can be just in the NPLD Viewer.

Intercept responses that are not PDF/ePub or web formats, and block them, replacing them with a help page.
This will be implemented in pywb. There are already placeholders for this for the reading-room collection.

Add appropriate redirects so all requests go to a fixed timestamp+ARK combination (as the locking mechanism requires both a timestamp and an ARK identifier). 

The live proxying will be in pywb.

Remove UKWA branding and calendar navigation from the document access collection.

pywb can just display a frame with no banner for this. Should the NPLD viewer have UKWA branding at all, as part of the nav location bar, but no access to calendar?

Provide an interface that meets or exceeds the WCAG 2.1 AA accessibility standard.

This would be in the NPLD Viewer

Display the NPLD terms of use at the start of a new session, requiring the reader to accept terms.

This will also be part of the NPLD viewer and not in pywb.

@anjackson
Copy link
Contributor Author

Unfortunately, I think everything on this ticket needs to be part of PyWB (or the NGINX it's deployed with). The terms of use may need to be displayed by PyWB in some contexts, so it's probably better to add it to PyWB/NGINX. The same applies to blocking Content-Disposition unfortunately.

Accessibility does cover the NPLD Player UI, but also covers the PyWB navigation, calendar pages, and PDF/ePub viewers. Or course, if we find problems with pdf.js/ePub.js all we can do is raise them upstream, but both projects take accessibility seriously and so this should work out over time. I believe the PyWB templates/pages are already okay in terms of accessibility.

BTW, I've been working on integrating basic accessibility testing into our routine tests. See ukwa/docker-robot-framework#5

@anjackson
Copy link
Contributor Author

As discussed in Slack, the download block should probably done in the browser. If this causes problems, we'll revisit the issue.

@anjackson
Copy link
Contributor Author

This is looking pretty good now, although printing is not quite right.

@anjackson
Copy link
Contributor Author

So, apparently, the British Library fork of Universal Viewer (https://github.com/britishlibrary/universalviewer) supports PDF and ePub, but also has enhanced features like print layouts, citation support, etc. This was developed during an earlier iteration of this project, but I'm struggling to find documentation on how it should be used.

As I understand it, we could bundle this as the viewer, but it would mean setting up support for IIIF manifests, and it's not clear if ePub streaming would work.

@anjackson
Copy link
Contributor Author

Noting that the appearance of any collection can be modified by adding alternative template files according the standard directory structure.

@anjackson
Copy link
Contributor Author

Implemented branding changes over in https://github.com/ukwa/npld-access-stack/tree/main/pywb which is part of the repository that controls how PyWB is deployed for the NPLD service.

@anjackson
Copy link
Contributor Author

I've moved some follow-on support issues to #100

@anjackson
Copy link
Contributor Author

I've moved #77, #78 to follow-on work tickets, and marked the accessibility requirement as done as you've done all you can at present, and you have indicated you are happy to fold in any issues we find during formal evaluation.

@anjackson
Copy link
Contributor Author

Thanks for all that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants