Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some collections not visible in category view #353

Open
anjackson opened this issue Jun 9, 2022 · 7 comments
Open

Some collections not visible in category view #353

anjackson opened this issue Jun 9, 2022 · 7 comments

Comments

@anjackson
Copy link
Contributor

anjackson commented Jun 9, 2022

In response to ukwa/w3act#676

Looking at:

public NamedList<List<PivotField>> pivotCategoriesRequest() throws Exception{

This seems to map to this query, where the UK General Election 2015 collection can be seen: http://prod1.n45.wa.bl.uk:9021/solr/collections/select?indent=on&facet=true&facet.pivot=collectionAreaId,id,description,name&q=*:*&facet.limit=-1&facet.pivot.mincount=1&wt=json&rows=1

The code that consumes this list is not obvious to me, so I'm not sure what's happening there:

for (Map.Entry<String, List<PivotField>> pivotEntry : pivotEntryList) {

The only thing I spotted is that the UK General Election 2015 collection is preceded by a entry that has no description:

            {
              "field":"id",
              "value":"4148",
              "count":1},
            {
              "field":"id",
              "value":"60",
              "count":1,
              "pivot":[{
                  "field":"description",
                  "value":"Collection of websites, curated by staff at the Legal Deposit Libraries, focussing on the 2015 UK General Election which was held on 7 May 2015 to elect 650 members to the House of Commons. It was the first general election at the end of a fixed-term Parliament. \n",
                  "count":1,
                  "pivot":[{
                      "field":"name",
                      "value":"UK General Election 2015",
                      "count":1}]}]},

Is it possible that that's breaking things? You could try this by filtering out items with no description or title (name:[* TO *] AND description:[* TO *]), e.g.

http://prod1.n45.wa.bl.uk:9021/solr/collections/select?indent=on&facet=true&facet.pivot=collectionAreaId,id,description,name&q=name:[*%20TO%20%20*]%20AND%20description:[*%20TO%20*]&facet.limit=-1&facet.pivot.mincount=1&wt=json&rows=0

@min2ha
Copy link
Contributor

min2ha commented Jun 17, 2022

Actually the SOLR query related to AREAS is there:

public NamedList<List<PivotField>> pivotCategoriesRequest() throws Exception{

min2ha added a commit to min2ha/ukwa-ui that referenced this issue Jun 21, 2022
min2ha added a commit to min2ha/ukwa-ui that referenced this issue Jun 21, 2022
@anjackson anjackson added the bug label Jun 24, 2022
@crarugal
Copy link
Collaborator

Top level collections (and sub-collections) that are visible, searchable, published, and/or not listed
https://docs.google.com/spreadsheets/d/1i77oxEa4sPfUk4wAdQRp3xo-KAU-cR9pIEObWGZaLUg/edit#gid=2019239782

min2ha added a commit to min2ha/ukwa-ui that referenced this issue Jul 26, 2022
@crarugal
Copy link
Collaborator

crarugal commented Aug 8, 2022

Collections that's can't be searched for in dev:

<style type="text/css"></style>
W3ACT link Collection ID Collection name anomaly? Topics and Themes page link Viewable on site ttype id url created_at Name length Description length publish
https://www.webarchive.org.uk/act/collections/60 60 UK Gen election 2015 Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/60 Yes collections 60 act-300 2015-02-09 14:07:06 24 264 TRUE
https://www.webarchive.org.uk/act/collections/689 689 Scottish elections 2016 Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/689 Yes collections 689 act-689 2016-01-18 12:03:39 36 220 TRUE
https://www.webarchive.org.uk/act/collections/851 851 Queens birthday 2016 Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/851 Yes collections 851 act-851 2016-05-26 11:37:23 34 115 TRUE
https://www.webarchive.org.uk/act/collections/2778 2778 Unfinished business Missing description https://www.webarchive.org.uk/en/ukwa/collection/2778 Yes collections 2778 act-2778 2019-09-30 10:15:45 49 0 TRUE
https://www.webarchive.org.uk/act/collections/3064 3064 Startup Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/3064 Yes collections 3064 act-3064 2020-04-28 11:11:13 19 332 TRUE
https://www.webarchive.org.uk/act/collections/3098 3098 UK Retail Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/3098 Yes collections 3098 act-3098 2020-07-17 13:25:10 46 590 TRUE
https://www.webarchive.org.uk/act/collections/3866 3866 Duke of edinburgh Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/3866 Yes collections 3866 act-3866 2021-04-19 08:42:07 17 668 TRUE
https://www.webarchive.org.uk/act/collections/4148 4148 NHS Patient Surveys | UKWA Topics and Themes Missing Description https://www.webarchive.org.uk/en/ukwa/collection/4148 Yes collections 4148 act-4148 2022-01-13 13:39:10 19 0 TRUE
https://www.webarchive.org.uk/act/collections/4214 4214 Ukraine 2022 Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/4214 Yes   1163 act-4214 Wednesday, March 02, 2022 12 739 TRUE
https://www.webarchive.org.uk/act/collections/4088 4088 The Queen's Platinum Jubilee 2022 Collection description was present, but stil not searchable https://www.webarchive.org.uk/en/ukwa/collection/4088 Yes collections 4088     33 569 TRUE

@min2ha
Copy link
Contributor

min2ha commented Aug 22, 2022

The scope of this ticket is Collection Visibility in Category view only.
(searchable or not is out of scope)

The source of JSON of Top Collections (https://www.webarchive.org.uk/act/collections/allCollectionAreasAsJson/7)

Case of Collection 3098 (aka UK Retail) (Listed in Working On(!))
From JSON of Top Collections we know that it exists in 3 Collection Areas: Places, Society & Communities and Working On(!).

BTW Long time ago we agreed, that by default we do not expose collections listed in Working On.

Count of collections in 'Working On' (from JSON is 7) and from SOLR instance is 5 (2 less due to field Publish:NO probably):
http://prod1.n45.wa.bl.uk:9021/solr/collections/select?q=collectionAreaId:2945&indent=on&wt=json&rows=100

@crarugal
Copy link
Collaborator

Thanks, Mindy, and apologies, I wasn't aware (or maybe I forgot) that Collections tagged into "Working On" would not be exposed.

Should it be the case that Collections that are both published and tagged into "Working On" should still be searchable? Perhaps this is more of a curatorial question as I think all published Collections should still be searchable.

@nicolabingham
Copy link

Please can we remove the "Working On" collection so it is not available to ACT users or end users please?
We will make sure anything currently tagged in this collection is removed from it and tagged into other collections. Thanks.

@nicolabingham
Copy link

I have added descriptions for two collections that were missing them (https://www.webarchive.org.uk/act/collections/4148 NHS Patient Surveys and https://www.webarchive.org.uk/act/collections/2778 Unfinished business)
And have untagged three collections from the 'working on' category
https://www.webarchive.org.uk/act/collections/3064 3064 Startup
https://www.webarchive.org.uk/act/collections/3098 3098 UK Retail
https://www.webarchive.org.uk/act/collections/3866 3866 Duke of edinburgh
Will check back tomorrow to see if they are visible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants