Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User story: Provide update to users when new reporting years has been added at country level #139

Closed
anderspeders opened this issue Jul 31, 2017 · 20 comments

Comments

@anderspeders
Copy link

User story:

As a data user for Guine and Ghana I would like to know when ne data has been added to these specific countries so that I can rerun my analysis or conduct new analysis with the more recent figures.

Alert should follow once a new [year] has been added for a specific [country]. The alert could be as email, RSS feed or other.

Now that we have the metadata added for additions to rows eg. year 2014 added for Afghanistan it should be possible to generate some type of alert for this.

Maybe worth pinging the CKAN developer list for this?

What

Notes

@mattfullerton
Copy link
Contributor

To summarize, this would be our plan for implementation:

  • A list of email addresses, to be provided by NRGI, gets a summary of what has changed at the "new years" level everytime a country resource has changed when importing (i.e. maximum alert of once per week).
  • To begin with any year for any country triggers an email
  • In a second stage we can have lists per country
  • In a later stage we can open up these mailing lists for subscription and/or integrate this into CKAN

@mattfullerton
Copy link
Contributor

De-milestoning while @anderspeders clarifies demand for this feature

@mattfullerton
Copy link
Contributor

Comment from Anders on Slack:

Thinking about a country focused monthly email: excisting and new extractives data for - for example Colombia

As a user I would like to know:

  • all contracts, projects, eiti payments and RGI data
  • all newly releaed data - for example within the last month

@t-morrison
Copy link
Member

I would like to explore the most simple implementation of this request. I am thinking that would be:

  • Check timestamps in the last week/month
  • If there is new data in the last month (based on creation date timestamp) or updated data (based on update timestamp), generate some text w/ links saying what data was changed/added (country, year, etc.)
  • Email text to a list of email addresses

Is this the easiest way to start this? I can envision a fairly simple R solution. What would be involved for you to implement as part of the server @mattfullerton (in your own way, not something I did in R)? Or do you have a better idea?

@mattfullerton
Copy link
Contributor

If we go at it this way (which is also what we were originally thinking), the service can be any script, even R (there's Rscript for running things like this). Your idea with the timestamps has the big advantage that we don't need to track changes, but just ask the data.

However, if we stay within CKAN, what I'm wondering is do we achieve exactly the same functionality if we let people know any time the resource has been updated; given we only update it when rows have actually changed? I'm not sure that functionality is in there, but it is conformant with the idea of CKAN notifications - when something changes, subscribers get notified, and it would save us a pile of overhead running scripts, managing email subscriptions etc. We can pair this with the idea of datastore views, where we could even create "virtual" resources for countries and years that don't exist yet.

The middle road is to use ckanext-hooks which will trigger some external service when something happens; i.e. resource gets updated, external service (again, could be almost anything but preferably something that is good for writing APIs/Web services) checks whether what happened is relevant for a subscriber, logs it and includes it in e.g. a weekly or monthly update email for that person.

@t-morrison
Copy link
Member

I don't quite follow the second option there. To clarify, I'm not suggesting we have an option for someone to only get an update for say the Nigeria dataset, it would be all or nothing (essentially tracking only the Complete dataset). We could script XXX new rows for Nigeria since MM/DD/YY for example. This would make overhead easier, yes (or were you were thinking the same already?) ?

One concern on notifying on any changes: looking at the data, there were changes March 3rd, 6th, 9th, 11th and 13th this year, as one example. It would be too much to send a notification each time.

Can you briefly describe the process for managing subscribers for my suggestion and yours above? What is easier with your option ("would save us a pile of overhead running scripts, managing email subscriptions etc")?

@mattfullerton
Copy link
Contributor

@moman822 To (finally) answer your question, the work/maintenance saved would be through using CKAN's existing "follow" functionality: CKAN manages subscriptions and sends the email.

I will look at how far we get with notifications on resource updates in the EITI data

mattfullerton added a commit that referenced this issue Nov 21, 2017
… (trying to achieve meaningful "last modified" dates to help with #139)
@mattfullerton
Copy link
Contributor

Quick update: I've tested the standard CKAN alerting with EITI resources - the result was that I needed to add an explicit field for the resources to mark when they've been updated (in my opinion, CKAN should be doing this, but it neither updates the right field (ckan/ckan#3907) nor triggers an activity when the file is changed (which may well be because no field is updated)).

We have the flexibility to set up how often alerts are sent by email (e.g. once a week) and for how long back the alert email looks for events of interest to the user. Obviously these two should be coordinated.

The alert email should(?) also be extended to show what has changed. Otherwise the user first has to go to their list of activities, which may also be a little overwhelming. What I would consider doing is parsing the list of things that have changed (i.e. 6 EITI datasets changed 3 times) and consolidating them to say what has changed during the entire period.

We could test this for a while with just subscribing to the complete dataset. There we ought to get fairly frequent changes.

A next step would then be to extend CKAN to allow "following" searches - i.e. anything and everything about Nigeria.

Thoughts welcome

@t-morrison
Copy link
Member

Implementing on the complete dataset to start would be great. For timeframe one week alerts would be fine. I agree that it should show what has changed.

From your example "(i.e. 6 EITI datasets changed 3 times)" - this would be more than following a single dataset? So more of an entire site follow for any and all changes?

mattfullerton added a commit to derilinx/ckanext-nrgi-published that referenced this issue Nov 28, 2017
@mattfullerton
Copy link
Contributor

The example was based on the idea that anyone could follow any dataset (in fact, that is standard). What is coming in the email alert by default is just a message saying that something changed and the user can look at the activity list - I think including some content on what changed is important for this use case.

@t-morrison
Copy link
Member

t-morrison commented Jan 12, 2018

@mattfullerton can you give some detail on how this alert will be implemented from a user perspective?

Will they need to have registered an account or can they just be prompted to enter an email address? Will this happen using the "follow" button on the resource page (e.g. here https://www.resourcedata.org/dataset/eiti-complete-summary-table)?

@mattfullerton
Copy link
Contributor

@moman822 Users need to register and enable the sending of email alerts in their profile. They need to follow a dataset or organisation. What's particularly new is that they can also click to follow a [faceted] search.

@mattfullerton
Copy link
Contributor

mattfullerton commented Jan 19, 2018

The major work on this is now complete.

It works as follows and can be tested on staging:

  • A logged in user makes any search, for example by entering a keyword, or selecting a facet or facets
  • A "follow" button in green appears at the top.
  • Once a [insert interval here, we were thinking a week], the system performs the search and checks if a) The set of results changed b) If not, whether any of the results have been modified since the last time the search was performed. If either of these are true, the system emails the user.

Note: users need to be real CKAN users, and need to allow emails to be sent under the "Manage" button under their profile (https://staging.resourcedata.org/user/edit)

Remaining TODOs:

  • Try and persuade the mail server in the CKAN image to know its domain name (so that emails don't land in SPAM) --> job for Vitamin as this needs to come from the Docker config
  • Someone at NRGI or Vitamin should do SPF DNS entry (so that emails don't land in SPAM)
  • Unfollow button: probably make this not appear at all, but when someone follows a search, we check to see if it already exists and prevent them creating a new saved search
  • Saved search management: it would be nice to be able to manage the searches and delete ones that aren't required. A simpler option (reduction of UI work) would be to provide a direct link in the email that allows users to remove the save search.

Minor TODOs:

  • Translatable template
  • Revise init method (doesn't need to take all data?)
  • Warn about limit to 1000 rows, or overcome [note: was solved by limiting search additions on the UI to those with 500 results or less]
  • a brief explainer popup/hover for the follow button

@t-morrison
Copy link
Member

t-morrison commented Jan 23, 2018

@mattfullerton @deirdrelee I'm following up on Deirdre's email and the above here with some questions/comments:

  • If I follow multiple datasets, and they both change, will I receive separate emails?
  • I have followed the EITI complete dataset and have received emails today and yesterday- are these legitimate changes to the data or just some test process that is set up?
  • What is the possibility for alerting to specific changes in the EITI complete dataset, e.g. new country-year added?
  • Can we provide an initial email when following, something like: "You have subscribed to receive updates on the following [dataset/facet search]: XXXXX..."
  • Can we add a brief explainer popup/hover for the follow button?

@mattfullerton
Copy link
Contributor

@moman822

If I follow multiple datasets, and they both change, will I receive separate emails?

No. All dataset/group/organization activity is summarized in one email, and the email only provides a link to the site where the activity is listed. You also won't get emailed if you've looked at the activity list. The saved search email is a separate email.

I have followed the EITI complete dataset and have received emails today and yesterday- are these legitimate changes to the data or just some test process that is set up?

They are legitimate in that the EITI harvester runs every day and for testing we have search-checking/email-sending also running every hour for testing purposes (the idea would be every week).
We may need to take a closer look though at the contents of the file from one harvest to the next - to check that its not just some line ordering change or similar.

What is the possibility for alerting to specific changes in the EITI complete dataset, e.g. new country-year added?

You can either follow the individual dataset (i.e. https://staging.resourcedata.org/dataset/eiti-summary-data-table-for-norway) or create a search and follow that (i.e. https://staging.resourcedata.org/dataset?_country_limit=0&country=Norway) or if you're a pro-user you could create a saved search that expresses the data you want to be there but isn't yet (i.e. https://staging.resourcedata.org/dataset?_country_limit=0&country=Norway&year=2016)

Can we provide an initial email when following, something like: "You have subscribed to receive updates on the following [dataset/facet search]: XXXXX..."

Definitely possible, but I'm a bit unsure of the value in the case that we provide a listing of what searches have been saved (which we should do I think, along with a delete button). And then there is the question of when to send it? If its triggered by every saved search, a user might end up with 3 or 4 emails from one browsing session. But maybe that's OK?

Can we add a brief explainer popup/hover for the follow button?

Yes, have added to TODO list above

@mattfullerton
Copy link
Contributor

This is more or less done from our side; new version including the UI (on the user page beside Activity Stream) is being pushed to staging now.

@anderspeders
Copy link
Author

@moman822 Are we at a stage where we could send a test round to a few colleagues from the staging environment - or does it need a full deployment in order to be tested?

@t-morrison
Copy link
Member

Following the discussion today:

@anderspeders
Copy link
Author

@EricSoroos Could you please provide an update here

@EricSoroos
Copy link
Collaborator

Closing based on the merge of #216

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants