-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send an alert when data is stale #303
Comments
Not sure where to put it (here vs transposer), but I did have a rough script that returns how stale the YAU data is and returns a value such as "-6d" (or 6 days old): https://github.com/pdehaan/ensemble-data-test Per https://github.com/mozilla-services/Dockerflow, services should have a
So, maybe we just check a couple of choice endpoints to see what the latest date in the dataset is, and return a 500 error if the data is more than -7d old. Then we'd need to make sure OPs is monitoring that heartbeat endpoint and then maybe they ping us if the data is stale. Not sure how it'd work w/ their monitoring tools. I would hate to think that somebody on pagerduty gets paged at 3am on a Sunday because the data is 8 days old. |
Here's an example of scraping the https://data.firefox.com/dashboard/hardware dashboard and grabbing the const ms = require("ms");
const puppeteer = require("puppeteer");
async function main() {
const sel = "select#date-selector option";
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://data.firefox.com/dashboard/hardware", {waitUntil: "networkidle2"});
await page.waitForSelector(sel);
const lastModified = await page.$eval(sel, el => el.textContent);
const diff = getAge(lastModified);
console.log(`[${diff}] ${lastModified}`);
if (parseInt(diff, 10) < -7) {
process.exitCode = 1;
}
browser.close();
}
function getAge(date) {
return ms(new Date(date) - Date.now());
}
main(); Although it isn't especially speedy since it takes about 5s to launch a headless browser and wait for the page to load/render: $ time node check-hardware-dashboard.js
[-12d] February 3, 2019
node check-hardware-dashboard.js 0.44s user 0.19s system 12% cpu 5.020 total |
Excellent. Thank you, @pdehaan! I'll look into this. |
Note to self: the code in #297 is also worth looking at. |
@pdehaan launched a site which reports on the freshness of all data. This could be a great thing for us to leverage. 😃 |
https://ensemble-last-modified.now.sh/ is currently saying the dashboard data is currently 9-10 days old: {
"source": "https://github.com/mozilla/ensemble",
"version": "1.0.0",
"commit": "5753d4021c792b3af31174a8cb473c10549f82ae",
"dashboads": {
"/datasets/desktop/user-activity": "-10d",
"/datasets/desktop/usage-behavior": "-10d",
"/datasets/desktop/hardware": "-9d"
},
"homepage": "https://github.com/pdehaan/ensemble-last-modified"
} |
Ah! Thanks for the heads up! |
An alert should be sent when the site is showing stale data. See #297 as an example of when this has happened.
It's unclear to me if this should be done here, in ensemble-transposer, or in Fx_Usage_Report. Perhaps more than one.
The text was updated successfully, but these errors were encountered: