Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contributors guide for modifying the data file #132

Open
kingo55 opened this issue Jul 17, 2016 · 7 comments
Open

Contributors guide for modifying the data file #132

kingo55 opened this issue Jul 17, 2016 · 7 comments

Comments

@kingo55
Copy link
Contributor

kingo55 commented Jul 17, 2016

For new additions to the referrer YAML, it would be helpful if there were some guidelines on how to name / group sites.

Based on what I see in the YAML now, I don't even agree with past contributions I've made. E.g.:

  1. Given email is just one of the services Naver provides, Naver Mail should just be Naver... just like how Google is represented
  2. It's also not clear how we handle different sites of the same company, or the same brand in multiple countries.
  3. Sometimes the medium email makes sense because we can see traffic arriving through Cheetah Mail or Responsys servers but we wouldn't call them an email provider.

Thoughts?

@alexanderdean
Copy link
Contributor

Agree, we need an initiative to write a contributors guide for the data file...

@alexanderdean
Copy link
Contributor

More than just naming conventions - also rules like:

Any given referrer URI should only be found in the database once. If the same URI is used for two different mediums, like search and paid, then we should give the traffic the benefit of the doubt and make it search (i.e. don't assume paid).

(#130 (comment))

@kingo55 kingo55 changed the title Define naming conventions for Source / Medium Contributors guide for modifying the data file Jul 21, 2016
@kingo55
Copy link
Contributor Author

kingo55 commented Jul 21, 2016

Also useful:

  • Suggestions / explanation of how it works
  • Setup of a local environment to test with
  • Identifying missing sources, misclassifications etc

@alexanderdean - happy to work on something like this too. Can a pull request be made for GH wikis?

@christoph-buente
Copy link

We also assembled more ESP domain names and would like to see them mentioned in the referer.yml file. Is there a process by now?

@alexanderdean
Copy link
Contributor

The process is not yet finished, but we are working on it. Please open a PR and we will get a new versions of the referer.yml database published.

The unfinished work is around making the Java client read an external file, and updating the Snowplow enrichment to support that external file.

@christoph-buente
Copy link

Well, i don't have a proper referer.yml file but a list of 3600+ ESP domain names. Would that be of any interest to you?

@alexanderdean
Copy link
Contributor

Wow - that does sound interesting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants