Skip to content
This repository has been archived by the owner on Jul 15, 2021. It is now read-only.

[POLL] HTTP tracking (and caching) #176

Open
timwhite opened this issue Jul 28, 2018 · 5 comments
Open

[POLL] HTTP tracking (and caching) #176

timwhite opened this issue Jul 28, 2018 · 5 comments
Labels

Comments

@timwhite
Copy link
Contributor

timwhite commented Jul 28, 2018

With the push towards a "secure" web (HTTPS everywhere), more and more sites are now only accessible over HTTPS. This means more and more sites are not showing in the Squid web logs.

It isn't possible to track what sites Hotspot users visit if they are visiting HTTPS websites. Reverse lookup of IP address works in some cases, and is wildly inaccurate in other cases (think of a website behind a cloudflare shared IP address). At no point is a Hotspot user going to modify their proxy settings so that we can monitor their HTTPS usage, that's just not good for user experience.

HTTPS accounts for more than 66% of all page loads in Chrome across all platforms with With more than 50% of all pages being loaded via Chrome being HTTPS, and most platforms have more than 75% of page loads of HTTPS. https://transparencyreport.google.com/https/overview?hl=en The encrypted web really is here, and here to stay. https://security.googleblog.com/2018/02/a-secure-web-is-here-to-stay.html

It's time to decide if it's worth the maintenance effort, and the CPU cycles, of running the Squid transparent proxy, and attempting to track users browsing history. The report is becoming more and more useless as more and more sites are HTTPS.

If voting to keep HTTP tracking, please leave a comment below as to why you want it kept

@timwhite timwhite added the Polls label Jul 28, 2018
@timwhite timwhite changed the title POLL: HTTP tracking (and caching) [POLL] HTTP tracking (and caching) Jul 28, 2018
@tomas213
Copy link
Contributor

tomas213 commented Jul 28, 2018

Two questions :

  1. what about stats on what pages users have visited for legal matters. Removing squid will lose all that data. Maybe we can use awstats.
  2. removoing squid and cache, will it have any affection on speed on browsing

@timwhite
Copy link
Contributor Author

@tomas213

  1. Some countries require tracking, others require that we don't track. It may be best for countries that require tracking, we have a HowTo on setting up something like softflowd or similar, that will log IP connections, (IP, port, duration,bytes) and then it'll be up to the operator to cross match the hotspot IPs against that data in the instance they need to retrieve the logs for legal purposes.
  2. The cache will only be helping HTTP sites already. As the number of HTTP sites reduces, the cache performance will significantly reduce. If you have a site currently running, run a squid log analyser of the logs, and see how many cache hits you actually get. Any cache misses are when the file couldn't be served from the cache.

@louis222
Copy link

I think it's a useful feature for countries that require it. However I think put a feature to easily turn it on and off if it's possible

@tomas213
Copy link
Contributor

TIm, if that's the case, then there can be a guide for tracking for those countries needed.
I voted to remove squid!

@joseborges
Copy link

joseborges commented Oct 8, 2018

In my opinion, and as a user, i always rather have the option to do or not. So if it's possible, Tim consider adding this has a setting on the backoffice.

Control access (Squid)?
[ ] Yes [ X ] No

And you could have it default to No.

(This would be the perfect solution for less tech savy users)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants