A Ruby Gem to sniff information about a domain's technology and capabilities.
site-inspector.herokuapp.com (source)
Site Inspector involves three primary concepts:
-
Domain - A domain has a host defined by it's TLD + SLD. A domain might be
example.com
. Domain's have certain domain-wide properties like whether it supports non-www requests, or if it enforces HTTPS. -
Endpoint - Each domain has four endpoints based on whether you make your request with HTTPS or not, and whether you prefix the host with
www.
or not. So the domainexample.com
may have endpoints athttps://example.com
,https://www.example.com
,http://example.com
, andhttps://www.example.com
. There may theoretically be a different server responding to each endpoint, so endpoints have certain endpoint-specific properties, like whether it responds or not, or whether it redirects. Each domain has one canonical (primary) endpoint. -
Checks - A check is a set of tests performed on an endpoint. A check might look at what headers are returned, what CMS is used, or whether there is a valid HTTPS certificate. There are some built in checks, listed below, or you can define your own. While they're endpoint specific, checks often filter up and inform some of the domain-wide logic (such as if the domain supports HTTPS).
domain = SiteInspector.inspect "whitehouse.gov"
domain.https?
# => true
domain.www?
# => true
domain.canonical_endpoint.to_s
# => "https://www.whitehouse.gov"
domain.canonical_endpoint.sniffer.cms
# => { :drupal => {}}
site-inspector inspect -- inspects a domain
Usage:
site-inspector inspect <domain> [options]
Options:
-j, --json JSON encode the output
-a, --all return results for all endpoints (defaults to only the canonical endpoint)
--sniffer return results for the sniffer check (defaults to all checks unless one or more checks are specified)
--https return results for the https check (defaults to all checks unless one or more checks are specified)
--hsts return results for the hsts check (defaults to all checks unless one or more checks are specified)
--headers return results for the headers check (defaults to all checks unless one or more checks are specified)
--dns return results for the dns check (defaults to all checks unless one or more checks are specified)
--content return results for the content check (defaults to all checks unless one or more checks are specified)
-h, --help Show this message
-v, --version Print the name and version
-t, --trace Show the full backtrace when an error occurs
canonical_endpoint
- The domain's primary endpointgovernment
- whether the domain is a government domainup
- whether any endpoint respondswww
- whether eitherwww
endpoint respondsroot
- whether you can access the domain withwww.
https
- whether HTTPS is supportedenforces_https
- whether non-htttps endpoints are either down or redirects to httpsdowngrades_https
- whether the canonical endpoint redirects to an http endpointcanonically_www
- whether non-www requests are redirected to www (or all non-www endpoints are down)canonically_https
- whether non-https request are redirected to https (or all http endpoints are down)redirect
- whether the domain redirects to an external domainhsts
- does the canonical endpoint have HSTS enabledhsts_subdomains
- are subdomains included in the HSTS list?hsts_preload_ready
- can this domain be added to the HSTS preload list?
up
- whether the endpoint responds or nottimed_out
- whether the endpoint times outredirect
- whether the endpoint redirectsexternal_redirect
- whether the endpoint redirects to another domain
Each endpoint also returns the following checks:
Uses the pa11y
CLI to run automated accessibility tests. Requires node
. To install pally
: [sudo] npm install -g pa11y
.
section508
- Tests against the Section508 standardwcag2a
- Tests against the WCAG2A standardwcag2aa
- Tests against the WCAG2AA standardwcag2aaa
- Tests against the WCAG2AAA standard
doctype
- The HTML doctype returnedsitemap_xml
- Whether the endpoint has a sitemaprobots_txt
- whether the endpoint has arobots.txt
file
dnssec
- is DNSSEC supportedipv6
- is IPV6 supportedcdn
- the endpoint's CDN, if anycloud_provider
- the endpoint's cloud provider, if anygoogle_apps
- whether the domain is using google appshostname
- the server hostnameip
- the server IP
cookies
- does the domain use cookiesstrict_transport_security
- whether STS is enabledcontent_security_policy
- the endpoint's CSPclick_jacking_protection
- whether anx-frame-options
header is sentxss_protection
- whether anx-xss-protection
header is sentserver
- the server headersecure_cookies
- whether the cookies are secure, or not
valid
- whether the HSTS header is validmax_age
- the HSTS max ageinclude_subdomains
- whether subdomains are includedpreload
- whether its preloadedenabled
- whether HSTS is enabledpreload_ready
- whether HSTS could be preloaded
valid
- if the HTTPS response is validreturn_code
- the HTTPS error, if any
cms
- the CMS used, if anyanalytics
- the analytics providers used, if anyjavascript
- the javascript libraries used, if anyadvertising
- the advertising providers used, if any
Checks are special classes that are children of SiteInspector::Endpoint::Check
. You can implement your own check like this:
class SiteInspector
class Endpoint
class Mention < Check
def mentions_ben?
endpoint.content.body =~ /ben/i
end
end
end
end
This check can then be used as follows:
domain.canonical_endpoint.mention.mentions_ben?
Checks can call the endpoint
object, which, contains the request, response, and other checks. Custom checks are automatically exposed as endpoint methods.
- Clone down the repo
script/bootstrap
script/cibuild
script/console
- Fork the project
- Create a new, descriptively named feature branch
- Make your changes
- Submit a pull request