Awesome
Site Inspector
A Ruby Gem to sniff information about a domain's technology and capabilities.
Demo
site-inspector.herokuapp.com (source)
Concepts
Site Inspector involves three primary concepts:
-
Domain - A domain has a host defined by it's TLD + SLD. A domain might be
example.com
. Domain's have certain domain-wide properties like whether it supports non-www requests, or if it enforces HTTPS. -
Endpoint - Each domain has four endpoints based on whether you make your request with HTTPS or not, and whether you prefix the host with
www.
or not. So the domainexample.com
may have endpoints athttps://example.com
,https://www.example.com
,http://example.com
, andhttps://www.example.com
. There may theoretically be a different server responding to each endpoint, so endpoints have certain endpoint-specific properties, like whether it responds or not, or whether it redirects. Each domain has one canonical (primary) endpoint. -
Checks - A check is a set of tests performed on an endpoint. A check might look at what headers are returned, what CMS is used, or whether there is a valid HTTPS certificate. There are some built in checks, listed below, or you can define your own. While they're endpoint specific, checks often filter up and inform some of the domain-wide logic (such as if the domain supports HTTPS).
Usage
Ruby
domain = SiteInspector.inspect "whitehouse.gov"
domain.https?
# => true
domain.www?
# => true
domain.canonical_endpoint.to_s
# => "https://www.whitehouse.gov"
domain.canonical_endpoint.sniffer.cms
# => { :drupal => {}}
Command line usage
site-inspector inspect -- inspects a domain
Usage:
site-inspector inspect <domain> [options]
Options:
-j, --json JSON encode the output
-a, --all return results for all endpoints (defaults to only the canonical endpoint)
--sniffer return results for the sniffer check (defaults to all checks unless one or more checks are specified)
--https return results for the https check (defaults to all checks unless one or more checks are specified)
--hsts return results for the hsts check (defaults to all checks unless one or more checks are specified)
--headers return results for the headers check (defaults to all checks unless one or more checks are specified)
--dns return results for the dns check (defaults to all checks unless one or more checks are specified)
--content return results for the content check (defaults to all checks unless one or more checks are specified)
-h, --help Show this message
-v, --version Print the name and version
-t, --trace Show the full backtrace when an error occurs
What's checked
Domain
canonical_endpoint
- The domain's primary endpointgovernment
- whether the domain is a government domainup
- whether any endpoint respondswww
- whether eitherwww
endpoint respondsroot
- whether you can access the domain withwww.
https
- whether HTTPS is supportedenforces_https
- whether non-htttps endpoints are either down or redirects to httpsdowngrades_https
- whether the canonical endpoint redirects to an http endpointcanonically_www
- whether non-www requests are redirected to www (or all non-www endpoints are down)canonically_https
- whether non-https request are redirected to https (or all http endpoints are down)redirect
- whether the domain redirects to an external domainhsts
- does the canonical endpoint have HSTS enabledhsts_subdomains
- are subdomains included in the HSTS list?hsts_preload_ready
- can this domain be added to the HSTS preload list?
Endpoint
up
- whether the endpoint responds or nottimed_out
- whether the endpoint times outredirect
- whether the endpoint redirectsexternal_redirect
- whether the endpoint redirects to another domain
Checks
Each endpoint also returns the following checks:
Accessibility
Uses the pa11y
CLI to run automated accessibility tests. Requires node
. To install pally
: [sudo] npm install -g pa11y
.
section508
- Tests against the Section508 standardwcag2a
- Tests against the WCAG2A standardwcag2aa
- Tests against the WCAG2AA standardwcag2aaa
- Tests against the WCAG2AAA standard
Content
doctype
- The HTML doctype returnedsitemap_xml
- Whether the endpoint has a sitemaprobots_txt
- whether the endpoint has arobots.txt
file
DNS
dnssec
- is DNSSEC supportedipv6
- is IPV6 supportedcdn
- the endpoint's CDN, if anycloud_provider
- the endpoint's cloud provider, if anygoogle_apps
- whether the domain is using google appshostname
- the server hostnameip
- the server IP
Headers
cookies
- does the domain use cookiesstrict_transport_security
- whether STS is enabledcontent_security_policy
- the endpoint's CSPclick_jacking_protection
- whether anx-frame-options
header is sentxss_protection
- whether anx-xss-protection
header is sentserver
- the server headersecure_cookies
- whether the cookies are secure, or not
HSTS
valid
- whether the HSTS header is validmax_age
- the HSTS max ageinclude_subdomains
- whether subdomains are includedpreload
- whether its preloadedenabled
- whether HSTS is enabledpreload_ready
- whether HSTS could be preloaded
HTTPS
valid
- if the HTTPS response is validreturn_code
- the HTTPS error, if any
Sniffer
cms
- the CMS used, if anyanalytics
- the analytics providers used, if anyjavascript
- the javascript libraries used, if anyadvertising
- the advertising providers used, if any
Adding your own check
Checks are special classes that are children of SiteInspector::Endpoint::Check
. You can implement your own check like this:
class SiteInspector
class Endpoint
class Mention < Check
def mentions_ben?
endpoint.content.body =~ /ben/i
end
end
end
end
This check can then be used as follows:
domain.canonical_endpoint.mention.mentions_ben?
Checks can call the endpoint
object, which, contains the request, response, and other checks. Custom checks are automatically exposed as endpoint methods.
Contributing
Bootstrapping locally
- Clone down the repo
script/bootstrap
Running tests
script/cibuild
Development console
script/console
How to contribute
- Fork the project
- Create a new, descriptively named feature branch
- Make your changes
- Submit a pull request