Home

Awesome

:white_check_mark: htmltest

Go Report Card GoDoc

If you generate HTML files, html-proofer might be the tool for you. If you can't be bothered with a Ruby environment or fancy something a bit faster, htmltest may be a better option.

:mag: htmltest runs your HTML output through a series of checks to ensure all your links, images, scripts references work, your alt tags are filled in, et cetera.

:horse_racing: Faster? Yep, quite a bit actually. On a site with over 2000 files htmlproofer took over three minutes, htmltest took 8.6 seconds. Both tools had full valid caches.

:confused: Why make another tool: A mix of frustration with using htmlproofer/Ruby on large sites and needing a good project to get to grips with Go.

:floppy_disk: Installation

:penguin: Linux / :green_apple: macOS

This godownloader script will query GitHub for the latest release and download the correct binary for your platform into the directory set with the -b flag.

System-wide Install

curl https://htmltest.wjdp.uk | sudo bash -s -- -b /usr/local/bin

You'll be prompted for your password. After simply do htmltest to run.

Into Current Directory

curl https://htmltest.wjdp.uk | bash

By default this will install htmltest into ./bin of your current directory, to run do bin/htmltest. Rather suitable for CI environments.

More options

Run curl https://htmltest.wjdp.uk | bash -s -- -h for help text.

📦 Package Repos

We're available in some package repositories such as Homebrew and repos for certain distros. Do note the version available may not be the latest.

Packaging status

win64 Windows

:arrow_down: Download the latest binary release and put it somewhere on your PATH.

:whale: Docker

docker run -v $(pwd):/test --rm wjdp/htmltest Mount your directory with html files into the container and test them.

If you need more arguments to the test run it like this: docker run -v $(pwd):/test --rm wjdp/htmltest -l 3 -s

🧾 Temporary Files

We store temporary files in tmp/.htmltest by default. This contains a log of output and a cache of external links, speeding up subsequent runs. You probably want to ignore it in version control, and perhaps cache it in your CI system.

:computer: Usage

htmltest - Test generated HTML for problems
           https://github.com/wjdp/htmltest

Usage:
  htmltest [options] [<path>]
  htmltest -v --version
  htmltest -h --help

Options:
  <path>                       Path to directory or file to test, if omitted we
                               attempt to read from .htmltest.yml.
  -c FILE, --conf FILE         Custom path to config file.
  -h, --help                   Show this text.
  -l LEVEL, --log-level LEVEL  Logging level, 0-3: debug, info, warning, error.
  -s, --skip-external          Skip external link checks, may shorten execution
                               time considerably.
  -v, --version                Show version and build time.

:microscope: What's Tested?

Many options of the following tests can customised. Items marked :soon: are not checked yet, but will be soon.

What's Not

I'd like to test the following but won't be for a while.

:see_no_evil: Ignoring content

Add the data-proofer-ignore attribute to any tag or to the class of a tag to ignore it from every check. The name of this attribute can be customised.

<a href="http://notareallink" data-proofer-ignore>Not checked.</a>

:bookmark_tabs: Caching

Checking external URLs can slow tests down and potentially annoy the URL's host. htmltest caches the status code of checked external URLs and stores this cache between runs. We write the cache to tmp/.htmltest/refcache.json and expire items after two weeks by default.

:rainbow: Colour Output

By default, output in the TTY uses colours to indicate warnings, errors, and success. To turn off colourization, set an environment variable named NO_COLOR. If it is present, no colour will be used. The value is ignored. (See no-color.org.)

:fax: Logging

If you've got a lot of errors, reading them off a TTY may be difficult. We write errors to tmp/.htmltest/htmltest.log by default. The log level is set in the config file.

:wrench: Configuration

htmltest uses a YAML configuration file. Put .htmltest.yml in the same directory that you're running the tool from and you can just say htmltest to run your tests. You'll probably also want to cache the tmp/.htmltest directory.

Basic Options

OptionDescriptionDefault
DirectoryPathDirectory to scan for HTML files.
DirectoryIndexThe file to look for when linking to a directory.index.html
FilePathSingle file to test within DirectoryPath, omit to test all.
FileExtensionExtension of your HTML documents, includes the dot. If FilePath is set we use the extension from that..html
CheckDoctypeEnables checking the document type declaration.true
CheckAnchorsEnables checking <a… tags.true
CheckLinksEnables checking <link… tags.true
CheckImagesEnables checking <img… tagstrue
CheckScriptsEnables checking <script… tags.true
CheckMetaEnables checking <meta… tags.true
CheckGenericEnables other tags, see items marked with checkGeneric on the tags wiki page.true
CheckExternalEnables external reference checking; all tag types.true
CheckInternalEnables internal reference checking; all tag types. When disabled will prevent internal hash checking unless the reference only contains a hash fragment (#heading) and therefore refers to the current page.true
CheckInternalHashEnables internal hash/fragment checking.true
CheckMailtoEnables–albeit quite basic–mailto: link checking.true
CheckTelEnables–albeit quite basic–tel: link checking.true
CheckFaviconEnables favicon checking, ensures every page has a favicon set.false
CheckMetaRefreshEnables checking meta refresh tags.true
EnforceHTML5Fails when the doctype isn't <!DOCTYPE html>.false
EnforceHTTPSFails when encountering an http:// link. Useful to prevent mixed content errors when serving over HTTPS.false
IgnoreURLsArray of regexs of URLs to ignore.empty
IgnoreInternalURLsArray of strings of internal URLs to ignore. Exact matches only. âš  Likely to be deprecated, use IgnoreURLs instead.empty
IgnoreHTTPSArray of regexs of URLs to ignore for EnforceHTTPS. These URLs are still tested, unless also present in IgnoreURLs.empty
IgnoreDirsArray of regexs of directories to ignore when scanning for HTML files.empty
IgnoreInternalEmptyHashWhen true prevents raising an error for links with href="#".false
IgnoreEmptyHrefWhen true prevents raising an error for links with href="".false
IgnoreCanonicalBrokenLinksWhen true produces a warning, rather than an error, for broken canonical links. When testing a site which isn't live yet or before publishing a new page canonical links will fail.true
IgnoreExternalBrokenLinksWhen true produces a warning, rather than an error, for broken external links. Useful when testing a site having hundreds of external links.false
IgnoreAltMissingTurns off image alt attribute checking.false
IgnoreAltEmptyAllows alt="" for decorative images.false
IgnoreDirectoryMissingTrailingSlashTurns off errors for links to directories without a trailing slash.false
IgnoreSSLVerifyTurns off x509 errors for self-signed certificates.false
IgnoreTagAttributeSpecify the ignore attribute. All tags with this attribute or with this class will be excluded from every check."data-proofer-ignore"
HTTPHeadersDictionary of headers to include in external requests{"Range": "bytes=0-0", "Accept": "*/*"}
TestFilesConcurrently:warning: :construction: EXPERIMENTAL Turns on concurrent checking of files.false
DocumentConcurrencyLimitMaximum number of documents to process at once.128
HTTPConcurrencyLimitMaximum number of open HTTP connections. If you raise this number ensure the ExternalTimeout is suitably raised.16
LogLevelLogging level, 0-3: debug, info, warning, error.2
LogSortHow to sort/present issues. Can be seq for sequential output or document to group by document.document
ExternalTimeoutNumber of seconds to wait on an HTTP connection before failing.15
RedirectLimitAllowed number of redirects. Use built-in behavior with negative values.-1
StripQueryStringEnables stripping of query strings from external checks.true
StripQueryExcludesList of URLs to disable query stripping on.["fonts.googleapis.com"]
OutputDirDirectory to store cache and log files in. Relative to executing directory.tmp/.htmltest
OutputCacheFileFile within OutputDir to store reference cache.refcache.json
OutputLogFileFile within OutputDir to store last tests errors.htmltest.log
CacheExpiresCache validity period, accepts go.time duration strings (…"m", "h").336h (two weeks)

Example

DirectoryPath: "_site"
EnforceHTTPS: true
IgnoreURLs:
- "example.com"
- "^/misc/js/script.js$"
IgnoreDirs:
- "lib"
CacheExpires: "6h"

:loudspeaker: Issues? Suggestions?

Submit an issue.