Home

Awesome

CSV Validator

A Validation Tool and APIs for validating CSV (Comma Separated Value) files by using CSV Schema.

CI

Released under the Mozilla Public Licence version 2.0.

A comprehensive user guide is available in GitHub pages, along with a more complete specification of the CSV Schema language.

Technology

The Validation tool and APIs are written in Scala 2.13 and may be used as:

The Validation Tool and APIs can be used on any Java Virtual Machine which supports Java 8 or better (NB Java 6 support was removed in version 1.1). The source code is built using the Apache Maven build tool:

  1. For use in other Java/Scala Applications, build by executing mvn clean install.
  2. For the Command Line Interface or Swing GUI, build by executing mvn clean package.

Maven Artifacts

Released Maven Artifacts can be found in Maven Central under the groupId uk.gov.nationalarchives.

Java API

If you wish to use the CSV Validator from your own Java project, we provide a native Java API, the dependency details are:

<dependency>
	<groupId>uk.gov.nationalarchives</groupId>
    <artifactId>csv-validator-java-api</artifactId>
    <version>1.3.0</version>
</dependency>

The Javadoc, can be found in either Maven Central or you can build it locally by executing mvn javadoc:javadoc.

Example Java code of using the CSV Validator through the Java API:

Boolean failFast = false;
List<Substitution> pathSubstitutions = new ArrayList<Substitution>();

List<FailMessage> messages = CsvValidator.validate(
"/data/csv/data.csv",
"/data/csv/data-schema.csvs",
failFast,
pathSubstitutions,
true,
false);

 if(messages.isEmpty()) {
	System.out.println("Completed validation OK");
 } else {
 	for(FailMessage message : messages) {
 		if(message instanceof WarningMessage) {
 			System.out.println("[WARN] " + message.getMessage());
 		} else {
 			System.out.println("[ERROR] " + message.getMessage());
 		}
 	}
}

Scala API

Likewise, if you wish to use the CSV Validator from your own Scala project, the Scala API is part of the core, the dependency details are:

<dependency>
	<groupId>uk.gov.nationalarchives</groupId>
    <artifactId>csv-validator-core</artifactId>
    <version>1.3.0</version>
</dependency>

The Scaladoc, can be found in either Maven Central or you can build it locally by executing mvn scala:doc.

An example of using the Scala API can be found in the class uk.gov.nationalarchives.csv.validator.api.java.CsvValidatorJavaBridge from the csv-validator-java-api module. The Scala API at present gives much more control over the individual Schema Parsing and Validation Processor than the Java API.

Schema Examples

Examples of CSV Schema can be found in the test cases of the csv-validator-core module. See the *.csvs files in acceptance/. Schemas used by the Digital Preservation department at The National Archives are also available in the example-schemas folder of the csv-schema repository.

Current Limitations of the CSV Validator Tool

The CSV Validator implements almost all of CSV Schema 1.2 (Draft) language, current limitations and missing functionality are: