Home

Awesome

<p align="center"> <img src="docs/assets/CodableCSV.svg" alt="Codable CSV"/> </p> <p align="center"> <a href="https://swift.org/about/#swiftorg-and-open-source"><img src="docs/assets/badges/Swift.svg" alt="Swift 5.x"></a> <a href="https://github.com/dehesa/CodableCSV/wiki/Implicit-dependencies"><img src="docs/assets/badges/Apple.svg" alt="macOS 10.10+ - iOS 8+ - tvOS 9+ - watchOS 2+"></a> <a href="https://ubuntu.com"><img src="docs/assets/badges/Ubuntu.svg" alt="Ubuntu 18.04"></a> <a href="http://doge.mit-license.org"><img src="docs/assets/badges/License.svg" alt="MIT License"></a> </p>

CodableCSV provides:

Usage

To use this library, you need to:

<ul> <details><summary>Add <code>CodableCSV</code> to your project.</summary><p>

You can choose to add the library through SPM or Cocoapods:

</p></details> <details><summary>Import <code>CodableCSV</code> in the file that needs it.</summary><p>
import CodableCSV
</p></details> </ul>

There are two ways to use this library:

  1. imperatively, as a row-by-row and field-by-field reader/writer.
  2. declaratively, through Swift's Codable interface.

Imperative Reader/Writer

The following types provide imperative control on how to read/write CSV data.

<ul> <details><summary><code>CSVReader</code></summary><p>

A CSVReader parses CSV data from a given input (String, Data, URL, or InputStream) and returns CSV rows as a Strings array. CSVReader can be used at a high-level, in which case it parses an input completely; or at a low-level, in which each row is decoded when requested.

Reader Configuration

CSVReader accepts the following configuration properties:

The configuration values are set during initialization and can be passed to the CSVReader instance through a structure or with a convenience closure syntax:

let reader = CSVReader(input: ...) {
    $0.encoding = .utf8
    $0.delimiters.row = "\r\n"
    $0.headerStrategy = .firstLine
    $0.trimStrategy = .whitespaces
}
</p></details> <details><summary><code>CSVWriter</code></summary><p>

A CSVWriter encodes CSV information into a specified target (i.e. a String, or Data, or a file). It can be used at a high-level, by encoding completely a prepared set of information; or at a low-level, in which case rows or fields can be written individually.

Writer Configuration

CSVWriter accepts the following configuration properties:

The configuration values are set during initialization and can be passed to the CSVWriter instance through a structure or with a convenience closure syntax:

let writer = CSVWriter(fileURL: ...) {
    $0.delimiters.row = "\r\n"
    $0.headers = ["Name", "Age", "Pet"]
    $0.encoding = .utf8
    $0.bomStrategy = .never
}
</p></details> <details><summary><code>CSVError</code></summary><p>

Many of CodableCSV's imperative functions may throw errors due to invalid configuration values, invalid CSV input, file stream failures, etc. All these throwing operations exclusively throw CSVErrors that can be easily caught with do-catch clause.

do {
    let writer = try CSVWriter()
    for row in customData {
        try writer.write(row: row)
    }
} catch let error {
    print(error)
}

CSVError adopts Swift Evolution's SE-112 protocols and CustomDebugStringConvertible. The error's properties provide rich commentary explaining what went wrong and indicate how to fix the problem.

<br>You can get all the information by simply printing the error or calling the localizedDescription property on a properly casted CSVError<CSVReader> or CSVError<CSVWriter>.

</p></details> </ul>

Declarative Decoder/Encoder

The encoders/decoders provided by this library let you use Swift's Codable declarative approach to encode/decode CSV data.

<ul> <details><summary><code>CSVDecoder</code></summary><p>

CSVDecoder transforms CSV data into a Swift type conforming to Decodable. The decoding process is very simple and it only requires creating a decoding instance and call its decode function passing the Decodable type and the input data.

let decoder = CSVDecoder()
let result = try decoder.decode(CustomType.self, from: data)

CSVDecoder can decode CSVs represented as a Data blob, a String, an actual file in the file system, or an InputStream (e.g. stdin).

let decoder = CSVDecoder { $0.bufferingStrategy = .sequential }
let content = try decoder.decode([Student].self, from: URL("~/Desktop/Student.csv"))

If you are dealing with a big CSV file, it is preferred to used direct file decoding, a .sequential or .unrequested buffering strategy, and set presampling to false; since then memory usage is drastically reduced.

Decoder Configuration

The decoding process can be tweaked by specifying configuration values at initialization time. CSVDecoder accepts the same configuration values as CSVReader plus the following ones:

The configuration values can be set during CSVDecoder initialization or at any point before the decode function is called.

let decoder = CSVDecoder {
    $0.encoding = .utf8
    $0.delimiters.field = "\t"
    $0.headerStrategy = .firstLine
    $0.bufferingStrategy = .keepAll
    $0.decimalStrategy = .custom({ (decoder) in
        let value = try Float(from: decoder)
        return Decimal(value)
    })
}
</p></details> <details><summary><code>CSVDecoder.Lazy</code></summary><p>

A CSV input can be decoded on demand (i.e. row-by-row) with the decoder's lazy(from:) function.

let decoder = CSVDecoder(configuration: config).lazy(from: fileURL)
let student1 = try decoder.decodeRow(Student.self)
let student2 = try decoder.decodeRow(Student.self)

CSVDecoder.Lazy conforms to Swift's Sequence protocol, letting you use functionality such as map(), allSatisfy(), etc. Please note, CSVDecoder.Lazy cannot be used for repeated access; It consumes the input CSV.

let decoder = CSVDecoder().lazy(from: fileData)
let students = try decoder.map { try $0.decode(Student.self) }

A nice benefit of using the lazy operation, is that it lets you switch how a row is decoded at any point. For example:

let decoder = CSVDecoder().lazy(from: fileString)
// The first 100 rows are students.
let students = (  0..<100).map { _ in try decoder.decode(Student.self) }
// The second 100 rows are teachers.
let teachers = (100..<110).map { _ in try decoder.decode(Teacher.self) }

Since CSVDecoder.Lazy exclusively provides sequential access; setting the buffering strategy to .sequential will reduce the decoder's memory usage.

let decoder = CSVDecoder {
    $0.headerStrategy = .firstLine
    $0.bufferingStrategy = .sequential
}.lazy(from: fileURL)
</p></details> <details><summary><code>CSVEncoder</code></summary><p>

CSVEncoder transforms Swift types conforming to Encodable into CSV data. The encoding process is very simple and it only requires creating an encoding instance and call its encode function passing the Encodable value.

let encoder = CSVEncoder()
let data = try encoder.encode(value, into: Data.self)

The Encoder's encode() function creates a CSV file as a Data blob, a String, or an actual file in the file system.

let encoder = CSVEncoder { $0.headers = ["name", "age", "hasPet"] }
try encoder.encode(value, into: URL("~/Desktop/Students.csv"))

If you are dealing with a big CSV content, it is preferred to use direct file encoding and a .sequential or .assembled buffering strategy, since then memory usage is drastically reduced.

Encoder Configuration

The encoding process can be tweaked by specifying configuration values. CSVEncoder accepts the same configuration values as CSVWriter plus the following ones:

The configuration values can be set during CSVEncoder initialization or at any point before the encode function is called.

let encoder = CSVEncoder {
    $0.headers = ["name", "age", "hasPet"]
    $0.delimiters = (field: ";", row: "\r\n")
    $0.dateStrategy = .iso8601
    $0.bufferingStrategy = .sequential
    $0.floatStrategy = .convert(positiveInfinity: "∞", negativeInfinity: "-∞", nan: "≁")
    $0.dataStrategy = .custom({ (data, encoder) in
        let string = customTransformation(data)
        var container = try encoder.singleValueContainer()
        try container.encode(string)
    })
}

The .headers configuration is required if you are using keyed encoding container.

</p></details> <details><summary><code>CSVEncoder.Lazy</code></summary><p>

A series of codable types (representing CSV rows) can be encoded on demand with the encoder's lazy(into:) function.

let encoder = CSVEncoder().lazy(into: Data.self)
for student in students {
    try encoder.encodeRow(student)
}
let data = try encoder.endEncoding()

Call endEncoding() once there is no more values to be encoded. The function will return the encoded CSV.

let encoder = CSVEncoder().lazy(into: String.self)
students.forEach {
    try encoder.encode($0)
}
let string = try encoder.endEncoding()

A nice benefit of using the lazy operation, is that it lets you switch how a row is encoded at any point. For example:

let encoder = CSVEncoder(configuration: config).lazy(into: fileURL)
students.forEach { try encoder.encode($0) }
teachers.forEach { try encoder.encode($0) }
try encoder.endEncoding()

Since CSVEncoder.Lazy exclusively provides sequential encoding; setting the buffering strategy to .sequential will reduce the encoder's memory usage.

let encoder = CSVEncoder {
    $0.bufferingStrategy = .sequential
}.lazy(into: String.self)
</p></details> </ul>

Tips using Codable

Codable is fairly easy to use and most Swift standard library types already conform to it. However, sometimes it is tricky to get custom types to comply to Codable for specific functionality.

<ul> <details><summary>Basic adoption.</summary><p>

When a custom type conforms to Codable, the type is stating that it has the ability to decode itself from and encode itself to a external representation. Which representation depends on the decoder or encoder chosen. Foundation provides support for JSON and Property Lists and the community provide many other formats, such as: YAML, XML, BSON, and CSV (through this library).

Usually a CSV represents a long list of entities. The following is a simple example representing a list of students.

let string = """
    name,age,hasPet
    John,22,true
    Marine,23,false
    Alta,24,true
    """

A student can be represented as a structure:

struct Student: Codable {
    var name: String
    var age: Int
    var hasPet: Bool
}

To decode the list of students, create a decoder and call decode on it passing the CSV sample.

let decoder = CSVDecoder { $0.headerStrategy = .firstLine }
let students = try decoder.decode([Student].self, from: string)

The inverse process (from Swift to CSV) is very similar (and simple).

let encoder = CSVEncoder { $0.headers = ["name", "age", "hasPet"] }
let newData = try encoder.encode(students)
</p></details> <details><summary>Specific behavior for CSV data.</summary><p>

When encoding/decoding CSV data, it is important to keep several points in mind:

</p> <ul> <details><summary><code>Codable</code>'s automatic synthesis requires CSV files with a headers row.</summary><p>

Codable is able to synthesize init(from:) and encode(to:) for your custom types when all its members/properties conform to Codable. This automatic synthesis create a hidden CodingKeys enumeration containing all your property names.

During decoding, CSVDecoder tries to match the enumeration string values with a field position within a row. For this to work the CSV data must contain a headers row with the property names. If your CSV doesn't contain a headers row, you can specify coding keys with integer values representing the field index.

struct Student: Codable {
    var name: String
    var age: Int
    var hasPet: Bool

    private enum CodingKeys: Int, CodingKey {
        case name = 0
        case age = 1
        case hasPet = 2
    }
}

Using integer coding keys has the added benefit of better encoder/decoder performance. By explicitly indicating the field index, you let the decoder skip the functionality of matching coding keys string values to headers.

</p></details> <details><summary>A CSV is a long list of rows/records.</summary><p>

CSV formatted data is commonly used with flat hierarchies (e.g. a list of students, a list of car models, etc.). Nested structures, such as the ones found in JSON files, are not supported by default in CSV implementations (e.g. a list of users, where each user has a list of services she uses, and each service has a list of the user's configuration values).

You can support complex structures in CSV, but you would have to flatten the hierarchy in a single model or build a custom encoding/decoding process. This process would make sure there is always a maximum of two keyed/unkeyed containers.

As an example, we can create a nested structure for a school with students who own pets.

struct School: Codable {
    let students: [Student]
}

struct Student: Codable {
    var name: String
    var age: Int
    var pet: Pet
}

struct Pet: Codable {
    var nickname: String
    var gender: Gender

    enum Gender: Codable {
        case male, female
    }
}

By default the previous example wouldn't work. If you want to keep the nested structure, you need to overwrite the custom init(from:) implementation (to support Decodable).

extension School {
    init(from decoder: Decoder) throws {
        var container = try decoder.unkeyedContainer()
        while !container.isAtEnd {
            self.student.append(try container.decode(Student.self))
        }
    }
}

extension Student {
    init(from decoder: Decoder) throws {
        var container = try decoder.container(keyedBy: CustomKeys.self)
        self.name = try container.decode(String.self, forKey: .name)
        self.age = try container.decode(Int.self, forKey: .age)
        self.pet = try decoder.singleValueContainer.decode(Pet.self)
    }
}

extension Pet {
    init(from decoder: Decoder) throws {
        var container = try decoder.container(keyedBy: CustomKeys.self)
        self.nickname = try container.decode(String.self, forKey: .nickname)
        self.gender = try container.decode(Gender.self, forKey: .gender)
    }
}

extension Pet.Gender {
    init(from decoder: Decoder) throws {
        var container = try decoder.singleValueContainer()
        self = try container.decode(Int.self) == 1 ? .male : .female
    }
}

private CustomKeys: Int, CodingKey {
    case name = 0
    case age = 1
    case nickname = 2
    case gender = 3
}

You could have avoided building the initializers overhead by defining a flat structure such as:

struct Student: Codable {
    var name: String
    var age: Int
    var nickname: String
    var gender: Gender

    enum Gender: Int, Codable {
        case male = 1
        case female = 2
    }
}
</p></details> </ul> </details> <details><summary>Encoding/decoding strategies.</summary><p>

SE167 proposal introduced to Foundation JSON and PLIST encoders/decoders. This proposal also featured encoding/decoding strategies as a new way to configure the encoding/decoding process. CodableCSV continues this tradition and mirrors such strategies including some new ones specific to the CSV file format.

To configure the encoding/decoding process, you need to set the configuration values of the CSVEncoder/CSVDecoder before calling the encode()/decode() functions. There are two ways to set configuration values:

The strategies labeled with .custom let you insert behavior into the encoding/decoding process without forcing you to manually conform to init(from:) and encode(to:). When set, they will reference the targeted type for the whole process. For example, if you want to encode a CSV file where empty fields are marked with the word null (for some reason). You could do the following:

let decoder = CSVDecoder()
decoder.nilStrategy = .custom({ (encoder) in
    var container = encoder.singleValueContainer()
    try container.encode("null")
})
</p></details> <details><summary>Type-safe headers row.</summary><p>

You can generate type-safe name headers using Swift introspection tools (i.e. Mirror) or explicitly defining the CodingKey enum with String raw value conforming to CaseIterable.

struct Student {
    var name: String
    var age: Int
    var hasPet: Bool

    enum CodingKeys: String, CodingKey, CaseIterable {
        case name, age, hasPet
    }
}

Then configure your encoder with explicit headers.

let encoder = CSVEncoder {
    $0.headers = Student.CodingKeys.allCases.map { $0.rawValue }
}
</p></details> <details><summary>Performance advices.</summary><p>

#warning("TODO:")

</p></details> </ul>

Roadmap

<p align="center"> <img src="docs/assets/Roadmap.svg" alt="Roadmap"/> </p>

The library has been heavily documented and any contribution is welcome. Check the small How to contribute document or take a look at the Github projects for a more in-depth roadmap.

Community

If CodableCSV is not of your liking, the Swift community offers other CSV solutions:

There are many good tools outside the Swift community. Since writing them all would be a hard task, I will just point you to the great AwesomeCSV github repo. There are a lot of treasures to be found there.