Home

Awesome

literate-programming-lib Build Status

Write your code anywhere and in any order with as much explanation as you like. literate-programming will weave it all together to produce your project.

This is a modificaiton of and an implementation of Knuth's Literate Programming technique. It is perhaps most in line with noweb.

It uses markdown as the basic document format with the code to be weaved together being markdown code blocks. GitHub flavored code fences can also be used to demarcate code blocks. In particular, commonmark is the spec that the parsing of the markdown is used. Anything considered code by it will be considered code by literate programming.

This processing does not care what language(s) your are programming in. But it may skew towards more useful for the web stack.

This is the core library that is used as a module. See -cli for the command line client. The full version has a variety of useful standard plugins ("batteries included").

Installation

This requires node.js and npm to be installed. See nvm for a recommend installation of node; it allows one to toggle between different versions. This has been tested on node.js .10, .12, and io.js. It is basic javascript and should work pretty much on any javascript engine.

Then issue the command:

npm install literate-programming-lib

Since this is the library module, typically you use the client version install and do not install the lib directly. If you are hacking with modules, then you already know that you will want this in the package.json file.

Using as a module

You can use Folder = require('literate-programming-lib'); to get a constructor that will create what I think of as a folder. The folder will handle all the documents and scopes and etc.

To actually use this library (as opposed to the command line client), you need to establish how it fetches documents and tell it how to save documents. An example is below. If you just want to compile some documents, use the command line client and ignore this. Just saying the following is not pretty. At least, not yet!

The thing to keep in mind is that this library is structured around events using my event-when library. The variable gcd is the event emitter (dispatcher if you will).

var fs = require('fs');
var Folder = require('literate-programming-lib');
var folder = new Folder();
var gcd = folder.gcd;
var colon = folder.colon;

gcd.on("need document", function (rawname) {
    var safename = colon.escape(rawname);
    fs.readfile(rawname, {encoding:'utf8'},  function (err, text) {
        if (err) {
            gcd.emit("error:file not found:" + safename);
        } else {
            folder.newdoc(safename, text);
        }
    });
});

gcd.on("file ready", function(text, evObj) {
    var filename = evObj.pieces[0]; 
    fs.writefile(filename, text);
});

gcd.emit("need document:first.md");

This last line should start the whole chain of compilation with first.md being read in and then any of its files being called, etc., and then any files to save will get saved.

The reason the lib does not have this natively is that I separated it out specifically to avoid requiring file system access. Instead you can use any kind of function that provides text, or whatever. It should be fine to also use folder.newdoc directly on each bit of text as needed; everything will patiently wait until the right stuff is ready. I think.

Note that live code can be run from a literate program as well. So be careful!

Example

Let's give a quick example of what a sample text might look like.

# Welcome

So you want to make a literate program? Let's have a program that outputs
all numbers between 1 to 10.

Let's save it in file count.js

[count.js](#Structure "save:")

## Structure 

We have some intial setup. Then we will generate the array of numbers. We
end with outputting the numbers. 

    var numarr = [], start=1, end = 11, step = 1;

    _"Loop"

    _"Output"

## Output 

At this point, we have the array of numbers. Now we can join them with a
comma and output that to the console.

    console.log("The numbers are: ", numarr.join(", ") );

## Loop

Set the loop up and push the numbers onto it. 

    var i;
    for (i = start; i < end; i += step) {
        numarr.push(i);
    }

A full example of a literate program is lp.md in this repository. It compiles to this library.

Document syntax

A literate program is a markdown document with some special conventions.

The basic idea is that each header line (regardless of level, either atx # or seText underline ) demarcates a full block. Code blocks within a full block are the bits that are woven together.

Code Block

Each code block can contain whatever kind of code, but there is a primary special syntax.

_"Block name" This tells the compiler to compile the block with "Block name" and then replace the _"Block name" with that code.

Note the the allowed quotes are double, single, and backtick. Matching types are expected. And yes, it is useful to have three different types.

The full syntax is something of the form _"scope name::block name:minor block name | cmd arg 1, arg 2 | cmd2 |cmd3 ..." where the scope name allows us to refer to other documents (or artificial common scopes) and the commands run the output of one to the input of the other, also taking in arguments which could they themselves be block substitutions.

Note that one can also backslash escape the underscore. To have multiple escapes (to allow for multiple compiling), one can use \#_" where the number gets decremented by one on each compile and, when it is compiled with a 0 there, the sub finally gets run.

A block of the form _":first" would look for a minor block, i.e., a block that has been created by a switch directive. See next section.

One can also visually hide parts of the document, without it being hidden to the compiler, by using html comments. If the start of a line is <!--+ then it will strip that and the next occurrence of --> before doing the markdown compiling.

Directive

A directive is a command that interacts with external input/output. Just about every literate program has at least one save directive that will save some compiled block to a file.

The syntax for the save directive is

[file.ext](#name-the-heading "save: encoding | pipe commands")  

where

For other directives, what the various parts mean depends, but it is always

[some](#stuff "dir: whatever")  

where the dir should be replaced with a directive name. If dir is absent, but the colon is there, then this demarcates a minor block start.

Pipes

One can also use pipes to pipe the compiled text through a command to do something to it. For example, _"Some JS code | jshint" will take the code in block some JS code and pipe it into the jshint command which can be a thin wrapper for the jshint module and report errors to the console. That command would then return the text in an untouched fashion. We can also use pipe commands to modify the text.

Commands can be used in block substitutions, minor block directive switches, and other directives that are setup to use them such as the save and out directive:
[code.js](#some-js-code "save: | jstidy) will tidy up the code before storing it in the file code.js.

If you want your own directive to process pipes, see the save directive in lp.md. Pay particular attention to the "process" and "deal with start" minor blocks. The functionality of pipe parsing is in the doc.pipeParsing command, but there are events that need to be respected in the setup.

Commands take arguments separated by commas and commands end with pipes or the block naming quote. One can also use a named code block as an argument, using any of the quote marks (same or different as surround block name). To escape commas, quotes, pipes, underscores, spaces (spaces get trimmed from the beginning and ending of an argument), newlines, one can use a backslash, which also escapes itself. Note that the commonmark parser will escape all backslash-punctuation combinations outside of code blocks. So you may need a double backslash in directive command pipings.

You can also use \n to pu ta newline in line or \u... where the ... is a unicode codepoint per JavaScript spec implemented by string.fromcodepoint.

Minor Block

Finally, you can use distinct code blocks within a full block. If you simply have multiple code blocks with none of the switching syntax below, then they will get concatenated into a single code block.

You can also switch to have what I call minor blocks within a main heading. This is mainly used for small bits that are just pushed out of the way for convenience. A full heading change is more appropriate for something that merits separate attention.

To create a minor block, one can either use a link of the form [code name]() or [code name](#whatever ":|cmd ...") Note this is a bit of a break from earlier versions in which a link on its own line would create a minor block. Now it is purely on the form and not on placement.

Example: Let's say in heading block ### Loopy we have [outer loop]() Then it will create a code block that can be referenced by _"Loopy:outer loop".

Note: If the switch syntax is [](#... ":|...") then this just transforms whatever is point to in href using the pipe commands. That is, it is not a switch, but fills in a gap for main blocks not having pipe switch syntax. The key is the empty link text.

Templating

One use of minor blocks is as a templating mechanism.

## Top

After the first compile, the numbers will be decremented, but the blocks
will not be evaluated.

    \1_":first"

    \2_":second"
    
    \1_":final"


This is now a template. We could use it as

[jack](# "store:| compile basic ")

[happy.txt](#jack "save:| compile great")
[sad.txt](# "save:| compile basic | compile grumpy")


# Basic

[first]()
    
    Greetings and Salutations

[final]()

    Sincerely,
    Jack

# Great

[second]()

    You are great.

# Grumpy

[second]()

    You are grumpy.

# Middle

[second]()

    You are okay.

## Another

    \_":first"

    \_"$2:second"
    
    \_":final"

[middle.txt](# "save:| sub $2, middle | compile basic")

This would produce the files:

happy.txt

Greetings and Salutations

You are great.

Sincerely,
Jack

sad.txt

Greetings and Salutations

You are grumpy.

Sincerely,
Jack

middle.txt

Greetings and Salutations

You are okay.

Sincerely,
Jack

Note that you need to be careful about feeding in the escaped commands into other parsers. For example, I was using Pugs to generate HTML structure and then using this templating to inject content (using markdown). Well, Pugs escapes quotes and this was causing troubles. So I used backticks to delimit the block name instead of quotes and it worked fine. Be flexible.

Nifty parts of writing literate programming

I also like to use it to compile an entire project from a single file, pulling in other literate program files as needed. That is, one can have a command-and-control literate program file and a bunch of separate files for separate concerns. But note that you need not split the project into any pre-defined ways. For example, if designing a web interface, you can organize the files by widgets, mixing in HTML, CSS, and JS in a single file whose purpose is clear. Then the central file can pull it all in to a single web page (or many) as well as save the CSS and JS to their own files as per the reommendation, lessing the CSS, tanspiling ES6, linting, and minifying all as desired. Or you could just write each output file separate in its own litpro document.

It's all good. You decide the order and grouping. The structure of your litpro documents is up to you and is independent of the needed structures of the output.

Directives vs commands vs subcommand

Directives affect the flow of the literate program itself, such as defining commands, saving output, or directly storing values. Commands transform incoming text or other input. Subcommands create useful arguments to commands.

Directives can be thought of as procedures, commands as methods on the input, and subcommands as functions. And indeed, directives do not compose in the sense of returning a value. Commands are written like the chain syntax, with what is on the left being evaluated first. Subcommands are written with typical function syntax, with what is on the right being evaluated first.

Built in directives

There are a variety of directives that come built in.

Built in commands

Note commands need to be one word and are case-sensitive. They can be symbols as long as that does not conflict with anything (avoid pipes, commas, colons, quotes).

Built-in Subcommands

With command arguments, one can run commands on arguments to get them in some appropriate form or use, including passing in objects or arrays. You can use them as cmd a, subcmd(arg1, arg2, arg3) would have subcmd acting on the args and the result of that would be the argument place The a would be passed into cmd as the first argument, but anything might get passed into cmd by subcmd's return value. It could also store an object into a state for configuration.

There are several built-in subcommands. Note that these are case insensitive.

To build one's own command, you can attach a function whose arguments will be the arguments passed in. The this is the doc object. The current name (say for scope storing) is in doc.cmdName. This will point to within a whole pipe chunk. Pop off the last part (delimited by triple colon) to get to the whole command scope. The return value will be used as in an argument into the command or another subcommand. If it is an array and the flag args is set to true, then each entry in the array will be expanded into a set of arguments. So instead of 1 argument, several could be returned. If nothing is returned, then no arguments are passed on and it is as if it wasn't there.

h5 and h6

So this design treats h5 and h6 headings differently. They become subheadings of h1-4 headings. So for example, if we have # top and then ##### doc and ###### something then the sections would be recorded as top, top/doc, top/doc/something and we have a path syntax such as ../ which would yield top/doc if placed in top/doc/something. Ideally, this should work as you imagine. See tests/h5.md for the test examples.

Plugins

This is a big topic which I will only touch on here. You can define commands in the text of a literate program, and we will discuss this a bit here, but mostly, both commands and directives get defined in module plugins or the lprc.js file if need be.

Defining Commands

The define directive allows one to create commands within a document. This is a good place to start getting used to how things work.

A command has the function signature function (input, args, name)-> void where the input is the incoming text (we are piping along when evaluating commands), args are the arguments that are comma separated after the command name, and the name is the name of the event that needs to be emitted with the outgoing text. The function context is the doc example.

A minimal example is

function ( input, args, name) {
    this.gcd.emit(name, input);
}

We simply emit the name with the incoming text as data. We usually use doc for the this variable. This is the raw option in the define directive.

The default is sync and is very easy.

function ( input, args, name) {
    return input;
}

That is, we just return the text we want to return. In general, the name is not needed though it may provide context clues.

The third option is an async command. For those familiar with node conventions, this is easy and natural.

function (input, args, callback, name) {
    callback(null, input);
}

The callback takes in an error as first argument and, if no error, the text to output. One should be able to use this as a function callback to pass into other callback async setups in node.

So that's the flow. Obviously, you are free to do what you like with the text inside. You can access the document as this and from there get to the event emitter gcd and the parent, folder, leading to other docs. The scopes are available as well. Synchronous is the easiest, but asynchronous control flow is just as good and is needed for reading files, network requests, external process executions, etc.

Plugin convention.

I recommend the following npm module conventions for plugins for literate-programming.

  1. litpro-... is the name. So all plugins would be namespaced to litpro. Clear, but short.

  2. Set module.exports = function(Folder, other) The first argument is the Folder object which construts folders which constructs documents. By accessing Folder, one can add a lot of functionality. This access is granted in the command line client before any folder is created.

    The other argument depends on context, but for the command line client it is the parsed in arguments object. It can be useful for a number of purposes, but one should limit its use as it narrows the context of the use.

  3. Define commands and, less, directives. Commands are for transforming text, directives are for doing document flow maipulations. Other hacks on Folder should be even less rare than adding directives.

  4. Commands and directives are globally defined.

  5. Folder.commands[cmd name] = function (input, args, name)... is how to add a command function. You can use Folder.sync(cmdname, cmdfun) and Folder.async to install sync and async functions directly in the same fashion as used by the define directive.

  6. Folder.directives[directive name] = function (args) is how to install a directive. There are no helper functions for directives. These are more for controlling the flow of the compiling in the large. The arg keys are read off from [link](href "directive:input"). Also provided is the current block name which is given by the key cur.

  7. If you want to do stuff after folder and its event emitter, gcd, is created, then you can modify Folder.postInit to be a function that does whatever you want on a folder instance. Think of it as a secondary constructor function.

  8. The Folder has a plugins object where one can stash whatever under the plugin's name. This is largely for options and alternatives. The folder and doc object map to the same object.

Structure of Doc and Folder

To really hack the doc compiling, one should inspect the structure of Folder, folder, and doc. The Folder is a constructor and it has a variety of properties on it that are global to all folders. But it also has several prototype properties that get inherited by the folder instances. Some of those get inherited by the docs as well. For each folder, there is also a gcd object which is the event emitter, which comes from the, ahem, magnificient event-when library (I wrote it with this use in mind). In many ways, hacking on gcd will manipulate the flow of the compiling.

I wrote the folder instance to maintain flexibility, but typically (so far at least), one folder instance per run is typical. Still, there might be a use for it in say have a development and production compile separate but running simultaneously?

Folder

These are the properties of Folder that may be of interest.

folder

Each instance of folder comes with its own instances of:

and shares via the prototype

and direct copying from

doc

Each file leads to a doc which is stored in the folder. Each doc has a variety of stuff going on.

Unique to each instance

Inherited from folder

Prototyped on Doc. Almost all are internal and are of little to no interest.

Reporting

A key feature of any programming environment is debugging. It is my hope that this version has some better debugging information. The key to this is the reporting function of what is waiting around.

The way it works is that when an event of the form waiting for:type:... is emitted with data [evt, reportname, ...] then reporters gets a key of the event string wthout the waiting for:, and when the evt is emitted, it is removed.

If it is still waiting around when all is done, then it gets reported. The reportname is used to look up which reporter is used. Then that reporter takes in the remaining arguments and produces a string that will be part of the final report that gets printed out.

Some of the waiting is not done by the emitting, but rather by presence in .when and .onces.

LICENSE

MIT-LICENSE