Home

Awesome

Awesome Language Engineering Awesome

alan_behind

A curated list of useful resources for computer language engineering and theory

Whether you want to create a text-processor, a parser, a language application, a DSL (Domain Specific Language), or a full-fledged programming language with compiler and tooling, this page serves as a directory map to point you to the right direction.

Better yet, help others finding their way by contributing to this page with the resources that you think useful.

Contents

Tools

Just like other domains, knowing the available tools that are tried-and-true will save you a lot of time and efforts. Furthermore, you will also learn the emerging techniques that are adopted in different tools which make the skills more transferable.

ANTLR (ANother Tool for Language Recognition)

A powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.

Describe language lexical and grammar specification in a declarative file format .g4 (Lex/Yacc format alike), and the generator can create a parser for the following target languages: Java, C#, Python, JavaScript, Go, C++, Swift (see update)

Learning materials:

MPS (Meta Programming System)

With JetBrains MPS, you can define custom editors for any new language and make using these DSLs simpler. Even domain experts, who are not familiar with traditional programming, can easily work in MPS with domain-specific languages designed around their domain-specific terminology.

Learning materials:

Xtext

Xtext is a framework by Eclipse for development of programming languages and domain-specific languages. With Xtext you define your language using a powerful grammar language. As a result you get a full infrastructure, including parser, linker, typechecker, compiler as well as editing support for Eclipse, IntelliJ IDEA and your favorite web browser.

Learning materials:

Sirius

Sirius is an Eclipse project which allows you to easily create your own graphical modeling workbench by leveraging the Eclipse Modeling technologies, including EMF and GMF.

Learning materials:

Flex and Bison

Flex and Bison are aging unix utilities that help you write very fast parsers for almost arbitrary file formats. Lex and Yacc are the original tools; Flex and Bison are their almost completely compatible newer versions.

Learning materials:

Kaitai Struct

A parser generator for reading binary data. This is a declarative language for specifying data structure of binary data in order to generate parser (in multiple target languages) that handles reading binary file formats, network stream packet formats, etc. It comes with a compiler, an IDE, a visualizer, and library of format specs.

Describe binary structure specification in a declarative file format .ksy (YAML alike), and the generator can create a parser for the following target languages: C++/STL, C#, Java, JavaScript, Perl, PHP, Python, Ruby (see update)

Sed and Awk

Sed and Awk are two text processing programs that are mainstays of the UNIX programmer's toolbox.

Both are command-line interface programs that can be used independently or together nicely for many text processing purposes. They are great for recognizing and extracting information from text input. For simple language recognition tasks, perhaps they are the best tools for the job with the least effort due to their simplicity and targeted use cases. Sed and Awk are part of most, if not all, Linux/Unix/macOS distributions. They are available to download for Windows as well.

Learning materials:

Fundamentals

Books

DSL Engineering

Designing, Implementing and Using Domain-Specific Languages

<a href="https://www.amazon.com/DSL-Engineering-Designing-Implementing-Domain-Specific/dp/1481218581/ref=as_li_ss_il?ie=UTF8&qid=1505967419&sr=8-3&keywords=domain+specific+languages&linkCode=li3&tag=mynn11481-20&linkId=5eeaee127a3c85ae1922734c300464a2" target="_blank"><img border="0" src="https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1481218581&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mynn11481-20" ></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=mynn11481-20&l=li3&o=1&a=1481218581" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />

The definitive resource on domain-specific languages: based on years of real-world experience, relying on modern language workbenches and full of examples. Domain-Specific Languages are programming languages specialized for a particular application domain.

Language Implementation Patterns

Create Your Own Domain-Specific and General Programming Languages

<a href="https://www.amazon.com/Language-Implementation-Patterns-Domain-Specific-Programming/dp/193435645X/ref=as_li_ss_il?s=books&ie=UTF8&qid=1527985098&sr=1-1&keywords=Language+Implementation+Patterns&dpID=51QDoAv%252BFgL&preST=_SX218_BO1,204,203,200_QL40_&dpSrc=srch&linkCode=li3&tag=mynn11481-20&linkId=74fe9628b485dfa8ab1e36bd67e33e09" target="_blank"><img border="0" src="https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=193435645X&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mynn11481-20" ></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=mynn11481-20&l=li3&o=1&a=193435645X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />

Written by the author of ANTLR, and it is also the tool used in the book, but the general concepts apply regardless of what you use.

Compilers: Principles, Techniques, and Tools

<a href="https://www.amazon.com/Compilers-Principles-Techniques-Tools-2nd/dp/0321486811//ref=as_li_ss_il?ie=UTF8&linkCode=li3&tag=mynn11481-20&linkId=9793c66f2ad6b1f2e740b235c955ac8d" target="_blank"><img border="0" src="https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=0321486811&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mynn11481-20" ></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=mynn11481-20&l=li3&o=1&a=0321486811" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />

A classic compiler book that is known to professors, students, and developers worldwide as the "Dragon Book"

Writing An Interpreter In Go

<a href="https://www.amazon.com/dp/300055808X//ref=as_li_ss_il?coliid=I3HPPYJ76KKH81&colid=28DF35XRZA3V&psc=0&ref_=lv_ov_lig_dp_it&linkCode=li3&tag=mynn11481-20&linkId=4a752f54ba3aeebe83b7e9fb27e8263c" target="_blank"><img border="0" src="https://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=300055808X&Format=_SL250_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=mynn11481-20" ></a><img src="https://ir-na.amazon-adsystem.com/e/ir?t=mynn11481-20&l=li3&o=1&a=300055808X" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" />

Learning how to use a C-like language such as Go to create a complete programming language by applying fundamental concepts of lexer, parser, AST (Abstract Syntax Tree), Pratt technique, and recursive descent parser. This also shows you how to implement a REPL (interactive language shell).

Articles

General:

Paradigms:

Type Systems

License

CC0

To the extent possible under law, Nikyle Nguyen has waived all copyright and related or neighboring rights to this work.