Home

Awesome

LIBUNIBREAK

Overview

This is the README file for libunibreak, an implementation of the line breaking and word/grapheme breaking algorithms as described in Unicode Standard Annex 14 (UAX #14) and Unicode Standard Annex 29 (UAX #29). Check the project's home page for up-to-date information.

As of February 2024, Unicode 15.0 support for line breaking, as well as full Unicode 15.1 support for word/grapheme breaking, is provided. There is currently no plan to implement full Unicode 15.1 support for line breaking, mostly because tailoring for Brahmic scripts, as described in LB28a of UAX #14-51, is problematic within the current framework.

Licence

This library is released under an open-source licence, the zlib/libpng licence. Please check the file LICENCE for details.

Apart from using the algorithm, part of the code is derived from the Unicode Public Data, and the Unicode Terms of Use may apply.

Installation

There are three ways to build the library:

  1. On *NIX systems supported by the autoconfiscation tools, do the normal

     ./configure
     make
     sudo make install
    

    to build and install both the dynamic and static libraries. In addition, one may

    • type make doc to generate the doxygen documentation; or
    • type make check to run the self-check tests.
  2. On systems where GCC and Binutils are supported, one can type

     cd src
     cp -p Makefile.gcc Makefile
     make
    

    to build the static library. In addition, one may

    • type make debug or make release to explicitly generate the debug or release build; or
    • type make doc to generate the doxygen documentation.
  3. On Windows, apart from using method 1 (Cygwin/MSYS) and method 2 (MinGW), MSVC can also be used. Type

     cd src
     nmake -f Makefile.msvc
    

    to build the static library. By default the debug version is built. To build the release version

     nmake -f Makefile.msvc CFG="libunibreak - Win32 Release"
    

Documentation

Check the generated document doc/html/linebreak_8h.html, doc/html/wordbreak_8h.html, and doc/html/graphemebreak_8h.html in the downloaded file for the public interfaces exposed to applications.

Examples

See the tools directory for basic examples. A more real utility is breaktext.

<!-- vim:autoindent:expandtab:formatoptions=tcqlmn:textwidth=72: -->