Awesome
documentation (generated by doing ./make_test.sh 8): HTML
libsrt has been included into Paul Hsieh's String Library Comparisons table. What a great honor! :-)
libsrt: Safe Real-Time library for the C programming language
libsrt is a C library that provides string, vector, bit set, set, map, hash set, and hash map handling. It's been designed for avoiding explicit memory management when using dynamic-size data structures, allowing safe and expressive code, while keeping high performance. It is also suitable for low level and hard real-time applications, because functions are predictable in both space and time (assuming OS and underlying C library is also real-time suitable).
Key points:
- Safe: write high-level-like C code. Write code faster and safer.
- Fast: using O(n)/O(log n)/O(1) state of the art algorithms.
- Rich: strings supporting raw data handling (per-byte), vector, bit set, set, map, hash set, and hash map data structures.
- Efficient: space optimized, minimum allocation calls (heap and stack support for ALL data structures).
- Compatible: OS-independent (e.g. built-in space-optimized UTF-8 support).
- Predictable: intended for both general purpose and for hard real-time applications.
- Unicode: although string internal representation is raw data (bytes), functions for handling Unicode interpretation/generation/transformation are provided, so when Unicode-specific functions are used, the result of these functions is stored internally as UTF-8 (also, caching some operations, like Unicode string length -e.g. ss_len()/ss_size() give length in bytes, and ss_len_u() the length in Unicode characters-).
How to build
-
In most POSIX systems (e.g. Linux, Unix, and other Unix-like) "make -f Makefile.posix" will be enough, as the Makefile.posix includes platform detection for corner cases. However, you can use multiple flags for tuning the build (you can also mix them):
- make -f Makefile.posix # Defaults (equivalent to make ADD_CFLAGS="-DS_CRC32_SLC=12")
- make -f Makefile.posix -j 8 # Make, spawning up to 8 concurrent jobs (faster on multiprocessor systems)
- make -f Makefile.posix DEBUG=1 # Disable optimizations (-O0 -ggdb)
- make -f Makefile.posix PROFILING=1 # Profiling and coverage flags
- make -f Makefile.posix CC=gcc PEDANTIC=1 C99=1 # Pedantic build for GCC in C99 mode (this should give no warnings)
- make -f Makefile.posix CC=tcc # Use the Tiny C Compiler
- make -f Makefile.posix CC=gcc CXX=g++ # Use gcc for the library and examples and g++ for the benchmark
- make -f Makefile.posix CC=clang CXX=clang++ # Use CLang for the library and examples and CLang C++ for the benchmark
- make -f Makefile.posix FORCE32=1 # Force 32 bit build on 64 bit system
- make -f Makefile.posix MINIMAL=1 # Microcontroller-suitable build (optimize for size and low memory usage)
- make -f Makefile.posix CC=gcc PROFILING=1 # Build with gcc and profiling
- make -f Makefile.posix CC=gcc C90=1 # Build with gcc using C89/90 standard
- make -f Makefile.posix CC=gcc C99=1 # Build with gcc using C99 standard
- make -f Makefile.posix CC=gcc C11=1 # Build with gcc using C11 standard
- make -f Makefile.posix CC=g++ # Build with g++ (C++ instead of C)
- make -f Makefile.posix CC=clang++ CPP11=1 # Build with clang++ using C++11 standard (instead of C)
- make -f Makefile.posix CC=clang CXX=clang++ C99=1 CPP11=1 # Build with clang using C99 mode and build the benchmark using C++11
- make -f Makefile.posix CC=tcc DEBUG=1 # Build with TinyCC with debug symbols
- make -f Makefile.posix CPP11=1 bench # Build the benchmark including C++11 support (for comparing libsrt vs C++/STL hash map/set)
- make -f Makefile.posix CC=powerpc-linux-gnu-gcc # Build with gcc cross compiler (PPC)
- make -f Makefile.posix CC=arm-linux-gnueabi-gcc # Build with gcc cross compiler (ARM)
- make -f Makefile.posix ADD_CFLAGS="-DS_CRC32_SLC=0" # Build without CRC32 hash tables, 1 bit/loop (100MB/s on i5@3GHz)
- make -f Makefile.posix ADD_CFLAGS="-DS_CRC32_SLC=1" # Build with CRC32 1024 byte hash table, 1 byte/loop (400MB/s on i5@3GHz)
- make -f Makefile.posix ADD_CFLAGS="-DS_CRC32_SLC=4" # Build with CRC32 4096 byte hash table, 4 bytes/loop (1000MB/s on i5@3GHz)
- make -f Makefile.posix ADD_CFLAGS="-DS_CRC32_SLC=8" # Build with CRC32 8192 byte hash table, 8 bytes/loop (2000MB/s on i5@3GHz)
- make -f Makefile.posix ADD_CFLAGS="-DS_CRC32_SLC=12" # Build with CRC32 12288 byte hash table, 12 bytes/loop (2500MB/s on i5@3GHz) -this is the default CRC32 mode-
- make -f Makefile.posix ADD_CFLAGS="-DS_CRC32_SLC=16" # Build with CRC32 16384 byte hash table, 16 bytes/loop (2700MB/s on i5@3GHz):
- make -f Makefile.posix ADD_FLAGS=-DSD_DISABLE_HEURISTIC_GROWTH # Build with growth heuristics disabled (not recommended)
- make -f Makefile.posix ADD_FLAGS=-DS_DISABLE_SM_STRING_OPTIMIZATION # Build without map string optimizations (not recommended, except for benchmarking)
- make -f Makefile.posix HAS_PNG=1 HAS_JPG=1 # Build enabling PNG and JPG usage so the 'imgc' example can convert import/export those formats (libpng and jpeg 6b -e.g. libjpegturbo- compatible dev libs and headers must be installed in the system)
-
Observations
- Every make call, in addition to building the targets, it does a full test for that build (unit tests covering all the API function calls -'stest' executable-)
- Between make calls with different parameters, please use "make clean"
- It should build with any GCC version >= 2.95.2 (1999), in any hardware platform (x86, x86-64, MIPS32/64, MIPSLE, PPC/PPC64, ARM, etc.)
- CLang should work in all versions (except in cases of reporting wrong/missing compiler or C standard version)
- For Windows it is provided a .sln example just for building the test (used for the CI check)
- In BSD systems not using GNU Make as default, use gmake instead of make
-
For launching the extensive tests:
- ./make_test.sh # All tests (used in the CI validation): all builds (23 * 4 tests), Valgrind memcheck, CLang static analyzer, documentation generation and validation, coding style check
- ./make_test.sh 1 # Validate all available C/C++ builds
- ./make_test.sh 2 # Valgrind memcheck
- ./make_test.sh 4 # Clang static analyzer
- ./make_test.sh 8 # Generate documentation
- ./make_test.sh 16 # Check coding style
- ./make_test.sh 24 # Like 8 plus 16 (3 would be like 1 plus 2, etc.)
-
A preliminary Autoconf/Automake build is also provided:
- ./bootstrap.sh
- ./configure
- make
- make check
-
A Visual Studio project is provided in win/vs2013/ for running the test through AppVeyor's CI.
Generic advantages
-
Ease of use
- Use strings, vectors, bit sets, sets, maps, hash sets, and hash maps in a similar way to higher level languages.
-
Space-optimized
- Dynamic one-block linear addressing space.
- Internal structures use indexes instead of pointers (i.e. similar memory usage in both 32 and 64 bit mode).
- Details: doc/benchmarks.md
-
Time-optimized
- Buffer direct access
- Preallocation hints (reducing memory allocation calls)
- Heap and stack memory allocation support
- Details: doc/benchmarks.md
-
Predictable (suitable for hard and soft real-time)
- Predictable execution speed: all API calls have documented time complexity. Also space complexity, when extra space involving dynamic memory is required.
- Hard real-time: allocating maximum size (strings, vector, bit set, set, map, hash set, hash map) will imply calling 0 or 1 times to the malloc() function (C library). Zero times if using the stack as memory or externally allocated memory pool, and one if using the heap. This is important because malloc() implementation has both memory and time overhead, as internally manages a pool of free memory (which could have O(1), O(log(n)), or even O(n), depending on the compiler provider C library implementation).
- Soft real-time: logarithmic time memory allocation (when not doing preallocation): for 10^n new elements just log(n) calls to the malloc() function. This is not a guarantee for hard real time, but for soft real time (being careful could provide almost same results, except in the case of very poor malloc() implementation in the C library being used -not a problem with modern compilers-).
-
RAM, ROM, and disk operation
- Data structures can be stored in ROM memory.
- Data structures are suitable for memory mapped operation, and disk store/restore. This is true for strings, vectors, and bit sets, and for sets/maps/hash sets/hash maps when using integer data and when using small strings (<= 19 bytes for S/SI/IS data types, and <= 54 bytes for SS).
-
Known edge case behavior
- Allowing both "carefree code" and per-operation error check. I.e. memory errors and UTF8 format error can be checked after every operation.
-
Implementation
- Simple internal structures, with linear addressing. It allows to reduce memory allocation calls to the minimum (using the stack it is possible to avoid heap usage).
- Implementation kept as simple as possible/known.
- Focus in avoiding code-bloat, avoiding expensive code duplication/inlining (inlining is used when being "free" or because of big speed impact)
- Small code, suitable for static linking. In fact, currently no dynamic linking is provided (will be added, but it is not a priority).
- Coverage, profiling, and memory leak/overflow/uninitialized checks (Valgrind, CLang static analysis, Coverity, MS VS).
-
Compatibility
- C99 (and later) compatible (requiring alloca() function)
- C++11 (and later) compatible (requiring alloca() function)
- Endian-safe: the library works with any CPU endianess (tested with little and big endian, but it should work with other modes, e.g. with PDP endianess -not tested-)
- POSIX builds (Linux, BSD's) require GNU Make (e.g. on FreeBSD use 'gmake' instead of 'make')
- Compatibility with old C compilers:
- Oldest GNU C compiler being tested is GCC 2.95.2 (1999)
- Other old compilers may work in C89 mode only if following language extensions are available:
- __VA_ARGS__ macro
- alloca()
- type of bit-field in 'struct'
- %S printf extension (only for unit testing)
- Compatibility with old C++ compilers:
- It may work in C++98 mode only if following language extensions are available:
- Anonymous variadic macros
- Long long integer constant support (LL)
- It may work in C++98 mode only if following language extensions are available:
Generic disadvantages/limitations
- Double pointer usage: because of using just one allocation, write operations require to address a double pointer, so in the case of reallocation the source pointer could be changed.
- Concurrent read-only operations are safe, but concurrent read/write must be protected by the user (e.g. using mutexes or spinlocks). That can be seen as a disadvantage or as a "feature" (it is faster).
String-specific advantages (srt_string)
- Binary data support
- I.e. strings can have zeros in the middle
- Search and replace into binary data is supported
- Unicode support
- Although strings internal storage is binary, Unicode-aware functions store data in UTF-8.
- Search and replace into UTF-8 data is supported
- Full and fast Unicode lowercase/uppercase support without requiring "setlocale" nor hash tables.
- Efficient raw and Unicode (UTF-8) handling. Unicode size is tracked, so resulting operations with cached Unicode size, will keep that, keeping the O(1) for getting that information afterwards.
- Find/search: O(n), one pass.
- Replace: O(n), one pass. Worst case overhead is limited to a realloc and a copy of the part already processed.
- Concatenation: O(n), one pass for multiple concatenation. I.e. Optimal concatenation of multiple elements require just one allocation, which is computed before the concatenation. When concatenating ss_t strings the allocation size compute time is O(1).
- Resize: O(n) for worst case (when requiring reallocation for extra space. O(n) for resize giving as cut indicator the number of Unicode characters. O(1) for cutting raw data (bytes).
- Case conversion: O(n), one pass, using the same input if case conversion requires no allocation over current string capacity. If resize is required, in order to keep O(n) complexity, the string is scanned for computing required size. After that, the conversion outputs to the secondary string. Before returning, the output string replaces the input, and the input becomes freed.
- Avoid double copies for I/O (read access, write reserve)
- Avoid re-scan (e.g. commands with offset for random access)
- Transformation operations are supported in all dup/cpy/cat functions, in order to both increase expressiveness and avoid unnecessary copies (e.g. tolower, erase, replace, etc.). E.g. you can both convert to lower a string in the same container, or copy/concatenate to another container.
- Space-optimized
- Using just 4 byte overhead for strings with size <= 255 bytes
- Using sizeof(size_t) * 5 byte overhead for strings with size >= 256 bytes (e.g. 20 bytes for a 32-bit CPU, 40 for 64-bit)
- Data structure has no pointers, i.e. just one allocation is required for building a string. Or zero, if using the stack.
- No additional memory allocation for search.
- Extra memory allocation may be required for: UTF-8 uppercase/lowercase and replace.
- Strings can grow from 0 bytes to ((size_t)~0 - metainfo_size)
- String operations
- Copy, cat, tolower/toupper, find, split, printf, cmp, base64, data compression, crc32 on buffers, etc.
- All string operations allow C strings and raw buffers as input, without extra copies (ss_[c]refa functions)
- Allocation, buffer pre-reserve,
- Raw binary content is allowed, including 0's.
- "Wide char" and "C style" strings R/W interoperability support.
- I/O helpers: buffer read, reserve space for async write
- Aliasing suport, e.g. ss_cat(&a, a) is valid
- Misc string/buffer operations:
- Real-time O(n) data compression (stateless, unlimited buffer size, and hash table resource usage proportional to the input size, i.e. efficient also for small inputs)
- State of the art encoding: base64, hexadecimal, etc. (at GB/s speeds)
- State of the art CRC32 and Adler32 hashes on strings (at >2 GB/s speeds)
- Focus on reducing verbosity:
- ss_cat(&t, s1, ..., sN);
- ss_cat(&t, s1, s2, ss_printf(&s3, "%i", cnt), ..., sN);
- ss_free(&s1, &s2, ..., &sN);
- Expressive code without explicit memory handling
- Focus on reducing errors, e.g.
- If a string operation fails, the string is kept in the last successful state (e.g. ss_cat(&a, b, huge_string, other))
- String operations always return valid strings, e.g. This is OK: srt_string *s = NULL; ss_cpy_c(&s, "a"); Same behavior as: srt_string *s = ss_dup_c("a");
- ss_free(&s1, ..., &sN); (no manual set to NULL is required)
String-specific disadvantages/limitations
- No reference counting support. Rationale: simplicity.
Vector-specific advantages (srt_vector)
- Variable-length concatenation and push functions.
- Allow explicit size for allocation (8, 16, 32, 64 bits) with size-agnostic generic signed/unsigned functions (easier coding).
- Allow variable-size generic element.
- Sorting
- O(n) for 8-bit elements (counting sort algorithm), much faster than GNU/Clang qsort() (C), and up to 5x faster than GNU/Clang std::vector sort (C++)
- O(n log n) -pseudo O(n)- for 16/32/64-bit elements (in-place MSD binary radix sort algorithm), 2x-3x faster than GNU/Clang qsort() (C), performing similar to GNU/Clang std::vector sort (C++)
- O(n log n) for generic elements using the C library (qsort())
Vector-specific disadvantages/limitations
- No insert function. Rationale: insert is slow (O(n)). Could be added, if someone asks for it.
Set and map advantages (srt_set and srt_map)
- Abstraction over Red-Black tree implementation using linear memory pool with just 8 byte per node overhead, allowing up to (2^32)-1 nodes (for both 32 an 64 bit compilers). E.g. for a key-value map, one million 32 bit key, 32 bit value map will take just 16MB of memory (16 bytes per element -8 byte metadata, 4 + 4 byte data-).
- Keys: integer (8, 16, 32, 64 bits) and string (ss_t)
- Values: integer (8, 16, 32, 64 bits), string (ss_t), and pointer
- O(1) for allocation
- O(1) for deleting without strings (one or zero calls to 'free' C function)
- O(n) for deleting with strings (n + one or zero calls to 'free' C function)
- O(n) for copy (in case of without strings, would be as fast as a memcpy())
- O(log n) insert, search, delete
- O(n) sorted enumeration (amortized O(n log n))
- O(n) unsorted enumeration (faster than the sorted case)
- O(n) copy: tree structure is copied as fast as a memcpy(). For types involving strings, additional allocation is used for duplicating strings.
- Short string optimization so strings up to 18 bytes can fit in the node for (SI, IS, SP maps, and S sets), and up to 54 bytes combined for string-string maps (SS type). Short strings require no extra allocation/de-allocation calls.
Set and map disadvantages/limitations (srt_set and srt_map)
- Because of being implemented as a tree, it can be slower than a hash-map on average, specially on integer data. However, total execution time is not that bad, as because of allocation heuristics a lot of calls to the allocator are avoided.
- There is room for node deletion speed-up (currently deletion is a bit slower than insertion, because of an additional tree search used for avoiding having memory fragmentation, as implementation guarantees linear/compacted memory usage, it could be optimized with a small LRU for cases of multiple delete/insert operation mix).
Hash set and hash map advantages (srt_hset and srt_hmap)
- Implemented using open-addressing hash table, using linear memory pool with 12 byte per bucket overhead, allowing up to (2^32)-1 nodes (for both 32 an 64 bit compilers). E.g. for a key-value hash map, one million 32 bit key, 32 bit value map will take just 20MB of memory (20 bytes per insertion -12 byte for the hash table bucket, 4 + 4 byte data-).
- Keys: integer (8, 16, 32, 64 bits) and string (ss_t)
- Values: integer (8, 16, 32, 64 bits), string (ss_t), and pointer
- O(1) for allocation
- O(1) for deleting without strings (one or zero calls to 'free' C function)
- O(n) for deleting with strings (n + one or zero calls to 'free' C function)
- O(n) for map copy (in case of maps without strings, would be as fast as a memcpy())
- O(n) -O(1) amortized- insert, search, delete
- O(n) unsorted enumeration
- O(n) copy: hash table structure and data elements are copied as fast as a memcpy(). For map types involving strings, additional allocation is used for duplicating strings.
- Short string optimization so strings up to 18 bytes can fit in the node for (SI, IS, SP maps, and S sets), and up to 54 bytes combined for string-string maps (SS type). Short strings require no extra allocation/de-allocation calls.
Hash set and hash map disadvantages/limitations (srt_hset and srt_hmap)
- Because of being implemented as a hash table, if not pre-reserved, rehash adds latency.
Test-covered platforms
ISA | Word size | Endianess | Unaligned memory access HW support | OS | Compilers | Code analysis | Test coverage |
---|---|---|---|---|---|---|---|
x86, x86-64 (Core i5) | 32, 64 | little | yes | Linux Ubuntu 12.04/14.04/17.10 | gcc, g++, tcc, clang, clang++ | Valgrind, clang, Coverity | Travis CI (automatic, every public commit) |
x86, x86-64 (Core i5) | 32, 64 | little | yes | Windows | Visual Studio Express 2013, AppVeyor's VS | VS | AppVeyor (automatic, every public commit) |
x86, x86-64 (Core i5) | 32, 64 | little | yes | FreeBSD 10.2 | gcc, g++, clang, clang++ | Valgrind clang | manual |
x86, x86-64 (Core2Duo) | 32, 64 | little | yes | Darwin 11.4.2 | gcc, g++, clang, clang++ | none | manual |
ARMv5 (ARM926EJ-S) | 32 | little | no | Arch Linux | gcc, g++, clang, clang++ | none | manual |
ARMv5 (ARM926EJ-S) | 32 | little | no | Linux Debian 7.0 "Wheezy" | gcc, g++, clang, clang++ | none | manual |
ARMv5 (Feroceon) | 32 | little | no | Linux Debian 7.0 "Wheezy" | gcc, g++ | none | manual |
ARMv6 (ARM1176JZF-S) | 32 | little | yes | Linux Raspbian | gcc, g++, clang, clang++ | Valgrind, clang | manual |
ARMv7-A (Krait 400) | 32 | little | yes | Linux Android 5.1.1 + BusyBox | gcc, g++ | none | manual |
ARMv8-A (Cortex A53) | 64 | little | yes | Debian 8.5 "Jessie" | gcc, g++, clang, clang++ | Valgrind, clang | manual |
ARMv8-A (MSM8996) | 64 | little | yes | Linux Android 7.1.1 | clang, clang++ | clang | manual |
MIPS, MIPS64 (Octeon) | 32, 64 | big | yes | EdgeOS v1.6.0 (Linux Vyatta-based using Debian 7 "Wheezy" packages) | gcc, g++, clang, clang++ | Valgrind, clang | manual |
MIPS (EE R5900) | 32 | little | no | Playstation 2 Linux (Red Hat based) | gcc, g++ 2.95.2 | none | manual |
PowerPC (G4) | 32 | big | yes | Linux Ubuntu 12.04 | gcc, g++ | none | manual |
PowerPC, PPC64 (Cell) | 32, 64 | big | yes | Yellow Dog Linux 6.2 for Playstation 3 (Red Hat based) | gcc, g++ 4.1.2 | none | manual |
License
Copyright (c) 2015-2019, F. Aragon. All rights reserved. Released under the BSD 3-Clause License (see the LICENSE file included).
Contact
email: faragon.github (GMail account, add @gmail.com)
Other
Status
Beta. API still can change: suggestions are welcome.
Acknowledgements and references
Check doc/references.md