Home

Awesome

PCRE.NET <picture><source media="(prefers-color-scheme: dark)" srcset="icon-dark.png"><img src="icon.png" align="right" alt="Logo"></picture>

Perl Compatible Regular Expressions for .NET

Build NuGet Package GitHub release PCRE2 License <br clear="right" />

PCRE.NET is a .NET wrapper for the PCRE2 library.

The following systems are supported:

API Types

The classic API

This is a friendly API that is very similar to .NET's System.Text.RegularExpressions. It works on string objects, and supports the following operations:

The Span API

PcreRegex objects provide overloads which take a ReadOnlySpan<char> parameter for the following methods:

These methods return a ref struct type when possible, but are otherwise similar to the classic API.

The zero-allocation API

This is the fastest matching API the library provides.

Call the CreateMatchBuffer method on a PcreRegex instance to create the necessary data structures up-front, then use the returned match buffer for subsequent match operations. Performing a match through this buffer will not allocate further memory, reducing GC pressure and optimizing the process.

The downside of this approach is that the returned match buffer is not thread-safe and not reentrant: you cannot perform a match operation with a buffer which is already being used - match operations need to be sequential.

It is also counter-productive to allocate a match buffer to perform a single match operation. Use this API if you need to match a pattern against many subject strings.

PcreMatchBuffer objects are disposable (and finalizable in case they're not disposed). They provide an API for matching against ReadOnlySpan<char> subjects.

If you're looking for maximum speed, consider using the following options:

The DFA matching API

This API provides regex matching in O(subject length) time. It is accessible through the Dfa property on a PcreRegex instance:

You can read more about its features in the PCRE2 documentation, where it's described as the alternative matching algorithm.

Library highlights

Example usage

var matches = PcreRegex.Matches("(foo) bar (baz) 42", @"\(\w+\)(*SKIP)(*FAIL)|\w+")
                       .Select(m => m.Value)
                       .ToList();
// result: "bar", "42"
var result = PcreRegex.Replace("hello, world!!!", @"\p{P}+", "<$&>");
// result: "hello<,> world<!!!>"
var result = PcreRegex.Substitute("hello, world!!!", @"\p{P}+", "<$0>", PcreOptions.None, PcreSubstituteOptions.SubstituteGlobal);
Assert.That(result, Is.EqualTo("hello<,> world<!!!>"));
var regex = new PcreRegex(@"(?<=abc)123");
var match = regex.Match("xyzabc12", PcreMatchOptions.PartialSoft);
// result: match.IsPartialMatch == true
const string jsonPattern = """
    (?(DEFINE)
        # An object is an unordered set of name/value pairs.
        (?<object> \{
            (?: (?&keyvalue) (?: , (?&keyvalue) )* )?
        (?&ws) \} )
        (?<keyvalue>
            (?&ws) (?&string) (?&ws) : (?&value)
        )

        # An array is an ordered collection of values.
        (?<array> \[
            (?: (?&value) (?: , (?&value) )* )?
        (?&ws) \] )

        # A value can be a string in double quotes, or a number,
        # or true or false or null, or an object or an array.
        (?<value> (?&ws)
            (?: (?&string) | (?&number) | (?&object) | (?&array) | true | false | null )
        )

        # A string is a sequence of zero or more Unicode characters,
        # wrapped in double quotes, using backslash escapes.
        (?<string>
            " (?: [^"\\\p{Cc}]++ | \\u[0-9A-Fa-f]{4} | \\ ["\\/bfnrt] )* "
            # \p{Cc} matches control characters
        )

        # A number is very much like a C or Java number, except that the octal
        # and hexadecimal formats are not used.
        (?<number>
            -? (?: 0 | [1-9][0-9]* ) (?: \. [0-9]+ )? (?: [Ee] [-+]? [0-9]+ )?
        )

        # Whitespace
        (?<ws> \s*+ )
    )

    \A (?&ws) (?&object) (?&ws) \z
    """;

var regex = new PcreRegex(jsonPattern, PcreOptions.IgnorePatternWhitespace);

const string subject = """
    {
        "hello": "world",
        "numbers": [4, 8, 15, 16, 23, 42],
        "foo": null,
        "bar": -2.42e+17,
        "baz": true
    }
    """;

var isValidJson = regex.IsMatch(subject);
// result: true