Sofia Regex is a lightweight Java API designed to help developers add regular expression search capabilities to their applications.
Sofia Regex enables you to implement regular expression pattern matching using DFA and NFA conversion.







Based on libregex++

Automatically removes comment lines, comments and blank lines

Automatically strips the prefixes, infixes and suffixes of escape sequences

Optionally adds whitespace around matched groups (but not escaped meta characters)

Updates the matches so that they are not in the wrong group, if the original matches were in the wrong group

Automatically removes unbalanced (non-escaped) meta characters

Doesn’t require the matched text to be UTF8 encoded, but you can tell Sofia to make sure the matched text is
UTF8 encoded if you want it to be

Strips the source text of carriage returns (CR), newlines (NL), tabs (TAB), backslashes (BACKSLASH) and line breaks (LF)

Works on any Java-compatible text

Exposes the matches, the entire input, the groups, and optionally the offsets of the matched patterns (right after the matches)

Supports backreferences, replacing only the first match, multi-line, single-line, and multi-line anchored and non-anchored searches

Supports the use of non-capturing groups

Supports grouping, unbounded repetition and backreferences

Supports unescaped meta characters and ignoring case

Supports escape sequences like \r \t etc.

Supports lookaheads, non-capturing groups, lookbehinds, negative lookahead and negative lookbehind

Supports unescaped asterisk * (wildcard) characters

Supports recursion (reduction) and iteration

Supports Unicode, and its standard block, category, letter, and case properties

Supports UTF-8 and UTF-16 string encodings

Supports Unicode in both the source text and the pattern

Supports to ignore comments

Supports to support JVM’s locale and specific language

Supports to support JDK’s properties

Supports to use Unicode tokens in the pattern

Supports to use Unicode literals

Supports to support Java 6 and Java 7

Supports to use Unicode symbolic properties

Supports to set case-insensitive matching

Supports to convert to DFA

Supports to keep backtracking

Supports to support backreferences

Supports to support groups

Supports to use UTF-8

What’s New In Sofia Regex?

The Sofia Regex library contains several utility classes for Java programming language.

How To Use:

You can use this API by providing a string or a char array.

1. String to Regex object using Sofia
So that you can do following operations:
1. Search: Search a pattern in a string.
2. Replace: Replace a pattern in a string.
3. Split: Split a string into a collection of strings.
4. Matches: Checks if a string matches a given pattern.
5. UPPER: converts the given string to upper case.
6. LOWER: converts the given string to lower case.
7. MATCHES: checks if a string contains a pattern.

Sample Code:

The following sample code searches for a pattern in the given string.

public class Main {

public static void main(String[] args) {

// Create a regular expression
String pattern = “a+”;

// Create a Sofi object
Sofia.Sofia sofia = new Sofia.Sofia();

// Search a pattern in a string
sofia.Search(“aA”, pattern);



java Main


Searching for “a+” in a

This results are really similar to those of Perl.

Java Regular Expression Tutorial

Sofia Regex Tutorial

Java Regular Expressions Tutorial

Download Sofia Regex library


You are free to use, modify and redistribute this library under the terms of the LGPL.


Copyright © 2008-2010, Giorgio Antonioli

The Sofia Regex library is free software: you can redistribute it and/or modify
it under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
GNU Less

