ICU > Demo >

ICU Unicode String Comparison

This demo application illustrates the operation of some of the different string compare functions that are available in the ICU Unicode support library.

Enter two strings to be compared, then click on the "Submit" button. The results are described in the Key. For more information, see Background.

*Strings to be Compared*
	String 1	String 2
Enter strings, then submit	αλφα	αλφα
	αλφα	αλφα
Strings after unescaping	αλφα	αλφα
Strings in hex format	\u03b1 \u03bb \u03c6 \u03b1	\u03b1 \u03bb \u03c6 \u03b1

*Comparison Result*
Binary	Caseless	Equiv	Equiv-Caseless
Y	Y	Y	Y

*Key to Results*
Result	Meaning
Binary	Strings have exactly the same code points in Unicode
Caseless	Strings are equal, case insensitive; thus case differences are discarded. Examples: αλφα and Αλφα
Equiv	Strings are canonically equivalent; thus equal after normalization. Examples: Åland and Åland, or \u062f\u0650\u0651 and \u062f\u0651\u0650. Note that Åland and Åland are also equal in a caseless match because they both case-fold to the same string.
Equiv-Caseless	Strings are canonically equivalent, case insensitive. Examples: åland and Åland
Hex	Display all input characters as hex values.

Background

The above comparisons use only Unicode properties, and are invariant across locales. To compare two strings according to locale settings, see ICU Collation Demo.
The strings may contain \uhhhh and \Uhhhhhhhh hex escapes for characters that can not be entered directly from the keyboard. For descriptions of additional escape sequences, see UnicodeString::unescape() in the ICU API reference.
This tool is built using ICU's string comparison functions, showing the effects of the options for Unicode Normalization (canonical equivalence) and Case Insensitive comparisons.

International Components for Unicode

ICU Unicode String Comparison