International Components for Unicode

ICU Home
  · ICU Home
ICU4C Demos
  · Converter Explorer
  · Collation Demo
  · Segments
  · IDNA
  · Locale Explorer
  · Normalization Browser
  · Regular Expressions
  · String Compare
  · Transforms
  · Unicode Browser
ICU4J Demos
  · Demo Page
Tools
 

Related Websites

Unicode Consortium

Common Locale Data

 

ICU  >  Demo  > 

IDNA Demo


Results of Operation
ModeTextCode Points
Inputതോട്ടിങ്ങല്‍0D24 0D4B 0D1F 0D4D 0D1F 0D3F 0D19 0D4D 0D19 0D32 0D4D 200D
ToASCII(input)xn--fwcaqax2g2d7dtadc 0078 006E 002D 002D 0066 0077 0063 0061 0071 0061 0078 0032 0067 0032 0064 0037 0064 0074 0061 0064 0063
ToUnicode(ToASCII(input))തോട്ടിങ്ങല്0D24 0D4B 0D1F 0D4D 0D1F 0D3F 0D19 0D4D 0D19 0D32 0D4D
ToUnicode(input)തോട്ടിങ്ങല്‍0D24 0D4B 0D1F 0D4D 0D1F 0D3F 0D19 0D4D 0D19 0D32 0D4D 200D
ToASCII(ToUnicode(input))xn--fwcaqax2g2d7dtadc 0078 006E 002D 002D 0066 0077 0063 0061 0071 0061 0078 0032 0067 0032 0064 0037 0064 0074 0061 0064 0063

About this demo

This CGI program demostrates the IDNA implementation. The RFC defines 2 operations: ToASCII and ToUnicode. Domain labels containing non-ASCII code points are required to be processed by ToASCII operation before passing it to resolver libraries. Domain names that are obtained from resolver libraries are required to be processed by ToUnicode operation before displaying the domain name to the user. IDNA requires that implementations process input strings with Nameprep, which is a profile of Stringprep , and then with Punycode.

In the above demo, different combinations of ToASCII and ToUnicode are applied to the input. It also provides a simple illustration of how a GUI can visually indicate boundaries between different scripts, to help avoid spoofing. The code is rough, and only meant for illustration. One could certainly refine this to call out more characters that are visually confusable. For example, many CJK Radicals are identical in appearance to CJK Ideographs. Mixtures of simplified and traditional characters can also be visually highlighted, to help signal possible user errors.


Examples
You can either paste in Unicode text into the above box, or you can use Unicode escapes. For example, you can either use "ä" or "\u00E4", or could use the decomposition "a\u0308". You can also copy some interesting Unicode text samples from the following pages:

Unicode version used by IDNA 3.2 — Powered by ICU 74.1