Unicode in Embedded Wizard

Question

We have to convert Unicode entered by keyboard to other Character sets like Shift jis. For that I want to know which character set in Unicode is supported by embedded wizard

UCS-2
UTF-16BE BOM
UTF-16LE BOM

This the quote from embedded wizard documentation

Embedded Wizard supports character codes (code points) from UNICODE plane 0. This includes Arabic and Hebrew scripts. Also bidirectional text (BIDI) output with left-to-right (LTR) and right-to-left (RTL) text writing direction, as well as combinations of them according to the Unicode Standard Annex #9 Unicode version 10.0.0 are supported.

Paul Banach · Answer 1 · 2022-12-21T13:05:01+0000

Hello anas.n,

EW limits its support to UNICODE plane 0. This includes UNICODE code points 0x0000 ... 0xFFFF. All code points (character codes) are stored in EW with fixed width of 16-bits (2 bytes) per character (see also char and string type description). In other words, each code point is stored as an individual 16-bit entity.

Concerning UCS-2, so far I found in the related specification, the format is using 2-bytes per code point. This corresponds to EW coding. UCS-2 code points also map to UNICODE Basic Multilanguage Plane, which corresponds to UNICODE plane 0. So far, I would say that UCS-2 encoded characters and strings could be exchanged directly with EW without any conversion. However, since we have not tested this, we cannot guarantee this.

Concerning the support of UTF-16, such is not available. As explained above, EW supports only the lower 65536 UNICODE code points. These are stored as independent 16-bit entities. If you plan to exchange strings encoded in UTF-16, then you would need to decode/encode them in advance. Please note, with UTF-16 format it is possible to address code points beyond the 65536 characters. Such code points however are not supported by EW. Therefore from Embedded Wizard point of view there is no practical usage and no support for UTF-16 encoded strings.

Also important, the support of the above mentioned character codes from UNICODE plain 0 does not mean that text rendering for all of the included languages will work. Concrete, scripts expecting so-called complex text layout (e.g. Devanagari or Thai) are not or only limited supported by Embedded Wizard.

I hope it helps you further.

Best regards

Paul Banach

Unicode in Embedded Wizard

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Categories

Unicode in Embedded Wizard

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Categories

Related questions