ListOfLists.com Internet Directory

Home > Top > Computers > Software > Globalization > Character Encoding
CJKV@Encoding
@Data FormatsUnicode

Surf these sites:
Basis Technology - Encoding and Language Identifier -- Basis Technology''s Encoding and Language Identifier is a high-performance engine for determining the encoding and language of unspecified text. It can identify over 80 language/encoding pairs, including European and Asian languages/encodings.
Character Sets -- These are the official names for character sets that may be used in the Internet and may be referred to in Internet documentation.
Character code issues -- A tutorial on character code issues in digital processing and transfer of text data (on the Internet or otherwise).
Character sets and UNICODE -- The author explains what character encoding is. Due to the increase of text based computing, users now require that computers are able to recognise and manipulate text in different languages. The code space provided by the 8-bit byte is no longer sufficient. UNICODE provides a unified 16-bit encoding scheme allowing systems to exchange information unambiguously.
Codepage & Co. -- An overview of the history and current status of the various code-pages being used in the PC environment.
Cyrillic Charset Soup -- Detailed overview of all cyrillic character-set encodings ranging from GOST-13052 to Unicode.
HTML Document Representation -- Chapter covering document character sets and encodings in HTML from the World Wide Web Consortium''s HTML 4.0 Specification.
HTML Validation: Using Character Encodings -- How to validate HTML documents in various character encodings.
ISO 646 (Good old ASCII) -- An overview of the 7-bit ASCII (ISO-646) standard, a comparison with EBCDIC, international additions - and how it ultimately let to the development of the 8-bit ISO-8895-x and 16-bit Unicode (ISO-10646) standards.
ISO 8859 Alphabet Soup -- A commented graphical overview of the ISO 8859 character sets by Roman Czyborra.
MS Windows characters in HTML -- A review of the HTML authoring problems caused by some special characters which belong to MS Windows character set but not to ISO Latin 1.
OII - Character Set Standards -- This section of the OII Standards and Specifications List provides information on character sets that can be used for data interchange.
Russian encoding options -- Brief tutorial on Russian encoding, code pages, fonts, alphabet. With tables.
Standard ECMA-35 -- Character Code Structure and Extension Techniques
Tutorial: Shady Characters -- A tutorial that explains HTML character sets, character encodings and character references.
UTF-8 Sampler -- UTF-8 is an ASCII-preserving encoding method for Unicode (ISO 10646), the Universal Character Set (UCS). The UCS encodes most of the world''s writing systems in a single character set, allowing you to mix languages within a document without needing any tricks for switching character sets. This web page is encoded directly in UTF-8.
Uniconv (Free Demonstration Copy) -- Can be used to convert text documents to any supported encoding. It can also be used to transform text within the document, for example you may convert all uppercase letters to lowercase.
World Wide Web Consortium -- Internationalization / Localization: Character sets

Help build the largest human-edited directory on the web.
Submit a Site - Open Directory Project - Become an Editor