Chinese characters

From Citizendium
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
This editable Main Article is under development and subject to a disclaimer.
PD Image
'Citizendium' in Chinese characters. This writing system allows characters to be written top to bottom as well as left to right.

Chinese characters (simplified Chinese 汉字; traditional Chinese: 漢字; hànzì in Mandarin) are symbols used to write varieties of Chinese and - in modified form - other languages, once this writing system spread to such nations as Korea, Japan and Vietnam. They are the world's oldest writing system in that they have the longest record of continuous use, dating back thousands of years. Today, they are mostly used in mainland China (including Hong Kong and Macau), Taiwan, Singapore, other Chinese communities globally, and in Japan and South Korea. Vietnamese is no longer written in characters, and their use has been abolished in North Korea. Characters in mainland China are written in a 'simplified' form, whereas elsewhere 'traditional' characters are maintained. A full list of characters would run to over 47,000 entries, but most of these are variants or obsolete; standardisation took centuries, and most literate users today know up to about 4,000. Characters can be written vertically, in columns from right to left, but it is increasingly common to see them written horizontally, left to right (newspapers take advantage of this to display articles both vertically and horizontally on the same page).


How to describe the system of Chinese characters has led to much academic debate among scholars. Everyone agrees that they are certainly not pictograms, which represent something directly, as a drawing of the object itself. Like most if not all writing systems, this is how characters first developed, but subsequently the system became far more abstract as its use extended beyond immediate referents such as the sun, hunted animals and so on. A few characters, such as 山 'mountain' do somewhat resemble that which they represent, but only a small number of characters are like this and it usually difficult to recognise the meaning. Most characters are completely abstract, or what they mean is only obvious with hindsight: 魚,[1] for example.

PD Image
Four characters with Mandarin pronunciations. The character for 'horse' is used as a component of the other three to show their exact or approximate pronunciations; they are nothing to do with horses. 媽, for instance, with a left-hand 'radical' 女 which alone means 'female', does not literally mean 'female horse'; the right-hand 'phonetic' simply indicates that the pronunciation is the same as 馬.

Likewise, Chinese characters are not ideograms, which are symbols that represent ideas directly. This myth partly arises from the observation that certain characters, particularly some of the more frequent ones, represent words without indicating pronunciation at all: for example, 日 means 'sun', and is pronounced differently depending on the language, or variety of Chinese - in Mandarin, jat6[2] in Cantonese, hi in Japanese. However, the vast majority of characters include a pronunciation element, which gives an idea of the 'reading' (pronunciation) of the character, and each character represents one syllable.[3] For example, Mandarin 机 'machine', pronounced , incorporates a 'radical' 木 ('tree'; 'wood') which may give an idea of the meaning, and a 'phonetic' 几 which indicates the pronunciation. The phonetic represents the pronunciation of another character, whose own meaning is irrelevant (in this case, 几 alone means '[small] table').[4] In the same way, a children's code might use a picture of an eye to represent 'I'. The components of the character do not mean anything in themselves; 机 does not mean 'wood[en] table' in Mandarin, for example, which could be written 木制桌子 and pronounced mùzhì zhuōzi. It does mean 'table' or 'desk' in Japanese kanji, though (pronounced tsukue), as characters often change their meanings over time and in transitions between cultures. Similarly, pronunciation has changed over the centuries, meaning that perhaps 30-40% of characters that have a 'phonetic' that no longer provides a good approximation of the reading: for example, 王 wáng (Mandarin) 'king' only very roughly indicates the pronunciation of 聖 shèng 'sacred', and is no help at all for 玉 'jade'.[5] Nevertheless, such heavy use of 'phonetic' elements exists in Chinese writing has allowed scholars to glean some idea of how Chinese varieties were spoken as long ago as the seventh century.[6]


One way of referring to the Chinese writing system is to argue that it is 'logographic' - i.e. one that uses pictorial symbols to represent words. This is probably true for the 3-10% of characters that have no pronunciation component at all, such as the numbers 一 'one', 二 'two', 三 'three' and so on, or whose phonetic components are obscure. Having said that, the same is true for the Arabic numerals 1, 2, 3... employed worldwide.[7]

The idea that Chinese is 'logographic' is misleading due to the pronunciation components of most characters and the traditional view that in Chinese the difference between 'word' and smaller units of linguistic meaning, 'morphemes',[8] is not clear. Traditional Chinese dictionaries list only single characters, giving the false impression that Chinese varieties are 'monosyllabic' (i.e. words consist of one syllable); furthermore, the notion of a 'word' was not widely employed in Chinese linguistics until the twentieth century, and English 'word' is not easily translatable.[9] In fact, the majority of e.g. Mandarin words are polysyllabic,[10] and what appears to be many words that are homophones, such as dozens of characters read as in Mandarin, are in fact separately listed morphemes of polysyllabic words, that in most cases have no independent existence as words.[11]

An example of the problem of what constitutes a written word in Chinese, extensively discussed in the literature,[12] is 蝴蝶 húdié[13] (Mandarin) 'butterfly'. This word is written with two characters, both of which when presented as isolated dictionary entries are translated as 'butterfly'.[14] The leftmost component of each character is the radical (虫, indicating that the character is probably something to do with insects or other creepy-crawlies), and with a little imagination it is possible to discern the original pictograph. However, evidence that 'butterfly' was ever consistently rendered as 蝴 or 蝶 alone is lacking,[15] and nowadays the two are effectively meaningless when apart; húdié is arguably a single morpheme and a single word in Mandarin,[16] though the rules are different in other languages: 蝶 choo means 'butterfly' in Japanese, a word comprising a single syllable and therefore requiring only one character.

Characters as a 'morphemic syllabary'

A more accurate way of describing the nature of characters would refer to both the meaning and pronunciation elements found in most characters. A linguistic approach might identify most characters as 'morphosyllabic' - morphemic, in that they represent basic units of meaning (morphemes), and syllabic, in that in Chinese most characters represent a single syllable, an abstract unit of phonology;[17] Chinese characters effectively constitute a morphemic syllabary, or script based on syllables - albeit one that contains around 850 'phonetics' to represent the 1,277 possible syllables of e.g. modern Mandarin, and thus requires an extra meaning component to distinguish them.[18]

Characters in other languages

Chinese characters do not represent 'thoughts on paper' divorced from language, and cannot be easily co-opted to write any language, because they developed to represent the syllables or words of Chinese, a set of 'isolating' languages in which each word has few affixes (such as word endings). Japanese, which is an 'agglutinating' language with many affixes, requires extra symbols to write grammatical particles which have no counterpart in varieties of Chinese - a language to which Japanese, Korean and Vietnamese are all unrelated. These languages also developed many of their own characters to express ideas differently from Chinese culture or indicate local pronunciations.

Japanese kanji

For more information, see: Kanji.

Kanji (漢字, literally 'Chinese characters') are Chinese-derived characters used to write some elements of the Japanese language; some of them were invented in Japan or Korea, so are not Chinese in origin. Kanji are also not used in exactly the same way as traditional or simplified Chinese characters used to write modern Mandarin or other varieties of Chinese, though many characters do have similar or the same meanings. Japanese also makes use of fewer characters than Chinese, so many characters have multiple readings.

Kanji have a long history in Japan, emerging perhaps by the fifth century AD, but initially their use was restricted to the work of highly literate elites who brought the characters from China, often via Korea. Today, there are 1,945 'official' kanji (常用漢字 jooyoo kanji) sanctioned by the Japanese government for learning in schools, and another 983 official characters mainly used in people's names (人名用漢字 jinmeeyoo kanji), but there are also many others that are outside these lists.

Early kanji were borrowed alongside large numbers of Chinese words, so most kanji have at least two readings. One is derived from the Chinese lexicon (音読み on'yomi) - often from up to about 1,500 years ago, and filtered through Japanese phonology - while the other is a native Japanese reading (訓読み kun'yomi). As Chinese and Japanese are unrelated in syntactic, phonological and other grammatical terms, these two readings are very different. For example, the character 口 'mouth' can be read as KOO[19] in the Chinese reading and as kuchi in the Japanese reading. Often, the Chinese reading is used in compounds such as 人口 (jinkoo 'population') while the Japanese form is used when the character stands alone.

Korean hanja

Vietnamese han tu


  1. 'Fish'.
  2. The number indicates one of several tones; in this case, low-level.
  3. DeFrancis (1984: 181, 187).
  4. DeFrancis (1984: 96, 128-129).
  5. DeFrancis (1984: 102, 104, 110); Chao (1976: 92).
  6. DeFrancis (1984: 105).
  7. DeFrancis (1984: 86, 96, 129, 186).
  8. A morpheme may be a word or part of one; e.g. English cats has two morphemes, cat and plural -s.
  9. Starosta et al. (1998: 350).
  10. Duanmu (2000: 146); DeFrancis (1984: 177-188).
  11. Kennedy (1964a: 116-117); DeFrancis (1984: 183-184).
  12. Kennedy (1964b).
  13. This is the formal, standard Mandarin reading. In colloquial spoken Mandarin, hùdiǎr is more likely. See DeFrancis (1984: 180).
  14. e.g. Mathews (1945: nos. 2174, 6321).
  15. Kennedy (1964b) and DeFrancis (1984: 180-181) both refer to the famous fourth century BC 'butterfly dream' poem by Zhuangzi (莊子), pointing to 'butterfly' written with two characters, 胡蝶. 胡 is read as well; the character 蝴 incorporates the 'insect' radical 虫, showing how the phonetic element is earlier.
  16. DeFrancis (1984: 180-184).
  17. DeFrancis (1984: 187).
  18. DeFrancis (1984: 97-104, 111).
  19. Chinese readings are capitalised in roomaji for ease of distinction.