Posted by nabla9 10/27/2024
In short, it's the same nominal sound with varying tones ("shi", which is closer in pronunciation to "shirr" than "she"), repeated about a hundred times, which is of course meaningless in spoken form (since there's not enough context to differentiate between the various forms), but actually conveys a story in written form.
With the shift toward typing and (especially mobile) computerization in the recent era, it's really not surprising (to me, at least) that Chinese society is moving in a direction where literacy no longer extends to recall of individual characters, and only encompasses recognition, since recall is no longer as necessary of a skill in day-to-day life.
There's a close relative of Mandarin (Dungan) which is written in the Cyrillic alphabet. The spoken language is tonal, but tones aren't used in the written language because written words are polysyllabic, and if you know how to speak Dungan, you can reliably infer the tones.
In normal texts written in modern Chinese, this is not a problem. Nobody writes real texts like the "shi" poem. In cases where something can only be understood in written form, you can rephrase it to avoid homophones.
0. https://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffal...
You may think it’s not needed, because that information isn’t available in spoken Chinese. The same is true for written English - putting spaces between words, dividing texts into paragraphs, capitalizing them, differentiating between different pauses (a comma, period, semicolon, etc. all signifying what kind of pause something its), quotation marks, parenthesis, etc. - none of this is available in our spoken language, and we’re still able to understand it. In theory, we could get rid of them all and understand what’s being written. In practice, most people would find the result to be an incomprehensible mess.
The same goes for Chinese. Written languages, for the most part, are more than a simple transcription of spoken sounds.
Unless Chinese is somehow unique among all human languages, this isn't true. Chinese would be just as intelligible if written in a phonetic script (like Pinyin) as it is when written using the characters.
Now, it would be an incredibly shocking transition for Chinese people who have already spent their entire lives writing with characters. However, after the transition to Pinyin, especially for young people who wouldn't ever learn the characters, written Chinese would still be perfectly understandable.
That being said, I don't favor replacing the characters, because the transition would be extremely difficult and because the characters are very culturally important to China. They've been in use for a good 3000 years, and people are very attached to them. Phonetic scripts are technically superior, but the cultural and practical arguments for sticking with the characters are still stronger.
I was talking about English in that paragraph:
> The same is true for written English - putting spaces between words, dividing texts into paragraphs, capitalizing them, differentiating between different pauses (a comma, period, semicolon, etc. all signifying what kind of pause something its), quotation marks, parenthesis, etc. - none of this is available in our spoken language, and we’re still able to understand it. In theory, we could get rid of them all and understand what’s being written. In practice, most people would find the result to be an incomprehensible mess.
The very next sentence you wrote was
> The same goes for Chinese.
So you were talking about both English and Chinese in that sentence.
I was talking about English in the sentence you quoted. In the next paragraph, I said that Chinese was the same as English in this regard. That's why I couldn't (and still can't) understand your comment.
You're saying it isn't true that removing those parts of English would mean "most people would find the result to be an incomprehensible mess" unless Chinese is unique? Chinese has absolutely no connection to written English becoming a mess after removing those elements of written English.
Or are you objecting to the paragraph after the one you quoted, where I say the same thing that happens in English is true for Chinese? "Unless Chinese is somehow unique among all human languages, this isn't true" that Chinese would be like English? That doesn't make any sense to me unless you misread my initial comment to mean the complete opposite of what it was saying.
You very clearly wrote that Chinese would become an incomprehensible mess if written in Pinyin.
You first stated that there would be a severe loss in fidelity in switching to Pinyin. Then you gave an analogy showing how removing various non-phonetic elements of written English would make it an incomprehensible mess. Immediately after that, you said that the same applies for Chinese.
I'm objecting to your argument that Chinese would be an incomprehensible mess if written alphabetically.
> I'm objecting to your argument that Chinese would be an incomprehensible mess if written alphabetically.
That's fine, but it runs directly counter to your initial comment. If a phonetic transcription would make Chinese just as easy to understand as it is written now, it would be quite different from English, and almost every other written language, all of which include non-phonetic elements in order to facilitate reading.
Now, you're obsessing over some pretty obvious misinterpretations of what I've written, and you're ignoring the argument you yourself initially made.
> If a phonetic transcription would make Chinese just as easy to understand as it is written now, it would be quite different from English, and almost every other written language, all of which include non-phonetic elements in order to facilitate reading
Pinyin, the phonetic transcription of Standard Chinese, is written with spaces and punctuation. You're going on about something that doesn't exist.
It still isn't a very good argument, though. Most English speakers get by without any knowledge of classical languages, and accept having to look up words in a dictionary.
The Chinese characters do indeed contain semantic information that Pinyin (the standard Romanization) does not, but in practice, you don't need that extra semantic information. If you write down a single word in Pinyin, it may have a few homophones, whereas the same word, written in Chinese characters, would be unambiguous. However, in written Pinyin texts, you would almost always be able to figure out which word is meant from context. In the few cases in which that would not be possible, the author could slightly rephrase the text to make it unambiguous.
Most languages on Earth (that have a writing system) are written using alphabets. Chinese is not so special that it could not be written using an alphabet as well. The reason why China hasn't switched to an alphabetic script is because of cultural attachment to the script, not because the Pinyin doesn't work just as well in a practical sense.
In what I wrote, I was assuming there would be no unfamiliar characters, but there would be one or more unfamiliar words composed of two or more characters.
I was trying to put forward the best argument I could think of for retaining the characters, but like you, have decided it isn't worth the additional effort of learning thousands of characters up front to become literate when you can use a phonetic script and look up any unfamiliar words in a dictionary instead.
And yes, this is also 100% applicable to English.
This just proves that a phonetic writing is not sufficient, but it does not mean that the phonetic writing must be replaced with traditional writing.
To resolve the ambiguity of the phonetic writing, both in Chinese and in Japanese, where the ambiguity is much worse, it is enough to retain at most a couple hundred symbols to be used as semantic classifiers. It is likely that a great part of the traditional radicals would be suitable to be retained as classifiers, with perhaps a part of them omitted if redundant and a few other symbols added, if necessary.
Then the writing could be phonetic, but with classifier symbols attached to words, wherever the ambiguity makes them necessary.
This is not a new method. The oldest writing systems, like those of Egypt or Mesopotamia, also used classifier symbols (with meanings like: "a kind of human", "a kind of god", "a kind of animal", "a kind of stone", "a kind of wood", "a body part", "a kind of tool" and so on) attached to the words written phonetically, to avoid ambiguities.
If one would have to learn only 200 classifier symbols and with lower stroke counts than most symbols used now, that would be a great simplification.
Many of the Chinese characters are actually intended to be composed of two parts, a semantic classifier and a phonetic symbol, but this principle is applied too inconsistently and with too many variants, so the system can be greatly simplified by using a simple phonetic writing like Pinyin together with semantic classifiers inserted in the text only if they are necessary.
That is not entirely true in the case of Mandarin, but it is more true in the case of Cantonese (and a few other Chinese languages).
Owing to the historical loss of sounds (especially finals) over the course of the Mandarin development, many Mandarin words tend to be longer (3-4 syllables are common) compared to their counterparts in, say, Cantonese where they are most of the time (but not always) are two syllables long due to the fact that Cantonese has retained more sounds from Middle Chinese (plus, the intermingling with the Bat Yue) over the course of its development.
Which is why the «Lion eating poet in the stone den» still makes some sense when read out loud in Cantonese (also in Wu, Min) and makes no sense in Mandarin.
https://pinyin.info/readings/zyg/what_pinyin_is_not.html
correctly however, the text was not meant as an argument against romanization but as a playful example of how pinyin are unfit for classical rather than modern vernacular chinese.
Sounds like Buffalo buffalo, but it's more like someone being clever than pointing out an actual problem with the language.
Anecdotally, I can usually remember a few parts of the character but draw a blank on the rest. You can see the picture of the grocery list that for some characters he got basically half the character right but gave up on the other half (shrimp is the combination of 虫 and 下, you can see he remembered the first half).
I guess this is analogous to only remembering the main themes of a piece and forgetting how the rest of it goes. I'll recognise it when I hear it, but can't recall it off the top of my head.
English, being the composite/mongrel language that it is has really complicated patterns for how you put letters together. For example the "i before e except after c as in neighbor and weigh" sort of thing (which does not cover all of the exceptions of course). This sort of thing has lead to the existence of spelling competitions in the English-speaking world (spelling bees). My Hungarian wife was surprised that such a thing existed. In Hungarian it is much closer to see-what-you say, with only a few exceptions (not that the rules are kind on English-speaking Hungarian learners like myself).
No, they're really not. First, they have 46 characters (each), not 56, though there are another 36 combination characters like ちゃ. Regardless, the problem here is that number comes from the total number of allowed sounds in the entire language. Japanese has an extremely small number of total possible sounds in the language compared to most other languages, particular western ones, and almost all syllables are of the form consonant+vowel: there's basically no way to write, for instance, a word ending in a hard "t" sound, so when Japanese adopts such words, it adds a vowel ending like "tu", and does this for every syllable with a hard consonant without a following vowel. Because of this, loanwords can be really hard to recognize even if you're a speaker of the language that word was adopted from (usually English these days), because the sounds don't map over very well.
And because there's so few total possible syllables, there's a huge number of homophones. The main reason kanji is still around is because it resolves ambiguity and makes it much, much easier to read Japanese text: trying to read text that's all in hiragana (or katakana) is cumbersome, even if spaces are added (Japanese text doesn't normally have spaces).
The other thing that makes alphabets more popular in the long run is that they spread easier because they're easier to adapt to different languages compared to syllabaries (indeed, it's not uncommon for a syllabary to become semi-alphabetic as part of such adoption).
On the other hand - western scholars can understand what the spoken word sounded like - but eastern readers have a much harder time what ancient words sounded like.
https://en.wikipedia.org/wiki/Rime_dictionary
Western writing systems "decay" faster. Look at french writing - the spellings are phonetic for the time they were first put to paper - but they sound nothing like the current pronunciations.
That's simply not true.
Ancient Chinese calligraphy and language is so different that you have entire PhD fields about it.
By contrast, as someone who has studied basic Latin in high school, I can read stuff from the walls of Pompeii without issue. I can directly read Latin texts from 700AD or so with the standard difficulty of reading handwriting.
See: http://www.edr-edr.it/edr_programmi/view_img.php?lang=en&id_...
Now, perhaps if I were Chinese, I could read ancient graffiti on the Great Wall, but nobody seems to have ever mentioned that.
Modern Korean people can't even read stuff older than a century or so because the language changed from using Chinese characters to the home-grown Hangul character set, and that was only completed a few decades ago.
By contrast, English speakers can read Shakespeare just fine mostly, with a little difficulty understanding some words that are no longer used.
Once calligraphy/handwriting is involved, the situation on the Western side is not much better either. Modern Anglosphere children probably would struggle with 19th-century cursive like https://www.pinterest.com/pin/375276581427478862/ ; in Germanic countries, the handwriting system underwent deeper changes, so nobody apart from selected nerds and antiquarians can read Kurrent as in Goethe's letters - https://commons.m.wikimedia.org/wiki/File:Goethe_Brief_(nich... - or even the newer Sütterlin. Contra what some posters here claim, Roman cursive (https://en.m.wikipedia.org/wiki/Roman_cursive) is right out. I don't think this should be conflated with the question of whether the writing system is understood by future readers - as an imperfect computer analogy, an ASCII text document is in some meaningful sense more futureproof than an Autodesk Animator .FLI, even if the former is on a five-inch floppy and the latter is on a USB thumb drive.
(As for the effects of the Japan's Chinese character simplification, I think they are a bit overstated. I accidentally bought a 旧字体 copy of Mishima's Haru no Yuki at a book sale once, and at least as an L2 speaker I didn't find it particularly more painful to read than I find unmodernised Shakespeare as an L2 speaker of English.)
The Proto-Sinaitic alphabetic script is the oldest (1800–1500 BCE) and evolved from Egyptian hieroglyphic symbols. It contained simplified characters representing consonants, The Phoenician alphabet came later, around 1050 BCE, evolving from Proto-Sinaitic. It became a widely used script with 22 consonantal characters and was highly influential, serving as a foundation for both the Paleo-Hebrew and Greek alphabet. The Etruskan alphabet was adapted from the Greek alphabet in the 8th centry BCE and the Roman alfabet was adapted from the Etruskan alphabet in the 7th century.
Alphabets with 20–30 letters seem to be close to a neurolinguistic optimum for balancing simplicity with expressiveness. The Armenian script was designed by monk and linguist Mesrop Mashtots in the 5th century CE to enable the translation of the Bible into Armenian. With 39 letter it represents Armenian phonetics. The Khmer alphabet with 74 characters evolved from the ancient Pallava script, which was developed in Southern India around the 4th century CE. By the 7th century CE, the Khmer people had adapted the Pallava script, creating an early form of the Khmer script. This script was initially used to write Sanskrit and Pali, the languages of Hindu and Buddhist texts.
Meanwhile, if you remember how the character is pronounced and can identify it in a lineup, it's far easier to use the phonetic approaches. (Even if your input method doesn't auto-correct the word based on context, experienced typists will also memorize the position of common words, so even they don't need to stop and look at the individual candidates in most situations.)
https://www.youtube.com/watch?v=f3seWGtZ3DQ&t=3035s
The whole series is worth a watch if you're into writing.
I'm not sure if the author has studied Vietnamese. I'm a native Vietnamese, and I believe the language is perfectly phonetic.
If I hear a word, I can write it. If I see a word, I can pronounce it, regardless of whether I understand the meaning.
It's interesting that among the 4 countries (China/Japan/Korea/Vietnam), it's the only one that completely reinvented the language into Latin based. I think that refactor addressed the phonetic issue well enough. When I was there, there was also no TV program for "spelling bees" or something like that. Even a third grader could read/write almost any word (even when they don't understand the text yet)
Edit: adding to this original post to reply a common theme people brought up in multiple posts.
I think bringing up dialects and provincial accents is not convincing. There is one official way "gia đình" should be pronounced. It's taught in school, even in the South. Pronouncing it as "da đình" can still be understood, and it doesn't retract from the point that the language is phonetic.
In other words, assuming I know nothing about the meaning of the word, if I hear "da đình" I can correctly write down it as so. I wouldn't know that in Saigon that also means "gia đình". But I definitely can write it down exactly.
I don't think using provincial speaking accent is a good line of argument here. Otherwise, no language in the world can satisfy the phonetic requirements. Any group of people can have different accents, different tones, different sound length and pauses.
Not if you account for variations of pronunciation in dialects. Not even the most phonetically accurate accent, the Hanoian Northern accent which I am a native speaker of, is perfect.
For example, you could hear Northern Vietnamese people say "dổ", "dá" instead of "rổ", "rá". Morning dew is pronounced "xương" but is written as "sương". These characters are pronounced with greater clarity in the Central and Southern regions, but they have their own peculiarities too. Til' this day I still find it iffy they call someone named "Diễm" as "Yỉm". Unless you have seen the correct way to spell those words before, you can't say for sure. Even now as a working adult I find myself referring to the dictionary to make sure my accent doesn't embarrass me in official emails.
In a perfect world, we can have one single Vietnamese accent that aims to pronoun all these words true to the intended way of the alphabet, but it isn't practical. That being said, one can get pretty far in Vietnamese when encountering new words.
There is nothing wrong with being sentimental, I lift heavy weights, collect vinyl and do film photography because I like the aesthetic of these activities. But let me force my own kids to learn whatever I think they should learn just like me at home rather than everyone forcing everyone else's kids in school.
I somehow kept the habit of handwriting for years. But as a guy in my early 30s, I do notice characters fade away from my brain from time to time, which wasn't a thing at all in the 20s. And to my surprise, some of the characters are fairly frequently used - I was just completely stuck when I was trying to recall them.
Probably that's how brains and organs peaked and will slowly break down over the following decades just like hard drives.
It always struck me that a phonetic alphabet for writing rather was much simpler and easier to learn than a system based on pictograms. So much that a society could achieve the same level of literacy with much lower cost if they adapted a phonetic system.
But I wonder if that is actually true? Has there been comparative studies of what mainland China did compared to Taiwan (which kept the traditional system) or Vietnam (which adopted latin letters) and its effect on literacy. Obviously hard to do ...
That said, you can look at Korean for a historical example of how a well-designed alphabet can fare when replacing a historical Chinese-based system. It actually spawned whole new literary genres by making writing more accessible to large segments of the populace that were effectively excluded before.