Jump to content
Chinese-Forums
  • Sign Up

Learning Cantonese


Recommended Posts

Posted
Originally posted by atitarev

HK government added a few thousand Cantonese specific characters!

I've seen this bandied around in these forums before. Where do you get that figure from? I think you mean to say a few hundred, right?

all educated Canonese speakers need to know both ways to write.

Actually, I doubt that anyone "needs" to be able to write colloquial Cantonese. One "needs" only to be able to write in standard Chinese. Other dialect speakers get by fine by only being able to write standard Chinese. Being able to read colloquial Cantonese characters is obviously a bonus if you live in HK, but a lot of the time the characters are intuitive for native Cantonese speakers so it doesn't require much active learning.

Posted

Thanks Mugi,

I downloaded a PDF file from a HK site, there were about 3 thousand characters.

What is intuitive for Chinese people, especially Cantonese speakers (because it's based on Cantonese pronunciation), is not so intuitive fo learners and requires an extra effort. Learning Chinese characters is not easy and not straightforward for foreigners - writing, pronunciation and meaning, all 3 have to go together.

If learning colloquial Cantonese writing is is not necessary, then what is? Your comment means - just learn Mandarin or standard written Chinese if you wish (only read it using Cantonese pronunciation). Then you will have to remember all the different words used when speaking and when writing. It seems Hong Kong has created some kind of standard for written Cantonese and not ony written, which is spread outside and picked up by Cantonese speakers in Guangdong, etc. There is no other standard, which is maintained, anyway

If you need to learn spoken Cantonese, then you need to somehow connect it to the written form, otherwise you will have to use romanisation or search for homophones.

I noticed that Cantonese speakers feel differently about what is called Cantonese topolect and how and what to learn depending on where they live. Cantonese textbooks introduce the spoken Cantonese using either specific Cantonese characters or they are just romanised (usually Yale).

Which version of Cantonese in your oipinion do you need to learn if you eg. want to understand 1) Hong Kong movies - both what is said and what is written in the subtitles? I believe there will be some discrepancy between some words and characters (these subtitles are supposed to be used by all Chinese) but not sure how much 2) CantoPop songs from Hong Kong, should the characters reflect what is really sung? 3) News from Hong Kong - speech and subtitles (I know the subtitles are in standard Mandarin (somewhat different from mainland Mandarin) using traditional characters but certain words may not match.

Posted

A few thousands? I thought only a few tens... mostly just adding a mouth radical. Also, if you insist, you can write most of them in proper characters (see the list I posted in 2004). For example, like you said 嚟 can be written as 来 and given a second pronunciation, since they are essentially the same word whose pronunciation diverged at some point in time due to some compound words being used less frequently than others (therefore retaining the character's archaic pronunciation). Also, the character 是 can be treated the same and given the "hai" pronunciation, since as a radical in 提 and 题, it gives the "ai" sound. Then there's 番 for 返 etc etc. How many Cantonese speakers know how to write car+stand for "lip"? I don't think Cantonese specific characters have been standardized to such a degree that you must learn them to survive or understand texts in Hong Kong. If you know Mandarin well, it's not hard to pick up Cantonese, at least the listening part. Native Chinese speakers can usually pick up Cantonese in a matter of months just by passive submersion. Understanding news broadcasts, however, will take a longer time, because it's a different language...

Posted

Sorry, I was wrong, it's not 3,000, it's 4,384 Cantonese specific characters included in HKSCS-2001.

On this site:

http://www.info.gov.hk/digital21/eng/structure/jyutping.html

There is a link (it's a complete PDF file with 29,000 Chinese characters, including the Cantonese specific 4,384, it's 14 MB!

http://www.iso10646hk.net/download/jp/doc/JPTableFull.pdf

They are probably not used heavily but it is a quite complete list of all characters

On the bright side, here's the description of what's included:

The HKSCS-2004 contains 4,941 characters as follows:

Symbols :

There are 441 symbol characters including Han character shapes, Hanyu Pinyin alphabets, international phonetic alphabets, Han radicals, symbols for drawing table, Japanese Katakana and Hiragana, etc.

Han characters :

There are 4,500 Han characters, 3,353 out of which can be found in the major dictionaries (Kangxi Zi Dictionary, Han Yu Da Zi Dian, Han Yu Da Ci Dian and Zhong-hua Zi Hai).

The remaining 1,147 characters that cannot be found in the major dictionaries are :

Cantonese dialect characters : 109 characters. Mainly provided by the Judiciary, the Hong Kong Police, the Department of Justice, the Hong Kong Polytechnic University and the Linguistic Society of Hong Kong. Some of them can be found in the dictionaries of Cantonese dialect or academic articles.

Radicals and shapes : 30 characters

Scientific terms : 13 characters

Names of persons, companies and locations : 892 characters. They come from the databases of the Immigration Department, the Companies Registry, the Inland Revenue Department and the Lands Department. It has been confirmed that these characters are still being used in names of persons, companies and locations.

Others : 103 characters. (They can be found in commonly used Chinese font products in Hong Kong.)

Any comments on the break-up? Cantonese learners knowing Mandarin probably need to learn as completely NEW 1,147 characters, they will be familiar with the rest but need to give it a new meaning/reading and add some usage frequency. Luckily, if we don't want to learn personal names and ignore company names we end up with learning only 109 + (30 radicals) + 13, not too bad! So Quest, is right too, only a few tens. The list we all definitely want would be the 109 (as of today characters). Of course, these will not include some Cantonese characters, which are already in Chinese dictionaries - eg. 係唔係佢哋嘅? because these characters ARE part of standard Chinese but used differently or considered rare.

Sorry for being pedantic, that's my nature :mrgreen:

Posted

I too am pedantic! :mrgreen: (That's part of the reason I enjoy your posts, and part of the reason I feel the need to comment when I think something is awry :) )

I think whittling down the number to that 109 + some of the others that you mention that do not appear in dictionaries is what should be termed "Cantonese specific characters". That would probably give you a total of 2-300 tops. However, reading the quote, my guess is that the likes of 哋 and 嘅 are included in the 109. But to paraphrase Quest, you only need to learn several dozen Cantonese specfic characters to read most colloquial Cantonese.

Another (small) point to remember is that there is a lot of doubling up of dialect characters - often a case of 1 or more prospective 本字 competing with 1 or more 俗字 (新造字 or 代用字). If you're not sure what I'm referring to , I will post a couple of examples this evening when I get back to a more Cantonese input friendly environment.

What is intuitive for Chinese people, especially Cantonese speakers (because it's based on Cantonese pronunciation), is not so intuitive fo learners and requires an extra effort.

My mistake - when you wrote "educated Cantonese speakers" I thought you were referring to native Cantonese speakers.

If learning colloquial Cantonese writing is is not necessary, then what is? Your comment means - just learn Mandarin or standard written Chinese if you wish (only read it using Cantonese pronunciation).

Yes, that's exactly what I mean. As I mentioned in a different way before, you don't need to be able to write all the colloquial words - Shanghainese don't, Taiwanese don't, Hakka don't, even Pekinese don't for many 土話 words. The only time your argument really has any relevance is with respect to foreigners learning Cantonese - is this what you're talking about?

If you need to learn spoken Cantonese, then you need to somehow connect it to the written form, otherwise you will have to use romanisation or search for homophones

Why? If you can read Chinese already and you are learning spoken Cantonese there is no need to tie it to any kind of written form at all. It's potentially a different story if Cantonese is your first Chinese language. And of course definitely a different story if Cantonese is your first Chinese language and you are wanting to learn to read and write Chinese.

If you only want to learn to speak the language though, characters are only a hindrance! Much more effective and efficient to stay with a romanisation system.

Which version of Cantonese in your oipinion do you need to learn if you eg. want to understand 1) Hong Kong movies - both what is said and what is written in the subtitles? I believe there will be some discrepancy between some words and characters (these subtitles are supposed to be used by all Chinese) but not sure how much 2) CantoPop songs from Hong Kong, should the characters reflect what is really sung? 3) News from Hong Kong - speech and subtitles (I know the subtitles are in standard Mandarin (somewhat different from mainland Mandarin) using traditional characters but certain words may not match.

1) Obviously vernacular Cantonese to listen to and understand the movie, but for the most part standard Chinese if you're reading the subtitles. People often make the mistake that subtitles are supposed to be a verbatim representation of what is spoken - they're not! They are a translation. So it's only natural that there will be discrepencies between what is spoken and what is written - this is true even of spoken English with English subtitles.

2) Most CantoPop I've heard is basically standard Chinese with Cantonese pronunciation - but maybe I'm wrong in this. I've always found it an anomaly. (Taiwanese and Hakka songs on the other hand are sung using their respective grammar and vocab). Either way, in this case I think the written lyrics should correspond directly with what is sung.

3) Same as 1)

Originally posted by Quest

For example, like you said 嚟 can be written as 来 and given a second pronunciation, since they are essentially the same word whose pronunciation diverged at some point in time due to some compound words being used less frequently than others (therefore retaining the character's archaic pronunciation).

Actually, it's usually the other way around. The colloquial pronunciation 白讀 usually has a continuous link with the earliest pronunciation. The formal pronunciation 文讀 is usually a reflection of the character being influenced by "standard Chinese" at some point in history. Thus, in the case of 來, "lai4" is older than "loi4". The reason you end up with two differenct characters is the penchant for maintaining only one pronunciation per character. And given the close relationship between written Chinese and the "national standard", it's usually the colloquial pronunciation that is assigned a new character.

Posted

Thanks, Mugi. You explained it very well and covered some of the gaps I had in understanding of modern Cantonese.

Would be good to know what is actually included in the 109 characters.

However, reading the quote, my guess is that the likes of 哋 and 嘅 are included in the 109.

It's possible but we are not 100% sure, because these characters have entries in the ABC dictionary.

E.g. see Wenlin:

嘅 [kǎi] [kài] [xì] (Unihan) sound of sighing

Here's my favourite Cantonese song with some romanization (Yale) produced using HanConv (I joined the words afterwards):

小城大事 - 楊千嬅

曲︰雷頌德

詞︰林夕

編︰雷頌德

青春彷彿因我愛你開始

cheng1cheun1 fong2fat1 yan1 ngo5 ngoi3 nei5 hoi1chi2

但卻令我看破愛這個字

daan6 keuk3 lim1 ngo5 hon1po3 ngoi3 je5go3 ji6

自你患上失憶 便是我扭轉命數的事

ji6 nei5 waan6seung5 sat1 yik1 bin6si6 ngo5 nau2jyun2 meng6sou2 di1 si6

只因當失憶症發作加深

ji2yan1 dong1 sat1 yik1 jing1 faat3jok3 ga1sam1

沒記住我但卻另有更新蜜運

mut6 gei3jyu6 ngo5 daan6 keuk3 ling6yau5 ang1san1 mat6 wan6

像狐狸精般 並未允許我步近

jeung6 wu4lei4 jing1 bun1 bing6mei6 wan5heui2 ngo5 bou6 gan6

無回憶的餘生 忘掉往日情人

mou4 wui4yik1 di1 yu4saang1 mong4diu6 wong5yat6 ching4yan4

卻又記住移情別愛的命運

keuk3yau6 gei3jyu6 yi4ching4 bit6 ngoi3 di1 meng6wan6

無回憶的男人 就當偷厄與瞞騙

mou4 wui4yik1 di1 naam4yan4 jau6 dong1 tau1 ak1 yu5 mun4pin3

抱抱我不過份

pou5pou5 ngo5 bat1gwo3 fan6

吻下來 豁出去 這吻別似覆水

man5 ha6loi4 kut3cheut1heui3 je5 man5bit6 chi5 fau6 seui2

再來也許要天上團聚

joi3loi4 ya5heui2 yiu1 tin1seung5 tyun4jeui6

再回頭 你不許 如曾經不登對

joi3 wui4tau4 nei5 bat1heui2 yu4 chang4ging1 bat1 dang1 deui3

你何以雙眼好像流淚

nei5 ho4yi5 seung1ngaan5 hou2jeung6 lau4leui6

彼此追憶不怕愛要終止

bei2chi2 jeui1yik1 bat1 pa3 ngoi3 yiu1 jung1ji2

但我大概上世做過太多壞事

daan6 ngo5 daai6koi2 seung5sai3 jou6 gwo3 taai3 do1 waai6si6

能從頭開始 跪在教堂說願意

nang4 chung4tau4 hoi1chi2 gwai6 joi6 gaau3tong4 syut3 yun6yi3

娛樂行的人影 還在繼續繁榮

yu4lok6 haang4 di1 yan4ying2 waan4 joi6 gai3juk6 faan4wing4

我在算著甜言蜜語的壽命

ngo5 joi6 syun3jeuk3 tim4yin4mat6yu5 di1 sau6meng6

人造的蠢衛星 沒探測出我們已

yan4jou6 di1 cheun2 wai6sing1 mut6 taam3chak1 cheut1 ngo5mun4 yi5

已再見不再認

yi5 joi3gin3 bat1joi3 ying6

吻下來 豁出去 這吻別似覆水

man5 ha6loi4 kut3cheut1heui3 je5 man5bit6 chi5 fau6 seui2

再來也許要天上團聚

joi3loi4 ya5heui2 yiu1 tin1seung5 tyun4jeui6

我下來 你出去 講再會也心虛

ngo5 ha6loi4 nei5 cheut1heui3 gong2 joi3wui2 ya5 sam1heui1

我還記得到天上團聚

ngo5 waan4 gei3dak1 dou3 tin1seung5 tyun4jeui6

吻下來 豁出去 從前多麼登對

man5 ha6loi4 kut3cheut1heui3 chung4chin4 do1mo1 dang1 deui3

何以雙眼好像流淚

ho4yi5 seung1ngaan5 hou2jeung6 lau4leui6

每年這天記得再流淚

mui5 nin4 je5tin1 gei3dak1 joi3 lau4leui6

Posted
Originally posted by Quest

For example, like you said 嚟 can be written as 来 and given a second pronunciation, since they are essentially the same word whose pronunciation diverged at some point in time due to some compound words being used less frequently than others (therefore retaining the character's archaic pronunciation).

Actually, it's usually the other way around. The colloquial pronunciation 白讀 usually has a continuous link with the earliest pronunciation. The formal pronunciation 文讀 is usually a reflection of the character being influenced by "standard Chinese" at some point in history. Thus, in the case of 來, "lai4" is older than "loi4". The reason you end up with two differenct characters is the penchant for maintaining only one pronunciation per character. And given the close relationship between written Chinese and the "national standard", it's usually the colloquial pronunciation that is assigned a new character.

I would say it goes both ways. In some Cantonese subdialects 來 is prounounced "loi" only -- the famous chant to recall lost souls "番来啰~~~"

Posted
I would say it goes both ways. In some Cantonese subdialects 來 is prounounced "loi" only

Not debating this point at all. Just saying that when there is a 文白異讀 in any dialect/language, the 白讀 usually reflects an older pronunciation. When only one pronunciation exists you would need to look at the other characters in the same rhyme/rime group to determine whether the character has possibly been influenced by a "standard" pronunciation or not.

  • 1 month later...
Posted
Quest wrote on 11th May 2006, 06:31 AM

Also, the character 是 can be treated the same and given the "hai" pronunciation, since as a radical in 提 and 题, it gives the "ai" sound.

This is an interesting approach. However, we don’t need to look for the right character for hai IMO, since 係 (to be) is a standard character, not specific to Cantonese.

For example, in the novel 喻世明言, Ming dynasty, there is the following sentence: 這畜生只除天上有,果係世間無。

Also, if we search for 作者係 (“the author is”) using Google, we can find about one million links (some of them may not be relevant), mostly non-Cantonese.

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...