Jump to content
Chinese-Forums
  • Sign Up

Concept of word in Japanese and other Asian languages


Recommended Posts

Posted (edited)

NOTE: split off from here

The concept of word is somewhat blurred or different in Japanese as well (some scholars even treat whole clauses as one word in Japanese

At first glance, this sounds highly dubious. Do you have any sources for that? Thank you.

Edited by chrix
Posted (edited)
At first glance, this sounds highly dubious. Do you have any sources for that? Thank you.

I got interested and searched for it. I read an article yesterday when I replied in the thread. I almost quoted the opinion. Will search for it later. The agglutinative nature of Japanese makes the concept of a word rather different from words in English. It doesn't mean that Japanese is not made of words but word boundaries are not always understood and agreed on, people may not agree how many words are in a sentence.

I don't personally support this point of view but here are a few things that would makes sense:

How many words here?

ではない - de wa nai, dewa nai, dewanai - 3, 2 or 1 word(s)?

知っている - shitte iru or shitteiru - 2 or 1 word?

わたしは - when breaking up words with spaces for schoolchidren, there is never a space between a word and a particle. Is it one word or two?

Colloquillly the Japanese don't use 単語 "(a single) word" but 言葉 - words, word, speech, what someone says. I have yet to search how Japanese say about how many words they said, it may be even more complicated that Chinese, which can clearly use the number of 字 in a sentence.

On Japanese grammar

Some scholars romanize Japanese sentences by inserting spaces only at phrase boundaries (i.e., "taiyō-ga higashi-no sora-ni noboru"), treating an entire phrase as a single word. This represents an almost purely phonological conception of where one word ends and the next begins. There is some validity in taking this approach: phonologically, the postpositional particles merge with the structural word that precedes them, and within a phonological phrase, the pitch can have at most one fall. Usually, however, grammarians adopt a more conventional concept of word (単語 tango), one which invokes meaning and sentence structure.
.

The challenge for learners is also that the pitch accent marked in some dictionaries is not just the dictionary word but what follows it - some words have the high pitch ending on the last syllable and some will have the high pitch continued on the rest of the phrase.

Korean is agglutinative as well, I won't be surprised if they have problem with definitions of a word as well. Vietnamese put spaces between all syllables (many used to be written in Chinese characters), are they all individual words?

Thai and other South-Eastern languages don't put spaces between words, also used to be monosyllabic, the issue would be very similar to Chinese and Vietnamese, despite the difference in the writing systems. Japanese/Korean and Chinese/Vietnamese/Thai have different reasons but the concept of word is very different from English and other European languages, so the usage of the word "word" (no pun intended) is also different in a natural, not strictly linguistic talk.

Edited by atitarev
Posted (edited)

If we take this any further, we should probably open a new thread...

I got interested and searched for it. I read an article yesterday when I replied in the thread. I almost quoted the opinion. Will search for it later. The agglutinative nature of Japanese makes the concept of a word rather different from words in English. It doesn't mean that Japanese is not made of words but word boundaries are not always understood and agreed on, people may not agree how many words are in a sentence.

I don't personally support this point of view but here are a few things that would makes sense:

How many words here?

ではない - de wa nai, dewa nai, dewanai - 3, 2 or 1 word(s)?

知っている - shitte iru or shitteiru - 2 or 1 word?

わたしは - when breaking up words with spaces for schoolchidren, there is never a space between a word and a particle. Is it one word or two?

Colloquillly the Japanese don't use 単語 "(a single) word" but 言葉 - words, word, speech, what someone says. I have yet to search how Japanese say about how many words they said, it may be even more complicated that Chinese, which can clearly use the number of 字 in a sentence.

OK, your examples aren't whole clauses, and they're only mildly problematic: ではない and 知っている are predicates that can be broken down into words, dewa is indeed problematic, with some people claiming that it has acquired wordhood and some people still insisting to keep de and wa apart.

shitte iru is a complex predicate, consisting of te-converb and the auxiliary verb iru. But interestingly, it can be contracted to shitteru, where it then would be one phonological word (but probably still two words morphologically).

watashi wa - watashi-wa, this again a difference between morphological and phonological word. The Japanese particles are clitics, so they "lean" onto the preceding word to form a phonological word, but from a morphological point of view they're not part of the word (I know some scholars view this differently, but this is the majority view).

The challenge for learners is also that the pitch accent marked in some dictionaries is not just the dictionary word but what follows it - some words have the high pitch ending on the last syllable and some will have the high pitch continued on the rest of the phrase.

if your dictionary has marked the accent at all. But as I said this related to the notion of these particles as clitics, only then will they form part of a phonological word, if no clitic follows the word, a word that's stressed on the last mora (Japanese is a mora language, not a syllable language) will have the same pitch contour like an accent-less word, namely low in the first mora and high all the way after.

Korean is agglutinative as well, I won't be surprised if they have problem with definitions of a word as well.

Every language has problems with definition of a word. But I fail to see how agglutination makes this complicated here. Where agglutination involves affixes (or suffixes in the case of Korean and Japanese), they just form one word with the root, no problem with that...

Vietnamese put spaces between all syllables (many used to be written in Chinese characters), are they all individual words?

According to Hannas' book "Asia's orthographic dilemma", this is a holdover from the Chinese script, and makes Vietnamese harder to read than necessary, and leads some people to believe that Vietnamese is indeed a monosyllabic language.

Thai and other South-Eastern languages don't put spaces between words, also used to be monosyllabic, the issue would be very similar to Chinese and Vietnamese, despite the difference in the writing systems.

Don't paint with too broad a brush, "South-East Asian languages" include many polysyllabic languages, and even within the Austroasiatic family, Vietnamese is actually the odd man out here (possibly under Chinese influence): most of the other languages, including Khmer, are sesquisyllabic (1.5 syllables), not monosyllabic.

I don't have my Thai grammar on me right now, but as far as I remember, even though Thai doesn't have spaces, there's plenty of indicators of where a word ends, like letters being pronounced differently depending on position and IIRC also some morphophonological processes that only occur at the word boundary etc.

Unfortunately I don't recall right now how compounds work in Thai and how problematic they would be for wordhood, but other than that I also don't see any other major problems, what with nouns and verbs (almost?) never being accompanied by particles of any kind...

Edited by chrix
Posted (edited)
Don't paint with too broad a brush, "South-East Asian languages" include many polysyllabic languages

Ok, I won't as I am not an expert. Of course, Malay and Indonesian are not there but I reckon, at least, Burmese and Lao have a similar structure to Thai, assumed Khmer would be as well but thanks for correcting me.

Unfortunately I don't recall right now how compounds work in Thai and how problematic they would be for wordhood, but other than that I also don't see any other major problems, what with nouns and verbs (almost?) never being accompanied by particles of any kind...

They do use particles or tense prefixes, not unsimilar to Chinese, which may be before or after the verb, usually before in Vietnamese, optional plural markers also precede the word in Vietnamese (các, chúng) - các tôi, chúng tôi - "we", plural for "tôi".

As an example, Thai คนรัสเซีย (kon rátsia) - 'Russian (person)" or Vietnamese người Nga, not sure I can definitely say they are two words or one in both cases. They are spelled together in Thai and separately in Vietnamese. Don't get me wrong, it doesn't cause any problem with understanding or learning, like in the case of 俄罗斯人 (Chinese) or ロシア人 (Japanese) but in Chinese and Japanese dictionaries, at least, these would be listed under Russia (俄罗斯 / ロシア), Vietnamese and Thai dictionaries may prefer to give รัสเซีย and Nga as anything to do with Russia (country, language, people) but you would need to add ภาษา (paasăa) (Th.) / tiếng (Vi) for language, คน (kon) (Th.) / người for person and ประเทศ (bpràtêt) (Th.) / nước (Vi) for country to the beginning.

In case of Thai, the combination would not have spaces, like anything else (space is used as a full stop or for learners), in case of Vietnamese, it's written separately and you may decide whatever your preference is to treat the resulting tiếng Nga, người Nga and nước Nga as separate or single words.

These languages have a large layer of borrowed multi-syllable words, in case of Thai, they are definitely single words, e.g. ภาษา - language, with Vietnamese, I can only say for sure about the words, which are spelled together - a rather recent trend for borrowings only.

Disclaimer: I am not actively learning Thai or Vietnamese but exposing myself when I have time for language comparison and fun.

Edited by atitarev
disclaimer
Posted
They do use particles or tense prefixes, not unsimilar to Chinese, which may be before or after the verb, usually before in Vietnamese, optional plural markers also precede the word in Vietnamese (các, chúng) - các tôi, chúng tôi - "we", plural for "tôi".

As an example, Thai คนรัสเซีย (kon rátsia) - Russian (person) or Vietnamese người Nga, not sure I can definitely say they are two words or one in both cases. Don't get me wrong, it doesn't cause any problem with understanding or learning, like in the case of 俄罗斯人 (Chinese) or ロシア人 (Japanese) but in Chinese and Japanese dictionaries, at least, these would be listed under Russia (俄罗斯 / ロシア), Vietnamese and Thai dictionaries may prefer to give รัสเซีย and Nga as anything to do with Russia (country, language, people) but you would need to add ภาษา (paasăa) (Th.) / tiếng (Vi) for language, คน (kon) (Th.) / người for person and ประเทศ (bpràtêt) (Th.) / nước (Vi) for country.

verbal particles: are usually not considered part of the verb unless they can be shown to be some kind of inflection. Also things to consider: is there really no syntactic flexibility at all, is the order of such elements totally fixed and rigid, can nothing intervene between particle and verb? If that's the case, the case for separating them becomes weaker...

nominal compounds: this is indeed a problematic area for determining wordhood, where the line is often blurry. But as you say, this doesn't give learner any particular trouble. The thing about language is, not everything can be neatly categorised, there are a lot of continua around...

Posted
verbal particles: are usually not considered part of the verb unless they can be shown to be some kind of inflection. Also things to consider: is there really no syntactic flexibility at all, is the order of such elements totally fixed and rigid, can nothing intervene between particle and verb? If that's the case, the case for separating them becomes weaker...

Sorry, I don't quite understand you. Do you mean, if something CAN intervene, they are definitely separate words?

Posted

Not necessarily. If something can intervene, but it is all in the same order, this could also be a case of an agglutinating language, where you have a lot of affixes after one another. But usually these would need to be shown to be part of a larger paradigm. But if there is no such paradigm, then yes, an intervening element would speak for it being a separate word.

It's usually only elements that can't occur on their own that are tricky. If some word is used like an auxiliary but can be used on its own, this would also count in favour for its wordhood...

Posted

Verbal particles (pre- and post-postions) are more like words in both Thai and Vietnamese. In Japanese and Korean the verb endings are not separate words but as discussed, there are many arguable cases - Jap. する いる になる ておく てくる, Kor. - 합니다, etc.

Posted

In Japanese and Korean, you have two groups:

- verbal endings that are part of the verb.

- auxiliaries that form a complex predicate with the lexical verb. Your arguable cases all fall under this. No problem with wordhood here.

Posted
- auxiliaries that form a complex predicate with the lexical verb. Your arguable cases all fall under this. No problem with wordhood here.

Are 勉強する and 감사합니다 words?

Posted

this is a bit trickier. Phonologically generally they are words, but morphologically speaking the opinions do differ. The connection is much more rigid than for the other auxiliary constructions. I'd say this is one of them grey area cases...

Posted

Going back to the original thread and the original question (Concept of word in Chinese), unlike English it's hard to say with 100% certainty, how many words there are in a sentence in some languages. In English, some confusion may only be caused by 's 're (as in he's, you're) to a non-linguist. I want to check with my Japanese friends how they use the popular (non-scientific) word count.

Posted (edited)

Sorry for being off-topic, but where did you get the sources from?!

nước (Vi) for country

Wrong! That's Vietnamese for "water, liquid, fluid, etc..."

Vietnamese for country = quốc which is based on Cantonese "gwok"/"kwok", which is written as "国 / 國" in Chinese.

tiếng (Vi) for language

Wrong! tiếng doesn't mean "language(s)" in Vietnamese. It means either "talk" or "speak".

It's only referring to the language when if and only if tiếng is in front of the name of the language.

Edited by trien27
Posted (edited)

These are perfect synonyms, which also have other meanings:

English wiktionary

nước

tiếng

They are linked to the Vietnamese wiktionary, which confirms the meaning.

It's only referring to the language when if and only if tiếng is in front of the name of the language.

That's what I did, didn't I? Or "tiếng mẹ đẻ" - mother tongue.

Or more reliable, both words are there:

http://vdict.com/

Edited by atitarev
Posted
quốc which is based on Cantonese "gwok"/"kwok"

Make that Middle Chinese 國. Cantonese is too new.

Posted
Quote:

nước (Vi) for country

Wrong! That's Vietnamese for "water, liquid, fluid, etc..."

I think trien27 should revise his Vietnamese: nước tôi = my country

Quote:

tiếng (Vi) for language

Wrong! tiếng doesn't mean "language(s)" in Vietnamese. It means either "talk" or "speak".

Trien27 is wrong again: tiếng nước tôi = the language of my country

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...