Jump to content
Chinese-Forums
  • Sign Up

Learning vocabulary by context mining


Jan Finster

Recommended Posts

I just wanted to share something I have been doing for the last 2 weeks: 

I have recently begun focussing more on improving my active vocabulary. In the course of doing so, I somehow started googling, baiduing or zhihu-ing certain expressions to find example contexts. I feel reviewing those examples may be a good way to memorise the words or grammar patterns. It also provides the brain with examples on how to use it. Rather than hoping that you will read the same word in different contexts in books or the like, this approach of mine is proactively searching for those contexts.

 

Here are some recent examples:

 

I wanted to better understand how to use "所需的" (needed; required, necessary for). So googling this here are some results:

 

I added google auto-translations, which sometimes do not make (perfect) sense ((partly also because I have literally pulled the expression out of the context of the whole sentence)).

 

所需的日常飲食 Required daily diet
参会者所需的基本信息 Basic information required for participants
使用前所需的准备工作 The preparations required before use
是人体所需的六大营养 Is the six nutrients needed by the body
人体所需的营养素 nutrients the human body needs 
   
   
   
人体所需的 required by the human body
人体所需的物质 一 A substance needed by the body
日常所需的物品 Daily necessary items
生活所需的 Required for life
各取所需 What they need
你所需要的 what you need
什么是人们生存所必需的 What is it necessary for survival
有您所需的服务 There are services you need
网站建设所需的资料 Information required for website construction
所需的资料 Necessary information
机能所需的能源 The required energy to function
创业所需的社会资 Social capital needed for business
所需的技能 Skills required
办公所需的办公用品 office supplies required for the office
涂鸦所需的工具 Graffiti tools required
可视化所需要的一切 Everything needed for visualization
所需要的一切 Everything you need
人体所必需的氨基酸 Essential amino acids the human body
所需要的经验是多少  how much experience is required...

 

 

Here is 主要靠 (mainly by...; mainly rely on...):

 

主要靠勤奋 Mainly by hard work
公司盈利主要靠政府补助 Corporate profits mainly by government subsidies
主要靠想象力 Mainly by imagination
学习主要靠自己 The main learning on their own
收入主要靠汽车 Income mainly by car
主要靠气质 Mainly by temperament
主要靠中等收入者来实现 Mainly by middle-income people to achieve
经济增长主要靠房地产 Economic growth depends mainly on real estate
主要靠想象力 Mainly by imagination
主要靠什么发现目标 Mainly rely on finding the target
主要靠政府 Mainly by government
原材料主要靠进口 Mainly rely on imports of raw materials
收入应主要靠工资 It should mainly rely on wage income
   
主要靠这三种技术 Mainly by three techniques
需求主要靠清洁能源满足 Mainly rely on clean energy to meet demand

 

 

I know there are dictionaries or http://www.jukuu.com/ that provide example sentences, but they are often not relevant to me or too long or contain too many new words. My search is really focussed on getting a list of "sentence snippets" that I can then use as inspiration or as building blocks to make my own sentences.

 

By reviewing 10-30 examples of how a word is used, I also hope it will remain in my memory better than if I simply memorised it with Anki.

 

I wonder if any of you have used a similar method and/or if you can think of ways on how to improve my system.

 

(Yes, it is time-consuming. But if you listen to some random 3 hour English podcasts in the background, it is doable... ? 

 

(sometimes I get carried away during my search and then add more and more new words)

  • Like 2
Link to comment
Share on other sites

I think its a great study method, would be great if you'd be willing to share the deck here at some point in the future? It takes a lot of time and work for sure, and a human-curated mass sentence deck is direly lacking on the anki shared decks from what I can see. 

 

4 hours ago, Jan Finster said:

Rather than hoping that you will read the same word in different contexts in books or the like, this approach of mine is proactively searching for those contexts.

 

On this point, I thought it might be worth clarifying, sentence mining could and should be done 'actively' in everything you do. You should be reading and looking for new words and good sentences wherever you go, seeing every conversation opportunity as an opportunity to pick up new sentences (obviously this works better if youre in China, but still doable via the internet when abroad). I suppose what I'm saying is, I would argue it is more 'proactive' to use and interact with the language yourself and then get your sentence from these experiences, rather than searching online, which seems to me the more 'passive' approach. Not a criticism, just thought it was worth noting

  • Like 1
  • Helpful 1
Link to comment
Share on other sites

5 hours ago, Tomsima said:

I suppose what I'm saying is, I would argue it is more 'proactive' to use and interact with the language yourself and then get your sentence from these experiences, rather than searching online,

 

Thanks for the feedback :)

I totally agree, but being in Europe and not having a chance to go to China in the foreseeable future, this is what I (have to) do. Of course I also mine sentence snippets when I read other texts.

 

5 hours ago, Tomsima said:

I think its a great study method, would be great if you'd be willing to share the deck here at some point in the future? It takes a lot of time and work for sure, and a human-curated mass sentence deck is direly lacking on the anki shared decks from what I can see. 

 

I would not mind at all. I am not sure how useful it is though, because it may look quite random, as there is really no theme or HSK level I am following. Yesterday, for instance, I was mining  "尤其" , "仍面临", "被判了", "我能否", "价值观的改变" plus a few more and I ended up with close to 1000 snippets. Also, the auto-translates are often quite weird and unless you have some grasp of the meaning you may get annoyed. Anyhow, I may share some useful snippets later on.

Link to comment
Share on other sites

This sounds like a great method, especially if your goal is to develop the ability to write in a specific style (e.g. scientific papers or technical documentation). If that isn't your goal then you will still benefit from reading all these usage examples you find in the wild.

 

I don't know what your Chinese level is. If you aren't advanced then I'd suggest using the HSK word list or a word frequency listing to focus on the most common words first. Here's a graph I made with the SUBTLEX-CH word frequency listing. The first 5k most common words constitute >93% of the corpus. Don't go chasing after diminishing returns until you've picked all the low-hanging fruit.

 

image.thumb.png.6441950eb30a81668208e952677414cc.png

 

In addition to internet searches, you could also look through some of the corpora out there like the LCMC to find usage examples. 

  • Like 3
Link to comment
Share on other sites

17 hours ago, 大块头 said:

if your goal is to develop the ability to write in a specific style (e.g. scientific papers or technical documentation). If that isn't your goal then you will still benefit from reading all these usage examples you find in the wild.

 

Not so much. I am not following HSK, but I am still at an intermediate level.

 

17 hours ago, 大块头 said:

If you aren't advanced then I'd suggest using the HSK word list or a word frequency listing to focus on the most common words first.

 

Thanks. This is good advice. As I am not following HSK, I do not follow their vocabulary, but rather vocabulary I read in texts, e.g. at TheChairMansBao, or whatever text my tutor uses in our online class. 

 

 

17 hours ago, 大块头 said:

Don't go chasing after diminishing returns until you've picked all the low-hanging fruit.

Currently, I still feel my approach helps me learn words better. So, it basically helps me memorise the usage of those low-hanging fruit words. And, yes, I am sure some of those words are also on the HSK lists.

 

 

 

Link to comment
Share on other sites

Here are some more examples: "一个 天生 的" ("a born...", "a natural"?

 

他就是一个天生的技术人员 He is a born technician
他是一个天生的内向者 He is a born introvert
孩子的母親也是天生的外向者
The child's mother is also a  born extrovert
內向者和外向者都是天生的
Introverts and extroverts are natural
他是天生的外向者还是内向者
He is a born extrovert or introvert
鸟儿是天生的音乐家
The birds are born musicians
孩子都是天生的演说家 Children are born orators
一个天生的歌手 A born singer
一个天生的全能艺人 A born entertainer
我不是一个天生的领袖 I am not a born leader
一个天生的运动员 A natural athlete
  • Like 1
Link to comment
Share on other sites

Here "-isms" (主义):

 

自由主义 Liberalism
社会主义 socialism
资本主义 Capitalism
达尔文主义 Darwinism
马克思主义 Marxism
种族主义 racism
女权主义 Feminism
共产主义 communism
爱国主义 patriotism
民族主义 Nationalism
人道主义 Humanism
素食主义 Vegetarianism
Link to comment
Share on other sites

"The reason I ..." (我......原因):

 

我熬夜的原因
The reason I stay up all night
我喝酒的原因 The reason I drink
我心裡的原因 my heart's reason
我单身的原因 The reason I'm single
   
我疯的原因 The reason I was mad
我早起的原因 The reason I get up early
我累的原因 The reason I'm tired of
我快乐的原因是什么 What is my reason for my joy
因为这是我选择的原因
Because this is the reason I chose
   
我后悔的原因 The reason I regret it
是促成我出家的原因
My reason I became a monk
你就是我存在的原因 You're the reason I exist
我 吃素 的 原因
The reason I am vegetarian
Link to comment
Share on other sites

Jan,

Thanks for these posts.

I was thinking about something similar recently and looking for some tools that relate. I am at level HSK 5, approximately, and currently I am noticing that at least half the time when I encounter a word I don't know, I know at least one of the characters and therefore can sometimes guess the meaning.

With this thought, I was wondering if there were resources online that focus on Chinese "building blocks" for vocabulary building. This approach would make perfect sense for someone studying English or some romance language, since if you studied prefixes and suffixes, as well as Latin roots, you would be able to guess the meaning much of the time for something that included, say, "-ization" or "produc-".

Even as a native English speaker, I sometimes look up dictionaries to find a word's etymology (derivation), and this helps me understand a new word.

I think seeing various related examples of usage of a character, along with translations of those usages, would be a great help to me in learning to be even better at guessing the meaning of new Chinese words.

TofuLearn has a feature like this when you are going through their word lists, but it's not really designed for someone looking up a new word. And Chinese-Chinese dictionaries aren't useful to me yet because all their examples are a big blur to me when the list isn't designed for language learners.

Any suggestions from the peanut gallery? (I.e. lurkers who are following the discussion.)

Link to comment
Share on other sites

2 hours ago, Moshen said:

TofuLearn has a feature like this when you are going through their word lists, but it's not really designed for someone looking up a new word.

 

I always wondered if there was a way to access this feature of Tofulearn outside of studying on it. That would be so cool. I believe they are accessing some (free?) online resources to provide these lists. So, in theory, it should be accessible (?)

Link to comment
Share on other sites

The suggestions of using the dictionary provided examples may be convenient but I think actually work against some of the benefits of @Jan Finster doing the leg work and finding the sentences through google:

1) Phrases in the wild will naturally be more authentic;

2) The process of searching will help her (him?) get used to scanning Chinese for what he is looking for. Incidentally, a skill that is also important for HSK 5 and 6. Additionally, Research from Laufer and others have shown that vocabulary acquisition can be strengthened when the learner needs to search for the vocab in some way;

3) Greater context provided from the webpage on which the phrase is found. Isolated dictionary sentences are still divorced from context and can make it difficult to determine the level of formality being used.

  • Like 4
Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...