骏马的丕沿? Posted November 27, 2021 at 07:23 PM Report Posted November 27, 2021 at 07:23 PM CSV file is attached. Encoding should be utf-8-sig. I compiled this list a while ago, using a web-scraper (script that collects data from the internet) that I built Most of these came from https://chengyu.911cha.com/ Take a glance at https://www.zhihu.com/question/399852044 to get an idea of approximately how many chengyu there are So obviously here I used a pretty liberal definition of "chengyu," since there are more than 40,000 in this list Feel free to use this as a study tool 成语清单.csv 3 Quote
骏马的丕沿? Posted November 27, 2021 at 07:28 PM Author Report Posted November 27, 2021 at 07:28 PM Related note: if anyone wants a recommendation for a 成语 dictionary, I would highly recommend 新华成语词典(第二版),by 商务印书馆辞书研究中心,published by 商务印书馆…… awesome dictionary! I'd say it's the most reliable resource I've used for learning chengyu. Quote
mungouk Posted November 28, 2021 at 03:09 PM Report Posted November 28, 2021 at 03:09 PM On 11/28/2021 at 3:23 AM, 骏马的丕沿? said: Feel free to use this as a study tool So how do you imagine we might do that, any ideas? 40,000 is more of a dictionary than anything else... how are you using it yourself? There doesn't seem to be any data in there apart from the text itself — definitions and particularly frequency data (how commonly are they used) would be very useful... are you able to add that? Thanks! Quote
骏马的丕沿? Posted November 28, 2021 at 04:08 PM Author Report Posted November 28, 2021 at 04:08 PM On 11/28/2021 at 9:09 AM, mungouk said: So how do you imagine we might do that, any ideas? 40,000 is more of a dictionary than anything else... how are you using it yourself? There doesn't seem to be any data in there apart from the text itself — definitions and particularly frequency data (how commonly are they used) would be very useful... are you able to add that? Thanks! Sorry, I'm not exactly sure! I'm sharing this chengyu list only because it's data that I think can be important to others. The reason that I created this list is that I wasn't able to find a list of chengyu like this on the internet in the first place. So, here I am, providing that here, in case anyone wants to use this data in some way. Personally, I find this list useful, but that's only because I have a script that returns all the chengyu that appear in a given chunk of text. I'm currently reading 红楼梦。Before every chapter, I sit down with my 成语 dictionary, and study all of the 成语 that appear in the chapter I'm about to read (I do this because I'm interested in learning more chengyu). This is only possible because I have a giant list that tells me all of the chengyu that appear in each chapter in 红楼梦,and I have this giant list because: 1) I scraped a ton of chengyu from the internet, and put them in a CSV file, and 2) utilized this CSV file in my other script, so as to find out which chengyu appear in which chapters. Efficiently studying chengyu in this way would be impossible without this programming component. I would share this script with you, but it would have to be via github, and I'm not even well-versed with github in the first place, and am also not very experienced in "exporting" programs/scripts to other users, so I'm not sure how I would go about doing that. Quote
phills Posted November 28, 2021 at 04:39 PM Report Posted November 28, 2021 at 04:39 PM On 11/29/2021 at 12:08 AM, 骏马的丕沿? said: This is only possible because I have a giant list that tells me all of the chengyu that appear in each chapter in 红楼梦 A giant list like this is pretty handy if you're into scripting. So I would thank you just for that alone! To make it more useful as a study aid for myself, I ran your list of Chap 1 chengyu from 红楼梦, and found about 2/3rd of them in CE-dict. https://www.chinese-forums.com/forums/topic/61737-list-of-all-chengyu-in-红楼梦/#comment-483571 One thing I noticed while doing that is how many chengyu variants there are. Another thing you can consider doing to organize the list is to try to group all the variants together. Finally, I have a list of 7000 chengyu that's supposed to be reverse sorted by order of frequency. I got it from the internet -- you can probably find an even bigger frequency list out there somewhere. But 7000 + variants probably gets you to around 10k out of the your 40k total. All of those + filtering it against a particular book would probably make a handy vocab starter list for tackling a complicated work. Quote
shawky.nasr Posted January 29, 2022 at 11:51 AM Report Posted January 29, 2022 at 11:51 AM There are good dictionary for 红楼梦 新编红楼梦辞典 (Recommend) second one:红楼梦大辞典(增订本) Third one:红楼梦汉英文化大辞典 A Chinese-English Dictionary of Idioms from A Dream of Red Mansions Forth one:《红楼梦成语辞典》 Some of them so old, you can find it online. 1 Quote
New Members sciuro Posted January 8, 2023 at 01:43 AM New Members Report Posted January 8, 2023 at 01:43 AM Thank you! Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.