Flickserve Posted September 11, 2016 at 05:15 AM Report Posted September 11, 2016 at 05:15 AM I have some pdfs from the CCTV series Growing Up with Chinese. In the pdfs, the pinyin is above each character in a sentence If I copy and paste the text to a word file, then the pinyin is appears to the right of each word across the whole sentence.. At the moment, I have to manually delete each pinyin from the sentence which is quite a laborious process. How would I be able to easily convert this sentence into only chinese characters? Quote
edelweis Posted September 11, 2016 at 01:15 PM Report Posted September 11, 2016 at 01:15 PM Linux (or cygwin etc): cat inputfile.txt | sed 's/[a-zA-ZüÜāēīōūǖĀĒĪŌŪǕáéíóúǘÁÉÍÓÚǗǎěǐǒǔǚǍĚǏǑǓǙàèìòùǜÀÈÌÒÙǛ]//g' > outputfile.txt Or, if you have a text editor that allows using search/replace with regular expressions, try replacing [a-zA-ZüÜāēīōūǖĀĒĪŌŪǕáéíóúǘÁÉÍÓÚǗǎěǐǒǔǚǍĚǏǑǓǙàèìòùǜÀÈÌÒÙǛ] with an empty string. Or, if you're under Windows, you could install ActivePerl and then do the same as the sed command above with a tiny perl script. 1 Quote
Flickserve Posted September 13, 2016 at 12:32 AM Author Report Posted September 13, 2016 at 12:32 AM @edelweis I couldn't get that to work in Word 2010 conveniently but I did find another method by complete accident stimulated by your post. I noticed after copying over, the font for the pinyin was different. I selected one of the pinyin and then where "Find" and "Replace" are found, there is another tab called "Select". In that tab, there is an option to "Select text with a similar formatting". Click on that and all the pinyin with similar formatting is selected. Then press delete. Immediately all the pinyin of the same format is deleted! Through the document, the pinyin might come up in two or three different fonts so repeat as necessary. The same can be done with English text and Chinese text separately if one should so wish. Quote
Recommended Posts
Join the conversation
You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.