Jump to content
Chinese-Forums
  • Sign Up

Chinese filenames in UNIX/Linux


Recommended Posts

Posted

A long shot, but we have some gurus here who might know the solution.

Problem: UTF-8 locale in the console, but GB2312-encoded filenames.

Complete mess and impossible to know what is what, all I get are question marks.

Is there a way to rename these to their UTF-8 equivalents without too much effort?

I'm not looking for a handout, a push in the right direction would be appreciated. Perhaps a perl or python script can do such a thing?

Posted

I wrote some java code a while back to do this. There's probably a simpler way, but if you can compile Java, here's the source: http://people.csail.mit.edu/imcgraw/Converter.txt

Oh... but I just realized you're talking about the file names not the contents of the files... that I'm not sure... maybe the java code could be adapted, but again, there's probably a simpler way. I just use Java because the rest of my work requires it.

Posted

Thanks for the helping hand, but you're right, I'd like to convert filenames.

I'm afraid that they have already lost all the information at the moment they were saved to the filesystem, though. I have a feeling that the ?????-01.rmvb are in fact just question marks, at least that's what my attempts so far seem to suggest.....

EDIT: I found this, but it doesn't work on my files. I guess the information I need is lost already.

Posted
I have a feeling that the ?????-01.rmvb are in fact just question marks
The easy way to check would be to set the encoding of the shell to GB2312 and see if you can see the files.
Posted

No, the filenames are completely unimpressed by my encoding settings. ??? galore.

I'm guessing that mldonkey failed to save them using the original encoding, or that the filesystem (ext3) didn't like them.

Posted

Chances are good it's an mlDonkey problem. I've got some filenames in GB2312, and they display as random characters in the file browser (but as ?'s in the console, I think), but I am able to copy/paste them to an editor and convert them. Haven't bothered to yet though, as it's a fair amount of effort...

Posted

The mlconv utility I linked to will do that for you automatically.

Unfortunately, that doesn't help me, so I'm stuck with a bunch of ?????-01.rmvb files, which is great for watching things from the command line....

Posted

Why not rename them all to something in pinyin? Surely you could do a bulk rename?

Posted

I get the same kind of thing when using aMule, although I can input and read most anything else in chinese just fine.

  • 3 weeks later...
Posted

Why not rename them all to something in pinyin? Surely you could do a bulk rename?

I'll probably end up doing this, but since they all land in the same download directory, which is full of different files from our episode project, it will take some script-fu.

Posted

Not yet. I've been really busy recently, and I'm still catching up with some of the shows we've started.

Also, I'm trying to savour it while it lasts. Don't want the joy to end :mrgreen:

  • 7 months later...
Posted

Just an update.

It seems to be a KMLDonley problem. If I download directly from MLDonley's browser interface, I get the Chinese names just fine (in UTF-8).

I'm hoping that this will be fixed in the new version of KMLDonkey, until then, I'll use the browser interface. Now I need an efficient way to use the tab-completion together with Chinese input in Konsole :mrgreen:

  • 1 month later...
Posted

I've fixed the problem by setting KMLDonkey's default encoding to UTF-8.

Now everything is A-OK.

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...