Jump to content
Chinese-Forums
  • Sign Up

Status of Traditional


(JR)

Recommended Posts

Have been ironing out the details for the last week or so, so things should be pretty much taken care of by now. The two real issues I know of that remain are: (1) some GB2312 characters don't have traditional UTF equivalents, so if anyone copies and pastes GB2312 text into the textbox, some of the odder punctuation marks (such as the elipse) will not identify correctly. (2) any additions people make to the database will not be added in traditional form until the next review process, so will appear missing in traditional annotations until then.

Those aside, if people can let me know of any problems they run into, that is probably the best way to help at this stage.

Link to comment
Share on other sites

  • 4 months later...

Hi,

Just stumbled onto this site in the last few days (love it). I also did some searching around for answers to my questions, but couldn't find much -- with exception of this older thread. So, my question is, what's the status of a traditional version of the website www.newsinchinese.com? Is there an equivalent out there that I'm unaware of? If so, can you point me in the right direction?

Thank you very much for any help you can provide! BTW -- great site, and forum!

Link to comment
Share on other sites

NewsinChinese feeds off the Xinhua RSS, so everything on the site is simplified. Am planning to update the site in the near future, though, to integrate it with something that will allow people to manually annotate/correct errors and store texts long term.If anyone knows any good sources for syndicated RSS feeds with traditional news we can add traditional articles too.

The Adso engine can handle traditional Chinese that's encoded in UTF8. So in the meantime you can always try either submitting the content to the annotation engine or using the Firefox plugin while viewing online traditional texts:

http://www.chinese-forums.com/index.php?/topic/7087-firefox-plugin-chinese-text-annotation

Link to comment
Share on other sites

  • 2 weeks later...

Xinhua has started offering their Big5 newsfeeds in rss as well:

http://big5.xinhuanet.com/gate/big5/www.xinhuanet.com/rss.htm

Also, how did you setup newsinchinese to grab entire articles? rss feeds normally only contain some basic descriptive information, and then a link to the full article. If i could get a hold of the full articles for other rss feeds similar to how you have at newinchinese, then it would not be difficult to setup a server to mirror other banned server's news feeds.

Keith

Link to comment
Share on other sites

The full articles don't come via RSS - the whole page is called, and then the non-article bits discarded (can't remember exactly how, but it's something like spotting and and keeping the bit between. Xinhua doesn't make it THAT easy, but you get the drift. It fails sometimes and then you see the newsinchinese page get messed up royally.

If you wanted, you could do something similar for banned websites, but running a simple proxy (disguise the urls, btw, that's what I found important when setting up a bbc mirror) is probably easier.

Full article calls only happen on user request, so it doesn't happen for every article.

Link to comment
Share on other sites

If you wanted, you could do something similar for banned websites, but running a simple proxy (...) is probably easier.

Do you mean running a mirror server?

I'm in the states currently, so accessing banned sites is not a problem. I was just thinking of ways to provide the news feeds so that they could potentially be used on newinchinese. (although if you were to do that, you would have to be careful not to get your own server banned).

Link to comment
Share on other sites

on a side note, it probably wouldn't be too terrible a task to write up a script to grab news from other servers.. just taking a glace at 聯合報, they have unique "FONT" tags in front of the title of the article, and a second in front of the body of the article.

Link to comment
Share on other sites

Join the conversation

You can post now and select your username and password later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Click here to reply. Select text to quote.

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...