CEDICT

UPDATE (Thursday 10th May 2018) : Unfortunately, neocities does not allow zip files, so the links below to zip files (here and elsewhere) are linked via a Internet Archived copy.

The download for CEDICT can be found in Erik Peterson's website www.mandarintools.com. In this version of the file, there are 26404 entries (my download indicates a date of 30th May 2003). It lists three main parts to each entry, and is sorted by pronunciation in pinyin.

[1] Big5 encoded Chinese characters,
[2] Pronunciation in pinyin
[3] English meaning

Accompanying the download is the readme file, explaining the project and copyright issues.

I have sorted the entries in a different order to that in the download from EP's site, since I wanted to have a character index of entries instead. So,the version on this website has been modified to two important ways:

[1] the listing of characters is in the order found in Unicode which for Big5 characters are all under the Kangxi Radical order, except for one entry
[2] decimal form of the unicode of the first character entry is given.

Other than the inclusion of the decimal numbers the structure of the entries are as the original format (Big5, Pinyin, English), in the file "cedict-kx.txt".

The data is sorted this way for an important reason. I wanted to have a HTML based tool where I could locate character entries at will, using the powerful search tool in a HTML web browser. This lead to the writing of a program to split the cedict-kx.txt file into many smaller files. The inspiration comes from Muller's online CJK dicitionary. As the total number of files created is huge, the download would also be huge. But having a file close to the size of EP's download version, and a small program to facillitate the creation of the HTML dictionary seemed the best option. Here is the result.

The program in this download will convert the sorted version of CEDICT (cedict-kx.txt 1200 kb) into individual files according to the initial character.

You will need at least 8MB RAM, 32 bit Windows environment, and some spare time on your hands for nearly 8300 files to be created. When I look at how much space it takes up on the diskdrive, there are two sizes, WinXP "properties" gives two: for the same directory

Size            9.97 MB ( 10,464,289 bytes)
Size on disk  131.0  MB (137,534,680 bytes)

So, if you don't have at least 150 MB spare, don't run the program, but use the sorted cedict-kx.txt file instead. The program does not ask for you to do any input after you double click it, and can take some time. To aid the long wait, I've made the program tell you how many entries there are left to write (from 26404 entries, the number should end with 0).

You should close all other programs first before setup after downloading the zipped file "cedict-kx.zip" 484 kb. Unzip it, find the file called cedict-kx.exe (66 kb) and double click on it to run and make the conversion.

The index file will be called "cindex.html", whilst all the other files are "[five digit numbers].htm". A copy of "cedict-readme.htm" is zipped up. The zip has three files in all.

cedict-readme.htm (11 kb)
cedict-kx.txt (1200 kb)
cedict-kx.exe (66 kb)

The zip file size is

cedict-kx.zip (488 kb)

Cheers,
Dyl.

Index