The Dialects of China

UPDATE (Thursday 10th May 2018) : Unfortunately, neocities does not allow zip files, so the links below to the various zip files (here and elsewhere) will not work.

The project Dialects on Computer (DOC) file can be found at http://www.lang.cityu.edu.hk/chinese/DOCMAS9.TXT This file was copied to "docmas9[1].txt". It differs only in one aspect, I've added a blank line right at the start. Otherwise, the file has not been changed.

Click here to find more about the Dialects of China listing.

The file docmas9.txt does not display correctly as is. Rather than manually appending each pronunciation by hand so that one can display the file using their font docipa.ttf correctly, I've decided to convert the pronunciation section into IPA and include tone contours as well, and to use the powerful ability of web browsers to display Chinese and IPA characters in colour simultaneously in the form of a HTML file. This requires some data files to facilitate the transition.

To convert the docmas9[1].txt file into HTML form, I've created "ch-doc-ipa1.txt" which lists the characters codes that are needed to convert readings into International Phonetic Alphabet characters in Unicode. They represent all the characters found in the font "docipa.ttf". Dialects.txt is a list of the dialects found in this version of the DOC project. doc-tones.txt is a list of tone contours for each dialect in DOC.

To create the HTML file, all these files including "doc2html.exe" needs to be present in the same folder. I've zipped up the program and four data files called doc.zip, the link for the download is at the bottom of this page. When unzipped, you need to click on doc2html.exe and the HTML file will be created. It is 9218 kilobytes in size. The Chinese characters still requires Big5 fonts.

The size of the file is in part due to all the blank space in the coding, and also due to the repetitive use of tags, Fortran 77's rather archaic character data manipulation features and my not so good programming... Anyway, hope you like what I've done.

Since the original file docmas9.txt is available free, and all I've done is a bit of conversion, the small software file doc-htm3.exe I release as freeware, and should be used only on 32bit Windows machines. Likewise, the datafile, ch-doc-ipa1.txt as anyone could have done it. I do not anticipate any problems with the program, just remember not to change the two txt files. If you have, delete them, and unzip again and start over again.

Dylan W.H. Sung 17th January 2004

http://www.dylanwhs.ukgateway.net/download/doc.zip 292 kb Requirements: 32 bit Windows.