CJK (Chinese, Japanese, Korean) Wordfast Classic
The following discussion concerns the Wordfast-generated data (like translation memories and glossaries). It does not concern documents. Ms-Word documents always support Unicode, and do not lose encoding. If there are issues, those are font (rendering) issues, or material brought into Ms-Word by copy-pasting alien material.
Unicode translation memories and glossaries should be used for translation where one of the two languages (source or target) is CJK. All versions of Wordfast after the year 2007 use only Unicode TMs and glossaries, so that should not be a worry.
Use path names and file names with latin, non-accented (English) letters only for TMs and glossaries. Try to keep file names (including document names if possible) under 32 letters, using English non-accented letters, and without spaces. Wordfast may not support folder and file names with unicode characters. If Wordfast malfunctions, this could be due to the Ms-Word Startup path containing unicode characters. If this is the case, create a folder, for example C:\Startup, and copy wordfast.dot there. Start Ms-Word, use the Tools/Options/Default folders dialog box to change Ms-Word's Startup folder to the one you just created. Close and restart Ms-Word.
If given the choice of Unicode flavour when you save a TM or glossary, select the simple "Unicode" (this can be just Unicode, or UTF-16) setting, not a language-specific encoding.
If you use Ms-Word XP (Ms-Word 2002), note that a notorious Ms-Word 2002 glitch prevents it from saving documents as Unicode (unless you specifically added that feature at installation time). In this case, export the TM to unicode. To do so, start the TM/Glossary editor, click "Tools", and run the "Rewrite as Unicode" special filter. Another workaround is to open an existing Unicode document, delete all its contents, paste your data into it, save it then rename it directly on disk.
In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language code begins with either ZH-, JA-, or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, , or Korean. Notes:
- For and Chinese, make sure the full stop is visible in the Wordfast/Setup/General "End-of-segment punctuation"setting. It should be automatically added there when you create a translation memory with JA, KO, or ZH in the source language (for example, JA-JP, JA-01, ZH-CN, etc.). If you do not see the or Chinese full stop, select your language's full stop in a document. Copy it (Ctrl+C). Open Wordfast. In the Wordfast/Setup/General "End-of-segment punctuation"setting, press Enter to edit the value, then paste your full stop before the existing punctuations there (I advise not to delete the existing, latin punctuation).
- For and Chinese, check at least the "An ESP without a trailing space ends a segment" rule in Wordfast/Setup/Seg, so that end-of-sentence punctuations that are not followed by a space may still be recognised as ending a sentence. This too is normally done automatically by Wordfast when the TM is CJK.
- To have all target segments receive a specific font (a font that can display CJK characters), use the Wordfast/Setup/General "Target font" setting to specify the target font. But this is not necessary if your platform automatically adapts fonts to languages.
- To have both Concordance search and glossaries displayed using a specific font, go to Wordfast/Setup/Pandor'as box. Add the parameter TermFont="MyFont" with the required font instead of MyFont.
If you open a glossary or a translation memory with Ms-Word and cannot read the text: select all text then apply a font that can display your language (a specific font, or a generic Unicode font). If you still cannot see text properly displayed, and all you see are question marks (????) then perhaps, at some stage, the file was saved as (rewritten) using a text-only format rather than Unicode. There is no way back. Make sure Unicode files remain Unicode at all times. This concerns the Text format used for translation memories and glossaries, not Ms-Word documents. Unicode is not relevant with the DOC file format.
If an Ms-Word document does not display your language properly, it's a font problem. Target segments must receive the proper font; see above for automatically applying a certain font to target segments.
Back to Wordfast Classic User Manual