The TM/Glossary editor Wordfast Classic

From Wordfast Wiki
Jump to: navigation, search

Click the "TM/Glossary editor" TM Glossary editor icon.png icon in Wordfast's main toolbar, or the last icon in any of the glossary toolbars to start the TM/Glossary editor. Outside a translation session, glossary toolbars can be opened using the Ctrl+Alt+Right shortcut (and closed using the Ctrl+Alt+Left shortcut). During a translation session, the Ctrl+Alt+G shortcut pressed on a word or selection will open the glossary toolbar(s) of the glossary(ies) where the term was found. Glossary toolbars open only on glossaries that were specified in Wordfast/Terminology.

Wordfast's TM/Glossary editor is intended to make maintenance easy and intuitive, and offers practically identical methods for TMs and glossaries. Once the editor is opened, you can scroll up/down the data, edit/delete/add entries.

Shortcut Effect
Space bar mark/unmark entries
Ctrl+A mark/unmark all entries
Shift+Ctrl+A reverse the current marking
Ctrl+X cut all marked entries
Ctrl+Y undo the previous cut operation
Ctrl+C copy all marked entries to Wordfast's own clipboard
Ctrl+V paste Wordfast's clipboard's contents to the end of the file
Ctrl+D

(right-click)

Toggle the display of all, or only marked, TUs.
Ctrl+O Open another file (another glossary or antoher TM).

Note: cutting (deleting) a single line (or entry, or TU) is a soft operation, meaning it can be reversed or undone (press Delete twice on an entry to see the toggling effect). When an entry is cut (or soft-deleted), it appears as a blank line, but when it is selected, the source and target data appears in the editor's bottom blue/green display. Ctrl+Delete will permanently erase cut entries by "packing", i.e. rewriting, the entire TM or glossary.

The editor's Filter or Sort dialog box (Press F7 or click the column header are) gives access to three types of operation on data: Filter, Sort and Special filters.

Filter

Filtering means you define a condition with a Field Condition Argument format.

For example:

SourceText & "MyText

where & means "contains", or

Counter = 0

See more examples in the Filter or Sort dialog box' Help.

When Argument is made of text, it must be enclosed in straight quotes like this: "MyText".

The effect of a filter is that only the entries that conform to the filter's condition(s) will be made visible in the glossary editor. When a filter has been set, using the Mark methods (mark, unmark, copy, paste, cut) will operate only on visible entries.

In the Data Editor, use the F8 shortcut to cancel a filter.


Sort

Sort files only if you need. Sorting can take some time, because the entire file is actually (physically) sorted, not just the display of the file. Sort when necessary. WFC adds the convenience of being able to sort source or target text on the number of words or characters in segments. This can be useful for terminology extraction.

Replace

This is a standard Find-replace operation. The operation is done on one specific field at a time.

With this feature, you can replace a word like "Smith" into "Jones" in the TM's target segments. Be careful that find-replace operations do not produce unwanted results.

Example: Changing language codes in your TM

Suppose your source language code is EN and you want to change it into EN-US: Select "SourceLanguageCode" in the list of fields, enter EN as search, enter EN-US as replacement. Click OK.


TM Tools (Special Filters)

TM Tools contains special filters are meant to perform operations that would be difficult or impossible to perform with just filtering and sorting. These operations are:

Mark redundant entries

(there are various types of definition for a redundant entry, depending on whether you use a TM or a glossary). This feature marks entries that are considered duplicates. Once the marking is done, you can review them, then delete them all by using the Cut shortcut (Ctrl+X) followed by a hard-delete command (Ctrl+Delete). Of course, with a TM, such entries are grouped if the TM is sorted on the source segment.

Reverse source and target

This will rewrite the current file and reverse source and target fields.

Export to Unicode

Exports the current file to a unicode format.

Export to TMX (TM only)

Exports the current file to the TMX format. The TM is not overwritten - a new file is created, and it has a .tmx extension.

Remove tags

This special filter removes tags from a TM. This is recommended after finishing a project with tagged files. The leverage of TUs with tags is precious within the scope of a particular project. Tagged leveraged outside a project is an extreme rarity. This is why it is recommended to remove tags from a TM that will be used on different translation projects. Tags bloat TMs to a ridiculous extent.

Export as segment document

This filter will create a segmented document in Ms-Word that contains all Translation Units (TUs) in the TM.

Repair and compact TM

This maintenance will rewrite the entire TM, removing lines marked for deletion, removing empty lines. It will re-create the index. This filter can be run if the TM does not perform well, or before storing or archiving a TM.

Mark suspicious TUs

This powerful feature can clean up a TM by marking TUs that look suspicious for various reasons. After the filter was executed, TUs that loo, suspicious are simply marked (checked, or selected). You can review and delete them as needed.


Rewrite Entries with a Mask

This powerful feature is used to replace a particular field, or many fields, with some given value, or erase the content of the fields, in all visible entries. Visible entries are those that are displayed in the editor. If a filter is set, only some entries are visible.

You are first presented with an empty entry (a mask). You can:

  • enter an equal sign (=) followed by some text in any field, in which case, the text after the equal sign will replace whatever is found is the corresponding fields in all visible entries in the file (TM or glossary);
  • enter "=null" in a field to erase the content of that field.

All fields that are left blank (or which do not begin with = followed by at least one character) in the mask will remain untouched in the file.

The following mask would replace all User fields with "FOO", and erase Attribute fields 1, 2, 3, 4 in the entire TM:

Practical example: "I have that older, bulky TM that combines TUs from various translators. I want these entries grouped by user (translator) name. I want to delete all entries that have a usage counter of less than 2, and that are older than August 31, 2004. Then I want to review them one by one and perhaps have some entries not marked for deletion if I think they're useful after all. Only then will I erase all marked entries that remain".

  • Start the TM/Glossary editor, click the Tools button.
  • Sort on "User".
  • Set the following filter: Counter < 2 AND Date < 20040831 .
  • Press Ctrl+D to view only marked entries.
  • Review marked entries, un-mark the ones you wish to keep.
  • Press Ctrl+X to cut all marked entries.
  • Press Ctrl+Delete to permanently erase all marked entries.
  • Sort on Date to revert to a "natural" order in the TM.

Note that all operations except #7 can be undone.

TMs and glossaries must be created for one language pair only. I also advise keeping separate TMs for different subject (domain) and client, and having them in dedicated folders so that keeping track of them, and especially backing them up, remains easy.

TMs keep growing all the time. Most of TUs are very unkikely to be re-used, while a minority of them will. Since WFC keeps track of how many times a TU is re-used in the usage counter field, it is advised, when a TM reaches a large size (over 100,000 TUs), or when finishing a large translation project, to perform a compression by eliminating all TUs that have never been re-used. As a result, the TM's size will be considerably reduced, while its overall efficiency will be preserved. To do so:

  • Start the TM/glossary editor on the required TM.
  • Press F7, and set the following filter: Counter = 0 . Click OK.
  • Mark all (Ctrl+A). Cut marked (Ctrl+X). Hard-delete (Ctrl+Delete).

Creating a startup TM. Create one single, large TM by combining all the TMs you have. Delete all TUs that have a usage counter of less than 3. To compress further, you can visually review the TM and delete TUs that are unlikely to pop up again. To do so, sort the TM on "SourceWords", go to the end of it and review the TUs that are the longest, where there are likely "ghost" candidates, longish TUs that are unikely to show up again. Delete them. This TM can then be used as a primer - if you need to create a new, empty TM, better use a copy of that TM instead, because it contains a "Top 50" or perhaps a "Top 1000" of your previous work. It's like priming a pump with a cup of water.

A Wordfast TM may contain TUs where the first figure of the date (normally "2", but it can be "1" for TMs created in the previous millenium) is replaced with "x", and which, as a consequence, appear to be "cut" in the editor. This is because, in the course of a translation session, the TU was proposed as 100% match on a green background, but the target segment was edited, so WFC has deleted the original version of the TU in the TM and has re-written the TU's edited version at the end of the TM. This is normal. Do not "resurrect" or un-delete such TUs: their correct version appears further down in the TM. During translation sessions, WFC is blind to TUs that are marked "x". As a rule of thumb, perform a "Reorganisation" of the TM before working on it. This is done with the WFC > Translation memory > TM "Reorganise" button and it erases all TUs that were marked as "Deleted" with an "x" mark in the course of previous translation sessions.

Sharing TMs with other WFC users, or with other CAT tools.

Sharing TMs with other CAT tools: open the TM with the TM/Glossary editor, click Tools, apply the "Export TM as TMX" special filter. The TM will be re-written as TMX and the file's extension will be changed to .tmx.

  Back to Wordfast Classic User Manual