HighTech Finland › Information & Communications › All articles in this section   ›  Machine translation whatever the language

Mobility & Networking
Software & Systems
All articles in this section

 

Machine translation whatever the language

Sunda Systems has developed a new-generation machine translation technology that can be used to develop ­ efficient machine translators for any pair of languages you care to name. The Sunda MT Workbench provides a set of tools for building high-quality machine translators for even small languages that have been bypassed so far by major ­vendors.
Sunda Systems Oy

Sunda Systems applies a number of key technologies to guarantee the suitability of its Sunda MT Workbench for a wide variety of languages and ensure that it can be used for any pair of languages, and contains all the tools needed for building a machine translator from scratch.

Good-quality machine translation suitable for a wide variety of languages needs to be based on a sound linguistic and IT theoretical foundation, according to Sunda Systems.

Dependency theory, for example, is employed to handle sentence structure, as structures that can seem quite different on the surface in different languages can actually be quite close when projected on this abstract level.

The company has also pioneered the principle of parallel translation. Among the most important tangible theoretical and practical benefits of this approach is efficiency, as a common processor can translate thousands of sentences in a second. This principle also keeps linguistic and computational issues strictly separate – and enables linguists to concentrate on linguistic issues, and see the effects of a linguistic change on the system in only a few seconds.

The Sunda MT Workbench also includes tools for quality control and teamwork.

A high-quality English-Finnish translator built using the Sunda MT Workbench is already in wide use and has yielded good results.

Efficient code use

To minimise the reworking needed for different languages, the Sunda approach is based on the principle of late commitment, which means that the processing of source language sentences is conditioned to a specific target language only when it is imperative, not before.

This means that a major portion of the source language processing developed for one pair of languages can be reused in a translator for another target language.

Smoothing the path for developers

In addition to good translation quality, Sunda has also prioritised the need for its technology and the applications built around it to be efficient, user-friendly, adaptable, and robust.

Sunda's core translation engine can be embedded easily in external systems using standard programming interfaces and Internet protocols, for example, and runs seamlessly under most commonly used operating systems

The engine also has built-in support for common file formats, such as rtf and html – and documents written in these formats retain their formats in translation.

Language-independent, end-user applications are already available to translate home pages in a web browser, translate formatted documents, and translate general text content in desktop applications. And more are on the way.

> Harri Arnola
(Published in High Technology Finland 2007)