Korean charactersThe Korean language is spoken by over 75 million people worldwide, and with over 40 million Internet users, South Korea is one of the most wired countries in the world. But, despite the fact that Korean is the tenth most popular language on the web, the software localization industry is underdeveloped. In a recent article in Multilingual magazine, Hyelim Chang, a senior language specialist at Google, describes some of the special challenges faced when localizing web content in Korean.

Isolated Particles

One of the qualities of the Korean language that frustrates localization is that “unique grammatical structure” the “postposition” or particle. There are around 20 particles, with special rules for use. Different particles are inserted after different words to indicate their role in a sentence. For instance, a word is marked as a subject or object using a particle. One must use different particles for words that end in consonants than for words that end in vowels. This makes it hard to program placeholders—blanks that will be filled in with different information depending upon information supplied by a user or on different entries in a database.

For example, a personalized string such as “XXX likes this photo” is hard to implement because one cannot know the vowel/consonant structure of every name that could take the place of XXX. Google solved this particular problem by adding an automatic honorific “-nim” after every name. This allowed the consonant-particle to be used uniformly, and made Google look extra polite as well.

Levels of formality

A second complicating factor in Korean localization is that the Korean language has seven different levels of formality, depending on who is speaking and who is listening. We are used to the idea of different registers of speech— in English we might use a formal register to talk to a judge, and an intimate register to talk to our spouse—but the difference will be in the vocabulary we use rather than in grammar. In Korean each of the seven levels of formality requires different verb endings. A linguist must be careful to use an appropriate middle range when translating a user command.

Non-linear writing

The Hangeul writing system in which Korean is written uses “syllabic blocks”—what looks like a single character to an English speaker is actually an amalgamation of several letters grouped together to make a block. A single word might be made of more than one syllable block— for example the word hangeul 한 글 looks like two characters, but it’s actually six letters grouped in two syllable blocks. Text wrap becomes problematic under these conditions because a word using multiple syllabic blocks might be improperly chopped up if a line break falls in the wrong place. One possible solution, manually adding line breaks, won’t always work because different devices have different size screens.

As Chang’s article demonstrates, the unique linguistic characteristics of a language can also pose unique technical problems when it comes to software localization. Localization is not simply a matter of translating content and reimporting it. Good localization takes more time, money, and thought than simple translation. We recommend working with language partners that are language pros and are familiar with the issues involved in getting your software and online apps to function properly for your audience.

