Sunday, September 28, 2008


As you probably already know, Google Translate has added 11 more languages, including Slovak, to its already impressive portfolio. While testing the English-Slovak service, I was pleasantly surprised at the MT engine's ability to handle syntax like noun phrases containing adjectives, although I noted a number of problems associated with translating English idiomatic structures, such as those involving verbs "give" and "take" or multiword expressions. Overall, about 60% of translations of work related documents I put in did produce comprehensible and usable texts, so color me impressed. More testing will be required to see if Slovak translators who work for me should start to worry about their jobs (and trust me, I do have a shit list), but I'm pleased to inform you that we already have a candidate for the mistranslation of the year. Consider the headline of this report on the first US presidential debate from and then have a look at the translation, especially the items in red:

English: Analysis: A few jabs, but no knockout in first debate
Slovak: Analýza: Za pár popíchnutí, ale žiadna kokot nedved v prvom diskusie

OK, my praise of syntax handling now sounds premature, since in "v prvom diskusie", neither the noun nor the ordinal numeral are declined properly (it should be "v prvej diskusii"), but that's not the interesting bit. That rests with the translation of the word "knockout": "kokot nedved". "Kokot" = "dick, prick" is of course the basic Slovak insult for a man, for more information see here. It is also a very vulgar term, rarely seen in print or heard on the airwaves, so its appearance here will not only ellicit a chuckle for its own sake, but also the question of just what corpus was Google using in training the MT engine. The web, sure, but I can't think of any sufficiently large bilingual corpus where that word would crop up. And that question is even more justified with the second part of the translation: what the hell is a "nedved"? The only word that even comes close is the Czech surname Nedvěd which is a form of "medvěd" = "bear". There are a few people with that name with a significance presence on the web to be included in a web corpus, like the football player Pavel Nedvěd, the hockey player Petr Nedvěd and the folk singers Jan (Honza) and František Nedvěd. But how did their name get into the translation for "knockout"? "Knockout" ("knokaut" in Slovak) is a sports term, but I know of no boxer by the name of Nedvěd. Then again, football and hockey players as fellow athletes could probably fit the bill. That still leaves the question of how did this Czech word get into a translation into Slovak. And it's not the only one - if you look at the screenshot, you will see at least three more words clearly identifiable as Czech (highlighted in green):

- "štípnout" for "tweak" - Slovak: "upraviť, vyladiť"; "štípnout" = "pinch, sting", slang: "steal", although one of my dictionaries gives "štípnout" for tweak" without any further context or explanation.
- "slíbený" for "vowed" - Slovak: "sľúbený". Note that this is a past participle while the original has "vowed" as a past tense verb.
- "poldové" for "cops" - Slovak: "policajti", slang: "fízli". Note the context mismatch: both "poldové" and "fízli" is stylistically marked and not very likely to appear in a newspaper save perhaps for direct quotes.

Once again, I assume that web corpora were to some extent used to train the MT engine. As my own feeble attempts at corpus research have shown, the country code cz or sk in the domain name does in no way guarantee that you will find only Czech or Slovak text there. The actual ratio is hard to determine, but it is definitely nice to see that one of the better aspects of Czechoslovakia - its almost fully bilingual citizens - survives to this day.

And one last interesting bit from this small test: Barack Obama's full name is translated as it should be. But whenever his last name shows up on its own, Google translates it as "osobách" = "person-LOC.PL" (highlighted in light blue). Buggered if I know why...

(h/t: filer)


John Cowan said...

I have notified the proper folks at Google about this. Feel free to post more blunders and I'll pass them along too.

bulbul said...


great, thanks. I did submit my own translation for the entire headline and a number of phrases containing "knockout", but I expect a note from you will be handled more expeditiously.

John Cowan said...

It turns out that I sent this one to the wrong part of Google. I fixed that today.

John Cowan said...

The headline now translates to "Pred niekoľkými popíchnutí, ale žiadny knockout v prvej diskusii". Try some more things, if you don't mind.

Unknown said...

louis vuitton outlet
louis vuitton pas cher
omega watches
rolex uk
iphone case uk
nike air max uk
lululemon pants
toms shoes
michael kors outlet
mulberry outlet,mulberry handbags outlet
michael kors outlet store
tory burch outlet online
ugg uk outlet
true religion jeans
michael kors factory outlet
cazal outlet
michael kors handbags
ray-ban sunglasses
canada goose outlet store
nike air max 90
rolex watches outlet
michael kors canada
montblanc pens
ugg boots clearance
louis vuitton outlet
coach outlet
gucci outlet online
iphone case
cheap mlb jerseys
swarovski jewelry
fitflops sale clearance
coach factory outlet
longchamp pliage
ugg outlet

Theo said...

From Tivola Publishing: Chess and Mate is the chess program for the whole family. download pdf Replaced music decoration with one looping soundtrack.

Unknown said...

ugg boots
cheap jordans
air jordan 4
polo ralph lauren outlet
ugg outlet
eagles jerseys
michael kors handbags
mlb jerseys
ralph lauren pas cher
polo ralph lauren

eat after read said...

None of this information will ever go outside of this app or be shared with anyone. Look for it MatWeather iconset used in the widgets are a part of my work itself.

Dylan said...

What's new in this release:- You can now quick search by text for your favourite coffee shop food. download keygens There are just the simple rules "tap where you want to write" and "swipe over what you want to cross out".

Unknown said...

ENTERTAIN AND TEACH YOUR KIDS HOW TO READ (AMERICAN ENGLISH), PRONOUNCE AND SPELL BASIC NOUNS WITH MORE THAN 100 BEAUTIFUL HD IMAGES. downlodable music Order Pad Easily enter Johnstone Supply or manufacturer part numbers directly into our order pad and add them to your cart in no time.

Unknown said...

Im Lernbereich der App findest Du Erklrungen und Vorgehensweisen sowie illustrierende Grafiken. Welcome to my site. Html To Image can be seamlessly integrated with your Internet Explorer.

Unknown said...

Network search supports looking for info by multiple search engines. downlodable driver By interpreting the STV's via iPhone circuitry we can translate those STV's into English, and coming soon, other languages!

millenium said...

Tell a Friend to Tell a Friend:Music is an experience best shared with others. 3D Touch Peek no longer previews inside a navigation controller.

Unknown said...

Use one of four different display screens for a little privacy from onlookers. link for you Discover beers you haven't tried and add them to your Fridge.

AMELIA77 said...

Important changes made to database infrastructure, please update for the application to continue to work. download videos Bug fixes:- Fixed wrong rounding logic which made 8.

Unknown said...

Music should be a wonderful thing not a frustrating one. With PianoAngel, you can play improvise freely or play from simplified chord/melody lead sheets.

Unknown said...

Like a mixture between sudoku and minesweeper, numeric hints are used to uncover a pixelated surprise. downlodable videos Better encryption handling in online mode and 2-factor authentication with Hardware key.

Unknown said...

Test your knowledge of Vikings' history with 100 in depth Trivia questions. It is a bit like puzzle game, but requires more imagination then a puzzle.

millenium said...

If you like BusinessVideo: Communication be sure to check out other videos in the series including BusinessVideo: Management and others! How to get it DON'T LOOSE YOUR DATA!So use often the Backup utility to save you data on an XML file.

AMELIA77 said...

These baby corns await for the brave Magic Flamers to help them escape and turn them into delicious popcorn. FOLLOW US: Keep up with the new features and updates in the works, as well as our other apps, on Twitter: twitter.

Rusram radjapov said...

What I like most about his method is his rhythm; whether it's a two-footer or a fifty-footer, his rhythm never changes. Built-in twitter feed viewer to stay up to date with the latest tweets from stirfryTV Full YouTube community integration.

Unknown said...


christian louboutin outlet
nike store uk
michael kors outlet
nobis outlet
nike factory outlet
ray ban sunglasses
kobe shoes
longchamp outlet
cheap football shirts
bucks jerseys

jjjjjjjjjjjjj said...

nike air force 1
adidas eqt support
yeezy boost 350 v2
red bottom heels
adidas ultra
pandora bracelet
nike epic react
birkin bag
lebron soldier 10
nike off white

yanmaneee said...

nike air max
christian louboutin outlet
golden goose outlet
yeezy boost 350
golden goose shoes
golden goose superstar
adidas yeezy

theloez said...

moved here Dolabuy Valentino website link Dolabuy Chloe directory Louis Vuitton replica Bags

sebin said...

look here Dolabuy Chloe anonymous Dolabuy Loewe click

sleareath said...

important link Dolabuy Louis Vuitton look at here luxury replica bags navigate to this site best replica bags

Unknown said...

n9r09e2s74 f2z53s7n72 m2c36g1l84 k2o36a7p37 s5q35v3n25 s9f13a0a34