| | |
| | |
Stat |
Members: 3669 Articles: 2'599'751 Articles rated: 2609
18 March 2025 |
|
| | | |
|
Article overview
| |
|
Bi-Text Alignment of Movie Subtitles for Spoken English-Arabic Statistical Machine Translation | Fahad Al-Obaidli
; Stephen Cox
; Preslav Nakov
; | Date: |
5 Sep 2016 | Abstract: | We describe efforts towards getting better resources for English-Arabic
machine translation of spoken text. In particular, we look at movie subtitles
as a unique, rich resource, as subtitles in one language often get translated
into other languages. Movie subtitles are not new as a resource and have been
explored in previous research; however, here we create a much larger bi-text
(the biggest to date), and we further generate better quality alignment for it.
Given the subtitles for the same movie in different languages, a key problem is
how to align them at the fragment level. Typically, this is done using
length-based alignment, but for movie subtitles, there is also time
information. Here we exploit this information to develop an original algorithm
that outperforms the current best subtitle alignment tool, subalign. The
evaluation results show that adding our bi-text to the IWSLT training bi-text
yields an improvement of over two BLEU points absolute. | Source: | arXiv, 1609.1188 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
|
| |
|
|
|