TY - BOOK ID - 13971223 TI - Bitext alignment PY - 2011 SN - 9781608455119 1608455114 9781608455102 PB - San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) Morgan & Claypool DB - UniCat KW - Machine translating. KW - Computational linguistics. KW - Bitexts. KW - Sentence alignment. KW - Languages & Literatures KW - Philology & Linguistics KW - Natural language processing (Computer science) KW - NLP (Computer science) KW - Automatic translating KW - Computer translating KW - Electronic translating KW - Mechanical translating KW - Alignment KW - Bitexts KW - Parallel corpora KW - Sentence alignment KW - Word alignment KW - Tree alignment KW - Statistical machine translation KW - Transduction grammars KW - Text mining KW - Lexicon induction KW - Artificial intelligence KW - Electronic data processing KW - Human-computer interaction KW - Semantic computing KW - Algorithms KW - Applied linguistics KW - Natural language generation (Computer science) KW - Information theory KW - Translating and interpreting KW - Cross-language information retrieval KW - Translating machines KW - 681.3*I27 KW - 800:311 KW - 800:311 Kwantitatieve linguistiek. Computerlinguistiek KW - Kwantitatieve linguistiek. Computerlinguistiek KW - 681.3*I27 Natural language processing: language generation; language models; language parsing and understanding; machine translation; speech recognition and under-standing; text analysis (Artificial intelligence) KW - Natural language processing: language generation; language models; language parsing and understanding; machine translation; speech recognition and under-standing; text analysis (Artificial intelligence) KW - Automatic language processing KW - Language and languages KW - Language data processing KW - Linguistics KW - Natural language processing (Linguistics) KW - Mathematical linguistics KW - Multilingual computing KW - Data processing UR - https://www.unicat.be/uniCat?func=search&query=sysid:13971223 AB - Preface -- Acknowledgments -- 1. Introduction -- Applications -- Further readings -- 2. Basic concepts and terminology -- Bitext and alignment -- Alignment and segmentation -- Alignment spaces and constraints -- Correlations and cues -- Alignment models and search algorithms -- Evaluation of bitext alignment -- Summary and further reading -- 3. Building parallel corpora -- Document alignment -- Mining the web -- Extracting parallel data from comparable corpora -- Summary and further reading -- 4. Sentence alignment -- Length-based approaches -- Lexical matching approaches -- Combined and resource-specific techniques -- Summary and further reading -- 5. Word alignment -- Generative alignment models -- Constraints and heuristics -- Discriminative alignment models -- Translation spotting and bilingual lexicon induction -- Summary and further reading -- 6. Phrase and tree alignment -- Parallel treebanks and tree alignment -- Hierarchical alignment and transduction grammars -- Summary and further reading -- 7. Concluding remarks -- Final recommendations -- A. Resources & tools -- Bibliography -- Author's biography. This book provides an overview of various techniques for the alignment of bitexts. It describes general concepts and strategies that can be applied to map corresponding parts in parallel documents on various levels of granularity. Bitexts are valuable linguistic resources for many different research fields and practical applications. The most predominant application is machine translation, in particular, statistical machine translation. However, there are various other threads that can be followed which may be supported by the rich linguistic knowledge implicitly stored in parallel resources. Bitexts have been explored in lexicography, word sense disambiguation, terminology extraction, computer-aided language learning and translation studies to name just a few. The book covers the essential tasks that have to be carried out when building parallel corpora starting from the collection of translated documents up to sub-sentential alignments. In particular, it describes various approaches to document alignment, sentence alignment, word alignment and tree structure alignment. It also includes a list of resources and a comprehensive review of the literature on alignment techniques. ER -