Listing 1 - 10 of 10 |
Sort by
|
Choose an application
Speech dynamics refer to the temporal characteristics in all stages of the human speech communication process. This speech starts with the formation of a linguistic message in a speaker's brain and ends with the arrival of the message in a listener's brain. Given the intricacy of the dynamic speech process and its fundamental importance in human communication, this monograph is intended to provide a comprehensive material on mathematical models of speech dynamics and to address the following issues: How do we make sense of the complex speech process in terms of its functional role of speech communication? How do we quantify the special role of speech timing? How do the dynamics relate to the variability of speech that has often been said to seriously hamper automatic speech recognition? How do we put the dynamic process of speech into a quantitative form to enable detailed analyses? And finally, how can we incorporate the knowledge of speech dynamics into computerized speech analysis and recognition algorithms? The answers to all these questions require building and applying computational models for the dynamic speech process.
Oral communication --- Speech --- Automatic speech recognition. --- Mathematical models. --- Mechanical speech recognizer --- Speech recognition, Automatic --- Talking --- Oral transmission --- Speech communication --- Verbal communication --- Articulatory trajectories. --- Coarticulation. --- Discretizing hidden dynamics. --- Dynamic Bayesian network. --- Formant tracking. --- Generative modeling. --- Speech acoustics. --- Speech dynamics. --- Vocal tract resonance. --- Pattern recognition systems --- Perceptrons --- Speech, Intelligibility of --- Speech perception --- Speech processing systems --- Language and languages --- Phonetics --- Voice --- Communication
Choose an application
Speech dynamics refer to the temporal characteristics in all stages of the human speech communication process. This speech “chain” starts with the formation of a linguistic message in a speaker's brain and ends with the arrival of the message in a listener's brain. Given the intricacy of the dynamic speech process and its fundamental importance in human communication, this monograph is intended to provide a comprehensive material on mathematical models of speech dynamics and to address the following issues: How do we make sense of the complex speech process in terms of its functional role of speech communication? How do we quantify the special role of speech timing? How do the dynamics relate to the variability of speech that has often been said to seriously hamper automatic speech recognition? How do we put the dynamic process of speech into a quantitative form to enable detailed analyses? And finally, how can we incorporate the knowledge of speech dynamics into computerized speech analysis and recognition algorithms? The answers to all these questions require building and applying computational models for the dynamic speech process. What are the compelling reasons for carrying out dynamic speech modeling? We provide the answer in two related aspects. First, scientific inquiry into the human speech code has been relentlessly pursued for several decades. As an essential carrier of human intelligence and knowledge, speech is the most natural form of human communication. Embedded in the speech code are linguistic (as well as para-linguistic) messages, which are conveyed through four levels of the speech chain. Underlying the robust encoding and transmission of the linguistic messages are the speech dynamics at all the four levels. Mathematical modeling of speech dynamics provides an effective tool in the scientific methods of studying the speech chain. Such scientific studies help understand why humans speak as they do and how humans exploit redundancy and variability by way of multitiered dynamic processes to enhance the efficiency and effectiveness of human speech communication. Second, advancement of human language technology, especially that in automatic recognition of natural-style human speech is also expected to benefit from comprehensive computational modeling of speech dynamics. The limitations of current speech recognition technology are serious and are well known. A commonly acknowledged and frequently discussed weakness of the statistical model underlying current speech recognition technology is the lack of adequate dynamic modeling schemes to provide correlation structure across the temporal speech observation sequence. Unfortunately, due to a variety of reasons, the majority of current research activities in this area favor only incremental modifications and improvements to the existing HMM-based state-of-the-art. For example, while the dynamic and correlation modeling is known to be an important topic, most of the systems nevertheless employ only an ultra-weak form of speech dynamics; e.g., differential or delta parameters. Strong-form dynamic speech modeling, which is the focus of this monograph, may serve as an ultimate solution to this problem. After the introduction chapter, the main body of this monograph consists of four chapters. They cover various aspects of theory, algorithms, and applications of dynamic speech models, and provide a comprehensive survey of the research work in this area spanning over past 20~years. This monograph is intended as advanced materials of speech and signal processing for graduate-level teaching, for professionals and engineering practitioners, as well as for seasoned researchers and engineers specialized in speech processing.
Choose an application
In recent years, deep learning has fundamentally changed the landscapes of a number of areas in artificial intelligence, including speech, vision, natural language, robotics, and game playing. In particular, the striking success of deep learning in a wide variety of natural language processing (NLP) applications has served as a benchmark for the advances in one of the most important tasks in artificial intelligence. This book reviews the state of the art of deep learning research and its successful applications to major NLP tasks, including speech recognition and understanding, dialogue systems, lexical analysis, parsing, knowledge graphs, machine translation, question answering, sentiment analysis, social computing, and natural language generation from images. Outlining and analyzing various research frontiers of NLP in the deep learning era, it features self-contained, comprehensive chapters written by leading researchers in the field. A glossary of technical terms and commonly used acronyms in the intersection of deep learning and NLP is also provided. The book appeals to advanced undergraduate and graduate students, post-doctoral researchers, lecturers and industrial researchers, as well as anyone interested in deep learning and natural language processing. .
Natural language processing (Computer science) --- Computer science. --- Mathematical statistics. --- Artificial intelligence. --- Text processing (Computer science). --- Computational linguistics. --- Computer Science. --- Artificial Intelligence (incl. Robotics). --- Document Preparation and Text Processing. --- Probability and Statistics in Computer Science. --- Language Translation and Linguistics. --- Automatic language processing --- Language and languages --- Language data processing --- Linguistics --- Natural language processing (Linguistics) --- Applied linguistics --- Cross-language information retrieval --- Mathematical linguistics --- Multilingual computing --- Processing, Text (Computer science) --- Database management --- Electronic data processing --- Information storage and retrieval systems --- Word processing --- AI (Artificial intelligence) --- Artificial thinking --- Electronic brains --- Intellectronics --- Intelligence, Artificial --- Intelligent machines --- Machine intelligence --- Thinking, Artificial --- Bionics --- Cognitive science --- Digital computer simulation --- Logic machines --- Machine theory --- Self-organizing systems --- Simulation methods --- Fifth generation computers --- Neural computers --- Mathematics --- Statistical inference --- Statistics, Mathematical --- Statistics --- Probabilities --- Sampling (Statistics) --- Informatics --- Science --- Data processing --- Statistical methods --- NLP (Computer science) --- Artificial intelligence --- Human-computer interaction --- Semantic computing --- Natural language processing (Computer science). --- Artificial Intelligence. --- Natural Language Processing (NLP). --- Tractament del llenguatge natural (Informàtica) --- Lingüística computacional
Choose an application
This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models including deep neural networks and many of their variants. This is the first automatic speech recognition book dedicated to the deep learning approach. In addition to the rigorous mathematical treatment of the subject, the book also presents insights and theoretical foundation of a series of highly successful deep learning models.
Acoustics in engineering. --- Social sciences --- Signal, Image and Speech Processing. --- Engineering Acoustics. --- Computer Appl. in Social and Behavioral Sciences. --- Data processing. --- Automatic speech recognition. --- Markov processes --- Mathematical models. --- Analysis, Markov --- Chains, Markov --- Markoff processes --- Markov analysis --- Markov chains --- Markov models --- Models, Markov --- Processes, Markov --- Stochastic processes --- Mechanical speech recognizer --- Speech recognition, Automatic --- Pattern recognition systems --- Perceptrons --- Speech, Intelligibility of --- Speech perception --- Speech processing systems --- Signal processing. --- Image processing. --- Speech processing systems. --- Acoustical engineering. --- Application software. --- Application computer programs --- Application computer software --- Applications software --- Apps (Computer software) --- Computer software --- Acoustic engineering --- Sonic engineering --- Sonics --- Sound engineering --- Sound-waves --- Engineering --- Computational linguistics --- Electronic systems --- Information theory --- Modulation theory --- Oral communication --- Speech --- Telecommunication --- Singing voice synthesizers --- Pictorial data processing --- Picture processing --- Processing, Image --- Imaging systems --- Optical data processing --- Processing, Signal --- Information measurement --- Signal theory (Telecommunication) --- Industrial applications
Choose an application
This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models including deep neural networks and many of their variants. This is the first automatic speech recognition book dedicated to the deep learning approach. In addition to the rigorous mathematical treatment of the subject, the book also presents insights and theoretical foundation of a series of highly successful deep learning models.
Choose an application
In recent years, deep learning has fundamentally changed the landscapes of a number of areas in artificial intelligence, including speech, vision, natural language, robotics, and game playing. In particular, the striking success of deep learning in a wide variety of natural language processing (NLP) applications has served as a benchmark for the advances in one of the most important tasks in artificial intelligence. This book reviews the state of the art of deep learning research and its successful applications to major NLP tasks, including speech recognition and understanding, dialogue systems, lexical analysis, parsing, knowledge graphs, machine translation, question answering, sentiment analysis, social computing, and natural language generation from images. Outlining and analyzing various research frontiers of NLP in the deep learning era, it features self-contained, comprehensive chapters written by leading researchers in the field. A glossary of technical terms and commonly used acronyms in the intersection of deep learning and NLP is also provided. The book appeals to advanced undergraduate and graduate students, post-doctoral researchers, lecturers and industrial researchers, as well as anyone interested in deep learning and natural language processing. .
Operational research. Game theory --- Computer science --- Artificial intelligence. Robotics. Simulation. Graphics --- Computer. Automation --- NLP (neurolinguïstisch programmeren) --- spraaktechnologie --- stochastische analyse --- machine learning --- deep learning --- computers --- informatietechnologie --- KI (kunstmatige intelligentie) --- computerkunde --- AI (artificiële intelligentie)
Choose an application
In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-function form. This common form enables the use of the growth transformation (or extended Baum–Welch) optimization framework in discriminative learning of model parameters. In addition to all the necessary introduction of the background and tutorial material on the subject, we also included technical details on the derivation of the parameter optimization formulas for exponential-family distributions, discrete hidden Markov models (HMMs), and continuous-density HMMs in discriminative learning. Selected experimental results obtained by the authors in firsthand are presented to show that discriminative learning can lead to superior speech recognition performance over conventional parameter learning. Details on major algorithmic implementation issues with practical significance are provided to enable the practitioners to directly reproduce the theory in the earlier part of the book into engineering practice. Table of Contents: Introduction and Background / Statistical Speech Recognition: A Tutorial / Discriminative Learning: A Unified Objective Function / Discriminative Learning Algorithm for Exponential-Family Distributions / Discriminative Learning Algorithm for Hidden Markov Model / Practical Implementation of Discriminative Learning / Selected Experimental Results / Epilogue / Major Symbols Used in the Book and Their Descriptions / Mathematical Notation / Bibliography.
Choose an application
"In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-function form. In addition to all the necessary introduction of the background and tutorial material on the subject, we also included technical details on the derivation of the parameter optimization formulas for exponential-family distributions, discrete hidden Markov models (HMMs), and continuous-density HMMs in discriminative learning. Selected experimental results obtained by the authors in firsthand are presented to show that discriminative learning can lead to superior speech recognition performance over conventional parameter learning. Details on major algorithmic implementation issues with practical significance are provided to enable the practitioners to directly reproduce the theory in the earlier part of the book into engineering practice."--BOOK JACKET.
Choose an application
Choose an application
Robust Automatic Speech Recognition: A Bridge to Practical Applications establishes a solid foundation for automatic speech recognition that is robust against acoustic environmental distortion. It provides a thorough overview of classical and modern noise-and reverberation robust techniques that have been developed over the past thirty years, with an emphasis on practical methods that have been proven to be successful and which are likely to be further developed for future applications. The strengths and weaknesses of robustness-enhancing speech recognition techniques are carefully analyzed. The book covers noise-robust techniques designed for acoustic models which are based on both Gaussian mixture models and deep neural networks. In addition, a guide to selecting the best methods for practical applications is provided. The reader will: Gain a unified, deep and systematic understanding of the state-of-the-art technologies for robust speech recognitionLearn the links and relationship between alternative technologies for robust speech recognition Be able to use the technology analysis and categorization detailed in the book to guide future technology developmentBe able to develop new noise-robust methods in the current era of deep learning for acoustic modeling in speech recognition.
Listing 1 - 10 of 10 |
Sort by
|