Listing 1 - 8 of 8 |
Sort by
|
Choose an application
The influence of gender in conversation has been studied with methodologies that use a relative small sample of words. Stable Lexical Marker Analysis is a method that analyses large amounts of words and determines which words are most typical for a text. A comparison between a man talking to a man and a woman talking to a man was done. The analysis points to some collaborative word use when men talk among themselves. The context in which the conversation takes place influences how big the difference between the dyads is. The results of the SLMA-analysis were originally scattered into many different data frames. To increase the readability of the results several visualisations were made. They could also help researchers in the future to get an oversight of their SLMA-results.
Choose an application
The philosophical community is currently engaged in comparative philosophy, a research paradigm with a high level of variation. The scholarly definition of comparative philosophy is not set, which has resulted in a convergence of various, perhaps conflicting, ideas in the field. Philosophers stay clear of misunderstandings and keep improving their own viewpoints through confrontations with ideas. Earlier research has shown that comparative philosophers currently use four methods, including archival methods and equivalent methods, yet the majority of these methods are conveyed through close readings of philosophical classics. This study applies computational text analysis to introduce the classical philosophical literature into the digital world. Applying a combined set of analytical methods including topic modeling, collocation analysis, and semantic network analysis, the researcher investigates concepts and their relationships of two non-western philosophers: Mencius, ’Second Sage’ in Confucianism, and al-Farabi, ’the second master’ in Islamic philosophy. Topic modeling groups words in two different language texts into different topics, enabling the researcher to explore the relationship between the common topics and others through the topic distribution of documents. Collocation analysis concentrates on how philosophers of different traditions have conceptualized a notion or a set of concepts. Semantic network analysis creates collocation networks of both concepts of interest to the researcher and concepts selected by statistical methods without any prior knowledge, allowing the researcher to view the philosophical literature from a new perspective. The findings indicate that computational text analysis can bring discoveries to the concept analysis of cross-linguistic comparative philosophical projects that differ from those of close reading. Thus, these quantitative textual analysis techniques can create new consensus for cross-cultural interaction.
Choose an application
This thesis project presents a working prototype of a corpus of Indian English fictional texts. It begins with a discussion of the qualitative background of the need of such a corpus. Various debates and arguments made in previous studies are laid out. An argument is made about the need for quantitative and corpus stylistic research in the study of Indian English literature. This project aims to fill the gap in the research by building a prototype of a corpus that can be used to that end. The report moves on to discuss the state-of-the-art in corpus design and annotation principles, available corpora that are similar to the one proposed, and available corpus architectures. Two main approaches to corpus architecture are discussed – the IMS Corpus Workbench and the relational database approach. The later approach is the one used for building the prototype. An overview of the corpus prototype then follows, along with a description of the web-interface, the possible queries, and a discussion of the rationale behind the decisions made in the process of creating it. The pros and cons of the approach taken are discussed – the pros being that the relational database approach allows for a simplified and scalable model for building a corpus. A drawback of the approach is that despite the claims about it allowing for faster queries, the database tables can be quite large, resulting in slower execution times. The strengths and shortcomings of the prototype are also laid out in the discussion. The prototype is an extremely valuable starting point for being scaled up into a fully-fledged working corpus with a web-interface. However, presently it lacks markup that separates the text from the non-textual elements, which create discrepancies in the frequency and n-gram tables. In conclusion, future prospects of the prototype are discussed, with the possibility of adding markup, and additional annotations, along with improvements to the user interface.
Choose an application
This thesis explores the modeling and incorporation of the rhythmic constraints within an existing, neural-network-based poetry generation system developed by Van de Cruys (2020) in a practical manner. As the poetry generation system is trained on prosaic texts, content and form constraints should be added respectively to endow the generated texts with a poetic character. In other words, our goal is to integrate rhythmic and topic aspects into the existing neural network so as to teach the existing system to generate texts that resemble human-written poems. In this thesis, our task focuses on the rhythm constraints for English poetry generation. This thesis starts from a review of academic work related to poetry generation from a broad horizon, then elaborates on the details of our methodologies as its main body followed by an evaluation of the outputs from the generators using different methods and at different implementation stages, and ends with conclusions, reflections on the limitations of our work and visions for future improvements. The experimentation is designed to be carried out in two major steps. The first step is to create a model that yields lexical stress patterns for given English words, where we built a set of annotated data based on open language data that contains phonetic information, and then used the preprocessed dataset as the training data of deep neural networks for the automatic prediction of lexical stress patterns. The second step is to construct representations of stress patterns in context and use the representations as the constraints on the neural network for poetry generation. Ideally, the generated verse will be rhythmically constrained. It should be noted that the final implementation is different from our original experiment design. In the second step, we created a very accurate and handy stress dictionary and exclusively made use of the stress patterns for the vocabulary present in the dictionary as constraints within the language generation model, which performs very well in rhythms and is able to generate fluent and meaningful verse. The neural model constructed in the first step, though performs very well in predicting unknown English words, was not actually employed in the final incorporation. we decided to retain this step in the thesis, as it offers the possibility of improvement of the rhythmic constraints in the future: combining the stress dictionary (highly accurate rhythmic labeling for common words) and the prediction model (applying to words that are not included in the dictionary) within the generation system. The lines generated by our final generator incorporated with the rhythmic, rhyme, and topical constraints have very good rhythmic patterns and rhyme schemes. The incorporation of the rhythmic constraints with the other two constraints is also favorable. Eventually, the model is able to generate fair four-line poems in terms of fluency and meaningfulness, but there is still a gap from models of more sophisticated algorithms and human-written poetry.
Choose an application
This study focuses on the comparisons between the Chinese reason connectives yinwei and youyu from the viewpoint of corpus linguistics. By means of three independent statistical methods (collocation analysis, correspondence analysis, and mixed-effects logistic regression analysis), the analyses were performed within the data of two balanced and comparative corpora, the UCLA2 (original Chinese text) and the ZCTC(English-translated Chinese text). The results mainly revealed that the alternative choice between yinwei and youyu is determined by a number of language-internal factors including the length of related sentences, the position of reason subjects and reason clauses, and the usage with result connectives. It was also illustrated that youyu is preferred in objective domains as well as in the original Chinese text.
Choose an application
Contemporary data analytics involves extracting insights from data and translating them into action. With its turn towards empirical methods and convergent data sources, cognitive linguistics is a fertile context for data analytics. There are key differences between data analytics and statistical analysis as typically conceived. Though the former requires the latter, it emphasizes the role of domain-specific knowledge. Statistical analysis also tends to be associated with preconceived hypotheses and controlled data. Data analytics, on the other hand, can help explore unstructured datasets and inspire emergent questions.This volume addresses two key aspects in data analytics for cognitive linguistic work. Firstly, it elaborates the bottom-up guiding role of data analytics in the research trajectory, and how it helps to formulate and refine questions. Secondly, it shows how data analytics can suggest concrete courses of research-based action, which is crucial for cognitive linguistics to be truly applied. The papers in this volume impart various data analytic methods and report empirical studies across different areas of research and application. They aim to benefit new and experienced researchers alike.
Electronic data processing --- Business intelligence --- Big data --- Data sets, Large --- Large data sets --- Data sets --- Business espionage --- Competitive intelligence --- Corporate intelligence --- Economic espionage --- Espionage, Business --- Espionage, Economic --- Espionage, Industrial --- Industrial espionage --- Intelligence, Business --- Intelligence, Corporate --- Business ethics --- Competition, Unfair --- Industrial management --- Confidential business information --- ADP (Data processing) --- Automatic data processing --- Data processing --- EDP (Data processing) --- IDP (Data processing) --- Integrated data processing --- Computers --- Office practice --- Automation
Choose an application
Choose an application
Linguistics --- Computer. Automation --- History as a science
Listing 1 - 8 of 8 |
Sort by
|