Kalloniatis Antonis (Phd Candidate)

Thesis title: Analysis and generation of humorous texts in Greek
Supervisor: Adamidis Panagiotis
Advisory Committee Members:
Goulianas Konstantinos, Professor, Dept. of Information and Electronic Engineering, IHU
Tantos Alexandros, Ass. Professor, School of Philology, AUTH

The appreciation and use of humor comes so naturally to humans that it does not allow them to fully grasp the complex cognitive steps, world knowledge, human intelligence and creativity behind it. Codifying all of this so that a computer system can behave like humans is a difficult task. The problem of computational humor is considered to be one of the most complex problems in artificial intelligence. Humor perception requires that a computational model, in addition to linguistic knowledge, has been trained with contextual understanding data from text or speech. Similarly, humor generation requires the system not only to generate a funny text that fits the context of the situation, but also to judge the situation as appropriate for using a joke. In this thesis, we will investigate a very widespread but difficult to analyze emotion: humor. This study, as far as we know, will be one of the first investigations on computational humor in the Greek language. Considering the weaknesses discussed in the previous sections, we will investigate specific questions for both strands of computational humor, its understanding and generation. Scientific research in humor recognition is mainly based on Machine Learning techniques applied to a set of language features. The research work that has been done using data extracted from dialogue synthesis is very limited. Even Purandare and Litman [32], who used speech data from a TV series, selected small individual segments of speech, ignoring any kind of contextual information. Our goal is to overcome the limitations of previous research, namely the analysis of humor as a classification problem, and to create a computational model that will incorporate the use of data based on field ontologies, which define relationships with characteristics of the environment and the given situation. For the computer humor understanding part, the question is whether and how we will be able to improve the humor recognition performance of the computational model we will create. On the humor production side, scientific research places great emphasis on using pattern-based models that produce a large amount of canned results, which may or may not be funny and hardly fit the context of a dialogue or situational context. As we have seen, this greatly limits the positive results of human evaluation of these systems, which do not exceed 20% of the total evaluation. Research into the creation of spontaneous conversational humor is in its infancy. Even the recent interactive speech systems (Virtual Assistants) that are now able to structure complex conversations still sound very robotic and lack a sense of humor. In this thesis we will develop a model that, after judging a dialogue suitable for the use of a joke, will generate funny words/sentences with dynamic pattern modification in different sentence forms and insert them into the dialogue to match the context. To create humor, the question is whether and how we can create expressions (responses to dialogue) that are funny and relevant to the context of the dialogue. Natural language understanding, which is part of the computational humor problem, is a difficult problem in itself, and not yet solved for all languages. Supervised Machine Learning techniques applied to a feature set for language have been shown to achieve very good results [10, 47, 27]. But most of the research in Machine Learning is mainly based on text documents written in English language, a language spoken all over the world with huge datasets available. However, Schroeder’s study [38] highlights the high percentage of social media texts that are not written in the English language. Fischer [10] gives some interesting information about the languages used on Twitter based on geographical locations. With the huge amount of such data available on social media, there is a need to develop technologies for the other languages as well. With our research, we will try to fill this gap in the literature using the Greek language to create humor. Our research question is whether we can build computational models for recognizing and creating humor in the Greek language.