The goal of this course is to give an in-depth introduction to this fascinating
problem and to present a comprehensive survey of all important research
topics and the latest developments in the field. As evidence of that, this course
covers more than 400 references from all major conferences and journals.
Although the field deals with the natural language text, which is often considered the unstructured data, this course takes a structured approach in introducing the problem with the aim of bridging the unstructured and structured worlds and facilitating qualitative and quantitative analysis of opinions. This is crucial for practical applications. In this course, we first define the problem in order to provide an abstraction or structure to the problem.
Opinions are central to almost all human activities and are key influencers of our behaviors. Our beliefs and perceptions of reality, and the choices we make, are, to a considerable degree, conditioned upon how others see and evaluate the world. For this reason, when we need to make a decision we often seek out the opinions of others. This is not only true for individuals but also true for organizations.
Opinions and its related concepts such as sentiments, evaluations, attitudes, and emotions are the subjects of study of sentiment analysis and opinion mining. The inception and rapid growth of the field coincide with those of the social media on the Web, e.g., reviews, forum discussions, blogs, microblogs, Twitter, and social networks, because for the first time in human history, we have a huge volume of opinionated data recorded in digital forms. Since early 2000, sentiment analysis has grown to be one of the most active research areas in natural language processing. It is also widely studied in data mining, Web mining, and text mining. In fact, it has spread from computer science to management sciences and social sciences due to its importance to business and society as a whole. In recent years, industrial activities surrounding sentiment analysis have also thrived.
Numerous startups have emerged. Many large corporations have built their own inhouse capabilities. Sentiment analysis systems have found their applications in almost every business and social domain.
From the abstraction, we will naturally see its key sub-problems.
The subsequent chapters discuss the existing techniques for solving these subproblems. This course is suitable for engineer participants, researchers, and practitioners who are interested in social media analysis in general and sentiment analysis in particular.
Sentiment analysis, also called opinion mining, is the field of study that
analyzes people’s opinions, sentiments, evaluations, appraisals, attitudes,
and emotions towards entities such as products, services, organizations,
individuals, issues, events, topics, and their attributes. It represents a large
problem space. There are also many names and slightly different tasks, e.g.,
sentiment analysis, opinion mining, opinion extraction, sentiment mining,
subjectivity analysis, affect analysis, emotion analysis, review mining, etc.
However, they are now all under the umbrella of sentiment analysis or
opinion mining. While in industry, the term sentiment analysis is more
commonly used, but in academia both sentiment analysis and opinion mining
are frequently employed. They basically represent the same field of study.
The term sentiment analysis perhaps first appeared in (Nasukawa and Yi, 2003), and the term opinion mining first appeared in (Dave, Lawrence and Pennock, 2003). However, the research on sentiments and opinions appeared earlier (Das and Chen, 2001; Morinaga et al., 2002; Pang, Lee and Vaithyanathan, 2002; Tong, 2001; Turney, 2002; Wiebe, 2000).
In this course, we use the terms sentiment analysis and opinion mining interchangeably. To simplify the presentation, throughout this course we will use the term opinion to denote opinion, sentiment, evaluation, appraisal, attitude, and emotion. However, these concepts are not equivalent. We will distinguish them when needed. The meaning of opinion itself is still very broad. Sentiment analysis and opinion mining mainly focuses on opinions which express or imply positive or negative sentiments.
Although linguistics and natural language processing (NLP) have a long history, little research had been done about people’s opinions and sentiments before the year 2000. Since then, the field has become a very active research area. There are several reasons for this. First, it has a wide arrange of applications, almost in every domain. The industry surrounding sentiment analysis has also flourished due to the proliferation of commercial applications. This provides a strong motivation for research. Second, it offers many challenging research problems, which had never been studied before. This course will systematically define and discuss these problems, and describe the current state-of-the-art techniques for solving them.
Third, for the first time in human history, we now have a huge volume of opinionated data in the social media on the Web. Without this data, a lot of research would not have been possible. Not surprisingly, the inception and the rapid growth of sentiment analysis coincide with those of the social media.
In fact, sentiment analysis is now right at the center of the social media research. Hence, research in sentiment analysis not only has an important impact on NLP, but may also have a profound impact on management sciences, political science, economics, and social sciences as they are all affected by people’s opinions. Although the sentiment analysis research mainly started from early 2000, there were some earlier work on interpretation of metaphors, sentiment adjectives, subjectivity, view points, and affects (Hatzivassiloglou and McKeown, 1997; Hearst, 1992; Wiebe, 1990; Wiebe, 1994; Wiebe, Bruce and O'Hara, 1999). This course serves as an up-to-date and comprehensive introductory text, as well as a survey to the subject.
• Sentiment Analysis Applications
• Sentiment Analysis Research
• Different Levels of Analysis
• Sentiment Lexicon and Its Issues
• Natural Language Processing Issues
• Opinion Spam Detection
• What’s Ahead
• Problem Definitions
• Opinion Defintion
• Sentiment Analysis Tasks
• Opinion Summarization
• Different Types of Opinions
• Regular and Comparative Opinions
• Explicit and Implicit Opinions
• Subjectivity and Emotion
• Author and Reader Standing Point
• Sentiment Classification Using Supervised Learning
• Sentiment Classification Using Unsupervised Learning
• Sentiment Rating Prediction
• Cross-Domain Sentiment Classification
• Cross-Language Sentiment Classification
• Sentence Subjectivity and Sentiment Classification
• Subectivity Classification
• Sentence Sentiment Classification
• Dealing with Conditional Sentences
• Dealing with Sarcastic Sentences
• Cross-language Subjectivity and Sentiment Classification
• Using Discourse Information for Sentiment Classification
• Dictionary-based Approach
• Corpus-based Approach
• Desirable and Undesirable Facts
• Quality as Regression Problem
• Other Methods
• Aspect Sentiment Classification
• Basic Rules of Opinions and Compositional Semantics
• Aspect Extraction
• Finding Frequent Nouns and Noun Phrases
• Using Opinion and Target Relations
• Using Supervised Learning
• Using Topic Models
• Mapping Implicit Aspects
• Identifying Resource Usage Aspect
• Simutaneous Opinion Lexicon Expansion and Aspect
• Grouping Aspects into Categories
• Entity, Opinion Holder and Time Extraction
• Coreference Resolution and Word Sense Disambiguation
• Aspect-based Opinion Summarization
• Improvements to Aspect-based Opinion Summarization
• Contrastive View Summarization
• Traditional Summarization
• Analysis of Comparative Opinions
• Problem Definitions
• Identify Comparative Sentences
• Identifying Preferred Entities
• Web Search vs. Opinion Search
• Existing Opinion Retrieval Techniques
• Types of Spam and Spamming
• Harmful Fake Reviews
• Individual and Group Spamming
• Types of Data, Features and Detection
• Supervised Spam Detection
• Unsupervised Spam Detection
• Spam Detection based on Atypical Behaviors
• Spam Detection Using Review Graph
• Group Spam Detection
Cumhuriyet Cad. No:5
Floor 5 - Taksim
Do not hesitate to send your inquiry