Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. Text data are unique in that they are usually generated directly by humans rather than a computer system or sensors, and are thus especially valuable for discovering knowledge about people’s opinions and preferences, in addition to many other kinds of knowledge that we encode in text.
Offered By
About this Course
Skills you will gain
- Information Retrieval (IR)
- Document Retrieval
- Machine Learning
- Recommender Systems
Offered by
University of Illinois at Urbana-Champaign
The University of Illinois at Urbana-Champaign is a world leader in research, teaching and public engagement, distinguished by the breadth of its programs, broad academic excellence, and internationally renowned faculty and alumni. Illinois serves the world by creating knowledge, preparing students for lives of impact, and finding solutions to critical societal needs.
Syllabus - What you will learn from this course
Orientation
You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course.
Week 1
During this week's lessons, you will learn of natural language processing techniques, which are the foundation for all kinds of text-processing applications, the concept of a retrieval model, and the basic idea of the vector space model.
Week 2
In this week's lessons, you will learn how the vector space model works in detail, the major heuristics used in designing a retrieval function for ranking documents with respect to a query, and how to implement an information retrieval system (i.e., a search engine), including how to build an inverted index and how to score documents quickly for a query.
Week 3
In this week's lessons, you will learn how to evaluate an information retrieval system (a search engine), including the basic measures for evaluating a set of retrieved results and the major measures for evaluating a ranked list, including the average precision (AP) and the normalized discounted cumulative gain (nDCG), and practical issues in evaluation, including statistical significance testing and pooling.
Week 4
In this week's lessons, you will learn probabilistic retrieval models and statistical language models, particularly the detail of the query likelihood retrieval function with two specific smoothing methods, and how the query likelihood retrieval function is connected with the retrieval heuristics used in the vector space model.
Reviews
- 5 stars65.45%
- 4 stars23.98%
- 3 stars6.82%
- 2 stars1.65%
- 1 star2.09%
TOP REVIEWS FROM TEXT RETRIEVAL AND SEARCH ENGINES
This course is complemented with a software and text (not mandatory) and good explanation with diagrams...
I have learned a lot of concepts through this course, but at a shallow level. It is a great introduction course to IR. It can be improved by adding more programming tasks for hands-on exercise
A great overview of text retrieval methods. Good coverage of search engines. A longer course will cover search engine better (remember this is a 6 weeker)
A bit difficult to complete as the Quiz questions were tougher. But when you go through all, you might feel good.
About the Data Mining Specialization
The Data Mining Specialization teaches data mining techniques for both structured data which conform to a clearly defined schema, and unstructured data which exist in the form of natural language text. Specific course topics include pattern discovery, clustering, text retrieval, text mining and analytics, and data visualization. The Capstone project task is to solve real-world data mining challenges using a restaurant review data set from Yelp.
Frequently Asked Questions
When will I have access to the lectures and assignments?
What will I get if I subscribe to this Specialization?
Is financial aid available?
More questions? Visit the Learner Help Center.