This
course will cover models for information retrieval, techniques for indexing and
searching, and algorithms for text mining. It will also cover core algorithms
such as SVM, latent semantic indexing, link analysis and ranking, Map-Reduce
architecture and Hadoop, and applications of deep learning in IR/text-related
tasks, to different degrees of detail, time permitting.
The
course load includes programming projects, reading assignments, and a take-home
final exam.
Note that you will need significant
programming skills (Python is preferred) to finish the projects.
(1) You are encouraged to work on the project assignments in teams of two students. You can discuss and share
your work with the team members to improve your understanding of the material,
but not with other teams.
(3) The students will be prohibited from using any computer (e.g., laptop),
calculator, or cell phone during the exam.
(4) Any academic impropriety during an exam (which includes copying from other
students, or accessing web resources and passing it off as your work) or on the
assignments will have a minimum penalty of 'F' grade, plus additional
disciplinary action for unethical behavior. See http://www.wright.edu/students/judicial/integrity.html
for details.
2 Project Assignments (50%)
2-4 Reading Assignments (20%)
Take-home Final Exam (30%)
The
A/B/C/D/F letter grade will be assigned at the end of the course. The
final grades may be curved according to the overall grade (projects + exams)
distribution.
Tentative Class Schedule
|
Topics |
Additional Reading |
1. |
Information Retrieval; The
Boolean Model |
IIR-1 |
2. |
The Vector Space Model : Term Weighting and Scoring |
IIR-6 |
3. |
Inverted Index Construction |
IIR-1 |
4. |
Dictionary and Postings;
Query Processing |
IIR-2 |
5. |
Tolerant Retrieval
(B-Trees) |
IIR-3 |
6. |
Index Construction |
IIR-4 |
7. |
Map Reduce Architecture |
|
8. |
Index Compression |
IIR-5 |
9. |
Vector Space Model : TF-IDF |
IIR-6.2 |
10. |
Vector Space Model : Ranking Revisited |
IIR-6.1 |
11. |
Midterm |
|
12. |
Evaluation in Information
Retrieval |
IIR-8 |
13. |
Relevance Feedback and
Query Expansion |
IIR-9 |
14. |
Text Classification and
Naive Bayes |
IIR-13 |
15. |
Vector Space Classification |
IIR-14 |
16. |
Support Vector Machines |
IIR-15, SVM |
17. |
Flat and Hierarchical
Clustering |
IIR-16, IIR-17 |
18. |
Latent Semantic Indexing |
IIR-18, Refs |
19. |
Linear Algebra: Matrix
Decompositions |
|
20. |
Link Analysis |
IIR-21 |
21. |
Other
topics |
|
Final Exam |
|