1. Handbook
  2. Subjects
  3. Web Search and Text Analysis

Web Search and Text Analysis (COMP90042)

Graduate courseworkPoints: 12.5On Campus (Parkville)

You’re viewing the 2019 Handbook:
Or view archived Handbooks


Year of offer2019
Subject levelGraduate coursework
Subject codeCOMP90042
Semester 1
FeesSubject EFTSL, Level, Discipline & Census Date


The aims for this subject is for students to develop an understanding of the main algorithms used in natural language processing and text retrieval, for use in a diverse range of applications including search engines, machine translation, text mining, sentiment analysis, and question answering. Topics to be covered include text pre-processing, part-of-speech tagging, context-free grammars, n-gram language modelling, and text classification. The programming language used is Python.


Topics covered may include:

  • Text classification algorithms such as logistic regression
  • Vector space models for natural language semantics
  • Structured prediction, Hidden Markov models
  • N-gram language modelling, including statistical estimation
  • Alignment of parallel corpora
  • Term indexing, term weighting for information retrieval
  • Query expansion and relevance feedback

Intended learning outcomes


On completion of this subject the student is expected to:

  • Identify basic challenges associated with the computational modelling of natural language
  • Understand and articulate the mathematical and/or algorithmic basis of common techniques used in natural language processing and information retrevial
  • Implement relevant techniques and/or interface with existing libraries
  • Carry out end-to-end research experiments, including evaluation with text corpora as well as presentation and interpretation of results.

Generic skills

On completing this subject, students should have the following skills:

  • Formulate and implement algorithmic solutions to computational problems, with reference to the research literature
  • Apply a systems approach to complex problems, and design for operational efficiency
  • Design, implement and test programs for small and medium size problems in the Python programming language.

Last updated: 16 August 2019