Handbook home
Web Search and Text Analysis (COMP90042)
Graduate courseworkPoints: 12.5On Campus (Parkville)
About this subject
- Overview
- Eligibility and requirements
- Assessment
- Dates and times
- Further information
- Timetable(opens in new window)
Contact information
Semester 1
Overview
Availability | Semester 1 |
---|---|
Fees | Look up fees |
AIMS
The aims for this subject is for students to develop an understanding of the main algorithms used in natural language processing and text retrieval, for use in a diverse range of applications including search engines, machine translation, text mining, sentiment analysis, and question answering. Topics to be covered include text pre-processing, part-of-speech tagging, context-free grammars, n-gram language modelling, and text classification. The programming language used is Python.
INDICATIVE CONTENT
Topics covered may include:
- Text classification algorithms such as logistic regression
- Vector space models for natural language semantics
- Structured prediction, Hidden Markov models
- N-gram language modelling, including statistical estimation
- Alignment of parallel corpora
- Term indexing, term weighting for information retrieval
- Query expansion and relevance feedback
Intended learning outcomes
INTENDED LEARNING OUTCOMES (ILO)
On completion of this subject the student is expected to:
- Identify basic challenges associated with the computational modelling of natural language
- Understand and articulate the mathematical and/or algorithmic basis of common techniques used in natural language processing and information retrevial
- Implement relevant techniques and/or interface with existing libraries
- Carry out end-to-end research experiments, including evaluation with text corpora as well as presentation and interpretation of results.
Generic skills
On completing this subject, students should have the following skills:
- Formulate and implement algorithmic solutions to computational problems, with reference to the research literature
- Apply a systems approach to complex problems, and design for operational efficiency
- Design, implement and test programs for small and medium size problems in the Python programming language.
Last updated: 3 November 2022
Eligibility and requirements
Prerequisites
One of the following:
Code | Name | Teaching period | Credit Points |
---|---|---|---|
COMP30018 | Knowledge Technologies | No longer available | |
COMP90049 | Knowledge Technologies |
Semester 2 (On Campus - Parkville)
Semester 1 (On Campus - Parkville)
|
12.5 |
COMP30027 | Machine Learning | Semester 1 (On Campus - Parkville) |
12.5 |
OR admission into MC-IT Master of Technology, 100 or 150 point program in Distributed Computing or Computing
Corequisites
None
Non-allowed subjects
433-460 Human Language Technology
433-467 Text and Document Management
433-660 Human Language Technology
433-667 Text and Document Management
433-476 Text and Document Management
Inherent requirements (core participation requirements)
The University of Melbourne is committed to providing students with reasonable adjustments to assessment and participation under the Disability Standards for Education (2005), and the Assessment and Results Policy (MPF1326). Students are expected to meet the core participation requirements for their course. These can be viewed under Entry and Participation Requirements for the course outlines in the Handbook.
Further details on how to seek academic adjustments can be found on the Student Equity and Disability Support website: http://services.unimelb.edu.au/student-equity/home
Last updated: 3 November 2022
Assessment
Additional details
- Up to 4 short programming assignments that will be due between week 3 and week 10, requiring approximately of 20 - 30 hours of work. (20% total)
- A final, open-ended research project requiring approximately 30-40 hours of work (30%), due at the end of the semester (week 11-12).
- One 2-hour end-of-semester examination (50%).
Hurdle requirement: To pass the subject, students must obtain at least:
- 50% overall
- 25/50 in the continuous assessment
- 25/50 in the end-of-semester written examination.
Intended Learning Outcomes (ILOs) 1 and 2 are addressed in the lectures, workshops, and exam; ILOs 3 and 4 are addressed in the assignments and project.
Last updated: 3 November 2022
Dates & times
- Semester 1
Principal coordinator Trevor Cohn Mode of delivery On Campus (Parkville) Contact hours 36 hours, comprising of one 2-hour lecture and one 1-hour workshop per week Total time commitment 200 hours Teaching period 4 March 2019 to 2 June 2019 Last self-enrol date 15 March 2019 Census date 31 March 2019 Last date to withdraw without fail 10 May 2019 Assessment period ends 28 June 2019 Semester 1 contact information
Time commitment details
200 hours
Last updated: 3 November 2022
Further information
- Texts
Prescribed texts
There are no specifically prescribed or recommended texts for this subject.
- Subject notes
LEARNING AND TEACHING METHOD
The subject comprises a weekly 2 hour lecture followed by a 1 hour laboratory exercise. Weekly readings are assigned from relevant textbooks and the research literature, and weekly laboratory exercises are assigned. Additionally, a significant amount of project work is assigned.
INDICATIVE KEY LEARNING RESOURCES
At the beginning of the semester, the coordinator will post a list of readings taken from relevant textbooks as well as research literature and research monographs. An indicative source of relevant material is the textbook Speech and Language Processing by Dan Jurafsky and James H. Martin (2008).
CAREERS / INDUSTRY LINKS
A growing sector of the IT industry is concerned with leveraging the information that is locked up in semi-structured text data on the web. Large scale analysis and exploitation of this information depends on graduates with a solid grounding in natural language processing and text retrieval algorithms, and experience with implementing systems that are informed by the research literature.
- Related Handbook entries
This subject contributes to the following:
Type Name Course Doctor of Philosophy - Engineering Course Master of Philosophy - Engineering Course Master of Science (Computer Science) Course Master of Data Science Course Ph.D.- Engineering Specialisation (formal) Distributed Computing Specialisation (formal) Computing Major Computer Science Specialisation (formal) Software - Available through the Community Access Program
About the Community Access Program (CAP)
This subject is available through the Community Access Program (also called Single Subject Studies) which allows you to enrol in single subjects offered by the University of Melbourne, without the commitment required to complete a whole degree.
Entry requirements including prerequisites may apply. Please refer to the CAP applications page for further information.
Additional information for this subject
Subject coordinator approval required
- Available to Study Abroad and/or Study Exchange Students
This subject is available to students studying at the University from eligible overseas institutions on exchange and study abroad. Students are required to satisfy any listed requirements, such as pre- and co-requisites, for enrolment in the subject.
Last updated: 3 November 2022