THE UNIVERSITY of EDINBURGH

DEGREE REGULATIONS & PROGRAMMES OF STUDY 2011/2012

University Homepage
DRPS Homepage
DRPS Search
DRPS Contact
DRPS : Course Catalogue : School of Informatics : Informatics

Undergraduate Course: Foundations of Natural Language Processing (INFR09028)

Course Outline
School School of Informatics College College of Science and Engineering
Course type Standard Availability Available to all students
Credit level (Normal year taken) SCQF Level 9 (Year 3 Undergraduate) Credits 10
Home subject area Informatics Other subject area None
Course website http://www.inf.ed.ac.uk/teaching/courses/fnlp Taught in Gaelic? No
Course description This course covers some of the linguistic and algorithmic foundations of natural language processing. It builds on the material introduced in Informatics 2A and aims to equip students for more advanced NLP courses in years 3 or 4. The course is strongly empirical, using corpus data to illustrate both core linguistic concepts and algorithms, including language modeling, part of speech tagging, syntactic processing, the syntax-semantics interface, and aspects of semantic processing. Linguistic and algorithmic content will be interleaved throughout the course.
Entry Requirements
Pre-requisites Students MUST have passed: Informatics 2A - Processing Formal and Natural Languages (INFR08008) OR Informatics Research Review (INFR11034)
Co-requisites
Prohibited Combinations Students MUST NOT also be taking Advanced Natural Language Processing (INFR11059)
Other requirements None
Additional Costs None
Information for Visiting Students
Pre-requisites None
Displayed in Visiting Students Prospectus? Yes
Course Delivery Information
Delivery period: 2011/12 Semester 2, Available to all students (SV1) WebCT enabled:  No Quota:  None
Location Activity Description Weeks Monday Tuesday Wednesday Thursday Friday
CentralLecture1-11 10:00 - 10:50
CentralLecture1-11 10:00 - 10:50
First Class First class information not currently available
Exam Information
Exam Diet Paper Name Hours:Minutes
Main Exam Diet S2 (April/May)2:00
Resit Exam Diet (August)2:00
Summary of Intended Learning Outcomes
1 - Given an appropriate NLP problem, students should be able to select a corpus and an annotation scheme for the problem and justify the choice over other candidates.
2 - Students should also be able to identify suitable evaluation measures for the problem and provide a written explanation of the role of annotated corpora in natural language processing.
3 - Given one of the main linguistic issues relevant to NLP (including the representation and induction of syntactic knowledge, and the modelling of lexical and semantic information, and the syntax-semantics interface), students should be able to construct an example of the issue and provide an explanation of how their example illustrates the issue in general.
4 - Given an example of one of the main linguistic issues identified above, students should be able to classify it as belonging to that issue and relate the example to the issue in general.
5 - Given an NLP problem, students should be able to analyse, assess and justify which algorithms are most appropriate for solving the problem, based on an understanding of fundamental algorithms such as Viterbi algorithm, inside-outside, chart-based parsing and generation.
Assessment Information
Written Examination 70
Assessed Assignments 30
Oral Presentations 0

Assessment
Two assignments involving both programming and short essays.

If delivered in semester 1, this course will have an option for semester 1 only visiting undergraduate students, providing assessment prior to the end of the calendar year.
Special Arrangements
None
Additional Information
Academic description Not entered
Syllabus 1. Creating annotated corpora:
* markup, annotation
* evaluation measures
* corpora and the web

2. Lexicon and lexical processing:
* language modeling
* Hidden Markov Models
* part of speech tagging (e.g., for a language other than English) to illustrate HMMs
* Viterbi algorithm
* smoothing

3. Syntax and syntactic processing:
* revision of context-free grammars and chart parsing
* syntactic concepts: constituency, subcategorization, bounded and unbounded dependencies, feature representations
* lexicalized grammar formalisms (e.g., TAG, CCG, dependency grammar)
* treebanks: lexicalized grammars and corpus annotation

4. Semantics and semantic processing:
* compositionality
* argument structure
* word sense disambigution
* anaphora resolution
* treebanks: argument structure, WSD (e.g., Propbank, Semcor)

Relevant QAA Computing Curriculum Sections: Not yet available
Transferable skills Not entered
Reading list Jurafsky and Martin, Speech and Language Processing, 2nd edition, 2008.
Study Abroad Not entered
Study Pattern Lectures 20
Tutorials 8
Timetabled Laboratories 0
Non-timetabled assessed assignments 30
Private Study/Other 42
Total 100
Keywords Not entered
Contacts
Course organiser Dr Marcelo Cintra
Tel: (0131 6)50 5118
Email: mc@inf.ed.ac.uk
Course secretary Miss Tamise Totterdell
Tel: 0131 650 9970
Email: t.totterdell@ed.ac.uk
Navigation
Help & Information
Home
Introduction
Glossary
Search DPTs and Courses
Regulations
Regulations
Degree Programmes
Introduction
Browse DPTs
Courses
Introduction
Humanities and Social Science
Science and Engineering
Medicine and Veterinary Medicine
Other Information
Timetab
Prospectuses
Important Information
 
copyright 2011 The University of Edinburgh - 3 April 2011 11:19 am