# Computer Science   722   Course Outline

```COMPUTING PATTERNS IN STRINGS
```
```[Winter Term (January-April) 2014-2015]
```

## Instructor(s)

```Bill Smyth
```

## Course Assistance

```Please contact the instructor (see above).
```

## Schedule

```We meet

Mondays   	12:00 - 13:30   ITB-222
Thursdays      	12:00 - 13:30   ITB-222

The first class is on Thursday 08 January.
```

## Calendar Description

```This course deals with algorithms for finding "patterns"
in strings, patterns of three main kinds: specific, generic,
and intrinsic.  The importance of approximate patterns and
algorithms which identify them is made clear.  Applications
to DNA sequence analysis and other scientific areas are
emphasized.
```

## Course Objectives

```The study of string algorithms is increasingly important in
computer science, primarily due to its many applications,
in computational biology, data compression, cryptology,
and other areas.  Over the last 40 years a variety of
elegant algorithms have been discovered.  The objective of
this course is to provide students with some sense of this
mathematical elegance while at the same time making clear
the subject's relevance to applications.

See Calendar Description.
```

## Outline of Topics

```The topics are selected from the course text, Computing Patterns
in Strings, whose main components are given below:

PART I -- STRINGS & ALGORITHMS

Chapter  1 -- Properties of Strings
Chapter  2 -- Patterns?  What Patterns?
Chapter  3 -- Strings Famous & Infamous
Chapter  4 -- Good Algorithms & Good Test Data

PART II -- COMPUTING INTRINSIC PATTERNS

Chapter  5 -- Trees Derived from Strings
Chapter  6 -- Decomposing a String

PART III -- COMPUTING SPECIFIC PATTERNS

Chapter  7 -- Basic Algorithms
Chapter  8 -- Son of BM Rides Again!
Chapter  9 -- String Distance Algorithms
Chapter 10 -- Approximate Pattern-Matching
Chapter 11 -- Regular Expressions & Multiple Patterns

PART IV -- COMPUTING GENERIC PATTERNS

Chapter 12 -- Repetitions (Periodicity)
Chapter 13 -- Extensions of Periodicity

Please note: This course is highly mathematical in nature;
students taking it should both enjoy mathematics and have a good
undergraduate foundation in discrete mathematics (the mathematics
of computer science).
```

## Format

```See Schedule.
```

```The course will be run as a seminar course with heavy student
participation throughout the term: lecturing, presentations,
and problem-solving.  Student assessment will be based primarily
on a major written project that will also be presented to the
class.  The student's participation during the term may also
contribute to the final mark.
```

## Resources

```The lecturer's monograph, Computing Patterns in Strings, will be
the main course text.  It is available both in the bookstore and in
the library.  Numerous other references will also be made available
to students, both journal papers and books.  Among the books are

* Algorithms on Strings, Maxime Crochemore, Christophe Hancart
& Thierry Lecroq, Cambridge University Press (2007)
* Jewels of Stringology, Maxime Crochemore & Wojciech Rytter
World Scientific Publishing (2003)
* Algorithms on Strings, Trees & Sequences, Dan Gusfield
Cambridge University Press (1997)
* Introduction to Computational Molecular Biology
Joao Carlos Setubal & Joao Meidanis
PWS Publishing (1997)
* String Searching Algorithms, Graham A. Stephen
World Scientific Publishing (1994)

All of these books are available in the Thode Science Library.

Some useful files for testing purposes can be picked up at
the lecturer's website.
```

## Instructor Specific Information

```N/A
```

```Academic dishonesty consists of misrepresentation by deception or by other
fraudulent means and can result in serious consequences, e.g. the grade of
zero on an assignment, loss of credit with a notation on the transcript
suspension or expulsion from the university.

For information on the various kinds of academic dishonesty please refer to
the Academic Integrity Policy, specifically Appendix 3, located at

The following illustrates only three forms of academic dishonesty:
1. Plagiarism, e.g. the submission of work that is not one's own or for
which other credit has been obtained.
2. Improper collaboration in group work.
3. Copying or using unauthorized aids in tests and examinations.

The instructor in this course may be using a software package designed
to reveal plagiarism.  In such cases, students will be required to submit
their work electronically and in hard copy so that it can be checked
```

## Faculty Notices

```"The Faculty of Engineering is concerned with ensuring an
environment that is free of all discrimination.  If there
is a problem, individuals are reminded that they should
contact the Department Chair, the Sexual Harassment Officer
or the Human Rights Consultant, as the problem occurs."
```
```"The instructor and university reserve the right to modify elements of the
course during the term. The university may change the dates and deadlines
for any or all courses in extreme circumstances. If either type of
modification becomes necessary, reasonable notice and communication with the
students will be given with explanation and the opportunity to comment on
changes. It is the responsibility of the student to check their McMaster
email and course websites weekly during the term and to note any changes."

```

Outline revised: Jan 5, 2015
Script revised: Jan 26, 2010 J. Nakamura