Assignment 1: Finding modes

Deadline: Nov 30, 14:00 CET

In this assignment, we are interested in finding a mode or modes of a sample (typically of a probability distribution, it is just a list of numbers for our purposes).

A mode of a sample is a ‘peak’ you observe when you plot a histogram of the sample. For example, for the sample [4 3 7 9 6 1], the first (4) and the fourth (9) elements are modes. For this assignment, we define a mode as an element of the sample that is larger than its neighbors. More formally, ith element of a sequence a, a[i], is a mode if a[i-1] < a[i] and a[i] > a[i+1]. For the first element, we will relax the first requirement, only requiring than a[i] > a[i+1] (as in our example above). Similarly, for the last element, the condition a[i-1] < a[i] is enough for being a mode.

For simplicity, we further require/assume that the input sequence consists of unique numbers. That is, the same value is not repeated in the input. So there are no ‘plateaus’ or ‘ridges’ (in higher dimensions).

General instructions

Remember that the assignments are accepted only through git repositories (no email submissions). You need to work on this assignment and the later assignments by accepting the link provided at the private course repository. Note, however, to access the private course repository, you are required to complete the introductory assignment.

Please implement all exercises in the provided template, following the defined structure and API for the tasks.

Important: after completing your assignment, please tag it as final, and push the tag to your GitHub repository. Here is the reminder of how to do it on the command line:

git tag final
git push --tags

The repositories without the final tag will not be checked until the late assignment deadline.

Exercises

1.1 Implement an algorithm that finds a mode of a given sequence

Implement the function find_mode_linear() in the template, which performs a linear search for a mode and returns any of the modes of a given sample.

1.2 Finding the unique mode efficiently

Assuming there is only a single mode in the input, implement the function find_unique_mode() in the template which finds the mode using a more efficient algorithm than the linear search.

Tip: note the similarity of the problem with binary search.

1.3 Test and output average running times

Implement the time_search() function in the template that returns the average running time of multiple runs of the given search function over specified number of random samples with a single mode.

Please follow the template for details.

1.4 Finding a mode in 2D

Implement the functions find_mode_2d() in the template, which finds the/a mode of a 2D matrix. Make sure that the (2D) index returned is always a mode, which should naturally be the only mode of a unimodal distribution.

Note: for this exercise a greedy ‘hill-climbing’ algorithm is sufficient for our purposes. However, you are recommended to think about whether you can do better (in terms of worst-time complexity) given the input description; what sort of data would result in the worst-time complexity; and would you need any changes to your implementation if the input was allowed to have duplicate values (hence plateaus).