Assignment 1: Finding modes
Deadline: Nov 30, 14:00 CET
In this assignment, we are interested in finding a mode or modes of a sample (typically of a probability distribution, it is just a list of numbers for our purposes).
A mode of a sample is a ‘peak’ you observe when you plot a histogram of the sample. For example, for the sample [4 3 7 9 6 1], the first (4) and the fourth (9) elements are modes. For this assignment, we define a mode as an element of the sample that is larger than its neighbors. More formally, ith element of a sequence a, a[i], is a mode if a[i-1] < a[i] and a[i] > a[i+1]. For the first element, we will relax the first requirement, only requiring than a[i] > a[i+1] (as in our example above). Similarly, for the last element, the condition a[i-1] < a[i] is enough for being a mode.
For simplicity, we further require/assume that the input sequence consists of unique numbers. That is, the same value is not repeated in the input. So there are no ‘plateaus’ or ‘ridges’ (in higher dimensions).
General instructions
Remember that the assignments are accepted only through git repositories (no email submissions). You need to work on this assignment and the later assignments by accepting the link provided at the private course repository. Note, however, to access the private course repository, you are required to complete the introductory assignment.
Please implement all exercises in the provided template, following the defined structure and API for the tasks.
Important: after completing your assignment,
please tag it as final
, and push the tag to your GitHub repository.
Here is the reminder of how to do it on the command line:
git tag final
git push --tags
The repositories without the final
tag will not be checked
until the late assignment deadline.
Exercises
1.1 Implement an algorithm that finds a mode of a given sequence
Implement the function find_mode_linear()
in the template,
which performs a linear search for a mode
and returns any of the modes of a given sample.
1.2 Finding the unique mode efficiently
Assuming there is only a single mode in the input,
implement the function find_unique_mode()
in the template
which finds the mode using a more efficient algorithm than the linear search.
Tip: note the similarity of the problem with binary search.
1.3 Test and output average running times
Implement the time_search()
function in the template
that returns the average running time of multiple runs of
the given search function
over specified number of random samples with a single mode.
Please follow the template for details.
1.4 Finding a mode in 2D
Implement the functions find_mode_2d()
in the template,
which finds the/a mode of a 2D matrix.
Make sure that the (2D) index returned is always a mode,
which should naturally be the only mode of a unimodal distribution.
Note: for this exercise a greedy ‘hill-climbing’ algorithm is sufficient for our purposes. However, you are recommended to think about whether you can do better (in terms of worst-time complexity) given the input description; what sort of data would result in the worst-time complexity; and would you need any changes to your implementation if the input was allowed to have duplicate values (hence plateaus).