# Chapter 8: How Did Yeast Become a Wine-Maker?

### (Coursera Week 1)

*decreases*during the course of an experiment?

We are interested in all genes that may be implicated in the diauxic shift. If the expression of a gene decreases during the diauxic shift, then chances are it is still relevant to the diauxic shift. For example, the gene may have to be repressed in order to enable the diauxic shift.

**FarthestFirstTraversal**to solve the

*k*-Means Clustering Problem?

Yes. Simply substitute *d*(*DataPoint*, *Centers*) by *Distortion*(*DataPoint*, *Centers*) in the psesudocode for **FarthestFirstTraversal**.

To make sense of a gene expression matrix, biologists often rearrange its genes so that similar expression vectors correspond to consecutive rows in the gene expression matrix. Then, they often assign colors to each element of the gene expression matrix according to their expression levels, resulting in a **heatmap** (shown below).

**Figure:** The heatmap of a gene expression matrix. The rows and columns of the gene expression matrix are ordered according to the order of leaves constructed using hierarchical clustering.

*k*clusters are there?

Let {*n*, *k*} denote the number of partitions of *n* points into *k* clusters. We will establish a recurrence relation to compute {*n*, *k*}. To do so, select an arbitrary point *x*, and note that there are two possibilities. First, *x* could belong to a cluster all by itself. In this case, the remaining points form *k* - 1 clusters, and we know that there are {*n*-1, *k*-1} ways to partition them. Second, *x* could belong to a cluster with other points. In this case, there are {*n*-1, *k*} ways to partition the other points into *k* clusters, and then there are *k* different choices for the cluster to which *x* belongs.

Combining these two cases results in the following recurrence relation for all *n* > 1 and all *k* satisfying 1 < *k* < *n*:

*n*,

*k*} = {

*n*-1,

*k*-1} +

*k*· {

*n*-1,

*k*}

As for a base case, note that {*n*, 1} = 1 for all *n *≥ 1 because there is only one way to partition *n* points into a single cluster. By the same reasoning, {*n, n*} = 1 for all *n *≥ 1.

**Exercise Break:** Find a formula for {*n, *2} in terms of *n*.

The numbers {*n*, *k*} are called **Stirling numbers of the second kind**.

Given two vectors *a* = (*a*_{1}, …, *a*_{n}) and *b* = (*b*_{1}, …, *b*_{n}), their **dot product** is defined as the sum of pairwise products of their elements:

*a*·

*b*= Σ

^{n}

_{i=1}

*a*

_{i}·

*b*

_{i}=

*a*

_{1}·

*b*

_{1}+

*a*

_{2}·

*b*

_{2}+ ... +

*a*

_{n}·

*b*

_{n}

Note that the two occurrences of the "·" symbol in this equation mean different things. On the left side, it represents the dot product of vectors, which we have just defined. On the right side, it indicates a normal product of two numbers *a*_{i} and *b*_{i}.

### (Coursera Week 2)

*k*-Means Clustering Problem. Does the

*soft*

*k*-means clustering algorithm converge to an optimal solution?

No, not necessarily.