Home > Erasure Coding > Erasure Coding for Big Data

Erasure Coding for Big Data

1.3 Response of Coding Theory

To address this issue, coding theorists came up with two new classes of codes, known respectively as regenerating codes (RGC) [9] and locally recoverable codes (LRC) [10]. LRC also go by the name codes with locality and we will interchangeably use the two terms. These developments have led to the evolution of coding theory in two new directions (see Figure 8). While these developments also resulted in finding improved methods of repairing RS codes, we will not cover them in this survey and point the reader instead, to a representative set of references on this topic [11, 12, 13]. We will also not cover variants of regenerating codes such as the piggybacking framework introduced in [14], the $\epsilon-$MSR framework [15], codes with cooperative repair [16] and fractional-repetition codes [17].

Figure 5
Figure 5: Number of node failures over the period of a single month in a $3000$-node production cluster of Facebook. Image taken from the work of Sathiamoorthy et al. [7].
Figure 6
Figure 6: Cross-rack traffic generated during the reconstruction of the failed blocks in the production cluster of Facebook. Image taken from the work of Rashmi et al. [8].
Figure 7
Figure 7: Figure shows the conventional means of repairing a failed node in Facebook’s $[14,10]$ Reed-Solomon code. To repair a failed node storing $100$MB, in this example, the replacement node would have to download $1$TB of data.
Figure 8
Figure 8: Showing the two new branches of coding theory that have sprung up in response to the need for efficient handling of node failures in erasure-coded distributed storage.

The goal in the case of an RGC is to reduce the amount of data that has to be downloaded to repair a failed node, termed as the repair bandwidth. The aim in the case of an LRC is to minimize the number of helper nodes contacted for repair of a failed node. This is termed as the repair degree.

2 Regenerating Codes

Each code symbol in an RS code is an element in a finite field. Regenerating codes are codes over a vector alphabet. That is, each code symbol is a vector as opposed to a scalar. This property is key to enabling a regenerating code to achieve savings in repair bandwidth. This is illustrated in the example depicted in Figure 9.

2.1 Explaining the Need for Sub-packetization

In Figure 9, the setup on the left represents an $[4,2]$ MDS code. The symbols stored in the $4$ nodes are respectively, $A,B,A+B,A+2B$. This makes the code an MDS code. However, to repair a failed node, say the node $1$, that stored $A$, we still have to download $2$ symbols to repair the node. Consider next, the setup on the right. Here the sub-packetization level is $2$, each symbol is replaced by $2$ ‘half-symbols’. Thus $A$ is replaced by $A_1,A_2$, $B$ by $B_1,B_2$. Note that if the data stored in the remaining two parity nodes is as shown in the figure, then node $1$ can be repaired by downloading $3$ half-symbols in place of two full symbols, thereby achieving a reduction in repair bandwidth. Note however, that in the case of the regenerating code, we have contacted all the remaining nodes, $3$ in this case, as opposed to $k=2$ in the case of the MDS code on the left. Thus while regenerating codes reduce the repair bandwidth, they do in general, result in increased repair degree.

2.2 Formal Definition of a Regenerating Code

A regenerating code is characterized by the parameter set

\begin{eqnarray*}
\{(n,k,d), (\alpha,\beta), B, \mathbb{F}_q\ \}.
\end{eqnarray*}

The parameter $n$ denotes the block length of the code, which is the number of nodes that the data associated with a codeword in the RGC is stored across. Each node stores $\alpha$ symbols over $\mathbb{F}_q$. A value of $\alpha=1$ would indicate a scalar code, such as an RS code. Thus $\alpha$ is the level of sub-packetization of the code. The parameter $k$ indicates that the RGC has the ‘any $k$ of $n$’ property. This is illustrated in Figure 10, on the left. Node repair is accomplished by having the replacement of a failed node contact any $d$ of the remaining nodes, with $k \leq d \leq (n-1)$ and download $\beta$ symbols from each of the $d$ helper nodes (see Figure 10, on the right). The parameter $\beta$ is typically much smaller than $\alpha$. $B$ is the size of the file being stored, as measured in the number of $\mathbb{F}_q$ symbols. The savings in repair bandwidth comes about since $d \beta < k \alpha $.

2.3 Bound on File Size

A cut-set bound derived from network information-flow considerations [18] gives us the following relationship [9] between code parameters:

\begin{eqnarray*}
B & \leq & \sum_{i=0}^{k-1} \min\{\alpha, (d-i)\beta\}.
\end{eqnarray*}

Figure 10
Figure 9: Showing how breaking up a single scalar symbol into two smaller symbols helps improve the repair efficiency. This breaking up of a symbol is referred to as sub-packetization. The sub-packetization level equals $2$ here.
Figure 10
Figure 10: Illustrating data collection (left) and node repair (right) in a regenerating code.

Clearly, an RGC that achieves the above bound on file size with equality is an optimal RGC. Turns out that there are many flavours of optimality, in the sense that for a given file size $B$, there can be several values of the parameter pair $(\alpha,\beta)$ for which the bound holds with equality. Two special cases of the above bounds are shown below:

\begin{eqnarray}
B & \leq & k \alpha, \tag{1}\label{eq:msr}\\
B & \leq & \sum_{i=0}^{k-1} (d-i)\beta \ = \ kd\beta – \left(\begin{array}{@{}c@{}}k\\ 2\end{array}\right)\beta.\tag{2}\label{eq:mbr}
\end{eqnarray}

Codes which achieve the first upper bound are termed as Minimum Storage Regenerating (MSR) codes while those that achieve the second bound with equality are known as Minimum Bandwidth Regenerating (MBR) codes. MSR codes have the advantage of having the least possible storage overhead as they can be shown to belong to the class of MDS codes. MBR codes have the minimum possible repair bandwidth, but are not MDS. The minimum storage overhead of an MBR code is close to the value $2$.


Pages ( 2 of 6 ): « Previous1 2 34 ... 6Next »

Leave a Comment:

Your email address will not be published. Required fields are marked *