Locally Repairable Linear Block Codes for Distributed Storage Systems

  • Author / Creator
    Shahabinejad, Mostafa
  • Distributed storage systems (DSSs) traditionally replicate data blocks to achieve storage reliability. Considering the rapid growth of data volume as well as costly maintenance of storage components in DSSs, the replication method is becoming unattractive because of its very large storage overhead. Recently, locally repairable codes (LRCs) have been proposed and used in practice e.g., in Facebook HDFS-RAID and Windows Azure storage. LRCs are attractive because they i) significantly decrease the storage overhead of the replication method and ii) considerably decrease communications traffic for data recovery compared to traditional coding methods—such as Reed-Solomon codes. An LRC can reconstruct any coded block by accessing a small number of other coded blocks. The minimum number of blocks required to reconstruct a missing block is defined as the block’s locality. The maximum locality of all blocks is defined as code locality. Similarly, the average locality of a code is defined as the average locality of its blocks. The main focus of this dissertation is on studying and designing LRCs with low computational complexity and on LRCs with small average locality. Because of immense size of modern energy-hungry DSSs, reducing the com- putational complexity of coding methods is of great importance. In that regard, binary LRCs are attractive because they eliminate the need for multiplication in operations such as encoding, decoding, and reconstruction. In the first part of this dissertation, we propose a class of binary LRCs. Using storage overhead and reliability of the code as design metrics, we show that some instances of our proposed binary codes are optimal, while others sacrifice the storage overhead marginally to gain a low coding complexity. Also, by analyzing mean-time to data-loss as a reliability metric, we verify that the reliability of our proposed binary LRC is more than sufficient from a practical point of view. In the second part of the dissertation, we study average locality of LRCs. We derive lower bounds on average locality and design three classes of LRCs with minimum average locality. We also establish an achievable lower bound on the average locality of information blocks of LRCs, and design LRCs that achieve the obtained bound. Minimizing the average locality of information blocks is important as only information blocks are needed to be recovered during a temporal node unavailability, which accounts for 90% of all block recoveries triggered in DSSs.

  • Subjects / Keywords
  • Graduation date
    Spring 2018
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.