A Coding Tale of a Tail at Scale

Dr. Emina Soljanin
Distinguished Member of Technical Staff, Bell Labs
Given on: April 3rd, 2014

Abstract

We study how coding in distributed storage reduces download time of large files, in addition to providing reliability against disk failures. When a file is encoded to add redundancy, and distributed across multiple disks, reading from only a subset of the disks is sufficient to reconstruct the original content. For the same total storage used, coding exploits the diversity and parallelism in the system better than simple replication, and hence gives faster download. We introduce a fork-join queuing framework to model multiple users requesting the content simultaneously, and demonstrate the trade-off between the download time and the amount of storage space.

Biography

Emina Soljanin received the PhD and MS degrees from Texas A&M University, College Station, in 1989 and 1994, and the European Diploma degree from the University of Sarajevo, Bosnia, in 1986, all in Electrical Engineering. From 1986 to 1988, she worked in the Energoinvest Company, Bosnia, developing optimization algorithms and software for power systems control. After graduating from Texas A&M, she joined Bell Laboratories, Murray Hill, NJ, where she is now a Distinguished Member of Technical Staff in the Mathematics of Networks research department. Dr. Soljanin's research interests are in the broad area of coding and information theory, and their applications. In the course of her twenty year employment with Bell Labs, she has participated in a very wide range of research and business projects, including the first distance enhancing codes to be implemented in commercial magnetic storage devices, the first forward error correction for Bell Labs optical transmission devices, color space quantization and color image processing, quantum computation, and link error prediction methods for hybrid ARQ wireless network standards. Her most recent activities are in the area of network and rateless coding for data transmission and storage.