Privacy and Optimization for Distributed Machine Learning and Storage

  • Author / Creator
    Hu, Yaochen
  • We study the problem of applying optimization methods under realistic and challenging constraints such as privacy, efficiency, etc., and demonstrate our design over two different cases: distributed machine learning with distributed features and load balancing for erasure-coded cloud storage systems.

    First, we study an important distributed machine learning problem where features are inherently distributed or vertically partitioned among multiple parties, and sharing of raw data or model parameters among parties is prohibited due to privacy concerns. We propose an asynchronous stochastic gradient descent (SGD) algorithm for such a feature distributed machine learning (FDML) problem, with theoretical convergence guarantees under bounded asynchrony. We also propose an alternating direction method of multipliers (ADMM) sharing framework, a more robust approach over SGD method especially for data set with high dimensional feature spaces and establish convergence and iteration complexity results under non-convex loss. Besides, we introduce a novel technique from differential privacy for both algorithms to further protect the data privacy.

    Second, we study the load balancing and tail latency reduction problem for erasure coded cloud storage systems, which lacks the flexibility of load balancing. We provide a new perspective by proactively and intelligently launching degraded reads and propose a variety of strategies on when and where to launch the degraded reads. We also solve the load balancing problem by block migration for statistical load balancing and propose a local block migration scheme with theoretical approximation ratio.

  • Subjects / Keywords
  • Graduation date
    Fall 2019
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.