Usage
  • 69 views
  • 197 downloads

Hardware Accelerators for Deep Neural Networks

  • Author / Creator
    Machupalli, Raju
  • Deep Neural Networks (DNNs) have recently evolved as the state-of-the-art method for machine learning applications such as object detection, face recognition, and image classification. However, a DNN typically has high computational complexity, and specialized hardware accelerators would be helpful to obtain real-time performance.
    Over the last decade, many accelerators have been proposed in the literature for DNN models. This thesis presents a comprehensive review of the existing DNN accelerators. The accelerators were classified into four categories: ALU, Dataflow, Sparsity, and Hybrid, based on the optimization techniques used. The classification provides a good starting point to identify significant areas where an accelerator can be further optimized for better throughput, latency, and energy performance.
    In this thesis, we also explored the bit-precision requirement of the MAC units for DNN implementation. A DNN has two modes of operations: Training and Inference. It is generally known that the inference can be done using lower-precision MAC units, but the training requires higher-precision MAC units. The lower-precision MAC units consume less energy which may be desirable for low-power applications. We propose an iterative MAC model where the inference will be done using low-precision MAC in a single pass, and the training will be done with the same low-precision MAC using multiple passes (to achieve higher bit precision). The proposed model, during training, determines the number of iterations on the fly by checking the error magnitude. Experimental results, with LeNet-300-100 model implemented using the iterative MAC, show a satisfactory performance for digit classification.

  • Subjects / Keywords
  • Graduation date
    Fall 2023
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-gywx-8e23
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.