Automatic image capturing

  • Author(s) / Creator(s)
  • The difficult and multidisciplinary process of automatically creating accurate and logical textual descriptions for photographs is known as automatic image captioning. Modern Neural Networks excel in tasks like Computer Vision and Natural Language Processing, but their memory and compute appetite hinder deployment on resource-limited edge devices. Researchers have developed pruning and quantization algorithms to compress networks without compromising efficacy. The process typically involves two main steps: Image understanding and Caption generation. This work presents an unconventional end-to-end compression pipeline for a CNN (Convolutional neural network)-LSTM (Long short-term memory)-based Image captioning model, achieving a 73.1 percentage reduction in model size, 71.3 percentage reduction in inference time, and 7.7 percentage increase in BLEU(bilingual evaluation understudy) score compared to uncompressed models. By comparing generated captions with reference captions created by humans, evaluation metrics like BLEU (bilingual evaluation understudy) and METEOR (metric for evaluation of translation with explicit ordering) are used to evaluate the quality of generated captions. The purpose of Automatic image processing is to extract useful information from photos, making analysis, interpretation, and manipulation faster, precise, and effective in various fields, using computational algorithms.

  • Date created
    2023
  • Subjects / Keywords
  • Type of Item
    Research Material
  • DOI
    https://doi.org/10.7939/r3-8xqr-m878
  • License
    Attribution-NonCommercial 4.0 International