-
Improving Hyperparameter Optimization with Checkpointed Model Weights
Authors:
Nikhil Mehta,
Jonathan Lorraine,
Steve Masson,
Ramanathan Arunachalam,
Zaid Pervaiz Bhat,
James Lucas,
Arun George Zachariah
Abstract:
When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for mor…
▽ More
When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for more efficient optimization. For example, using intermediate loss evaluations to terminate bad selections. In this work, we propose an HPO method for neural networks using logged checkpoints of the trained weights to guide future hyperparameter selections. Our method, Forecasting Model Search (FMS), embeds weights into a Gaussian process deep kernel surrogate model, using a permutation-invariant graph metanetwork to be data-efficient with the logged network weights. To facilitate reproducibility and further research, we open-source our code at https://github.com/NVlabs/forecasting-model-search.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Data-centric Artificial Intelligence: A Survey
Authors:
Daochen Zha,
Zaid Pervaiz Bhat,
Kwei-Herng Lai,
Fan Yang,
Zhimeng Jiang,
Shaochen Zhong,
Xia Hu
Abstract:
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler of its great success is the availability of abundant and high-quality data for building machine learning models. Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI. The attention of researchers and practitioners has gradually shifte…
▽ More
Artificial Intelligence (AI) is making a profound impact in almost every domain. A vital enabler of its great success is the availability of abundant and high-quality data for building machine learning models. Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI. The attention of researchers and practitioners has gradually shifted from advancing model design to enhancing the quality and quantity of the data. In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals (training data development, inference data development, and data maintenance) and the representative methods. We also organize the existing literature from automation and collaboration perspectives, discuss the challenges, and tabulate the benchmarks for various tasks. We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle. We hope it can help the readers efficiently grasp a broad picture of this field, and equip them with the techniques and further research ideas to systematically engineer data for building AI systems. A companion list of data-centric AI resources will be regularly updated on https://github.com/daochenzha/data-centric-AI
△ Less
Submitted 11 June, 2023; v1 submitted 17 March, 2023;
originally announced March 2023.
-
Data-centric AI: Perspectives and Challenges
Authors:
Daochen Zha,
Zaid Pervaiz Bhat,
Kwei-Herng Lai,
Fan Yang,
Xia Hu
Abstract:
The role of data in building AI systems has recently been significantly magnified by the emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model advancements to ensuring data quality and reliability. Although our community has continuously invested efforts into enhancing data in different aspects, they are often isolated initiatives on specific tasks. To facilita…
▽ More
The role of data in building AI systems has recently been significantly magnified by the emerging concept of data-centric AI (DCAI), which advocates a fundamental shift from model advancements to ensuring data quality and reliability. Although our community has continuously invested efforts into enhancing data in different aspects, they are often isolated initiatives on specific tasks. To facilitate the collective initiative in our community and push forward DCAI, we draw a big picture and bring together three general missions: training data development, inference data development, and data maintenance. We provide a top-level discussion on representative DCAI tasks and share perspectives. Finally, we list open challenges. More resources are summarized at https://github.com/daochenzha/data-centric-AI
△ Less
Submitted 2 April, 2023; v1 submitted 12 January, 2023;
originally announced January 2023.
-
BED: A Real-Time Object Detection System for Edge Devices
Authors:
Guanchu Wang,
Zaid Pervaiz Bhat,
Zhimeng Jiang,
Yi-Wei Chen,
Daochen Zha,
Alfredo Costilla Reyes,
Afshin Niktash,
Gorkem Ulkar,
Erman Okman,
Xuanting Cai,
Xia Hu
Abstract:
Deploying deep neural networks~(DNNs) on edge devices provides efficient and effective solutions for the real-world tasks. Edge devices have been used for collecting a large volume of data efficiently in different domains. DNNs have been an effective tool for data processing and analysis. However, designing DNNs on edge devices is challenging due to the limited computational resources and memory.…
▽ More
Deploying deep neural networks~(DNNs) on edge devices provides efficient and effective solutions for the real-world tasks. Edge devices have been used for collecting a large volume of data efficiently in different domains. DNNs have been an effective tool for data processing and analysis. However, designing DNNs on edge devices is challenging due to the limited computational resources and memory. To tackle this challenge, we demonstrate Object Detection System for Edge Devices~(BED) on the MAX78000 DNN accelerator. It integrates on-device DNN inference with a camera and an LCD display for image acquisition and detection exhibition, respectively. BED is a concise, effective and detailed solution, including model training, quantization, synthesis and deployment. The entire repository is open-sourced on Github, including a Graphical User Interface~(GUI) for on-chip debugging. Experiment results indicate that BED can produce accurate detection with a 300-KB tiny DNN model, which takes only 91.9 ms of inference time and 1.845 mJ of energy. The real-time detection is available at YouTube.
△ Less
Submitted 25 September, 2022; v1 submitted 14 February, 2022;
originally announced February 2022.
-
AutoVideo: An Automated Video Action Recognition System
Authors:
Daochen Zha,
Zaid Pervaiz Bhat,
Yi-Wei Chen,
Yicheng Wang,
Sirui Ding,
Jiaben Chen,
Kwei-Herng Lai,
Mohammad Qazim Bhat,
Anmoll Kumar Jain,
Alfredo Costilla Reyes,
Na Zou,
Xia Hu
Abstract:
Action recognition is an important task for video understanding with broad applications. However, developing an effective action recognition solution often requires extensive engineering efforts in building and testing different combinations of the modules and their hyperparameters. In this demo, we present AutoVideo, a Python system for automated video action recognition. AutoVideo is featured fo…
▽ More
Action recognition is an important task for video understanding with broad applications. However, developing an effective action recognition solution often requires extensive engineering efforts in building and testing different combinations of the modules and their hyperparameters. In this demo, we present AutoVideo, a Python system for automated video action recognition. AutoVideo is featured for 1) highly modular and extendable infrastructure following the standard pipeline language, 2) an exhaustive list of primitives for pipeline construction, 3) data-driven tuners to save the efforts of pipeline tuning, and 4) easy-to-use Graphical User Interface (GUI). AutoVideo is released under MIT license at https://github.com/datamllab/autovideo
△ Less
Submitted 16 July, 2022; v1 submitted 9 August, 2021;
originally announced August 2021.