Skip to main content

Showing 1–19 of 19 results for author: Jones, A K

  1. arXiv:2403.04976  [pdf, other

    cs.DC

    Towards Data-center Level Carbon Modeling and Optimization for Deep Learning Inference

    Authors: Shixin Ji, Zhuoping Yang, Xingzhen Chen, Jingtong Hu, Yiyu Shi, Alex K. Jones, Peipei Zhou

    Abstract: Recently, the increasing need for computing resources has led to the prosperity of data centers, which poses challenges to the environmental impacts and calls for improvements in data center provisioning strategies. In this work, we show a comprehensive analysis based on profiling a variety of deep-learning inference applications on different generations of GPU servers. Our analysis reveals severa… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: 12 pages, 9 figures

  2. arXiv:2401.16694  [pdf, other

    cs.LG cs.CV cs.DC

    EdgeOL: Efficient in-situ Online Learning on Edge Devices

    Authors: Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, Jingtong Hu, Yanzhi Wang, Xulong Tang

    Abstract: Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests and ii) adapting to possible deployment scenario changes. Online model fine-tuning is widely adopted to satisfy these needs. However, an inappropriate fine-tuning scheme could involve significant ene… ▽ More

    Submitted 15 March, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  3. SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration

    Authors: Jinming Zhuang, Zhuoping Yang, Shixin Ji, Heng Huang, Alex K. Jones, Jingtong Hu, Yiyu Shi, Peipei Zhou

    Abstract: With the increase in the computation intensity of the chip, the mismatch between computation layer shapes and the available computation resource significantly limits the utilization of the chip. Driven by this observation, prior works discuss spatial accelerators or dataflow architecture to maximize the throughput. However, using spatial accelerators could potentially increase the execution latenc… ▽ More

    Submitted 18 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Journal ref: 2024 ACM/SIGDA International Symposium on Field Programmable Gate Arrays (FPGA '24)

  4. arXiv:2401.06270  [pdf, other

    cs.DC

    SCARIF: Towards Carbon Modeling of Cloud Servers with Accelerators

    Authors: Shixin Ji, Zhuoping Yang, Xingzhen Chen, Stephen Cahoon, Jingtong Hu, Yiyu Shi, Alex K. Jones, Peipei Zhou

    Abstract: Embodied carbon has been widely reported as a significant component in the full system lifecycle of various computing systems' green house gas emissions. Many efforts have been undertaken to quantify the elements that comprise this embodied carbon, from tools that evaluate semiconductor manufacturing to those that can quantify different elements of the computing system from commercial and academic… ▽ More

    Submitted 22 May, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: 6 pages; 6 figures; 3 tables. Accepted by ISVLSI' 24

  5. arXiv:2312.02991  [pdf, other

    cs.AR

    REFRESH FPGAs: Sustainable FPGA Chiplet Architectures

    Authors: Peipei Zhou, Jinming Zhuang, Stephen Cahoon, Yue Tang, Zhuoping Yang, Xingzhen Chen, Yiyu Shi, Jingtong Hu, Alex K. Jones

    Abstract: There is a growing call for greater amounts of increasingly agile computational power for edge and cloud infrastructure to serve the computationally complex needs of ubiquitous computing devices. Thus, an important challenge is addressing the holistic environmental impacts of these next-generation computing systems. To accomplish this, a life-cycle view of sustainability for computing advancements… ▽ More

    Submitted 27 November, 2023; originally announced December 2023.

  6. arXiv:2309.12275  [pdf, other

    cs.AR

    AIM: Accelerating Arbitrary-precision Integer Multiplication on Heterogeneous Reconfigurable Computing Platform Versal ACAP

    Authors: Zhuoping Yang, Jinming Zhuang, Jiaqi Yin, Cunxi Yu, Alex K. Jones, Peipei Zhou

    Abstract: Arbitrary-precision integer multiplication is the core kernel of many applications in simulation, cryptography, etc. Existing acceleration of arbitrary-precision integer multiplication includes CPUs, GPUs, FPGAs, and ASICs. Among these accelerators, FPGAs are promised to provide both good energy efficiency and flexibility. Surprisingly, in our implementations, FPGA has the lowest energy efficiency… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

  7. arXiv:2207.01209  [pdf, other

    cs.AR cs.AI

    Sustainable AI Processing at the Edge

    Authors: Sébastien Ollivier, Sheng Li, Yue Tang, Chayanika Chaudhuri, Peipei Zhou, Xulong Tang, Jingtong Hu, Alex K. Jones

    Abstract: Edge computing is a popular target for accelerating machine learning algorithms supporting mobile devices without requiring the communication latencies to handle them in the cloud. Edge deployments of machine learning primarily consider traditional concerns such as SWaP constraints (Size, Weight, and Power) for their installations. However, such metrics are not entirely sufficient to consider envi… ▽ More

    Submitted 4 July, 2022; originally announced July 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  8. arXiv:2205.12494  [pdf, other

    cs.ET

    A Multi-domain Magneto Tunnel Junction for Racetrack Nanowire Strips

    Authors: Prayash Dutta, Albert Lee, Kang L. Wang, Alex K. Jones, Sanjukta Bhanja

    Abstract: Domain-wall memory (DWM) has SRAM class access performance, low energy, high endurance, high density, and CMOS compatibility. Recently, shift reliability and processing-using-memory (PuM) proposals developed a need to count the number of parallel or anti-parallel domains in a portion of the DWM nanowire. In this paper we propose a multi-domain magneto-tunnel junction (MTJ) that can detect differen… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: This paper is under review for possible publication by the IEEE

  9. DNA Pre-alignment Filter using Processing Near Racetrack Memory

    Authors: Fazal Hameed, Asif Ali Khan, Sebastien Ollivier, Alex K. Jones, Jeronimo Castrillon

    Abstract: Recent DNA pre-alignment filter designs employ DRAM for storing the reference genome and its associated meta-data. However, DRAM incurs increasingly high energy consumption background and refresh energy as devices scale. To overcome this problem, this paper explores a design with racetrack memory (RTM)--an emerging non-volatile memory that promises higher storage density, faster access latency, an… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Report number: Volume 21, Issue 2

    Journal ref: IEEE Computer Architecture Letters 2022

  10. arXiv:2204.13788  [pdf, other

    cs.ET cs.AR

    FPIRM: Floating-point Processing in Racetrack Memories

    Authors: Sébastien Ollivier, Xinyi Zhang, Yue Tang, Chayanika Choudhuri, Jingtong Hu, Alex K. Jones

    Abstract: Convolutional neural networks (CNN) have become a ubiquitous algorithm with growing applications in mobile and edge settings. We describe a compute-in-memory (CIM) technique called FPIRM using Racetrack Memory (RM) to accelerate CNNs for edge systems. Using transverse read, a technique that can determine the number of '1's multiple adjacent domains, FPIRM can efficiently implement multi-operand bu… ▽ More

    Submitted 1 August, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: This paper is accepted to the IEEE Micro Magazine with the title "POD-RACING: Bulk-Bitwise to Floating-point Compute In Racetrack Memory for Machine Learning at the Edge"

  11. Pinning Fault Mode Modeling for DWM Shifting

    Authors: Kawsher Roxy, Stephen Longofono, Sebastien Olliver, Sanjukta Bhanja, Alex K. Jones

    Abstract: Extreme scaling for purposes of achieving higher density and lower energy continues to increase the probability of memory faults. For domain wall (DW) memories, misalignment faults arise when aligning domains with access points. A previously understudied type of shifting fault, a pinning fault may occur due to non-uniform pinning potential distribution caused by notches with fabrication imperfecti… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: IEEE Transactions on Circuits and Systems--II, 2022

  12. XDWM: A 2D Domain Wall Memory

    Authors: Arifa Hoque, Alex K. Jones, Sanjukta Bhanja

    Abstract: Domain-Wall Memory (DWM) structures typically bundle nanowires shifted together for parallel access. Ironically, this organization does not allow the natural shifting of DWM to realize \textit{logical shifting} within data elements. We describe a novel 2-D DWM cross-point (X-Cell) that allows two individual nanowires placed orthogonally to share the X-Cell. Each nanowire can operate independently… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

    Comments: in IEEE Transactions on Nanotechnology

    Journal ref: IEEE Transactions on Nanotechnology

  13. arXiv:2112.01658  [pdf, other

    cs.AR

    Virtual Coset Coding for Encrypted Non-Volatile Memories with Multi-Level Cells

    Authors: Stephen Longofono, Seyed Mohammad Seyedzadeh, Alex K. Jones

    Abstract: PCM is a popular backing memory for DRAM main memory in tiered memory systems. PCM has asymmetric access energy; writes dominate reads. MLC asymmetry can vary by an order of magnitude. Many schemes have been developed to take advantage of the asymmetric patterns of 0s and 1s in the data to reduce write energy. Because the memory is non-volatile, data can be recovered via physical attack or across… ▽ More

    Submitted 2 December, 2021; originally announced December 2021.

    Comments: Preprint: Accepted to HPCA 2022

  14. arXiv:2111.02246  [pdf, other

    cs.LG cs.AR cs.ET

    Brain-inspired Cognition in Next Generation Racetrack Memories

    Authors: Asif Ali Khan, Sebastien Ollivier, Stephen Longofono, Gerald Hempel, Jeronimo Castrillon, Alex K. Jones

    Abstract: Hyperdimensional computing (HDC) is an emerging computational framework inspired by the brain that operates on vectors with thousands of dimensions to emulate cognition. Unlike conventional computational frameworks that operate on numbers, HDC, like the brain, uses high dimensional random vectors and is capable of one-shot learning. HDC is based on a well-defined set of arithmetic operations and i… ▽ More

    Submitted 15 March, 2022; v1 submitted 3 November, 2021; originally announced November 2021.

    Comments: Preprint, accepted for publication, ACM Transactions on Embedded Computing Systems. ACM Trans. Embed. Comput. Syst. (March 2022)

  15. arXiv:2108.01202  [pdf, other

    cs.ET

    PIRM: Processing In Racetrack Memories

    Authors: Sebastien Ollivier, Stephen Longofono, Prayash Dutta, Jingtong Hu, Sanjukta Bhanja, Alex K. Jones

    Abstract: The growth in data needs of modern applications has created significant challenges for modern systems leading a "memory wall." Spintronic Domain Wall Memory (DWM), related to Spin-Transfer Torque Memory (STT-MRAM), provides near-SRAM read/write performance, energy savings and nonvolatility, potential for extremely high storage density, and does not have significant endurance limitations. However,… ▽ More

    Submitted 1 August, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: This paper is accepted to the IEEE/ACM Symposium on Microarchitecture, October 2022 under the title "CORUSCANT: Fast Efficient Processing-in-Racetrack Memories"

  16. arXiv:2005.01588  [pdf

    cs.CY

    Workshops on Extreme Scale Design Automation (ESDA) Challenges and Opportunities for 2025 and Beyond

    Authors: R. Iris Bahar, Alex K. Jones, Srinivas Katkoori, Patrick H. Madden, Diana Marculescu, Igor L. Markov

    Abstract: Integrated circuits and electronic systems, as well as design technologies, are evolving at a great rate -- both quantitatively and qualitatively. Major developments include new interconnects and switching devices with atomic-scale uncertainty, the depth and scale of on-chip integration, electronic system-level integration, the increasing significance of software, as well as more effective means o… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: A Computing Community Consortium (CCC) workshop report, 32 pages

    Report number: ccc2014report_1

  17. arXiv:1806.02498  [pdf, other

    cs.AR

    Mitigating Wordline Crosstalk using Adaptive Trees of Counters

    Authors: Seyed Mohammad Seyedzadeh, Alex K. Jones, Rami Melhem

    Abstract: High access frequency of certain rows in the DRAM may cause data loss in cells of physically adjacent rows due to crosstalk. The malicious exploit of this crosstalk by repeatedly accessing a row to induce this effect is known as row hammering. Additionally, inadvertent row hammering may also occur due to the natural weighted nature of applications' access patterns. In this paper, we analyze the… ▽ More

    Submitted 6 June, 2018; originally announced June 2018.

    Comments: 12 pages

  18. arXiv:1711.08572  [pdf, other

    cs.AR

    Enabling Fine-Grain Restricted Coset Coding Through Word-Level Compression for PCM

    Authors: Seyed Mohammad Seyedzadeh, Alex K. Jones, Rami Melhem

    Abstract: Phase change memory (PCM) has recently emerged as a promising technology to meet the fast growing demand for large capacity memory in computer systems, replacing DRAM that is impeded by physical limitations. Multi-level cell (MLC) PCM offers high density with low per-byte fabrication cost. However, despite many advantages, such as scalability and low leakage, the energy for programming intermediat… ▽ More

    Submitted 22 November, 2017; originally announced November 2017.

    Comments: 12 pages

  19. arXiv:1710.08940  [pdf, other

    cs.ET

    A Variable Length Coding Framework for Cost Function Reduction in Non-Volatile Memory Systems

    Authors: Seyed Mohammad Seyedzadeh, Alex K. Jones, Rami Melhem

    Abstract: Variable length coding for Non-Volatile Memory (NVM) technologies is a promising method to improve memory capacity and system performance through compressing memory blocks. However, compression techniques used to improve capacity or bandwidth utilization do not take into consideration the asymmetric costs of writing 1's and 0's in NVMs. Taking into account this asymmetry, we propose a variable len… ▽ More

    Submitted 24 October, 2017; originally announced October 2017.

    Comments: NVMW2017