subscribe to arXiv mailings

Accelerating Decision Diagram-based Multi-node Quantum Simulation with Ring Communication and Automatic SWAP Insertion

Authors: Yusuke Kimura, Shaowen Li, Hiroyuki Sato, Masahiro Fujita

Abstract: An N-bit quantum state requires a vector of length $2^N$, leading to an exponential increase in the required memory with N in conventional statevector-based quantum simulators. A proposed solution to this issue is the decision diagram-based quantum simulator, which can significantly decrease the necessary memory and is expected to operate faster for specific quantum circuits. However, decision dia… ▽ More An N-bit quantum state requires a vector of length $2^N$, leading to an exponential increase in the required memory with N in conventional statevector-based quantum simulators. A proposed solution to this issue is the decision diagram-based quantum simulator, which can significantly decrease the necessary memory and is expected to operate faster for specific quantum circuits. However, decision diagram-based quantum simulators are not easily parallelizable because data must be manipulated dynamically, and most implementations run on one thread. This paper introduces ring communication-based optimal parallelization and automatic swap insertion techniques for multi-node implementation of decision diagram-based quantum simulators. The ring communication approach is designed so that each node communicates with its neighboring nodes, which can facilitate faster and more parallel communication than broadcasting where one node needs to communicate with all nodes simultaneously. The automatic swap insertion method, an approach to minimize inter-node communication, has been employed in existing multi-node state vector-based simulators, but this paper proposes two methods specifically designed for decision diagram-based quantum simulators. These techniques were implemented and evaluated using the Shor algorithm and random circuits with up to 38 qubits using a maximum of 256 nodes. The experimental results have revealed that multi-node implementation can reduce run-time by up to 26 times. For example, Shor circuits that need 38 qubits can finish simulation in 147 seconds. Additionally, it was shown that ring communication has a higher speed-up effect than broadcast communication, and the importance of selecting the appropriate automatic swap insertion method was revealed. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted at IEEE QSW 2024

arXiv:2312.01570 [pdf, other]

doi 10.1109/QSW59989.2023.00026

Parallelizing quantum simulation with decision diagrams

Authors: Shaowen Li, Yusuke Kimura, Hiroyuki Sato, Junwei Yu, Masahiro Fujita

Abstract: Recent technological advancements show promise in leveraging quantum mechanical phenomena for computation. This brings substantial speed-ups to problems that are once considered to be intractable in the classical world. However, the physical realization of quantum computers is still far away from us, and a majority of research work is done using quantum simulators running on classical computers. C… ▽ More Recent technological advancements show promise in leveraging quantum mechanical phenomena for computation. This brings substantial speed-ups to problems that are once considered to be intractable in the classical world. However, the physical realization of quantum computers is still far away from us, and a majority of research work is done using quantum simulators running on classical computers. Classical computers face a critical obstacle in simulating quantum algorithms. Quantum states reside in a Hilbert space whose size grows exponentially to the number of subsystems, i.e., qubits. As a result, the straightforward statevector approach does not scale due to the exponential growth of the memory requirement. Decision diagrams have gained attention in recent years for representing quantum states and operations in quantum simulations. The main advantage of this approach is its ability to exploit redundancy. However, mainstream quantum simulators still rely on statevectors or tensor networks. We consider the absence of decision diagrams due to the lack of parallelization strategies. This work explores several strategies for parallelizing decision diagram operations, specifically for quantum simulations. We propose optimal parallelization strategies. Based on the experiment results, our parallelization strategy achieves a 2-3 times faster simulation of Grover's algorithm and random circuits than the state-of-the-art single-thread DD-based simulator DDSIM. △ Less

Submitted 3 December, 2023; originally announced December 2023.

arXiv:2207.01214 [pdf, other]

Integrating a Manual Pipette into a Collaborative Robot Manipulator for Flexible Liquid Dispensing

Authors: Junbo Zhang, Weiwei Wan, Nobuyuki Tanaka, Miki Fujita, Kensuke Harada

Abstract: This paper presents a system integration approach for a 6-DoF (Degree of Freedom) collaborative robot to operate a pipette for liquid dispensing. Its technical development is threefold. First, we designed an end-effector for holding and triggering manual pipettes. Second, we took advantage of a collaborative robot to recognize labware poses and planned robotic motion based on the recognized poses.… ▽ More This paper presents a system integration approach for a 6-DoF (Degree of Freedom) collaborative robot to operate a pipette for liquid dispensing. Its technical development is threefold. First, we designed an end-effector for holding and triggering manual pipettes. Second, we took advantage of a collaborative robot to recognize labware poses and planned robotic motion based on the recognized poses. Third, we developed vision-based classifiers to predict and correct positioning errors and thus precisely attached pipettes to disposable tips. Through experiments and analysis, we confirmed that the developed system, especially the planning and visual recognition methods, could help secure high-precision and flexible liquid dispensing. The developed system is suitable for low-frequency, high-repetition biochemical liquid dispensing tasks. We expect it to promote the deployment of collaborative robots for laboratory automation and thus improve the experimental efficiency without significantly customizing a laboratory environment. △ Less

Submitted 4 July, 2022; originally announced July 2022.

arXiv:2112.15443 [pdf]

FPGA Based Accelerator for Neural Networks Computation with Flexible Pipelining

Authors: Qingyang Yi, Heming Sun, Masahiro Fujita

Abstract: FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an FPGA-based neural networks accelerator and its optimization framework, which can achieve optimal efficiency for various CNN models and FPGA resources. Targeting high… ▽ More FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an FPGA-based neural networks accelerator and its optimization framework, which can achieve optimal efficiency for various CNN models and FPGA resources. Targeting high throughput, we adopt layer-wise pipeline architecture for higher DSP utilization. To get the optimal performance, a flexible algorithm to allocate balanced hardware resources to each layer is also proposed, supported by activation buffer design. Through our well-balanced implementation of four CNN models on ZC706, the DSP utilization and efficiency are over 90%. For VGG16 on ZC706, the proposed accelerator achieves the performance of 2.58x, 1.53x and 1.35x better than the referenced non-pipeline architecture [1], pipeline architecture [2] and [3], respectively. △ Less

Submitted 28 December, 2021; originally announced December 2021.

Comments: 6 pages

arXiv:2012.02530 [pdf, other]

Logic Synthesis Meets Machine Learning: Trading Exactness for Generalization

Authors: Shubham Rai, Walter Lau Neto, Yukio Miyasaka, Xinpei Zhang, Mingfei Yu, Qingyang Yi Masahiro Fujita, Guilherme B. Manske, Matheus F. Pontes, Leomar S. da Rosa Junior, Marilton S. de Aguiar, Paulo F. Butzen, Po-Chun Chien, Yu-Shan Huang, Hoa-Ren Wang, Jie-Hong R. Jiang, Jiaqi Gu, Zheng Zhao, Zixuan Jiang, David Z. Pan, Brunno A. de Abreu, Isac de Souza Campos, Augusto Berndt, Cristina Meinhardt, Jonata T. Carvalho, Mateus Grellert , et al. (15 additional authors not shown)

Abstract: Logic synthesis is a fundamental step in hardware design whose goal is to find structural representations of Boolean functions while minimizing delay and area. If the function is completely-specified, the implementation accurately represents the function. If the function is incompletely-specified, the implementation has to be true only on the care set. While most of the algorithms in logic synthes… ▽ More Logic synthesis is a fundamental step in hardware design whose goal is to find structural representations of Boolean functions while minimizing delay and area. If the function is completely-specified, the implementation accurately represents the function. If the function is incompletely-specified, the implementation has to be true only on the care set. While most of the algorithms in logic synthesis rely on SAT and Boolean methods to exactly implement the care set, we investigate learning in logic synthesis, attempting to trade exactness for generalization. This work is directly related to machine learning where the care set is the training set and the implementation is expected to generalize on a validation set. We present learning incompletely-specified functions based on the results of a competition conducted at IWLS 2020. The goal of the competition was to implement 100 functions given by a set of care minterms for training, while testing the implementation using a set of validation minterms sampled from the same function. We make this benchmark suite available and offer a detailed comparative analysis of the different approaches to learning △ Less

Submitted 15 December, 2020; v1 submitted 4 December, 2020; originally announced December 2020.

Comments: In this 23 page manuscript, we explore the connection between machine learning and logic synthesis which was the main goal for International Workshop on logic synthesis. It includes approaches applied by ten teams spanning 6 countries across the world

arXiv:2012.01114 [pdf, other]

Parallel Scheduling Self-attention Mechanism: Generalization and Optimization

Authors: Mingfei Yu, Masahiro Fujita

Abstract: Over the past few years, self-attention is shining in the field of deep learning, especially in the domain of natural language processing(NLP). Its impressive effectiveness, along with ubiquitous implementations, have aroused our interest in efficiently scheduling the data-flow of corresponding computations onto architectures with many computing units to realize parallel computing. In this paper,… ▽ More Over the past few years, self-attention is shining in the field of deep learning, especially in the domain of natural language processing(NLP). Its impressive effectiveness, along with ubiquitous implementations, have aroused our interest in efficiently scheduling the data-flow of corresponding computations onto architectures with many computing units to realize parallel computing. In this paper, based on the theory of self-attention mechanism and state-of-the-art realization of self-attention in language models, we propose a general scheduling algorithm, which is derived from the optimum scheduling for small instances solved by a satisfiability checking(SAT) solver, to parallelize typical computations of self-attention. Strategies for further optimization on skipping redundant computations are put forward as well, with which reductions of almost 25% and 50% of the original computations are respectively achieved for two widely-adopted application schemes of self-attention. With the proposed optimization adopted, we have correspondingly come up with another two scheduling algorithms. The proposed algorithms are applicable regardless of problem sizes, as long as the number of input vectors is divisible to the number of computing units available in the architecture. Due to the complexity of proving the correctness of the algorithms mathematically for general cases, we have conducted experiments to reveal their validity, together with the superior quality of the solutions provided by which, by solving SAT problems for particular instances. △ Less

Submitted 2 December, 2020; originally announced December 2020.

arXiv:2009.07513 [pdf, ps, other]

Perfectly Secure Message Transmission against Rational Adversaries

Authors: Maiki Fujita, Takeshi Koshiba, Kenji Yasunaga

Abstract: Secure Message Transmission (SMT) is a two-party cryptographic protocol by which the sender can securely and reliably transmit messages to the receiver using multiple channels. An adversary can corrupt a subset of the channels and commit eavesdropping and tampering attacks over the channels. In this work, we introduce a game-theoretic security model for SMT in which adversaries have some preferenc… ▽ More Secure Message Transmission (SMT) is a two-party cryptographic protocol by which the sender can securely and reliably transmit messages to the receiver using multiple channels. An adversary can corrupt a subset of the channels and commit eavesdropping and tampering attacks over the channels. In this work, we introduce a game-theoretic security model for SMT in which adversaries have some preferences for protocol execution. We define rational "timid" adversaries who prefer to violate security requirements but do not prefer the tampering to be detected. First, we consider the basic setting where a single adversary attacks the protocol. We construct perfect SMT protocols against any rational adversary corrupting all but one of the channels. Since minority corruption is required in the traditional setting, our results demonstrate a way of circumventing the cryptographic impossibility results by a game-theoretic approach. Next, we study the setting in which all the channels can be corrupted by multiple adversaries who do not cooperate. Since we cannot hope for any security if a single adversary corrupts all the channels or multiple adversaries cooperate maliciously, the scenario can arise from a game-theoretic model. We also study the scenario in which both malicious and rational adversaries exist. △ Less

Submitted 29 December, 2021; v1 submitted 16 September, 2020; originally announced September 2020.

arXiv:1703.06416 [pdf, other]

A Passivity-Based Distributed Reference Governor for Constrained Robotic Networks

Authors: Tam Nguyen, Takeshi Hatanaka, Mamoru Doi, Emanuele Garone, Masayuki Fujita

Abstract: This paper focuses on a passivity-based distributed reference governor (RG) applied to a pre-stabilized mobile robotic network. The novelty of this paper lies in the method used to solve the RG problem, where a passivity-based distributed optimization scheme is proposed. In particular, the gradient descent method minimizes the global objective function while the dual ascent method maximizes the Ha… ▽ More This paper focuses on a passivity-based distributed reference governor (RG) applied to a pre-stabilized mobile robotic network. The novelty of this paper lies in the method used to solve the RG problem, where a passivity-based distributed optimization scheme is proposed. In particular, the gradient descent method minimizes the global objective function while the dual ascent method maximizes the Hamiltonian. To make the agents converge to the agreed optimal solution, a proportional-integral consensus estimator is used. This paper proves the convergence of the state estimates of the RG to the optimal solution through passivity arguments, considering the physical system static. Then, the effectiveness of the scheme considering the dynamics of the physical system is demonstrated through simulations and experiments. △ Less

Submitted 19 March, 2017; originally announced March 2017.

Comments: 8 pages, International Federation of Automatic Conference 2017, 8 figures

arXiv:1307.7129 [pdf, ps, other]

An Architecture for Autonomously Controlling Robot with Embodiment in Real World

Authors: Megumi Fujita, Yuki Goto, Naoyuki Nide, Ken Satoh, Hiroshi Hosobe

Abstract: In the real world, robots with embodiment face various issues such as dynamic continuous changes of the environment and input/output disturbances. The key to solving these issues can be found in daily life; people `do actions associated with sensing' and `dynamically change their plans when necessary'. We propose the use of a new concept, enabling robots to do these two things, for autonomously co… ▽ More In the real world, robots with embodiment face various issues such as dynamic continuous changes of the environment and input/output disturbances. The key to solving these issues can be found in daily life; people `do actions associated with sensing' and `dynamically change their plans when necessary'. We propose the use of a new concept, enabling robots to do these two things, for autonomously controlling mobile robots. We implemented our concept to make two experiments under static/dynamic environments. The results of these experiments show that our idea provides a way to adapt to dynamic changes of the environment in the real world. △ Less

Submitted 26 July, 2013; originally announced July 2013.

Comments: Submission for proc. of KRR-ICLP2013 (resubmitted using LLNCS style)

Showing 1–9 of 9 results for author: Fujita, M