subscribe to arXiv mailings

Plug-and-Play Acceleration of Occupancy Grid-based NeRF Rendering using VDB Grid and Hierarchical Ray Traversal

Abstract: Transmittance estimators such as Occupancy Grid (OG) can accelerate the training and rendering of Neural Radiance Field (NeRF) by predicting important samples that contributes much to the generated image. However, OG manages occupied regions in the form of the dense binary grid, in which there are many blocks with the same values that cause redundant examination of voxels' emptiness in ray-tracing… ▽ More Transmittance estimators such as Occupancy Grid (OG) can accelerate the training and rendering of Neural Radiance Field (NeRF) by predicting important samples that contributes much to the generated image. However, OG manages occupied regions in the form of the dense binary grid, in which there are many blocks with the same values that cause redundant examination of voxels' emptiness in ray-tracing. In our work, we introduce two techniques to improve the efficiency of ray-tracing in trained OG without fine-tuning. First, we replace the dense grids with VDB grids to reduce the spatial redundancy. Second, we use hierarchical digital differential analyzer (HDDA) to efficiently trace voxels in the VDB grids. Our experiments on NeRF-Synthetic and Mip-NeRF 360 datasets show that our proposed method successfully accelerates rendering NeRF-Synthetic dataset by 12% in average and Mip-NeRF 360 dataset by 4% in average, compared to a fast implementation of OG, NerfAcc, without losing the quality of rendered images. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Short paper for CVPR Neural Rendering Intelligence Workshop 2024. Code: https://github.com/Yosshi999/faster-occgrid

arXiv:2404.04465 [pdf, other]

Aligning Diffusion Models by Optimizing Human Utility

Authors: Shufan Li, Konstantinos Kallidromitis, Akash Gokul, Yusuke Kato, Kazuki Kozuka

Abstract: We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility. Since this objective applies to each generation independently, Diffusion-KTO does not require collecting costly pairwise preference data nor training a complex reward model. Instead, our objective requires simple per-image bina… ▽ More We present Diffusion-KTO, a novel approach for aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility. Since this objective applies to each generation independently, Diffusion-KTO does not require collecting costly pairwise preference data nor training a complex reward model. Instead, our objective requires simple per-image binary feedback signals, e.g. likes or dislikes, which are abundantly available. After fine-tuning using Diffusion-KTO, text-to-image diffusion models exhibit superior performance compared to existing techniques, including supervised fine-tuning and Diffusion-DPO, both in terms of human judgment and automatic evaluation metrics such as PickScore and ImageReward. Overall, Diffusion-KTO unlocks the potential of leveraging readily available per-image binary signals and broadens the applicability of aligning text-to-image diffusion models with human preferences. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 27 pages, 11 figures

arXiv:2310.13368 [pdf, other]

AP Connection Method for Maximizing Throughput Considering User Moving and Degree of Interference Based on Potential Game

Authors: Yu Kato, Jiquan Xie, Tutomu Murase, Sumiko Miyata

Abstract: For multi-transmission rate environments, access point (AP) connection methods have been proposed for maximizing system throughput, which is the throughput of an entire system, on the basis of the cooperative behavior of users. These methods derive optimal positions for the cooperative behavior of users, which means that new users move to improve the system throughput when connecting to an AP. How… ▽ More For multi-transmission rate environments, access point (AP) connection methods have been proposed for maximizing system throughput, which is the throughput of an entire system, on the basis of the cooperative behavior of users. These methods derive optimal positions for the cooperative behavior of users, which means that new users move to improve the system throughput when connecting to an AP. However, the conventional method only considers the transmission rate of new users and does not consider existing users, even though it is necessary to consider the transmission rate of all users to improve system throughput. In addition, these method do not take into account the frequency of interference between users. In this paper, we propose an AP connection method which maximizes system throughput by considering the interference between users and the initial position of all users. In addition, our proposed method can improve system throughput by about 6% at most compared to conventional methods. △ Less

Submitted 5 December, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

Comments: 14 pages, 15 figures, It is being submitted to IEEE Open Journal of the Communications Society

arXiv:2308.02136 [pdf, other]

World-Model-Based Control for Industrial box-packing of Multiple Objects using NewtonianVAE

Authors: Yusuke Kato, Ryo Okumura, Tadahiro Taniguchi

Abstract: The process of industrial box-packing, which involves the accurate placement of multiple objects, requires high-accuracy positioning and sequential actions. When a robot is tasked with placing an object at a specific location with high accuracy, it is important not only to have information about the location of the object to be placed, but also the posture of the object grasped by the robotic hand… ▽ More The process of industrial box-packing, which involves the accurate placement of multiple objects, requires high-accuracy positioning and sequential actions. When a robot is tasked with placing an object at a specific location with high accuracy, it is important not only to have information about the location of the object to be placed, but also the posture of the object grasped by the robotic hand. Often, industrial box-packing requires the sequential placement of identically shaped objects into a single box. The robot's action should be determined by the same learned model. In factories, new kinds of products often appear and there is a need for a model that can easily adapt to them. Therefore, it should be easy to collect data to train the model. In this study, we designed a robotic system to automate real-world industrial tasks, employing a vision-based learning control model. We propose in-hand-view-sensitive Newtonian variational autoencoder (ihVS-NVAE), which employs an RGB camera to obtain in-hand postures of objects. We demonstrate that our model, trained for a single object-placement task, can handle sequential tasks without additional training. To evaluate efficacy of the proposed model, we employed a real robot to perform sequential industrial box-packing of multiple objects. Results showed that the proposed model achieved a 100% success rate in industrial box-packing tasks, thereby outperforming the state-of-the-art and conventional approaches, underscoring its superior effectiveness and potential in industrial tasks. △ Less

Submitted 3 April, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

Comments: 7 pages, 8 figures

arXiv:2307.00764 [pdf, other]

Hierarchical Open-vocabulary Universal Image Segmentation

Authors: Xudong Wang, Shufan Li, Konstantinos Kallidromitis, Yusuke Kato, Kazuki Kozuka, Trevor Darrell

Abstract: Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions. However, complex visual scenes can be naturally decomposed into simpler parts and abstracted at multiple levels of granularity, introducing inherent segmentation ambiguity. Unlike existing methods that typically sidestep this ambiguity and treat it as an external factor, ou… ▽ More Open-vocabulary image segmentation aims to partition an image into semantic regions according to arbitrary text descriptions. However, complex visual scenes can be naturally decomposed into simpler parts and abstracted at multiple levels of granularity, introducing inherent segmentation ambiguity. Unlike existing methods that typically sidestep this ambiguity and treat it as an external factor, our approach actively incorporates a hierarchical representation encompassing different semantic-levels into the learning process. We propose a decoupled text-image fusion mechanism and representation learning modules for both "things" and "stuff". Additionally, we systematically examine the differences that exist in the textual and visual features between these types of categories. Our resulting model, named HIPIE, tackles HIerarchical, oPen-vocabulary, and unIvErsal segmentation tasks within a unified framework. Benchmarked on over 40 datasets, e.g., ADE20K, COCO, Pascal-VOC Part, RefCOCO/RefCOCOg, ODinW and SeginW, HIPIE achieves the state-of-the-art results at various levels of image comprehension, including semantic-level (e.g., semantic segmentation), instance-level (e.g., panoptic/referring segmentation and object detection), as well as part-level (e.g., part/subpart segmentation) tasks. Our code is released at https://github.com/berkeley-hipie/HIPIE. △ Less

Submitted 21 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: Project web-page: http://people.eecs.berkeley.edu/~xdwang/projects/HIPIE/; NeurIPS 2023 Camera-ready

arXiv:2306.13961 [pdf, ps, other]

Categorical Approach to Conflict Resolution: Integrating Category Theory into the Graph Model for Conflict Resolution

Authors: Yukiko Kato

Abstract: This paper introduces the Categorical Graph Model for Conflict Resolution (C-GMCR), a novel framework that integrates category theory into the traditional Graph Model for Conflict Resolution (GMCR). The C-GMCR framework provides a more abstract and general way to model and analyze conflict resolution, enabling researchers to uncover deeper insights and connections. We present the basic concepts, m… ▽ More This paper introduces the Categorical Graph Model for Conflict Resolution (C-GMCR), a novel framework that integrates category theory into the traditional Graph Model for Conflict Resolution (GMCR). The C-GMCR framework provides a more abstract and general way to model and analyze conflict resolution, enabling researchers to uncover deeper insights and connections. We present the basic concepts, methods, and application of the C-GMCR framework to the well-known Prisoner's Dilemma and other representative cases. The findings suggest that the categorical approach offers new perspectives on stability concepts and can potentially lead to the development of more effective conflict resolution strategies. △ Less

Submitted 30 June, 2023; v1 submitted 24 June, 2023; originally announced June 2023.

Comments: This work has been submitted to IEEE SMC 2023 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2306.11983 [pdf, other]

Stability analysis of admittance control using asymmetric stiffness matrix

Authors: Toshiaki Tsuji, Yasuhiro Kato

Abstract: In contact-rich tasks, setting the stiffness of the control system is a critical factor in its performance. Although the setting range can be extended by making the stiffness matrix asymmetric, its stability has not been proven. This study focuses on the stability of compliance control in a robot arm that deals with an asymmetric stiffness matrix. It discusses the convergence stability of the admi… ▽ More In contact-rich tasks, setting the stiffness of the control system is a critical factor in its performance. Although the setting range can be extended by making the stiffness matrix asymmetric, its stability has not been proven. This study focuses on the stability of compliance control in a robot arm that deals with an asymmetric stiffness matrix. It discusses the convergence stability of the admittance control. The paper explains how to derive an asymmetric stiffness matrix and how to incorporate it into the admittance model. The authors also present simulation and experimental results that demonstrate the effectiveness of their proposed method. △ Less

Submitted 20 June, 2023; originally announced June 2023.

arXiv:2211.04972 [pdf, ps, other]

Hibikino-Musashi@Home 2018 Team Description Paper

Authors: Yutaro Ishida, Sansei Hori, Yuichiro Tanaka, Yuma Yoshimoto, Kouhei Hashimoto, Gouki Iwamoto, Yoshiya Aratani, Kenya Yamashita, Shinya Ishimoto, Kyosuke Hitaka, Fumiaki Yamaguchi, Ryuhei Miyoshi, Kentaro Honda, Yushi Abe, Yoshitaka Kato, Takashi Morie, Hakaru Tamukoh

Abstract: Our team, Hibikino-Musashi@Home (the shortened name is HMA), was founded in 2010. It is based in the Kitakyushu Science and Research Park, Japan. We have participated in the RoboCup@Home Japan open competition open platform league every year since 2010. Moreover, we participated in the RoboCup 2017 Nagoya as open platform league and domestic standard platform league teams. Currently, the Hibikino-… ▽ More Our team, Hibikino-Musashi@Home (the shortened name is HMA), was founded in 2010. It is based in the Kitakyushu Science and Research Park, Japan. We have participated in the RoboCup@Home Japan open competition open platform league every year since 2010. Moreover, we participated in the RoboCup 2017 Nagoya as open platform league and domestic standard platform league teams. Currently, the Hibikino-Musashi@Home team has 20 members from seven different laboratories based in the Kyushu Institute of Technology. In this paper, we introduce the activities of our team and the technologies. △ Less

Submitted 9 November, 2022; originally announced November 2022.

Comments: 8 pages, 5 figures, RoboCup@Home

arXiv:2210.16938 [pdf, ps, other]

A view on model misspecification in uncertainty quantification

Authors: Yuko Kato, David M. J. Tax, Marco Loog

Abstract: Estimating uncertainty of machine learning models is essential to assess the quality of the predictions that these models provide. However, there are several factors that influence the quality of uncertainty estimates, one of which is the amount of model misspecification. Model misspecification always exists as models are mere simplifications or approximations to reality. The question arises wheth… ▽ More Estimating uncertainty of machine learning models is essential to assess the quality of the predictions that these models provide. However, there are several factors that influence the quality of uncertainty estimates, one of which is the amount of model misspecification. Model misspecification always exists as models are mere simplifications or approximations to reality. The question arises whether the estimated uncertainty under model misspecification is reliable or not. In this paper, we argue that model misspecification should receive more attention, by providing thought experiments and contextualizing these with relevant literature. △ Less

Submitted 2 November, 2022; v1 submitted 30 October, 2022; originally announced October 2022.

Comments: An initial version of the current work has been accepted to be presented at BNAIC/BeNeLearn 2022, to which it was submitted on August 27, 2022

arXiv:2208.11821 [pdf, other]

Refine and Represent: Region-to-Object Representation Learning

Authors: Akash Gokul, Konstantinos Kallidromitis, Shufan Li, Yusuke Kato, Kazuki Kozuka, Trevor Darrell, Colorado J Reed

Abstract: Recent works in self-supervised learning have demonstrated strong performance on scene-level dense prediction tasks by pretraining with object-centric or region-based correspondence objectives. In this paper, we present Region-to-Object Representation Learning (R2O) which unifies region-based and object-centric pretraining. R2O operates by training an encoder to dynamically refine region-based seg… ▽ More Recent works in self-supervised learning have demonstrated strong performance on scene-level dense prediction tasks by pretraining with object-centric or region-based correspondence objectives. In this paper, we present Region-to-Object Representation Learning (R2O) which unifies region-based and object-centric pretraining. R2O operates by training an encoder to dynamically refine region-based segments into object-centric masks and then jointly learns representations of the contents within the mask. R2O uses a "region refinement module" to group small image regions, generated using a region-level prior, into larger regions which tend to correspond to objects by clustering region-level features. As pretraining progresses, R2O follows a region-to-object curriculum which encourages learning region-level features early on and gradually progresses to train object-centric representations. Representations learned using R2O lead to state-of-the art performance in semantic segmentation for PASCAL VOC (+0.7 mIOU) and Cityscapes (+0.4 mIOU) and instance segmentation on MS COCO (+0.3 mask AP). Further, after pretraining on ImageNet, R2O pretrained models are able to surpass existing state-of-the-art in unsupervised object segmentation on the Caltech-UCSD Birds 200-2011 dataset (+2.9 mIoU) without any further training. We provide the code/models from this work at https://github.com/KKallidromitis/r2o. △ Less

Submitted 20 December, 2022; v1 submitted 24 August, 2022; originally announced August 2022.

arXiv:2207.11733 [pdf, other]

doi 10.1109/SMC53654.2022.9945371

State Definition for Conflict Analysis with Four-valued Logic

Authors: Yukiko Kato

Abstract: We examined a four-valued logic method for state settings in conflict resolution models. Decision-making models of conflict resolution, such as game theory and graph model for conflict resolution (GMCR), assume the description of a state to be the outcome of a combination of strategies or the consequence of option selection by the decision-makers. However, for a framework to function as a decision… ▽ More We examined a four-valued logic method for state settings in conflict resolution models. Decision-making models of conflict resolution, such as game theory and graph model for conflict resolution (GMCR), assume the description of a state to be the outcome of a combination of strategies or the consequence of option selection by the decision-makers. However, for a framework to function as a decision-making system, unless a clear definition of the task of placing information out of an infinite world exists, logical consistency cannot be ensured, and thus, the function may be incomputable. The introduction of paraconsistent four-valued logic can prevent incorrect state setting and analysis with insufficient information and provide logical validity to analytical methods that vary the analysis resolution depending on the degree of coarseness of the available information. This study proposes a GMCR stability analysis with state configuration based on Belnap's four-valued logic. △ Less

Submitted 24 July, 2022; originally announced July 2022.

Comments: This work has been submitted to the IEEE SMC 2022 for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Journal ref: IEEE Syst. Man Cybern.2022, pp. 3186-3191

arXiv:2205.09924 [pdf]

doi 10.1109/ICDMW53433.2021.00072

Anomaly Detection for Multivariate Time Series on Large-scale Fluid Handling Plant Using Two-stage Autoencoder

Authors: Susumu Naito, Yasunori Taguchi, Kouta Nakata, Yuichi Kato

Abstract: This paper focuses on anomaly detection for multivariate time series data in large-scale fluid handling plants with dynamic components, such as power generation, water treatment, and chemical plants, where signals from various physical phenomena are observed simultaneously. In these plants, the need for anomaly detection techniques is increasing in order to reduce the cost of operation and mainten… ▽ More This paper focuses on anomaly detection for multivariate time series data in large-scale fluid handling plants with dynamic components, such as power generation, water treatment, and chemical plants, where signals from various physical phenomena are observed simultaneously. In these plants, the need for anomaly detection techniques is increasing in order to reduce the cost of operation and maintenance, in view of a decline in the number of skilled engineers and a shortage of manpower. However, considering the complex behavior of high-dimensional signals and the demand for interpretability, the techniques constitute a major challenge. We introduce a Two-Stage AutoEncoder (TSAE) as an anomaly detection method suitable for such plants. This is a simple autoencoder architecture that makes anomaly detection more interpretable and more accurate, in which based on the premise that plant signals can be separated into two behaviors that have almost no correlation with each other, the signals are separated into long-term and short-term components in a stepwise manner, and the two components are trained independently to improve the inference capability for normal signals. Through experiments on two publicly available datasets of water treatment systems, we have confirmed the high detection performance, the validity of the premise, and that the model behavior was as intended, i.e., the technical effectiveness of TSAE. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: The 2nd Workshop on Large-scale Industrial Time Series Analysis at the 21st IEEE International Conference on Data Mining (ICDM), 2021

Journal ref: 2021 International Conference on Data Mining Workshops (ICDMW), 2021, pp. 542-551

arXiv:2203.08496 [pdf, other]

doi 10.1038/s41598-022-27183-x

Dynamic Grass Color Scale Display Technique Based on Grass Length for Green Landscape-Friendly Animation Display

Authors: Kojiro Tanaka, Yuichi Kato, Akito Mizuno, Masahiko Mikawa, Makoto Fujisawa

Abstract: Recently, public displays such as liquid crystal displays (LCDs) are often used in urban green spaces, however, the display devices can spoil green landscape of urban green spaces because they look like artificial materials. We previously proposed a green landscape-friendly grass animation display method by controlling a pixel-by-pixel grass color dynamically. The grass color can be changed by mov… ▽ More Recently, public displays such as liquid crystal displays (LCDs) are often used in urban green spaces, however, the display devices can spoil green landscape of urban green spaces because they look like artificial materials. We previously proposed a green landscape-friendly grass animation display method by controlling a pixel-by-pixel grass color dynamically. The grass color can be changed by moving a green grass length in yellow grass, and the grass animation display can play simple animations using grayscale images. In the previous research, the color scale was mapped to the green grass length subjectively, however, this method has not achieved displaying the grass colors corresponding to the color scale based on objective evaluations. Here, we introduce a dynamic grass color scale display technique based on a grass length. In this paper, we developed a grass color scale setting procedure to map the grass length to the color scale with five levels through image processing. Through the outdoor experiment of the grass color scale setting procedure, the color scale can correspond to the green grass length based on a viewpoint. After the experiments, we demonstrated a grass animation display to show the animations with the color scale using the experiment results. △ Less

Submitted 18 December, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

Comments: 17 pages

arXiv:2203.05413 [pdf, other]

doi 10.1109/LRA.2022.3190806

A Self-Tuning Impedance-based Interaction Planner for Robotic Haptic Exploration

Authors: Yasuhiro Kato, Pietro Balatti, Juan M. Gandarias, Mattia Leonori, Toshiaki Tsuji, Arash Ajoudani

Abstract: This paper presents a novel interaction planning method that exploits impedance tuning techniques in response to environmental uncertainties and unpredictable conditions using haptic information only. The proposed algorithm plans the robot's trajectory based on the haptic interaction with the environment and adapts planning strategies as needed. Two approaches are considered: Exploration and Bounc… ▽ More This paper presents a novel interaction planning method that exploits impedance tuning techniques in response to environmental uncertainties and unpredictable conditions using haptic information only. The proposed algorithm plans the robot's trajectory based on the haptic interaction with the environment and adapts planning strategies as needed. Two approaches are considered: Exploration and Bouncing strategies. The Exploration strategy takes the actual motion of the robot into account in planning, while the Bouncing strategy exploits the forces and the motion vector of the robot. Moreover, self-tuning impedance is performed according to the planned trajectory to ensure compliant contact and low contact forces. In order to show the performance of the proposed methodology, two experiments with a torque-controller robotic arm are carried out. The first considers a maze exploration without obstacles, whereas the second includes obstacles. The proposed method performance is analyzed and compared against previously proposed solutions in both cases. Experimental results demonstrate that: i) the robot can successfully plan its trajectory autonomously in the most feasible direction according to the interaction with the environment, and ii) a compliant interaction with an unknown environment despite the uncertainties is achieved. Finally, a scalability demonstration is carried out to show the potential of the proposed method under multiple scenarios. △ Less

Submitted 2 September, 2022; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: 8 pages, 9 figures, accepted for IEEE Robotics and Automation Letters (RA-L) and IEEE/RSJ International Conference on Intelligent Robots and Systems 2022

arXiv:2109.14348 [pdf, ps, other]

Smart-home anomaly detection using combination of in-home situation and user behavior

Authors: Masaaki Yamauchi, Masahiro Tanaka, Yuichi Ohsita, Masayuki Murata, Kensuke Ueda, Yoshiaki Kato

Abstract: Internet-of-things (IoT) devices are vulnerable to malicious operations by attackers, which can cause physical and economic harm to users; therefore, we previously proposed a sequence-based method that modeled user behavior as sequences of in-home events and a base home state to detect anomalous operations. However, that method modeled users' home states based on the time of day; hence, attackers… ▽ More Internet-of-things (IoT) devices are vulnerable to malicious operations by attackers, which can cause physical and economic harm to users; therefore, we previously proposed a sequence-based method that modeled user behavior as sequences of in-home events and a base home state to detect anomalous operations. However, that method modeled users' home states based on the time of day; hence, attackers could exploit the system to maximize attack opportunities. Therefore, we then proposed an estimation-based detection method that estimated the home state using not only the time of day but also the observable values of home IoT sensors and devices. However, it ignored short-term operational behaviors. Consequently, in the present work, we propose a behavior-modeling method that combines home state estimation and event sequences of IoT devices within the home to enable a detailed understanding of long- and short-term user behavior. We compared the proposed model to our previous methods using data collected from real homes. Compared with the estimation-based method, the proposed method achieved a 15.4% higher detection ratio with fewer than 10% misdetections. Compared with the sequence-based method, the proposed method achieved a 46.0% higher detection ratio with fewer than 10% misdetections. △ Less

Submitted 29 September, 2021; originally announced September 2021.

Comments: 13 pages, 22 figures,

arXiv:2108.08631 [pdf, other]

doi 10.1103/PhysRevResearch.3.043126

Determinant-free fermionic wave function using feed-forward neural networks

Authors: Koji Inui, Yasuyuki Kato, Yukitoshi Motome

Abstract: We propose a general framework for finding the ground state of many-body fermionic systems by using feed-forward neural networks. The anticommutation relation for fermions is usually implemented to a variational wave function by the Slater determinant (or Pfaffian), which is a computational bottleneck because of the numerical cost of $O(N^3)$ for $N$ particles. We bypass this bottleneck by explici… ▽ More We propose a general framework for finding the ground state of many-body fermionic systems by using feed-forward neural networks. The anticommutation relation for fermions is usually implemented to a variational wave function by the Slater determinant (or Pfaffian), which is a computational bottleneck because of the numerical cost of $O(N^3)$ for $N$ particles. We bypass this bottleneck by explicitly calculating the sign changes associated with particle exchanges in real space and using fully connected neural networks for optimizing the rest parts of the wave function. This reduces the computational cost to $O(N^2)$ or less. We show that the accuracy of the approximation can be improved by optimizing the "variance" of the energy simultaneously with the energy itself. We also find that a reweighting method in Monte Carlo sampling can stabilize the calculation. These improvements can be applied to other approaches based on variational Monte Carlo methods. Moreover, we show that the accuracy can be further improved by using the symmetry of the system, the representative states, and an additional neural network implementing a generalized Gutzwiller-Jastrow factor. We demonstrate the efficiency of the method by applying it to a two-dimensional Hubbard model. △ Less

Submitted 22 August, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

Journal ref: Phys. Rev. Research 3, 043126 (2021)

arXiv:1609.08144 [pdf, other]

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Authors: Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith , et al. (6 additional authors not shown)

Abstract: Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NM… ▽ More Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system. △ Less

Submitted 8 October, 2016; v1 submitted 26 September, 2016; originally announced September 2016.

arXiv:1202.4883 [pdf, ps, other]

The Dissecting Power of Regular Languages

Authors: Tomoyuki Yamakami, Yuichi Kato

Abstract: A recent study on structural properties of regular and context-free languages has greatly promoted our basic understandings of the complex behaviors of those languages. We continue the study to examine how regular languages behave when they need to cut numerous infinite languages. A particular interest rests on a situation in which a regular language needs to "dissect" a given infinite language in… ▽ More A recent study on structural properties of regular and context-free languages has greatly promoted our basic understandings of the complex behaviors of those languages. We continue the study to examine how regular languages behave when they need to cut numerous infinite languages. A particular interest rests on a situation in which a regular language needs to "dissect" a given infinite language into two subsets of infinite size. Every context-free language is dissected by carefully chosen regular languages (or it is REG-dissectible). In a larger picture, we show that constantly-growing languages and semi-linear languages are REG-dissectible. Under certain natural conditions, complements and finite intersections of semi-linear languages also become REG-dissectible. Restricted to bounded languages, the intersections of finitely many context-free languages and, more surprisingly, the entire Boolean hierarchy over bounded context-free languages are REG-dissectible. As an immediate application of the REG-dissectibility, we show another structural property, in which an appropriate bounded context-free language can "separate with infinite margins" two given nested infinite bounded context-free languages. △ Less

Submitted 12 December, 2012; v1 submitted 22 February, 2012; originally announced February 2012.

Comments: A4, 10pt, 9 pages, 2 figures

Journal ref: Information Processing Letters, Vol.113, pp.116-122, 2013

Showing 1–18 of 18 results for author: Kato, Y