Skip to main content

Showing 1–34 of 34 results for author: Saito, H

  1. arXiv:2406.14978  [pdf, other

    cs.CV

    E2GS: Event Enhanced Gaussian Splatting

    Authors: Hiroyuki Deguchi, Mana Masuda, Takuya Nakabayashi, Hideo Saito

    Abstract: Event cameras, known for their high dynamic range, absence of motion blur, and low energy usage, have recently found a wide range of applications thanks to these attributes. In the past few years, the field of event-based 3D reconstruction saw remarkable progress, with the Neural Radiance Field (NeRF) based approach demonstrating photorealistic view synthesis results. However, the volume rendering… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 7pages,

  2. arXiv:2406.03095  [pdf, other

    cs.CV cs.AI cs.LG

    EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos

    Authors: Ryo Fujii, Hideo Saito, Hiroki Kajita

    Abstract: Surgical tool detection is a fundamental task for understanding egocentric open surgery videos. However, detecting surgical tools presents significant challenges due to their highly imbalanced class distribution, similar shapes and similar textures, and heavy occlusion. The lack of a comprehensive large-scale dataset compounds these challenges. In this paper, we introduce EgoSurgery-Tool, an exten… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 June, 2024; originally announced June 2024.

  3. arXiv:2405.20030  [pdf, other

    cs.CV

    EMAG: Ego-motion Aware and Generalizable 2D Hand Forecasting from Egocentric Videos

    Authors: Masashi Hatano, Ryo Hachiuma, Hideo Saito

    Abstract: Predicting future human behavior from egocentric videos is a challenging but critical task for human intention understanding. Existing methods for forecasting 2D hand positions rely on visual representations and mainly focus on hand-object interactions. In this paper, we investigate the hand forecasting task and tackle two significant issues that persist in the existing methods: (1) 2D hand positi… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  4. arXiv:2405.19917  [pdf, other

    cs.CV

    Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition

    Authors: Masashi Hatano, Ryo Hachiuma, Ryo Fujii, Hideo Saito

    Abstract: We address a novel cross-domain few-shot learning task (CD-FSL) with multimodal input and unlabeled target data for egocentric action recognition. This paper simultaneously tackles two critical challenges associated with egocentric action recognition in CD-FSL settings: (1) the extreme domain gap in egocentric videos (e.g., daily life vs. industrial domain) and (2) the computational cost for real-… ▽ More

    Submitted 16 July, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted at ECCV'24

  5. arXiv:2405.19644  [pdf, other

    cs.CV cs.AI cs.LG

    EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos

    Authors: Ryo Fujii, Masashi Hatano, Hideo Saito, Hiroki Kajita

    Abstract: Surgical phase recognition has gained significant attention due to its potential to offer solutions to numerous demands of the modern operating room. However, most existing methods concentrate on minimally invasive surgery (MIS), leaving surgical phase recognition for open surgery understudied. This discrepancy is primarily attributed to the scarcity of publicly available open surgery video datase… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Early accepted by MICCAI 2024

  6. arXiv:2401.02791  [pdf, other

    cs.CV cs.LG

    Weakly Semi-supervised Tool Detection in Minimally Invasive Surgery Videos

    Authors: Ryo Fujii, Ryo Hachiuma, Hideo Saito

    Abstract: Surgical tool detection is essential for analyzing and evaluating minimally invasive surgery videos. Current approaches are mostly based on supervised methods that require large, fully instance-level labels (i.e., bounding boxes). However, large image datasets with instance-level labels are often limited because of the burden of annotation. Thus, surgical tool detection is important when providing… ▽ More

    Submitted 8 January, 2024; v1 submitted 5 January, 2024; originally announced January 2024.

    Comments: Accepted to ICASSP 2024

  7. arXiv:2305.07152  [pdf, other

    cs.CV

    Surgical tool classification and localization: results and methods from the MICCAI 2022 SurgToolLoc challenge

    Authors: Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Max Berniker, Ziheng Wang, Rogerio Nespolo, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa, Bo Liu, David Austin, Yiheng Wang, Michal Futrega, Jean-Francois Puget, Zhenqiang Li, Yoichi Sato, Ryo Fujii, Ryo Hachiuma, Mana Masuda, Hideo Saito, An Wang, Mengya Xu, Mobarakol Islam, Long Bai, Winnie Pang , et al. (46 additional authors not shown)

    Abstract: The ability to automatically detect and track surgical instruments in endoscopic videos can enable transformational interventions. Assessing surgical performance and efficiency, identifying skilled tool use and choreography, and planning operational and logistical aspects of OR resources are just a few of the applications that could benefit. Unfortunately, obtaining the annotations needed to train… ▽ More

    Submitted 31 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

  8. arXiv:2305.04531  [pdf, ps, other

    cs.SD eess.AS physics.ins-det

    A method for analyzing sampling jitter in audio equipment

    Authors: Makoto Takeuchi, Haruo Saito

    Abstract: A method for analyzing sampling jitter in audio equipment is proposed. The method is based on the time-domain analysis where the time fluctuations of zero-crossing points in recorded sinusoidal waves are employed to characterize jitter. This method enables the separate evaluation of jitter in an audio player from those in audio recorders when the same playback signal is simultaneously fed into two… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: 15 pages, 12 figures

  9. arXiv:2304.04559  [pdf, other

    cs.CV

    Event-based Camera Tracker by $\nabla$t NeRF

    Authors: Mana Masuda, Yusuke Sekikawa, Hideo Saito

    Abstract: When a camera travels across a 3D world, only a fraction of pixel value changes; an event-based camera observes the change as sparse events. How can we utilize sparse events for efficient recovery of the camera pose? We show that we can recover the camera pose by minimizing the error between sparse events and the temporal gradient of the scene represented as a neural radiance field (NeRF). To enab… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

  10. arXiv:2304.03420  [pdf, other

    cs.CV

    Toward Unsupervised 3D Point Cloud Anomaly Detection using Variational Autoencoder

    Authors: Mana Masuda, Ryo Hachiuma, Ryo Fujii, Hideo Saito, Yusuke Sekikawa

    Abstract: In this paper, we present an end-to-end unsupervised anomaly detection framework for 3D point clouds. To the best of our knowledge, this is the first work to tackle the anomaly detection task on a general object represented by a 3D point cloud. We propose a deep variational autoencoder-based unsupervised anomaly detection network adapted to the 3D point cloud and an anomaly score specifically for… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: ICIP2021

  11. arXiv:2303.15947  [pdf, other

    cs.CV cs.AI cs.LG

    Deep Selection: A Fully Supervised Camera Selection Network for Surgery Recordings

    Authors: Ryo Hachiuma, Tomohiro Shimizu, Hideo Saito, Hiroki Kajita, Yoshifumi Takatsume

    Abstract: Recording surgery in operating rooms is an essential task for education and evaluation of medical treatment. However, recording the desired targets, such as the surgery field, surgical tools, or doctor's hands, is difficult because the targets are heavily occluded during surgery. We use a recording system in which multiple cameras are embedded in the surgical lamp, and we assume that at least one… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

    Comments: MICCAI 2020

  12. arXiv:2303.13465  [pdf, other

    cs.CL cs.AI

    Deep RL with Hierarchical Action Exploration for Dialogue Generation

    Authors: Itsugun Cho, Ryota Takahashi, Yusaku Yanase, Hiroaki Saito

    Abstract: Traditionally, approximate dynamic programming is employed in dialogue generation with greedy policy improvement through action sampling, as the natural language action space is vast. However, this practice is inefficient for reinforcement learning (RL) due to the sparsity of eligible responses with high action values, which leads to weak improvement sustained by random sampling. This paper presen… ▽ More

    Submitted 15 May, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  13. arXiv:2204.07372  [pdf, other

    cs.CL cs.LG

    A Personalized Dialogue Generator with Implicit User Persona Detection

    Authors: Itsugun Cho, Dongyang Wang, Ryota Takahashi, Hiroaki Saito

    Abstract: Current works in the generation of personalized dialogue primarily contribute to the agent presenting a consistent personality and driving a more informative response. However, we found that the generated responses from most previous models tend to be self-centered, with little care for the user in the dialogue. Moreover, we consider that human-like conversation is essentially built based on infer… ▽ More

    Submitted 21 August, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

    Comments: 9 pages, 7 figures, Accepted by Coling2022

  14. arXiv:2203.07098  [pdf, other

    cs.CV

    A Two-Block RNN-based Trajectory Prediction from Incomplete Trajectory

    Authors: Ryo Fujii, Jayakorn Vongkulbhisal, Ryo Hachiuma, Hideo Saito

    Abstract: Trajectory prediction has gained great attention and significant progress has been made in recent years. However, most works rely on a key assumption that each video is successfully preprocessed by detection and tracking algorithms and the complete observed trajectory is always available. However, in complex real-world environments, we often encounter miss-detection of target agents (e.g., pedestr… ▽ More

    Submitted 16 March, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: Accepted by IEEE Access

  15. arXiv:2202.00210  [pdf, other

    cs.RO

    INPUT Team Description Paper in 2022

    Authors: Masaki Yasuhara, Tomoya Takahashi, Hiroki Maruta, Hiroyuki Saito, Shota Higuchi, Takaaki Nara, Keitaro Takeuchi, Yota Sakai, Kazuki Ishibashi

    Abstract: INPUT is a team participating in the RoboCup Soccer Small League (SSL). It aims to show the world the technological capabilities of the Nagaoka region of Niigata Prefecture, which is where the team members are from. For this purpose, we are working on one of the projects from the Nagaoka Activation Zone of Energy (NAZE). Herein, we introduce two robots, v2019 and v2022, as well as AI systems that… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

  16. arXiv:2111.03824  [pdf, other

    cs.CV

    Neural Implicit Event Generator for Motion Tracking

    Authors: Mana Masuda, Yusuke Sekikawa, Ryo Fujii, Hideo Saito

    Abstract: We present a novel framework of motion tracking from event data using implicit expression. Our framework use pre-trained event generation MLP named implicit event generator (IEG) and does motion tracking by updating its state (position and velocity) based on the difference between the observed event and generated event from the current state estimate. The difference is computed implicitly by the I… ▽ More

    Submitted 6 November, 2021; originally announced November 2021.

    Comments: Submitted to ICRA 2022

  17. arXiv:2110.07413  [pdf, other

    cs.CV

    RGB-D Image Inpainting Using Generative Adversarial Network with a Late Fusion Approach

    Authors: Ryo Fujii, Ryo Hachiuma, Hideo Saito

    Abstract: Diminished reality is a technology that aims to remove objects from video images and fills in the missing region with plausible pixels. Most conventional methods utilize the different cameras that capture the same scene from different viewpoints to allow regions to be removed and restored. In this paper, we propose an RGB-D image inpainting method using generative adversarial network, which does n… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Accepted at AVR 2020

  18. arXiv:2108.12971  [pdf, other

    cs.PL cs.CL cs.LO cs.SE

    HELMHOLTZ: A Verifier for Tezos Smart Contracts Based on Refinement Types

    Authors: Yuki Nishida, Hiromasa Saito, Ran Chen, Akira Kawata, Jun Furuse, Kohei Suenaga, Atsushi Igarashi

    Abstract: A smart contract is a program executed on a blockchain, based on which many cryptocurrencies are implemented, and is being used for automating transactions. Due to the large amount of money that smart contracts deal with, there is a surging demand for a method that can statically and formally verify them. This article describes our type-based static verification tool HELMHOLTZ for Michelson, whi… ▽ More

    Submitted 10 September, 2021; v1 submitted 29 August, 2021; originally announced August 2021.

  19. arXiv:2105.00151  [pdf, other

    cs.NI

    Theoretical Analysis for Determining Geographical Route of Cable Network with Various Disaster-Endurance Levels

    Authors: Hiroshi Saito

    Abstract: This paper theoretically analyzes cable network disconnection due to randomly occurring natural disasters, where the disaster-endurance (DE) levels of the network are determined by a network entity such as the type of shielding method used for a duct containing cables. The network operator can determine which parts have a high DE level. When a part of a network can be protected, the placement of t… ▽ More

    Submitted 30 April, 2021; originally announced May 2021.

  20. arXiv:2010.06318  [pdf, other

    cs.CV

    Audio-Visual Self-Supervised Terrain Type Discovery for Mobile Platforms

    Authors: Akiyoshi Kurobe, Yoshikatsu Nakajima, Hideo Saito, Kris Kitani

    Abstract: The ability to both recognize and discover terrain characteristics is an important function required for many autonomous ground robots such as social robots, assistive robots, autonomous vehicles, and ground exploration robots. Recognizing and discovering terrain characteristics is challenging because similar terrains may have very different appearances (e.g., carpet comes in many colors), while t… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

  21. Deep Learning in Diabetic Foot Ulcers Detection: A Comprehensive Evaluation

    Authors: Moi Hoon Yap, Ryo Hachiuma, Azadeh Alavi, Raphael Brungel, Bill Cassidy, Manu Goyal, Hongtao Zhu, Johannes Ruckert, Moshe Olshansky, Xiao Huang, Hideo Saito, Saeed Hassanpour, Christoph M. Friedrich, David Ascher, Anping Song, Hiroki Kajita, David Gillespie, Neil D. Reeves, Joseph Pappachan, Claire O'Shea, Eibe Frank

    Abstract: There has been a substantial amount of research involving computer methods and technology for the detection and recognition of diabetic foot ulcers (DFUs), but there is a lack of systematic comparisons of state-of-the-art deep learning object detection frameworks applied to this problem. DFUC2020 provided participants with a comprehensive dataset consisting of 2,000 images for training and 2,000 i… ▽ More

    Submitted 24 May, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

    Comments: 19 pages, 18 figures, 10 tables

    Journal ref: Computers in Biology and Medicine, Volume 135, 2021, 104596, ISSN 0010-4825,

  22. Spatio-Temporal Correlation of Interference in MANET Under Spatially Correlated Shadowing Environment

    Authors: Tatsuaki Kimura, Hiroshi Saito

    Abstract: Correlation of interference affects spatio-temporal aspects of various wireless mobile systems, such as retransmission, multiple antennas and cooperative relaying. In this paper, we study the spatial and temporal correlation of interference in mobile ad-hoc networks under a correlated shadowing environment. By modeling the node locations as a Poisson point process with an i.i.d. mobility model and… ▽ More

    Submitted 20 December, 2019; originally announced December 2019.

    Comments: to appear in IEEE Transactions on Mobile Computing

  23. arXiv:1907.10008  [pdf, other

    cs.CV cs.RO

    Incremental Class Discovery for Semantic Segmentation with RGBD Sensing

    Authors: Yoshikatsu Nakajima, Byeongkeun Kang, Hideo Saito, Kris Kitani

    Abstract: This work addresses the task of open world semantic segmentation using RGBD sensing to discover new semantic classes over time. Although there are many types of objects in the real-word, current semantic segmentation methods make a closed world assumption and are trained only to segment a limited number of object classes. Towards a more open world approach, we propose a novel method that increment… ▽ More

    Submitted 23 July, 2019; originally announced July 2019.

    Comments: 10 pages, To appear at IEEE International Conference on Computer Vision (ICCV 2019)

  24. arXiv:1907.09127  [pdf, other

    cs.CV

    DetectFusion: Detecting and Segmenting Both Known and Unknown Dynamic Objects in Real-time SLAM

    Authors: Ryo Hachiuma, Christian Pirchheim, Dieter Schmalstieg, Hideo Saito

    Abstract: We present DetectFusion, an RGB-D SLAM system that runs in real-time and can robustly handle semantically known and unknown objects that can move dynamically in the scene. Our system detects, segments and assigns semantic class labels to known objects in the scene, while tracking and reconstructing them even when they move independently in front of the monocular camera. In contrast to related work… ▽ More

    Submitted 22 July, 2019; originally announced July 2019.

    Comments: 12 pages, 4 figures, 4 tables, accepted by BMVC 2019 spotlight session

  25. arXiv:1812.07045  [pdf, other

    cs.CV cs.LG

    EventNet: Asynchronous Recursive Event Processing

    Authors: Yusuke Sekikawa, Kosuke Hara, Hideo Saito

    Abstract: Event cameras are bio-inspired vision sensors that mimic retinas to asynchronously report per-pixel intensity changes rather than outputting an actual intensity image at regular intervals. This new paradigm of image sensor offers significant potential advantages; namely, sparse and non-redundant data representation. Unfortunately, however, most of the existing artificial neural network architectur… ▽ More

    Submitted 1 April, 2019; v1 submitted 7 December, 2018; originally announced December 2018.

  26. arXiv:1803.02784  [pdf, other

    cs.CV cs.RO

    Fast and Accurate Semantic Mapping through Geometric-based Incremental Segmentation

    Authors: Yoshikatsu Nakajima, Keisuke Tateno, Federico Tombari, Hideo Saito

    Abstract: We propose an efficient and scalable method for incrementally building a dense, semantically annotated 3D map in real-time. The proposed method assigns class probabilities to each region, not each element (e.g., surfel and voxel), of the 3D map which is built up through a robust SLAM framework and incrementally segmented with a geometric-based segmentation method. Differently from all other approa… ▽ More

    Submitted 7 March, 2018; originally announced March 2018.

  27. arXiv:1707.06128  [pdf, ps, other

    cs.IT cs.CG math.PR

    Geometric Analysis of Observability of Target Object Shape Using Location-Unknown Distance Sensors

    Authors: Hiroshi Saito, Hirotada Honda

    Abstract: We geometrically analyze the problem of estimating parameters related to the shape and size of a two-dimensional target object on the plane by using randomly distributed distance sensors whose locations are unknown. Based on the analysis using geometric probability, we discuss the observability of these parameters: which parameters we can estimate and what conditions are required to estimate them.… ▽ More

    Submitted 15 May, 2017; originally announced July 2017.

    Comments: under submission

  28. arXiv:1706.09606  [pdf, ps, other

    cs.PF

    Theoretical Performance Analysis of Vehicular Broadcast Communications at Intersection and their Optimization

    Authors: Tatsuaki Kimura, Hiroshi Saito

    Abstract: In this paper, we propose an optimization method for the broadcast rate in vehicle-to-vehicle (V2V) broadcast communications at an intersection on the basis of theoretical analysis. We consider a model in which locations of vehicles are modeled separately as queuing and running segments and derive key performance metrics of V2V broadcast communications via a stochastic geometry approach. Since the… ▽ More

    Submitted 29 March, 2019; v1 submitted 29 June, 2017; originally announced June 2017.

  29. arXiv:1403.2486  [pdf, ps, other

    cs.PF

    Theoretical Evaluation of Offloading through Wireless LANs

    Authors: Hiroshi Saito, Ryoichi Kawahara

    Abstract: Offloading of cellular traffic through a wireless local area network (WLAN) is theoretically evaluated. First, empirical data sets of the locations of WLAN internet access points are analyzed and an inhomogeneous Poisson process consisting of high, normal, and low density regions is proposed as a spatial point process model for these configurations. Second, performance metrics, such as mean availa… ▽ More

    Submitted 11 March, 2014; originally announced March 2014.

  30. Spatial Design of Physical Network Robust against Earthquakes

    Authors: Hiroshi Saito

    Abstract: This paper analyzes the survivability of a physical network against earthquakes and proposes spatial network design rules to make a network robust against earthquakes. The disaster area model used is fairly generic and bounded. The proposed design rules for physical networks include: (i) a shorter zigzag route can reduce the probability that a network intersects a disaster area, (ii) an additive p… ▽ More

    Submitted 27 February, 2014; originally announced February 2014.

    Comments: arXiv admin note: text overlap with arXiv:1312.7187

  31. arXiv:1402.1637  [pdf

    cs.CE

    Vertical Clustering of 3D Elliptical Helical Data

    Authors: Wasantha Samarathunga, Masatoshi Seki, Hidenobu Saito, Ken Ichiryu, Yasuhiro Ohyama

    Abstract: This research proposes an effective vertical clustering strategy of 3D data in an elliptical helical shape based on 2D geometry. The clustering object is an elliptical cross-sectioned metal pipe which is been bended in to an elliptical helical shape which is used in wearable muscle support designing for welfare industry. The aim of this proposed method is to maximize the vertical clustering (verti… ▽ More

    Submitted 7 February, 2014; originally announced February 2014.

    Journal ref: International Journal of Computer Trends and Technology, volume 6 number 2,Dec 2013

  32. arXiv:1402.1635  [pdf

    cs.CE

    Product Evaluation In Elliptical Helical Pipe Bending

    Authors: Wasantha Samarathunga, Masatoshi Seki, Hidenobu Saito, Ken Ichiryu, Yasuhiro Ohyama

    Abstract: This research proposes a computation approach to address the evaluation of end product machining accuracy in elliptical surfaced helical pipe bending using 6dof parallel manipulator as a pipe bender. The target end product is wearable metal muscle supporters used in build-to-order welfare product manufacturing. This paper proposes a product testing model that mainly corrects the surface direction… ▽ More

    Submitted 7 February, 2014; originally announced February 2014.

    Journal ref: International Journal of Computer Trends and Technology, volume 4 Issue 10 Oct 2013

  33. arXiv:1312.7187  [pdf, ps, other

    cs.NI

    Analysis of Geometric Disaster Evaluation Model for Physical Networks

    Authors: Hiroshi Saito

    Abstract: A geometric model of a physical network affected by a disaster is proposed and analyzed using integral geometry (geometric probability). This analysis provides a theoretical method of evaluating performance metrics, such as the probability of maintaining connectivity, and a network design rule that can make the network robust against disasters. The proposed model is of when the disaster area is… ▽ More

    Submitted 26 December, 2013; originally announced December 2013.

    Comments: 12 pages

  34. arXiv:0911.3842  [pdf, other

    physics.data-an cs.IR cs.SD physics.soc-ph

    Musical Genres: Beating to the Rhythms of Different Drums

    Authors: Debora C. Correa, Jose H. Saito, Luciano da F. Costa

    Abstract: Online music databases have increased signicantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic music genre classification is addressed by exploring rhythm-based features obtained f… ▽ More

    Submitted 19 November, 2009; originally announced November 2009.

    Comments: 35 pages, 13 figures, 13 tables