Skip to main content

Showing 1–13 of 13 results for author: Kawamura, M

  1. arXiv:2406.07969  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

    Authors: Masaya Kawamura, Ryuichi Yamamoto, Yuma Shirahata, Takuya Hasumi, Kentaro Tachibana

    Abstract: We introduce LibriTTS-P, a new corpus based on LibriTTS-R that includes utterance-level descriptions (i.e., prompts) of speaking style and speaker-level prompts of speaker characteristics. We employ a hybrid approach to construct prompt annotations: (1) manual annotations that capture human perceptions of speaker characteristics and (2) synthetic annotations on speaking style. Compared to existing… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024

  2. A Method of Joint Angle Estimation Using Only Relative Changes in Muscle Lengths for Tendon-driven Humanoids with Complex Musculoskeletal Structures

    Authors: Kento Kawaharazuka, Shogo Makino, Masaya Kawamura, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: Tendon-driven musculoskeletal humanoids typically have complex structures similar to those of human beings, such as ball joints and the scapula, in which encoders cannot be installed. Therefore, joint angles cannot be directly obtained and need to be estimated using the changes in muscle lengths. In previous studies, methods using table-search and extended kalman filter have been developed. These… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted at Humanoids2018

  3. Online Learning of Joint-Muscle Mapping Using Vision in Tendon-driven Musculoskeletal Humanoids

    Authors: Kento Kawaharazuka, Shogo Makino, Masaya Kawamura, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: The body structures of tendon-driven musculoskeletal humanoids are complex, and accurate modeling is difficult, because they are made by imitating the body structures of human beings. For this reason, we have not been able to move them accurately like ordinary humanoids driven by actuators in each axis, and large internal muscle tension and slack of tendon wires have emerged by the model error bet… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at IEEE Robotics and Automation Letters, 2018

  4. Online Self-body Image Acquisition Considering Changes in Muscle Routes Caused by Softness of Body Tissue for Tendon-driven Musculoskeletal Humanoids

    Authors: Kento Kawaharazuka, Shogo Makino, Masaya Kawamura, Ayaka Fujii, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: Tendon-driven musculoskeletal humanoids have many benefits in terms of the flexible spine, multiple degrees of freedom, and variable stiffness. At the same time, because of its body complexity, there are problems in controllability. First, due to the large difference between the actual robot and its geometric model, it cannot move as intended and large internal muscle tension may emerge. Second, m… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted at IROS2018

  5. High-Power, Flexible, Robust Hand: Development of Musculoskeletal Hand Using Machined Springs and Realization of Self-Weight Supporting Motion with Humanoid

    Authors: Shogo Makino, Kento Kawaharazuka, Masaya Kawamura, Yuki Asano, Kei Okada, Masayuki Inaba

    Abstract: Human can not only support their body during standing or walking, but also support them by hand, so that they can dangle a bar and others. But most humanoid robots support their body only in the foot and they use their hand just to manipulate objects because their hands are too weak to support their body. Strong hands are supposed to enable humanoid robots to act in much broader scene. Therefore,… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: accepted at IROS2017

  6. Five-fingered Hand with Wide Range of Thumb Using Combination of Machined Springs and Variable Stiffness Joints

    Authors: Shogo Makino, Kento Kawaharazuka, Ayaka Fujii, Masaya Kawamura, Tasuku Makabe, Moritaka Onitsuka, Yuki Asano, Kei Okada, Koji Kawasaki, Masayuki Inaba

    Abstract: Human hands can not only grasp objects of various shape and size and manipulate them in hands but also exert such a large gripping force that they can support the body in the situations such as dangling a bar and climbing a ladder. On the other hand, it is difficult for most robot hands to manage both. Therefore in this paper we developed the hand which can grasp various objects and exert large gr… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: accepted at IROS2018

  7. arXiv:2402.05508  [pdf, ps, other

    cs.MM cond-mat.stat-mech

    Performance Evaluation of Associative Watermarking Using Statistical Neurodynamics

    Authors: Ryoto Kanegae, Masaki Kawamura

    Abstract: We theoretically evaluated the performance of our proposed associative watermarking method in which the watermark is not embedded directly into the image. We previously proposed a watermarking method that extends the zero-watermarking model by applying associative memory models. In this model, the hetero-associative memory model is introduced to the mapping process between image features and water… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  8. arXiv:2309.08140  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

    Authors: Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana

    Abstract: We propose PromptTTS++, a prompt-based text-to-speech (TTS) synthesis system that allows control over speaker identity using natural language descriptions. To control speaker identity within the prompt-based TTS framework, we introduce the concept of speaker prompt, which describes voice characteristics (e.g., gender-neutral, young, old, and muffled) designed to be approximately independent of spe… ▽ More

    Submitted 27 December, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  9. arXiv:2210.15975  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

    Authors: Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana

    Abstract: We propose a lightweight end-to-end text-to-speech model using multi-band generation and inverse short-time Fourier transform. Our model is based on VITS, a high-quality end-to-end text-to-speech model, but adopts two changes for more efficient inference: 1) the most computationally expensive component is partially replaced with a simple inverse short-time Fourier transform, and 2) multi-band gene… ▽ More

    Submitted 21 February, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

    Comments: Accepted to ICASSP 2023

  10. arXiv:2202.00200  [pdf, other

    cs.SD cs.LG eess.AS

    Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds

    Authors: Masaya Kawamura, Tomohiko Nakamura, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo

    Abstract: A differentiable digital signal processing (DDSP) autoencoder is a musical sound synthesizer that combines a deep neural network (DNN) and spectral modeling synthesis. It allows us to flexibly edit sounds by changing the fundamental frequency, timbre feature, and loudness (synthesis parameters) extracted from an input sound. However, it is designed for a monophonic harmonic sound and cannot handle… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: 5 pages, 2 figures, to appear in 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2022)

  11. The Study about the Analysis of Responsiveness Pair Clustering to Social Network Bipartite Graph

    Authors: Akira Otsuki, Masayoshi Kawamura

    Abstract: In this study, regional (cities, towns and villages) data and tweet data are obtained from Twitter, and extract information of purchase information (Where and what bought) from the tweet data by morphological analysis and rule-based dependency analysis. Then, the "The regional information" and "The information of purchase history (Where and what bought information)" are captured as bipartite graph… ▽ More

    Submitted 9 December, 2013; originally announced December 2013.

    Comments: 14 pages, 8 figures, 3 tables

    Journal ref: Advanced Computing: An International Journal (ACIJ), Vol.4, No.6, November 2013

  12. GV-Index:Scientific Contribution Rating Index That Takes into Account the Growth Degree of Research Area and Variance Values of the Publication Year of Cited Paper

    Authors: Akira Otsuki, Masayoshi Kawamura

    Abstract: There are a wide variety of scientific contribution rating indices including the impact factor and h-index. These are used for quantitative analyses on research papers published in the past, and therefore unable to incorporate in the assessment the growth, or deterioration, of the research area: whether the research area of a particular paper is in decline or conversely in a growing trend. Other h… ▽ More

    Submitted 17 October, 2013; originally announced October 2013.

    Comments: 11 pages, 9 figures, 8 tables

    Journal ref: International Journal of Data Mining & Knowledge Management Process (IJDKP) Vol.3, No.5, September 2013

  13. arXiv:1209.4772  [pdf, ps, other

    cond-mat.stat-mech cs.IT

    Statistical mechanical evaluation of spread spectrum watermarking model with image restoration

    Authors: Masaki Kawamura, Kao Hayashi, Tatsuya Uezu, Masato Okada

    Abstract: In cases in which an original image is blind, a decoding method where both the image and the messages can be estimated simultaneously is desirable. We propose a spread spectrum watermarking model with image restoration based on Bayes estimation. We therefore need to assume some prior probabilities. The probability for estimating the messages is given by the uniform distribution, and the ones for t… ▽ More

    Submitted 26 June, 2019; v1 submitted 21 September, 2012; originally announced September 2012.

    Journal ref: Phys. Rev. E 99, 062132 (2019)