Skip to main content

Showing 1–40 of 40 results for author: Hu, N

  1. arXiv:2404.07721  [pdf, other

    eess.SP cs.IT

    Trainable Joint Channel Estimation, Detection and Decoding for MIMO URLLC Systems

    Authors: Yi Sun, Hong Shen, Bingqing Li, Wei Xu, Pengcheng Zhu, Nan Hu, Chunming Zhao

    Abstract: The receiver design for multi-input multi-output (MIMO) ultra-reliable and low-latency communication (URLLC) systems can be a tough task due to the use of short channel codes and few pilot symbols. Consequently, error propagation can occur in traditional turbo receivers, leading to performance degradation. Moreover, the processing delay induced by information exchange between different modules may… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 17 pages, 12 figures, accepted by IEEE Transactions on Wireless Communications

  2. arXiv:2403.19723  [pdf, other

    cs.CL cs.AI cs.DB cs.MM

    HGT: Leveraging Heterogeneous Graph-enhanced Large Language Models for Few-shot Complex Table Understanding

    Authors: Rihui Jin, Yu Li, Guilin Qi, Nan Hu, Yuan-Fang Li, Jiaoyan Chen, Jianan Wang, Yongrui Chen, Dehai Min

    Abstract: Table understanding (TU) has achieved promising advancements, but it faces the challenges of the scarcity of manually labeled tables and the presence of complex table structures.To address these challenges, we propose HGT, a framework with a heterogeneous graph (HG)-enhanced large language model (LLM) to tackle few-shot TU tasks.It leverages the LLM by aligning the table semantics with the LLM's p… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  3. arXiv:2403.18802  [pdf, other

    cs.CL cs.AI cs.LG

    Long-form factuality in large language models

    Authors: Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Jie Huang, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, Quoc V. Le

    Abstract: Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factua… ▽ More

    Submitted 3 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  4. arXiv:2403.18760  [pdf, other

    cs.RO

    MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model

    Authors: Yike Wu, Jiatao Zhang, Nan Hu, LanLing Tang, Guilin Qi, Jun Shao, Jie Ren, Wei Song

    Abstract: In the realm of data-driven AI technology, the application of open-source large language models (LLMs) in robotic task planning represents a significant milestone. Recent robotic task planning methods based on open-source LLMs typically leverage vast task planning datasets to enhance models' planning abilities. While these methods show promise, they struggle with complex long-horizon tasks, which… ▽ More

    Submitted 1 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

  5. arXiv:2402.14835  [pdf, other

    cs.CL cs.AI cs.LG

    MIKE: A New Benchmark for Fine-grained Multimodal Entity Knowledge Editing

    Authors: Jiaqi Li, Miaozeng Du, Chuanyi Zhang, Yongrui Chen, Nan Hu, Guilin Qi, Haiyun Jiang, Siyuan Cheng, Bozhong Tian

    Abstract: Multimodal knowledge editing represents a critical advancement in enhancing the capabilities of Multimodal Large Language Models (MLLMs). Despite its potential, current benchmarks predominantly focus on coarse-grained knowledge, leaving the intricacies of fine-grained (FG) multimodal entity knowledge largely unexplored. This gap presents a notable challenge, as FG entity recognition is pivotal for… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 8 pages

  6. arXiv:2402.13179  [pdf, other

    cs.LO math.CT

    homotopy.io: a proof assistant for finitely-presented globular $n$-categories

    Authors: Nathan Corbyn, Lukas Heidemann, Nick Hu, Chiara Sarti, Calin Tataru, Jamie Vicary

    Abstract: We present the proof assistant homotopy.io for working with finitely-presented semistrict higher categories. The tool runs in the browser with a point-and-click interface, allowing direct manipulation of proof objects via a graphical representation. We describe the user interface and explain how the tool can be used in practice. We also describe the essential subsystems of the tool, including coll… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  7. arXiv:2402.12869  [pdf, other

    cs.CL

    Exploring the Impact of Table-to-Text Methods on Augmenting LLM-based Question Answering with Domain Hybrid Data

    Authors: Dehai Min, Nan Hu, Rihui Jin, Nuo Lin, Jiaoyan Chen, Yongrui Chen, Yu Li, Guilin Qi, Yun Li, Nijun Li, Qianren Wang

    Abstract: Augmenting Large Language Models (LLMs) for Question Answering (QA) with domain specific data has attracted wide attention. However, domain data often exists in a hybrid format, including text and semi-structured tables, posing challenges for the seamless integration of information. Table-to-Text Generation is a promising solution by facilitating the transformation of hybrid data into a uniformly… ▽ More

    Submitted 9 April, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

    Comments: Accepted to NAACL 2024 Industry Track Paper

  8. arXiv:2401.14640  [pdf, other

    cs.CL

    Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs

    Authors: Nan Hu, Jiaoyan Chen, Yike Wu, Guilin Qi, Sheng Bi, Tongtong Wu, Jeff Z. Pan

    Abstract: The attribution of question answering is to provide citations for supporting generated statements, and has attracted wide research attention. The current methods for automatically evaluating the attribution, which are often based on Large Language Models (LLMs), are still inadequate, particularly in recognizing subtle differences between attributions, and complex relationships between citations an… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 13 pages, 5 figures

  9. arXiv:2312.12484  [pdf, other

    cs.CR cs.DC cs.LG

    SkyMask: Attack-agnostic Robust Federated Learning with Fine-grained Learnable Masks

    Authors: Peishen Yan, Hao Wang, Tao Song, Yang Hua, Ruhui Ma, Ningxin Hu, Mohammad R. Haghighat, Haibing Guan

    Abstract: Federated Learning (FL) is becoming a popular paradigm for leveraging distributed data and preserving data privacy. However, due to the distributed characteristic, FL systems are vulnerable to Byzantine attacks that compromised clients attack the global model by uploading malicious model updates. Most existing Byzantine-robust FL systems statistically analyze the weights of whole individual model… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  10. arXiv:2309.11206  [pdf, other

    cs.CL cs.AI

    Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering

    Authors: Yike Wu, Nan Hu, Sheng Bi, Guilin Qi, Jie Ren, Anhuan Xie, Wei Song

    Abstract: Despite their competitive performance on knowledge-intensive tasks, large language models (LLMs) still have limitations in memorizing all world knowledge especially long tail knowledge. In this paper, we study the KG-augmented language model approach for solving the knowledge graph question answering (KGQA) task that requires rich world knowledge. Existing work has shown that retrieving KG knowled… ▽ More

    Submitted 21 September, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  11. arXiv:2308.08446  [pdf, other

    cs.IR cs.LG

    CSPM: A Contrastive Spatiotemporal Preference Model for CTR Prediction in On-Demand Food Delivery Services

    Authors: Guyu Jiang, Xiaoyun Li, Rongrong Jing, Ruoqi Zhao, Xingliang Ni, Guodong Cao, Ning Hu

    Abstract: Click-through rate (CTR) prediction is a crucial task in the context of an online on-demand food delivery (OFD) platform for precisely estimating the probability of a user clicking on food items. Unlike universal e-commerce platforms such as Taobao and Amazon, user behaviors and interests on the OFD platform are more location and time-sensitive due to limited delivery ranges and regional commodity… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  12. arXiv:2308.04019  [pdf, other

    cs.IR

    Exploring the Spatiotemporal Features of Online Food Recommendation Service

    Authors: Shaochuan Lin, Jiayan Pei, Taotao Zhou, Hengxu He, Jia Jia, Ning Hu

    Abstract: Online Food Recommendation Service (OFRS) has remarkable spatiotemporal characteristics and the advantage of being able to conveniently satisfy users' needs in a timely manner. There have been a variety of studies that have begun to explore its spatiotemporal properties, but a comprehensive and in-depth analysis of the OFRS spatiotemporal features is yet to be conducted. Therefore, this paper stud… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: accepted by SIGIR 2023

  13. arXiv:2308.04017  [pdf, other

    cs.IR cs.AI

    Multi-Granularity Attention Model for Group Recommendation

    Authors: Jianye Ji, Jiayan Pei, Shaochuan Lin, Taotao Zhou, Hengxu He, Jia Jia, Ning Hu

    Abstract: Group recommendation provides personalized recommendations to a group of users based on their shared interests, preferences, and characteristics. Current studies have explored different methods for integrating individual preferences and making collective decisions that benefit the group as a whole. However, most of them heavily rely on users with rich behavior and ignore latent preferences of user… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  14. arXiv:2308.03855  [pdf, other

    cs.IR cs.AI

    Mobile Supply: The Last Piece of Jigsaw of Recommender System

    Authors: Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jie Zhang, Jia Jia, Ning Hu

    Abstract: Recommendation system is a fundamental functionality of online platforms. With the development of computing power of mobile phones, some researchers have deployed recommendation algorithms on users' mobile devices to address the problems of data transmission delay and pagination trigger mechanism. However, the existing edge-side mobile rankings cannot completely solve the problem of pagination tri… ▽ More

    Submitted 8 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  15. arXiv:2307.09193  [pdf, other

    cs.AI cs.IR

    ESMC: Entire Space Multi-Task Model for Post-Click Conversion Rate via Parameter Constraint

    Authors: Zhenhao Jiang, Biao Zeng, Hao Feng, Jin Liu, Jicong Fan, Jie Zhang, Jia Jia, Ning Hu, Xingyu Chen, Xuguang Lan

    Abstract: Large-scale online recommender system spreads all over the Internet being in charge of two basic tasks: Click-Through Rate (CTR) and Post-Click Conversion Rate (CVR) estimations. However, traditional CVR estimators suffer from well-known Sample Selection Bias and Data Sparsity issues. Entire space models were proposed to address the two issues via tracing the decision-making path of "exposure_clic… ▽ More

    Submitted 29 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  16. arXiv:2306.13518  [pdf, other

    cs.CV cs.RO

    Segmentation and Tracking of Vegetable Plants by Exploiting Vegetable Shape Feature for Precision Spray of Agricultural Robots

    Authors: Nan Hu, Daobilige Su, Shuo Wang, Xuechang Wang, Huiyu Zhong, Zimeng Wang, Yongliang Qiao, Yu Tan

    Abstract: With the increasing deployment of agricultural robots, the traditional manual spray of liquid fertilizer and pesticide is gradually being replaced by agricultural robots. For robotic precision spray application in vegetable farms, accurate plant phenotyping through instance segmentation and robust plant tracking are of great importance and a prerequisite for the following spray action. Regarding t… ▽ More

    Submitted 26 June, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

  17. arXiv:2305.15076  [pdf, other

    cs.CL

    Meta-Learning Online Adaptation of Language Models

    Authors: Nathan Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn

    Abstract: Large language models encode impressively broad world knowledge in their parameters. However, the knowledge in static language models falls out of date, limiting the model's effective "shelf life." While online fine-tuning can reduce this degradation, we find that naively fine-tuning on a stream of documents leads to a low level of information uptake. We hypothesize that online fine-tuning does no… ▽ More

    Submitted 20 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023 Camera Ready

  18. arXiv:2304.04718  [pdf, other

    cs.CL

    Investigating Graph Structure Information for Entity Alignment with Dangling Cases

    Authors: Jin Xu, Yangning Li, Xiangjin Xie, Yinghui Li, Niu Hu, Haitao Zheng, Yong Jiang

    Abstract: Entity alignment (EA) aims to discover the equivalent entities in different knowledge graphs (KGs), which play an important role in knowledge engineering. Recently, EA with dangling entities has been proposed as a more realistic setting, which assumes that not all entities have corresponding equivalent entities. In this paper, we focus on this setting. Some work has explored this problem by levera… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  19. arXiv:2303.10368  [pdf, other

    cs.CL

    An Empirical Study of Pre-trained Language Models in Simple Knowledge Graph Question Answering

    Authors: Nan Hu, Yike Wu, Guilin Qi, Dehai Min, Jiaoyan Chen, Jeff Z. Pan, Zafar Ali

    Abstract: Large-scale pre-trained language models (PLMs) such as BERT have recently achieved great success and become a milestone in natural language processing (NLP). It is now the consensus of the NLP community to adopt PLMs as the backbone for downstream tasks. In recent works on knowledge graph question answering (KGQA), BERT or its variants have become necessary in their KGQA models. However, there is… ▽ More

    Submitted 18 March, 2023; originally announced March 2023.

    Comments: Accepted by World Wide Web Journal

  20. arXiv:2303.07992  [pdf, other

    cs.CL

    Can ChatGPT Replace Traditional KBQA Models? An In-depth Analysis of the Question Answering Performance of the GPT LLM Family

    Authors: Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, Guilin Qi

    Abstract: ChatGPT is a powerful large language model (LLM) that covers knowledge resources such as Wikipedia and supports natural language question answering using its own knowledge. Therefore, there is growing interest in exploring whether ChatGPT can replace traditional knowledge-based question answering (KBQA) models. Although there have been some works analyzing the question answering performance of Cha… ▽ More

    Submitted 20 September, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: To be published in Proceedings of ISWC 2023, 22nd International Semantic Web Conference

  21. arXiv:2211.12033  [pdf, other

    cs.LG

    BASM: A Bottom-up Adaptive Spatiotemporal Model for Online Food Ordering Service

    Authors: Boya Du, Shaochuan Lin, Jiong Gao, Xiyu Ji, Mengya Wang, Taotao Zhou, Hengxu He, Jia Jia, Ning Hu

    Abstract: Online Food Ordering Service (OFOS) is a popular location-based service that helps people to order what you want. Compared with traditional e-commerce recommendation systems, users' interests may be diverse under different spatiotemporal contexts, leading to various spatiotemporal data distribution, which limits the fitting capacity of the model. However, numerous current works simply mix all samp… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  22. arXiv:2209.09427  [pdf, other

    cs.IR

    Spatiotemporal-Enhanced Network for Click-Through Rate Prediction in Location-based Services

    Authors: Shaochuan Lin, Yicong Yu, Xiyu Ji, Taotao Zhou, Hengxu He, Zisen Sang, Jia Jia, Guodong Cao, Ning Hu

    Abstract: In Location-Based Services(LBS), user behavior naturally has a strong dependence on the spatiotemporal information, i.e., in different geographical locations and at different times, user click behavior will change significantly. Appropriate spatiotemporal enhancement modeling of user click behavior and large-scale sparse attributes is key to building an LBS model. Although most of existing methods… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: accepted by CIKM workshop 2022

  23. arXiv:2207.00756  [pdf, other

    cs.SD eess.AS

    Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers

    Authors: Liumeng Xue, Shan Yang, Na Hu, Dan Su, Lei Xie

    Abstract: Building a voice conversion system for noisy target speakers, such as users providing noisy samples or Internet found data, is a challenging task since the use of contaminated speech in model training will apparently degrade the conversion performance. In this paper, we leverage the advances of our recently proposed Glow-WaveGAN and propose a noise-independent speech representation learning approa… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

    Comments: Accepted by INTERSPEECH 2022

  24. arXiv:2205.00692  [pdf, other

    cs.NI

    Energy-efficient Caching and Task offloading for Timely Status Updates in UAV-assisted VANETs

    Authors: Nan Hu, Xiaoqi Qin, Nan Ma, Yiming Liu, Yuanyuan Yao, Ping Zhang

    Abstract: Intelligent edge network is maturing to enable smart and efficient transportation systems. In this letter, we consider unmanned aerial vehicle (UAV)-assisted vehicular networks where UAVs provide caching and computing services in complement with base station (BS). One major challenge is that vehicles need to obtain timely situational awareness via orchestration of ubiquitous caching and computing… ▽ More

    Submitted 4 May, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

  25. arXiv:2202.13352  [pdf, other

    cs.CL cs.AI

    HiCLRE: A Hierarchical Contrastive Learning Framework for Distantly Supervised Relation Extraction

    Authors: Dongyang Li, Taolin Zhang, Nan Hu, Chengyu Wang, Xiaofeng He

    Abstract: Distant supervision assumes that any sentence containing the same entity pairs reflects identical relationships. Previous works of distantly supervised relation extraction (DSRE) task generally focus on sentence-level or bag-level de-noising techniques independently, neglecting the explicit interaction with cross levels. In this paper, we propose a hierarchical contrastive learning Framework for D… ▽ More

    Submitted 27 February, 2022; originally announced February 2022.

  26. arXiv:2112.01047  [pdf, other

    cs.CL

    DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding

    Authors: Taolin Zhang, Chengyu Wang, Nan Hu, Minghui Qiu, Chengguang Tang, Xiaofeng He, Jun Huang

    Abstract: Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities. To guarantee effective knowledge injection, previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs. The operations for knowledge retrieval and encoding bring si… ▽ More

    Submitted 15 October, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Accepted by AAAI22

  27. arXiv:2107.07741  [pdf, other

    cs.LG

    When does loss-based prioritization fail?

    Authors: Niel Teng Hu, Xinyu Hu, Rosanne Liu, Sara Hooker, Jason Yosinski

    Abstract: Not all examples are created equal, but standard deep neural network training protocols treat each training point uniformly. Each example is propagated forward and backward through the network the same amount of times, independent of how much the example contributes to the learning protocol. Recent work has proposed ways to accelerate training by deviating from this uniform treatment. Popular meth… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

  28. arXiv:2106.10828  [pdf, other

    eess.AS cs.SD

    Controllable Context-aware Conversational Speech Synthesis

    Authors: Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su

    Abstract: In spoken conversations, spontaneous behaviors like filled pause and prolongations always happen. Conversational partner tends to align features of their speech with their interlocutor which is known as entrainment. To produce human-like conversations, we propose a unified controllable spontaneous conversational speech synthesis framework to model the above two phenomena. Specifically, we use expl… ▽ More

    Submitted 20 June, 2021; originally announced June 2021.

    Comments: Accepted to INTERSPEECH 2021

  29. Immersive Operation of a Semi-Autonomous Aerial Platform for Detecting and Mapping Radiation

    Authors: P. Dayani, N. Orr, A. Thomopoulos, V. Saran, S. Krishnaswamy, E. Zhang, N. Hu, D. McPherson, J. Menke, A. Yang, K. Vetter

    Abstract: Recent advancements in radiation detection and computer vision have enabled small unmanned aerial systems (sUASs) to produce 3D nuclear radiation maps in real-time. Currently these state-of-the-art systems still require two operators: one to pilot the sUAS and another operator to monitor the detected radiation. In this work we present a system that integrates real-time 3D radiation visualization w… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: 3 pages, 2 figures. The first three authors contributed equally. Accepted to the 2020 IEEE Nuclear Science Symposium & Medical Imaging Conference

  30. arXiv:2102.06431  [pdf, other

    cs.SD cs.CL eess.AS

    VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention

    Authors: Peng Liu, Yuewen Cao, Songxiang Liu, Na Hu, Guangzhi Li, Chao Weng, Dan Su

    Abstract: This paper proposes VARA-TTS, a non-autoregressive (non-AR) text-to-speech (TTS) model using a very deep Variational Autoencoder (VDVAE) with Residual Attention mechanism, which refines the textual-to-acoustic alignment layer-wisely. Hierarchical latent variables with different temporal resolutions from the VDVAE are used as queries for residual attention module. By leveraging the coarse global al… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  31. arXiv:2012.09502  [pdf, ps, other

    cs.DS math.CO math.PR

    Sampling Arborescences in Parallel

    Authors: Nima Anari, Nathan Hu, Amin Saberi, Aaron Schild

    Abstract: We study the problem of sampling a uniformly random directed rooted spanning tree, also known as an arborescence, from a possibly weighted directed graph. Classically, this problem has long been known to be polynomial-time solvable; the exact number of arborescences can be computed by a determinant [Tut48], and sampling can be reduced to counting [JVV86, JS96]. However, the classic reduction from… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: To appear in ITCS 2021

  32. arXiv:2012.01837  [pdf, other

    cs.SD cs.AI cs.MM eess.AS

    Phonetic Posteriorgrams based Many-to-Many Singing Voice Conversion via Adversarial Training

    Authors: Haohan Guo, Heng Lu, Na Hu, Chunlei Zhang, Shan Yang, Lei Xie, Dan Su, Dong Yu

    Abstract: This paper describes an end-to-end adversarial singing voice conversion (EA-SVC) approach. It can directly generate arbitrary singing waveform by given phonetic posteriorgram (PPG) representing content, F0 representing pitch, and speaker embedding representing timbre, respectively. Proposed system is composed of three modules: generator $G$, the audio generation discriminator $D_{A}$, and the feat… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

  33. arXiv:2009.02878  [pdf, other

    cs.CV

    Benchmarking off-the-shelf statistical shape modeling tools in clinical applications

    Authors: Anupama Goparaju, Alexandre Bone, Nan Hu, Heath B. Henninger, Andrew E. Anderson, Stanley Durrleman, Matthijs Jacxsens, Alan Morris, Ibolya Csecs, Nassir Marrouche, Shireen Y. Elhabian

    Abstract: Statistical shape modeling (SSM) is widely used in biology and medicine as a new generation of morphometric approaches for the quantitative analysis of anatomical shapes. Technological advancements of in vivo imaging have led to the development of open-source computational tools that automate the modeling of anatomical shapes and their population-level variability. However, little work has been do… ▽ More

    Submitted 6 September, 2020; originally announced September 2020.

    Comments: 22 pages, 22 figures

  34. arXiv:1909.01700  [pdf, other

    cs.CL cs.CV cs.SD eess.AS

    DurIAN: Duration Informed Attention Network For Multimodal Synthesis

    Authors: Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu

    Abstract: In this paper, we present a generic and robust multimodal synthesis system that produces highly natural speech and facial expression simultaneously. The key component of this system is the Duration Informed Attention Network (DurIAN), an autoregressive model in which the alignments between the input text and the output acoustic features are inferred from a duration model. This is different from th… ▽ More

    Submitted 5 September, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

  35. arXiv:1907.09634  [pdf, other

    cs.LO

    Codensity Games for Bisimilarity

    Authors: Yuichi Komorida, Shin-ya Katsumata, Nick Hu, Bartek Klin, Ichiro Hasuo

    Abstract: Bisimilarity as an equivalence notion of systems has been central to process theory. Due to the recent rise of interest in quantitative systems (probabilistic, weighted, hybrid, etc.), bisimilarity has been extended in various ways: notably, bisimulation metric between probabilistic systems. An important feature of bisimilarity is its game-theoretic characterization, where Spoiler and Duplicator p… ▽ More

    Submitted 22 July, 2019; originally announced July 2019.

    Comments: 13 pages + 3 page appendix, to appear in Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2019)

  36. arXiv:1804.03715  [pdf, other

    cs.CV

    Graph Matching with Anchor Nodes: A Learning Approach

    Authors: Nan Hu, Raif M. Rustamov, Leonidas Guibas

    Abstract: In this paper, we consider the weighted graph matching problem with partially disclosed correspondences between a number of anchor nodes. Our construction exploits recently introduced node signatures based on graph Laplacians, namely the Laplacian family signature (LFS) on the nodes, and the pairwise heat kernel map on the edges. In this paper, without assuming an explicit form of parametric depen… ▽ More

    Submitted 10 April, 2018; originally announced April 2018.

    Comments: final version for CVPR2013

  37. arXiv:1611.07191  [pdf, other

    cs.DS cs.CV

    Distributable Consistent Multi-Object Matching

    Authors: Nan Hu, Qixing Huang, Boris Thibert, Leonidas Guibas

    Abstract: In this paper we propose an optimization-based framework to multiple object matching. The framework takes maps computed between pairs of objects as input, and outputs maps that are consistent among all pairs of objects. The central idea of our approach is to divide the input object collection into overlapping sub-collections and enforce map consistency among each sub-collection. This leads to a di… ▽ More

    Submitted 10 April, 2018; v1 submitted 22 November, 2016; originally announced November 2016.

    Comments: Final version for CVPR2018

  38. Analytical reconstruction of isotropic turbulence spectra based on the Gaussian transform

    Authors: Attila Wohlbrandt, Nan Hu, Sebastien Guerin, Roland Ewert

    Abstract: The Random Particle Mesh (RPM) method used to simulate turbulence-induced broadband noise in several aeroacoustic applications is extended to realise isotropic turbulence spectra. With this method turbulent fluctuations are synthesised by filtering white noise with a Gaussian filter kernel that in turn gives a Gaussian spectrum. The Gaussian function is smooth and its derivatives and integrals are… ▽ More

    Submitted 19 November, 2015; v1 submitted 29 June, 2015; originally announced June 2015.

    Comments: Preprint, submitted to Computers & Fluids

  39. arXiv:1503.01820  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Latent Hierarchical Model for Activity Recognition

    Authors: Ninghang Hu, Gwenn Englebienne, Zhongyu Lou, Ben Kröse

    Abstract: We present a novel hierarchical model for human activity recognition. In contrast to approaches that successively recognize actions and activities, our approach jointly models actions and activities in a unified framework, and their labels are simultaneously predicted. The model is embedded with a latent layer that is able to capture a richer class of contextual information in both state-state and… ▽ More

    Submitted 5 March, 2015; originally announced March 2015.

  40. arXiv:1304.1572  [pdf, other

    cs.CV

    Stable and Informative Spectral Signatures for Graph Matching

    Authors: Nan Hu, Raif M. Rustamov, Leonidas Guibas

    Abstract: In this paper, we consider the approximate weighted graph matching problem and introduce stable and informative first and second order compatibility terms suitable for inclusion into the popular integer quadratic program formulation. Our approach relies on a rigorous analysis of stability of spectral signatures based on the graph Laplacian. In the case of the first order term, we derive an objecti… ▽ More

    Submitted 10 April, 2018; v1 submitted 4 April, 2013; originally announced April 2013.

    Comments: final version for CVPR2014