Skip to main content

Showing 1–50 of 88 results for author: Dai, M

  1. arXiv:2405.18797  [pdf, other

    cs.NI

    User Association and Channel Allocation in 5G Mobile Asymmetric Multi-band Heterogeneous Networks

    Authors: Miao Dai, Gang Sun, Hongfang Yu, Sheng Wang, Dusit Niyato

    Abstract: With the proliferation of mobile terminals and the continuous upgrading of services, 4G LTE networks are showing signs of weakness. To enhance the capacity of wireless networks, millimeter waves are introduced to drive the evolution of networks towards multi-band 5G heterogeneous networks. The distinct propagation characteristics of mmWaves and microwaves, as well as the vastly different hardware… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 17 pages, 5 figures

  2. arXiv:2404.11816  [pdf, other

    cs.LG

    Tailoring Generative Adversarial Networks for Smooth Airfoil Design

    Authors: Joyjit Chattoraj, Jian Cheng Wong, Zhang Zexuan, Manna Dai, Xia Yingzhi, Li Jichao, Xu Xinxing, Ooi Chin Chun, Yang Feng, Dao My Ha, Liu Yong

    Abstract: In the realm of aerospace design, achieving smooth curves is paramount, particularly when crafting objects such as airfoils. Generative Adversarial Network (GAN), a widely employed generative AI technique, has proven instrumental in synthesizing airfoil designs. However, a common limitation of GAN is the inherent lack of smoothness in the generated airfoil surfaces. To address this issue, we prese… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  3. arXiv:2404.07503  [pdf, other

    cs.CL

    Best Practices and Lessons Learned on Synthetic Data for Language Models

    Authors: Ruibo Liu, Jerry Wei, Fangyu Liu, Chenglei Si, Yanzhe Zhang, Jinmeng Rao, Steven Zheng, Daiyi Peng, Diyi Yang, Denny Zhou, Andrew M. Dai

    Abstract: The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challeng… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  4. arXiv:2404.04815  [pdf, other

    cs.PL cs.AR cs.LG

    Allo: A Programming Model for Composable Accelerator Design

    Authors: Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang

    Abstract: Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools o… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted to PLDI'24

  5. arXiv:2401.15118  [pdf

    cs.CV cs.AI

    GeoDecoder: Empowering Multimodal Map Understanding

    Authors: Feng Qi, Mian Dai, Zixian Zheng, Chao Wang

    Abstract: This paper presents GeoDecoder, a dedicated multimodal model designed for processing geospatial information in maps. Built on the BeitGPT architecture, GeoDecoder incorporates specialized expert modules for image and text processing. On the image side, GeoDecoder utilizes GaoDe Amap as the underlying base map, which inherently encompasses essential details about road and building shapes, relative… ▽ More

    Submitted 18 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  6. arXiv:2401.02961  [pdf, other

    cs.LG cs.CV eess.IV physics.optics

    A Surrogate-Assisted Extended Generative Adversarial Network for Parameter Optimization in Free-Form Metasurface Design

    Authors: Manna Dai, Yang Jiang, Feng Yang, Joyjit Chattoraj, Yingzhi Xia, Xinxing Xu, Weijiang Zhao, My Ha Dao, Yong Liu

    Abstract: Metasurfaces have widespread applications in fifth-generation (5G) microwave communication. Among the metasurface family, free-form metasurfaces excel in achieving intricate spectral responses compared to regular-shape counterparts. However, conventional numerical methods for free-form metasurfaces are time-consuming and demand specialized expertise. Alternatively, recent studies demonstrate that… ▽ More

    Submitted 18 October, 2023; originally announced January 2024.

  7. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  8. arXiv:2312.11797  [pdf, other

    q-fin.PM cs.LG q-fin.CP

    Learning Merton's Strategies in an Incomplete Market: Recursive Entropy Regularization and Biased Gaussian Exploration

    Authors: Min Dai, Yuchao Dong, Yanwei Jia, Xun Yu Zhou

    Abstract: We study Merton's expected utility maximization problem in an incomplete market, characterized by a factor process in addition to the stock price process, where all the model primitives are unknown. We take the reinforcement learning (RL) approach to learn optimal portfolio policies directly by exploring the unknown market, without attempting to estimate the model parameters. Based on the entropy-… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: 43 pages, 5 figures, 3 tables

  9. arXiv:2312.06134  [pdf, other

    cs.CL cs.LG

    Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

    Authors: Dami Choi, Derrick Xin, Hamid Dadkhahi, Justin Gilmer, Ankush Garg, Orhan Firat, Chih-Kuan Yeh, Andrew M. Dai, Behrooz Ghorbani

    Abstract: In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's be… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

  10. arXiv:2311.12386  [pdf, other

    cs.CV

    Point, Segment and Count: A Generalized Framework for Object Counting

    Authors: Zhizhong Huang, Mingliang Dai, Yi Zhang, Junping Zhang, Hongming Shan

    Abstract: Class-agnostic object counting aims to count all objects in an image with respect to example boxes or class names, \emph{a.k.a} few-shot and zero-shot counting. In this paper, we propose a generalized framework for both few-shot and zero-shot object counting based on detection. Our framework combines the superior advantages of two foundation models without compromising their zero-shot capability:… ▽ More

    Submitted 27 March, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Accepted by CVPR 2024. Camera ready

  11. arXiv:2310.03179  [pdf, other

    cs.RO

    Multi-Domain Walking with Reduced-Order Models of Locomotion

    Authors: Min Dai, Jaemin Lee, Aaron D. Ames

    Abstract: Drawing inspiration from human multi-domain walking, this work presents a novel reduced-order model based framework for realizing multi-domain robotic walking. At the core of our approach is the viewpoint that human walking can be represented by a hybrid dynamical system, with continuous phases that are fully-actuated, under-actuated, and over-actuated and discrete changes in actuation type occurr… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

    Comments: submitted to ACC 2024

  12. arXiv:2308.06457  [pdf, other

    cs.CV cs.CL

    Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation

    Authors: Zhichao Wang, Mengyu Dai, Keld Lundgaard

    Abstract: The advent of ChatGPT has introduced innovative methods for information gathering and analysis. However, the information provided by ChatGPT is limited to text, and the visualization of this information remains constrained. Previous research has explored zero-shot text-to-video (TTV) approaches to transform text into videos. However, these methods lacked control over the identity of the generated… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 6 pages

  13. arXiv:2308.04498  [pdf, other

    cs.CL

    DialogRE^C+: An Extension of DialogRE to Investigate How Much Coreference Helps Relation Extraction in Dialogs

    Authors: Yiyun Xiong, Mengwei Dai, Fei Li, Hao Fei, Bobo Li, Shengqiong Wu, Donghong Ji, Chong Teng

    Abstract: Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument r… ▽ More

    Submitted 12 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

    Comments: Accepted by NLPCC 2023

  14. arXiv:2306.17207  [pdf, other

    cs.CV eess.IV

    A Fast Fourier Convolutional Deep Neural Network For Accurate and Explainable Discrimination Of Wheat Yellow Rust And Nitrogen Deficiency From Sentinel-2 Time-Series Data

    Authors: Yue Shi, Liangxiu Han, Pablo González-Moreno, Darren Dancey, Wenjiang Huang, Zhiqiang Zhang, Yuanyuan Liu, Mengning Huan, Hong Miao, Min Dai

    Abstract: Accurate and timely detection of plant stress is essential for yield protection, allowing better-targeted intervention strategies. Recent advances in remote sensing and deep learning have shown great potential for rapid non-invasive detection of plant stress in a fully automated and reproducible manner. However, the existing models always face several challenges: 1) computational inefficiency and… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 24 pages

  15. arXiv:2305.16960  [pdf, ps, other

    cs.CL cs.AI cs.CY cs.HC

    Training Socially Aligned Language Models on Simulated Social Interactions

    Authors: Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, Soroush Vosoughi

    Abstract: Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attack… ▽ More

    Submitted 28 October, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Code, data, and models can be downloaded via https://github.com/agi-templar/Stable-Alignment

  16. arXiv:2305.10403  [pdf, other

    cs.CL cs.AI

    PaLM 2 Technical Report

    Authors: Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego , et al. (103 additional authors not shown)

    Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on… ▽ More

    Submitted 13 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  17. arXiv:2304.10057  [pdf, other

    cs.NI

    Maximize the Long-term Average Revenue of Network Slice Provider via Admission Control Among Heterogeneous Slices

    Authors: Miao Dai, Gang Sun, Hongfang Yu, Dusit Niyato

    Abstract: Network slicing endows 5G/B5G with differentiated and customized capabilities to cope with the proliferation of diversified services, whereas limited physical network resources may not be able to support all service requests. Slice admission control is regarded as an essential means to ensure service quality and service isolation when the network is under burden. Herein, the scenario where rationa… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: 16 pages, 6 figures

  18. Cross-head Supervision for Crowd Counting with Noisy Annotations

    Authors: Mingliang Dai, Zhizhong Huang, Jiaqi Gao, Hongming Shan, Junping Zhang

    Abstract: Noisy annotations such as missing annotations and location shifts often exist in crowd counting datasets due to multi-scale head sizes, high occlusion, etc. These noisy annotations severely affect the model training, especially for density map-based methods. To alleviate the negative impact of noisy annotations, we propose a novel crowd counting model with one convolution head and one transformer… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: accepted by ICASSP 2023

    Journal ref: IEEE ICASSP 2023

  19. arXiv:2303.08562  [pdf, other

    cs.CV

    MGA: Medical generalist agent through text-guided knowledge transformation

    Authors: Weijian Huang, Hao Yang, Cheng Li, Mingtong Dai, Rui Yang, Shanshan Wang

    Abstract: Multi-modal representation methods have achieved advanced performance in medical applications by extracting more robust features from multi-domain data. However, existing methods usually need to train additional branches for downstream tasks, which may increase the model complexities in clinical applications as well as introduce additional human inductive bias. Besides, very few studies exploit th… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

  20. arXiv:2303.08250  [pdf, other

    cs.CV cs.LG

    Transforming Transformers for Resilient Lifelong Learning

    Authors: Chinmay Savadikar, Michelle Dai, Tianfu Wu

    Abstract: Lifelong learning without catastrophic forgetting (i.e., resiliency) remains an open problem for deep neural networks. The prior art mostly focuses on convolutional neural networks. With the increasing dominance of Transformers in deep learning, it is a pressing need to study lifelong learning with Transformers. Due to the complexity of training Transformers in practice, for lifelong learning, a q… ▽ More

    Submitted 3 October, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

  21. arXiv:2303.08103  [pdf, other

    cs.LG cs.CE

    Multi-task Meta Label Correction for Time Series Prediction

    Authors: Luxuan Yang, Ting Gao, Wei Wei, Min Dai, Cheng Fang, Jinqiao Duan

    Abstract: Time series classification faces two unavoidable problems. One is partial feature information and the other is poor label quality, which may affect model performance. To address the above issues, we create a label correction method to time series data with meta-learning under a multi-task framework. There are three main contributions. First, we train the label correction model with a two-branch ne… ▽ More

    Submitted 18 February, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

  22. arXiv:2302.08917  [pdf, other

    cs.CL cs.LG

    Massively Multilingual Shallow Fusion with Large Language Models

    Authors: Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman

    Abstract: While large language models (LLM) have made impressive progress in natural language processing, it remains unclear how to utilize them in improving automatic speech recognition (ASR). In this work, we propose to train a single multilingual language model (LM) for shallow fusion in multiple languages. We push the limits of the multilingual LM to cover up to 84 languages by scaling up using a mixtur… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: Accepted to IEEE ICASSP 2023

  23. A Survey on Digital Twins: Architecture, Enabling Technologies, Security and Privacy, and Future Prospects

    Authors: Yuntao Wang, Zhou Su, Shaolong Guo, Minghui Dai, Tom H. Luan, Yiliang Liu

    Abstract: By interacting, synchronizing, and cooperating with its physical counterpart in real time, digital twin is promised to promote an intelligent, predictive, and optimized modern city. Via interconnecting massive physical entities and their virtual twins with inter-twin and intra-twin communications, the Internet of digital twins (IoDT) enables free data exchange, dynamic mission cooperation, and eff… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: 21 pages, 7 figures

  24. arXiv:2211.01772  [pdf, other

    eess.SY cs.GT

    Collaborative Honeypot Defense in UAV Networks: A Learning-Based Game Approach

    Authors: Yuntao Wang, Zhou Su, Abderrahim Benslimane, Qichao Xu, Minghui Dai, Ruidong Li

    Abstract: The proliferation of unmanned aerial vehicles (UAVs) opens up new opportunities for on-demand service provisioning anywhere and anytime, but also exposes UAVs to a variety of cyber threats. Low/medium interaction honeypots offer a promising lightweight defense for actively protecting mobile Internet of things, particularly UAV networks. While previous research has primarily focused on honeypot sys… ▽ More

    Submitted 29 August, 2023; v1 submitted 28 October, 2022; originally announced November 2022.

    Comments: Accepted Aug. 28, 2023 by IEEE Transactions on Information Forensics & Security. arXiv admin note: text overlap with arXiv:2209.13815

  25. arXiv:2210.05359  [pdf, other

    cs.CL cs.AI

    Mind's Eye: Grounded Language Model Reasoning through Simulation

    Authors: Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai

    Abstract: Successful and effective communication between humans and AI relies on a shared experience of the world. By training solely on written text, current language models (LMs) miss the grounded experience of humans in the real-world -- their failure to relate language to the physical world causes knowledge to be misrepresented and obvious mistakes in their reasoning. We present Mind's Eye, a paradigm t… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  26. arXiv:2209.13815  [pdf, other

    cs.GT

    A Learning-based Honeypot Game for Collaborative Defense in UAV Networks

    Authors: Yuntao Wang, Zhou Su, Abderrahim Benslimane, Qichao Xu, Minghui Dai, Ruidong Li

    Abstract: The proliferation of unmanned aerial vehicles (UAVs) opens up new opportunities for on-demand service provisioning anywhere and anytime, but it also exposes UAVs to various cyber threats. Low/medium-interaction honeypot is regarded as a promising lightweight defense to actively protect mobile Internet of things, especially UAV networks. Existing works primarily focused on honeypot design and attac… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: Accepted by IEEE Globecom2022

  27. arXiv:2209.08458  [pdf, other

    cs.RO eess.SY

    Data-driven Adaptation for Robust Bipedal Locomotion with Step-to-Step Dynamics

    Authors: Min Dai, Xiaobin Xiong, Jaemin Lee, Aaron D. Ames

    Abstract: This paper presents an online framework for synthesizing agile locomotion for bipedal robots that adapts to unknown environments, modeling errors, and external disturbances. To this end, we leverage step-to-step (S2S) dynamics which has proven effective in realizing dynamic walking on underactuated robots -- assuming known dynamics and environments. This paper considers the case of uncertain model… ▽ More

    Submitted 4 August, 2023; v1 submitted 17 September, 2022; originally announced September 2022.

  28. arXiv:2208.06561  [pdf, other

    cs.CV

    Finding Point with Image: A Simple and Efficient Method for UAV Self-Localization

    Authors: Ming Dai, Enhui Zheng, Zhenhua Feng, Jiahao Chen, Wankou Yang

    Abstract: Image retrieval has emerged as a prominent solution for the self-localization task of unmanned aerial vehicles (UAVs). However, this approach involves complicated pre-processing and post-processing operations, placing significant demands on both computational and storage resources. To mitigate this issue, this paper presents an end-to-end positioning framework, namely Finding Point with Image (FPI… ▽ More

    Submitted 5 December, 2023; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: 15 pages, 14 figures

  29. arXiv:2206.13963  [pdf, other

    cs.CV

    Primitive Graph Learning for Unified Vector Mapping

    Authors: Lei Wang, Min Dai, Jianan He, Jingwei Huang, Mingwei Sun

    Abstract: Large-scale vector mapping is important for transportation, city planning, and survey and census. We propose GraphMapper, a unified framework for end-to-end vector map extraction from satellite images. Our key idea is a novel unified representation of shapes of different topologies named "primitive graph", which is a set of shape primitives and their pairwise relationship matrix. Then, we convert… ▽ More

    Submitted 10 November, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

  30. arXiv:2205.04151  [pdf, other

    stat.ML cs.LG

    Learning effective dynamics from data-driven stochastic systems

    Authors: Lingyu Feng, Ting Gao, Min Dai, Jinqiao Duan

    Abstract: Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic… ▽ More

    Submitted 29 December, 2023; v1 submitted 9 May, 2022; originally announced May 2022.

  31. arXiv:2204.02311  [pdf, other

    cs.CL

    PaLM: Scaling Language Modeling with Pathways

    Authors: Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin , et al. (42 additional authors not shown)

    Abstract: Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Tran… ▽ More

    Submitted 5 October, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

  32. arXiv:2202.05447  [pdf, other

    cs.NI cs.PF

    PSACCF: Prioritized Online Slice Admission Control Considering Fairness in 5G/B5G Networks

    Authors: Miao Dai, Long Luo, Jing Ren, Hongfang Yu, Gang Sun

    Abstract: 5G/B5G is envisioned to support various services with the assistance of network slices, each slice instance asks for adequate resources to provide the pre-negotiated service quality to its subscribers. Slice Admission Control (SAC) algorithm is a necessity for Slice Providers (SPs) to guarantee the QoS and QoE of each admitted request with limited resources. In that circumstance, the priority conc… ▽ More

    Submitted 12 April, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

  33. A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

    Authors: Ming Dai, Jianhong Hu, Jiedong Zhuang, Enhui Zheng

    Abstract: Cross-view geo-localization is a task of matching the same geographic image from different views, e.g., unmanned aerial vehicle (UAV) and satellite. The most difficult challenges are the position shift and the uncertainty of distance and scale. Existing methods are mainly aimed at digging for more comprehensive fine-grained information. However, it underestimates the importance of extracting robus… ▽ More

    Submitted 23 January, 2022; originally announced January 2022.

    Comments: 14 pages, 13 figures, IEEE Transactions on Circuits and Systems for Video Technology

  34. arXiv:2201.09201  [pdf, other

    cs.CV cs.AI

    Vision-Based UAV Self-Positioning in Low-Altitude Urban Environments

    Authors: Ming Dai, Enhui Zheng, Zhenhua Feng, Jiedong Zhuang, Wankou Yang

    Abstract: Unmanned Aerial Vehicles (UAVs) rely on satellite systems for stable positioning. However, due to limited satellite coverage or communication disruptions, UAVs may lose signals from satellite-based positioning systems. In such situations, vision-based techniques can serve as an alternative, ensuring the self-positioning capability of UAVs. However, most of the existing datasets are developed for t… ▽ More

    Submitted 10 August, 2023; v1 submitted 23 January, 2022; originally announced January 2022.

    Comments: 13 pages,8 figures

  35. arXiv:2112.07175  [pdf, other

    cs.CV

    Co-training Transformer with Videos and Images Improves Action Recognition

    Authors: Bowen Zhang, Jiahui Yu, Christopher Fifty, Wei Han, Andrew M. Dai, Ruoming Pang, Fei Sha

    Abstract: In learning action recognition, models are typically pre-trained on object recognition with images, such as ImageNet, and later fine-tuned on target action recognition with videos. This approach has achieved good empirical performance especially with recent transformer-based video architectures. While recently many works aim to design more advanced transformer architectures for action recognition,… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

  36. arXiv:2112.06905  [pdf, other

    cs.CL

    GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

    Authors: Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu , et al. (2 additional authors not shown)

    Abstract: Scaling language models with more data, compute and parameters has driven significant progress in natural language processing. For example, thanks to scaling, GPT-3 was able to achieve strong results on in-context learning tasks. However, training these large dense models requires significant amounts of computing resources. In this paper, we propose and develop a family of language models named GL… ▽ More

    Submitted 1 August, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted to ICML 2022

  37. arXiv:2112.02450  [pdf, other

    cs.CV

    Adaptive Feature Interpolation for Low-Shot Image Generation

    Authors: Mengyu Dai, Haibin Hang, Xiaoyang Guo

    Abstract: Training of generative models especially Generative Adversarial Networks can easily diverge in low-data setting. To mitigate this issue, we propose a novel implicit data augmentation approach which facilitates stable training and synthesize high-quality samples without need of label information. Specifically, we view the discriminator as a metric embedding of the real data manifold, which offers p… ▽ More

    Submitted 14 July, 2022; v1 submitted 4 December, 2021; originally announced December 2021.

    Comments: ECCV'22. Code available at https://github.com/dzld00/Adaptive-Feature-Interpolation-for-Low-Shot-Image-Generation

  38. arXiv:2111.07109  [pdf, other

    cs.LG stat.ML

    Nyström Regularization for Time Series Forecasting

    Authors: Zirui Sun, Mingwei Dai, Yao Wang, Shao-Bo Lin

    Abstract: This paper focuses on learning rate analysis of Nyström regularization with sequential sub-sampling for $τ$-mixing time series. Using a recently developed Banach-valued Bernstein inequality for $τ$-mixing sequences and an integral operator approach based on second-order decomposition, we succeed in deriving almost optimal learning rates of Nyström regularization with sequential sub-sampling for… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

    Comments: 35 pages

  39. arXiv:2109.03378  [pdf, other

    stat.ML cs.LG

    Rethinking Multidimensional Discriminator Output for Generative Adversarial Networks

    Authors: Mengyu Dai, Haibin Hang, Anuj Srivastava

    Abstract: The study of multidimensional discriminator (critic) output for Generative Adversarial Networks has been underexplored in the literature. In this paper, we generalize the Wasserstein GAN framework to take advantage of multidimensional critic output and explore its properties. We also introduce a square-root velocity transformation (SRVT) block which favors training in the multidimensional setting.… ▽ More

    Submitted 14 July, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: Frontiers in Adversarial Machine Learning ICML 2022

  40. arXiv:2109.01652  [pdf, other

    cs.CL

    Finetuned Language Models Are Zero-Shot Learners

    Authors: Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le

    Abstract: This paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks. We take a 137B parameter pretrained language model and instruction-tune it on over 60 NLP tasks verbalized via natur… ▽ More

    Submitted 8 February, 2022; v1 submitted 3 September, 2021; originally announced September 2021.

    Comments: Version 5. Find list of changes in Appendix F (page 35)

  41. arXiv:2107.08189  [pdf, other

    cs.LG cs.CY

    BEDS-Bench: Behavior of EHR-models under Distributional Shift--A Benchmark

    Authors: Anand Avati, Martin Seneviratne, Emily Xue, Zhen Xu, Balaji Lakshminarayanan, Andrew M. Dai

    Abstract: Machine learning has recently demonstrated impressive progress in predictive accuracy across a wide array of tasks. Most ML approaches focus on generalization performance on unseen data that are similar to the training data (In-Distribution, or IND). However, real world applications and deployments of ML rarely enjoy the comfort of encountering examples that are always IND. In such situations, mos… ▽ More

    Submitted 17 July, 2021; originally announced July 2021.

  42. arXiv:2106.10777  [pdf, other

    cs.CV

    Manifold Matching via Deep Metric Learning for Generative Modeling

    Authors: Mengyu Dai, Haibin Hang

    Abstract: We propose a manifold matching approach to generative models which includes a distribution generator (or data generator) and a metric generator. In our framework, we view the real data set as some manifold embedded in a high-dimensional Euclidean space. The distribution generator aims at generating samples that follow some distribution condensed around the real data manifold. It is achieved by mat… ▽ More

    Submitted 26 August, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

    Comments: ICCV 2021. Code available at https://github.com/dzld00/pytorch-manifold-matching.git

  43. arXiv:2105.01697  [pdf, other

    cs.RO

    Episodic Learning for Safe Bipedal Locomotion with Control Barrier Functions and Projection-to-State Safety

    Authors: Noel Csomay-Shanklin, Ryan K. Cosner, Min Dai, Andrew J. Taylor, Aaron D. Ames

    Abstract: This paper combines episodic learning and control barrier functions in the setting of bipedal locomotion. The safety guarantees that control barrier functions provide are only valid with perfect model knowledge; however, this assumption cannot be met on hardware platforms. To address this, we utilize the notion of projection-to-state safety paired with a machine learning framework in an attempt to… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

    Comments: 13 pages, 4 figures, to appear at the Conference on Learning for Dynamics and Control 2021

  44. arXiv:2104.10367  [pdf, other

    cs.RO eess.SY

    Bipedal Walking on Constrained Footholds: Momentum Regulation via Vertical COM Control

    Authors: Min Dai, Xiaobin Xiong, Aaron Ames

    Abstract: This paper presents an online walking synthesis methodology to enable dynamic and stable walking on constrained footholds for underactuated bipedal robots. Our approach modulates the change of angular momentum about the foot-ground contact pivot at discrete impact using pre-impact vertical center of mass (COM) velocity. To this end, we utilize the underactuated Linear Inverted Pendulum (LIP) model… ▽ More

    Submitted 23 September, 2021; v1 submitted 21 April, 2021; originally announced April 2021.

  45. arXiv:2102.13201  [pdf, other

    cs.RO

    Learning Controller Gains on Bipedal Walking Robots via User Preferences

    Authors: Noel Csomay-Shanklin, Maegan Tucker, Min Dai, Jenna Reher, Aaron D. Ames

    Abstract: Experimental demonstration of complex robotic behaviors relies heavily on finding the correct controller gains. This painstaking process is often completed by a domain expert, requiring deep knowledge of the relationship between parameter values and the resulting behavior of the system. Even when such knowledge is possessed, it can take significant effort to navigate the nonintuitive landscape of… ▽ More

    Submitted 2 March, 2022; v1 submitted 25 February, 2021; originally announced February 2021.

    Comments: 6 pages + 1 page of references, 7 figures, accepted to ICRA 2022

  46. arXiv:2102.02340  [pdf, other

    cs.LG cs.AI cs.CL

    MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records

    Authors: Zhen Xu, David R. So, Andrew M. Dai

    Abstract: One important challenge of applying deep learning to electronic health records (EHR) is the complexity of their multimodal structure. EHR usually contains a mixture of structured (codes) and unstructured (free-text) data with sparse and irregular longitudinal features -- all of which doctors utilize when making decisions. In the deep learning regime, determining how different modality representati… ▽ More

    Submitted 5 October, 2021; v1 submitted 3 February, 2021; originally announced February 2021.

    Comments: Accepted for publication at the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

  47. arXiv:2012.13121  [pdf, other

    cs.LG q-fin.ST

    Memory-Gated Recurrent Networks

    Authors: Yaquan Zhang, Qi Wu, Nanbo Peng, Min Dai, Jing Zhang, Hu Wang

    Abstract: The essence of multivariate sequential learning is all about how to extract dependencies in data. These data sets, such as hourly medical records in intensive care units and multi-frequency phonetic time series, often time exhibit not only strong serial dependencies in the individual components (the "marginal" memory) but also non-negligible memories in the cross-sectional dependencies (the "joint… ▽ More

    Submitted 30 December, 2020; v1 submitted 24 December, 2020; originally announced December 2020.

    Comments: This paper was accepted and will be published in the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

  48. arXiv:2010.11983  [pdf, other

    quant-ph cs.CC cs.LG

    Learnability and Complexity of Quantum Samples

    Authors: Murphy Yuezhen Niu, Andrew M. Dai, Li Li, Augustus Odena, Zhengli Zhao, Vadim Smelyanskyi, Hartmut Neven, Sergio Boixo

    Abstract: Given a quantum circuit, a quantum computer can sample the output distribution exponentially faster in the number of bits than classical computers. A similar exponential separation has yet to be established in generative models through quantum sample learning: given samples from an n-qubit computation, can we learn the underlying quantum distribution using models with training parameters that scal… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  49. arXiv:2010.06610  [pdf, other

    cs.LG cs.CV stat.ML

    Training independent subnetworks for robust prediction

    Authors: Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

    Abstract: Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple pred… ▽ More

    Submitted 4 August, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Updated to the ICLR camera ready version, added reference to Soflaei et al. 2020

  50. arXiv:2010.05941  [pdf, other

    astro-ph.IM cs.AI

    Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients

    Authors: Noble Kennamer, Emille E. O. Ishida, Santiago Gonzalez-Gaitan, Rafael S. de Souza, Alexander Ihler, Kara Ponder, Ricardo Vilalta, Anais Moller, David O. Jones, Mi Dai, Alberto Krone-Martins, Bruno Quint, Sreevarsha Sreejith, Alex I. Malz, Lluis Galbany

    Abstract: The recent increase in volume and complexity of available astronomical data has led to a wide use of supervised machine learning techniques. Active learning strategies have been proposed as an alternative to optimize the distribution of scarce labeling resources. However, due to the specific conditions in which labels can be acquired, fundamental assumptions, such as sample representativeness and… ▽ More

    Submitted 26 October, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: Accepted to the 2020 IEEE Symposium Series on Computational Intelligence