Skip to main content

Showing 1–50 of 79 results for author: Yu, A

  1. arXiv:2406.07003  [pdf, other

    cs.SE

    GraphCoder: Enhancing Repository-Level Code Completion via Code Context Graph-based Retrieval and Language Model

    Authors: Wei Liu, Ailun Yu, Daoguang Zan, Bo Shen, Wei Zhang, Haiyan Zhao, Zhi Jin, Qianxiang Wang

    Abstract: The performance of repository-level code completion depends upon the effective leverage of both general and repository-specific knowledge. Despite the impressive capability of code LLMs in general code completion tasks, they often exhibit less satisfactory performance on repository-level completion due to the lack of repository-specific knowledge in these LLMs. To address this problem, we propose… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2405.16587  [pdf, other

    cs.LG cs.AI cs.HC

    Cost-Effective Online Multi-LLM Selection with Versatile Reward Models

    Authors: Xiangxiang Dai, Jin Li, Xutong Liu, Anqi Yu, John C. S. Lui

    Abstract: With the rapid advancement of large language models (LLMs), the diversity of multi-LLM tasks and the variability in their pricing structures have become increasingly important, as costs can vary greatly between different LLMs. To tackle these challenges, we introduce the \textit{C2MAB-V}, a \underline{C}ost-effective \underline{C}ombinatorial \underline{M}ulti-armed \underline{B}andit with \underl… ▽ More

    Submitted 26 May, 2024; originally announced May 2024.

    Comments: 29 pages, 12 figures, conference

  3. arXiv:2405.14893  [pdf, other

    cs.LG econ.EM

    YUI: Day-ahead Electricity Price Forecasting Using Invariance Simplified Supply and Demand Curve

    Authors: Linian Wang, Anlan Yu, Jianghong Liu, Huibing Zhang, Leye Wang

    Abstract: In day-ahead electricity market, it is crucial for all market participants to have access to reliable and accurate price forecasts for their decision-making processes. Forecasting methods currently utilized in industrial applications frequently neglect the underlying mechanisms of price formation, while economic research from the perspective of supply and demand have stringent data collection requ… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 13 pages

  4. arXiv:2405.13975  [pdf, other

    cs.LG stat.ML

    There is HOPE to Avoid HiPPOs for Long-memory State Space Models

    Authors: Annan Yu, Michael W. Mahoney, N. Benjamin Erichson

    Abstract: State-space models (SSMs) that utilize linear, time-invariant (LTI) systems are known for their effectiveness in learning long sequences. However, these models typically face several challenges: (i) they require specifically designed initializations of the system matrices to achieve state-of-the-art performance, (ii) they require training of state matrices on a logarithmic scale with very small le… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  5. arXiv:2405.10020  [pdf, other

    cs.RO cs.CL cs.CV cs.LG

    Natural Language Can Help Bridge the Sim2Real Gap

    Authors: Albert Yu, Adeline Foote, Raymond Mooney, Roberto Martín-Martín

    Abstract: The main challenge in learning image-conditioned robotic policies is acquiring a visual representation conducive to low-level control. Due to the high dimensionality of the image space, learning a good visual representation requires a considerable amount of visual data. However, when learning in the real world, data is expensive. Sim2Real is a promising paradigm for overcoming data scarcity in the… ▽ More

    Submitted 2 July, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: To appear in RSS 2024. Project website at https://robin-lab.cs.utexas.edu/lang4sim2real/

    ACM Class: I.2.9; I.2.7; I.2.6

  6. arXiv:2405.03949  [pdf, other

    cs.LG cs.CR eess.SP

    FedSC: Provable Federated Self-supervised Learning with Spectral Contrastive Objective over Non-i.i.d. Data

    Authors: Shusen Jing, Anlan Yu, Shuai Zhang, Songyang Zhang

    Abstract: Recent efforts have been made to integrate self-supervised learning (SSL) with the framework of federated learning (FL). One unique challenge of federated self-supervised learning (FedSSL) is that the global objective of FedSSL usually does not equal the weighted sum of local SSL objectives. Consequently, conventional approaches, such as federated averaging (FedAvg), fail to precisely minimize the… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  7. arXiv:2403.16443  [pdf, other

    cs.CL cs.AI cs.SE

    CodeS: Natural Language to Code Repository via Multi-Layer Sketch

    Authors: Daoguang Zan, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei Guan, Zhiguang Yang, Yongji Wang, Qianxiang Wang, Lizhen Cui

    Abstract: The impressive performance of large language models (LLMs) on code-related tasks has shown the potential of fully automated software development. In light of this, we introduce a new software engineering task, namely Natural Language to code Repository (NL2Repo). This task aims to generate an entire code repository from its natural language requirements. To address this task, we propose a simple y… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: https://github.com/NL2Code/CodeS

  8. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  9. arXiv:2403.05205  [pdf, other

    cs.CY

    Interoperability of the Metaverse: A Digital Ecosystem Perspective Review

    Authors: Liang Yang, Shi-Ting Ni, Yuyang Wang, Ao Yu, Jyh-An Lee, Pan Hui

    Abstract: The Metaverse is at the vanguard of the impending digital revolution, with the potential to significantly transform industries and lifestyles. However, in 2023, skepticism surfaced within industrial and academic spheres, raising concerns that excitement may outpace actual technological progress. Interoperability, recognized as a major barrier to the Metaverse's full potential, is central to this d… ▽ More

    Submitted 15 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  10. arXiv:2402.15231  [pdf, other

    cs.LG cs.CV

    Which Model to Transfer? A Survey on Transferability Estimation

    Authors: Yuhe Ding, Bo Jiang, Aijing Yu, Aihua Zheng, Jian Liang

    Abstract: Transfer learning methods endeavor to leverage relevant knowledge from existing source pre-trained models or datasets to solve downstream target tasks. With the increase in the scale and quantity of available pre-trained models nowadays, it becomes critical to assess in advance whether they are suitable for a specific target task. Model transferability estimation is an emerging and growing area of… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  11. arXiv:2402.06501  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    Scalable Interactive Machine Learning for Future Command and Control

    Authors: Anna Madison, Ellen Novoseller, Vinicius G. Goecks, Benjamin T. Files, Nicholas Waytowich, Alfred Yu, Vernon J. Lawhern, Steven Thurman, Christopher Kelshaw, Kaleb McDowell

    Abstract: Future warfare will require Command and Control (C2) personnel to make decisions at shrinking timescales in complex and potentially ill-defined situations. Given the need for robust decision-making processes and decision-support tools, integration of artificial and human intelligence holds the potential to revolutionize the C2 operations process to ensure adaptability and efficiency in rapidly cha… ▽ More

    Submitted 28 March, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Accepted at the NATO Science and Technology Organization Symposium (ICMCIS) organized by the Information Systems Technology (IST) Panel, IST-205-RSY - the ICMCIS, held in Koblenz, Germany, 23-24 April 2024

    ACM Class: I.2.6; I.2.7; J.7

  12. arXiv:2401.14242  [pdf, other

    cs.CL

    Improving Natural Language Capability of Code Large Language Model

    Authors: Wei Li, Daoguang Zan, Bei Guan, Ailun Yu, Xiaolin Chen, Yongji Wang

    Abstract: Code large language models (Code LLMs) have demonstrated remarkable performance in code generation. Nonetheless, most existing works focus on boosting code LLMs from the perspective of programming capabilities, while their natural language capabilities receive less attention. To fill this gap, we thus propose a novel framework, comprising two modules: AttentionExtractor, which is responsible for e… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  13. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  14. arXiv:2310.01798  [pdf, other

    cs.CL cs.AI

    Large Language Models Cannot Self-Correct Reasoning Yet

    Authors: Jie Huang, Xinyun Chen, Swaroop Mishra, Huaixiu Steven Zheng, Adams Wei Yu, Xinying Song, Denny Zhou

    Abstract: Large Language Models (LLMs) have emerged as a groundbreaking technology with their unparalleled text generation capabilities across various applications. Nevertheless, concerns persist regarding the accuracy and appropriateness of their generated content. A contemporary methodology, self-correction, has been proposed as a remedy to these issues. Building upon this premise, this paper critically e… ▽ More

    Submitted 14 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  15. arXiv:2310.01698  [pdf, other

    cs.LG stat.ML

    Robustifying State-space Models for Long Sequences via Approximate Diagonalization

    Authors: Annan Yu, Arnur Nigmetov, Dmitriy Morozov, Michael W. Mahoney, N. Benjamin Erichson

    Abstract: State-space models (SSMs) have recently emerged as a framework for learning long-range sequence tasks. An example is the structured state-space sequence (S4) layer, which uses the diagonal-plus-low-rank structure of the HiPPO initialization framework. However, the complicated structure of the S4 layer poses challenges; and, in an effort to address these challenges, models such as S4D and S5 have c… ▽ More

    Submitted 2 October, 2023; originally announced October 2023.

  16. arXiv:2309.15790  [pdf, other

    cs.CR

    Some Constructions of Private, Efficient, and Optimal $K$-Norm and Elliptic Gaussian Noise

    Authors: Matthew Joseph, Alexander Yu

    Abstract: Differentially private computation often begins with a bound on some $d$-dimensional statistic's $\ell_p$ sensitivity. For pure differential privacy, the $K$-norm mechanism can improve on this approach using a norm tailored to the statistic's sensitivity space. Writing down a closed-form description of this optimal norm is often straightforward. However, running the $K$-norm mechanism reduces to u… ▽ More

    Submitted 21 May, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: This version corresponds to the camera-ready accepted at COLT 2024

  17. arXiv:2308.16824  [pdf, other

    cs.CL cs.AI cs.PL cs.SE

    Can Programming Languages Boost Each Other via Instruction Tuning?

    Authors: Daoguang Zan, Ailun Yu, Bo Shen, Jiaxin Zhang, Taihong Chen, Bing Geng, Bei Chen, Jichuan Ji, Yafen Yao, Yongji Wang, Qianxiang Wang

    Abstract: When human programmers have mastered a programming language, it would be easier when they learn a new programming language. In this report, we focus on exploring whether programming languages can boost each other during the instruction fine-tuning phase of code large language models. We conduct extensive experiments of 8 popular programming languages (Python, JavaScript, TypeScript, C, C++, Java,… ▽ More

    Submitted 3 September, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

    Comments: Work in progress

  18. arXiv:2308.07931  [pdf, other

    cs.CV cs.AI cs.CL cs.LG cs.RO

    Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation

    Authors: William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling, Phillip Isola

    Abstract: Self-supervised and language-supervised image models contain rich knowledge of the world that is important for generalization. Many robotic tasks, however, require a detailed understanding of 3D geometry, which is often lacking in 2D image features. This work bridges this 2D-to-3D gap for robotic manipulation by leveraging distilled feature fields to combine accurate 3D geometry with rich semantic… ▽ More

    Submitted 29 December, 2023; v1 submitted 27 July, 2023; originally announced August 2023.

    Comments: Project website at https://f3rm.csail.mit.edu, Accepted at the 7th Annual Conference on Robot Learning (CoRL), 2023 in Atlanta, US

  19. arXiv:2307.14936  [pdf, other

    cs.CL cs.AI cs.LG cs.PL cs.SE

    PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback

    Authors: Bo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, Qianxiang Wang

    Abstract: Large Language Models for Code (Code LLM) are flourishing. New and powerful models are released on a weekly basis, demonstrating remarkable performance on the code generation task. Various approaches have been proposed to boost the code generation performance of pre-trained Code LLMs, such as supervised fine-tuning, instruction tuning, reinforcement learning, etc. In this paper, we propose a novel… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Preprint

  20. arXiv:2306.00303  [pdf, other

    cs.CV eess.IV

    Sea Ice Extraction via Remote Sensed Imagery: Algorithms, Datasets, Applications and Challenges

    Authors: Anzhu Yu, Wenjun Huang, Qing Xu, Qun Sun, Wenyue Guo, Song Ji, Bowei Wen, Chunping Qiu

    Abstract: The deep learning, which is a dominating technique in artificial intelligence, has completely changed the image understanding over the past decade. As a consequence, the sea ice extraction (SIE) problem has reached a new era. We present a comprehensive review of four important aspects of SIE, including algorithms, datasets, applications, and the future trends. Our review focuses on researches publ… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: 24 pages, 6 figures

  21. arXiv:2305.10429  [pdf, other

    cs.CL cs.LG

    DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

    Authors: Sang Michael Xie, Hieu Pham, Xuanyi Dong, Nan Du, Hanxiao Liu, Yifeng Lu, Percy Liang, Quoc V. Le, Tengyu Ma, Adams Wei Yu

    Abstract: The mixture proportions of pretraining data domains (e.g., Wikipedia, books, web text) greatly affect language model (LM) performance. In this paper, we propose Domain Reweighting with Minimax Optimization (DoReMi), which first trains a small proxy model using group distributionally robust optimization (Group DRO) over domains to produce domain weights (mixture proportions) without knowledge of do… ▽ More

    Submitted 20 November, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  22. arXiv:2305.09893  [pdf, other

    cs.CV

    Integrating Multiple Sources Knowledge for Class Asymmetry Domain Adaptation Segmentation of Remote Sensing Images

    Authors: Kuiliang Gao, Anzhu Yu, Xiong You, Wenyue Guo, Ke Li, Ningbo Huang

    Abstract: In the existing unsupervised domain adaptation (UDA) methods for remote sensing images (RSIs) semantic segmentation, class symmetry is an widely followed ideal assumption, where the source and target RSIs have exactly the same class space. In practice, however, it is often very difficult to find a source RSI with exactly the same classes as the target RSI. More commonly, there are multiple source… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: 17 pages, 10 figures

  23. Freeform Templates: Combining Freeform Curation with Structured Templates

    Authors: Stephen MacNeil, Ziheng Huang, Kenneth Chen, Zijian Ding, Alex Yu, Kendall Nakai, Steven P. Dow

    Abstract: Online whiteboards are becoming a popular way to facilitate collaborative design work, providing a free-form environment to curate ideas. However, as templates are increasingly being used to scaffold contributions from non-experts designers, it is crucial to understand their impact on the creative process. In this paper, we present the results from a study with 114 students in a large introductory… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

    ACM Class: H.5.0

  24. arXiv:2304.14972  [pdf, other

    cs.CV

    Semi-supervised Road Updating Network (SRUNet): A Deep Learning Method for Road Updating from Remote Sensing Imagery and Historical Vector Maps

    Authors: Xin Chen, Anzhu Yu, Qun Sun, Wenyue Guo, Qing Xu, Bowei Wen

    Abstract: A road is the skeleton of a city and is a fundamental and important geographical component. Currently, many countries have built geo-information databases and gathered large amounts of geographic data. However, with the extensive construction of infrastructure and rapid expansion of cities, automatic updating of road data is imperative to maintain the high quality of current basic geographic infor… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

    Comments: 22 pages, 8 figures

  25. arXiv:2210.11416  [pdf, other

    cs.LG cs.CL

    Scaling Instruction-Finetuned Language Models

    Authors: Hyung Won Chung, Le Hou, Shayne Longpre, Barret Zoph, Yi Tay, William Fedus, Yunxuan Li, Xuezhi Wang, Mostafa Dehghani, Siddhartha Brahma, Albert Webson, Shixiang Shane Gu, Zhuyun Dai, Mirac Suzgun, Xinyun Chen, Aakanksha Chowdhery, Alex Castro-Ros, Marie Pellat, Kevin Robinson, Dasha Valter, Sharan Narang, Gaurav Mishra, Adams Yu, Vincent Zhao, Yanping Huang , et al. (10 additional authors not shown)

    Abstract: Finetuning language models on a collection of datasets phrased as instructions has been shown to improve model performance and generalization to unseen tasks. In this paper we explore instruction finetuning with a particular focus on (1) scaling the number of tasks, (2) scaling the model size, and (3) finetuning on chain-of-thought data. We find that instruction finetuning with the above aspects d… ▽ More

    Submitted 6 December, 2022; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: Public checkpoints: https://huggingface.co/docs/transformers/model_doc/flan-t5

  26. arXiv:2210.04476  [pdf, other

    cs.RO cs.CL cs.LG

    Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks

    Authors: Albert Yu, Raymond J. Mooney

    Abstract: Demonstrations and natural language instructions are two common ways to specify and teach robots novel tasks. However, for many complex tasks, a demonstration or language instruction alone contains ambiguities, preventing tasks from being specified clearly. In such cases, a combination of both a demonstration and an instruction more concisely and effectively conveys the task to the robot than eith… ▽ More

    Submitted 28 April, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: 24 pages, 10 figures. Project website at https://deltaco-robot.github.io/

    ACM Class: I.2.9; I.2.7; I.2.6

  27. arXiv:2208.13979  [pdf

    physics.app-ph cs.LG eess.IV physics.optics

    Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

    Authors: Yi Luo, Yijie Zhang, Tairan Liu, Alan Yu, Yichen Wu, Aydogan Ozcan

    Abstract: Exposure to bio-aerosols such as mold spores and pollen can lead to adverse health effects. There is a need for a portable and cost-effective device for long-term monitoring and quantification of various bio-aerosols. To address this need, we present a mobile and cost-effective label-free bio-aerosol sensor that takes holographic images of flowing particulate matter concentrated by a virtual impac… ▽ More

    Submitted 30 August, 2022; originally announced August 2022.

    Comments: 23 Pages, 5 Figures, 1 Table

    Journal ref: ACS Sensors (2022)

  28. arXiv:2207.06010  [pdf, other

    cs.LG q-bio.BM

    Does GNN Pretraining Help Molecular Representation?

    Authors: Ruoxi Sun, Hanjun Dai, Adams Wei Yu

    Abstract: Extracting informative representations of molecules using Graph neural networks (GNNs) is crucial in AI-driven drug discovery. Recently, the graph research community has been trying to replicate the success of self-supervised pretraining in natural language processing, with several successes claimed. However, we find the benefit brought by self-supervised pretraining on small molecular data can be… ▽ More

    Submitted 2 November, 2022; v1 submitted 13 July, 2022; originally announced July 2022.

  29. arXiv:2207.04703  [pdf, other

    cs.RO cs.LG

    Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning

    Authors: Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, Sergey Levine

    Abstract: Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems. However, in practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment. Moreover, robotic policies learned with RL often fail when deployed beyond the carefully controlled setting in which they were lea… ▽ More

    Submitted 17 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

    Comments: 17 pages, project website at https://sites.google.com/view/ariel-berkeley/

  30. arXiv:2205.14300  [pdf, ps, other

    cs.LG

    Tuning Frequency Bias in Neural Network Training with Nonuniform Data

    Authors: Annan Yu, Yunan Yang, Alex Townsend

    Abstract: Small generalization errors of over-parameterized neural networks (NNs) can be partially explained by the frequency biasing phenomenon, where gradient-based algorithms minimize the low-frequency misfit before reducing the high-frequency residuals. Using the Neural Tangent Kernel (NTK), one can provide a theoretically rigorous analysis for training where data are drawn from constant or piecewise-co… ▽ More

    Submitted 25 September, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    MSC Class: 68T07; 68Q32

  31. arXiv:2204.03021  [pdf, other

    cs.CL

    The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems

    Authors: Caleb Ziems, Jane A. Yu, Yi-Chia Wang, Alon Halevy, Diyi Yang

    Abstract: Conversational agents have come increasingly closer to human competence in open-domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user's trust in the moral integrity of the system. Moral deviations are difficult to mitigate because moral judgments are not universal, and there may be multiple competing judgments that appl… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: ACL 2022 main conference

  32. arXiv:2203.08195  [pdf, other

    cs.CV

    DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection

    Authors: Yingwei Li, Adams Wei Yu, Tianjian Meng, Ben Caine, Jiquan Ngiam, Daiyi Peng, Junyang Shen, Bo Wu, Yifeng Lu, Denny Zhou, Quoc V. Le, Alan Yuille, Mingxing Tan

    Abstract: Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving. While prevalent multi-modal methods simply decorate raw lidar point clouds with camera features and feed them directly to existing 3D detection models, our study shows that fusing camera features with deep lidar features instead of raw points, can lead to better performance. Howev… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: CVPR 2022. 1st rank 3D detection method on Waymo Challenge Leaderboard: https://waymo.com/open/challenges/entry/?timestamp=1647356360224524&challenge=DETECTION_3D&emailId=5451f123-a0ea

  33. arXiv:2112.14205  [pdf

    cs.CR cs.CY cs.HC

    Analysis of Longitudinal Changes in Privacy Behavior of Android Applications

    Authors: Alexander Yu, Yuvraj Agarwal, Jason I. Hong

    Abstract: Privacy concerns have long been expressed around smart devices, and the concerns around Android apps have been studied by many past works. Over the past 10 years, we have crawled and scraped data for almost 1.9 million apps, and also stored the APKs for 135,536 of them. In this paper, we examine the trends in how Android apps have changed over time with respect to privacy and look at it from two p… ▽ More

    Submitted 28 December, 2021; originally announced December 2021.

  34. arXiv:2112.06905  [pdf, other

    cs.CL

    GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

    Authors: Nan Du, Yanping Huang, Andrew M. Dai, Simon Tong, Dmitry Lepikhin, Yuanzhong Xu, Maxim Krikun, Yanqi Zhou, Adams Wei Yu, Orhan Firat, Barret Zoph, Liam Fedus, Maarten Bosma, Zongwei Zhou, Tao Wang, Yu Emma Wang, Kellie Webster, Marie Pellat, Kevin Robinson, Kathleen Meier-Hellstern, Toju Duke, Lucas Dixon, Kun Zhang, Quoc V Le, Yonghui Wu , et al. (2 additional authors not shown)

    Abstract: Scaling language models with more data, compute and parameters has driven significant progress in natural language processing. For example, thanks to scaling, GPT-3 was able to achieve strong results on in-context learning tasks. However, training these large dense models requires significant amounts of computing resources. In this paper, we propose and develop a family of language models named GL… ▽ More

    Submitted 1 August, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: Accepted to ICML 2022

  35. arXiv:2112.05131  [pdf, other

    cs.CV cs.GR

    Plenoxels: Radiance Fields without Neural Networks

    Authors: Alex Yu, Sara Fridovich-Keil, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa

    Abstract: We introduce Plenoxels (plenoptic voxels), a system for photorealistic view synthesis. Plenoxels represent a scene as a sparse 3D grid with spherical harmonics. This representation can be optimized from calibrated images via gradient methods and regularization without any neural components. On standard, benchmark tasks, Plenoxels are optimized two orders of magnitude faster than Neural Radiance Fi… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: For video and code, please see https://alexyu.net/plenoxels

  36. arXiv:2111.10434  [pdf, other

    cs.LG

    Machine Learning for Mechanical Ventilation Control (Extended Abstract)

    Authors: Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alexander Yu, Paula Gradu, Karan Singh, Cyril Zhang, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan

    Abstract: Mechanical ventilation is one of the most widely used therapies in the ICU. However, despite broad application from anaesthesia to COVID-related life support, many injurious challenges remain. We frame these as a control problem: ventilators must let air in and out of the patient's lungs according to a prescribed trajectory of airway pressure. Industry-standard controllers, based on the PID method… ▽ More

    Submitted 23 December, 2021; v1 submitted 19 November, 2021; originally announced November 2021.

    Comments: Machine Learning for Health (ML4H) at NeurIPS 2021 - Extended Abstract. arXiv admin note: substantial text overlap with arXiv:2102.06779

  37. arXiv:2111.10050  [pdf, other

    cs.LG cs.CL cs.CV

    Combined Scaling for Zero-shot Transfer Learning

    Authors: Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

    Abstract: We present a combined scaling method - named BASIC - that achieves 85.7% top-1 accuracy on the ImageNet ILSVRC-2012 validation set without learning from any labeled ImageNet example. This accuracy surpasses best published similar models - CLIP and ALIGN - by 9.3%. Our BASIC model also shows significant improvements in robustness benchmarks. For instance, on 5 test sets with natural distribution sh… ▽ More

    Submitted 12 April, 2023; v1 submitted 19 November, 2021; originally announced November 2021.

  38. arXiv:2110.11489  [pdf, ps, other

    cs.AR cs.LG

    Supporting Massive DLRM Inference Through Software Defined Memory

    Authors: Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Shishir Juluri, Akshat Nanda, Manoj Wodekar, Dheevatsa Mudigere, Krishnakumar Nair, Maxim Naumov, Chris Peterson, Mikhail Smelyanskiy, Vijay Rao

    Abstract: Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents differen… ▽ More

    Submitted 8 November, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: 14 pages, 5 figures

  39. arXiv:2109.12109  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Autonomy and Perception for Space Mining

    Authors: Ragav Sachdeva, Ravi Hammond, James Bockman, Alec Arthur, Brandon Smart, Dustin Craggs, Anh-Dzung Doan, Thomas Rowntree, Elijah Schutz, Adrian Orenstein, Andy Yu, Tat-Jun Chin, Ian Reid

    Abstract: Future Moon bases will likely be constructed using resources mined from the surface of the Moon. The difficulty of maintaining a human workforce on the Moon and communications lag with Earth means that mining will need to be conducted using collaborative robots with a high degree of autonomy. In this paper, we describe our solution for Phase 2 of the NASA Space Robotics Challenge, which provided a… ▽ More

    Submitted 13 April, 2022; v1 submitted 26 September, 2021; originally announced September 2021.

    Comments: This paper describes our 3rd place and innovation award winning solution to the NASA Space Robotics Challenge Phase 2

  40. arXiv:2109.11354  [pdf, other

    cs.LG math.CA math.NA

    Arbitrary-Depth Universal Approximation Theorems for Operator Neural Networks

    Authors: Annan Yu, Chloé Becquey, Diana Halikias, Matthew Esmaili Mallory, Alex Townsend

    Abstract: The standard Universal Approximation Theorem for operator neural networks (NNs) holds for arbitrary width and bounded depth. Here, we prove that operator NNs of bounded width and arbitrary depth are universal approximators for continuous nonlinear operators. In our main result, we prove that for non-polynomial activation functions that are continuously differentiable at a point with a nonzero deri… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: 12 pages

  41. arXiv:2109.09193  [pdf, other

    cs.CL cs.LG

    Towards Zero-Label Language Learning

    Authors: Zirui Wang, Adams Wei Yu, Orhan Firat, Yuan Cao

    Abstract: This paper explores zero-label learning in Natural Language Processing (NLP), whereby no human-annotated data is used anywhere during training and models are trained purely on synthetic data. At the core of our framework is a novel approach for better leveraging the powerful pretrained language models. Specifically, inspired by the recent success of few-shot inference on GPT-3, we present a traini… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

  42. arXiv:2109.02734  [pdf, other

    cs.CL

    Detecting Inspiring Content on Social Media

    Authors: Oana Ignat, Y-Lan Boureau, Jane A. Yu, Alon Halevy

    Abstract: Inspiration moves a person to see new possibilities and transforms the way they perceive their own potential. Inspiration has received little attention in psychology, and has not been researched before in the NLP community. To the best of our knowledge, this work is the first to study inspiration through machine learning methods. We aim to automatically detect inspiring content from social media d… ▽ More

    Submitted 29 May, 2023; v1 submitted 6 September, 2021; originally announced September 2021.

    Comments: accepted at ACII 2021

  43. arXiv:2109.01652  [pdf, other

    cs.CL

    Finetuned Language Models Are Zero-Shot Learners

    Authors: Jason Wei, Maarten Bosma, Vincent Y. Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, Quoc V. Le

    Abstract: This paper explores a simple method for improving the zero-shot learning abilities of language models. We show that instruction tuning -- finetuning language models on a collection of tasks described via instructions -- substantially improves zero-shot performance on unseen tasks. We take a 137B parameter pretrained language model and instruction-tune it on over 60 NLP tasks verbalized via natur… ▽ More

    Submitted 8 February, 2022; v1 submitted 3 September, 2021; originally announced September 2021.

    Comments: Version 5. Find list of changes in Appendix F (page 35)

  44. arXiv:2108.10904  [pdf, other

    cs.CV cs.CL cs.LG

    SimVLM: Simple Visual Language Model Pretraining with Weak Supervision

    Authors: Zirui Wang, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, Yuan Cao

    Abstract: With recent progress in joint modeling of visual and textual representations, Vision-Language Pretraining (VLP) has achieved impressive performance on many multimodal downstream tasks. However, the requirement for expensive annotations including clean image captions and regional labels limits the scalability of existing approaches, and complicates the pretraining procedure with the introduction of… ▽ More

    Submitted 15 May, 2022; v1 submitted 24 August, 2021; originally announced August 2021.

    Comments: Published at ICLR 2022

  45. arXiv:2108.00819  [pdf, other

    cs.LG cs.AI stat.ML

    Active Learning in Gaussian Process State Space Model

    Authors: Hon Sum Alec Yu, Dingling Yao, Christoph Zimmer, Marc Toussaint, Duy Nguyen-Tuong

    Abstract: We investigate active learning in Gaussian Process state-space models (GPSSM). Our problem is to actively steer the system through latent states by determining its inputs such that the underlying dynamics can be optimally learned by a GPSSM. In order that the most informative inputs are selected, we employ mutual information as our active learning criterion. In particular, we present two approache… ▽ More

    Submitted 30 July, 2021; originally announced August 2021.

    Comments: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2021

  46. arXiv:2104.09806  [pdf, other

    cs.CR

    DeepHunter: A Graph Neural Network Based Approach for Robust Cyber Threat Hunting

    Authors: Renzheng Wei, Lijun Cai, Aimin Yu, Dan Meng

    Abstract: Cyber Threat hunting is a proactive search for known attack behaviors in the organizational information system. It is an important component to mitigate advanced persistent threats (APTs). However, the attack behaviors recorded in provenance data may not be completely consistent with the known attack behaviors. In this paper, we propose DeepHunter, a graph neural network (GNN) based graph pattern… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

  47. arXiv:2103.14024  [pdf, other

    cs.CV cs.GR

    PlenOctrees for Real-time Rendering of Neural Radiance Fields

    Authors: Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, Angjoo Kanazawa

    Abstract: We introduce a method to render Neural Radiance Fields (NeRFs) in real time using PlenOctrees, an octree-based 3D representation which supports view-dependent effects. Our method can render 800x800 images at more than 150 FPS, which is over 3000 times faster than conventional NeRFs. We do so without sacrificing quality while preserving the ability of NeRFs to perform free-viewpoint rendering of sc… ▽ More

    Submitted 17 August, 2021; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: ICCV 2021 (Oral)

  48. arXiv:2102.09968  [pdf, other

    cs.RO cs.LG

    Deluca -- A Differentiable Control Library: Environments, Methods, and Benchmarking

    Authors: Paula Gradu, John Hallman, Daniel Suo, Alex Yu, Naman Agarwal, Udaya Ghai, Karan Singh, Cyril Zhang, Anirudha Majumdar, Elad Hazan

    Abstract: We present an open-source library of natively differentiable physics and robotics environments, accompanied by gradient-based control methods and a benchmark-ing suite. The introduced environments allow auto-differentiation through the simulation dynamics, and thereby permit fast training of controllers. The library features several popular environments, including classical control settings from O… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  49. arXiv:2102.06779  [pdf, other

    cs.LG

    Machine Learning for Mechanical Ventilation Control

    Authors: Daniel Suo, Naman Agarwal, Wenhan Xia, Xinyi Chen, Udaya Ghai, Alexander Yu, Paula Gradu, Karan Singh, Cyril Zhang, Edgar Minasyan, Julienne LaChance, Tom Zajdel, Manuel Schottdorf, Daniel Cohen, Elad Hazan

    Abstract: We consider the problem of controlling an invasive mechanical ventilator for pressure-controlled ventilation: a controller must let air in and out of a sedated patient's lungs according to a trajectory of airway pressures specified by a clinician. Hand-tuned PID controllers and similar variants have comprised the industry standard for decades, yet can behave poorly by over- or under-shooting their… ▽ More

    Submitted 18 January, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

  50. arXiv:2012.02190  [pdf, other

    cs.CV cs.GR cs.LG

    pixelNeRF: Neural Radiance Fields from One or Few Images

    Authors: Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa

    Abstract: We propose pixelNeRF, a learning framework that predicts a continuous neural scene representation conditioned on one or few input images. The existing approach for constructing neural radiance fields involves optimizing the representation to every scene independently, requiring many calibrated views and significant compute time. We take a step towards resolving these shortcomings by introducing an… ▽ More

    Submitted 30 May, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: CVPR 2021