subscribe to arXiv mailings

SGCCNet: Single-Stage 3D Object Detector With Saliency-Guided Data Augmentation and Confidence Correction Mechanism

Authors: Ao Liang, Wenyu Chen, Jian Fang, Huaici Zhao

Abstract: The single-stage point-based 3D object detectors have attracted widespread research interest due to their advantages of lightweight and fast inference speed. However, they still face challenges such as inadequate learning of low-quality objects (ILQ) and misalignment between localization accuracy and classification confidence (MLC). In this paper, we propose SGCCNet to alleviate these two issues.… ▽ More The single-stage point-based 3D object detectors have attracted widespread research interest due to their advantages of lightweight and fast inference speed. However, they still face challenges such as inadequate learning of low-quality objects (ILQ) and misalignment between localization accuracy and classification confidence (MLC). In this paper, we propose SGCCNet to alleviate these two issues. For ILQ, SGCCNet adopts a Saliency-Guided Data Augmentation (SGDA) strategy to enhance the robustness of the model on low-quality objects by reducing its reliance on salient features. Specifically, We construct a classification task and then approximate the saliency scores of points by moving points towards the point cloud centroid in a differentiable process. During the training process, SGCCNet will be forced to learn from low saliency features through dropping points. Meanwhile, to avoid internal covariate shift and contextual features forgetting caused by dropping points, we add a geometric normalization module and skip connection block in each stage. For MLC, we design a Confidence Correction Mechanism (CCM) specifically for point-based multi-class detectors. This mechanism corrects the confidence of the current proposal by utilizing the predictions of other key points within the local region in the post-processing stage. Extensive experiments on the KITTI dataset demonstrate the generality and effectiveness of our SGCCNet. On the KITTI \textit{test} set, SGCCNet achieves $80.82\%$ for the metric of $AP_{3D}$ on the \textit{Moderate} level, outperforming all other point-based detectors, surpassing IA-SSD and Fast Point R-CNN by $2.35\%$ and $3.42\%$, respectively. Additionally, SGCCNet demonstrates excellent portability for other point-based detectors △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 16 pages, 16 figures

arXiv:2405.04816 [pdf, other]

Testing the Fairness-Accuracy Improvability of Algorithms

Authors: Eric Auerbach, Annie Liang, Kyohei Okumura, Max Tabord-Meehan

Abstract: Many organizations use algorithms that have a disparate impact, i.e., the benefits or harms of the algorithm fall disproportionately on certain social groups. Addressing an algorithm's disparate impact can be challenging, however, because it is often unclear whether it is possible to reduce this impact without sacrificing other objectives of the organization, such as accuracy or profit. Establishi… ▽ More Many organizations use algorithms that have a disparate impact, i.e., the benefits or harms of the algorithm fall disproportionately on certain social groups. Addressing an algorithm's disparate impact can be challenging, however, because it is often unclear whether it is possible to reduce this impact without sacrificing other objectives of the organization, such as accuracy or profit. Establishing the improvability of algorithms with respect to multiple criteria is of both conceptual and practical interest: in many settings, disparate impact that would otherwise be prohibited under US federal law is permissible if it is necessary to achieve a legitimate business interest. The question is how a policy-maker can formally substantiate, or refute, this "necessity" defense. In this paper, we provide an econometric framework for testing the hypothesis that it is possible to improve on the fairness of an algorithm without compromising on other pre-specified objectives. Our proposed test is simple to implement and can be applied under any exogenous constraint on the algorithm space. We establish the large-sample validity and consistency of our test, and illustrate its practical application by evaluating a healthcare algorithm originally considered by Obermeyer et al. (2019). In this application, we reject the null hypothesis that it is not possible to reduce the algorithm's disparate impact without compromising the accuracy of its predictions. △ Less

Submitted 3 July, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.15772 [pdf, other]

Bi-Mamba+: Bidirectional Mamba for Time Series Forecasting

Authors: Aobo Liang, Xingguo Jiang, Yan Sun, Xiaohou Shi, Ke Li

Abstract: Long-term time series forecasting (LTSF) provides longer insights into future trends and patterns. Over the past few years, deep learning models especially Transformers have achieved advanced performance in LTSF tasks. However, LTSF faces inherent challenges such as long-term dependencies capturing and sparse semantic characteristics. Recently, a new state space model (SSM) named Mamba is proposed… ▽ More Long-term time series forecasting (LTSF) provides longer insights into future trends and patterns. Over the past few years, deep learning models especially Transformers have achieved advanced performance in LTSF tasks. However, LTSF faces inherent challenges such as long-term dependencies capturing and sparse semantic characteristics. Recently, a new state space model (SSM) named Mamba is proposed. With the selective capability on input data and the hardware-aware parallel computing algorithm, Mamba has shown great potential in balancing predicting performance and computational efficiency compared to Transformers. To enhance Mamba's ability to preserve historical information in a longer range, we design a novel Mamba+ block by adding a forget gate inside Mamba to selectively combine the new features with the historical features in a complementary manner. Furthermore, we apply Mamba+ both forward and backward and propose Bi-Mamba+, aiming to promote the model's ability to capture interactions among time series elements. Additionally, multivariate time series data in different scenarios may exhibit varying emphasis on intra- or inter-series dependencies. Therefore, we propose a series-relation-aware decider that controls the utilization of channel-independent or channel-mixing tokenization strategy for specific datasets. Extensive experiments on 8 real-world datasets show that our model achieves more accurate predictions compared with state-of-the-art methods. △ Less

Submitted 26 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: New Mamba-based architecture. All experiments rerun

arXiv:2404.04424 [pdf, ps, other]

Algorithmic Fairness and Social Welfare

Authors: Annie Liang, Jay Lu

Abstract: Algorithms are increasingly used to guide high-stakes decisions about individuals. Consequently, substantial interest has developed around defining and measuring the ``fairness'' of these algorithms. These definitions of fair algorithms share two features: First, they prioritize the role of a pre-defined group identity (e.g., race or gender) by focusing on how the algorithm's impact differs system… ▽ More Algorithms are increasingly used to guide high-stakes decisions about individuals. Consequently, substantial interest has developed around defining and measuring the ``fairness'' of these algorithms. These definitions of fair algorithms share two features: First, they prioritize the role of a pre-defined group identity (e.g., race or gender) by focusing on how the algorithm's impact differs systematically across groups. Second, they are statistical in nature; for example, comparing false positive rates, or assessing whether group identity is independent of the decision (where both are viewed as random variables). These notions are facially distinct from a social welfare approach to fairness, in particular one based on ``veil of ignorance'' thought experiments in which individuals choose how to structure society prior to the realization of their social identity. In this paper, we seek to understand and organize the relationship between these different approaches to fairness. Can the optimization criteria proposed in the algorithmic fairness literature also be motivated as the choices of someone from behind the veil of ignorance? If not, what properties distinguish either approach to fairness? △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2403.10940 [pdf, other]

ViSaRL: Visual Reinforcement Learning Guided by Human Saliency

Authors: Anthony Liang, Jesse Thomason, Erdem Bıyık

Abstract: Training robots to perform complex control tasks from high-dimensional pixel input using reinforcement learning (RL) is sample-inefficient, because image observations are comprised primarily of task-irrelevant information. By contrast, humans are able to visually attend to task-relevant objects and areas. Based on this insight, we introduce Visual Saliency-Guided Reinforcement Learning (ViSaRL). U… ▽ More Training robots to perform complex control tasks from high-dimensional pixel input using reinforcement learning (RL) is sample-inefficient, because image observations are comprised primarily of task-irrelevant information. By contrast, humans are able to visually attend to task-relevant objects and areas. Based on this insight, we introduce Visual Saliency-Guided Reinforcement Learning (ViSaRL). Using ViSaRL to learn visual representations significantly improves the success rate, sample efficiency, and generalization of an RL agent on diverse tasks including DeepMind Control benchmark, robot manipulation in simulation and on a real robot. We present approaches for incorporating saliency into both CNN and Transformer-based encoders. We show that visual representations learned using ViSaRL are robust to various sources of visual perturbations including perceptual noise and scene variations. ViSaRL nearly doubles success rate on the real-robot tasks compared to the baseline which does not use saliency. △ Less

Submitted 16 March, 2024; originally announced March 2024.

arXiv:2402.15957 [pdf, other]

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

Authors: Anthony Liang, Guy Tennenholtz, Chih-wei Hsu, Yinlam Chow, Erdem Bıyık, Craig Boutilier

Abstract: We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates. We model episode sessions - parts of the episode where the latent state is fixed - and propose three key modifications to existing meta-RL methods: consistency of latent information within sessions, session masking, and prior latent co… ▽ More We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates. We model episode sessions - parts of the episode where the latent state is fixed - and propose three key modifications to existing meta-RL methods: consistency of latent information within sessions, session masking, and prior latent conditioning. We demonstrate the importance of these modifications in various domains, ranging from discrete Gridworld environments to continuous-control and simulated robot assistive tasks, demonstrating that DynaMITE-RL significantly outperforms state-of-the-art baselines in sample efficiency and inference returns. △ Less

Submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.11157 [pdf, other]

The Value of Context: Human versus Black Box Evaluators

Authors: Andrei Iakovlev, Annie Liang

Abstract: Machine learning algorithms are now capable of performing evaluations previously conducted by human experts (e.g., medical diagnoses). How should we conceptualize the difference between evaluation by humans and by algorithms, and when should an individual prefer one over the other? We propose a framework to examine one key distinction between the two forms of evaluation: Machine learning algorithm… ▽ More Machine learning algorithms are now capable of performing evaluations previously conducted by human experts (e.g., medical diagnoses). How should we conceptualize the difference between evaluation by humans and by algorithms, and when should an individual prefer one over the other? We propose a framework to examine one key distinction between the two forms of evaluation: Machine learning algorithms are standardized, fixing a common set of covariates by which to assess all individuals, while human evaluators customize which covariates are acquired to each individual. Our framework defines and analyzes the advantage of this customization -- the value of context -- in environments with high-dimensional data. We show that unless the agent has precise knowledge about the joint distribution of covariates, the benefit of additional covariates generally outweighs the value of context. △ Less

Submitted 29 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.03447 [pdf, other]

Challenges in Variable Importance Ranking Under Correlation

Authors: Annie Liang, Thomas Jemielita, Andy Liaw, Vladimir Svetnik, Lingkang Huang, Richard Baumgartner, Jason M. Klusowski

Abstract: Variable importance plays a pivotal role in interpretable machine learning as it helps measure the impact of factors on the output of the prediction model. Model agnostic methods based on the generation of "null" features via permutation (or related approaches) can be applied. Such analysis is often utilized in pharmaceutical applications due to its ability to interpret black-box models, including… ▽ More Variable importance plays a pivotal role in interpretable machine learning as it helps measure the impact of factors on the output of the prediction model. Model agnostic methods based on the generation of "null" features via permutation (or related approaches) can be applied. Such analysis is often utilized in pharmaceutical applications due to its ability to interpret black-box models, including tree-based ensembles. A major challenge and significant confounder in variable importance estimation however is the presence of between-feature correlation. Recently, several adjustments to marginal permutation utilizing feature knockoffs were proposed to address this issue, such as the variable importance measure known as conditional predictive impact (CPI). Assessment and evaluation of such approaches is the focus of our work. We first present a comprehensive simulation study investigating the impact of feature correlation on the assessment of variable importance. We then theoretically prove the limitation that highly correlated features pose for the CPI through the knockoff construction. While we expect that there is always no correlation between knockoff variables and its corresponding predictor variables, we prove that the correlation increases linearly beyond a certain correlation threshold between the predictor variables. Our findings emphasize the absence of free lunch when dealing with high feature correlation, as well as the necessity of understanding the utility and limitations behind methods in variable importance estimation. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2401.09988 [pdf, other]

Developing an AI-based Integrated System for Bee Health Evaluation

Authors: Andrew Liang

Abstract: Honey bees pollinate about one-third of the world's food supply, but bee colonies have alarmingly declined by nearly 40% over the past decade due to several factors, including pesticides and pests. Traditional methods for monitoring beehives, such as human inspection, are subjective, disruptive, and time-consuming. To overcome these limitations, artificial intelligence has been used to assess beeh… ▽ More Honey bees pollinate about one-third of the world's food supply, but bee colonies have alarmingly declined by nearly 40% over the past decade due to several factors, including pesticides and pests. Traditional methods for monitoring beehives, such as human inspection, are subjective, disruptive, and time-consuming. To overcome these limitations, artificial intelligence has been used to assess beehive health. However, previous studies have lacked an end-to-end solution and primarily relied on data from a single source, either bee images or sounds. This study introduces a comprehensive system consisting of bee object detection and health evaluation. Additionally, it utilized a combination of visual and audio signals to analyze bee behaviors. An Attention-based Multimodal Neural Network (AMNN) was developed to adaptively focus on key features from each type of signal for accurate bee health assessment. The AMNN achieved an overall accuracy of 92.61%, surpassing eight existing single-signal Convolutional Neural Networks and Recurrent Neural Networks. It outperformed the best image-based model by 32.51% and the top sound-based model by 13.98% while maintaining efficient processing times. Furthermore, it improved prediction robustness, attaining an F1-score higher than 90% across all four evaluated health conditions. The study also shows that audio signals are more reliable than images for assessing bee health. By seamlessly integrating AMNN with image and sound data in a comprehensive bee health monitoring system, this approach provides a more efficient and non-invasive solution for the early detection of bee diseases and the preservation of bee colonies. △ Less

Submitted 18 January, 2024; originally announced January 2024.

arXiv:2312.16364 [pdf, other]

Robustness Verification for Knowledge-Based Logic of Risky Driving Scenes

Authors: Xia Wang, Anda Liang, Jonathan Sprinkle, Taylor T. Johnson

Abstract: Many decision-making scenarios in modern life benefit from the decision support of artificial intelligence algorithms, which focus on a data-driven philosophy and automated programs or systems. However, crucial decision issues related to security, fairness, and privacy should consider more human knowledge and principles to supervise such AI algorithms to reach more proper solutions and to benefit… ▽ More Many decision-making scenarios in modern life benefit from the decision support of artificial intelligence algorithms, which focus on a data-driven philosophy and automated programs or systems. However, crucial decision issues related to security, fairness, and privacy should consider more human knowledge and principles to supervise such AI algorithms to reach more proper solutions and to benefit society more effectively. In this work, we extract knowledge-based logic that defines risky driving formats learned from public transportation accident datasets, which haven't been analyzed in detail to the best of our knowledge. More importantly, this knowledge is critical for recognizing traffic hazards and could supervise and improve AI models in safety-critical systems. Then we use automated verification methods to verify the robustness of such logic. More specifically, we gather 72 accident datasets from Data.gov and organize them by state. Further, we train Decision Tree and XGBoost models on each state's dataset, deriving accident judgment logic. Finally, we deploy robustness verification on these tree-based models under multiple parameter combinations. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2308.09119 [pdf, other]

ICAR: Image-based Complementary Auto Reasoning

Authors: Xijun Wang, Anqi Liang, Junbang Liang, Ming Lin, Yu Lou, Shan Yang

Abstract: Scene-aware Complementary Item Retrieval (CIR) is a challenging task which requires to generate a set of compatible items across domains. Due to the subjectivity, it is difficult to set up a rigorous standard for both data collection and learning objectives. To address this challenging task, we propose a visual compatibility concept, composed of similarity (resembling in color, geometry, texture,… ▽ More Scene-aware Complementary Item Retrieval (CIR) is a challenging task which requires to generate a set of compatible items across domains. Due to the subjectivity, it is difficult to set up a rigorous standard for both data collection and learning objectives. To address this challenging task, we propose a visual compatibility concept, composed of similarity (resembling in color, geometry, texture, and etc.) and complementarity (different items like table vs chair completing a group). Based on this notion, we propose a compatibility learning framework, a category-aware Flexible Bidirectional Transformer (FBT), for visual "scene-based set compatibility reasoning" with the cross-domain visual similarity input and auto-regressive complementary item generation. We introduce a "Flexible Bidirectional Transformer (FBT)" consisting of an encoder with flexible masking, a category prediction arm, and an auto-regressive visual embedding prediction arm. And the inputs for FBT are cross-domain visual similarity invariant embeddings, making this framework quite generalizable. Furthermore, our proposed FBT model learns the inter-object compatibility from a large set of scene images in a self-supervised way. Compared with the SOTA methods, this approach achieves up to 5.3% and 9.6% in FITB score and 22.3% and 31.8% SFID improvement on fashion and furniture, respectively. △ Less

Submitted 17 August, 2023; originally announced August 2023.

arXiv:2305.17386 [pdf, other]

HyperFormer: Learning Expressive Sparse Feature Representations via Hypergraph Transformer

Authors: Kaize Ding, Albert Jiongqian Liang, Bryan Perrozi, Ting Chen, Ruoxi Wang, Lichan Hong, Ed H. Chi, Huan Liu, Derek Zhiyuan Cheng

Abstract: Learning expressive representations for high-dimensional yet sparse features has been a longstanding problem in information retrieval. Though recent deep learning methods can partially solve the problem, they often fail to handle the numerous sparse features, particularly those tail feature values with infrequent occurrences in the training data. Worse still, existing methods cannot explicitly lev… ▽ More Learning expressive representations for high-dimensional yet sparse features has been a longstanding problem in information retrieval. Though recent deep learning methods can partially solve the problem, they often fail to handle the numerous sparse features, particularly those tail feature values with infrequent occurrences in the training data. Worse still, existing methods cannot explicitly leverage the correlations among different instances to help further improve the representation learning on sparse features since such relational prior knowledge is not provided. To address these challenges, in this paper, we tackle the problem of representation learning on feature-sparse data from a graph learning perspective. Specifically, we propose to model the sparse features of different instances using hypergraphs where each node represents a data instance and each hyperedge denotes a distinct feature value. By passing messages on the constructed hypergraphs based on our Hypergraph Transformer (HyperFormer), the learned feature representations capture not only the correlations among different instances but also the correlations among features. Our experiments demonstrate that the proposed approach can effectively improve feature representation learning on sparse features. △ Less

Submitted 27 May, 2023; originally announced May 2023.

Comments: Accepted by SIGIR 2023

arXiv:2303.13056 [pdf, other]

Predicting the Initial Conditions of the Universe using a Deterministic Neural Network

Authors: Vaibhav Jindal, Albert Liang, Aarti Singh, Shirley Ho, Drew Jamieson

Abstract: Finding the initial conditions that led to the current state of the universe is challenging because it involves searching over an intractable input space of initial conditions, along with modeling their evolution via tools such as N-body simulations which are computationally expensive. Recently, deep learning has emerged as a surrogate for N-body simulations by directly learning the mapping betwee… ▽ More Finding the initial conditions that led to the current state of the universe is challenging because it involves searching over an intractable input space of initial conditions, along with modeling their evolution via tools such as N-body simulations which are computationally expensive. Recently, deep learning has emerged as a surrogate for N-body simulations by directly learning the mapping between the linear input of an N-body simulation and the final nonlinear output from the simulation, significantly accelerating the forward modeling. However, this still does not reduce the search space for initial conditions. In this work, we pioneer the use of a deterministic convolutional neural network for learning the reverse mapping and show that it accurately recovers the initial linear displacement field over a wide range of scales ($<1$-$2\%$ error up to nearly $k\simeq0.8$-$0.9 \text{ Mpc}^{-1}h$), despite the one-to-many mapping of the inverse problem (due to the divergent backward trajectories at smaller scales). Specifically, we train a V-Net architecture, which outputs the linear displacement of an N-body simulation, given the nonlinear displacement at redshift $z=0$ and the cosmological parameters. The results of our method suggest that a simple deterministic neural network is sufficient for accurately approximating the initial linear states, potentially obviating the need for the more complex and computationally demanding backward modeling methods that were recently proposed. △ Less

Submitted 13 December, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: Camera ready version for NeurIPS 2023 AI for Science workshop https://ai4sciencecommunity.github.io/neurips23.html

arXiv:2303.01778 [pdf, other]

FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical Training

Authors: Zhenheng Tang, Xiaowen Chu, Ryan Yide Ran, Sunwoo Lee, Shaohuai Shi, Yonggang Zhang, Yuxin Wang, Alex Qiaozhong Liang, Salman Avestimehr, Chaoyang He

Abstract: Federated Learning (FL) enables collaborations among clients for train machine learning models while protecting their data privacy. Existing FL simulation platforms that are designed from the perspectives of traditional distributed training, suffer from laborious code migration between simulation and production, low efficiency, low GPU utility, low scalability with high hardware requirements and d… ▽ More Federated Learning (FL) enables collaborations among clients for train machine learning models while protecting their data privacy. Existing FL simulation platforms that are designed from the perspectives of traditional distributed training, suffer from laborious code migration between simulation and production, low efficiency, low GPU utility, low scalability with high hardware requirements and difficulty of simulating stateful clients. In this work, we firstly demystify the challenges and bottlenecks of simulating FL, and design a new FL system named as FedML \texttt{Parrot}. It improves the training efficiency, remarkably relaxes the requirements on the hardware, and supports efficient large-scale FL experiments with stateful clients by: (1) sequential training clients on devices; (2) decomposing original aggregation into local and global aggregation on devices and server respectively; (3) scheduling tasks to mitigate straggler problems and enhance computing utility; (4) distributed client state manager to support various FL algorithms. Besides, built upon our generic APIs and communication interfaces, users can seamlessly transform the simulation into the real-world deployment without modifying codes. We evaluate \texttt{Parrot} through extensive experiments for training diverse models on various FL datasets to demonstrate that \texttt{Parrot} can achieve simulating over 1000 clients (stateful or stateless) with flexible GPU devices setting ($4 \sim 32$) and high GPU utility, 1.2 $\sim$ 4 times faster than FedScale, and 10 $\sim$ 100 times memory saving than FedML. And we verify that \texttt{Parrot} works well with homogeneous and heterogeneous devices in three different clusters. Two FL algorithms with stateful clients and four algorithms with stateless clients are simulated to verify the wide adaptability of \texttt{Parrot} to different algorithms. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2301.02232 [pdf, other]

CA$^2$T-Net: Category-Agnostic 3D Articulation Transfer from Single Image

Authors: Jasmine Collins, Anqi Liang, Jitendra Malik, Hao Zhang, Frédéric Devernay

Abstract: We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model. Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image. The network is composed of three distinct branches that take a shared joint image-shape… ▽ More We present a neural network approach to transfer the motion from a single image of an articulated object to a rest-state (i.e., unarticulated) 3D model. Our network learns to predict the object's pose, part segmentation, and corresponding motion parameters to reproduce the articulation shown in the input image. The network is composed of three distinct branches that take a shared joint image-shape embedding and is trained end-to-end. Unlike previous methods, our approach is independent of the topology of the object and can work with objects from arbitrary categories. Our method, trained with only synthetic data, can be used to automatically animate a mesh, infer motion from real images, and transfer articulation to functionally similar but geometrically distinct 3D models at test time. △ Less

Submitted 22 March, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: 8 pages

arXiv:2203.14019 [pdf, other]

TridentNetV2: Lightweight Graphical Global Plan Representations for Dynamic Trajectory Generation

Authors: David Paz, Hao Xiang, Andrew Liang, Henrik I. Christensen

Abstract: We present a framework for dynamic trajectory generation for autonomous navigation, which does not rely on HD maps as the underlying representation. High Definition (HD) maps have become a key component in most autonomous driving frameworks, which include complete road network information annotated at a centimeter-level that include traversable waypoints, lane information, and traffic signals. Ins… ▽ More We present a framework for dynamic trajectory generation for autonomous navigation, which does not rely on HD maps as the underlying representation. High Definition (HD) maps have become a key component in most autonomous driving frameworks, which include complete road network information annotated at a centimeter-level that include traversable waypoints, lane information, and traffic signals. Instead, the presented approach models the distributions of feasible ego-centric trajectories in real-time given a nominal graph-based global plan and a lightweight scene representation. By embedding contextual information, such as crosswalks, stop signs, and traffic signals, our approach achieves low errors across multiple urban navigation datasets that include diverse intersection maneuvers, while maintaining real-time performance and reducing network complexity. Underlying datasets introduced are available online. △ Less

Submitted 26 March, 2022; originally announced March 2022.

Comments: 7 pages, Accepted at ICRA 2022

arXiv:2203.13116 [pdf, other]

Egocentric Prediction of Action Target in 3D

Authors: Yiming Li, Ziang Cao, Andrew Liang, Benjamin Liang, Luoyao Chen, Hang Zhao, Chen Feng

Abstract: We are interested in anticipating as early as possible the target location of a person's object manipulation action in a 3D workspace from egocentric vision. It is important in fields like human-robot collaboration, but has not yet received enough attention from vision and learning communities. To stimulate more research on this challenging egocentric vision task, we propose a large multimodality… ▽ More We are interested in anticipating as early as possible the target location of a person's object manipulation action in a 3D workspace from egocentric vision. It is important in fields like human-robot collaboration, but has not yet received enough attention from vision and learning communities. To stimulate more research on this challenging egocentric vision task, we propose a large multimodality dataset of more than 1 million frames of RGB-D and IMU streams, and provide evaluation metrics based on our high-quality 2D and 3D labels from semi-automatic annotation. Meanwhile, we design baseline methods using recurrent neural networks and conduct various ablation studies to validate their effectiveness. Our results demonstrate that this new task is worthy of further study by researchers in robotics, vision, and learning communities. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Comments: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

arXiv:2010.15195 [pdf, other]

doi 10.24963/ijcai.2021/306

Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment

Authors: Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee, Richard L. Lewis, Satinder Singh

Abstract: First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and e… ▽ More First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and expert demonstrations. In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent. Our key insight is that learning an object-model that incorporates object-attention into forward prediction provides a dense learning signal for unsupervised representation learning of both objects and their relationships. This, in turn, enables faster policy learning for an object-centric relational RL agent. We demonstrate our agent by introducing a set of challenging object-interaction tasks in the AI2Thor environment where learning with our attentive object-model is key to strong performance. Specifically, we compare our agent and relational RL agents with alternative auxiliary tasks to a relational RL agent equipped with ground-truth object-information, and show that learning with our object-model best closes the performance gap in terms of both learning speed and maximum success rate. Additionally, we find that incorporating object-attention into an object-model's forward predictions is key to learning representations which capture object-category and object-state. △ Less

Submitted 20 May, 2021; v1 submitted 28 October, 2020; originally announced October 2020.

Comments: Accepted to IJCAI 2021

Journal ref: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021)

arXiv:2007.09213 [pdf, other]

How Flexible is that Functional Form? Quantifying the Restrictiveness of Theories

Authors: Drew Fudenberg, Wayne Gao, Annie Liang

Abstract: We propose a restrictiveness measure for economic models based on how well they fit synthetic data from a pre-defined class. This measure, together with a measure for how well the model fits real data, outlines a Pareto frontier, where models that rule out more regularities, yet capture the regularities that are present in real data, are preferred. To illustrate our approach, we evaluate the restr… ▽ More We propose a restrictiveness measure for economic models based on how well they fit synthetic data from a pre-defined class. This measure, together with a measure for how well the model fits real data, outlines a Pareto frontier, where models that rule out more regularities, yet capture the regularities that are present in real data, are preferred. To illustrate our approach, we evaluate the restrictiveness of popular models in two laboratory settings -- certainty equivalents and initial play -- and in one field setting -- takeup of microfinance in Indian villages. The restrictiveness measure reveals new insights about each of the models, including that some economic models with only a few parameters are very flexible. △ Less

Submitted 23 August, 2023; v1 submitted 17 July, 2020; originally announced July 2020.

arXiv:2006.06543 [pdf, other]

Data and Incentives

Authors: Annie Liang, Erik Madsen

Abstract: "Big data" gives markets access to previously unmeasured characteristics of individual agents. Policymakers must decide whether and how to regulate the use of this data. We study how new data affects incentives for agents to exert effort in settings such as the labor market, where an agent's quality is initially unknown but is forecast from an observable outcome. We show that measurement of a new… ▽ More "Big data" gives markets access to previously unmeasured characteristics of individual agents. Policymakers must decide whether and how to regulate the use of this data. We study how new data affects incentives for agents to exert effort in settings such as the labor market, where an agent's quality is initially unknown but is forecast from an observable outcome. We show that measurement of a new covariate has a systematic effect on the average effort exerted by agents, with the direction of the effect determined by whether the covariate is informative about long-run quality or about a shock to short-run outcomes. For a class of covariates satisfying a statistical property we call strong homoskedasticity, this effect is uniform across agents. More generally, new measurements can impact agents unequally, and we show that these distributional effects have a first-order impact on social welfare. △ Less

Submitted 1 September, 2022; v1 submitted 11 June, 2020; originally announced June 2020.

arXiv:1910.07022 [pdf, other]

Measuring the Completeness of Theories

Authors: Drew Fudenberg, Jon Kleinberg, Annie Liang, Sendhil Mullainathan

Abstract: We use machine learning to provide a tractable measure of the amount of predictable variation in the data that a theory captures, which we call its "completeness." We apply this measure to three problems: assigning certain equivalents to lotteries, initial play in games, and human generation of random sequences. We discover considerable variation in the completeness of existing models, which sheds… ▽ More We use machine learning to provide a tractable measure of the amount of predictable variation in the data that a theory captures, which we call its "completeness." We apply this measure to three problems: assigning certain equivalents to lotteries, initial play in games, and human generation of random sequences. We discover considerable variation in the completeness of existing models, which sheds light on whether to focus on developing better models with the same features or instead to look for new features that will improve predictions. We also illustrate how and why completeness varies with the experiments considered, which highlights the role played in choosing which experiments to run. △ Less

Submitted 15 October, 2019; originally announced October 2019.

arXiv:1910.07018 [pdf, other]

Games of Incomplete Information Played By Statisticians

Authors: Annie Liang

Abstract: Players are statistical learners who learn about payoffs from data. They may interpret the same data differently, but have common knowledge of a class of learning procedures. I propose a metric for the analyst's "confidence" in a strategic prediction, based on the probability that the prediction is consistent with the realized data. The main results characterize the analyst's confidence in a given… ▽ More Players are statistical learners who learn about payoffs from data. They may interpret the same data differently, but have common knowledge of a class of learning procedures. I propose a metric for the analyst's "confidence" in a strategic prediction, based on the probability that the prediction is consistent with the realized data. The main results characterize the analyst's confidence in a given prediction as the quantity of data grows large, and provide bounds for small datasets. The approach generates new predictions, e.g. that speculative trade is more likely given high-dimensional data, and that coordination is less likely given noisy data. △ Less

Submitted 9 July, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

arXiv:1910.07015 [pdf, other]

Dynamically Aggregating Diverse Information

Authors: Annie Liang, Xiaosheng Mu, Vasilis Syrgkanis

Abstract: An agent has access to multiple information sources, each of which provides information about a different attribute of an unknown state. Information is acquired continuously -- where the agent chooses both which sources to sample from, and also how to allocate attention across them -- until an endogenously chosen time, at which point a decision is taken. We provide an exact characterization of the… ▽ More An agent has access to multiple information sources, each of which provides information about a different attribute of an unknown state. Information is acquired continuously -- where the agent chooses both which sources to sample from, and also how to allocate attention across them -- until an endogenously chosen time, at which point a decision is taken. We provide an exact characterization of the optimal information acquisition strategy under weak conditions on the agent's prior belief about the different attributes. We then apply this characterization to derive new results regarding: (1) endogenous information acquisition for binary choice, (2) strategic information provision by biased news sources, and (3) the dynamic consequences of attention manipulation. △ Less

Submitted 23 April, 2021; v1 submitted 15 October, 2019; originally announced October 2019.

arXiv:1805.08134 [pdf, ps, other]

Overabundant Information and Learning Traps

Authors: Annie Liang, Xiaosheng Mu

Abstract: We develop a model of social learning from overabundant information: Short-lived agents sequentially choose from a large set of (flexibly correlated) information sources for prediction of an unknown state. Signal realizations are public. We demonstrate two starkly different long-run outcomes: (1) efficient information aggregation, where the community eventually learns as fast as possible; (2) "lea… ▽ More We develop a model of social learning from overabundant information: Short-lived agents sequentially choose from a large set of (flexibly correlated) information sources for prediction of an unknown state. Signal realizations are public. We demonstrate two starkly different long-run outcomes: (1) efficient information aggregation, where the community eventually learns as fast as possible; (2) "learning traps," where the community gets stuck observing suboptimal sources and learns inefficiently. Our main results identify a simple property of the signal correlation structure that separates these outcomes. In both regimes, we characterize which sources are observed in the long run and how often. △ Less

Submitted 18 June, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

arXiv:1801.01019 [pdf]

Proteomics Analysis of FLT3-ITD Mutation in Acute Myeloid Leukemia Using Deep Learning Neural Network

Authors: Christine A. Liang, Lei Chen, Amer Wahed, Andy N. D. Nguyen

Abstract: Deep Learning can significantly benefit cancer proteomics and genomics. In this study, we attempt to determine a set of critical proteins that are associated with the FLT3-ITD mutation in newly-diagnosed acute myeloid leukemia patients. A Deep Learning network consisting of autoencoders forming a hierarchical model from which high-level features are extracted without labeled training data. Dimensi… ▽ More Deep Learning can significantly benefit cancer proteomics and genomics. In this study, we attempt to determine a set of critical proteins that are associated with the FLT3-ITD mutation in newly-diagnosed acute myeloid leukemia patients. A Deep Learning network consisting of autoencoders forming a hierarchical model from which high-level features are extracted without labeled training data. Dimensional reduction reduced the number of critical proteins from 231 to 20. Deep Learning found an excellent correlation between FLT3-ITD mutation with the levels of these 20 critical proteins (accuracy 97%, sensitivity 90%, specificity 100%). Our Deep Learning network could hone in on 20 proteins with the strongest association with FLT3-ITD. The results of this study allow a novel approach to determine critical protein pathways in the FLT3-ITD mutation, and provide proof-of-concept for an accurate approach to model big data in cancer proteomics and genomics. △ Less

Submitted 29 December, 2017; originally announced January 2018.

Comments: 12 pages, 4 figures, 2 tables

arXiv:1706.06974 [pdf, other]

The Theory is Predictive, but is it Complete? An Application to Human Perception of Randomness

Authors: Jon Kleinberg, Annie Liang, Sendhil Mullainathan

Abstract: When we test a theory using data, it is common to focus on correctness: do the predictions of the theory match what we see in the data? But we also care about completeness: how much of the predictable variation in the data is captured by the theory? This question is difficult to answer, because in general we do not know how much "predictable variation" there is in the problem. In this paper, we co… ▽ More When we test a theory using data, it is common to focus on correctness: do the predictions of the theory match what we see in the data? But we also care about completeness: how much of the predictable variation in the data is captured by the theory? This question is difficult to answer, because in general we do not know how much "predictable variation" there is in the problem. In this paper, we consider approaches motivated by machine learning algorithms as a means of constructing a benchmark for the best attainable level of prediction. We illustrate our methods on the task of predicting human-generated random sequences. Relative to an atheoretical machine learning algorithm benchmark, we find that existing behavioral models explain roughly 15 percent of the predictable variation in this problem. This fraction is robust across several variations on the problem. We also consider a version of this approach for analyzing field data from domains in which human perception and generation of randomness has been used as a conceptual framework; these include sequential decision-making and repeated zero-sum games. In these domains, our framework for testing the completeness of theories provides a way of assessing their effectiveness over different contexts; we find that despite some differences, the existing theories are fairly stable across our field domains in their performance relative to the benchmark. Overall, our results indicate that (i) there is a significant amount of structure in this problem that existing models have yet to capture and (ii) there are rich domains in which machine learning may provide a viable approach to testing completeness. △ Less

Submitted 21 June, 2017; originally announced June 2017.

arXiv:1703.06367 [pdf, other]

Optimal and Myopic Information Acquisition

Authors: Annie Liang, Xiaosheng Mu, Vasilis Syrgkanis

Abstract: We consider the problem of optimal dynamic information acquisition from many correlated information sources. Each period, the decision-maker jointly takes an action and allocates a fixed number of observations across the available sources. His payoff depends on the actions taken and on an unknown state. In the canonical setting of jointly normal information sources, we show that the optimal dynami… ▽ More We consider the problem of optimal dynamic information acquisition from many correlated information sources. Each period, the decision-maker jointly takes an action and allocates a fixed number of observations across the available sources. His payoff depends on the actions taken and on an unknown state. In the canonical setting of jointly normal information sources, we show that the optimal dynamic information acquisition rule proceeds myopically after finitely many periods. If signals are acquired in large blocks each period, then the optimal rule turns out to be myopic from period 1. These results demonstrate the possibility of robust and "simple" optimal information acquisition, and simplify the analysis of dynamic information acquisition in a widely used informational environment. △ Less

Submitted 14 May, 2018; v1 submitted 18 March, 2017; originally announced March 2017.

Showing 1–27 of 27 results for author: Liang, A