subscribe to arXiv mailings

Generative Image as Action Models

Authors: Mohit Shridhar, Yat Long Lo, Stephen James

Abstract: Image-generation diffusion models have been fine-tuned to unlock new capabilities such as image-editing and novel view synthesis. Can we similarly unlock image-generation models for visuomotor control? We present GENIMA, a behavior-cloning agent that fine-tunes Stable Diffusion to 'draw joint-actions' as targets on RGB images. These images are fed into a controller that maps the visual targets int… ▽ More Image-generation diffusion models have been fine-tuned to unlock new capabilities such as image-editing and novel view synthesis. Can we similarly unlock image-generation models for visuomotor control? We present GENIMA, a behavior-cloning agent that fine-tunes Stable Diffusion to 'draw joint-actions' as targets on RGB images. These images are fed into a controller that maps the visual targets into a sequence of joint-positions. We study GENIMA on 25 RLBench and 9 real-world manipulation tasks. We find that, by lifting actions into image-space, internet pre-trained diffusion models can generate policies that outperform state-of-the-art visuomotor approaches, especially in robustness to scene perturbations and generalizing to novel objects. Our method is also competitive with 3D agents, despite lacking priors such as depth, keypoints, or motion-planners. △ Less

Submitted 10 July, 2024; originally announced July 2024.

Comments: Project website, code, checkpoints: https://genima-robot.github.io/

arXiv:2401.11445 [pdf, other]

Towards Non-Robocentric Dynamic Landing of Quadrotor UAVs

Authors: Li-Yu Lo, Boyang Li, Chih-Yung Wen, Ching-Wei Chang

Abstract: In this work, we propose a dynamic landing solution without the need for onboard exteroceptive sensors and an expensive computation unit, where all localization and control modules are carried out on the ground in a non-inertial frame. Our system starts with a relative state estimator of the aerial robot from the perspective of the landing platform, where the state tracking of the UAV is done thro… ▽ More In this work, we propose a dynamic landing solution without the need for onboard exteroceptive sensors and an expensive computation unit, where all localization and control modules are carried out on the ground in a non-inertial frame. Our system starts with a relative state estimator of the aerial robot from the perspective of the landing platform, where the state tracking of the UAV is done through a set of onboard LED markers and an on-ground camera; the state is expressed geometrically on manifold, and is returned by Iterated Extended Kalman filter (IEKF) algorithm. Subsequently, a motion planning module is developed to guide the landing process, formulating it as a minimum jerk trajectory by applying the differential flatness property. Considering visibility and dynamic constraints, the problem is solved using quadratic programming, and the final motion primitive is expressed through piecewise polynomials. Through a series of experiments, the applicability of this approach is validated by successfully landing 18 cm x 18 cm quadrotor on a 43 cm x 43 cm platform, exhibiting performance comparable to conventional methods. Finally, we provide comprehensive hardware and software details to the research community for future reference. △ Less

Submitted 21 January, 2024; originally announced January 2024.

arXiv:2309.01445 [pdf, other]

Why Change My Design: Explaining Poorly Constructed Visualization Designs with Explorable Explanations

Authors: Leo Yu-Ho Lo, Yifan Cao, Leni Yang, Huamin Qu

Abstract: Although visualization tools are widely available and accessible, not everyone knows the best practices and guidelines for creating accurate and honest visual representations of data. Numerous books and articles have been written to expose the misleading potential of poorly constructed charts and teach people how to avoid being deceived by them or making their own mistakes. These readings use vari… ▽ More Although visualization tools are widely available and accessible, not everyone knows the best practices and guidelines for creating accurate and honest visual representations of data. Numerous books and articles have been written to expose the misleading potential of poorly constructed charts and teach people how to avoid being deceived by them or making their own mistakes. These readings use various rhetorical devices to explain the concepts to their readers. In our analysis of a collection of books, online materials, and a design workshop, we identified six common explanation methods. To assess the effectiveness of these methods, we conducted two crowdsourced studies (each with N = 125) to evaluate their ability to teach and persuade people to make design changes. In addition to these existing methods, we brought in the idea of Explorable Explanations, which allows readers to experiment with different chart settings and observe how the changes are reflected in the visualization. While we did not find significant differences across explanation methods, the results of our experiments indicate that, following the exposure to the explanations, the participants showed improved proficiency in identifying deceptive charts and were more receptive to proposed alterations of the visualization design. We discovered that participants were willing to accept more than 60% of the proposed adjustments in the persuasiveness assessment. Nevertheless, we found no significant differences among different explanation methods in convincing participants to accept the modifications. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: To be presented at IEEE VIS 2023

arXiv:2307.14267 [pdf, other]

Improving International Climate Policy via Mutually Conditional Binding Commitments

Authors: Jobst Heitzig, Jörg Oechssler, Christoph Pröschel, Niranjana Ragavan, Yat Long Lo

Abstract: The Paris Agreement, considered a significant milestone in climate negotiations, has faced challenges in effectively addressing climate change due to the unconditional nature of most Nationally Determined Contributions (NDCs). This has resulted in a prevalence of free-riding behavior among major polluters and a lack of concrete conditionality in NDCs. To address this issue, we propose the implemen… ▽ More The Paris Agreement, considered a significant milestone in climate negotiations, has faced challenges in effectively addressing climate change due to the unconditional nature of most Nationally Determined Contributions (NDCs). This has resulted in a prevalence of free-riding behavior among major polluters and a lack of concrete conditionality in NDCs. To address this issue, we propose the implementation of a decentralized, bottom-up approach called the Conditional Commitment Mechanism. This mechanism, inspired by the National Popular Vote Interstate Compact, offers flexibility and incentives for early adopters, aiming to formalize conditional cooperation in international climate policy. In this paper, we provide an overview of the mechanism, its performance in the AI4ClimateCooperation challenge, and discuss potential real-world implementation aspects. Prior knowledge of the climate mitigation collective action problem, basic economic principles, and game theory concepts are assumed. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: Presented at AI For Global Climate Cooperation Competition, 2023 (arXiv:cs/2307.06951)

Report number: AI4GCC/2023/track2/3

arXiv:2307.12882 [pdf, other]

doi 10.1145/3588001.3609364

FoodWise: Food Waste Reduction and Behavior Change on Campus with Data Visualization and Gamification

Authors: Yue Yu, Sophia Yi, Xi Nan, Leo Yu-Ho Lo, Kento Shigyo, Liwenhan Xie, Jeffry Wicaksana, Kwang-Ting Cheng, Huamin Qu

Abstract: Food waste presents a substantial challenge with significant environmental and economic ramifications, and its severity on campus environments is of particular concern. In response to this, we introduce FoodWise, a dual-component system tailored to inspire and incentivize campus communities to reduce food waste. The system consists of a data storytelling dashboard that graphically displays food wa… ▽ More Food waste presents a substantial challenge with significant environmental and economic ramifications, and its severity on campus environments is of particular concern. In response to this, we introduce FoodWise, a dual-component system tailored to inspire and incentivize campus communities to reduce food waste. The system consists of a data storytelling dashboard that graphically displays food waste information from university canteens, coupled with a mobile web application that encourages users to log their food waste reduction actions and rewards active participants for their efforts. Deployed during a two-week food-saving campaign at The Hong Kong University of Science and Technology (HKUST) in March 2023, FoodWise engaged over 200 participants from the university community, resulting in the logging of over 800 daily food-saving actions. Feedback collected post-campaign underscores the system's efficacy in elevating user consciousness about food waste and prompting behavioral shifts towards a more sustainable campus. This paper also provides insights for enhancing our system, contributing to a broader discourse on sustainable campus initiatives. △ Less

Submitted 27 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Comments: Accepted in ACM SIGCAS/SIGCHI Conference on Computing and Sustainable Societies (COMPASS) 2023

arXiv:2307.01403 [pdf, other]

Learning Multi-Agent Communication with Contrastive Learning

Authors: Yat Long Lo, Biswa Sengupta, Jakob Foerster, Michael Noukhovitch

Abstract: Communication is a powerful tool for coordination in multi-agent RL. But inducing an effective, common language is a difficult challenge, particularly in the decentralized setting. In this work, we introduce an alternative perspective where communicative messages sent between agents are considered as different incomplete views of the environment state. By examining the relationship between message… ▽ More Communication is a powerful tool for coordination in multi-agent RL. But inducing an effective, common language is a difficult challenge, particularly in the decentralized setting. In this work, we introduce an alternative perspective where communicative messages sent between agents are considered as different incomplete views of the environment state. By examining the relationship between messages sent and received, we propose to learn to communicate using contrastive learning to maximize the mutual information between messages of a given trajectory. In communication-essential environments, our method outperforms previous work in both performance and learning speed. Using qualitative metrics and representation probing, we show that our method induces more symmetric communication and captures global state information from the environment. Overall, we show the power of contrastive learning and the importance of leveraging messages as encodings for effective communication. △ Less

Submitted 1 February, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: The 12th International Conference on Learning Representations (ICLR)

arXiv:2303.10733 [pdf, other]

Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning

Authors: Yat Long Lo, Christian Schroeder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson

Abstract: By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be co… ▽ More By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be constantly accessible and known to the agents a priori. In this work, we lift these requirements such that the agents must discover the cheap talk channels and learn how to use them. Hence, the problem has two main parts: cheap talk discovery (CTD) and cheap talk utilization (CTU). We introduce a novel conceptual framework for both parts and develop a new algorithm based on mutual information maximization that outperforms existing algorithms in CTD/CTU settings. We also release a novel benchmark suite to stimulate future research in CTD/CTU. △ Less

Submitted 19 March, 2023; originally announced March 2023.

Comments: The 11th International Conference on Learning Representations (ICLR)

arXiv:2209.08751 [pdf, other]

doi 10.1145/3555597

Bias-Aware Design for Informed Decisions: Raising Awareness of Self-Selection Bias in User Ratings and Reviews

Authors: Qian Zhu, Leo Yu-Ho Lo, Meng Xia, Zixin Chen, Xiaojuan Ma

Abstract: People often take user ratings and reviews into consideration when shopping for products or services online. However, such user-generated data contains self-selection bias that could affect people decisions and it is hard to resolve this issue completely by algorithms. In this work, we propose to raise the awareness of the self-selection bias by making three types of information concerning user ra… ▽ More People often take user ratings and reviews into consideration when shopping for products or services online. However, such user-generated data contains self-selection bias that could affect people decisions and it is hard to resolve this issue completely by algorithms. In this work, we propose to raise the awareness of the self-selection bias by making three types of information concerning user ratings and reviews transparent. We distill these three pieces of information (reviewers experience, the extremity of emotion, and reported aspects) from the definition of self-selection bias and exploration of related literature. We further conduct an online survey to assess the perceptions of the usefulness of such information and identify the exact facets people care about in their decision process. Then, we propose a visual design to make such details behind user reviews transparent and integrate the design into an experimental website for evaluation. The results of a between-subjects study demonstrate that our bias-aware design significantly increases the awareness of bias and their satisfaction with decision-making. We further offer a series of design implications for improving information transparency and awareness of bias in user-generated content. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Journal ref: Proc. ACM Hum.-Comput. Interact. 6, CSCW2022

arXiv:2208.10603 [pdf, other]

Exploring Interactions with Printed Data Visualizations in Augmented Reality

Authors: Wai Tong, Zhutian Chen, Meng Xia, Leo Yu-Ho Lo, Linping Yuan, Benjamin Bach, Huamin Qu

Abstract: This paper presents a design space of interaction techniques to engage with visualizations that are printed on paper and augmented through Augmented Reality. Paper sheets are widely used to deploy visualizations and provide a rich set of tangible affordances for interactions, such as touch, folding, tilting, or stacking. At the same time, augmented reality can dynamically update visualization cont… ▽ More This paper presents a design space of interaction techniques to engage with visualizations that are printed on paper and augmented through Augmented Reality. Paper sheets are widely used to deploy visualizations and provide a rich set of tangible affordances for interactions, such as touch, folding, tilting, or stacking. At the same time, augmented reality can dynamically update visualization content to provide commands such as pan, zoom, filter, or detail on demand. This paper is the first to provide a structured approach to mapping possible actions with the paper to interaction commands. This design space and the findings of a controlled user study have implications for future designs of augmented reality systems involving paper sheets and visualizations. Through workshops (N=20) and ideation, we identified 81 interactions that we classify in three dimensions: 1) commands that can be supported by an interaction, 2) the specific parameters provided by an (inter)action with paper, and 3) the number of paper sheets involved in an interaction. We tested user preference and viability of 11 of these interactions with a prototype implementation in a controlled study (N=12, HoloLens 2) and found that most of the interactions are intuitive and engaging to use. We summarized interactions (e.g., tilt to pan) that have strong affordance to complement "point" for data exploration, physical limitations and properties of paper as a medium, cases requiring redundancy and shortcuts, and other implications for design. △ Less

Submitted 22 August, 2022; originally announced August 2022.

Comments: 11 pages, 9 figures, 1 table, accepted at IEEE VIS 2022

arXiv:2204.09548 [pdf, other]

Misinformed by Visualization: What Do We Learn From Misinformative Visualizations?

Authors: Leo Yu-Ho Lo, Ayush Gupta, Kento Shigyo, Aoyu Wu, Enrico Bertini, Huamin Qu

Abstract: Data visualization is powerful in persuading an audience. However, when it is done poorly or maliciously, a visualization may become misleading or even deceiving. Visualizations give further strength to the dissemination of misinformation on the Internet. The visualization research community has long been aware of visualizations that misinform the audience, mostly associated with the terms "lie" a… ▽ More Data visualization is powerful in persuading an audience. However, when it is done poorly or maliciously, a visualization may become misleading or even deceiving. Visualizations give further strength to the dissemination of misinformation on the Internet. The visualization research community has long been aware of visualizations that misinform the audience, mostly associated with the terms "lie" and "deceptive." Still, these discussions have focused only on a handful of cases. To better understand the landscape of misleading visualizations, we open-coded over one thousand real-world visualizations that have been reported as misleading. From these examples, we discovered 74 types of issues and formed a taxonomy of misleading elements in visualizations. We found four directions that the research community can follow to widen the discussion on misleading visualizations: (1) informal fallacies in visualizations, (2) exploiting conventions and data literacy, (3) deceptive tricks in uncommon charts, and (4) understanding the designers' dilemma. This work lays the groundwork for these research directions, especially in understanding, detecting, and preventing them. △ Less

Submitted 20 April, 2022; originally announced April 2022.

Comments: To be presented at EuroVis 2022. Appendix available at https://github.com/leoyuholo/bad-vis-browser/blob/master/misinformed_by_visualization_appendix.pdf

arXiv:2203.03344 [pdf, other]

Learning to Ground Decentralized Multi-Agent Communication with Contrastive Learning

Authors: Yat Long Lo, Biswa Sengupta

Abstract: For communication to happen successfully, a common language is required between agents to understand information communicated by one another. Inducing the emergence of a common language has been a difficult challenge to multi-agent learning systems. In this work, we introduce an alternative perspective to the communicative messages sent between agents, considering them as different incomplete view… ▽ More For communication to happen successfully, a common language is required between agents to understand information communicated by one another. Inducing the emergence of a common language has been a difficult challenge to multi-agent learning systems. In this work, we introduce an alternative perspective to the communicative messages sent between agents, considering them as different incomplete views of the environment state. Based on this perspective, we propose a simple approach to induce the emergence of a common language by maximizing the mutual information between messages of a given trajectory in a self-supervised manner. By evaluating our method in communication-essential environments, we empirically show how our method leads to better learning performance and speed, and learns a more consistent common language than existing methods, without introducing additional learning parameters. △ Less

Submitted 7 March, 2022; originally announced March 2022.

Journal ref: EmeCom at ICLR 2022

arXiv:2107.03891 [pdf, other]

Technical Report for Valence-Arousal Estimation in ABAW2 Challenge

Authors: Hong-Xia Xie, I-Hsuan Li, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng

Abstract: In this work, we describe our method for tackling the valence-arousal estimation challenge from ABAW2 ICCV-2021 Competition. The competition organizers provide an in-the-wild Aff-Wild2 dataset for participants to analyze affective behavior in real-life settings. We use a two stream model to learn emotion features from appearance and action respectively. To solve data imbalanced problem, we apply l… ▽ More In this work, we describe our method for tackling the valence-arousal estimation challenge from ABAW2 ICCV-2021 Competition. The competition organizers provide an in-the-wild Aff-Wild2 dataset for participants to analyze affective behavior in real-life settings. We use a two stream model to learn emotion features from appearance and action respectively. To solve data imbalanced problem, we apply label distribution smoothing (LDS) to re-weight labels. Our proposed method achieves Concordance Correlation Coefficient (CCC) of 0.591 and 0.617 for valence and arousal on the validation set of Aff-wild2 dataset. △ Less

Submitted 8 July, 2021; originally announced July 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2105.01502

arXiv:2012.14477 [pdf, other]

doi 10.1145/3411764.3445444

Causal Perception in Question-Answering Systems

Authors: Po-Ming Law, Leo Yu-Ho Lo, Alex Endert, John Stasko, Huamin Qu

Abstract: Root cause analysis is a common data analysis task. While question-answering systems enable people to easily articulate a why question (e.g., why students in Massachusetts have high ACT Math scores on average) and obtain an answer, these systems often produce questionable causal claims. To investigate how such claims might mislead users, we conducted two crowdsourced experiments to study the impac… ▽ More Root cause analysis is a common data analysis task. While question-answering systems enable people to easily articulate a why question (e.g., why students in Massachusetts have high ACT Math scores on average) and obtain an answer, these systems often produce questionable causal claims. To investigate how such claims might mislead users, we conducted two crowdsourced experiments to study the impact of showing different information on user perceptions of a question-answering system. We found that in a system that occasionally provided unreasonable responses, showing a scatterplot increased the plausibility of unreasonable causal claims. Also, simply warning participants that correlation is not causation seemed to lead participants to accept reasonable causal claims more cautiously. We observed a strong tendency among participants to associate correlation with causation. Yet, the warning appeared to reduce the tendency. Grounded in the findings, we propose ways to reduce the illusion of causality when using question-answering systems. △ Less

Submitted 6 January, 2021; v1 submitted 28 December, 2020; originally announced December 2020.

Comments: ACM Conference on Human Factors in Computing Systems (CHI 2021)

arXiv:2012.11307 [pdf, other]

An Overview of Facial Micro-Expression Analysis: Data, Methodology and Challenge

Authors: Hong-Xia Xie, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng

Abstract: Facial micro-expressions indicate brief and subtle facial movements that appear during emotional communication. In comparison to macro-expressions, micro-expressions are more challenging to be analyzed due to the short span of time and the fine-grained changes. In recent years, micro-expression recognition (MER) has drawn much attention because it can benefit a wide range of applications, e.g. pol… ▽ More Facial micro-expressions indicate brief and subtle facial movements that appear during emotional communication. In comparison to macro-expressions, micro-expressions are more challenging to be analyzed due to the short span of time and the fine-grained changes. In recent years, micro-expression recognition (MER) has drawn much attention because it can benefit a wide range of applications, e.g. police interrogation, clinical diagnosis, depression analysis, and business negotiation. In this survey, we offer a fresh overview to discuss new research directions and challenges these days for MER tasks. For example, we review MER approaches from three novel aspects: macro-to-micro adaptation, recognition based on key apex frames, and recognition based on facial action units. Moreover, to mitigate the problem of limited and biased ME data, synthetic data generation is surveyed for the diversity enrichment of micro-expression data. Since micro-expression spotting can boost micro-expression analysis, the state-of-the-art spotting works are also introduced in this paper. At last, we discuss the challenges in MER research and provide potential solutions as well as possible directions for further investigation. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: 20 pages, 7 figures

arXiv:2011.12566 [pdf]

ColdGAN: Resolving Cold Start User Recommendation by using Generative Adversarial Networks

Authors: Po-Lin Lai, Chih-Yun Chen, Liang-Wei Lo, Chien-Chin Chen

Abstract: Mitigating the new user cold-start problem has been critical in the recommendation system for online service providers to influence user experience in decision making which can ultimately affect the intention of users to use a particular service. Previous studies leveraged various side information from users and items; however, it may be impractical due to privacy concerns. In this paper, we prese… ▽ More Mitigating the new user cold-start problem has been critical in the recommendation system for online service providers to influence user experience in decision making which can ultimately affect the intention of users to use a particular service. Previous studies leveraged various side information from users and items; however, it may be impractical due to privacy concerns. In this paper, we present ColdGAN, an end-to-end GAN based model with no use of side information to resolve this problem. The main idea of the proposed model is to train a network that learns the rating distributions of experienced users given their cold-start distributions. We further design a time-based function to restore the preferences of users to cold-start states. With extensive experiments on two real-world datasets, the results show that our proposed method achieves significantly improved performance compared with the state-of-the-art recommenders. △ Less

Submitted 25 November, 2020; originally announced November 2020.

arXiv:2010.01959 [pdf, other]

Topological Analysis of Magnetic Reconnection in Kinetic Plasma Simulations

Authors: Divya Banesh, Li-Ta Lo, Patrick Kilian, Fan Guo, Bernd Hamann

Abstract: Magnetic reconnection is a ubiquitous plasma process in which oppositely directed magnetic field lines break and rejoin, resulting in a change of the magnetic field topology. Reconnection generates magnetic islands: regions enclosed by magnetic field lines and separated by reconnection points. Proper identification of these features is important to understand particle acceleration and overall beha… ▽ More Magnetic reconnection is a ubiquitous plasma process in which oppositely directed magnetic field lines break and rejoin, resulting in a change of the magnetic field topology. Reconnection generates magnetic islands: regions enclosed by magnetic field lines and separated by reconnection points. Proper identification of these features is important to understand particle acceleration and overall behavior of plasma. We present a contour-tree based visualization for robust and objective identification of islands and reconnection points in two-dimensional (2D) magnetic reconnection simulations. The application of this visualization to a simple simulation has revealed a physical phenomenon previously not reported, resulting in a more comprehensive understanding of magnetic reconnection. △ Less

Submitted 1 September, 2020; originally announced October 2020.

Comments: To to published in IEEE Transactions on Visualization and Computer Graphics 2020

arXiv:2007.10491 [pdf, other]

BeeSwarm: Enabling Scalability Tests in Continuous Integration

Authors: Jieyang Chen, Qiang Guan, Li-Ta Lo, Patricia Grubel, Tim Randles

Abstract: Testing is one of the most important steps in software development. It ensures the quality of software. Continuous Integration (CI) is a widely used testing system that can report software quality to the developer in a timely manner during the development progress. Performance, especially scalability, is another key factor for High Performance Computing (HPC) applications. Though there are many ap… ▽ More Testing is one of the most important steps in software development. It ensures the quality of software. Continuous Integration (CI) is a widely used testing system that can report software quality to the developer in a timely manner during the development progress. Performance, especially scalability, is another key factor for High Performance Computing (HPC) applications. Though there are many applications and tools to profile the performance of HPC applications, none of them are integrated into the continuous integration. On the other hand, no current continuous integration tools provide easy-to-use scalability test capabilities. In this work, we propose BeeSwarm, a scalability test system that can be easily applied to the current CI test environment enabling scalability test capability for HPC developers. As a showcase, BeeSwarm is integrated into Travis CI and GitLab CI to execute the scalability test workflow on Chameleon cloud. △ Less

Submitted 20 July, 2020; originally announced July 2020.

arXiv:2004.08915 [pdf, other]

MER-GCN: Micro Expression Recognition Based on Relation Modeling with Graph Convolutional Network

Authors: Ling Lo, Hong-Xia Xie, Hong-Han Shuai, Wen-Huang Cheng

Abstract: Micro-Expression (ME) is the spontaneous, involuntary movement of a face that can reveal the true feeling. Recently, increasing researches have paid attention to this field combing deep learning techniques. Action units (AUs) are the fundamental actions reflecting the facial muscle movements and AU detection has been adopted by many researches to classify facial expressions. However, the time-cons… ▽ More Micro-Expression (ME) is the spontaneous, involuntary movement of a face that can reveal the true feeling. Recently, increasing researches have paid attention to this field combing deep learning techniques. Action units (AUs) are the fundamental actions reflecting the facial muscle movements and AU detection has been adopted by many researches to classify facial expressions. However, the time-consuming annotation process makes it difficult to correlate the combinations of AUs to specific emotion classes. Inspired by the nodes relationship building Graph Convolutional Networks (GCN), we propose an end-to-end AU-oriented graph classification network, namely MER-GCN, which uses 3D ConvNets to extract AU features and applies GCN layers to discover the dependency laying between AU nodes for ME categorization. To our best knowledge, this work is the first end-to-end architecture for Micro-Expression Recognition (MER) using AUs based GCN. The experimental results show that our approach outperforms CNN-based MER networks. △ Less

Submitted 19 April, 2020; originally announced April 2020.

Comments: Accepted by IEEE MIPR 2020

arXiv:2003.07417 [pdf, other]

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

Authors: Sina Ghiassian, Banafsheh Rafiee, Yat Long Lo, Adam White

Abstract: Reinforcement learning systems require good representations to work well. For decades practical success in reinforcement learning was limited to small domains. Deep reinforcement learning systems, on the other hand, are scalable, not dependent on domain specific prior knowledge and have been successfully used to play Atari, in 3D navigation from pixels, and to control high degree of freedom robots… ▽ More Reinforcement learning systems require good representations to work well. For decades practical success in reinforcement learning was limited to small domains. Deep reinforcement learning systems, on the other hand, are scalable, not dependent on domain specific prior knowledge and have been successfully used to play Atari, in 3D navigation from pixels, and to control high degree of freedom robots. Unfortunately, the performance of deep reinforcement learning systems is sensitive to hyper-parameter settings and architecture choices. Even well tuned systems exhibit significant instability both within a trial and across experiment replications. In practice, significant expertise and trial and error are usually required to achieve good performance. One potential source of the problem is known as catastrophic interference: when later training decreases performance by overriding previous learning. Interestingly, the powerful generalization that makes Neural Networks (NN) so effective in batch supervised learning might explain the challenges when applying them in reinforcement learning tasks. In this paper, we explore how online NN training and interference interact in reinforcement learning. We find that simply re-mapping the input observations to a high-dimensional space improves learning speed and parameter sensitivity. We also show this preprocessing reduces interference in prediction tasks. More practically, we provide a simple approach to NN training that is easy to implement, and requires little additional computation. We demonstrate that our approach improves performance in both prediction and control with an extensive batch of experiments in classic control domains. △ Less

Submitted 16 March, 2020; originally announced March 2020.

Comments: 10 pages; Accepted to AAMAS 2020

arXiv:2002.09195 [pdf, other]

Nonlinearity Compensation in a Multi-DoF Shoulder Sensing Exosuit for Real-Time Teleoperation

Authors: Rejin John Varghese, Anh Nguyen, Etienne Burdet, Guang-Zhong Yang, Benny P L Lo

Abstract: The compliant nature of soft wearable robots makes them ideal for complex multiple degrees of freedom (DoF) joints, but also introduce additional structural nonlinearities. Intuitive control of these wearable robots requires robust sensing to overcome the inherent nonlinearities. This paper presents a joint kinematics estimator for a bio-inspired multi-DoF shoulder exosuit capable of compensating… ▽ More The compliant nature of soft wearable robots makes them ideal for complex multiple degrees of freedom (DoF) joints, but also introduce additional structural nonlinearities. Intuitive control of these wearable robots requires robust sensing to overcome the inherent nonlinearities. This paper presents a joint kinematics estimator for a bio-inspired multi-DoF shoulder exosuit capable of compensating the encountered nonlinearities. To overcome the nonlinearities and hysteresis inherent to the soft and compliant nature of the suit, we developed a deep learning-based method to map the sensor data to the joint space. The experimental results show that the new learning-based framework outperforms recent state-of-the-art methods by a large margin while achieving 12ms inference time using only a GPU-based edge-computing device. The effectiveness of our combined exosuit and learning framework is demonstrated through real-time teleoperation with a simulated NAO humanoid robot. △ Less

Submitted 21 February, 2020; originally announced February 2020.

Comments: 8 pages, 7 figures, 3 tables. Accepted to be published in IEEE RoboSoft 2020

arXiv:2001.02731 [pdf, other]

SirenLess: reveal the intention behind news

Authors: Xumeng Chen, Leo Yu-Ho Lo, Huamin Qu

Abstract: News articles tend to be increasingly misleading nowadays, preventing readers from making subjective judgments towards certain events. While some machine learning approaches have been proposed to detect misleading news, most of them are black boxes that provide limited help for humans in decision making. In this paper, we present SirenLess, a visual analytical system for misleading news detection… ▽ More News articles tend to be increasingly misleading nowadays, preventing readers from making subjective judgments towards certain events. While some machine learning approaches have been proposed to detect misleading news, most of them are black boxes that provide limited help for humans in decision making. In this paper, we present SirenLess, a visual analytical system for misleading news detection by linguistic features. The system features article explorer, a novel interactive tool that integrates news metadata and linguistic features to reveal semantic structures of news articles and facilitate textual analysis. We use SirenLess to analyze 18 news articles from different sources and summarize some helpful patterns for misleading news detection. A user study with journalism professionals and university students is conducted to confirm the usefulness and effectiveness of our system. △ Less

Submitted 8 January, 2020; originally announced January 2020.

Comments: 9 pages, 8 figures

arXiv:1910.13213 [pdf, other]

Overcoming Catastrophic Interference in Online Reinforcement Learning with Dynamic Self-Organizing Maps

Authors: Yat Long Lo, Sina Ghiassian

Abstract: Using neural networks in the reinforcement learning (RL) framework has achieved notable successes. Yet, neural networks tend to forget what they learned in the past, especially when they learn online and fully incrementally, a setting in which the weights are updated after each sample is received and the sample is then discarded. Under this setting, an update can lead to overly global generalizati… ▽ More Using neural networks in the reinforcement learning (RL) framework has achieved notable successes. Yet, neural networks tend to forget what they learned in the past, especially when they learn online and fully incrementally, a setting in which the weights are updated after each sample is received and the sample is then discarded. Under this setting, an update can lead to overly global generalization by changing too many weights. The global generalization interferes with what was previously learned and deteriorates performance, a phenomenon known as catastrophic interference. Many previous works use mechanisms such as experience replay (ER) buffers to mitigate interference by performing minibatch updates, ensuring the data distribution is approximately independent-and-identically-distributed (i.i.d.). But using ER would become infeasible in terms of memory as problem complexity increases. Thus, it is crucial to look for more memory-efficient alternatives. Interference can be averted if we replace global updates with more local ones, so only weights responsible for the observed data sample are updated. In this work, we propose the use of dynamic self-organizing map (DSOM) with neural networks to induce such locality in the updates without ER buffers. Our method learns a DSOM to produce a mask to reweigh each hidden unit's output, modulating its degree of use. It prevents interference by replacing global updates with local ones, conditioned on the agent's state. We validate our method on standard RL benchmarks including Mountain Car and Lunar Lander, where existing methods often fail to learn without ER. Empirically, we show that our online and fully incremental method is on par with and in some cases, better than state-of-the-art in terms of final performance and learning speed. We provide visualizations and quantitative measures to show that our method indeed mitigates interference. △ Less

Submitted 29 October, 2019; originally announced October 2019.

Comments: 9 Pages, 7 Figures, NeurIPS Workshop on Biological and Artificial Reinforcement Learning, 2019

Journal ref: Biological and Artificial RL Workshop at NeurIPS 2019

arXiv:1910.04787 [pdf, other]

Design and Prototyping of a Bio-inspired Kinematic Sensing Suit for the Shoulder Joint: Precursor to a Multi-DoF Shoulder Exosuit

Authors: Rejin John Varghese, Benny P L Lo, Guang-Zhong Yang

Abstract: Soft wearable robots are a promising new design paradigm for rehabilitation and active assistance applications. Their compliant nature makes them ideal for complex joints like the shoulder, but intuitive control of these robots require robust and compliant sensing mechanisms. In this work, we introduce the sensing framework for a multi-DoF shoulder exosuit capable of sensing the kinematics of the… ▽ More Soft wearable robots are a promising new design paradigm for rehabilitation and active assistance applications. Their compliant nature makes them ideal for complex joints like the shoulder, but intuitive control of these robots require robust and compliant sensing mechanisms. In this work, we introduce the sensing framework for a multi-DoF shoulder exosuit capable of sensing the kinematics of the shoulder joint. The proposed tendon-based sensing system is inspired by the concept of muscle synergies, the body's sense of proprioception, and finds its basis in the organization of the muscles responsible for shoulder movements. A motion-capture-based evaluation of the developed sensing system showed conformance to the behaviour exhibited by the muscles that inspired its routing and validates the hypothesis of the tendon-routing to be extended to the actuation framework of the exosuit in the future. The mapping from multi-sensor space to joint space is a multivariate multiple regression problem and was derived using an Artificial Neural Network (ANN). The sensing framework was tested with a motion-tracking system and achieved performance with root mean square error (RMSE) of approximately 5.43 degrees and 3.65 degrees for the azimuth and elevation joint angles, respectively, measured over 29000 frames (4+ minutes) of motion-capture data. △ Less

Submitted 10 October, 2019; originally announced October 2019.

Comments: 8 pages, 7 figures, 1 table

arXiv:1909.04749 [pdf, other]

Visual Analytics of Student Learning Behaviors on K-12 Mathematics E-learning Platforms

Authors: Meng Xia, Huan Wei, Min Xu, Leo Yu Ho Lo, Yong Wang, Rong Zhang, Huamin Qu

Abstract: With increasing popularity in online learning, a surge of E-learning platforms have emerged to facilitate education opportunities for k-12 (from kindergarten to 12th grade) students and with this, a wealth of information on their learning logs are getting recorded. However, it remains unclear how to make use of these detailed learning behavior data to improve the design of learning materials and g… ▽ More With increasing popularity in online learning, a surge of E-learning platforms have emerged to facilitate education opportunities for k-12 (from kindergarten to 12th grade) students and with this, a wealth of information on their learning logs are getting recorded. However, it remains unclear how to make use of these detailed learning behavior data to improve the design of learning materials and gain deeper insight into students' thinking and learning styles. In this work, we propose a visual analytics system to analyze student learning behaviors on a K-12 mathematics E-learning platform. It supports both correlation analysis between different attributes and a detailed visualization of user mouse-movement logs. Our case studies on a real dataset show that our system can better guide the design of learning resources (e.g., math questions) and facilitate quick interpretation of students' problem-solving and learning styles. △ Less

Submitted 21 September, 2019; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: 2 pages, 6 figures, 2019 VAST conference, Best Poster, Learning Analytics, Visual Analytic System for Education, Online Learning, Learning Data Analysis, Learning Trajectories Analysis, Mouse Movement

arXiv:1907.08796 [pdf, other]

Learning Vis Tools: Teaching Data Visualization Tutorials

Authors: Leo Yu-Ho Lo, Yao Ming, Huamin Qu

Abstract: Teaching and advocating data visualization are among the most important activities in the visualization community. With growing interest in data analysis from business and science professionals, data visualization courses attract students across different disciplines. However, comprehensive visualization training requires students to have a certain level of proficiency in programming, a requiremen… ▽ More Teaching and advocating data visualization are among the most important activities in the visualization community. With growing interest in data analysis from business and science professionals, data visualization courses attract students across different disciplines. However, comprehensive visualization training requires students to have a certain level of proficiency in programming, a requirement that imposes challenges on both teachers and students. With recent developments in visualization tools, we have managed to overcome these obstacles by teaching a wide range of visualization and supporting tools. Starting with GUI-based visualization tools and data analysis with Python, students put visualization knowledge into practice with increasing amounts of programming. At the end of the course, students can design and implement visualizations with D3 and other programming-based visualization tools. Throughout the course, we continuously collect student feedback and refine the teaching materials. This paper documents our teaching methods and considerations when designing the teaching materials. △ Less

Submitted 7 October, 2019; v1 submitted 20 July, 2019; originally announced July 2019.

Comments: 5 pages, 1 figure, IEEE VIS 2019

arXiv:1901.01331 [pdf, other]

The ISTI Rapid Response on Exploring Cloud Computing 2018

Authors: Carleton Coffrin, James Arnold, Stephan Eidenbenz, Derek Aberle, John Ambrosiano, Zachary Baker, Sara Brambilla, Michael Brown, K. Nolan Carter, Pinghan Chu, Patrick Conry, Keeley Costigan, Ariane Eberhardt, David M. Fobes, Adam Gausmann, Sean Harris, Donovan Heimer, Marlin Holmes, Bill Junor, Csaba Kiss, Steve Linger, Rodman Linn, Li-Ta Lo, Jonathan MacCarthy, Omar Marcillo , et al. (23 additional authors not shown)

Abstract: This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories. These demonstrations ranged from deploying proprietary software in a cloud environment to leveraging established cloud-based analytics workflows for processing scientific datasets. By and large, the projects were successful and collectiv… ▽ More This report describes eighteen projects that explored how commercial cloud computing services can be utilized for scientific computation at national laboratories. These demonstrations ranged from deploying proprietary software in a cloud environment to leveraging established cloud-based analytics workflows for processing scientific datasets. By and large, the projects were successful and collectively they suggest that cloud computing can be a valuable computational resource for scientific computation at national laboratories. △ Less

Submitted 4 January, 2019; originally announced January 2019.

Report number: LA-UR-18-31581

arXiv:1806.07382 [pdf, other]

In situ TensorView: In situ Visualization of Convolutional Neural Networks

Authors: Xinyu Chen, Qiang Guan, Li-Ta Lo, Simon Su, James Ahrens, Trilce Estrada

Abstract: Convolutional Neural Networks(CNNs) are complex systems. They are trained so they can adapt their internal connections to recognize images, texts and more. It is both interesting and helpful to visualize the dynamics within such deep artificial neural networks so that people can understand how these artificial networks are learning and making predictions. In the field of scientific simulations, vi… ▽ More Convolutional Neural Networks(CNNs) are complex systems. They are trained so they can adapt their internal connections to recognize images, texts and more. It is both interesting and helpful to visualize the dynamics within such deep artificial neural networks so that people can understand how these artificial networks are learning and making predictions. In the field of scientific simulations, visualization tools like Paraview have long been utilized to provide insights and understandings. We present in situ TensorView to visualize the training and functioning of CNNs as if they are systems of scientific simulations. In situ TensorView is a loosely coupled in situ visualization open framework that provides multiple viewers to help users to visualize and understand their networks. It leverages the capability of co-processing from Paraview to provide real-time visualization during training and predicting phases. This avoid heavy I/O overhead for visualizing large dynamic systems. Only a small number of lines of codes are injected in TensorFlow framework. The visualization can provide guidance to adjust the architecture of networks, or compress the pre-trained networks. We showcase visualizing the training of LeNet-5 and VGG16 using in situ TensorView. △ Less

Submitted 16 June, 2018; originally announced June 2018.

arXiv:1712.06790 [pdf, other]

Build and Execution Environment (BEE): an Encapsulated Environment Enabling HPC Applications Running Everywhere

Authors: Jieyang Chen, Qiang Guan, Xin Liang, Louis James Vernon, Allen McPherson, Li-Ta Lo, Zizhong Chen, James Paul Ahrens

Abstract: Variations in High Performance Computing (HPC) system software configurations mean that applications are typically configured and built for specific HPC environments. Building applications can require a significant investment of time and effort for application users and requires application users to have additional technical knowledge. Linux container technologies such as Docker and Charliecloud b… ▽ More Variations in High Performance Computing (HPC) system software configurations mean that applications are typically configured and built for specific HPC environments. Building applications can require a significant investment of time and effort for application users and requires application users to have additional technical knowledge. Linux container technologies such as Docker and Charliecloud bring great benefits to the application development, build and deployment processes. While cloud platforms already widely support containers, HPC systems still have non-uniform support of container technologies. In this work, we propose a unified runtime framework -- Build and Execution Environment (BEE) across both HPC and cloud platforms that allows users to run their containerized HPC applications across all supported platforms without modification. We design four BEE backends for four different classes of HPC or cloud platform so that together they cover the majority of mainstream computing platforms for HPC users. Evaluations show that BEE provides an easy-to-use unified user interface, execution environment, and comparable performance. △ Less

Submitted 27 February, 2021; v1 submitted 19 December, 2017; originally announced December 2017.

arXiv:1708.08144 [pdf, other]

doi 10.1145/3131473.3131482

Finding by Counting: A Probabilistic Packet Count Model for Indoor Localization in BLE Environments

Authors: Subham De, Shreyans Chowdhary, Aniket Shirke, Yat Long Lo, Robin Kravets, Hari Sundaram

Abstract: We propose a probabilistic packet reception model for Bluetooth Low Energy (BLE) packets in indoor spaces and we validate the model by using it for indoor localization. We expect indoor localization to play an important role in indoor public spaces in the future. We model the probability of reception of a packet as a generalized quadratic function of distance, beacon power and advertising frequenc… ▽ More We propose a probabilistic packet reception model for Bluetooth Low Energy (BLE) packets in indoor spaces and we validate the model by using it for indoor localization. We expect indoor localization to play an important role in indoor public spaces in the future. We model the probability of reception of a packet as a generalized quadratic function of distance, beacon power and advertising frequency. Then, we use a Bayesian formulation to determine the coefficients of the packet loss model using empirical observations from our testbed. We develop a new sequential Monte-Carlo algorithm that uses our packet count model. The algorithm is general enough to accommodate different spatial configurations. We have good indoor localization experiments: our approach has an average error of ~1.2m, 53% lower than the baseline range-free Monte-Carlo localization algorithm. △ Less

Submitted 27 August, 2017; originally announced August 2017.

Comments: 8 pages, 6 figures, to be published in WiNTECH 2017

arXiv:1502.02772 [pdf]

A HMAX with LLC for visual recognition

Authors: Kean Hong Lau, Yong Haur Tay, Fook Loong Lo

Abstract: Today's high performance deep artificial neural networks (ANNs) rely heavily on parameter optimization, which is sequential in nature and even with a powerful GPU, would have taken weeks to train them up for solving challenging tasks [22]. HMAX [17] has demonstrated that a simple high performing network could be obtained without heavy optimization. In this paper, we had improved on the existing be… ▽ More Today's high performance deep artificial neural networks (ANNs) rely heavily on parameter optimization, which is sequential in nature and even with a powerful GPU, would have taken weeks to train them up for solving challenging tasks [22]. HMAX [17] has demonstrated that a simple high performing network could be obtained without heavy optimization. In this paper, we had improved on the existing best HMAX neural network [12] in terms of structural simplicity and performance. Our design replaces the L1 minimization sparse coding (SC) with a locality-constrained linear coding (LLC) [20] which has a lower computational demand. We also put the simple orientation filter bank back into the front layer of the network replacing PCA. Our system's performance has improved over the existing architecture and reached 79.0% on the challenging Caltech-101 [7] dataset, which is state-of-the-art for ANNs (without transfer learning). From our empirical data, the main contributors to our system's performance include an introduction of partial signal whitening, a spot detector, and a spatial pyramid matching (SPM) [14] layer. △ Less

Submitted 10 February, 2015; v1 submitted 9 February, 2015; originally announced February 2015.

Comments: 10 pages, 3 figures, 2 tables, 23 references

Showing 1–30 of 30 results for author: Lo, L