Skip to main content

Showing 1–50 of 80 results for author: Ho, M

  1. arXiv:2407.10583  [pdf, other

    cs.AI cs.LG

    Three Dogmas of Reinforcement Learning

    Authors: David Abel, Mark K. Ho, Anna Harutyunyan

    Abstract: Modern reinforcement learning has been conditioned by at least three dogmas. The first is the environment spotlight, which refers to our tendency to focus on modeling environments rather than agents. The second is our treatment of learning as finding the solution to a task, rather than adaptation. The third is the reward hypothesis, which states that all goals and purposes can be well thought of a… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: RLC 2024

  2. arXiv:2407.04245  [pdf, other

    cs.CV

    Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization

    Authors: Ming-Yang Ho, Che-Ming Wu, Min-Sheng Wu, Yufeng Jane Tseng

    Abstract: Recent advancements in ultra-high-resolution unpaired image-to-image translation have aimed to mitigate the constraints imposed by limited GPU memory through patch-wise inference. Nonetheless, existing methods often compromise between the reduction of noticeable tiling artifacts and the preservation of color and hue contrast, attributed to the reliance on global image- or patch-level statistics in… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  3. arXiv:2406.04302  [pdf, other

    cs.LG

    Representational Alignment Supports Effective Machine Teaching

    Authors: Ilia Sucholutsky, Katherine M. Collins, Maya Malaviya, Nori Jacoby, Weiyang Liu, Theodore R. Sumers, Michalis Korakakis, Umang Bhatt, Mark Ho, Joshua B. Tenenbaum, Brad Love, Zachary A. Pardos, Adrian Weller, Thomas L. Griffiths

    Abstract: A good teacher should not only be knowledgeable; but should be able to communicate in a way that the student understands -- to share the student's representation of the world. In this work, we integrate insights from machine teaching and pragmatic communication with the burgeoning literature on representational alignment to characterize a utility curve defining a relationship between representatio… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Preprint

  4. arXiv:2404.13097  [pdf, other

    eess.IV cs.CV cs.LG q-bio.QM

    DISC: Latent Diffusion Models with Self-Distillation from Separated Conditions for Prostate Cancer Grading

    Authors: Man M. Ho, Elham Ghelichkhan, Yosep Chong, Yufei Zhou, Beatrice Knudsen, Tolga Tasdizen

    Abstract: Latent Diffusion Models (LDMs) can generate high-fidelity images from noise, offering a promising approach for augmenting histopathology images for training cancer grading models. While previous works successfully generated high-fidelity histopathology images using LDMs, the generation of image tiles to improve prostate cancer grading has not yet been explored. Additionally, LDMs face challenges i… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Abstract accepted for ISBI 2024. Extended version to be presented at SynData4CV @ CVPR 2024. See more at https://minhmanho.github.io/disc/

  5. arXiv:2404.12650  [pdf, other

    eess.IV cs.CV cs.LG

    F2FLDM: Latent Diffusion Models with Histopathology Pre-Trained Embeddings for Unpaired Frozen Section to FFPE Translation

    Authors: Man M. Ho, Shikha Dubey, Yosep Chong, Beatrice Knudsen, Tolga Tasdizen

    Abstract: The Frozen Section (FS) technique is a rapid and efficient method, taking only 15-30 minutes to prepare slides for pathologists' evaluation during surgery, enabling immediate decisions on further surgical interventions. However, FS process often introduces artifacts and distortions like folds and ice-crystal effects. In contrast, these artifacts and distortions are absent in the higher-quality for… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Preprint. Our work is available at https://minhmanho.github.io/f2f_ldm/

  6. arXiv:2404.09275  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning

    Authors: Quang Minh Dinh, Minh Khoi Ho, Anh Quan Dang, Hung Phong Tran

    Abstract: Traffic video description and analysis have received much attention recently due to the growing demand for efficient and reliable urban surveillance systems. Most existing methods only focus on locating traffic event segments, which severely lack descriptive details related to the behaviour and context of all the subjects of interest in the events. In this paper, we present TrafficVLM, a novel mul… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024, pp. 7134-7143

  7. arXiv:2402.05137  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.LG

    LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology

    Authors: Matthew Ho, Deaglan J. Bartlett, Nicolas Chartier, Carolina Cuesta-Lazaro, Simon Ding, Axel Lapel, Pablo Lemos, Christopher C. Lovell, T. Lucas Makinen, Chirag Modi, Viraj Pandya, Shivam Pandey, Lucia A. Perez, Benjamin Wandelt, Greg L. Bryan

    Abstract: This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i… ▽ More

    Submitted 2 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: 22 pages, 10 figures, accepted in the Open Journal of Astrophysics. Code available at https://github.com/maho3/ltu-ili

    Journal ref: 2024 OJA, Vol. 7

  8. arXiv:2312.13615  [pdf, other

    eess.AS cs.SD eess.SP

    Self-supervised Complex Network for Machine Sound Anomaly Detection

    Authors: Miseul Kim, Minh Tri Ho, Hong-Goo Kang

    Abstract: In this paper, we propose an anomaly detection algorithm for machine sounds with a deep complex network trained by self-supervision. Using the fact that phase continuity information is crucial for detecting abnormalities in time-series signals, our proposed algorithm utilizes the complex spectrum as an input and performs complex number arithmetic throughout the entire process. Since the usefulness… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: Published in EUSIPCO 2021

  9. arXiv:2312.06978  [pdf, other

    cs.CV

    CLASS-M: Adaptive stain separation-based contrastive learning with pseudo-labeling for histopathological image classification

    Authors: Bodong Zhang, Hamid Manoochehri, Man Minh Ho, Fahimeh Fooladgar, Yosep Chong, Beatrice S. Knudsen, Deepika Sirohi, Tolga Tasdizen

    Abstract: Histopathological image classification is an important task in medical image analysis. Recent approaches generally rely on weakly supervised learning due to the ease of acquiring case-level labels from pathology reports. However, patch-level classification is preferable in applications where only a limited number of cases are available or when local prediction accuracy is critical. On the other ha… ▽ More

    Submitted 4 January, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

  10. arXiv:2311.18644  [pdf, other

    cs.AI

    Exploring the hierarchical structure of human plans via program generation

    Authors: Carlos G. Correa, Sophia Sanborn, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

    Abstract: Human behavior is inherently hierarchical, resulting from the decomposition of a task into subtasks or an abstract action into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically-structured plans, using an experimental paradigm that makes hierarchic… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  11. arXiv:2311.07298  [pdf, other

    cs.SE

    Energy and Time Complexity for Sorting Algorithms in Java

    Authors: Kristina Carter, Su Mei Gwen Ho, Mathias Marquar Arhipenko Larsen, Martin Sundman, Maja H. Kirkeby

    Abstract: The article investigates the relationship between time complexity and energy consumption in sorting algorithms, focusing on commonly-used algorithms implemented in Java: Bubble Sort, Counting Sort, Merge Sort, and Quick Sort. The significance of understanding this relationship is driven by the increasing energy demands of Information and Communication Technology systems and the potential for softw… ▽ More

    Submitted 8 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

  12. arXiv:2310.20059  [pdf, other

    cs.AI

    Concept Alignment as a Prerequisite for Value Alignment

    Authors: Sunayana Rane, Mark Ho, Ilia Sucholutsky, Thomas L. Griffiths

    Abstract: Value alignment is essential for building AI systems that can safely and reliably interact with people. However, what a person values -- and is even capable of valuing -- depends on the concepts that they are currently using to understand and evaluate what happens in the world. The dependence of values on concepts means that concept alignment is a prerequisite for value alignment -- agents need to… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  13. arXiv:2310.12528  [pdf, other

    astro-ph.IM cs.LG

    Constructing Impactful Machine Learning Research for Astronomy: Best Practices for Researchers and Reviewers

    Authors: D. Huppenkothen, M. Ntampaka, M. Ho, M. Fouesneau, B. Nord, J. E. G. Peek, M. Walmsley, J. F. Wu, C. Avestruz, T. Buck, M. Brescia, D. P. Finkbeiner, A. D. Goulding, T. Kacprzak, P. Melchior, M. Pasquato, N. Ramachandra, Y. -S. Ting, G. van de Ven, S. Villar, V. A. Villar, E. Zinger

    Abstract: Machine learning has rapidly become a tool of choice for the astronomical community. It is being applied across a wide range of wavelengths and problems, from the classification of transients to neural network emulators of cosmological simulations, and is shifting paradigms about how we generate and report scientific results. At the same time, this class of method comes with its own set of best pr… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 14 pages, 3 figures; submitted to the Bulletin of the American Astronomical Society

  14. arXiv:2310.02221  [pdf, other

    cs.LG

    Structurally guided task decomposition in spatial navigation tasks

    Authors: Ruiqi He, Carlos G. Correa, Thomas L. Griffiths, Mark K. Ho

    Abstract: How are people able to plan so efficiently despite limited cognitive resources? We aimed to answer this question by extending an existing model of human task decomposition that can explain a wide range of simple planning problems by adding structure information to the task to facilitate planning in more complex tasks. The extended model was then applied to a more complex planning domain of spatial… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  15. arXiv:2309.14617  [pdf

    cs.CY cs.AI

    Towards A Unified Utilitarian Ethics Framework for Healthcare Artificial Intelligence

    Authors: Forhan Bin Emdad, Shuyuan Mary Ho, Benhur Ravuri, Shezin Hussain

    Abstract: Artificial Intelligence (AI) aims to elevate healthcare to a pinnacle by aiding clinical decision support. Overcoming the challenges related to the design of ethical AI will enable clinicians, physicians, healthcare professionals, and other stakeholders to use and trust AI in healthcare settings. This study attempts to identify the major ethical principles influencing the utility performance of AI… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  16. arXiv:2307.06333  [pdf, other

    cs.LG cs.AI cs.HC cs.RO

    Diagnosis, Feedback, Adaptation: A Human-in-the-Loop Framework for Test-Time Policy Adaptation

    Authors: Andi Peng, Aviv Netanyahu, Mark Ho, Tianmin Shu, Andreea Bobu, Julie Shah, Pulkit Agrawal

    Abstract: Policies often fail due to distribution shift -- changes in the state and reward that occur when a policy is deployed in new environments. Data augmentation can increase robustness by making the model invariant to task-irrelevant changes in the agent's observation. However, designers don't know which concepts are irrelevant a priori, especially when different end users have different preferences a… ▽ More

    Submitted 13 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: International Conference on Machine Learning (ICML) 2023

  17. arXiv:2306.13262  [pdf, other

    cs.IT

    Reliable computation by large-alphabet formulas in the presence of noise

    Authors: Andrew K. Tan, Matthew Ho, Isaac L. Chuang

    Abstract: We present two new positive results for reliable computation using formulas over physical alphabets of size $q > 2$. First, we show that for logical alphabets of size $\ell = q$ the threshold for denoising using gates subject to $q$-ary symmetric noise with error probability $\varepsilon$ is strictly larger than that for Boolean computation, and is possible as long as signals remain distinguishabl… ▽ More

    Submitted 25 June, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

    Comments: 20 pages, 4 figures

  18. arXiv:2305.11213  [pdf, other

    cs.LG

    Information-Ordered Bottlenecks for Adaptive Semantic Compression

    Authors: Matthew Ho, Xiaosheng Zhao, Benjamin Wandelt

    Abstract: We present the information-ordered bottleneck (IOB), a neural layer designed to adaptively compress data into latent variables ordered by likelihood maximization. Without retraining, IOB nodes can be truncated at any bottleneck width, capturing the most crucial information in the first latent variables. Unifying several previous approaches, we show that IOBs achieve near-optimal compression for a… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 14 pages, 6 figures, 1 table, Submitted to NeurIPS 2023

  19. arXiv:2305.03263  [pdf, other

    cs.LG cs.AI

    Bayesian Reinforcement Learning with Limited Cognitive Load

    Authors: Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

    Abstract: All biological and artificial agents must learn and make decisions given limits on their ability to process information. As such, a general theory of adaptive behavior should be able to account for the complex interactions between an agent's learning history, decisions, and capacity constraints. Recent work in computer science has begun to clarify the principles that shape these dynamics by bridgi… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  20. arXiv:2211.15352  [pdf, other

    cs.CV cs.AI cs.LG

    Interactive Image Manipulation with Complex Text Instructions

    Authors: Ryugo Morita, Zhiqiang Zhang, Man M. Ho, Jinjia Zhou

    Abstract: Recently, text-guided image manipulation has received increasing attention in the research field of multimedia processing and computer vision due to its high flexibility and controllability. Its goal is to semantically manipulate parts of an input reference image according to the text descriptions. However, most of the existing works have the following problems: (1) text-irrelevant content cannot… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: Accepted to WACV2023

  21. Humans decompose tasks by trading off utility and computational cost

    Authors: Carlos G. Correa, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

    Abstract: Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions. How are these decompositions created and used? Here, we propose and evaluate a normative framework for task decomposition based on the simple idea that people decompose tasks to reduce the overall cost of planning while maintaining task performance. Analyzing 11,117 distinct gra… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  22. arXiv:2210.16877  [pdf, ps, other

    cs.LG cs.AI

    On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

    Authors: Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

    Abstract: Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources. Prior work has drawn inspiration from this fact and leveraged an information-theoretic model of such behaviors or policies as communication cha… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

    Comments: Accepted to the NeurIPS Workshop on Information-Theoretic Principles in Cognitive Systems (InfoCog) 2022. arXiv admin note: text overlap with arXiv:2206.02072

  23. arXiv:2210.12152  [pdf, other

    cs.CL cs.AI

    WikiWhy: Answering and Explaining Cause-and-Effect Questions

    Authors: Matthew Ho, Aditya Sharma, Justin Chang, Michael Saxon, Sharon Levy, Yujie Lu, William Yang Wang

    Abstract: As large language models (LLMs) grow larger and more sophisticated, assessing their "reasoning" capabilities in natural language grows more challenging. Recent question answering (QA) benchmarks that attempt to assess reasoning are often limited by a narrow scope of covered situations and subject matters. We introduce WikiWhy, a QA dataset built around a novel auxiliary task: explaining why an ans… ▽ More

    Submitted 30 November, 2022; v1 submitted 21 October, 2022; originally announced October 2022.

  24. arXiv:2208.10730  [pdf, other

    cs.CV eess.IV

    Ultra-high-resolution unpaired stain transformation via Kernelized Instance Normalization

    Authors: Ming-Yang Ho, Min-Sheng Wu, Che-Ming Wu

    Abstract: While hematoxylin and eosin (H&E) is a standard staining procedure, immunohistochemistry (IHC) staining further serves as a diagnostic and prognostic method. However, acquiring special staining results requires substantial costs. Hence, we proposed a strategy for ultra-high-resolution unpaired image-to-image translation: Kernelized Instance Normalization (KIN), which preserves local information… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: Accepted to ECCV 2022

  25. Deep Learning for Classification of Thyroid Nodules on Ultrasound: Validation on an Independent Dataset

    Authors: Jingxi Weng, Benjamin Wildman-Tobriner, Mateusz Buda, Jichen Yang, Lisa M. Ho, Brian C. Allen, Wendy L. Ehieli, Chad M. Miller, Jikai Zhang, Maciej A. Mazurowski

    Abstract: Objectives: The purpose is to apply a previously validated deep learning algorithm to a new thyroid nodule ultrasound image dataset and compare its performances with radiologists. Methods: Prior study presented an algorithm which is able to detect thyroid nodules and then make malignancy classifications with two ultrasound images. A multi-task deep convolutional neural network was trained from 127… ▽ More

    Submitted 4 May, 2023; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: Clinical Imaging (2023)

  26. arXiv:2206.07870  [pdf, other

    cs.AI

    How to talk so AI will learn: Instructions, descriptions, and autonomy

    Authors: Theodore R Sumers, Robert D Hawkins, Mark K Ho, Thomas L Griffiths, Dylan Hadfield-Menell

    Abstract: From the earliest years of our lives, humans use language to express our beliefs and desires. Being able to talk to artificial agents about our preferences would thus fulfill a central goal of value alignment. Yet today, we lack computational models explaining such language use. To address this challenge, we formalize learning from language in a contextual bandit setting and ask how a human might… ▽ More

    Submitted 10 October, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

    Comments: 10 pages, 5 figures. Published as a conference paper at NeurIPS 2022

  27. arXiv:2206.01777  [pdf, other

    cs.CV eess.IV

    Real-Time Super-Resolution for Real-World Images on Mobile Devices

    Authors: Jie Cai, Zibo Meng, Jiaming Ding, Chiu Man Ho

    Abstract: Image Super-Resolution (ISR), which aims at recovering High-Resolution (HR) images from the corresponding Low-Resolution (LR) counterparts. Although recent progress in ISR has been remarkable. However, they are way too computationally intensive to be deployed on edge devices, since most of the recent approaches are deep learning-based. Besides, these methods always fail in real-world scenes, since… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: arXiv admin note: text overlap with arXiv:2004.13674

  28. arXiv:2204.05091  [pdf, other

    cs.AI cs.CL

    Linguistic communication as (inverse) reward design

    Authors: Theodore R. Sumers, Robert D. Hawkins, Mark K. Ho, Thomas L. Griffiths, Dylan Hadfield-Menell

    Abstract: Natural language is an intuitive and expressive way to communicate reward information to autonomous agents. It encompasses everything from concrete instructions to abstract descriptions of the world. Despite this, natural language is often challenging to learn from: it is difficult for machine learning methods to make appropriate inferences from such a wide range of input. This paper proposes a ge… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: 6 pages, 3 figures. Accepted at Learning from Natural Language Supervision workshop (ACL 2022)

  29. arXiv:2202.09892  [pdf, other

    cs.RO cs.CC

    Towards a Framework for Comparing the Complexity of Robotic Tasks

    Authors: Michelle Ho, Alec Farid, Anirudha Majumdar

    Abstract: We are motivated by the problem of comparing the complexity of one robotic task relative to another. To this end, we define a notion of reduction that formalizes the following intuition: Task 1 reduces to Task 2 if we can efficiently transform any policy that solves Task 2 into a policy that solves Task 1. We further define a quantitative measure of the relative complexity between any two tasks fo… ▽ More

    Submitted 24 June, 2022; v1 submitted 20 February, 2022; originally announced February 2022.

  30. arXiv:2111.12317  [pdf, other

    cs.CL

    Handling tree-structured text: parsing directory pages

    Authors: Sarang Shrivastava, Afreen Shaikh, Shivani Shrivastava, Chung Ming Ho, Pradeep Reddy, Vijay Saraswat

    Abstract: The determination of the reading sequence of text is fundamental to document understanding. This problem is easily solved in pages where the text is organized into a sequence of lines and vertical alignment runs the height of the page (producing multiple columns which can be read from left to right). We present a situation -- the directory page parsing problem -- where information is presented on… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

  31. arXiv:2111.00876  [pdf, other

    cs.LG cs.AI

    On the Expressivity of Markov Reward

    Authors: David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

    Abstract: Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajector… ▽ More

    Submitted 18 January, 2022; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted to NeurIPS 2021

  32. arXiv:2109.00127  [pdf, other

    cs.AI cs.RO eess.SY

    Cognitive science as a source of forward and inverse models of human decisions for robotics and control

    Authors: Mark K. Ho, Thomas L. Griffiths

    Abstract: Those designing autonomous systems that interact with humans will invariably face questions about how humans think and make decisions. Fortunately, computational cognitive science offers insight into human decision-making using tools that will be familiar to those with backgrounds in optimization and control (e.g., probability theory, statistical machine learning, and reinforcement learning). Here… ▽ More

    Submitted 31 August, 2021; originally announced September 2021.

    Comments: Invited submission for Annual Review of Control, Robotics, and Autonomous Systems

  33. arXiv:2108.09322  [pdf, other

    cs.CV

    MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition

    Authors: Jiawei Chen, Chiu Man Ho

    Abstract: This paper presents a pure transformer-based approach, dubbed the Multi-Modal Video Transformer (MM-ViT), for video action recognition. Different from other schemes which solely utilize the decoded RGB frames, MM-ViT operates exclusively in the compressed video domain and exploits all readily available modalities, i.e., I-frames, motion vectors, residuals and audio waveform. In order to handle the… ▽ More

    Submitted 12 November, 2021; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: Winter Conference on Applications of Computer Vision (WACV) 2022

  34. arXiv:2105.12789  [pdf, other

    cs.CV

    RSCA: Real-time Segmentation-based Context-Aware Scene Text Detection

    Authors: Jiachen Li, Yuan Lin, Rongrong Liu, Chiu Man Ho, Humphrey Shi

    Abstract: Segmentation-based scene text detection methods have been widely adopted for arbitrary-shaped text detection recently, since they make accurate pixel-level predictions on curved text instances and can facilitate real-time inference without time-consuming processing on anchors. However, current segmentation-based models are unable to learn the shapes of curved texts and often require complex label… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

    Comments: CVPR 2021 Workshop

  35. arXiv:2105.11950  [pdf, other

    cs.CL

    Extending rational models of communication from beliefs to actions

    Authors: Theodore R. Sumers, Robert D. Hawkins, Mark K. Ho, Thomas L. Griffiths

    Abstract: Speakers communicate to influence their partner's beliefs and shape their actions. Belief- and action-based objectives have been explored independently in recent computational models, but it has been challenging to explicitly compare or integrate them. Indeed, we find that they are conflated in standard referential communication tasks. To distinguish these accounts, we introduce a new paradigm cal… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: 7 pages, 4 figures. Proceedings for the 43rd Annual Meeting of the Cognitive Science Society

  36. arXiv:2105.08826  [pdf, other

    eess.IV cs.CV cs.LG

    Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report

    Authors: Andrey Ignatov, Andres Romero, Heewon Kim, Radu Timofte, Chiu Man Ho, Zibo Meng, Kyoung Mu Lee, Yuxiang Chen, Yutong Wang, Zeyu Long, Chenhao Wang, Yifei Chen, Boshen Xu, Shuhang Gu, Lixin Duan, Wen Li, Wang Bofei, Zhang Diankai, Zheng Chengjian, Liu Shaoli, Gao Si, Zhang Xiaofeng, Lu Kaidi, Xu Tianyu, Zheng Hui , et al. (6 additional authors not shown)

    Abstract: Video super-resolution has recently become one of the most important mobile-related problems due to the rise of video communication and streaming services. While many solutions have been proposed for this task, the majority of them are too computationally expensive to run on portable devices with limited hardware resources. To address this problem, we introduce the first Mobile AI challenge, where… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Mobile AI 2021 Workshop and Challenges: https://ai-benchmark.com/workshops/mai/2021/. arXiv admin note: substantial text overlap with arXiv:2105.07825. substantial text overlap with arXiv:2105.08629, arXiv:2105.07809, arXiv:2105.08630

  37. People construct simplified mental representations to plan

    Authors: Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths

    Abstract: One of the most striking features of human cognition is the capacity to plan. Two aspects of human planning stand out: its efficiency and flexibility. Efficiency is especially impressive because plans must often be made in complex environments, and yet people successfully plan solutions to myriad everyday problems despite having limited cognitive resources. Standard accounts in psychology, economi… ▽ More

    Submitted 26 November, 2022; v1 submitted 14 May, 2021; originally announced May 2021.

    Comments: 56 pages, 5 main figures, 10 extended data figures, supplementary information is included in ancillary files

    Journal ref: Nature, 606(7912), 129-136 (2022)

  38. arXiv:2105.00328  [pdf, other

    cs.CL

    When to Fold'em: How to answer Unanswerable questions

    Authors: Marshall Ho, Zhipeng Zhou, Judith He

    Abstract: We present 3 different question-answering models trained on the SQuAD2.0 dataset -- BIDAF, DocumentQA and ALBERT Retro-Reader -- demonstrating the improvement of language models in the past three years. Through our research in fine-tuning pre-trained models for question-answering, we developed a novel approach capable of achieving a 2% point improvement in SQuAD2.0 F1 in reduced training time. Our… ▽ More

    Submitted 1 May, 2021; originally announced May 2021.

  39. arXiv:2103.14053  [pdf, other

    quant-ph cond-mat.stat-mech cs.IT nlin.CG

    Quantum-inspired identification of complex cellular automata

    Authors: Matthew Ho, Andri Pradana, Thomas J. Elliott, Lock Yue Chew, Mile Gu

    Abstract: Elementary cellular automata (ECA) present iconic examples of complex systems. Though described only by one-dimensional strings of binary cells evolving according to nearest-neighbour update rules, certain ECA rules manifest complex dynamics capable of universal computation. Yet, the classification of precisely which rules exhibit complex behaviour remains a significant challenge. Here we approach… ▽ More

    Submitted 20 March, 2024; v1 submitted 25 March, 2021; originally announced March 2021.

    Comments: 22 pages, 9 figures

    Journal ref: Eur. Phys. J. Plus 138 (6) 540 (2023)

  40. arXiv:2103.12357  [pdf

    cs.PL cs.SE

    Unleashing the Hidden Power of Compiler Optimization on Binary Code Difference: An Empirical Study

    Authors: Xiaolei Ren, Michael Ho, Jiang Ming, Yu Lei, Li Li

    Abstract: Since compiler optimization is the most common source contributing to binary code differences in syntax, testing the resilience against the changes caused by different compiler optimization settings has become a standard evaluation step for most binary diffing approaches. For example, 47 top-venue papers in the last 12 years compared different program versions compiled by default optimization leve… ▽ More

    Submitted 25 March, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

  41. arXiv:2102.06120  [pdf, other

    cs.CV

    Deep Photo Scan: Semi-Supervised Learning for dealing with the real-world degradation in Smartphone Photo Scanning

    Authors: Man M. Ho, Jinjia Zhou

    Abstract: Physical photographs now can be conveniently scanned by smartphones and stored forever as a digital version, yet the scanned photos are not restored well. One solution is to train a supervised deep neural network on many digital photos and the corresponding scanned photos. However, it requires a high labor cost, leading to limited training data. Previous works create training pairs by simulating d… ▽ More

    Submitted 18 August, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: Our work is available at https://minhmanho.github.io/dpscan

  42. arXiv:2102.04213  [pdf

    cs.CY

    My Boss the Computer: A Bayesian analysis of socio-demographic and cross-cultural determinants of attitude toward the Non-Human Resource Management

    Authors: Mantello Peter, Manh-Tung Ho, Minh-Hoang Nguyen, Quan-Hoang Vuong

    Abstract: Human resource management technologies have moved from biometric surveillance to emotional artificial intelligence (AI) that monitor employees' engagement and productivity, analyze video interviews and CVs of job applicants. The rise of the US$20 billion emotional AI industry will transform the future workplace. Yet, besides no international consensus on the principles or standards for such techno… ▽ More

    Submitted 24 January, 2021; originally announced February 2021.

    Comments: 58 pages, 9 tables, 10 figures

  43. arXiv:2102.03882  [pdf, other

    cs.CL

    Spoiler Alert: Using Natural Language Processing to Detect Spoilers in Book Reviews

    Authors: Allen Bao, Marshall Ho, Saarthak Sangamnerkar

    Abstract: This paper presents an NLP (Natural Language Processing) approach to detecting spoilers in book reviews, using the University of California San Diego (UCSD) Goodreads Spoiler dataset. We explored the use of LSTM, BERT, and RoBERTa language models to perform spoiler detection at the sentence-level. This was contrasted with a UCSD paper which performed the same task, but using handcrafted features i… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

  44. On the binary adder channel with complete feedback, with an application to quantitative group testing

    Authors: Samuel H. Florin, Matthew H. Ho, Zilin Jiang

    Abstract: We determine the exact value of the optimal symmetric rate point $(r, r)$ in the Dueck zero-error capacity region of the binary adder channel with complete feedback. We proved that the average zero-error capacity $r = h(1/2-δ) \approx 0.78974$, where $h(\cdot)$ is the binary entropy function and $δ= 1/(2\log_2(2+\sqrt3))$. Our motivation is a problem in quantitative group testing. Given a set of… ▽ More

    Submitted 28 December, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

    Comments: 39 pages, 3 figures, accepted to IEEE Trans. Inf. Theory, corrections suggested by the referees have been incorporated

    MSC Class: 94A17; 94A24; 94A40

    Journal ref: IEEE Transactions on Information Theory, Volume 68, Issue 5, pp 2839-2856, May 2022

  45. arXiv:2012.09035  [pdf, other

    cs.CL

    Show or Tell? Demonstration is More Robust to Changes in Shared Perception than Explanation

    Authors: Theodore R. Sumers, Mark K. Ho, Thomas L. Griffiths

    Abstract: Successful teaching entails a complex interaction between a teacher and a learner. The teacher must select and convey information based on what they think the learner perceives and believes. Teaching always involves misaligned beliefs, but studies of pedagogy often focus on situations where teachers and learners share perceptions. Nonetheless, a teacher and learner may not always experience or att… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    Comments: 7 pages, 4 figures. Proceedings for the 42nd Annual Meeting of the Cognitive Science Society

  46. arXiv:2011.13011  [pdf, other

    cs.LG cs.CV eess.IV

    Advancing diagnostic performance and clinical usability of neural networks via adversarial training and dual batch normalization

    Authors: Tianyu Han, Sven Nebelung, Federico Pedersoli, Markus Zimmermann, Maximilian Schulze-Hagen, Michael Ho, Christoph Haarburger, Fabian Kiessling, Christiane Kuhl, Volkmar Schulz, Daniel Truhn

    Abstract: Unmasking the decision-making process of machine learning models is essential for implementing diagnostic support systems in clinical practice. Here, we demonstrate that adversarially trained models can significantly enhance the usability of pathology detection as compared to their standard counterparts. We let six experienced radiologists rate the interpretability of saliency maps in datasets of… ▽ More

    Submitted 25 November, 2020; originally announced November 2020.

  47. arXiv:2009.14715  [pdf, other

    cs.AI

    Learning Rewards from Linguistic Feedback

    Authors: Theodore R. Sumers, Mark K. Ho, Robert D. Hawkins, Karthik Narasimhan, Thomas L. Griffiths

    Abstract: We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive learning from language assumes a particular form of input (e.g., commands). We propose a general framework which does not make this assumption, using aspect-based sentiment analysis to decompose feedback into sentiment about… ▽ More

    Submitted 3 July, 2021; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: 9 pages, 4 figures. AAAI '21

  48. arXiv:2009.06943  [pdf, other

    eess.IV cs.CV

    AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results

    Authors: Kai Zhang, Martin Danelljan, Yawei Li, Radu Timofte, Jie Liu, Jie Tang, Gangshan Wu, Yu Zhu, Xiangyu He, Wenjie Xu, Chenghua Li, Cong Leng, Jian Cheng, Guangyang Wu, Wenyi Wang, Xiaohong Liu, Hengyuan Zhao, Xiangtao Kong, Jingwen He, Yu Qiao, Chao Dong, Xiaotong Luo, Liang Chen, Jiangtao Zhang, Maitreya Suin , et al. (60 additional authors not shown)

    Abstract: This paper reviews the AIM 2020 challenge on efficient single image super-resolution with focus on the proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor x4 based on a set of prior examples of low and corresponding high resolution images. The goal is to devise a network that reduces one or several aspects such as runtime, parameter co… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  49. GIA-Net: Global Information Aware Network for Low-light Imaging

    Authors: Zibo Meng, Runsheng Xu, Chiu Man Ho

    Abstract: It is extremely challenging to acquire perceptually plausible images under low-light conditions due to low SNR. Most recently, U-Nets have shown promising results for low-light imaging. However, vanilla U-Nets generate images with artifacts such as color inconsistency due to the lack of global color information. In this paper, we propose a global information aware (GIA) module, which is capable of… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

    Comments: 16 pages 6 figures; accepted to AIM at ECCV 2020

    Journal ref: Computer Vision -- ECCV 2020 Workshops, 2020, 327--342

  50. arXiv:2009.02476  [pdf, other

    cs.LG cs.AI cs.HC

    Using Machine Teaching to Investigate Human Assumptions when Teaching Reinforcement Learners

    Authors: Yun-Shiuan Chuang, Xuezhou Zhang, Yuzhe Ma, Mark K. Ho, Joseph L. Austerweil, Xiaojin Zhu

    Abstract: Successful teaching requires an assumption of how the learner learns - how the learner uses experiences from the world to update their internal states. We investigate what expectations people have about a learner when they teach them in an online manner using rewards and punishment. We focus on a common reinforcement learning method, Q-learning, and examine what assumptions people have using a beh… ▽ More

    Submitted 29 June, 2023; v1 submitted 5 September, 2020; originally announced September 2020.

    Comments: 21 pages, 4 figures