Skip to main content

Showing 1–49 of 49 results for author: Chan, D

  1. arXiv:2407.06576  [pdf, other

    cs.CL cs.AI

    Virtual Personas for Language Models via an Anthology of Backstories

    Authors: Suhong Moon, Marwa Abdulhai, Minwoo Kang, Joseph Suh, Widyadewi Soedarmadji, Eran Kohen Behar, David M. Chan

    Abstract: Large language models (LLMs) are trained from vast repositories of text authored by millions of distinct authors, reflecting an enormous diversity of human traits. While these models bear the potential to be used as approximations of human subjects in behavioral studies, prior efforts have been limited in steering model responses to match individual human users. In this work, we introduce "Antholo… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2405.15113  [pdf, other

    cs.RO

    A Wearable Resistance Devices Motor Learning Effects in Exercise

    Authors: Eugenio Frias-Miranda, Hong-Anh Nguyen, Jeremy Hampton, Trenner Jones, Benjamin Spotts, Matthew Cochran, Deva Chan, Laura H Blumenschein

    Abstract: The integration of technology into exercise regimens has emerged as a strategy to enhance normal human capabilities and return human motor function after injury or illness by enhancing motor learning and retention. Much research has focused on how active devices, whether confined to a lab or made into a wearable format, can apply forces at set times and conditions to optimize the process of learni… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 8 pages, 9 figures, To be published in IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob) 2024

  3. arXiv:2405.08272  [pdf, other

    cs.CV

    VS-Assistant: Versatile Surgery Assistant on the Demand of Surgeons

    Authors: Zhen Chen, Xingjian Luo, Jinlin Wu, Danny T. M. Chan, Zhen Lei, Jinqiao Wang, Sebastien Ourselin, Hongbin Liu

    Abstract: The surgical intervention is crucial to patient healthcare, and many studies have developed advanced algorithms to provide understanding and decision-making assistance for surgeons. Despite great progress, these algorithms are developed for a single specific task and scenario, and in practice require the manual combination of different functions, thus limiting the applicability. Thus, an intellige… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  4. arXiv:2404.05696  [pdf

    cs.DB q-bio.QM

    BOLD v4: A Centralized Bioinformatics Platform for DNA-based Biodiversity Data

    Authors: Sujeevan Ratnasingham, Catherine Wei, Dean Chan, Jireh Agda, Josh Agda, Liliana Ballesteros-Mejia, Hamza Ait Boutou, Zak Mohammad El Bastami, Eddie Ma, Ramya Manjunath, Dana Rea, Chris Ho, Angela Telfer, Jaclyn McKeowan, Miduna Rahulan, Claudia Steinke, Justin Dorsheimer, Megan Milton, Paul D. N. Hebert

    Abstract: BOLD, the Barcode of Life Data System, supports the acquisition, storage, validation, analysis, and publication of DNA barcodes, activities requiring the integration of molecular, morphological, and distributional data. Its pivotal role in curating the reference library of DNA barcodes, coupled with its data management and analysis capabilities, make it a central resource for biodiversity science.… ▽ More

    Submitted 5 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  5. arXiv:2404.02904  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    ALOHa: A New Measure for Hallucination in Captioning Models

    Authors: Suzanne Petryk, David M. Chan, Anish Kachinthaya, Haodi Zou, John Canny, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Despite recent advances in multimodal pre-training for visual description, state-of-the-art models still produce captions containing errors, such as hallucinating objects not present in a scene. The existing prominent metric for object hallucination, CHAIR, is limited to a fixed set of MS COCO objects and synonyms. In this work, we propose a modernized open-vocabulary metric, ALOHa, which leverage… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: To appear at NAACL 2024

  6. arXiv:2403.19822  [pdf, other

    cs.CL cs.AI

    Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

    Authors: Yash Jain, David Chan, Pranav Dheram, Aparna Khare, Olabanji Shonibare, Venkatesh Ravichandran, Shalini Ghosh

    Abstract: Recent advances in machine learning have demonstrated that multi-modal pre-training can improve automatic speech recognition (ASR) performance compared to randomly initialized models, even when models are fine-tuned on uni-modal tasks. Existing multi-modal pre-training methods for the ASR task have primarily focused on single-stage pre-training where a single unsupervised task is used for pre-trai… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: Accepted in LREC-COLING 2024 - The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation

  7. arXiv:2402.11590  [pdf, other

    cs.HC

    Designing interactive data visualizations representing recovery progress for patients after stroke

    Authors: Alicia Ouskine, Adrian D. C. Chan, Fateme Rajabiyazdi

    Abstract: Stroke is one of the leading causes of disability worldwide. The efficacy of recovery is determined by a variety of factors, including patient adherence to rehabilitation programs. One way to increase patient adherence to their rehabilitation program is to show patients their progress that is visualized in a simple and intuitive way. We begin to gather preliminary information on Functional Capacit… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 2 pages

  8. arXiv:2402.09679  [pdf, other

    cs.RO eess.SY

    Design and Visual Servoing Control of a Hybrid Dual-Segment Flexible Neurosurgical Robot for Intraventricular Biopsy

    Authors: Jian Chen, Mingcong Chen, Qingxiang Zhao, Shuai Wang, Yihe Wang, Ying Xiao, Jian Hu, Danny Tat Ming Chan, Kam Tong Leo Yeung, David Yuen Chung Chan, Hongbin Liu

    Abstract: Traditional rigid endoscopes have challenges in flexibly treating tumors located deep in the brain, and low operability and fixed viewing angles limit its development. This study introduces a novel dual-segment flexible robotic endoscope MicroNeuro, designed to perform biopsies with dexterous surgical manipulation deep in the brain. Taking into account the uncertainty of the control model, an imag… ▽ More

    Submitted 23 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Accepted by IEEE International Conference on Robotics and Automation (ICRA) 2024, 7 pages, 9 figures

  9. arXiv:2402.08205  [pdf, other

    cs.RO

    TurtleRabbit 2024 SSL Team Description Paper

    Authors: Linh Trinh, Alif Anzuman, Eric Batkhuu, Dychen Chan, Lisa Graf, Darpan Gurung, Tharunimm Jamal, Jigme Namgyal, Jason Ng, Wing Lam Tsang, X. Rosalind Wang, Eren Yilmaz, Oliver Obst

    Abstract: TurtleRabbit is a new RoboCup SSL team from Western Sydney University. This team description paper presents our approach in navigating some of the challenges in developing a new SSL team from scratch. SSL is dominated by teams with extensive experience and customised equipment that has been developed over many years. Here, we outline our approach in overcoming some of the complexities associated w… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: Submitted paper as part of the qualification for RoboCup 2024

  10. arXiv:2401.05314  [pdf, other

    eess.AS cs.CL cs.CV cs.SD

    ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video

    Authors: Kevin Cai, Chonghua Liu, David M. Chan

    Abstract: The Internet's wealth of content, with up to 60% published in English, starkly contrasts the global population, where only 18.8% are English speakers, and just 5.1% consider it their native language, leading to disparities in online information access. Unfortunately, automated processes for dubbing of video - replacing the audio track of a video with a translated alternative - remains a complex an… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: To appear in ICASSP 2024

  11. arXiv:2401.03384  [pdf, other

    cs.LG cs.CV

    conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks

    Authors: Tahseen Rabbani, Jiahao Su, Xiaoyu Liu, David Chan, Geoffrey Sangston, Furong Huang

    Abstract: Modern ConvNets continue to achieve state-of-the-art results over a vast array of vision and image classification tasks, but at the cost of increasing parameters. One strategy for compactifying a network without sacrificing much expressive power is to reshape it into a tensorial neural network (TNN), which is a higher-order tensorization of its layers, followed by a factorization, such as a CP-dec… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  12. arXiv:2401.02417  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition

    Authors: David M. Chan, Shalini Ghosh, Hitesh Tulsiani, Ariya Rastrow, Björn Hoffmeister

    Abstract: While word error rates of automatic speech recognition (ASR) systems have consistently fallen, natural language understanding (NLU) applications built on top of ASR systems still attribute significant numbers of failures to low-quality speech recognition results. Existing assistant systems collect large numbers of these unsuccessful interactions, but these systems usually fail to learn from these… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: To appear in ICASSP 2024

  13. arXiv:2312.14378  [pdf, other

    cs.LG cs.SD eess.AS

    Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

    Authors: Anirudh S. Sundar, Chao-Han Huck Yang, David M. Chan, Shalini Ghosh, Venkatesh Ravichandran, Phani Sankar Nidadavolu

    Abstract: Training large foundation models using self-supervised objectives on unlabeled data, followed by fine-tuning on downstream tasks, has emerged as a standard procedure. Unfortunately, the efficacy of this approach is often constrained by both limited fine-tuning compute and scarcity in labeled downstream data. We introduce Multimodal Attention Merging (MAM), an attempt that facilitates direct knowle… ▽ More

    Submitted 9 February, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: 5 pages, 1 figure, ICASSP 2024 Workshop on Self-supervision in Audio, Speech and Beyond

  14. arXiv:2312.08366  [pdf, other

    cs.CV

    See, Say, and Segment: Teaching LMMs to Overcome False Premises

    Authors: Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell

    Abstract: Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-vocabulary language grounding and segmentation but can suffer under false premises when queries imply the existence of something that is not actually present in the image. We observe that existing methods that fine-tune an LMM to segment images significantly degrade their ability to reliably determine ("see") if an obje… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Project Page: https://see-say-segment.github.io

  15. arXiv:2310.12971  [pdf, other

    cs.CV cs.AI cs.CL

    CLAIR: Evaluating Image Captions with Large Language Models

    Authors: David Chan, Suzanne Petryk, Joseph E. Gonzalez, Trevor Darrell, John Canny

    Abstract: The evaluation of machine-generated image captions poses an interesting yet persistent challenge. Effective evaluation measures must consider numerous dimensions of similarity, including semantic relevance, visual structure, object interactions, caption diversity, and specificity. Existing highly-engineered measures attempt to capture specific aspects, but fall short in providing a holistic score… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: To Appear at EMNLP 2023

  16. Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data

    Authors: Vladislav Lialin, Stephen Rawls, David Chan, Shalini Ghosh, Anna Rumshisky, Wael Hamza

    Abstract: Scaling up weakly-supervised datasets has shown to be highly effective in the image-text domain and has contributed to most of the recent state-of-the-art computer vision and multimodal neural networks. However, existing large-scale video-text datasets and mining techniques suffer from several limitations, such as the scarcity of aligned data, the lack of diversity in the data, and the difficulty… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Journal ref: 2023 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW)

  17. arXiv:2302.01328  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    IC3: Image Captioning by Committee Consensus

    Authors: David M. Chan, Austin Myers, Sudheendra Vijayanarasimhan, David A. Ross, John Canny

    Abstract: If you ask a human to describe an image, they might do so in a thousand different ways. Traditionally, image captioning models are trained to generate a single "best" (most like a reference) image caption. Unfortunately, doing so encourages captions that are "informationally impoverished," and focus on only a subset of the possible details, while ignoring other potentially useful information in th… ▽ More

    Submitted 19 October, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: To Appear at EMNLP 2023

  18. arXiv:2301.02736  [pdf, other

    eess.AS cs.LG cs.SD

    Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition

    Authors: David M. Chan, Shalini Ghosh, Ariya Rastrow, Björn Hoffmeister

    Abstract: Despite improvements to the generalization performance of automated speech recognition (ASR) models, specializing ASR models for downstream tasks remains a challenging task, primarily due to reduced data availability (necessitating increased data collection), and rapidly shifting data distributions (requiring more frequent model fine-tuning). In this work, we investigate the potential of leveragin… ▽ More

    Submitted 6 January, 2023; originally announced January 2023.

  19. arXiv:2209.07518  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    Distribution Aware Metrics for Conditional Natural Language Generation

    Authors: David M Chan, Yiming Ni, David A Ross, Sudheendra Vijayanarasimhan, Austin Myers, John Canny

    Abstract: Traditional automated metrics for evaluating conditional natural language generation use pairwise comparisons between a single generated text and the best-matching gold-standard ground truth text. When multiple ground truths are available, scores are aggregated using an average or max operation across references. While this approach works well when diversity in the ground truth data (i.e. dispersi… ▽ More

    Submitted 29 September, 2022; v1 submitted 15 September, 2022; originally announced September 2022.

  20. arXiv:2207.08024  [pdf, other

    cs.CV

    LAVA: Language Audio Vision Alignment for Contrastive Video Pre-Training

    Authors: Sumanth Gurram, Andy Fang, David Chan, John Canny

    Abstract: Generating representations of video data is of key importance in advancing the field of machine perception. Most current techniques rely on hand-annotated data, which can be difficult to work with, expensive to generate, and hard to scale. In this work, we propose a novel learning approach based on contrastive learning, LAVA, which is capable of learning joint language, audio, and video representa… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

    Comments: Workshop Paper at ICML 2022

  21. arXiv:2206.08353  [pdf, other

    cs.LG stat.ML

    Towards Understanding How Machines Can Learn Causal Overhypotheses

    Authors: Eliza Kosoy, David M. Chan, Adrian Liu, Jasmine Collins, Bryanna Kaufmann, Sandy Han Huang, Jessica B. Hamrick, John Canny, Nan Rosemary Ke, Alison Gopnik

    Abstract: Recent work in machine learning and cognitive science has suggested that understanding causal information is essential to the development of intelligence. The extensive literature in cognitive science using the ``blicket detector'' environment shows that children are adept at many kinds of causal inference and learning. We propose to adapt that environment for machine learning agents. One of the k… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  22. arXiv:2205.09872  [pdf, other

    eess.AS cs.LG cs.SD

    Content-Context Factorized Representations for Automated Speech Recognition

    Authors: David M. Chan, Shalini Ghosh

    Abstract: Deep neural networks have largely demonstrated their ability to perform automated speech recognition (ASR) by extracting meaningful features from input audio frames. Such features, however, may consist not only of information about the spoken language content, but also may contain information about unnecessary contexts such as background noise and sounds or speaker identity, accent, or protected a… ▽ More

    Submitted 15 September, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: Presented at Interspeech 2022 (On-Site Oral Presentation)

  23. arXiv:2205.06253  [pdf, other

    cs.CV cs.CL

    What's in a Caption? Dataset-Specific Linguistic Diversity and Its Effect on Visual Description Models and Metrics

    Authors: David M. Chan, Austin Myers, Sudheendra Vijayanarasimhan, David A. Ross, Bryan Seybold, John F. Canny

    Abstract: While there have been significant gains in the field of automated video description, the generalization performance of automated description models to novel domains remains a major barrier to using these systems in the real world. Most visual description methods are known to capture and exploit patterns in the training data leading to evaluation metric increases, but what are those patterns? In th… ▽ More

    Submitted 12 January, 2023; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: The 1st Workshop on Vision Datasets Understanding, IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR), 2022

  24. arXiv:2202.10430  [pdf, other

    cs.LG cs.AI cs.NE

    Learning Causal Overhypotheses through Exploration in Children and Computational Models

    Authors: Eliza Kosoy, Adrian Liu, Jasmine Collins, David M Chan, Jessica B Hamrick, Nan Rosemary Ke, Sandy H Huang, Bryanna Kaufmann, John Canny, Alison Gopnik

    Abstract: Despite recent progress in reinforcement learning (RL), RL algorithms for exploration still remain an active area of research. Existing methods often focus on state-based metrics, which do not consider the underlying causal structures of the environment, and while recent research has begun to explore RL environments for causal learning, these environments primarily leverage causal information thro… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  25. arXiv:2202.07706   

    cs.CV

    Misinformation Detection in Social Media Video Posts

    Authors: Kehan Wang, David Chan, Seth Z. Zhao, John Canny, Avideh Zakhor

    Abstract: With the growing adoption of short-form video by social media platforms, reducing the spread of misinformation through video posts has become a critical challenge for social media providers. In this paper, we develop methods to detect misinformation in social media posts, exploiting modalities such as video and text. Due to the lack of large-scale public data for misinformation detection in multi-… ▽ More

    Submitted 30 July, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

    Comments: We discovered an error in our dataset construction where retweets were not properly filtered. This resulted in test data leakage in training data, and the results reported are affected

  26. arXiv:2110.09890  [pdf, other

    eess.AS cs.LG cs.SD

    Multi-Modal Pre-Training for Automated Speech Recognition

    Authors: David M. Chan, Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister

    Abstract: Traditionally, research in automated speech recognition has focused on local-first encoding of audio representations to predict the spoken phonemes in an utterance. Unfortunately, approaches relying on such hyper-local information tend to be vulnerable to both local-level corruption (such as audio-frame drops, or loud noises) and global-level noise (such as environmental noise, or background noise… ▽ More

    Submitted 15 September, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Presented at ICASSP 2022

  27. arXiv:2110.03588  [pdf

    eess.IV cs.CV physics.med-ph

    A transformer-based deep learning approach for classifying brain metastases into primary organ sites using clinical whole brain MRI

    Authors: Qing Lyu, Sanjeev V. Namjoshi, Emory McTyre, Umit Topaloglu, Richard Barcus, Michael D. Chan, Christina K. Cramer, Waldemar Debinski, Metin N. Gurcan, Glenn J. Lesser, Hui-Kuan Lin, Reginald F. Munden, Boris C. Pasche, Kiran Kumar Solingapuram Sai, Roy E. Strowd, Stephen B. Tatter, Kounosuke Watabe, Wei Zhang, Ge Wang, Christopher T. Whitlow

    Abstract: Treatment decisions for brain metastatic disease rely on knowledge of the primary organ site, and currently made with biopsy and histology. Here we develop a novel deep learning approach for accurate non-invasive digital histology with whole-brain MRI data. Our IRB-approved single-site retrospective study was comprised of patients (n=1,399) referred for MRI treatment-planning and gamma knife radio… ▽ More

    Submitted 20 April, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

  28. arXiv:2108.13947  [pdf, other

    stat.ML cs.CY cs.LG

    Decision Tree-Based Predictive Models for Academic Achievement Using College Students' Support Networks

    Authors: Anthony Frazier, Joethi Silva, Rachel Meilak, Indranil Sahoo, David Chan, Michael Broda

    Abstract: In this study, we examine a set of primary data collected from 484 students enrolled in a large public university in the Mid-Atlantic United States region during the early stages of the COVID-19 pandemic. The data, called Ties data, included students' demographic and support network information. The support network data comprised of information that highlighted the type of support, (i.e. emotional… ▽ More

    Submitted 12 September, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

  29. arXiv:2108.01651  [pdf, ps, other

    cs.DC

    An Impossibility Result on Strong Linearizability in Message-Passing Systems

    Authors: David Yu Cheng Chan, Vassos Hadzilacos, Xing Hu, Sam Toueg

    Abstract: We prove that in asynchronous message-passing systems where at most one process may crash, there is no lock-free strongly linearizable implementation of a weak object that we call Test-or-Set (ToS). This object allows a single distinguished process to apply the set operation once, and a different distinguished process to apply the test operation also once. Since this weak object can be directly im… ▽ More

    Submitted 9 August, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

    Comments: 12 pages

  30. arXiv:2106.03185  [pdf, ps, other

    cs.DC

    Tight Lower Bounds for the RMR Complexity of Recoverable Mutual Exclusion

    Authors: David Yu Cheng Chan, Philipp Woelfel

    Abstract: We present a tight RMR complexity lower bound for the recoverable mutual exclusion (RME) problem, defined by Golab and Ramaraju \cite{GR2019a}. In particular, we show that any $n$-process RME algorithm using only atomic read, write, fetch-and-store, fetch-and-increment, and compare-and-swap operations, has an RMR complexity of $Ω(\log n/\log\log n)$ on the CC and DSM model. This lower bound covers… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

    Comments: 36 pages, 0 figures

  31. arXiv:2105.10880  [pdf, other

    cs.LG cs.HC cs.IR

    RtFPS: An Interactive Map that Visualizes and Predicts Wildfires in the US

    Authors: Yang Li, Hermawan Mulyono, Ying Chen, Zhiyin Lu, Desmond Chan

    Abstract: Climate change has largely impacted our daily lives. As one of its consequences, we are experiencing more wildfires. In the year 2020, wildfires burned a record number of 8,888,297 acres in the US. To awaken people's attention to climate change, and to visualize the current risk of wildfires, We developed RtFPS, "Real-Time Fire Prediction System". It provides a real-time prediction visualization o… ▽ More

    Submitted 21 June, 2021; v1 submitted 23 May, 2021; originally announced May 2021.

    Comments: Source code: https://github.com/yangland/rtfps

    MSC Class: 68U05; 68T30 ACM Class: J.2.5; H.4.0; I.5.1

  32. arXiv:2104.01263  [pdf, other

    cs.CV

    A Semantic Segmentation Network for Urban-Scale Building Footprint Extraction Using RGB Satellite Imagery

    Authors: Aatif Jiwani, Shubhrakanti Ganguly, Chao Ding, Nan Zhou, David M. Chan

    Abstract: Urban areas consume over two-thirds of the world's energy and account for more than 70 percent of global CO2 emissions. As stated in IPCC's Global Warming of 1.5C report, achieving carbon neutrality by 2050 requires a clear understanding of urban geometry. High-quality building footprint generation from satellite images can accelerate this predictive process and empower municipal decision-making a… ▽ More

    Submitted 18 November, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

    Comments: 11 pages, 5 figures. Code available at https://github.com/aatifjiwani/rgb-footprint-extract/

  33. arXiv:2103.11926  [pdf, other

    cs.DC

    Differentiated nonblocking: a new progress condition and a matching queue algorithm

    Authors: David Y. C. Chan, Shucheng Chi, Vassos Hadzilacos, Sam Toueg

    Abstract: In this paper, we first propose a new liveness requirement for shared objects and data structures, we then give a shared queue algorithm that satisfies this requirement and we prove its correctness. We also implement this algorithm and compare it to a well-known shared queue algorithm that is used in practice. In addition to having a stronger worst-case progress guarantee, our experimental results… ▽ More

    Submitted 22 March, 2021; originally announced March 2021.

  34. arXiv:2008.02787  [pdf, other

    cs.CV cs.GR eess.IV eess.SP physics.optics

    Efficient Non-Line-of-Sight Imaging from Transient Sinograms

    Authors: Mariko Isogawa, Dorian Chan, Ye Yuan, Kris Kitani, Matthew O'Toole

    Abstract: Non-line-of-sight (NLOS) imaging techniques use light that diffusely reflects off of visible surfaces (e.g., walls) to see around corners. One approach involves using pulsed lasers and ultrafast sensors to measure the travel time of multiply scattered light. Unlike existing NLOS techniques that generally require densely raster scanning points across the entirety of a relay wall, we explore a more… ▽ More

    Submitted 6 August, 2020; originally announced August 2020.

    Comments: ECCV 2020. Project page: https://marikoisogawa.github.io/project/c2nlos

  35. arXiv:2007.13913  [pdf, other

    cs.CV cs.CL cs.LG

    Active Learning for Video Description With Cluster-Regularized Ensemble Ranking

    Authors: David M. Chan, Sudheendra Vijayanarasimhan, David A. Ross, John Canny

    Abstract: Automatic video captioning aims to train models to generate text descriptions for all segments in a video, however, the most effective approaches require large amounts of manual annotation which is slow and expensive. Active learning is a promising way to efficiently build a training set for video captioning tasks while reducing the need to manually label uninformative examples. In this work we bo… ▽ More

    Submitted 2 December, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

    Comments: Published at the 15th Asian Conference on Computer Vision (ACCV 2020)

  36. arXiv:2006.08335  [pdf, other

    cs.CL cs.CV cs.MM

    A Dataset and Benchmarks for Multimedia Social Analysis

    Authors: Bofan Xue, David Chan, John Canny

    Abstract: We present a new publicly available dataset with the goal of advancing multi-modality learning by offering vision and language data within the same context. This is achieved by obtaining data from a social media website with posts containing multiple paired images/videos and text, along with comment trees containing images/videos and/or text. With a total of 677k posts, 2.9 million post images, 48… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

    Comments: Published as a workshop paper at "Multimodality Learning" (CVPR 2020)

  37. arXiv:2005.05023  [pdf, other

    cs.HC cs.LG

    Facial Electromyography-based Adaptive Virtual Reality Gaming for Cognitive Training

    Authors: Lorcan Reidy, Dennis Chan, Charles Nduka, Hatice Gunes

    Abstract: Cognitive training has shown promising results for delivering improvements in human cognition related to attention, problem solving, reading comprehension and information retrieval. However, two frequently cited problems in cognitive training literature are a lack of user engagement with the training programme, and a failure of developed skills to generalise to daily life. This paper introduces a… ▽ More

    Submitted 30 August, 2020; v1 submitted 27 April, 2020; originally announced May 2020.

    ACM Class: I.2; K.8

  38. arXiv:2005.02880  [pdf, other

    cs.AI

    Exploring Exploration: Comparing Children with RL Agents in Unified Environments

    Authors: Eliza Kosoy, Jasmine Collins, David M. Chan, Sandy Huang, Deepak Pathak, Pulkit Agrawal, John Canny, Alison Gopnik, Jessica B. Hamrick

    Abstract: Research in developmental psychology consistently shows that children explore the world thoroughly and efficiently and that this exploration allows them to learn. In turn, this early learning supports more robust generalization and intelligent behavior later in life. While much work has gone into developing methods for exploration in machine learning, artificial agents have not yet reached the hig… ▽ More

    Submitted 1 July, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

    Comments: Published as a workshop paper at "Bridging AI and Cognitive Science" (ICLR 2020)

  39. arXiv:1910.12154  [pdf, other

    cs.LG cs.AI

    ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations

    Authors: Daniel Seita, David Chan, Roshan Rao, Chen Tang, Mandi Zhao, John Canny

    Abstract: Learning from demonstrations is a popular tool for accelerating and reducing the exploration requirements of reinforcement learning. When providing expert demonstrations to human students, we know that the demonstrations must fall within a particular range of difficulties called the "Zone of Proximal Development (ZPD)". If they are too easy the student learns nothing, but if they are too difficult… ▽ More

    Submitted 26 October, 2019; originally announced October 2019.

    Comments: Deep Reinforcement Learning Workshop at NeurIPS 2019

  40. arXiv:1902.06085  [pdf

    cs.CV eess.IV

    DC-AL GAN: Pseudoprogression and True Tumor Progression of Glioblastoma Multiform Image Classification Based on DCGAN and AlexNet

    Authors: Meiyu Li, Hailiang Tang, Michael D. Chan, Xiaobo Zhou, Xiaohua Qian

    Abstract: Pseudoprogression (PsP) occurs in 20-30% of patients with glioblastoma multiforme (GBM) after receiving the standard treatment. In the course of post-treatment magnetic resonance imaging (MRI), PsP exhibits similarities in shape and intensity to the true tumor progression (TTP) of GBM. So, these similarities pose challenges on the differentiation of these types of progression and hence the selecti… ▽ More

    Submitted 18 May, 2019; v1 submitted 16 February, 2019; originally announced February 2019.

  41. arXiv:1902.04168  [pdf, other

    math.NA cs.CE physics.flu-dyn

    A robust and non-singular formulation of the boundary integral method for the potential problem

    Authors: Q. Sun, E. Klaseboer, B. C. Khoo, D. Y. C. Chan

    Abstract: A non-singular formulation of the boundary integral method (BIM) is presented for the Laplace equation whereby the well-known singularities that arise from the fundamental solution are eliminated analytically. A key advantage of this approach is that numerical errors that arise due to the proximity of nodes located on osculating boundaries are suppressed. This is particularly relevant in multi-sca… ▽ More

    Submitted 7 February, 2019; originally announced February 2019.

    Journal ref: Engineering Analysis with Boundary Elements 43 (2014) 117

  42. arXiv:1901.05305  [pdf, other

    eess.SP cs.CV cs.LG

    Seizure Detection using Least EEG Channels by Deep Convolutional Neural Network

    Authors: Mustafa Talha Avcu, Zhuo Zhang, Derrick Wei Shih Chan

    Abstract: This work aims to develop an end-to-end solution for seizure onset detection. We design the SeizNet, a Convolutional Neural Network for seizure detection. To compare SeizNet with traditional machine learning approach, a baseline classifier is implemented using spectrum band power features with Support Vector Machines (BPsvm). We explore the possibility to use the least number of channels for accur… ▽ More

    Submitted 14 January, 2019; originally announced January 2019.

  43. arXiv:1812.09744  [pdf, other

    cs.CV

    Leveraging Class Similarity to Improve Deep Neural Network Robustness

    Authors: Pooran Singh Negi, David chan, Mohammad Mahoor

    Abstract: Traditionally artificial neural networks (ANNs) are trained by minimizing the cross-entropy between a provided groundtruth delta distribution (encoded as one-hot vector) and the ANN's predictive softmax distribution. It seems, however, unacceptable to penalize networks equally for missclassification between classes. Confusing the class "Automobile" with the class "Truck" should be penalized less t… ▽ More

    Submitted 27 December, 2018; v1 submitted 23 December, 2018; originally announced December 2018.

  44. arXiv:1812.04604  [pdf, other

    cs.CV cs.AI

    Diagnostic Visualization for Deep Neural Networks Using Stochastic Gradient Langevin Dynamics

    Authors: Biye Jiang, David M. Chan, Tianhao Zhang, John F. Canny

    Abstract: The internal states of most deep neural networks are difficult to interpret, which makes diagnosis and debugging during training challenging. Activation maximization methods are widely used, but lead to multiple optima and are hard to interpret (appear noise-like) for complex neurons. Image-based methods use maximally-activating image regions which are easier to interpret, but do not provide pixel… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  45. Brain-Computer Interface in Virtual Reality

    Authors: Reza Abbasi-Asl, Mohammad Keshavarzi, Dorian Yao Chan

    Abstract: We study the performance of brain computer interface (BCI) system in a virtual reality (VR) environment and compare it to 2D regular displays. First, we design a headset that consists of three components: a wearable electroencephalography (EEG) device, a VR headset and an interface. Recordings of brain and behavior from human subjects, performing a wide variety of tasks using our device are collec… ▽ More

    Submitted 13 November, 2018; originally announced November 2018.

  46. arXiv:1810.00216  [pdf, other

    stat.AP cs.CV

    Parameter Estimation for the Single-Look $\mathcal{G}^0$ Distribution

    Authors: Débora Chan, Andrea Rey, Juliana Gambini, Alejandro C. Frery

    Abstract: The statistical properties of Synthetic Aperture Radar (SAR) image texture reveals useful target characteristics. It is well-known that these images are affected by speckle, and prone to contamination as double bounce and corner reflectors. The $\mathcal{G}^0$ distribution is flexible enough to model different degrees of texture in speckled data. It is indexed by three parameters: $α$, related to… ▽ More

    Submitted 29 September, 2018; originally announced October 2018.

  47. arXiv:1807.11824  [pdf, other

    cs.LG cs.PF stat.ML

    t-SNE-CUDA: GPU-Accelerated t-SNE and its Applications to Modern Data

    Authors: David M. Chan, Roshan Rao, Forrest Huang, John F. Canny

    Abstract: Modern datasets and models are notoriously difficult to explore and analyze due to their inherent high dimensionality and massive numbers of samples. Existing visualization methods which employ dimensionality reduction to two or three dimensions are often inefficient and/or ineffective for these datasets. This paper introduces t-SNE-CUDA, a GPU-accelerated implementation of t-distributed Symmetric… ▽ More

    Submitted 31 July, 2018; originally announced July 2018.

    Comments: To appear in HPML 2018 High Performance Machine Learning Workshop (Accepted, 2018)

  48. Facial Expression Recognition from World Wild Web

    Authors: Ali Mollahosseini, Behzad Hassani, Michelle J. Salvador, Hojjat Abdollahi, David Chan, Mohammad H. Mahoor

    Abstract: Recognizing facial expression in a wild setting has remained a challenging task in computer vision. The World Wide Web is a good source of facial images which most of them are captured in uncontrolled conditions. In fact, the Internet is a Word Wild Web of facial images with expressions. This paper presents the results of a new study on collecting, annotating, and analyzing wild facial expressions… ▽ More

    Submitted 5 January, 2017; v1 submitted 11 May, 2016; originally announced May 2016.

  49. Going Deeper in Facial Expression Recognition using Deep Neural Networks

    Authors: Ali Mollahosseini, David Chan, Mohammad H. Mahoor

    Abstract: Automated Facial Expression Recognition (FER) has remained a challenging and interesting problem. Despite efforts made in developing various methods for FER, existing approaches traditionally lack generalizability when applied to unseen images or those that are captured in wild setting. Most of the existing approaches are based on engineered features (e.g. HOG, LBPH, and Gabor) where the classifie… ▽ More

    Submitted 12 November, 2015; originally announced November 2015.

    Comments: To be appear in IEEE Winter Conference on Applications of Computer Vision (WACV), 2016 {Accepted in first round submission}

    Journal ref: IEEE Winter Conference on Applications of Computer Vision (WACV), 2016