subscribe to arXiv mailings

arXiv:2404.19360 [pdf, other]

Large Language Model Informed Patent Image Retrieval

Authors: Hao-Cheng Lo, Jung-Mei Chu, Jieh Hsiang, Chun-Chieh Cho

Abstract: In patent prosecution, image-based retrieval systems for identifying similarities between current patent images and prior art are pivotal to ensure the novelty and non-obviousness of patent applications. Despite their growing popularity in recent years, existing attempts, while effective at recognizing images within the same patent, fail to deliver practical value due to their limited generalizabi… ▽ More In patent prosecution, image-based retrieval systems for identifying similarities between current patent images and prior art are pivotal to ensure the novelty and non-obviousness of patent applications. Despite their growing popularity in recent years, existing attempts, while effective at recognizing images within the same patent, fail to deliver practical value due to their limited generalizability in retrieving relevant prior art. Moreover, this task inherently involves the challenges posed by the abstract visual features of patent images, the skewed distribution of image classifications, and the semantic information of image descriptions. Therefore, we propose a language-informed, distribution-aware multimodal approach to patent image feature learning, which enriches the semantic understanding of patent image by integrating Large Language Models and improves the performance of underrepresented classes with our proposed distribution-aware contrastive losses. Extensive experiments on DeepPatent2 dataset show that our proposed method achieves state-of-the-art or comparable performance in image-based patent retrieval with mAP +53.3%, Recall@10 +41.8%, and MRR@10 +51.9%. Furthermore, through an in-depth user analysis, we explore our model in aiding patent professionals in their image retrieval efforts, highlighting the model's real-world applicability and effectiveness. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Comments: 8 pages. Under review

arXiv:2403.15675 [pdf, other]

An active learning model to classify animal species in Hong Kong

Authors: Gareth Lamb, Ching Hei Lo, Jin Wu, Calvin K. F. Lee

Abstract: Camera traps are used by ecologists globally as an efficient and non-invasive method to monitor animals. While it is time-consuming to manually label the collected images, recent advances in deep learning and computer vision has made it possible to automating this process [1]. A major obstacle to this is the generalisability of these models when applying these images to independently collected dat… ▽ More Camera traps are used by ecologists globally as an efficient and non-invasive method to monitor animals. While it is time-consuming to manually label the collected images, recent advances in deep learning and computer vision has made it possible to automating this process [1]. A major obstacle to this is the generalisability of these models when applying these images to independently collected data from other parts of the world [2]. Here, we use a deep active learning workflow [3], and train a model that is applicable to camera trap images collected in Hong Kong. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 6 pages, 2 figures, 1 table

arXiv:2402.00421 [pdf, other]

From PARIS to LE-PARIS: Toward Patent Response Automation with Recommender Systems and Collaborative Large Language Models

Authors: Jung-Mei Chu, Hao-Cheng Lo, Jieh Hsiang, Chun-Chieh Cho

Abstract: In patent prosecution, timely and effective responses to Office Actions (OAs) are crucial for securing patents. However, past automation and artificial intelligence research have largely overlooked this aspect. To bridge this gap, our study introduces the Patent Office Action Response Intelligence System (PARIS) and its advanced version, the Large Language Model (LLM) Enhanced PARIS (LE-PARIS). Th… ▽ More In patent prosecution, timely and effective responses to Office Actions (OAs) are crucial for securing patents. However, past automation and artificial intelligence research have largely overlooked this aspect. To bridge this gap, our study introduces the Patent Office Action Response Intelligence System (PARIS) and its advanced version, the Large Language Model (LLM) Enhanced PARIS (LE-PARIS). These systems are designed to enhance the efficiency of patent attorneys in handling OA responses through collaboration with AI. The systems' key features include the construction of an OA Topics Database, development of Response Templates, and implementation of Recommender Systems and LLM-based Response Generation. To validate the effectiveness of the systems, we have employed a multi-paradigm analysis using the USPTO Office Action database and longitudinal data based on attorney interactions with our systems over six years. Through five studies, we have examined the constructiveness of OA topics (studies 1 and 2) using topic modeling and our proposed Delphi process, the efficacy of our proposed hybrid LLM-based recommender system tailored for OA responses (study 3), the quality of generated responses (study 4), and the systems' practical value in real-world scenarios through user studies (study 5). The results indicate that both PARIS and LE-PARIS significantly achieve key metrics and have a positive impact on attorney performance. △ Less

Submitted 4 March, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: 28 pages, 5 figures, typos corrected, references added, under review

arXiv:2309.11017 [pdf, other]

3SAT on an All-to-All-Connected CMOS Ising Solver Chip

Authors: Hüsrev Cılasun, Ziqing Zeng, Ramprasath S, Abhimanyu Kumar, Hao Lo, William Cho, Chris H. Kim, Ulya R. Karpuzcu, Sachin S. Sapatnekar

Abstract: This work solves 3SAT, a classical NP-complete problem, on a CMOS-based Ising hardware chip with all-to-all connectivity. The paper addresses practical issues in going from algorithms to hardware. It considers several degrees of freedom in mapping the 3SAT problem to the chip - using multiple Ising formulations for 3SAT; exploring multiple strategies for decomposing large problems into subproblems… ▽ More This work solves 3SAT, a classical NP-complete problem, on a CMOS-based Ising hardware chip with all-to-all connectivity. The paper addresses practical issues in going from algorithms to hardware. It considers several degrees of freedom in mapping the 3SAT problem to the chip - using multiple Ising formulations for 3SAT; exploring multiple strategies for decomposing large problems into subproblems that can be accommodated on the Ising chip; and executing a sequence of these subproblems on CMOS hardware to obtain the solution to the larger problem. These are evaluated within a software framework, and the results are used to identify the most promising formulations and decomposition techniques. These best approaches are then mapped to the all-to-all hardware, and the performance of 3SAT is evaluated on the chip. Experimental data shows that the deployed decomposition and mapping strategies impact SAT solution quality: without our methods, the CMOS hardware cannot achieve 3SAT solutions on SATLIB benchmarks. △ Less

Submitted 19 September, 2023; originally announced September 2023.

ACM Class: B.7

arXiv:2309.08325 [pdf, other]

Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional Semantics

Authors: Chun Hei Lo, Wai Lam, Hong Cheng, Guy Emerson

Abstract: Functional Distributional Semantics (FDS) models the meaning of words by truth-conditional functions. This provides a natural representation for hypernymy but no guarantee that it can be learnt when FDS models are trained on a corpus. In this paper, we probe into FDS models and study the representations learnt, drawing connections between quantifications, the Distributional Inclusion Hypothesis (D… ▽ More Functional Distributional Semantics (FDS) models the meaning of words by truth-conditional functions. This provides a natural representation for hypernymy but no guarantee that it can be learnt when FDS models are trained on a corpus. In this paper, we probe into FDS models and study the representations learnt, drawing connections between quantifications, the Distributional Inclusion Hypothesis (DIH), and the variational-autoencoding objective of FDS model training. Using synthetic data sets, we reveal that FDS models learn hypernymy on a restricted class of corpus that strictly follows the DIH. We further introduce a training objective that both enables hypernymy learning under the reverse of the DIH and improves hypernymy detection from real corpora. △ Less

Submitted 10 February, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: 12 pages

arXiv:2304.13789 [pdf, other]

Composable Security of Distributed Symmetric Key Exchange Protocol

Authors: Jie Lin, Manfred von Willich, Hoi-Kwong Lo

Abstract: The Distributed Symmetric Key Exchange (DSKE) protocol provides secure secret exchange (e.g., for key exchange) between two honest parties that need not have had prior contact, and use intermediaries with whom they each securely share confidential data. We show the composable security of the DSKE protocol in the constructive cryptography framework of Maurer. Specifically, we prove the security (co… ▽ More The Distributed Symmetric Key Exchange (DSKE) protocol provides secure secret exchange (e.g., for key exchange) between two honest parties that need not have had prior contact, and use intermediaries with whom they each securely share confidential data. We show the composable security of the DSKE protocol in the constructive cryptography framework of Maurer. Specifically, we prove the security (correctness and confidentiality) and robustness of this protocol against any computationally unbounded adversary, who additionally may have fully compromised a bounded number of the intermediaries and can eavesdrop on all communication. As DSKE is highly scalable in a network setting with no distance limit, it is expected to be a cost-effective quantum-safe cryptographic solution to safeguarding the network security against the threat of quantum computers. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: 15+6 pages, 5 figures

arXiv:2205.00615 [pdf, other]

Distributed Symmetric Key Exchange: A scalable, quantum-proof key distribution system

Authors: Hoi-Kwong Lo, Mattia Montagna, Manfred von Willich

Abstract: We propose and implement a protocol for a scalable, cost-effective, information-theoretically secure key distribution and management system. The system, called Distributed Symmetric Key Exchange (DSKE), relies on pre-shared random numbers between DSKE clients and a group of Security Hubs. Any group of DSKE clients can use the DSKE protocol to distill from the pre-shared numbers a secret key. The c… ▽ More We propose and implement a protocol for a scalable, cost-effective, information-theoretically secure key distribution and management system. The system, called Distributed Symmetric Key Exchange (DSKE), relies on pre-shared random numbers between DSKE clients and a group of Security Hubs. Any group of DSKE clients can use the DSKE protocol to distill from the pre-shared numbers a secret key. The clients are protected from Security Hub compromise via a secret sharing scheme that allows the creation of the final key without the need to trust individual Security Hubs. Precisely, if the number of compromised Security Hubs does not exceed a certain threshold, confidentiality is guaranteed to DSKE clients and, at the same time, robustness against denial-of-service (DoS) attacks. The DSKE system can be used for quantum-secure communication, can be easily integrated into existing network infrastructures, and can support arbitrary groups of communication parties that have access to a key. We discuss the high-level protocol, analyze its security, including its robustness against disruption. A proof-ofprinciple demonstration of secure communication between two distant clients with a DSKE-based VPN using Security Hubs on Amazon Web Server (AWS) nodes thousands of kilometres away from them was performed, demonstrating the feasibility of DSKEenabled secret sharing one-time-pad encryption with a data rate above 50 Mbit/s and a latency below 70 ms. △ Less

Submitted 24 November, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

Comments: Our protocol has been renamed Distributed Symmetric Key Exchange (DSKE). 11 pages, 6 figures

MSC Class: 94A60 ACM Class: E.3

arXiv:2110.13041 [pdf, other]

doi 10.3389/fdata.2022.787421

Applications and Techniques for Fast Machine Learning in Science

Authors: Allison McCarn Deiana, Nhan Tran, Joshua Agar, Michaela Blott, Giuseppe Di Guglielmo, Javier Duarte, Philip Harris, Scott Hauck, Mia Liu, Mark S. Neubauer, Jennifer Ngadiuba, Seda Ogrenci-Memik, Maurizio Pierini, Thea Aarrestad, Steffen Bahr, Jurgen Becker, Anne-Sophie Berthold, Richard J. Bonventre, Tomas E. Muller Bravo, Markus Diefenthaler, Zhen Dong, Nick Fritzsche, Amir Gholami, Ekaterina Govorkova, Kyle J Hazelwood , et al. (62 additional authors not shown)

Abstract: In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML ac… ▽ More In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. We also present overlapping challenges across the multiple scientific domains where common solutions can be found. This community report is intended to give plenty of examples and inspiration for scientific discovery through integrated and accelerated ML solutions. This is followed by a high-level overview and organization of technical advances, including an abundance of pointers to source material, which can enable these breakthroughs. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 66 pages, 13 figures, 5 tables

Report number: FERMILAB-PUB-21-502-AD-E-SCD

Journal ref: Front. Big Data 5, 787421 (2022)

arXiv:2101.06066 [pdf, other]

Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation

Authors: Mudit Chaudhary, Borislav Dzodzo, Sida Huang, Chun Hei Lo, Mingzhi Lyu, Lun Yiu Nie, Jinbo Xing, Tianhua Zhang, Xiaoying Zhang, Jingyan Zhou, Hong Cheng, Wai Lam, Helen Meng

Abstract: Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs. In this paper, we follow the baseline provided in DSTC9 Track 1 and propose three subsystems, KDEAK, KnowleDgEFactor, and Ens-GPT, which form the pipeline for a task-oriented dialog system capable of accessing unstructured knowledge. Specifically, KDEAK performs know… ▽ More Dialog systems enriched with external knowledge can handle user queries that are outside the scope of the supporting databases/APIs. In this paper, we follow the baseline provided in DSTC9 Track 1 and propose three subsystems, KDEAK, KnowleDgEFactor, and Ens-GPT, which form the pipeline for a task-oriented dialog system capable of accessing unstructured knowledge. Specifically, KDEAK performs knowledge-seeking turn detection by formulating the problem as natural language inference using knowledge from dialogs, databases and FAQs. KnowleDgEFactor accomplishes the knowledge selection task by formulating a factorized knowledge/document retrieval problem with three modules performing domain, entity and knowledge level analyses. Ens-GPT generates a response by first processing multiple knowledge snippets, followed by an ensemble algorithm that decides if the response should be solely derived from a GPT2-XL model, or regenerated in combination with the top-ranking knowledge snippet. Experimental results demonstrate that the proposed pipeline system outperforms the baseline and generates high-quality responses, achieving at least 58.77% improvement on BLEU-4 score. △ Less

Submitted 15 January, 2021; originally announced January 2021.

arXiv:2009.13271 [pdf, other]

Embedding and generation of indoor climbing routes with variational autoencoder

Authors: K. H. Lo

Abstract: Recent increase in popularity of indoor climbing allows possible applications of deep learning algorthms to classify and generate climbing routes. In this work, we employ a variational autoencoder to climbing routes in a standardized training apparatus MoonBoard, a well-known training tool within the climbing community. By sampling the encoded latent space, it is observed that the algorithm can ge… ▽ More Recent increase in popularity of indoor climbing allows possible applications of deep learning algorthms to classify and generate climbing routes. In this work, we employ a variational autoencoder to climbing routes in a standardized training apparatus MoonBoard, a well-known training tool within the climbing community. By sampling the encoded latent space, it is observed that the algorithm can generate high quality climbing routes. 22 generated problems are uploaded to the Moonboard app for user review. This algorithm could serve as a first step to facilitate indoor climbing route setting. △ Less

Submitted 16 September, 2020; originally announced September 2020.

arXiv:1909.04749 [pdf, other]

Visual Analytics of Student Learning Behaviors on K-12 Mathematics E-learning Platforms

Authors: Meng Xia, Huan Wei, Min Xu, Leo Yu Ho Lo, Yong Wang, Rong Zhang, Huamin Qu

Abstract: With increasing popularity in online learning, a surge of E-learning platforms have emerged to facilitate education opportunities for k-12 (from kindergarten to 12th grade) students and with this, a wealth of information on their learning logs are getting recorded. However, it remains unclear how to make use of these detailed learning behavior data to improve the design of learning materials and g… ▽ More With increasing popularity in online learning, a surge of E-learning platforms have emerged to facilitate education opportunities for k-12 (from kindergarten to 12th grade) students and with this, a wealth of information on their learning logs are getting recorded. However, it remains unclear how to make use of these detailed learning behavior data to improve the design of learning materials and gain deeper insight into students' thinking and learning styles. In this work, we propose a visual analytics system to analyze student learning behaviors on a K-12 mathematics E-learning platform. It supports both correlation analysis between different attributes and a detailed visualization of user mouse-movement logs. Our case studies on a real dataset show that our system can better guide the design of learning resources (e.g., math questions) and facilitate quick interpretation of students' problem-solving and learning styles. △ Less

Submitted 21 September, 2019; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: 2 pages, 6 figures, 2019 VAST conference, Best Poster, Learning Analytics, Visual Analytic System for Education, Online Learning, Learning Data Analysis, Learning Trajectories Analysis, Mouse Movement

arXiv:1707.06684 [pdf, other]

ShortScience.org - Reproducing Intuition

Authors: Joseph Paul Cohen, Henry Z. Lo

Abstract: We present ShortScience.org, a platform for post-publication discussion of research papers. On ShortScience.org, the research community can read and write summaries of papers in order to increase accessible and reproducibility. Summaries contain the perspective and insight of other readers, why they liked or disliked it, and their attempt to demystify complicated sections. ShortScience.org has ove… ▽ More We present ShortScience.org, a platform for post-publication discussion of research papers. On ShortScience.org, the research community can read and write summaries of papers in order to increase accessible and reproducibility. Summaries contain the perspective and insight of other readers, why they liked or disliked it, and their attempt to demystify complicated sections. ShortScience.org has over 600 paper summaries, all of which are searchable and organized by paper, conference, and year. Many regular contributors are expert machine learning researchers. We present statistics from the last year of operation, user demographics, and responses from a usage survey. Results indicate that ShortScience benefits students most, by providing short, understandable summaries reflecting expert opinions. △ Less

Submitted 20 July, 2017; originally announced July 2017.

Comments: To appear in International Conference on Machine Learning 2017 Workshop on Reproducibility in Machine Learning

arXiv:1703.08710 [pdf, other]

Count-ception: Counting by Fully Convolutional Redundant Counting

Authors: Joseph Paul Cohen, Genevieve Boucher, Craig A. Glastonbury, Henry Z. Lo, Yoshua Bengio

Abstract: Counting objects in digital images is a process that should be replaced by machines. This tedious task is time consuming and prone to errors due to fatigue of human annotators. The goal is to have a system that takes as input an image and returns a count of the objects inside and justification for the prediction in the form of object localization. We repose a problem, originally posed by Lempitsky… ▽ More Counting objects in digital images is a process that should be replaced by machines. This tedious task is time consuming and prone to errors due to fatigue of human annotators. The goal is to have a system that takes as input an image and returns a count of the objects inside and justification for the prediction in the form of object localization. We repose a problem, originally posed by Lempitsky and Zisserman, to instead predict a count map which contains redundant counts based on the receptive field of a smaller regression network. The regression network predicts a count of the objects that exist inside this frame. By processing the image in a fully convolutional way each pixel is going to be accounted for some number of times, the number of windows which include it, which is the size of each window, (i.e., 32x32 = 1024). To recover the true count we take the average over the redundant predictions. Our contribution is redundant counting instead of predicting a density map in order to average over errors. We also propose a novel deep neural network architecture adapted from the Inception family of networks called the Count-ception network. Together our approach results in a 20% relative improvement (2.9 to 2.3 MAE) over the state of the art method by Xie, Noble, and Zisserman in 2016. △ Less

Submitted 23 July, 2017; v1 submitted 25 March, 2017; originally announced March 2017.

Comments: Under Review

arXiv:1610.00318 [pdf, other]

MinMax Radon Barcodes for Medical Image Retrieval

Authors: H. R. Tizhoosh, Shujin Zhu, Hanson Lo, Varun Chaudhari, Tahmid Mehdi

Abstract: Content-based medical image retrieval can support diagnostic decisions by clinical experts. Examining similar images may provide clues to the expert to remove uncertainties in his/her final diagnosis. Beyond conventional feature descriptors, binary features in different ways have been recently proposed to encode the image content. A recent proposal is "Radon barcodes" that employ binarized Radon p… ▽ More Content-based medical image retrieval can support diagnostic decisions by clinical experts. Examining similar images may provide clues to the expert to remove uncertainties in his/her final diagnosis. Beyond conventional feature descriptors, binary features in different ways have been recently proposed to encode the image content. A recent proposal is "Radon barcodes" that employ binarized Radon projections to tag/annotate medical images with content-based binary vectors, called barcodes. In this paper, MinMax Radon barcodes are introduced which are superior to "local thresholding" scheme suggested in the literature. Using IRMA dataset with 14,410 x-ray images from 193 different classes, the advantage of using MinMax Radon barcodes over \emph{thresholded} Radon barcodes are demonstrated. The retrieval error for direct search drops by more than 15\%. As well, SURF, as a well-established non-binary approach, and BRISK, as a recent binary method are examined to compare their results with MinMax Radon barcodes when retrieving images from IRMA dataset. The results demonstrate that MinMax Radon barcodes are faster and more accurate when applied on IRMA images. △ Less

Submitted 2 October, 2016; originally announced October 2016.

Comments: To appear in proceedings of the 12th International Symposium on Visual Computing, December 12-14, 2016, Las Vegas, Nevada, USA

arXiv:1604.07796 [pdf, other]

Scale Normalization

Authors: Henry Z. Lo, Kevin Amaral, Wei Ding

Abstract: One of the difficulties of training deep neural networks is caused by improper scaling between layers. Scaling issues introduce exploding / gradient problems, and have typically been addressed by careful scale-preserving initialization. We investigate the value of preserving scale, or isometry, beyond the initial weights. We propose two methods of maintaing isometry, one exact and one stochastic.… ▽ More One of the difficulties of training deep neural networks is caused by improper scaling between layers. Scaling issues introduce exploding / gradient problems, and have typically been addressed by careful scale-preserving initialization. We investigate the value of preserving scale, or isometry, beyond the initial weights. We propose two methods of maintaing isometry, one exact and one stochastic. Preliminary experiments show that for both determinant and scale-normalization effectively speeds up learning. Results suggest that isometry is important in the beginning of learning, and maintaining it leads to faster learning. △ Less

Submitted 26 April, 2016; originally announced April 2016.

Comments: Preliminary version submitted to ICLR workshop 2016

arXiv:1603.04395 [pdf, ps, other]

Academic Torrents: Scalable Data Distribution

Authors: Henry Z. Lo, Joseph Paul Cohen

Abstract: As competitions get more popular, transferring ever-larger data sets becomes infeasible and costly. For example, downloading the 157.3 GB 2012 ImageNet data set incurs about $4.33 in bandwidth costs per download. Downloading the full ImageNet data set takes 33 days. ImageNet has since become popular beyond the competition, and many papers and models now revolve around this data set. For sharing su… ▽ More As competitions get more popular, transferring ever-larger data sets becomes infeasible and costly. For example, downloading the 157.3 GB 2012 ImageNet data set incurs about $4.33 in bandwidth costs per download. Downloading the full ImageNet data set takes 33 days. ImageNet has since become popular beyond the competition, and many papers and models now revolve around this data set. For sharing such an important resource to the machine learning community, the sharers of ImageNet must shoulder a large bandwidth burden. Academic Torrents reduces this burden for disseminating competition data, and also increases download speeds for end users. Academic Torrents is run by a pending nonprofit.. By augmenting an existing HTTP server with a peer-to-peer swarm, requests get re-routed to get data from downloaders. While existing systems slow down with more users, the benefits of Academic Torrents grow, with noticeable effects even when only one other person is downloading. △ Less

Submitted 14 March, 2016; originally announced March 2016.

Comments: Presented at Neural Information Processing Systems 2015 Challenges in Machine Learning (CiML) workshop http://ciml.chalearn.org/home/schedule

arXiv:1602.05931 [pdf, other]

RandomOut: Using a convolutional gradient norm to rescue convolutional filters

Authors: Joseph Paul Cohen, Henry Z. Lo, Wei Ding

Abstract: Filters in convolutional neural networks are sensitive to their initialization. The random numbers used to initialize filters are a bias and determine if you will "win" and converge to a satisfactory local minimum so we call this The Filter Lottery. We observe that the 28x28 Inception-V3 model without Batch Normalization fails to train 26% of the time when varying the random seed alone. This is a… ▽ More Filters in convolutional neural networks are sensitive to their initialization. The random numbers used to initialize filters are a bias and determine if you will "win" and converge to a satisfactory local minimum so we call this The Filter Lottery. We observe that the 28x28 Inception-V3 model without Batch Normalization fails to train 26% of the time when varying the random seed alone. This is a problem that affects the trial and error process of designing a network. Because random seeds have a large impact it makes it hard to evaluate a network design without trying many different random starting weights. This work aims to reduce the bias imposed by the initial weights so a network converges more consistently. We propose to evaluate and replace specific convolutional filters that have little impact on the prediction. We use the gradient norm to evaluate the impact of a filter on error, and re-initialize filters when the gradient norm of its weights falls below a specific threshold. This consistently improves accuracy on the 28x28 Inception-V3 with a median increase of +3.3%. In effect our method RandomOut increases the number of filters explored without increasing the size of the network. We observe that the RandomOut method has more consistent generalization performance, having a standard deviation of 1.3% instead of 2% when varying random seeds, and does so faster and with fewer parameters. △ Less

Submitted 29 May, 2017; v1 submitted 18 February, 2016; originally announced February 2016.

Comments: Extended version of the ICLR 2016 workshop track paper

arXiv:1601.00978 [pdf, other]

Crater Detection via Convolutional Neural Networks

Authors: Joseph Paul Cohen, Henry Z. Lo, Tingting Lu, Wei Ding

Abstract: Craters are among the most studied geomorphic features in the Solar System because they yield important information about the past and present geological processes and provide information about the relative ages of observed geologic formations. We present a method for automatic crater detection using advanced machine learning to deal with the large amount of satellite imagery collected. The challe… ▽ More Craters are among the most studied geomorphic features in the Solar System because they yield important information about the past and present geological processes and provide information about the relative ages of observed geologic formations. We present a method for automatic crater detection using advanced machine learning to deal with the large amount of satellite imagery collected. The challenge of automatically detecting craters comes from their is complex surface because their shape erodes over time to blend into the surface. Bandeira provided a seminal dataset that embodied this challenge that is still an unsolved pattern recognition problem to this day. There has been work to solve this challenge based on extracting shape and contrast features and then applying classification models on those features. The limiting factor in this existing work is the use of hand crafted filters on the image such as Gabor or Sobel filters or Haar features. These hand crafted methods rely on domain knowledge to construct. We would like to learn the optimal filters and features based on training examples. In order to dynamically learn filters and features we look to Convolutional Neural Networks (CNNs) which have shown their dominance in computer vision. The power of CNNs is that they can learn image filters which generate features for high accuracy classification. △ Less

Submitted 5 January, 2016; originally announced January 2016.

Comments: 2 Pages. Submitted to 47th Lunar and Planetary Science Conference (LPSC 2016)

arXiv:1405.0198

No Superluminal Signaling Implies Unconditionally Secure Bit Commitment

Authors: H. F. Chau, C. -H. Fred Fung, H. -K. Lo

Abstract: Bit commitment (BC) is an important cryptographic primitive for an agent to convince a mutually mistrustful party that she has already made a binding choice of 0 or 1 but only to reveal her choice at a later time. Ideally, a BC protocol should be simple, reliable, easy to implement using existing technologies, and most importantly unconditionally secure in the sense that its security is based on a… ▽ More Bit commitment (BC) is an important cryptographic primitive for an agent to convince a mutually mistrustful party that she has already made a binding choice of 0 or 1 but only to reveal her choice at a later time. Ideally, a BC protocol should be simple, reliable, easy to implement using existing technologies, and most importantly unconditionally secure in the sense that its security is based on an information-theoretic proof rather than computational complexity assumption or the existence of a trustworthy arbitrator. Here we report such a provably secure scheme involving only one-way classical communications whose unconditional security is based on no superluminal signaling (NSS). Our scheme is inspired by the earlier works by Kent, who proposed two impractical relativistic protocols whose unconditional securities are yet to be established as well as several provably unconditionally secure protocols which rely on both quantum mechanics and NSS. Our scheme is conceptually simple and shows for the first time that quantum communication is not needed to achieve unconditional security for BC. Moreover, with purely classical communications, our scheme is practical and easy to implement with existing telecom technologies. This completes the cycle of study of unconditionally secure bit commitment based on known physical laws. △ Less

Submitted 18 November, 2014; v1 submitted 1 May, 2014; originally announced May 2014.

Comments: This paper has been withdrawn by the authors due to a crucial oversight on an earlier work by A. Kent

arXiv:1207.1473 [pdf, ps, other]

doi 10.1103/PhysRevA.87.062327

Postprocessing for quantum random number generators: entropy evaluation and randomness extraction

Authors: Xiongfeng Ma, Feihu Xu, He Xu, Xiaoqing Tan, Bing Qi, Hoi-Kwong Lo

Abstract: Quantum random-number generators (QRNGs) can offer a means to generate information-theoretically provable random numbers, in principle. In practice, unfortunately, the quantum randomness is inevitably mixed with classical randomness due to classical noises. To distill this quantum randomness, one needs to quantify the randomness of the source and apply a randomness extractor. Here, we propose a ge… ▽ More Quantum random-number generators (QRNGs) can offer a means to generate information-theoretically provable random numbers, in principle. In practice, unfortunately, the quantum randomness is inevitably mixed with classical randomness due to classical noises. To distill this quantum randomness, one needs to quantify the randomness of the source and apply a randomness extractor. Here, we propose a generic framework for evaluating quantum randomness of real-life QRNGs by min-entropy, and apply it to two different existing quantum random-number systems in the literature. Moreover, we provide a guideline of QRNG data postprocessing for which we implement two information-theoretically provable randomness extractors: Toeplitz-hashing extractor and Trevisan's extractor. △ Less

Submitted 21 June, 2013; v1 submitted 5 July, 2012; originally announced July 2012.

Comments: 13 pages, 2 figures

Journal ref: Phys. Rev. A 87, 062327 (2013)

arXiv:quant-ph/0601115 [pdf, ps, other]

doi 10.1103/PhysRevA.75.032314

Phase-Remapping Attack in Practical Quantum Key Distribution Systems

Authors: Chi-Hang Fred Fung, Bing Qi, Kiyoshi Tamaki, Hoi-Kwong Lo

Abstract: Quantum key distribution (QKD) can be used to generate secret keys between two distant parties. Even though QKD has been proven unconditionally secure against eavesdroppers with unlimited computation power, practical implementations of QKD may contain loopholes that may lead to the generated secret keys being compromised. In this paper, we propose a phase-remapping attack targeting two practical… ▽ More Quantum key distribution (QKD) can be used to generate secret keys between two distant parties. Even though QKD has been proven unconditionally secure against eavesdroppers with unlimited computation power, practical implementations of QKD may contain loopholes that may lead to the generated secret keys being compromised. In this paper, we propose a phase-remapping attack targeting two practical bidirectional QKD systems (the "plug & play" system and the Sagnac system). We showed that if the users of the systems are unaware of our attack, the final key shared between them can be compromised in some situations. Specifically, we showed that, in the case of the Bennett-Brassard 1984 (BB84) protocol with ideal single-photon sources, when the quantum bit error rate (QBER) is between 14.6% and 20%, our attack renders the final key insecure, whereas the same range of QBER values has been proved secure if the two users are unaware of our attack; also, we demonstrated three situations with realistic devices where positive key rates are obtained without the consideration of Trojan horse attacks but in fact no key can be distilled. We remark that our attack is feasible with only current technology. Therefore, it is very important to be aware of our attack in order to ensure absolute security. In finding our attack, we minimize the QBER over individual measurements described by a general POVM, which has some similarity with the standard quantum state discrimination problem. △ Less

Submitted 5 March, 2007; v1 submitted 17 January, 2006; originally announced January 2006.

Comments: 13 pages, 8 figures

Journal ref: Phys. Rev. A 75, 032314 (2007)

arXiv:cs/0508094 [pdf, ps, other]

doi 10.1109/ISIT.2005.1523616

Conference Key Agreement and Quantum Sharing of Classical Secrets with Noisy GHZ States

Authors: Kai Chen, Hoi-Kwong Lo

Abstract: We propose a wide class of distillation schemes for multi-partite entangled states that are CSS-states. Our proposal provides not only superior efficiency, but also new insights on the connection between CSS-states and bipartite graph states. We then consider the applications of our distillation schemes for two cryptographic tasks--namely, (a) conference key agreement and (b) quantum sharing of… ▽ More We propose a wide class of distillation schemes for multi-partite entangled states that are CSS-states. Our proposal provides not only superior efficiency, but also new insights on the connection between CSS-states and bipartite graph states. We then consider the applications of our distillation schemes for two cryptographic tasks--namely, (a) conference key agreement and (b) quantum sharing of classical secrets. In particular, we construct ``prepare-and-measure'' protocols. Also we study the yield of those protocols and the threshold value of the fidelity above which the protocols can function securely. Surprisingly, our protocols will function securely even when the initial state does not violate the standard Bell-inequalities for GHZ states. Experimental realization involving only bi-partite entanglement is also suggested. △ Less

Submitted 22 August, 2005; originally announced August 2005.

Comments: 5 pages, to appear in Proc. 2005 IEEE International Symposium on Information Theory (ISIT 2005, Adelaide, Australia)

Report number: CQIQC-ISIT 2005-CL1

Journal ref: Information Theory, 2005. ISIT 2005. Proceedings. International Symposium on 4-9 Sept. 2005 Page(s):1607 - 1611

arXiv:quant-ph/9904091 [pdf, ps, other]

doi 10.1088/0305-4470/34/35/321

A simple proof of the unconditional security of quantum key distribution

Authors: Hoi-Kwong Lo

Abstract: Quantum key distribution is the most well-known application of quantum cryptography. Previous proposed proofs of security of quantum key distribution contain various technical subtleties. Here, a conceptually simpler proof of security of quantum key distribution is presented. The new insight is the invariance of the error rate of a teleportation channel: We show that the error rate of a teleport… ▽ More Quantum key distribution is the most well-known application of quantum cryptography. Previous proposed proofs of security of quantum key distribution contain various technical subtleties. Here, a conceptually simpler proof of security of quantum key distribution is presented. The new insight is the invariance of the error rate of a teleportation channel: We show that the error rate of a teleportation channel is independent of the signals being transmitted. This is because the non-trivial error patterns are permuted under teleportation. This new insight is combined with the recently proposed quantum to classical reduction theorem. Our result shows that assuming that Alice and Bob have fault-tolerant quantum computers, quantum key distribution can be made unconditionally secure over arbitrarily long distances even against the most general type of eavesdropping attacks and in the presence of all types of noises. △ Less

Submitted 27 April, 1999; originally announced April 1999.

Comments: 13 pages, extended abstract. Comments will be appreciated

Journal ref: J.Phys.A34:6957-6968,2001

arXiv:quant-ph/9611031 [pdf, ps, other]

doi 10.1103/PhysRevA.56.1154

Insecurity of Quantum Secure Computations

Authors: Hoi-Kwong Lo

Abstract: It had been widely claimed that quantum mechanics can protect private information during public decision in for example the so-called two-party secure computation. If this were the case, quantum smart-cards could prevent fake teller machines from learning the PIN (Personal Identification Number) from the customers' input. Although such optimism has been challenged by the recent surprising discov… ▽ More It had been widely claimed that quantum mechanics can protect private information during public decision in for example the so-called two-party secure computation. If this were the case, quantum smart-cards could prevent fake teller machines from learning the PIN (Personal Identification Number) from the customers' input. Although such optimism has been challenged by the recent surprising discovery of the insecurity of the so-called quantum bit commitment, the security of quantum two-party computation itself remains unaddressed. Here I answer this question directly by showing that all ``one-sided'' two-party computations (which allow only one of the two parties to learn the result) are necessarily insecure. As corollaries to my results, quantum one-way oblivious password identification and the so-called quantum one-out-of-two oblivious transfer are impossible. I also construct a class of functions that cannot be computed securely in any ``two-sided'' two-party computation. Nevertheless, quantum cryptography remains useful in key distribution and can still provide partial security in ``quantum money'' proposed by Wiesner. △ Less

Submitted 28 April, 1997; v1 submitted 19 November, 1996; originally announced November 1996.

Comments: The discussion on the insecurity of even non-ideal protocols has been greatly extended. Other technical points are also clarified. Version accepted for publication in Phys. Rev. A

arXiv:quant-ph/9603004 [pdf, ps, other]

doi 10.1103/PhysRevLett.78.3410

Is Quantum Bit Commitment Really Possible?

Authors: Hoi-Kwong Lo, H. F. Chau

Abstract: We show that all proposed quantum bit commitment schemes are insecure because the sender, Alice, can almost always cheat successfully by using an Einstein-Podolsky-Rosen type of attack and delaying her measurement until she opens her commitment. We show that all proposed quantum bit commitment schemes are insecure because the sender, Alice, can almost always cheat successfully by using an Einstein-Podolsky-Rosen type of attack and delaying her measurement until she opens her commitment. △ Less

Submitted 2 April, 1997; v1 submitted 4 March, 1996; originally announced March 1996.

Comments: Major revisions to include a more extensive introduction and an example of bit commitment. Overlap with independent work by Mayers acknowledged. More recent works by Mayers, by Lo and Chau and by Lo are also noted. Accepted for publication in Phys. Rev. Lett

Journal ref: Phys. Rev. Lett. 78, 3410 (1997)

Showing 1–25 of 25 results for author: Lo, H