Skip to main content

Showing 1–50 of 57 results for author: Ross, J

  1. arXiv:2406.05882  [pdf, other

    cs.LG stat.ML

    Distributional Preference Alignment of LLMs via Optimal Transport

    Authors: Igor Melnyk, Youssef Mroueh, Brian Belgodere, Mattia Rigotti, Apoorva Nitsure, Mikhail Yurochkin, Kristjan Greenewald, Jiri Navratil, Jerret Ross

    Abstract: Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  2. arXiv:2405.04912  [pdf, other

    q-bio.BM cs.LG physics.chem-ph

    GP-MoLFormer: A Foundation Model For Molecular Generation

    Authors: Jerret Ross, Brian Belgodere, Samuel C. Hoffman, Vijil Chenthamarakshan, Youssef Mroueh, Payel Das

    Abstract: Transformer-based models trained on large and general purpose datasets consisting of molecular strings have recently emerged as a powerful tool for successfully modeling various structure-property relations. Inspired by this success, we extend the paradigm of training chemical language transformers on large-scale chemical datasets to generative tasks in this work. Specifically, we propose GP-MoLFo… ▽ More

    Submitted 4 April, 2024; originally announced May 2024.

  3. arXiv:2403.14797  [pdf, other

    cs.CV cs.LG

    Preventing Catastrophic Forgetting through Memory Networks in Continuous Detection

    Authors: Gaurav Bhatt, James Ross, Leonid Sigal

    Abstract: Modern pre-trained architectures struggle to retain previous information while undergoing continuous fine-tuning on new tasks. Despite notable progress in continual classification, systems designed for complex vision tasks such as detection or segmentation still struggle to attain satisfactory performance. In this work, we introduce a memory-based detection transformer architecture to adapt a pre-… ▽ More

    Submitted 15 July, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Journal ref: European Conference on Computer Vision, 2024

  4. arXiv:2310.07132  [pdf, other

    cs.LG math.ST q-fin.RM stat.ML

    Risk Aware Benchmarking of Large Language Models

    Authors: Apoorva Nitsure, Youssef Mroueh, Mattia Rigotti, Kristjan Greenewald, Brian Belgodere, Mikhail Yurochkin, Jiri Navratil, Igor Melnyk, Jerret Ross

    Abstract: We propose a distributional framework for benchmarking socio-technical risks of foundation models with quantified statistical significance. Our approach hinges on a new statistical relative testing based on first and second order stochastic dominance of real random variables. We show that the second order statistics in this test are linked to mean-risk models commonly used in econometrics and math… ▽ More

    Submitted 9 June, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: ICML 2024

  5. arXiv:2307.01753  [pdf, other

    astro-ph.CO cs.LG physics.comp-ph physics.data-an

    Local primordial non-Gaussianity from the large-scale clustering of photometric DESI luminous red galaxies

    Authors: Mehdi Rezaie, Ashley J. Ross, Hee-Jong Seo, Hui Kong, Anna Porredon, Lado Samushia, Edmond Chaussidon, Alex Krolewski, Arnaud de Mattia, Florian Beutler, Jessica Nicole Aguilar, Steven Ahlen, Shadab Alam, Santiago Avila, Benedict Bahr-Kalus, Jose Bermejo-Climent, David Brooks, Todd Claybaugh, Shaun Cole, Kyle Dawson, Axel de la Macorra, Peter Doel, Andreu Font-Ribera, Jaime E. Forero-Romero, Satya Gontcho A Gontcho , et al. (24 additional authors not shown)

    Abstract: We use angular clustering of luminous red galaxies from the Dark Energy Spectroscopic Instrument (DESI) imaging surveys to constrain the local primordial non-Gaussianity parameter $\fnl$. Our sample comprises over 12 million targets, covering 14,000 square degrees of the sky, with redshifts in the range $0.2< z < 1.35$. We identify Galactic extinction, survey depth, and astronomical seeing as the… ▽ More

    Submitted 25 June, 2024; v1 submitted 4 July, 2023; originally announced July 2023.

    Comments: 21 pages, 17 figures, 7 tables (Appendix excluded). Published in MNRAS

  6. arXiv:2305.11305  [pdf, other

    quant-ph cs.ET

    Improved Synthesis of Toffoli-Hadamard Circuits

    Authors: Matthew Amy, Andrew N. Glaudell, Sarah Meng Li, Neil J. Ross

    Abstract: The matrices that can be exactly represented by a circuit over the Toffoli-Hadamard gate set are the orthogonal matrices of the form $M/ \sqrt{2}{}^k$, where $M$ is an integer matrix and $k$ is a nonnegative integer. The exact synthesis problem for this gate set is the problem of constructing a circuit for a given such matrix. Existing methods produce circuits consisting of $O(2^n \log(n)k)$ gates… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  7. arXiv:2304.10819  [pdf, other

    cs.LG cs.AI stat.ML

    Auditing and Generating Synthetic Data with Controllable Trust Trade-offs

    Authors: Brian Belgodere, Pierre Dognin, Adam Ivankay, Igor Melnyk, Youssef Mroueh, Aleksandra Mojsilovic, Jiri Navratil, Apoorva Nitsure, Inkit Padhi, Mattia Rigotti, Jerret Ross, Yair Schiff, Radhika Vedpathak, Richard A. Young

    Abstract: Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining fidelity to the original data. However, assessing the trustworthiness of synthetic datasets and models is a critical challenge. We introduce a holistic auditing framew… ▽ More

    Submitted 9 June, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: submitted

  8. arXiv:2212.08072  [pdf

    cs.CL cs.AI cs.LG

    Foresight -- Generative Pretrained Transformer (GPT) for Modelling of Patient Timelines using EHRs

    Authors: Zeljko Kraljevic, Dan Bean, Anthony Shek, Rebecca Bendayan, Harry Hemingway, Joshua Au Yeung, Alexander Deng, Alfie Baston, Jack Ross, Esther Idowu, James T Teo, Richard J Dobson

    Abstract: Background: Electronic Health Records hold detailed longitudinal information about each patient's health status and general clinical history, a large portion of which is stored within the unstructured text. Existing approaches focus mostly on structured data and a subset of single-domain outcomes. We explore how temporal modelling of patients from free text and structured data, using deep generati… ▽ More

    Submitted 24 January, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

  9. arXiv:2210.11729  [pdf, other

    cs.CL

    An Exploration of Data Efficiency in Intra-Dataset Task Transfer for Dialog Understanding

    Authors: Josiah Ross, Luke Yoffe, Alon Albalak, William Yang Wang

    Abstract: Transfer learning is an exciting area of Natural Language Processing that has the potential to both improve model performance and increase data efficiency. This study explores the effects of varying quantities of target task training data on sequential transfer learning in the dialog domain. We hypothesize that a model can utilize the information learned from a source task to better learn a target… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

  10. arXiv:2208.06665  [pdf, other

    cs.LG

    Cloud-Based Real-Time Molecular Screening Platform with MolFormer

    Authors: Brian Belgodere, Vijil Chenthamarakshan, Payel Das, Pierre Dognin, Toby Kurien, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young

    Abstract: With the prospect of automating a number of chemical tasks with high fidelity, chemical language processing models are emerging at a rapid speed. Here, we present a cloud-based real-time platform that allows users to virtually screen molecules of interest. For this purpose, molecular embeddings inferred from a recently proposed large chemical language model, named MolFormer, are leveraged. The pla… ▽ More

    Submitted 13 August, 2022; originally announced August 2022.

    Comments: Paper accepted at ECML PKDD 2022 demo track

  11. arXiv:2205.06068  [pdf, ps, other

    math.CT cs.LO

    On the Lambek embedding and the category of product-preserving presheaves

    Authors: Peng Fu, Kohei Kishida, Neil J. Ross, Peter Selinger

    Abstract: It is well-known that the category of presheaf functors is complete and cocomplete, and that the Yoneda embedding into the presheaf category preserves products. However, the Yoneda embedding does not preserve coproducts. It is perhaps less well-known that if we restrict the codomain of the Yoneda embedding to the full subcategory of limit-preserving functors, then this embedding preserves colimits… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

  12. arXiv:2204.13041  [pdf, ps, other

    cs.PL math.CT quant-ph

    Proto-Quipper with dynamic lifting

    Authors: Peng Fu, Kohei Kishida, Neil J. Ross, Peter Selinger

    Abstract: Quipper is a functional programming language for quantum computing. Proto-Quipper is a family of languages aiming to provide a formal foundation for Quipper. In this paper, we extend Proto-Quipper-M with a construct called dynamic lifting, which is present in Quipper. By virtue of being a circuit description language, Proto-Quipper has two separate runtimes: circuit generation time and circuit exe… ▽ More

    Submitted 8 November, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

  13. arXiv:2204.13039  [pdf, ps, other

    cs.PL math.CT quant-ph

    A Biset-Enriched Categorical Model for Proto-Quipper with Dynamic Lifting

    Authors: Peng Fu, Kohei Kishida, Neil J. Ross, Peter Selinger

    Abstract: Quipper and Proto-Quipper are a family of quantum programming languages that, by their nature as circuit description languages, involve two runtimes: one at which the program generates a circuit and one at which the circuit is executed, normally with probabilistic results due to measurements. Accordingly, the language distinguishes two kinds of data: parameters, which are known at circuit generati… ▽ More

    Submitted 15 November, 2023; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: In Proceedings QPL 2022, arXiv:2311.08375

    Journal ref: EPTCS 394, 2023, pp. 302-342

  14. arXiv:2204.00205  [pdf, other

    cs.LG cond-mat.mtrl-sci q-bio.TO

    A Physics-Guided Neural Operator Learning Approach to Model Biological Tissues from Digital Image Correlation Measurements

    Authors: Huaiqian You, Quinn Zhang, Colton J. Ross, Chung-Hao Lee, Ming-Chen Hsu, Yue Yu

    Abstract: We present a data-driven workflow to biological tissue modeling, which aims to predict the displacement field based on digital image correlation (DIC) measurements under unseen loading scenarios, without postulating a specific constitutive model form nor possessing knowledges on the material microstructure. To this end, a material database is constructed from the DIC displacement tracking measurem… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

  15. arXiv:2203.08205  [pdf, other

    cs.LG cond-mat.mtrl-sci

    Learning Deep Implicit Fourier Neural Operators (IFNOs) with Applications to Heterogeneous Material Modeling

    Authors: Huaiqian You, Quinn Zhang, Colton J. Ross, Chung-Hao Lee, Yue Yu

    Abstract: Constitutive modeling based on continuum mechanics theory has been a classical approach for modeling the mechanical responses of materials. However, when constitutive laws are unknown or when defects and/or high degrees of heterogeneity are present, these classical models may become inaccurate. In this work, we propose to use data-driven modeling, which directly utilizes high-fidelity simulation a… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

  16. arXiv:2109.05655  [pdf, other

    quant-ph cs.ET cs.LO

    Generators and Relations for Real Stabilizer Operators

    Authors: Justin Makary, Neil J. Ross, Peter Selinger

    Abstract: Real stabilizer operators, which are also known as real Clifford operators, are generated, through composition and tensor product, by the Hadamard gate, the Pauli Z gate, and the controlled-Z gate. We introduce a normal form for real stabilizer circuits and show that every real stabilizer operator admits a unique normal form. Moreover, we give a finite set of relations that suffice to rewrite any… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Comments: In Proceedings QPL 2021, arXiv:2109.04886

    Journal ref: EPTCS 343, 2021, pp. 14-36

  17. arXiv:2106.13724  [pdf, other

    astro-ph.CO cs.LG physics.comp-ph physics.data-an

    Primordial non-Gaussianity from the Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey I: Catalogue Preparation and Systematic Mitigation

    Authors: Mehdi Rezaie, Ashley J. Ross, Hee-Jong Seo, Eva-Maria Mueller, Will J. Percival, Grant Merz, Reza Katebi, Razvan C. Bunescu, Julian Bautista, Joel R. Brownstein, Etienne Burtin, Kyle Dawson, Héctor Gil-Marín, Jiamin Hou, Eleanor B. Lyke, Axel de la Macorra, Graziano Rossi, Donald P. Schneider, Pauline Zarrouk, Gong-Bo Zhao

    Abstract: We investigate the large-scale clustering of the final spectroscopic sample of quasars from the recently completed extended Baryon Oscillation Spectroscopic Survey (eBOSS). The sample contains $343708$ objects in the redshift range $0.8<z<2.2$ and $72667$ objects with redshifts $2.2<z<3.5$, covering an effective area of $4699~{\rm deg}^{2}$. We develop a neural network-based approach to mitigate s… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: 17 pages, 13 figures, 2 tables. Accepted for publication in MNRAS. For the associated code and value-added catalogs see https://github.com/mehdirezaie/sysnetdev and https://github.com/mehdirezaie/eBOSSDR16QSOE

  18. arXiv:2106.09553  [pdf, other

    cs.LG cs.CL q-bio.BM

    Large-Scale Chemical Language Representations Capture Molecular Structure and Properties

    Authors: Jerret Ross, Brian Belgodere, Vijil Chenthamarakshan, Inkit Padhi, Youssef Mroueh, Payel Das

    Abstract: Models based on machine learning can enable accurate and fast molecular property predictions, which is of interest in drug discovery and material design. Various supervised machine learning models have demonstrated promising performance, but the vast chemical space and the limited availability of property labels make supervised learning challenging. Recently, unsupervised transformer-based languag… ▽ More

    Submitted 14 December, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: NMI 2022

  19. arXiv:2106.01175  [pdf, ps, other

    quant-ph cs.ET cs.LO

    Generators and Relations for the Group On(Z[1/2])

    Authors: Sarah Meng Li, Neil J. Ross, Peter Selinger

    Abstract: We give a finite presentation by generators and relations for the group O_n(Z[1/2]) of n-dimensional orthogonal matrices with entries in Z[1/2]. We then obtain a similar presentation for the group of n-dimensional orthogonal matrices of the form M/sqrt(2)^k, where k is a nonnegative integer and M is an integer matrix. Both groups arise in the study of quantum circuits. In particular, when the dime… ▽ More

    Submitted 12 September, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: In Proceedings QPL 2021, arXiv:2109.04886

    Journal ref: EPTCS 343, 2021, pp. 210-264

  20. arXiv:2102.09299  [pdf, other

    cs.DS stat.CO

    Theory meets Practice at the Median: a worst case comparison of relative error quantile algorithms

    Authors: Graham Cormode, Abhinav Mishra, Joseph Ross, Pavel Veselý

    Abstract: Estimating the distribution and quantiles of data is a foundational task in data mining and data science. We study algorithms which provide accurate results for extreme quantile queries using a small amount of space, thus helping to understand the tails of the input distribution. Namely, we focus on two recent state-of-the-art solutions: $t$-digest and ReqSketch. While $t$-digest is a popular comp… ▽ More

    Submitted 10 June, 2021; v1 submitted 18 February, 2021; originally announced February 2021.

    Comments: Updated experiments, improved presentation. To appear in KDD 2021

    ACM Class: F.2.2

  21. arXiv:2012.11696  [pdf, other

    cs.CV cs.LG

    Image Captioning as an Assistive Technology: Lessons Learned from VizWiz 2020 Challenge

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff, Richard A. Young, Brian Belgodere

    Abstract: Image captioning has recently demonstrated impressive progress largely owing to the introduction of neural network algorithms trained on curated dataset like MS-COCO. Often work in this field is motivated by the promise of deployment of captioning systems in practical applications. However, the scarcity of data and contexts in many competition datasets renders the utility of systems trained on the… ▽ More

    Submitted 18 June, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: In submission to JAIR. Copyright may be transferred without notice, after which this version may no longer be accessible

  22. arXiv:2012.11691  [pdf, other

    cs.CV cs.LG

    Alleviating Noisy Data in Image Captioning with Cooperative Distillation

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Inkit Padhi, Mattia Rigotti, Jarret Ross, Yair Schiff

    Abstract: Image captioning systems have made substantial progress, largely due to the availability of curated datasets like Microsoft COCO or Vizwiz that have accurate descriptions of their corresponding images. Unfortunately, scarce availability of such cleanly labeled data results in trained algorithms producing captions that can be terse and idiosyncratically specific to details in the image. We propose… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: CVPR 2020 VizWiz Challenge

  23. arXiv:2011.14901  [pdf, other

    cs.CL cs.CV cs.LG cs.NE

    Language-Driven Region Pointer Advancement for Controllable Image Captioning

    Authors: Annika Lindh, Robert J. Ross, John D. Kelleher

    Abstract: Controllable Image Captioning is a recent sub-field in the multi-modal task of Image Captioning wherein constraints are placed on which regions in an image should be described in the generated natural language caption. This puts a stronger focus on producing more detailed descriptions, and opens the door for more end-user control over results. A vital component of the Controllable Image Captioning… ▽ More

    Submitted 30 November, 2020; originally announced November 2020.

    Comments: Accepted to COLING 2020

    MSC Class: 68T07; 68T45; 68T50 ACM Class: I.2.7; I.2.10; I.5.1

  24. arXiv:2011.01843  [pdf, other

    cs.LG cs.AI

    Tabular Transformers for Modeling Multivariate Time Series

    Authors: Inkit Padhi, Yair Schiff, Igor Melnyk, Mattia Rigotti, Youssef Mroueh, Pierre Dognin, Jerret Ross, Ravi Nair, Erik Altman

    Abstract: Tabular datasets are ubiquitous in data science applications. Given their importance, it seems natural to apply state-of-the-art deep learning algorithms in order to fully unlock their potential. Here we propose neural network models that represent tabular time series that can optionally leverage their hierarchical structure. This results in two architectures for tabular time series: one for learn… ▽ More

    Submitted 11 February, 2021; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: Accepted to ICASSP, 2021; https://github.com/IBM/TabFormer

  25. arXiv:2009.05560  [pdf, other

    cs.CY cs.CL cs.SI

    Narratives and Needs: Analyzing Experiences of Cyclone Amphan Using Twitter Discourse

    Authors: Ancil Crayton, João Fonseca, Kanav Mehra, Michelle Ng, Jared Ross, Marcelo Sandoval-Castañeda, Rachel von Gnechten

    Abstract: People often turn to social media to comment upon and share information about major global events. Accordingly, social media is receiving increasing attention as a rich data source for understanding people's social, political and economic experiences of extreme weather events. In this paper, we contribute two novel methodologies that leverage Twitter discourse to characterize narratives and identi… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

    Comments: 6 pages, 4 figures, 1 table

  26. arXiv:2006.12021  [pdf, ps, other

    cs.DM

    Sampling hypergraphs with given degrees

    Authors: Martin Dyer, Catherine Greenhill, Pieter Kleer, James Ross, Leen Stougie

    Abstract: There is a well-known connection between hypergraphs and bipartite graphs, obtained by treating the incidence matrix of the hypergraph as the biadjacency matrix of a bipartite graph. We use this connection to describe and analyse a rejection sampling algorithm for sampling simple uniform hypergraphs with a given degree sequence. Our algorithm uses, as a black box, an algorithm $\mathcal{A}$ for sa… ▽ More

    Submitted 13 July, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: 21 pages. This version addresses referees' comments

  27. arXiv:2006.11166  [pdf, other

    stat.ML cs.LG

    Fast Mixing of Multi-Scale Langevin Dynamics under the Manifold Hypothesis

    Authors: Adam Block, Youssef Mroueh, Alexander Rakhlin, Jerret Ross

    Abstract: Recently, the task of image generation has attracted much attention. In particular, the recent empirical successes of the Markov Chain Monte Carlo (MCMC) technique of Langevin Dynamics have prompted a number of theoretical advances; despite this, several outstanding problems remain. First, the Langevin Dynamics is run in very high dimension on a nonconvex landscape; in the worst case, due to the N… ▽ More

    Submitted 22 June, 2020; v1 submitted 19 June, 2020; originally announced June 2020.

  28. arXiv:2005.09599  [pdf, other

    cs.DS

    Asymmetric scale functions for $t$-digests

    Authors: Joseph Ross

    Abstract: The $t$-digest is a data structure that can be queried for approximate quantiles, with greater accuracy near the minimum and maximum of the distribution. We develop a $t$-digest variant with accuracy asymmetric about the median, thereby making possible alternative tradeoffs between computational resources and accuracy which may be of particular interest for distributions with significant skew. Aft… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

    Comments: 18 pages, 8 figured; submitted

  29. A tutorial introduction to quantum circuit programming in dependently typed Proto-Quipper

    Authors: Peng Fu, Kohei Kishida, Neil J. Ross, Peter Selinger

    Abstract: We introduce dependently typed Proto-Quipper, or Proto-Quipper-D for short, an experimental quantum circuit programming language with linear dependent types. We give several examples to illustrate how linear dependent types can help in the construction of correct quantum circuits. Specifically, we show how dependent types enable programming families of circuits, and how dependent types solve the p… ▽ More

    Submitted 12 December, 2020; v1 submitted 17 May, 2020; originally announced May 2020.

    Comments: Added a section on related work and a paragraph explaining qubit initialization and termination

    Journal ref: LNCS 12227:153-168 (2020)

  30. arXiv:2002.05702  [pdf, other

    eess.IV cs.CV cs.LG physics.med-ph stat.ML

    Generative-based Airway and Vessel Morphology Quantification on Chest CT Images

    Authors: Pietro Nardelli, James C. Ross, Raúl San José Estépar

    Abstract: Accurately and precisely characterizing the morphology of small pulmonary structures from Computed Tomography (CT) images, such as airways and vessels, is becoming of great importance for diagnosis of pulmonary diseases. The smaller conducting airways are the major site of increased airflow resistance in chronic obstructive pulmonary disease (COPD), while accurately sizing vessels can help identif… ▽ More

    Submitted 13 March, 2020; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: 19 pages, 13 figures

    MSC Class: 68T20 ACM Class: I.2.1; I.4.7; J.2

  31. arXiv:1912.11940  [pdf, other

    math.OC cs.LG

    Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets

    Authors: Mingrui Liu, Youssef Mroueh, Jerret Ross, Wei Zhang, Xiaodong Cui, Payel Das, Tianbao Yang

    Abstract: Adaptive gradient algorithms perform gradient-based updates using the history of gradients and are ubiquitous in training deep neural networks. While adaptive gradient methods theory is well understood for minimization problems, the underlying factors driving their empirical success in min-max problems such as GANs remain unclear. In this paper, we aim at bridging this gap from both theoretical an… ▽ More

    Submitted 24 December, 2020; v1 submitted 26 December, 2019; originally announced December 2019.

    Comments: Accepted by ICLR 2020

  32. arXiv:1910.12999  [pdf, other

    math.OC cs.LG

    A Decentralized Parallel Algorithm for Training Generative Adversarial Nets

    Authors: Mingrui Liu, Wei Zhang, Youssef Mroueh, Xiaodong Cui, Jerret Ross, Tianbao Yang, Payel Das

    Abstract: Generative Adversarial Networks (GANs) are a powerful class of generative models in the deep learning community. Current practice on large-scale GAN training utilizes large models and distributed large-batch training strategies, and is implemented on deep learning frameworks (e.g., TensorFlow, PyTorch, etc.) designed in a centralized manner. In the centralized network topology, every worker needs… ▽ More

    Submitted 19 October, 2020; v1 submitted 28 October, 2019; originally announced October 2019.

    Comments: Accepted by NeurIPS 2020

  33. arXiv:1902.04999  [pdf, other

    cs.LG stat.ML

    Wasserstein Barycenter Model Ensembling

    Authors: Pierre Dognin, Igor Melnyk, Youssef Mroueh, Jerret Ross, Cicero Dos Santos, Tom Sercu

    Abstract: In this paper we propose to perform model ensembling in a multiclass or a multilabel learning setting using Wasserstein (W.) barycenters. Optimal transport metrics, such as the Wasserstein distance, allow incorporating semantic side information such as word embeddings. Using W. barycenters to find the consensus between models allows us to balance confidence and semantics in finding the agreement b… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: ICLR 2019

  34. Generating Diverse and Meaningful Captions

    Authors: Annika Lindh, Robert J. Ross, Abhijit Mahalunkar, Giancarlo Salton, John D. Kelleher

    Abstract: Image Captioning is a task that requires models to acquire a multi-modal understanding of the world and to express this understanding in natural language text. While the state-of-the-art for this task has rapidly improved in terms of n-gram metrics, these models tend to output the same generic captions for similar images. In this work, we address this limitation and train a model that generates mo… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

    Comments: Accepted for presentation at The 27th International Conference on Artificial Neural Networks (ICANN 2018)

    Journal ref: Artificial Neural Networks and Machine Learning - ICANN 2018 (pp. 176-187). Springer International Publishing

  35. arXiv:1810.06695  [pdf, other

    cs.CL cs.LG stat.ML

    Exploring the Use of Attention within an Neural Machine Translation Decoder States to Translate Idioms

    Authors: Giancarlo D. Salton, Robert J. Ross, John D. Kelleher

    Abstract: Idioms pose problems to almost all Machine Translation systems. This type of language is very frequent in day-to-day language use and cannot be simply ignored. The recent interest in memory augmented models in the field of Language Modelling has aided the systems to achieve good results by bridging long-distance dependencies. In this paper we explore the use of such techniques into a Neural Machin… ▽ More

    Submitted 10 October, 2018; originally announced October 2018.

  36. arXiv:1805.00063  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    Adversarial Semantic Alignment for Improved Image Captions

    Authors: Pierre L. Dognin, Igor Melnyk, Youssef Mroueh, Jarret Ross, Tom Sercu

    Abstract: In this paper we study image captioning as a conditional GAN training, proposing both a context-aware LSTM captioner and co-attentive discriminator, which enforces semantic alignment between images and captions. We empirically focus on the viability of two training methods: Self-critical Sequence Training (SCST) and Gumbel Straight-Through (ST) and demonstrate that SCST shows more stable gradient… ▽ More

    Submitted 6 June, 2019; v1 submitted 30 April, 2018; originally announced May 2018.

    Comments: Authors Equal Contribution, CVPR 2019

  37. Automated optimization of large quantum circuits with continuous parameters

    Authors: Yunseong Nam, Neil J. Ross, Yuan Su, Andrew M. Childs, Dmitri Maslov

    Abstract: We develop and implement automated methods for optimizing quantum circuits of the size and type expected in quantum computations that outperform classical computers. We show how to handle continuous gate parameters and report a collection of fast algorithms capable of optimizing large-scale quantum circuits. For the suite of benchmarks considered, we obtain substantial reductions in gate counts. I… ▽ More

    Submitted 1 June, 2018; v1 submitted 19 October, 2017; originally announced October 2017.

    Comments: 21 pages

    Journal ref: npj:Quantum Information 4, 23 (2018)

  38. arXiv:1704.08343  [pdf

    cs.DC

    A Distributed Shared Memory Model and C++ Templated Meta-Programming Interface for the Epiphany RISC Array Processor

    Authors: David Richie, James Ross, Jamie Infantolino

    Abstract: The Adapteva Epiphany many-core architecture comprises a scalable 2D mesh Network-on-Chip (NoC) of low-power RISC cores with minimal uncore functionality. Whereas such a processor offers high computational energy efficiency and parallel scalability, developing effective programming models that address the unique architecture features has presented many challenges. We present here a distributed sha… ▽ More

    Submitted 26 April, 2017; originally announced April 2017.

    Comments: 10 pages, 2 figures, ICCS/ALCHEMY Workshop 2017

  39. arXiv:1704.04760  [pdf

    cs.AR cs.LG cs.NE

    In-Datacenter Performance Analysis of a Tensor Processing Unit

    Authors: Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, Rick Boyle, Pierre-luc Cantin, Clifford Chao, Chris Clark, Jeremy Coriell, Mike Daley, Matt Dau, Jeffrey Dean, Ben Gelb, Tara Vazir Ghaemmaghami, Rajendra Gottipati, William Gulland, Robert Hagmann, C. Richard Ho, Doug Hogberg , et al. (50 additional authors not shown)

    Abstract: Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOp… ▽ More

    Submitted 16 April, 2017; originally announced April 2017.

    Comments: 17 pages, 11 figures, 8 tables. To appear at the 44th International Symposium on Computer Architecture (ISCA), Toronto, Canada, June 24-28, 2017

  40. arXiv:1703.10242  [pdf

    cs.DC cs.PL

    I CAN HAS SUPERCOMPUTER? A Novel Approach to Teaching Parallel and Distributed Computing Concepts Using a Meme-Based Programming Language

    Authors: David Richie, James Ross

    Abstract: A novel approach is presented to teach the parallel and distributed computing concepts of synchronization and remote memory access. The single program multiple data (SPMD) partitioned global address space (PGAS) model presented in this paper uses a procedural programming language appealing to undergraduate students. We propose that the amusing nature of the approach may engender creativity and int… ▽ More

    Submitted 29 March, 2017; originally announced March 2017.

    Comments: 7 pages, 2 figures, example code, accepted for publication at the 7th NSF/TCPP Workshop on Parallel and Distributed Computing Education (EduPar-17) workshop in conjunction with the 31st IEEE International Parallel & Distributed Processing Symposium (IPDPS 17)

  41. arXiv:1701.00140  [pdf, other

    quant-ph cs.ET cs.LO

    A Finite Presentation of CNOT-Dihedral Operators

    Authors: Matthew Amy, Jianxin Chen, Neil J. Ross

    Abstract: We give a finite presentation by generators and relations of the unitary operators expressible over the {CNOT, T, X} gate set, also known as CNOT-dihedral operators. To this end, we introduce a notion of normal form for CNOT-dihedral circuits and prove that every CNOT-dihedral operator admits a unique normal form. Moreover, we show that in the presence of certain structural rules only finitely man… ▽ More

    Submitted 28 April, 2019; v1 submitted 31 December, 2016; originally announced January 2017.

    Comments: In Proceedings QPL 2017, arXiv:1802.09737

    Journal ref: EPTCS 266, 2018, pp. 84-97

  42. arXiv:1612.00563  [pdf, other

    cs.LG cs.AI cs.CV

    Self-critical Sequence Training for Image Captioning

    Authors: Steven J. Rennie, Etienne Marcheret, Youssef Mroueh, Jarret Ross, Vaibhava Goel

    Abstract: Recently it has been shown that policy-gradient methods for reinforcement learning can be utilized to train deep end-to-end systems directly on non-differentiable metrics for the task at hand. In this paper we consider the problem of optimizing image captioning systems using reinforcement learning, and show that by carefully optimizing our systems using the test metrics of the MSCOCO task, signifi… ▽ More

    Submitted 15 November, 2017; v1 submitted 1 December, 2016; originally announced December 2016.

    Comments: CVPR 2017 + additional analysis + fixed baseline results, 16 pages

  43. arXiv:1609.08283  [pdf, other

    cs.SI physics.soc-ph q-bio.PE

    A data-driven model for influenza transmission incorporating media effects

    Authors: Lewis Mitchell, Joshua V. Ross

    Abstract: Numerous studies have attempted to model the effect of mass media on the transmission of diseases such as influenza, however quantitative data on media engagement has until recently been difficult to obtain. With the recent explosion of "big data" coming from online social media and the like, large volumes of data on a population's engagement with mass media during an epidemic are becoming availab… ▽ More

    Submitted 27 September, 2016; originally announced September 2016.

    Comments: To appear in Royal Society Open Science

  44. OpenCL + OpenSHMEM Hybrid Programming Model for the Adapteva Epiphany Architecture

    Authors: David Richie, James Ross

    Abstract: There is interest in exploring hybrid OpenSHMEM + X programming models to extend the applicability of the OpenSHMEM interface to more hardware architectures. We present a hybrid OpenCL + OpenSHMEM programming model for device-level programming for architectures like the Adapteva Epiphany many-core RISC array processor. The Epiphany architecture comprises a 2D array of low-power RISC cores with min… ▽ More

    Submitted 11 August, 2016; originally announced August 2016.

    Comments: 12 pages, 5 figures, OpenSHMEM 2016: Third workshop on OpenSHMEM and Related Technologies

  45. An OpenSHMEM Implementation for the Adapteva Epiphany Coprocessor

    Authors: James Ross, David Richie

    Abstract: This paper reports the implementation and performance evaluation of the OpenSHMEM 1.3 specification for the Adapteva Epiphany architecture within the Parallella single-board computer. The Epiphany architecture exhibits massive many-core scalability with a physically compact 2D array of RISC CPU cores and a fast network-on-chip (NoC). While fully capable of MPMD execution, the physical topology and… ▽ More

    Submitted 11 August, 2016; originally announced August 2016.

    Comments: 14 pages, 9 figures, OpenSHMEM 2016: Third workshop on OpenSHMEM and Related Technologies

  46. arXiv:1605.04693  [pdf

    cs.CY

    Overcoming the language barrier in mobile user interface design: A case study on a mobile health app

    Authors: Jason Ross, Jing Gao

    Abstract: This research report proposes a structured solution to address the need for awareness of cultural and language in user design. It will include evaluated research on established methods that already exist. Discussed ideas about how to address this situation include: what others have found to take into consideration when using design principles to develop an interface, detailed troubles and critical… ▽ More

    Submitted 16 May, 2016; originally announced May 2016.

    Comments: Research-in-progress ISBN# 978-0-646-95337-3 Presented at the Australasian Conference on Information Systems 2015 (arXiv:1605.01032)

    Report number: ACIS/2015/13

  47. Advances in Run-Time Performance and Interoperability for the Adapteva Epiphany Coprocessor

    Authors: David A. Richie, James A. Ross

    Abstract: The energy-efficient Adapteva Epiphany architecture exhibits massive many-core scalability in a physically compact 2D array of RISC cores with a fast network-on-chip (NoC). The architecture presents many features and constraints which contribute to software design challenges for the application developer. Addressing these challenges within the software stack that supports application development i… ▽ More

    Submitted 14 April, 2016; originally announced April 2016.

    Comments: 11 pages, 3 figures, accepted to ICCS'16 ALCHEMY workshop

  48. arXiv:1604.04205  [pdf

    cs.DC

    Implementing OpenSHMEM for the Adapteva Epiphany RISC Array Processor

    Authors: James A. Ross, David A. Richie

    Abstract: The energy-efficient Adapteva Epiphany architecture exhibits massive many-core scalability in a physically compact 2D array of RISC cores with a fast network-on-chip (NoC). With fully divergent cores capable of MIMD execution, the physical topology and memory-mapped capabilities of the core and network translate well to partitioned global address space (PGAS) parallel programming models. Following… ▽ More

    Submitted 14 April, 2016; originally announced April 2016.

    Comments: 4 pages, 1 figure, accepted to ICCS'16 ALCHEMY workshop

  49. arXiv:1506.05442  [pdf

    cs.DC

    Parallel Programming Model for the Epiphany Many-Core Coprocessor Using Threaded MPI

    Authors: James A. Ross, David A. Richie, Song J. Park, Dale R. Shires

    Abstract: The Adapteva Epiphany many-core architecture comprises a 2D tiled mesh Network-on-Chip (NoC) of low-power RISC cores with minimal uncore functionality. It offers high computational energy efficiency for both integer and floating point calculations as well as parallel scalability. Yet despite the interesting architectural features, a compelling programming model has not been presented to date. This… ▽ More

    Submitted 17 June, 2015; originally announced June 2015.

    Comments: 7 pages, 6 figures, presented at ISCA'15, Third ACM International Workshop on Manycore Embedded Systems

    ACM Class: C.1.4; D.1.3

  50. arXiv:1505.01547  [pdf, other

    physics.soc-ph cs.SI stat.AP

    Understanding the Heavy Tailed Dynamics in Human Behavior

    Authors: Gordon J Ross, Tim Jones

    Abstract: The recent availability of electronic datasets containing large volumes of communication data has made it possible to study human behavior on a larger scale than ever before. From this, it has been discovered that across a diverse range of data sets, the inter-event times between consecutive communication events obey heavy tailed power law dynamics. Explaining this has proved controversial, and tw… ▽ More

    Submitted 6 May, 2015; originally announced May 2015.

    Comments: 9 pages in Physical Review E, 2015