Skip to main content

Showing 1–50 of 77 results for author: White, C

  1. arXiv:2406.19314  [pdf, other

    cs.CL cs.AI cs.LG

    LiveBench: A Challenging, Contamination-Free LLM Benchmark

    Authors: Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

    Abstract: Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource new prompts and evaluations from human or LLM judges; however, these can introduce significant biases, and break down when scoring hard questions. In… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2405.04515  [pdf, other

    cs.CL

    A Transformer with Stack Attention

    Authors: Jiaoda Li, Jennifer C. White, Mrinmaya Sachan, Ryan Cotterell

    Abstract: Natural languages are believed to be (mildly) context-sensitive. Despite underpinning remarkably capable large language models, transformers are unable to model many context-free language tasks. In an attempt to address this limitation in the modeling power of transformer-based language models, we propose augmenting them with a differentiable, stack-based attention mechanism. Our stack-based atten… ▽ More

    Submitted 13 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: NAACL 2024 Findings

  3. arXiv:2404.16436  [pdf

    cs.SD cs.AI cs.LG eess.AS

    Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics

    Authors: Ben Williams, Bart van Merriënboer, Vincent Dumoulin, Jenny Hamer, Eleni Triantafillou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis, Clemency E. White, Catherine A. D. Hobbs, Tries B. Razak, Kate E. Jones, Tom Denton

    Abstract: Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and compute costs limit the field's efficacy. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting its current applicability primarily to bird taxa. Here, we identify the optimum pr… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures

  4. arXiv:2404.04633  [pdf, other

    cs.CL

    Context versus Prior Knowledge in Language Models

    Authors: Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell

    Abstract: To answer a question, language models often need to integrate prior knowledge learned during pretraining and new information presented in context. We hypothesize that models perform this integration in a predictable way across different questions and contexts: models will rely more on prior knowledge for questions about entities (e.g., persons, places, etc.) that they are more familiar with due to… ▽ More

    Submitted 16 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

    Comments: Long paper accepted at ACL 2024

  5. arXiv:2403.12553  [pdf, other

    cs.LG

    Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

    Authors: Md Ashiqur Rahman, Robert Joseph George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A. Yeh, Jean Kossaifi, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain… ▽ More

    Submitted 5 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  6. arXiv:2403.00607  [pdf, other

    cs.GT

    Dynamic Operational Planning in Warfare: A Stochastic Game Approach to Military Campaigns

    Authors: Joseph E. McCarthy, Mathieu Dahan, Chelsea C. White III

    Abstract: We study a two-player discounted zero-sum stochastic game model for dynamic operational planning in military campaigns. At each stage, the players manage multiple commanders who order military actions on objectives that have an open line of control. When a battle over the control of an objective occurs, its stochastic outcome depends on the actions and the enabling support provided by the control… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  7. arXiv:2402.13228  [pdf, other

    cs.CL cs.AI cs.LG

    Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive

    Authors: Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White

    Abstract: Direct Preference Optimisation (DPO) is effective at significantly improving the performance of large language models (LLMs) on downstream tasks such as reasoning, summarisation, and alignment. Using pairs of preferred and dispreferred data, DPO models the relative probability of picking one response over another. In this work, first we show theoretically that the standard DPO loss can lead to a r… ▽ More

    Submitted 3 July, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  8. arXiv:2402.11137  [pdf, other

    cs.LG

    TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks

    Authors: Benjamin Feuer, Robin Tibor Schirrmeister, Valeriia Cherepanova, Chinmay Hegde, Frank Hutter, Micah Goldblum, Niv Cohen, Colin White

    Abstract: While tabular classification has traditionally relied on from-scratch training, a recent breakthrough called prior-data fitted networks (PFNs) challenges this approach. Similar to large language models, PFNs make use of pretraining and in-context learning to achieve strong performance on new tasks in a single forward pass. However, current PFNs have limitations that prohibit their widespread adopt… ▽ More

    Submitted 18 March, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  9. arXiv:2402.07600  [pdf, other

    cs.NI

    Optical Routing with Binary Optimisation and Quantum Annealing

    Authors: Ethan Davies, Darren Banfield, Vlad Carare, Ben Weaver, Catherine White, Nigel Walker

    Abstract: A challenge for scalability of demand-responsive, elastic optical Dense Wavelength Division Multiplexing (DWDM) and Flexgrid networks is the computational complexity of allocating many optical routes on large networks. We demonstrate that demand satisfaction problems in communication networks can be formulated as quadratic unconstrained binary optimisation (QUBO) problems, and solved using a hybri… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 7 pages, 3 figures

    MSC Class: F.2.m

  10. arXiv:2311.16452  [pdf, other

    cs.CL

    Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

    Authors: Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz

    Abstract: Generalist foundation models such as GPT-4 have displayed surprising capabilities in a wide variety of domains and tasks. Yet, there is a prevalent assumption that they cannot match specialist capabilities of fine-tuned models. For example, most explorations to date on medical competency benchmarks have leveraged domain-specific training, as exemplified by efforts on BioGPT and Med-PaLM. We build… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: 21 pages, 7 figures

    ACM Class: I.2.7

  11. arXiv:2311.01933  [pdf, other

    cs.LG

    ForecastPFN: Synthetically-Trained Zero-Shot Forecasting

    Authors: Samuel Dooley, Gurnoor Singh Khurana, Chirag Mohapatra, Siddartha Naidu, Colin White

    Abstract: The vast majority of time-series forecasting approaches require a substantial training dataset. However, many real-life forecasting applications have very little initial observations, sometimes just 40 or fewer. Thus, the applicability of most forecasting methods is restricted in data-sparse commercial applications. While there is recent work in the setting of very limited initial data (so-called… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Journal ref: Thirty-seventh Conference on Neural Information Processing Systems, 2023

  12. arXiv:2310.10628  [pdf, other

    cs.CL

    Data Contamination Through the Lens of Time

    Authors: Manley Roberts, Himanshu Thakur, Christine Herlihy, Colin White, Samuel Dooley

    Abstract: Recent claims about the impressive abilities of large language models (LLMs) are often supported by evaluating publicly available benchmarks. Since LLMs train on wide swaths of the internet, this practice raises concerns of data contamination, i.e., evaluating on examples that are explicitly or implicitly included in the training data. Data contamination remains notoriously challenging to measure… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  13. arXiv:2307.15034  [pdf, other

    cs.LG math.NA

    Guaranteed Approximation Bounds for Mixed-Precision Neural Operators

    Authors: Renbo Tu, Colin White, Jean Kossaifi, Boris Bonev, Nikola Kovachki, Gennady Pekhimenko, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Neural operators, such as Fourier Neural Operators (FNO), form a principled approach for learning solution operators for PDEs and other mappings between function spaces. However, many real-world problems require high-resolution training data, and the training time and limited GPU memory pose big barriers. One solution is to train neural operators in mixed precision to reduce the memory requirement… ▽ More

    Submitted 5 May, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: ICLR 2024

  14. arXiv:2305.18703  [pdf, other

    cs.CL cs.AI

    Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey

    Authors: Chen Ling, Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng, Junxiang Wang, Tanmoy Chowdhury, Yun Li, Hejie Cui, Xuchao Zhang, Tianjiao Zhao, Amit Panalkar, Dhagash Mehta, Stefano Pasquali, Wei Cheng, Haoyu Wang, Yanchi Liu, Zhengzhang Chen, Haifeng Chen, Chris White, Quanquan Gu, Jian Pei, Carl Yang, Liang Zhao

    Abstract: Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of dom… ▽ More

    Submitted 29 March, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

  15. arXiv:2305.02997  [pdf, other

    cs.LG cs.AI stat.ML

    When Do Neural Nets Outperform Boosted Trees on Tabular Data?

    Authors: Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, Vishak Prasad C, Benjamin Feuer, Chinmay Hegde, Ganesh Ramakrishnan, Micah Goldblum, Colin White

    Abstract: Tabular data is one of the most commonly used types of data in machine learning. Despite recent advances in neural nets (NNs) for tabular data, there is still an active discussion on whether or not NNs generally outperform gradient-boosted decision trees (GBDTs) on tabular data, with several recent works arguing either that GBDTs consistently outperform NNs on tabular data, or vice versa. In this… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: NeurIPS Datasets and Benchmarks Track 2023

  16. arXiv:2305.00086  [pdf, other

    cs.MA eess.SY

    An Integrated System Dynamics and Discrete Event Supply Chain Simulation Framework for Supply Chain Resilience with Non-Stationary Pandemic Demand

    Authors: Mustafa Can Camur, Chin-Yuan Tseng, Aristotelis E. Thanos, Chelsea C. White, Walter Yund, Eleftherios Iakovou

    Abstract: COVID-19 resulted in some of the largest supply chain disruptions in recent history. To mitigate the impact of future disruptions, we propose an integrated hybrid simulation framework to couple nonstationary demand signals from an event like COVID-19 with a model of an end-to-end supply chain. We first create a system dynamics susceptible-infected-recovered (SIR) model, augmenting a classic epidem… ▽ More

    Submitted 15 August, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

  17. arXiv:2301.08727  [pdf, other

    cs.LG cs.AI stat.ML

    Neural Architecture Search: Insights from 1000 Papers

    Authors: Colin White, Mahmoud Safari, Rhea Sukthanker, Binxin Ru, Thomas Elsken, Arber Zela, Debadeepta Dey, Frank Hutter

    Abstract: In the past decade, advances in deep learning have resulted in breakthroughs in a variety of areas, including computer vision, natural language understanding, speech recognition, and reinforcement learning. Specialized, high-performing neural architectures are crucial to the success of deep learning in these areas. Neural architecture search (NAS), the process of automating the design of neural ar… ▽ More

    Submitted 25 January, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

  18. arXiv:2211.13095  [pdf, other

    cs.CL

    Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in Superposition

    Authors: Jennifer C. White, Ryan Cotterell

    Abstract: Recent work has shown that despite their impressive capabilities, text-to-image diffusion models such as DALL-E 2 (Ramesh et al., 2022) can display strange behaviours when a prompt contains a word with multiple possible meanings, often generating images containing both senses of the word (Rassin et al., 2022). In this work we seek to put forward a possible explanation of this phenomenon. Using the… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  19. arXiv:2211.01454  [pdf, other

    cs.LG

    Speeding up NAS with Adaptive Subset Selection

    Authors: Vishak Prasad C, Colin White, Paarth Jain, Sibasis Nayak, Ganesh Ramakrishnan

    Abstract: A majority of recent developments in neural architecture search (NAS) have been aimed at decreasing the computational cost of various techniques without affecting their final performance. Towards this goal, several low-fidelity and performance prediction methods have been considered, including those that train only on subsets of the training data. In this work, we present an adaptive subset select… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

  20. arXiv:2210.09943  [pdf, other

    cs.CV cs.AI cs.CY cs.LG

    Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face Recognition

    Authors: Samuel Dooley, Rhea Sanjay Sukthanker, John P. Dickerson, Colin White, Frank Hutter, Micah Goldblum

    Abstract: Face recognition systems are widely deployed in safety-critical applications, including law enforcement, yet they exhibit bias across a range of socio-demographic dimensions, such as gender and race. Conventional wisdom dictates that model biases arise from biased training data. As a consequence, previous works on bias mitigation largely focused on pre-processing the training data, adding penaltie… ▽ More

    Submitted 6 December, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

  21. arXiv:2210.03324  [pdf, other

    cs.LG cs.AI stat.ML

    AutoML for Climate Change: A Call to Action

    Authors: Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White

    Abstract: The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications. The climate change AI (CCAI) community works on a diverse, challenging set of problems which often involve physics-constrained ML or heterogeneous spatiotemporal data. It would be desirable to use automated machine learning (AutoML)… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  22. arXiv:2210.03230  [pdf, other

    cs.LG cs.AI stat.ML

    NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies

    Authors: Arjun Krishnakumar, Colin White, Arber Zela, Renbo Tu, Mahmoud Safari, Frank Hutter

    Abstract: Zero-cost proxies (ZC proxies) are a recent architecture performance prediction technique aiming to significantly speed up algorithms for neural architecture search (NAS). Recent work has shown that these techniques show great promise, but certain aspects, such as evaluating and exploiting their complementary strengths, are under-studied. In this work, we create NAS-Bench-Suite: we evaluate 13 ZC… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: NeurIPS Datasets and Benchmarks Track 2022

  23. arXiv:2209.13515  [pdf, other

    cs.CL

    Assessing Digital Language Support on a Global Scale

    Authors: Gary F. Simons, Abbey L. Thomas, Chad K. White

    Abstract: The users of endangered languages struggle to thrive in a digitally-mediated world. We have developed an automated method for assessing how well every language recognized by ISO 639 is faring in terms of digital language support. The assessment is based on scraping the names of supported languages from the websites of 143 digital tools selected to represent a full range of ways that digital techno… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

    Comments: 7 pages, 3 figures, 3 tables, to be published in Proceedings of the 29th International Conference on Computational Linguistics

  24. arXiv:2209.10926  [pdf, other

    cs.CL cs.LG

    Equivariant Transduction through Invariant Alignment

    Authors: Jennifer C. White, Ryan Cotterell

    Abstract: The ability to generalize compositionally is key to understanding the potentially infinite number of sentences that can be constructed in a human language from only a finite number of words. Investigating whether NLP models possess this ability has been a topic of interest: SCAN (Lake and Baroni, 2018) is one task specifically proposed to test for this property. Previous work has achieved impressi… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

    Comments: Accepted at COLING 2022

  25. arXiv:2206.11886  [pdf, other

    cs.IR cs.AI cs.LG

    On the Generalizability and Predictability of Recommender Systems

    Authors: Duncan McElfresh, Sujay Khandagale, Jonathan Valverde, John P. Dickerson, Colin White

    Abstract: While other areas of machine learning have seen more and more automation, designing a high-performing recommender system still requires a high level of human effort. Furthermore, recent work has shown that modern recommender system algorithms do not always improve over well-tuned baselines. A natural follow-up question is, "how do we choose the right algorithm for a new dataset and performance met… ▽ More

    Submitted 6 October, 2022; v1 submitted 23 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022

  26. arXiv:2204.05112  [pdf, other

    cs.CV cs.LG physics.geo-ph

    FastMapSVM: Classifying Complex Objects Using the FastMap Algorithm and Support-Vector Machines

    Authors: Malcolm C. A. White, Kushal Sharma, Ang Li, T. K. Satish Kumar, Nori Nakata

    Abstract: Neural Networks and related Deep Learning methods are currently at the leading edge of technologies used for classifying objects. However, they generally demand large amounts of time and data for model training; and their learned models can sometimes be difficult to interpret. In this paper, we advance FastMapSVM -- an interpretable Machine Learning framework for classifying complex objects -- as… ▽ More

    Submitted 15 June, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: 27 pages, 12 figures

  27. arXiv:2201.13396  [pdf, other

    cs.LG cs.AI stat.ML

    NAS-Bench-Suite: NAS Evaluation is (Now) Surprisingly Easy

    Authors: Yash Mehta, Colin White, Arber Zela, Arjun Krishnakumar, Guri Zabergja, Shakiba Moradian, Mahmoud Safari, Kaicheng Yu, Frank Hutter

    Abstract: The release of tabular benchmarks, such as NAS-Bench-101 and NAS-Bench-201, has significantly lowered the computational overhead for conducting scientific research in neural architecture search (NAS). Although they have been widely adopted and used to tune real-world NAS algorithms, these benchmarks are limited to small search spaces and focus solely on image classification. Recently, several new… ▽ More

    Submitted 11 February, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: ICLR 2022

  28. arXiv:2201.07372  [pdf, other

    cs.LG cs.AI

    Prospective Learning: Principled Extrapolation to the Future

    Authors: Ashwin De Silva, Rahul Ramesh, Lyle Ungar, Marshall Hussain Shuler, Noah J. Cowan, Michael Platt, Chen Li, Leyla Isik, Seung-Eon Roh, Adam Charles, Archana Venkataraman, Brian Caffo, Javier J. How, Justus M Kebschull, John W. Krakauer, Maxim Bichuch, Kaleab Alemayehu Kinfu, Eva Yezerets, Dinesh Jayaraman, Jong M. Shin, Soledad Villar, Ian Phillips, Carey E. Priebe, Thomas Hartung, Michael I. Miller , et al. (18 additional authors not shown)

    Abstract: Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenari… ▽ More

    Submitted 13 July, 2023; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: Accepted at the 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023

  29. arXiv:2112.03276  [pdf, other

    eess.IV cs.CV cs.LG

    Organ localisation using supervised and semi supervised approaches combining reinforcement learning with imitation learning

    Authors: Sankaran Iyer, Alan Blair, Laughlin Dawes, Daniel Moses, Christopher White, Arcot Sowmya

    Abstract: Computer aided diagnostics often requires analysis of a region of interest (ROI) within a radiology scan, and the ROI may be an organ or a suborgan. Although deep learning algorithms have the ability to outperform other methods, they rely on the availability of a large amount of annotated data. Motivated by the need to address this limitation, an approach to localisation and detection of multiple… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 16 pages, 12 figures

  30. arXiv:2111.03602  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    NAS-Bench-x11 and the Power of Learning Curves

    Authors: Shen Yan, Colin White, Yash Savani, Frank Hutter

    Abstract: While early research in neural architecture search (NAS) required extreme computational resources, the recent releases of tabular and surrogate benchmarks have greatly increased the speed and reproducibility of NAS research. However, two of the most popular benchmarks do not provide the full training information for each architecture. As a result, on these benchmarks it is not possible to run many… ▽ More

    Submitted 5 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021

  31. arXiv:2108.13637  [pdf, other

    cs.LG cs.AI q-bio.NC stat.ML

    When are Deep Networks really better than Decision Forests at small sample sizes, and how?

    Authors: Haoyin Xu, Kaleab A. Kinfu, Will LeVine, Sambit Panda, Jayanta Dey, Michael Ainsworth, Yu-Chung Peng, Madi Kusmanov, Florian Engert, Christopher M. White, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Deep networks and decision forests (such as random forests and gradient boosted trees) are the leading machine learning methods for structured and tabular data, respectively. Many papers have empirically compared large numbers of classifiers on one or two different domains (e.g., on 100 different tabular data settings). However, a careful conceptual and empirical comparison of these two strategies… ▽ More

    Submitted 2 November, 2021; v1 submitted 31 August, 2021; originally announced August 2021.

  32. arXiv:2108.09585  [pdf, other

    math.OC cs.LG stat.ML

    Sequential Stochastic Optimization in Separable Learning Environments

    Authors: R. Reid Bishop, Chelsea C. White III

    Abstract: We consider a class of sequential decision-making problems under uncertainty that can encompass various types of supervised learning concepts. These problems have a completely observed state process and a partially observed modulation process, where the state process is affected by the modulation process only through an observation process, the observation process only observes the modulation proc… ▽ More

    Submitted 21 August, 2021; originally announced August 2021.

    Comments: 30 pages (Main), 12 pages (Figures, References, Appendices), 5 figures

  33. Quantum Technologies in the Telecommunications Industry

    Authors: Vicente Martin, Juan Pedro Brito, Carmen Escribano, Marco Menchetti, Catherine White, Andrew Lord, Felix Wissel, Matthias Gunkel, Paulette Gavignet, Naveena Genay, Olivier Le Moult, Carlos Abellán, Antonio Manzalini, Antonio Pastor-Perales, Victor López, Diego López

    Abstract: Quantum based technologies have been fundamental in our world. After producing the laser and the transistor, the devices that have shaped our modern information society, the possibilities enabled by the ability to create and manipulate individual quantum states opens the door to a second quantum revolution. In this paper we explore the possibilities that these new technologies bring to the Telecom… ▽ More

    Submitted 28 July, 2021; originally announced July 2021.

    Journal ref: EPJ Quantum Technology 8:19 (2021)

  34. arXiv:2106.12621  [pdf, other

    cs.LG cs.IR stat.ME

    Leveraging semantically similar queries for ranking via combining representations

    Authors: Hayden S. Helm, Marah Abdin, Benjamin D. Pedigo, Shweti Mahajan, Vince Lyzinski, Youngser Park, Amitabh Basu, Piali~Choudhury, Christopher M. White, Weiwei Yang, Carey E. Priebe

    Abstract: In modern ranking problems, different and disparate representations of the items to be ranked are often available. It is sensible, then, to try to combine these representations to improve ranking. Indeed, learning to rank via combining representations is both principled and practical for learning a ranking function for a particular query. In extremely data-scarce settings, however, the amount of l… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  35. arXiv:2106.12543  [pdf, other

    cs.LG cs.AI stat.ML

    Synthetic Benchmarks for Scientific Research in Explainable Machine Learning

    Authors: Yang Liu, Sujay Khandagale, Colin White, Willie Neiswanger

    Abstract: As machine learning models grow more complex and their applications become more high-stakes, tools for explaining model predictions have become increasingly important. This has spurred a flurry of research in model explainability and has given rise to feature attribution methods such as LIME and SHAP. Despite their widespread use, evaluating and comparing different feature attribution methods rema… ▽ More

    Submitted 4 November, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

    Comments: NeurIPS Datasets and Benchmarks Track 2021

  36. arXiv:2106.01044  [pdf, other

    cs.CL

    Examining the Inductive Bias of Neural Language Models with Artificial Languages

    Authors: Jennifer C. White, Ryan Cotterell

    Abstract: Since language models are used to model a wide variety of languages, it is natural to ask whether the neural architectures used for the task have inductive biases towards modeling particular types of languages. Investigation of these biases has proved complicated due to the many variables that appear in the experimental setup. Languages vary in many typological dimensions, and it is difficult to s… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL 2021

  37. arXiv:2105.10185  [pdf, other

    cs.CL cs.LG

    A Non-Linear Structural Probe

    Authors: Jennifer C. White, Tiago Pimentel, Naomi Saphra, Ryan Cotterell

    Abstract: Probes are models devised to investigate the encoding of knowledge -- e.g. syntactic structure -- in contextual representations. Probes are often designed for simplicity, which has led to restrictions on probe design that may not allow for the full exploitation of the structure of encoded information; one such restriction is linearity. We examine the case of a structural probe (Hewitt and Manning,… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

    Comments: Accepted at NAACL 2021

  38. When Can Accessibility Help?: An Exploration of Accessibility Feature Recommendation on Mobile Devices

    Authors: Jason Wu, Gabriel Reyes, Sam C. White, Xiaoyi Zhang, Jeffrey P. Bigham

    Abstract: Numerous accessibility features have been developed and included in consumer operating systems to provide people with a variety of disabilities additional ways to access computing devices. Unfortunately, many users, especially older adults who are more likely to experience ability changes, are not aware of these features or do not know which combination to use. In this paper, we first quantify thi… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

    Comments: Accepted to Web4All 2021 (W4A '21)

  39. arXiv:2104.01177  [pdf, other

    cs.LG cs.NE stat.ML

    How Powerful are Performance Predictors in Neural Architecture Search?

    Authors: Colin White, Arber Zela, Binxin Ru, Yang Liu, Frank Hutter

    Abstract: Early methods in the rapidly developing field of neural architecture search (NAS) required fully training thousands of neural networks. To reduce this extreme computational cost, dozens of techniques have since been proposed to predict the final performance of neural architectures. Despite the success of such performance prediction methods, it is not well-understood how different families of techn… ▽ More

    Submitted 27 October, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

    Comments: NeurIPS 2021

  40. arXiv:2104.00641  [pdf

    stat.ML cs.LG

    Dynamic Silos: Increased Modularity in Intra-organizational Communication Networks during the Covid-19 Pandemic

    Authors: Tiona Zuzul, Emily Cox Pahnke, Jonathan Larson, Patrick Bourke, Nicholas Caurvina, Neha Parikh Shah, Fereshteh Amini, Jeffrey Weston, Youngser Park, Joshua Vogelstein, Christopher White, Carey E. Priebe

    Abstract: Workplace communications around the world were drastically altered by Covid-19, related work-from-home orders, and the rise of remote work. To understand these shifts, we analyzed aggregated, anonymized metadata from over 360 billion emails within 4,361 organizations worldwide. By comparing month-to-month and year-over-year metrics, we examined changes in network community structures over 24 month… ▽ More

    Submitted 28 July, 2023; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: 48 pages, 15 figures

  41. arXiv:2103.08878  [pdf, other

    cs.LG cs.AI cs.NE

    Learning without gradient descent encoded by the dynamics of a neurobiological model

    Authors: Vivek Kurien George, Vikash Morar, Weiwei Yang, Jonathan Larson, Bryan Tower, Shweti Mahajan, Arkin Gupta, Christopher White, Gabriel A. Silva

    Abstract: The success of state-of-the-art machine learning is essentially all based on different variations of gradient descent algorithms that minimize some version of a cost or loss function. A fundamental limitation, however, is the need to train these systems in either supervised or unsupervised ways by exposing them to typically large numbers of training examples. Here, we introduce a fundamentally nov… ▽ More

    Submitted 23 March, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: Version 2 includes a new subsection 4.1 and associated table and figure benchmarking our biologically-inspired neural network against a traditional ANN

  42. arXiv:2102.10263  [pdf, other

    stat.ML cs.LG stat.ME

    Inducing a hierarchy for multi-class classification problems

    Authors: Hayden S. Helm, Weiwei Yang, Sujeeth Bharadwaj, Kate Lytvynets, Oriana Riva, Christopher White, Ali Geisa, Carey E. Priebe

    Abstract: In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not. Un-fortunately, the majority of classification datasets do not come pre-equipped with a hierarchical structure and classical flat classifiers must be employed. In this paper, we investigate a class of methods that induce a hierarchy that c… ▽ More

    Submitted 20 February, 2021; originally announced February 2021.

  43. arXiv:2011.06557  [pdf, other

    stat.ML cs.LG stat.ME

    A partition-based similarity for classification distributions

    Authors: Hayden S. Helm, Ronak D. Mehta, Brandon Duderstadt, Weiwei Yang, Christoper M. White, Ali Geisa, Joshua T. Vogelstein, Carey E. Priebe

    Abstract: Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners. In particular, we propose a novel similarity on classification distributions, dubbed task similarity, that quantifies how an optimally-transformed optimal representation for a… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  44. Detection of Local Mixing in Time-Series Data Using Permutation Entropy

    Authors: Michael Neuder, Elizabeth Bradley, Edward Dlugokencky, James W. C. White, Joshua Garland

    Abstract: While it is tempting in experimental practice to seek as high a data rate as possible, oversampling can become an issue if one takes measurements too densely. These effects can take many forms, some of which are easy to detect: e.g., when the data sequence contains multiple copies of the same measured value. In other situations, as when there is mixing$-$in the measurement apparatus and/or the sys… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

    Comments: Submission for Physical Review E

    Journal ref: Phys. Rev. E 103, 022217 (2021)

  45. arXiv:2007.11270  [pdf, other

    math.OC cs.CE

    Dynamic Pooled Capacity Deployment for Urban Parcel Logistics

    Authors: Louis Faugère, Walid Klibi, Chelsea White III, Benoit Montreuil

    Abstract: Last-mile logistics is regarded as an essential yet highly expensive component of parcel logistics. In dense urban environments, this is partially caused by inherent inefficiencies due to traffic congestion and the disparity and accessibility of customer locations. In parcel logistics, access hubs are facilities supporting relay-based last-mile activities by offering temporary storage locations en… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Comments: 36 pages, 10 figures

  46. arXiv:2007.04965  [pdf, other

    cs.LG cs.NE stat.ML

    A Study on Encodings for Neural Architecture Search

    Authors: Colin White, Willie Neiswanger, Sam Nolen, Yash Savani

    Abstract: Neural architecture search (NAS) has been extensively studied in the past few years. A popular approach is to represent each neural architecture in the search space as a directed acyclic graph (DAG), and then search over all DAGs by encoding the adjacency matrix and list of operations as a set of hyperparameters. Recent work has demonstrated that even small changes to the way each architecture is… ▽ More

    Submitted 17 February, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Journal ref: Advances in Neural Information Processing Systems 2020

  47. arXiv:2007.03377  [pdf, other

    cs.NI quant-ph

    5G Network Slicing with QKD and Quantum-Safe Security

    Authors: Paul Wright, Catherine White, Ryan C. Parker, Jean-Sébastien Pegon, Marco Menchetti, Joseph Pearse, Arash Bahrami, Anastasia Moroz, Adrian Wonfor, Richard V. Penty, Timothy P. Spiller, Andrew Lord

    Abstract: We demonstrate how the 5G network slicing model can be extended to address data security requirements. In this work we demonstrate two different slice configurations, with different encryption requirements, representing two diverse use-cases for 5G networking: namely, an enterprise application hosted at a metro network site, and a content delivery network. We create a modified software-defined net… ▽ More

    Submitted 8 January, 2021; v1 submitted 7 July, 2020; originally announced July 2020.

    Comments: 9 pages, 7 figures

    Journal ref: J. Opt. Commun. Netw. 13, 33-40 (2021)

  48. arXiv:2006.08564  [pdf, other

    cs.LG stat.ML

    Intra-Processing Methods for Debiasing Neural Networks

    Authors: Yash Savani, Colin White, Naveen Sundar Govindarajulu

    Abstract: As deep learning models become tasked with more and more decisions that impact human lives, such as criminal recidivism, loan repayment, and face recognition for law enforcement, bias is becoming a growing concern. Debiasing algorithms are typically split into three paradigms: pre-processing, in-processing, and post-processing. However, in computer vision or natural language applications, it is co… ▽ More

    Submitted 7 December, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Journal ref: Advances in Neural Information Processing Systems 2020

  49. arXiv:2005.10700  [pdf, other

    cs.LG cs.IR stat.ML

    Distance-based Positive and Unlabeled Learning for Ranking

    Authors: Hayden S. Helm, Amitabh Basu, Avanti Athreya, Youngser Park, Joshua T. Vogelstein, Carey E. Priebe, Michael Winding, Marta Zlatic, Albert Cardona, Patrick Bourke, Jonathan Larson, Marah Abdin, Piali Choudhury, Weiwei Yang, Christopher W. White

    Abstract: Learning to rank -- producing a ranked list of items specific to a query and with respect to a set of supervisory items -- is a problem of general interest. The setting we consider is one in which no analytic description of what constitutes a good ranking is available. Instead, we have a collection of representations and supervisory information consisting of a (target item, interesting items set)… ▽ More

    Submitted 28 September, 2022; v1 submitted 19 May, 2020; originally announced May 2020.

    Comments: 21 pages, 5 figures

  50. arXiv:2005.05688  [pdf, other

    cs.HC

    Design of a Privacy-Preserving Data Platform for Collaboration Against Human Trafficking

    Authors: Darren Edge, Weiwei Yang, Kate Lytvynets, Harry Cook, Claire Galez-Davis, Hannah Darnton, Christopher M. White

    Abstract: Case records on victims of human trafficking are highly sensitive, yet the ability to share such data is critical to evidence-based practice and policy development across government, business, and civil society. We present new methods to anonymize, publish, and explore such data, implemented as a pipeline generating three artifacts: (1) synthetic data mitigating the privacy risk that published att… ▽ More

    Submitted 18 September, 2020; v1 submitted 12 May, 2020; originally announced May 2020.