Skip to main content

Showing 1–39 of 39 results for author: Sala, F

  1. arXiv:2407.11004  [pdf, other

    cs.CL cs.AI cs.LG

    The ALCHEmist: Automated Labeling 500x CHEaper Than LLM Data Annotators

    Authors: Tzu-Heng Huang, Catherine Cao, Vaishnavi Bhargava, Frederic Sala

    Abstract: Large pretrained models can be used as annotators, helping replace or augment crowdworkers and enabling distilling generalist models into smaller specialist models. Unfortunately, this comes at a cost: employing top-of-the-line models often requires paying thousands of dollars for API calls, while the resulting datasets are static and challenging to audit. To address these challenges, we propose a… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

  2. arXiv:2407.03651  [pdf, other

    cs.CL cs.AI

    Evaluating Language Model Context Windows: A "Working Memory" Test and Inference-time Correction

    Authors: Amanda Dsouza, Christopher Glaze, Changho Shin, Frederic Sala

    Abstract: Large language models are prominently used in real-world applications, often tasked with reasoning over large volumes of documents. An exciting development in this space is models boasting extended context capabilities, with some accommodating over 2 million tokens. Such long context model capabilities remain uncertain in production systems, motivating the need to benchmark their performance on re… ▽ More

    Submitted 14 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2406.03642  [pdf, other

    cs.CL cs.LG

    Is Free Self-Alignment Possible?

    Authors: Dyah Adila, Changho Shin, Yijing Zhang, Frederic Sala

    Abstract: Aligning pretrained language models (LMs) is a complex and resource-intensive process, often requiring access to large amounts of ground-truth preference data and substantial compute. Are these costs necessary? That is, it is possible to align using only inherent model knowledge and without additional training? We tackle this challenge with AlignEZ, a novel approach that uses (1) self-generated pr… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  4. arXiv:2406.00894  [pdf, other

    cs.LG cs.AI cs.CL

    Pretrained Hybrids with MAD Skills

    Authors: Nicholas Roberts, Samuel Guo, Zhiqi Gao, Satya Sai Srinath Namburi GNVV, Sonia Cromp, Chengjun Wu, Chengyu Duan, Frederic Sala

    Abstract: While Transformers underpin modern large language models (LMs), there is a growing list of alternative architectures with new capabilities, promises, and tradeoffs. This makes choosing the right LM architecture challenging. Recently-proposed $\textit{hybrid architectures}$ seek a best-of-all-worlds approach that reaps the benefits of all architectures. Hybrid design is difficult for two reasons: i… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  5. arXiv:2404.16188  [pdf, other

    cs.LG cs.AI stat.ML

    Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

    Authors: Harit Vishwakarma, Reid, Chen, Sui Jiet Tay, Satya Sai Srinath Namburi, Frederic Sala, Ramya Korlakai Vinayak

    Abstract: Auto-labeling is an important family of techniques that produce labeled training sets with minimum manual labeling. A prominent variant, threshold-based auto-labeling (TBAL), works by finding a threshold on a model's confidence scores above which it can accurately label unlabeled data points. However, many models are known to produce overconfident scores, leading to poor TBAL performance. While a… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  6. arXiv:2404.08461  [pdf, other

    cs.LG cs.AI

    OTTER: Improving Zero-Shot Classification via Optimal Transport

    Authors: Changho Shin, Jitian Zhao, Sonia Cromp, Harit Vishwakarma, Frederic Sala

    Abstract: Popular zero-shot models suffer due to artifacts inherited from pretraining. A particularly detrimental artifact, caused by unbalanced web-scale pretraining data, is mismatched label distribution. Existing approaches that seek to repair the label distribution are not suitable in zero-shot settings, as they have incompatible requirements such as access to labeled downstream task data or knowledge o… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 29 pages

  7. arXiv:2401.15478  [pdf, other

    q-bio.QM cs.LG q-bio.MN

    Product Manifold Representations for Learning on Biological Pathways

    Authors: Daniel McNeela, Frederic Sala, Anthony Gitter

    Abstract: Machine learning models that embed graphs in non-Euclidean spaces have shown substantial benefits in a variety of contexts, but their application has not been studied extensively in the biological domain, particularly with respect to biological pathway graphs. Such graphs exhibit a variety of complex network structures, presenting challenges to existing embedding approaches. Learning high-quality… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 28 pages, 19 figures

  8. arXiv:2401.12225  [pdf, other

    cs.CV cs.LG

    Multimodal Data Curation via Object Detection and Filter Ensembles

    Authors: Tzu-Heng Huang, Changho Shin, Sui Jiet Tay, Dyah Adila, Frederic Sala

    Abstract: We propose an approach for curating multimodal data that we used for our entry in the 2023 DataComp competition filtering track. Our technique combines object detection and weak supervision-based ensembling. In the first of two steps in our approach, we employ an out-of-the-box zero-shot object detection model to extract granular information and produce a variety of filter designs. In the second s… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

    Comments: Appeared in the Workshop of Towards the Next Generation of Computer Vision Datasets (TNGCV) on ICCV 2023

  9. arXiv:2312.04740  [pdf, other

    cs.LG cs.AI cs.GT

    Train 'n Trade: Foundations of Parameter Markets

    Authors: Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala

    Abstract: Organizations typically train large models individually. This is costly and time-consuming, particularly for large-scale foundation models. Such vertical production is known to be suboptimal. Inspired by this economic insight, we ask whether it is possible to leverage others' expertise by trading the constituent parts in models, i.e., sets of weights, as if they were market commodities. While rece… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: accepted at NeurIPS 2023

  10. arXiv:2312.00960  [pdf

    cs.CL cs.AI cs.LG

    The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

    Authors: Satya Sai Srinath Namburi, Makesh Sreedhar, Srinath Srinivasan, Frederic Sala

    Abstract: Compressing large language models (LLMs), often consisting of billions of parameters, provides faster inference, smaller memory footprints, and enables local deployment. Two standard compression techniques are pruning and quantization, with the former eliminating redundant connections in model layers and the latter representing model parameters with fewer bits. The key tradeoff is between the degr… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Accepted to EMNLP 2023 Findings

  11. arXiv:2309.04344  [pdf, other

    cs.LG cs.AI

    Zero-Shot Robustification of Zero-Shot Models

    Authors: Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala

    Abstract: Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are vulnerable to inherited biases that can impact their performance. The traditional solution is fine-tuning, but this undermines the key advantage of pretrained models, which is their ability to be used out-of-the-box. We p… ▽ More

    Submitted 12 February, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: International Conference on Learning Representations (ICLR), 2024

  12. arXiv:2307.14430  [pdf, other

    cs.CL cs.LG

    Skill-it! A Data-Driven Skills Framework for Understanding and Training Language Models

    Authors: Mayee F. Chen, Nicholas Roberts, Kush Bhatia, Jue Wang, Ce Zhang, Frederic Sala, Christopher Ré

    Abstract: The quality of training data impacts the performance of pre-trained large language models (LMs). Given a fixed budget of tokens, we study how to best select data that leads to good downstream model performance across tasks. We develop a new framework based on a simple hypothesis: just as humans acquire interdependent skills in a deliberate order, language models also follow a natural order when le… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

  13. arXiv:2307.12226  [pdf, other

    cs.LG cs.AI stat.ML

    Geometry-Aware Adaptation for Pretrained Models

    Authors: Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

    Abstract: Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped with a metric that relates the labels via distances between them. We propose a simple approach to exploit this information to adapt the trained model to reliably predict new classes -- or, in the case of… ▽ More

    Submitted 27 November, 2023; v1 submitted 23 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  14. arXiv:2307.11031  [pdf, ps, other

    cs.LG cs.CL

    Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification

    Authors: Neel Guha, Mayee F. Chen, Kush Bhatia, Azalia Mirhoseini, Frederic Sala, Christopher Ré

    Abstract: Recent work has shown that language models' (LMs) prompt-based learning capabilities make them well suited for automating data labeling in domains where manual annotation is expensive. The challenge is that while writing an initial prompt is cheap, improving a prompt is costly -- practitioners often require significant labeled data in order to evaluate the impact of prompt modifications. Our work… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: 38 pages, 22 figures, 8 tables

  15. arXiv:2303.17713  [pdf, other

    cs.LG cs.CY stat.ML

    Mitigating Source Bias for Fairer Weak Supervision

    Authors: Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala

    Abstract: Weak supervision enables efficient development of training sets by reducing the need for ground truth labels. However, the techniques that make weak supervision attractive -- such as integrating any source of signal to estimate unknown labels -- also entail the danger that the produced pseudolabels are highly biased. Surprisingly, given everyday use and the potential for increased bias, weak super… ▽ More

    Submitted 29 November, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  16. arXiv:2303.07527  [pdf, other

    cs.LG cs.CV

    Domain Generalization via Nuclear Norm Regularization

    Authors: Zhenmei Shi, Yifei Ming, Ying Fan, Frederic Sala, Yingyu Liang

    Abstract: The ability to generalize to unseen domains is crucial for machine learning systems deployed in the real world, especially when we only have data from limited training domains. In this paper, we propose a simple and effective regularization method based on the nuclear norm of the learned features for domain generalization. Intuitively, the proposed regularizer mitigates the impacts of environmenta… ▽ More

    Submitted 4 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: 23 pages

  17. arXiv:2212.10579  [pdf, other

    hep-ph cs.LG hep-ex stat.ML

    Resonant Anomaly Detection with Multiple Reference Datasets

    Authors: Mayee F. Chen, Benjamin Nachman, Frederic Sala

    Abstract: An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) and Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single reference dataset. They cannot take advantage… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  18. arXiv:2211.13375  [pdf, other

    cs.LG cs.AI stat.ML

    Lifting Weak Supervision To Structured Prediction

    Authors: Harit Vishwakarma, Nicholas Roberts, Frederic Sala

    Abstract: Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources. WS is theoretically well understood for binary classification, where simple approaches enable consistent estimation of pseudolabel noise rates. Using this result, it has been shown that downstream models trained on the pseudolab… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

  19. arXiv:2211.12620  [pdf, other

    cs.LG cs.AI stat.ML

    Promises and Pitfalls of Threshold-based Auto-labeling

    Authors: Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Korlakai Vinayak

    Abstract: Creating large-scale high-quality labeled datasets is a major bottleneck in supervised machine learning workflows. Threshold-based auto-labeling (TBAL), where validation data obtained from humans is used to find a confidence threshold above which the data is machine-labeled, reduces reliance on manual annotation. TBAL is emerging as a widely-used solution in practice. Given the long shelf-life and… ▽ More

    Submitted 21 February, 2024; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2023 (Spotlight)

    Journal ref: Thirty Seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  20. arXiv:2210.03324  [pdf, other

    cs.LG cs.AI stat.ML

    AutoML for Climate Change: A Call to Action

    Authors: Renbo Tu, Nicholas Roberts, Vishak Prasad, Sibasis Nayak, Paarth Jain, Frederic Sala, Ganesh Ramakrishnan, Ameet Talwalkar, Willie Neiswanger, Colin White

    Abstract: The challenge that climate change poses to humanity has spurred a rapidly developing field of artificial intelligence research focused on climate change applications. The climate change AI (CCAI) community works on a diverse, challenging set of problems which often involve physics-constrained ML or heterogeneous spatiotemporal data. It would be desirable to use automated machine learning (AutoML)… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  21. arXiv:2210.02441  [pdf, other

    cs.CL

    Ask Me Anything: A simple strategy for prompting language models

    Authors: Simran Arora, Avanika Narayan, Mayee F. Chen, Laurel Orr, Neel Guha, Kush Bhatia, Ines Chami, Frederic Sala, Christopher Ré

    Abstract: Large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt that demonstrates how to perform the task and no additional training. Prompting is a brittle process wherein small modifications to the prompt can cause large variations in the model predictions, and therefore significant effort is dedicated towards designing a painstakingly "perfect promp… ▽ More

    Submitted 19 November, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

  22. arXiv:2208.14362  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

    Authors: Nicholas Roberts, Xintong Li, Tzu-Heng Huang, Dyah Adila, Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, Aws Albarghouthi, Frederic Sala

    Abstract: Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating multiple noisy-but-cheap label estimates expressed by labeling functions (LFs). While it has been used successfully in many domains, weak supervision's application scope is limited by the difficulty of construc… ▽ More

    Submitted 24 November, 2023; v1 submitted 30 August, 2022; originally announced August 2022.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track

  23. arXiv:2203.13270  [pdf, other

    stat.ML cs.LG

    Shoring Up the Foundations: Fusing Model Embeddings and Weak Supervision

    Authors: Mayee F. Chen, Daniel Y. Fu, Dyah Adila, Michael Zhang, Frederic Sala, Kayvon Fatahalian, Christopher Ré

    Abstract: Foundation models offer an exciting new paradigm for constructing models with out-of-the-box embeddings and a few labeled examples. However, it is not clear how to best apply foundation models without labeled data. A potential approach is to fuse foundation models with weak supervision frameworks, which use weak label sources -- pre-trained models, heuristics, crowd-workers -- to construct pseudol… ▽ More

    Submitted 1 August, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: UAI 2022 Camera Ready

  24. arXiv:2203.12023  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Generative Modeling Helps Weak Supervision (and Vice Versa)

    Authors: Benedikt Boecking, Nicholas Roberts, Willie Neiswanger, Stefano Ermon, Frederic Sala, Artur Dubrawski

    Abstract: Many promising applications of supervised machine learning face hurdles in the acquisition of labeled data in sufficient quantity and quality, creating an expensive bottleneck. To overcome such limitations, techniques that do not depend on ground truth labels have been studied, including weak supervision and generative modeling. While these techniques would seem to be usable in concert, improving… ▽ More

    Submitted 11 March, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: Published as a conference paper at ICLR 2023

    ACM Class: I.2.0; I.4.m

  25. arXiv:2112.03865  [pdf, other

    cs.LG cs.AI

    Universalizing Weak Supervision

    Authors: Changho Shin, Winfred Li, Harit Vishwakarma, Nicholas Roberts, Frederic Sala

    Abstract: Weak supervision (WS) frameworks are a popular way to bypass hand-labeling large datasets for training data-hungry models. These approaches synthesize multiple noisy but cheaply-acquired estimates of labels into a set of high-quality pseudolabels for downstream training. However, the synthesis technique is specific to a particular kind of label, such as binary labels or sequences, and each new lab… ▽ More

    Submitted 29 November, 2023; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICLR 2022

  26. arXiv:2110.05668  [pdf, other

    cs.CV cs.LG

    NAS-Bench-360: Benchmarking Neural Architecture Search on Diverse Tasks

    Authors: Renbo Tu, Nicholas Roberts, Mikhail Khodak, Junhong Shen, Frederic Sala, Ameet Talwalkar

    Abstract: Most existing neural architecture search (NAS) benchmarks and algorithms prioritize well-studied tasks, e.g. image classification on CIFAR or ImageNet. This makes the performance of NAS approaches in more diverse areas poorly understood. In this paper, we present NAS-Bench-360, a benchmark suite to evaluate methods on domains beyond those traditionally studied in architecture search, and use it to… ▽ More

    Submitted 19 January, 2023; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track

  27. arXiv:2103.02761  [pdf, other

    cs.LG stat.ML

    Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation

    Authors: Mayee F. Chen, Benjamin Cohen-Wang, Stephen Mussmann, Frederic Sala, Christopher Ré

    Abstract: Labeling data for modern machine learning is expensive and time-consuming. Latent variable models can be used to infer labels from weaker, easier-to-acquire sources operating on unlabeled data. Such models can also be trained using labeled data, presenting a key question: should a user invest in few labeled or many unlabeled points? We answer this via a framework centered on model misspecification… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: To appear in AISTATS 2021

  28. arXiv:2006.15168  [pdf, other

    stat.ML cs.LG

    Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings

    Authors: Mayee F. Chen, Daniel Y. Fu, Frederic Sala, Sen Wu, Ravi Teja Mullapudi, Fait Poms, Kayvon Fatahalian, Christopher Ré

    Abstract: Our goal is to enable machine learning systems to be trained interactively. This requires models that perform well and train quickly, without large amounts of hand-labeled data. We take a step forward in this direction by borrowing from weak supervision (WS), wherein models can be trained with noisy sources of signal instead of hand-labeled data. But WS relies on training downstream deep networks… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  29. arXiv:2005.00545  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Low-Dimensional Hyperbolic Knowledge Graph Embeddings

    Authors: Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, Christopher Ré

    Abstract: Knowledge graph (KG) embeddings learn low-dimensional representations of entities and relations to predict missing facts. KGs often exhibit hierarchical and logical patterns which must be preserved in the embedding space. For hierarchical data, hyperbolic embedding methods have shown promise for high-fidelity and parsimonious representations. However, existing hyperbolic embedding methods do not a… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

  30. arXiv:2004.05316  [pdf, other

    cs.LG stat.ML

    Ivy: Instrumental Variable Synthesis for Causal Inference

    Authors: Zhaobin Kuang, Frederic Sala, Nimit Sohoni, Sen Wu, Aldo Córdova-Palomera, Jared Dunnmon, James Priest, Christopher Ré

    Abstract: A popular way to estimate the causal effect of a variable x on y from observational data is to use an instrumental variable (IV): a third variable z that affects y only through x. The more strongly z is associated with x, the more reliable the estimate is, but such strong IVs are difficult to find. Instead, practitioners combine more commonly available IV candidates---which are not necessarily str… ▽ More

    Submitted 11 April, 2020; originally announced April 2020.

  31. arXiv:2002.11955  [pdf, other

    stat.ML cs.LG

    Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods

    Authors: Daniel Y. Fu, Mayee F. Chen, Frederic Sala, Sarah M. Hooper, Kayvon Fatahalian, Christopher Ré

    Abstract: Weak supervision is a popular method for building machine learning models without relying on ground truth annotations. Instead, it generates probabilistic training labels by estimating the accuracies of multiple noisy labeling sources (e.g., heuristics, crowd workers). Existing approaches use latent variable estimation to model the noisy sources, but these methods can be computationally expensive,… ▽ More

    Submitted 15 July, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

  32. arXiv:1910.09505  [pdf, other

    stat.ML cs.CV cs.LG

    Multi-Resolution Weak Supervision for Sequential Data

    Authors: Frederic Sala, Paroma Varma, Jason Fries, Daniel Y. Fu, Shiori Sagawa, Saelig Khattar, Ashwini Ramamoorthy, Ke Xiao, Kayvon Fatahalian, James Priest, Christopher Ré

    Abstract: Since manually labeling training data is slow and expensive, recent industrial and scientific research efforts have turned to weaker or noisier forms of supervision sources. However, existing weak supervision approaches fail to model multi-resolution sources for sequential data, like video, that can assign labels to individual elements or collections of elements in a sequence. A key challenge in w… ▽ More

    Submitted 21 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019 (Conference on Neural Information Processing Systems)

  33. arXiv:1903.05844  [pdf, other

    stat.ML cs.LG

    Learning Dependency Structures for Weak Supervision Models

    Authors: Paroma Varma, Frederic Sala, Ann He, Alexander Ratner, Christopher Ré

    Abstract: Labeling training data is a key bottleneck in the modern machine learning pipeline. Recent weak supervision approaches combine labels from multiple noisy sources by estimating their accuracies without access to ground truth labels; however, estimating the dependencies among these sources is a critical challenge. We focus on a robust PCA-based algorithm for learning these dependency structures, est… ▽ More

    Submitted 14 March, 2019; originally announced March 2019.

  34. arXiv:1810.02840  [pdf, other

    stat.ML cs.LG

    Training Complex Models with Multi-Task Weak Supervision

    Authors: Alexander Ratner, Braden Hancock, Jared Dunnmon, Frederic Sala, Shreyash Pandey, Christopher Ré

    Abstract: As machine learning models continue to increase in complexity, collecting large hand-labeled training sets has become one of the biggest roadblocks in practice. Instead, weaker forms of supervision that provide noisier but cheaper labels are often used. However, these weak supervision sources have diverse and unknown accuracies, may output correlated labels, and may label different tasks or apply… ▽ More

    Submitted 7 December, 2018; v1 submitted 5 October, 2018; originally announced October 2018.

  35. arXiv:1804.03329  [pdf, other

    cs.LG stat.ML

    Representation Tradeoffs for Hyperbolic Embeddings

    Authors: Christopher De Sa, Albert Gu, Christopher Ré, Frederic Sala

    Abstract: Hyperbolic embeddings offer excellent quality with few dimensions when embedding hierarchical data structures like synonym or type hierarchies. Given a tree, we give a combinatorial construction that embeds the tree in hyperbolic space with arbitrarily low distortion without using optimization. On WordNet, our combinatorial embedding obtains a mean-average-precision of 0.989 with only two dimensio… ▽ More

    Submitted 24 April, 2018; v1 submitted 9 April, 2018; originally announced April 2018.

  36. arXiv:1712.07222  [pdf, other

    cs.IT

    Codes Correcting Two Deletions

    Authors: Ryan Gabrys, Frederic Sala

    Abstract: In this work, we investigate the problem of constructing codes capable of correcting two deletions. In particular, we construct a code that requires redundancy approximately 8 log n + O(log log n) bits of redundancy, where n is the length of the code. To the best of the author's knowledge, this represents the best known construction in that it requires the lowest number of redundant bits for a cod… ▽ More

    Submitted 30 April, 2018; v1 submitted 19 December, 2017; originally announced December 2017.

  37. arXiv:1703.02641  [pdf, other

    stat.ML cs.LG

    Don't Fear the Bit Flips: Optimized Coding Strategies for Binary Classification

    Authors: Frederic Sala, Shahroze Kabir, Guy Van den Broeck, Lara Dolecek

    Abstract: After being trained, classifiers must often operate on data that has been corrupted by noise. In this paper, we consider the impact of such noise on the features of binary classifiers. Inspired by tools for classifier robustness, we introduce the same classification probability (SCP) to measure the resulting distortion on the classifier outputs. We introduce a low-complexity estimate of the SCP ba… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

    Comments: 11 pages, 4 figures

  38. arXiv:1604.03000  [pdf, other

    cs.IT

    Exact Reconstruction from Insertions in Synchronization Codes

    Authors: Frederic Sala, Ryan Gabrys, Clayton Schoeny, Lara Dolecek

    Abstract: This work studies problems in data reconstruction, an important area with numerous applications. In particular, we examine the reconstruction of binary and non-binary sequences from synchronization (insertion/deletion-correcting) codes. These sequences have been corrupted by a fixed number of symbol insertions (larger than the minimum edit distance of the code), yielding a number of distinct trace… ▽ More

    Submitted 7 March, 2017; v1 submitted 11 April, 2016; originally announced April 2016.

    Comments: 18 pages, 3 figures. Accepted to IEEE Transactions on Information Theory

  39. arXiv:1602.03206  [pdf, other

    cs.GR cs.CV

    Design of false color palettes for grayscale reproduction

    Authors: Filip A. Sala

    Abstract: Design of false color palette is quite easy but some effort has to be done to achieve good dynamic range, contrast and overall appearance of the palette. Such palettes, for instance, are commonly used in scientific papers for presenting the data. However, to lower the cost of the paper most scientists decide to let the data to be printed in grayscale. The same applies to e-book readers based on e-… ▽ More

    Submitted 31 March, 2016; v1 submitted 6 February, 2016; originally announced February 2016.