-
Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations
Authors:
Arie Cattan,
Alon Jacovi,
Alex Fabrikant,
Jonathan Herzig,
Roee Aharoni,
Hannah Rashkin,
Dror Marcus,
Avinatan Hassidim,
Yossi Matias,
Idan Szpektor,
Avi Caciularu
Abstract:
Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scenario; However, naively adding ICL examples with long context introduces challenges, including substantial token overhead added for each few-shot examp…
▽ More
Despite recent advancements in Large Language Models (LLMs), their performance on tasks involving long contexts remains sub-optimal. In-Context Learning (ICL) with few-shot examples may be an appealing solution to enhance LLM performance in this scenario; However, naively adding ICL examples with long context introduces challenges, including substantial token overhead added for each few-shot example and context mismatch between the demonstrations and the target query. In this work, we propose to automatically generate few-shot examples for long context QA tasks by recycling contexts. Specifically, given a long input context (1-3k tokens) and a query, we generate additional query-output pairs from the given context as few-shot examples, while introducing the context only once. This ensures that the demonstrations are leveraging the same context as the target query while only adding a small number of tokens to the prompt. We further enhance each demonstration by instructing the model to explicitly identify the relevant paragraphs before the answer, which improves performance while providing fine-grained attribution to the answer source. We apply our method on multiple LLMs and obtain substantial improvements (+23\% on average across models) on various QA datasets with long context, especially when the answer lies within the middle of the context. Surprisingly, despite introducing only single-hop ICL examples, LLMs also successfully generalize to multi-hop long-context QA using our approach.
△ Less
Submitted 23 June, 2024; v1 submitted 19 June, 2024;
originally announced June 2024.
-
Multi-turn Reinforcement Learning from Preference Human Feedback
Authors:
Lior Shani,
Aviv Rosenberg,
Asaf Cassel,
Oran Lang,
Daniele Calandriello,
Avital Zipori,
Hila Noga,
Orgad Keller,
Bilal Piot,
Idan Szpektor,
Avinatan Hassidim,
Yossi Matias,
Rémi Munos
Abstract:
Reinforcement Learning from Human Feedback (RLHF) has become the standard approach for aligning Large Language Models (LLMs) with human preferences, allowing LLMs to demonstrate remarkable abilities in various tasks. Existing methods work by emulating the preferences at the single decision (turn) level, limiting their capabilities in settings that require planning or multi-turn interactions to ach…
▽ More
Reinforcement Learning from Human Feedback (RLHF) has become the standard approach for aligning Large Language Models (LLMs) with human preferences, allowing LLMs to demonstrate remarkable abilities in various tasks. Existing methods work by emulating the preferences at the single decision (turn) level, limiting their capabilities in settings that require planning or multi-turn interactions to achieve a long-term goal. In this paper, we address this issue by developing novel methods for Reinforcement Learning (RL) from preference feedback between two full multi-turn conversations. In the tabular setting, we present a novel mirror-descent-based policy optimization algorithm for the general multi-turn preference-based RL problem, and prove its convergence to Nash equilibrium. To evaluate performance, we create a new environment, Education Dialogue, where a teacher agent guides a student in learning a random topic, and show that a deep RL variant of our algorithm outperforms RLHF baselines. Finally, we show that in an environment with explicit rewards, our algorithm recovers the same performance as a reward-based RL baseline, despite relying solely on a weaker preference signal.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models
Authors:
Ariel Goldstein,
Eric Ham,
Mariano Schain,
Samuel Nastase,
Zaid Zada,
Avigail Dabush,
Bobbi Aubrey,
Harshvardhan Gazula,
Amir Feder,
Werner K Doyle,
Sasha Devore,
Patricia Dugan,
Daniel Friedman,
Roi Reichart,
Michael Brenner,
Avinatan Hassidim,
Orrin Devinsky,
Adeen Flinker,
Omer Levy,
Uri Hasson
Abstract:
Deep Language Models (DLMs) provide a novel computational paradigm for understanding the mechanisms of natural language processing in the human brain. Unlike traditional psycholinguistic models, DLMs use layered sequences of continuous numerical vectors to represent words and context, allowing a plethora of emerging applications such as human-like text generation. In this paper we show evidence th…
▽ More
Deep Language Models (DLMs) provide a novel computational paradigm for understanding the mechanisms of natural language processing in the human brain. Unlike traditional psycholinguistic models, DLMs use layered sequences of continuous numerical vectors to represent words and context, allowing a plethora of emerging applications such as human-like text generation. In this paper we show evidence that the layered hierarchy of DLMs may be used to model the temporal dynamics of language comprehension in the brain by demonstrating a strong correlation between DLM layer depth and the time at which layers are most predictive of the human brain. Our ability to temporally resolve individual layers benefits from our use of electrocorticography (ECoG) data, which has a much higher temporal resolution than noninvasive methods like fMRI. Using ECoG, we record neural activity from participants listening to a 30-minute narrative while also feeding the same narrative to a high-performing DLM (GPT2-XL). We then extract contextual embeddings from the different layers of the DLM and use linear encoding models to predict neural activity. We first focus on the Inferior Frontal Gyrus (IFG, or Broca's area) and then extend our model to track the increasing temporal receptive window along the linguistic processing hierarchy from auditory to syntactic and semantic areas. Our results reveal a connection between human language processing and DLMs, with the DLM's layer-by-layer accumulation of contextual information mirroring the timing of neural activity in high-order language areas.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
AI Increases Global Access to Reliable Flood Forecasts
Authors:
Grey Nearing,
Deborah Cohen,
Vusumuzi Dube,
Martin Gauch,
Oren Gilon,
Shaun Harrigan,
Avinatan Hassidim,
Daniel Klotz,
Frederik Kratzert,
Asher Metzger,
Sella Nevo,
Florian Pappenberger,
Christel Prudhomme,
Guy Shalev,
Shlomo Shenzis,
Tadele Tekalign,
Dana Weitzner,
Yoss Matias
Abstract:
Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Using AI, we achieve reliability in predicting extreme riverine event…
▽ More
Floods are one of the most common natural disasters, with a disproportionate impact in developing countries that often lack dense streamflow gauge networks. Accurate and timely warnings are critical for mitigating flood risks, but hydrological simulation models typically must be calibrated to long data records in each watershed. Using AI, we achieve reliability in predicting extreme riverine events in ungauged watersheds at up to a 5-day lead time that is similar to or better than the reliability of nowcasts (0-day lead time) from a current state of the art global modeling system (the Copernicus Emergency Management Service Global Flood Awareness System). Additionally, we achieve accuracies over 5-year return period events that are similar to or better than current accuracies over 1-year return period events. This means that AI can provide flood warnings earlier and over larger and more impactful events in ungauged basins. The model developed in this paper was incorporated into an operational early warning system that produces publicly available (free and open) forecasts in real time in over 80 countries. This work highlights a need for increasing the availability of hydrological data to continue to improve global access to reliable flood warnings.
△ Less
Submitted 3 November, 2023; v1 submitted 29 July, 2023;
originally announced July 2023.
-
Using generative AI to investigate medical imagery models and datasets
Authors:
Oran Lang,
Doron Yaya-Stupp,
Ilana Traynis,
Heather Cole-Lewis,
Chloe R. Bennett,
Courtney Lyles,
Charles Lau,
Michal Irani,
Christopher Semturs,
Dale R. Webster,
Greg S. Corrado,
Avinatan Hassidim,
Yossi Matias,
Yun Liu,
Naama Hammel,
Boris Babenko
Abstract:
AI models have shown promise in many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust in AI-based models, and could enable novel scientific discovery by uncovering signals in the data that are not yet known to experts. In this paper, we present a method for automatic visual expl…
▽ More
AI models have shown promise in many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust in AI-based models, and could enable novel scientific discovery by uncovering signals in the data that are not yet known to experts. In this paper, we present a method for automatic visual explanations leveraging team-based expertise by generating hypotheses of what visual signals in the images are correlated with the task. We propose the following 4 steps: (i) Train a classifier to perform a given task (ii) Train a classifier guided StyleGAN-based image generator (StylEx) (iii) Automatically detect and visualize the top visual attributes that the classifier is sensitive towards (iv) Formulate hypotheses for the underlying mechanisms, to stimulate future research. Specifically, we present the discovered attributes to an interdisciplinary panel of experts so that hypotheses can account for social and structural determinants of health. We demonstrate results on eight prediction tasks across three medical imaging modalities: retinal fundus photographs, external eye photographs, and chest radiographs. We showcase examples of attributes that capture clinically known features, confounders that arise from factors beyond physiological mechanisms, and reveal a number of physiologically plausible novel attributes. Our approach has the potential to enable researchers to better understand, improve their assessment, and extract new knowledge from AI-based models. Importantly, we highlight that attributes generated by our framework can capture phenomena beyond physiology or pathophysiology, reflecting the real world nature of healthcare delivery and socio-cultural factors. Finally, we intend to release code to enable researchers to train their own StylEx models and analyze their predictive tasks.
△ Less
Submitted 4 July, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback
Authors:
Paul Roit,
Johan Ferret,
Lior Shani,
Roee Aharoni,
Geoffrey Cideron,
Robert Dadashi,
Matthieu Geist,
Sertan Girgin,
Léonard Hussenot,
Orgad Keller,
Nikola Momchev,
Sabela Ramos,
Piotr Stanczyk,
Nino Vieillard,
Olivier Bachem,
Gal Elidan,
Avinatan Hassidim,
Olivier Pietquin,
Idan Szpektor
Abstract:
Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work, we leverage recent progress on textual entailment models to directly address this p…
▽ More
Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work, we leverage recent progress on textual entailment models to directly address this problem for abstractive summarization systems. We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency and explore the ensuing trade-offs, as improved consistency may come at the cost of less informative or more extractive summaries. Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Leximin Approximation: From Single-Objective to Multi-Objective
Authors:
Eden Hartman,
Avinatan Hassidim,
Yonatan Aumann,
Erel Segal-Halevi
Abstract:
Leximin is a common approach to multi-objective optimization, frequently employed in fair division applications. In leximin optimization, one first aims to maximize the smallest objective value; subject to this, one maximizes the second-smallest objective; and so on. Often, even the single-objective problem of maximizing the smallest value cannot be solved accurately. What can we hope to accomplis…
▽ More
Leximin is a common approach to multi-objective optimization, frequently employed in fair division applications. In leximin optimization, one first aims to maximize the smallest objective value; subject to this, one maximizes the second-smallest objective; and so on. Often, even the single-objective problem of maximizing the smallest value cannot be solved accurately. What can we hope to accomplish for leximin optimization in this situation? Recently, Henzinger et al. (2022) defined a notion of \emph{approximate} leximin optimality. Their definition, however, considers only an additive approximation.
In this work, we first define the notion of approximate leximin optimality, allowing both multiplicative and additive errors. We then show how to compute, in polynomial time, such an approximate leximin solution, using an oracle that finds an approximation to a single-objective problem. The approximation factors of the algorithms are closely related: an $(α,ε)$-approximation for the single-objective problem (where $α\in (0,1]$ and $ε\geq 0$ are the multiplicative and additive factors respectively) translates into an $\left(\frac{α^2}{1-α+ α^2}, \fracε{1-α+α^2}\right)$-approximation for the multi-objective leximin problem, regardless of the number of objectives.
△ Less
Submitted 28 September, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning
Authors:
Deborah Cohen,
Moonkyung Ryu,
Yinlam Chow,
Orgad Keller,
Ido Greenberg,
Avinatan Hassidim,
Michael Fink,
Yossi Matias,
Idan Szpektor,
Craig Boutilier,
Gal Elidan
Abstract:
Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge. In this work we develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversa…
▽ More
Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge. In this work we develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversational skill at scale. Our work pairs the succinct embedding of the conversation state generated using SOTA (supervised) language models with RL techniques that are particularly suited to a dynamic action space that changes as the conversation progresses. Trained using crowd-sourced data, our novel system is able to substantially exceeds the (strong) baseline supervised model with respect to several metrics of interest in a live experiment with real users of the Google Assistant.
△ Less
Submitted 25 July, 2022;
originally announced August 2022.
-
TRUE: Re-evaluating Factual Consistency Evaluation
Authors:
Or Honovich,
Roee Aharoni,
Jonathan Herzig,
Hagai Taitelbaum,
Doron Kukliansy,
Vered Cohen,
Thomas Scialom,
Idan Szpektor,
Avinatan Hassidim,
Yossi Matias
Abstract:
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability. Automatic factual consistency evaluation may help alleviate this limitation by accelerating evaluation cycles, filtering inconsistent outputs and augmenting training data. While attracting increasing attention, such evaluation metrics are usually developed and evalu…
▽ More
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability. Automatic factual consistency evaluation may help alleviate this limitation by accelerating evaluation cycles, filtering inconsistent outputs and augmenting training data. While attracting increasing attention, such evaluation metrics are usually developed and evaluated in silo for a single task or dataset, slowing their adoption. Moreover, previous meta-evaluation protocols focused on system-level correlations with human annotations, which leave the example-level accuracy of such metrics unclear. In this work, we introduce TRUE: a comprehensive survey and assessment of factual consistency metrics on a standardized collection of existing texts from diverse tasks, manually annotated for factual consistency. Our standardization enables an example-level meta-evaluation protocol that is more actionable and interpretable than previously reported correlations, yielding clearer quality measures. Across diverse state-of-the-art metrics and 11 datasets we find that large-scale NLI and question generation-and-answering-based approaches achieve strong and complementary results. We recommend those methods as a starting point for model and metric developers, and hope TRUE will foster progress towards even better evaluation methods.
△ Less
Submitted 3 May, 2022; v1 submitted 11 April, 2022;
originally announced April 2022.
-
Flood forecasting with machine learning models in an operational framework
Authors:
Sella Nevo,
Efrat Morin,
Adi Gerzi Rosenthal,
Asher Metzger,
Chen Barshai,
Dana Weitzner,
Dafi Voloshin,
Frederik Kratzert,
Gal Elidan,
Gideon Dror,
Gregory Begelman,
Grey Nearing,
Guy Shalev,
Hila Noga,
Ira Shavitt,
Liora Yuklea,
Moriah Royz,
Niv Giladi,
Nofar Peled Levi,
Ofir Reich,
Oren Gilon,
Ronnie Maor,
Shahar Timnat,
Tal Shechter,
Vladimir Anisimov
, et al. (6 additional authors not shown)
Abstract:
The operational flood forecasting system by Google was developed to provide accurate real-time flood warnings to agencies and the public, with a focus on riverine floods in large, gauged rivers. It became operational in 2018 and has since expanded geographically. This forecasting system consists of four subsystems: data validation, stage forecasting, inundation modeling, and alert distribution. Ma…
▽ More
The operational flood forecasting system by Google was developed to provide accurate real-time flood warnings to agencies and the public, with a focus on riverine floods in large, gauged rivers. It became operational in 2018 and has since expanded geographically. This forecasting system consists of four subsystems: data validation, stage forecasting, inundation modeling, and alert distribution. Machine learning is used for two of the subsystems. Stage forecasting is modeled with the Long Short-Term Memory (LSTM) networks and the Linear models. Flood inundation is computed with the Thresholding and the Manifold models, where the former computes inundation extent and the latter computes both inundation extent and depth. The Manifold model, presented here for the first time, provides a machine-learning alternative to hydraulic modeling of flood inundation. When evaluated on historical data, all models achieve sufficiently high-performance metrics for operational use. The LSTM showed higher skills than the Linear model, while the Thresholding and Manifold models achieved similar performance metrics for modeling inundation extent. During the 2021 monsoon season, the flood warning system was operational in India and Bangladesh, covering flood-prone regions around rivers with a total area of 287,000 km2, home to more than 350M people. More than 100M flood alerts were sent to affected populations, to relevant authorities, and to emergency organizations. Current and future work on the system includes extending coverage to additional flood-prone locations, as well as improving modeling capabilities and accuracy.
△ Less
Submitted 4 November, 2021;
originally announced November 2021.
-
Adversarial Robustness of Streaming Algorithms through Importance Sampling
Authors:
Vladimir Braverman,
Avinatan Hassidim,
Yossi Matias,
Mariano Schain,
Sandeep Silwal,
Samson Zhou
Abstract:
In this paper, we introduce adversarially robust streaming algorithms for central machine learning and algorithmic tasks, such as regression and clustering, as well as their more general counterparts, subspace embedding, low-rank approximation, and coreset construction. For regression and other numerical linear algebra related tasks, we consider the row arrival streaming model. Our results are bas…
▽ More
In this paper, we introduce adversarially robust streaming algorithms for central machine learning and algorithmic tasks, such as regression and clustering, as well as their more general counterparts, subspace embedding, low-rank approximation, and coreset construction. For regression and other numerical linear algebra related tasks, we consider the row arrival streaming model. Our results are based on a simple, but powerful, observation that many importance sampling-based algorithms give rise to adversarial robustness which is in contrast to sketching based algorithms, which are very prevalent in the streaming literature but suffer from adversarial attacks. In addition, we show that the well-known merge and reduce paradigm in streaming is adversarially robust. Since the merge and reduce paradigm allows coreset constructions in the streaming setting, we thus obtain robust algorithms for $k$-means, $k$-median, $k$-center, Bregman clustering, projective clustering, principal component analysis (PCA) and non-negative matrix factorization. To the best of our knowledge, these are the first adversarially robust results for these problems yet require no new algorithmic implementations. Finally, we empirically confirm the robustness of our algorithms on various adversarial attacks and demonstrate that by contrast, some common existing algorithms are not robust.
(Abstract shortened to meet arXiv limits)
△ Less
Submitted 25 October, 2021; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Explaining in Style: Training a GAN to explain a classifier in StyleSpace
Authors:
Oran Lang,
Yossi Gandelsman,
Michal Yarom,
Yoav Wald,
Gal Elidan,
Avinatan Hassidim,
William T. Freeman,
Phillip Isola,
Amir Globerson,
Michal Irani,
Inbar Mosseri
Abstract:
Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. A natural source for such attributes is t…
▽ More
Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. A natural source for such attributes is the StyleSpace of StyleGAN, which is known to generate semantically meaningful dimensions in the image. However, because standard GAN training is not dependent on the classifier, it may not represent these attributes which are important for the classifier decision, and the dimensions of StyleSpace may represent irrelevant attributes. To overcome this, we propose a training procedure for a StyleGAN, which incorporates the classifier model, in order to learn a classifier-specific StyleSpace. Explanatory attributes are then selected from this space. These can be used to visualize the effect of changing multiple attributes per image, thus providing image-specific explanations. We apply StylEx to multiple domains, including animals, leaves, faces and retinal images. For these, we show how an image can be modified in different ways to change its classifier output. Our results show that the method finds attributes that align well with semantic ones, generate meaningful image-specific explanations, and are human-interpretable as measured in user-studies.
△ Less
Submitted 1 September, 2021; v1 submitted 27 April, 2021;
originally announced April 2021.
-
ML-based Flood Forecasting: Advances in Scale, Accuracy and Reach
Authors:
Sella Nevo,
Gal Elidan,
Avinatan Hassidim,
Guy Shalev,
Oren Gilon,
Grey Nearing,
Yossi Matias
Abstract:
Floods are among the most common and deadly natural disasters in the world, and flood warning systems have been shown to be effective in reducing harm. Yet the majority of the world's vulnerable population does not have access to reliable and actionable warning systems, due to core challenges in scalability, computational costs, and data availability. In this paper we present two components of flo…
▽ More
Floods are among the most common and deadly natural disasters in the world, and flood warning systems have been shown to be effective in reducing harm. Yet the majority of the world's vulnerable population does not have access to reliable and actionable warning systems, due to core challenges in scalability, computational costs, and data availability. In this paper we present two components of flood forecasting systems which were developed over the past year, providing access to these critical systems to 75 million people who didn't have this access before.
△ Less
Submitted 5 December, 2020; v1 submitted 29 November, 2020;
originally announced December 2020.
-
An Optimal Elimination Algorithm for Learning a Best Arm
Authors:
Avinatan Hassidim,
Ron Kupfer,
Yaron Singer
Abstract:
We consider the classic problem of $(ε,δ)$-PAC learning a best arm where the goal is to identify with confidence $1-δ$ an arm whose mean is an $ε$-approximation to that of the highest mean arm in a multi-armed bandit setting. This problem is one of the most fundamental problems in statistics and learning theory, yet somewhat surprisingly its worst-case sample complexity is not well understood. In…
▽ More
We consider the classic problem of $(ε,δ)$-PAC learning a best arm where the goal is to identify with confidence $1-δ$ an arm whose mean is an $ε$-approximation to that of the highest mean arm in a multi-armed bandit setting. This problem is one of the most fundamental problems in statistics and learning theory, yet somewhat surprisingly its worst-case sample complexity is not well understood. In this paper, we propose a new approach for $(ε,δ)$-PAC learning a best arm. This approach leads to an algorithm whose sample complexity converges to \emph{exactly} the optimal sample complexity of $(ε,δ)$-learning the mean of $n$ arms separately and we complement this result with a conditional matching lower bound. More specifically:
△ Less
Submitted 20 June, 2020;
originally announced June 2020.
-
Adversarially Robust Streaming Algorithms via Differential Privacy
Authors:
Avinatan Hassidim,
Haim Kaplan,
Yishay Mansour,
Yossi Matias,
Uri Stemmer
Abstract:
A streaming algorithm is said to be adversarially robust if its accuracy guarantees are maintained even when the data stream is chosen maliciously, by an adaptive adversary. We establish a connection between adversarial robustness of streaming algorithms and the notion of differential privacy. This connection allows us to design new adversarially robust streaming algorithms that outperform the cur…
▽ More
A streaming algorithm is said to be adversarially robust if its accuracy guarantees are maintained even when the data stream is chosen maliciously, by an adaptive adversary. We establish a connection between adversarial robustness of streaming algorithms and the notion of differential privacy. This connection allows us to design new adversarially robust streaming algorithms that outperform the current state-of-the-art constructions for many interesting regimes of parameters.
△ Less
Submitted 13 April, 2020;
originally announced April 2020.
-
Spectral Algorithm for Low-rank Multitask Regression
Authors:
Yotam Gigi,
Ami Wiesel,
Sella Nevo,
Gal Elidan,
Avinatan Hassidim,
Yossi Matias
Abstract:
Multitask learning, i.e. taking advantage of the relatedness of individual tasks in order to improve performance on all of them, is a core challenge in the field of machine learning. We focus on matrix regression tasks where the rank of the weight matrix is constrained to reduce sample complexity. We introduce the common mechanism regression (CMR) model which assumes a shared left low-rank compone…
▽ More
Multitask learning, i.e. taking advantage of the relatedness of individual tasks in order to improve performance on all of them, is a core challenge in the field of machine learning. We focus on matrix regression tasks where the rank of the weight matrix is constrained to reduce sample complexity. We introduce the common mechanism regression (CMR) model which assumes a shared left low-rank component across all tasks, but allows an individual per-task right low-rank component. This dramatically reduces the number of samples needed for accurate estimation. The problem of jointly recovering the common and the local components has a non-convex bi-linear structure. We overcome this hurdle and provide a provably beneficial non-iterative spectral algorithm. Appealingly, the solution has favorable behavior as a function of the number of related tasks and the small number of samples available for each one. We demonstrate the efficacy of our approach for the challenging task of remote river discharge estimation across multiple river sites, where data for each task is naturally scarce. In this scenario sharing a low-rank component between the tasks translates to a shared spectral reflection of the water, which is a true underlying physical model. We also show the benefit of the approach on the markedly different setting of image classification where the common component can be interpreted as the shared convolution filters.
△ Less
Submitted 27 October, 2019;
originally announced October 2019.
-
Personalizing ASR for Dysarthric and Accented Speech with Limited Data
Authors:
Joel Shor,
Dotan Emanuel,
Oran Lang,
Omry Tuval,
Michael Brenner,
Julie Cattiau,
Fernando Vieira,
Maeve McNally,
Taylor Charbonneau,
Melissa Nollstadt,
Avinatan Hassidim,
Yossi Matias
Abstract:
Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech:…
▽ More
Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech: speech from people with amyotrophic lateral sclerosis (ALS) and accented speech. We train personalized models that achieve 62% and 35% relative WER improvement on these two groups, bringing the absolute WER for ALS speakers, on a test set of message bank phrases, down to 10% for mild dysarthria and 20% for more serious dysarthria. We show that 71% of the improvement comes from only 5 minutes of training data. Finetuning a particular subset of layers (with many fewer parameters) often gives better results than finetuning the entire model. This is the first step towards building state of the art ASR models for dysarthric speech.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Audio De-identification: A New Entity Recognition Task
Authors:
Ido Cohn,
Itay Laish,
Genady Beryozkin,
Gang Li,
Izhak Shafran,
Idan Szpektor,
Tzvika Hartman,
Avinatan Hassidim,
Yossi Matias
Abstract:
Named Entity Recognition (NER) has been mostly studied in the context of written text. Specifically, NER is an important step in de-identification (de-ID) of medical records, many of which are recorded conversations between a patient and a doctor. In such recordings, audio spans with personal information should be redacted, similar to the redaction of sensitive character spans in de-ID for written…
▽ More
Named Entity Recognition (NER) has been mostly studied in the context of written text. Specifically, NER is an important step in de-identification (de-ID) of medical records, many of which are recorded conversations between a patient and a doctor. In such recordings, audio spans with personal information should be redacted, similar to the redaction of sensitive character spans in de-ID for written text. The application of NER in the context of audio de-identification has yet to be fully investigated. To this end, we define the task of audio de-ID, in which audio spans with entity mentions should be detected. We then present our pipeline for this task, which involves Automatic Speech Recognition (ASR), NER on the transcript text, and text-to-audio alignment. Finally, we introduce a novel metric for audio de-ID and a new evaluation benchmark consisting of a large labeled segment of the Switchboard and Fisher audio datasets and detail our pipeline's results on it.
△ Less
Submitted 5 May, 2019; v1 submitted 17 March, 2019;
originally announced March 2019.
-
Learning to Screen
Authors:
Alon Cohen,
Avinatan Hassidim,
Haim Kaplan,
Yishay Mansour,
Shay Moran
Abstract:
Imagine a large firm with multiple departments that plans a large recruitment. Candidates arrive one-by-one, and for each candidate the firm decides, based on her data (CV, skills, experience, etc), whether to summon her for an interview. The firm wants to recruit the best candidates while minimizing the number of interviews. We model such scenarios as an assignment problem between items (candidat…
▽ More
Imagine a large firm with multiple departments that plans a large recruitment. Candidates arrive one-by-one, and for each candidate the firm decides, based on her data (CV, skills, experience, etc), whether to summon her for an interview. The firm wants to recruit the best candidates while minimizing the number of interviews. We model such scenarios as an assignment problem between items (candidates) and categories (departments): the items arrive one-by-one in an online manner, and upon processing each item the algorithm decides, based on its value and the categories it can be matched with, whether to retain or discard it (this decision is irrevocable). The goal is to retain as few items as possible while guaranteeing that the set of retained items contains an optimal matching.
We consider two variants of this problem: (i) in the first variant it is assumed that the $n$ items are drawn independently from an unknown distribution $D$. (ii) In the second variant it is assumed that before the process starts, the algorithm has an access to a training set of $n$ items drawn independently from the same unknown distribution (e.g.\ data of candidates from previous recruitment seasons). We give tight bounds on the minimum possible number of retained items in each of these variants. These results demonstrate that one can retain exponentially less items in the second variant (with the training set).
△ Less
Submitted 31 May, 2019; v1 submitted 13 February, 2019;
originally announced February 2019.
-
ML for Flood Forecasting at Scale
Authors:
Sella Nevo,
Vova Anisimov,
Gal Elidan,
Ran El-Yaniv,
Pete Giencke,
Yotam Gigi,
Avinatan Hassidim,
Zach Moshe,
Mor Schlesinger,
Guy Shalev,
Ajai Tirumali,
Ami Wiesel,
Oleg Zlydenko,
Yossi Matias
Abstract:
Effective riverine flood forecasting at scale is hindered by a multitude of factors, most notably the need to rely on human calibration in current methodology, the limited amount of data for a specific location, and the computational difficulty of building continent/global level models that are sufficiently accurate. Machine learning (ML) is primed to be useful in this scenario: learned models oft…
▽ More
Effective riverine flood forecasting at scale is hindered by a multitude of factors, most notably the need to rely on human calibration in current methodology, the limited amount of data for a specific location, and the computational difficulty of building continent/global level models that are sufficiently accurate. Machine learning (ML) is primed to be useful in this scenario: learned models often surpass human experts in complex high-dimensional scenarios, and the framework of transfer or multitask learning is an appealing solution for leveraging local signals to achieve improved global performance. We propose to build on these strengths and develop ML systems for timely and accurate riverine flood prediction.
△ Less
Submitted 28 January, 2019;
originally announced January 2019.
-
Towards Global Remote Discharge Estimation: Using the Few to Estimate The Many
Authors:
Yotam Gigi,
Gal Elidan,
Avinatan Hassidim,
Yossi Matias,
Zach Moshe,
Sella Nevo,
Guy Shalev,
Ami Wiesel
Abstract:
Learning hydrologic models for accurate riverine flood prediction at scale is a challenge of great importance. One of the key difficulties is the need to rely on in-situ river discharge measurements, which can be quite scarce and unreliable, particularly in regions where floods cause the most damage every year. Accordingly, in this work we tackle the problem of river discharge estimation at differ…
▽ More
Learning hydrologic models for accurate riverine flood prediction at scale is a challenge of great importance. One of the key difficulties is the need to rely on in-situ river discharge measurements, which can be quite scarce and unreliable, particularly in regions where floods cause the most damage every year. Accordingly, in this work we tackle the problem of river discharge estimation at different river locations. A core characteristic of the data at hand (e.g. satellite measurements) is that we have few measurements for many locations, all sharing the same physics that underlie the water discharge. We capture this scenario in a simple but powerful common mechanism regression (CMR) model with a local component as well as a shared one which captures the global discharge mechanism. The resulting learning objective is non-convex, but we show that we can find its global optimum by leveraging the power of joining local measurements across sites. In particular, using a spectral initialization with provable near-optimal accuracy, we can find the optimum using standard descent methods. We demonstrate the efficacy of our approach for the problem of discharge estimation using simulations.
△ Less
Submitted 3 January, 2019;
originally announced January 2019.
-
Online Linear Quadratic Control
Authors:
Alon Cohen,
Avinatan Hassidim,
Tomer Koren,
Nevena Lazic,
Yishay Mansour,
Kunal Talwar
Abstract:
We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Cruci…
▽ More
We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to "strongly stable" policies that mix exponentially fast to a steady state.
△ Less
Submitted 19 June, 2018;
originally announced June 2018.
-
Planning and Learning with Stochastic Action Sets
Authors:
Craig Boutilier,
Alon Cohen,
Amit Daniely,
Avinatan Hassidim,
Yishay Mansour,
Ofer Meshi,
Martin Mladenov,
Dale Schuurmans
Abstract:
In many practical uses of reinforcement learning (RL) the set of actions available at a given state is a random variable, with realizations governed by an exogenous stochastic process. Somewhat surprisingly, the foundations for such sequential decision processes have been unaddressed. In this work, we formalize and investigate MDPs with stochastic action sets (SAS-MDPs) to provide these foundation…
▽ More
In many practical uses of reinforcement learning (RL) the set of actions available at a given state is a random variable, with realizations governed by an exogenous stochastic process. Somewhat surprisingly, the foundations for such sequential decision processes have been unaddressed. In this work, we formalize and investigate MDPs with stochastic action sets (SAS-MDPs) to provide these foundations. We show that optimal policies and value functions in this model have a structure that admits a compact representation. From an RL perspective, we show that Q-learning with sampled action sets is sound. In model-based settings, we consider two important special cases: when individual actions are available with independent probabilities; and a sampling-based model for unknown distributions. We develop poly-time value and policy iteration methods for both cases; and in the first, we offer a poly-time linear programming solution.
△ Less
Submitted 12 February, 2021; v1 submitted 7 May, 2018;
originally announced May 2018.
-
Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation
Authors:
Ariel Ephrat,
Inbar Mosseri,
Oran Lang,
Tali Dekel,
Kevin Wilson,
Avinatan Hassidim,
William T. Freeman,
Michael Rubinstein
Abstract:
We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise. Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video. In this paper, we present a deep network-based model that incorporates both visual and aud…
▽ More
We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise. Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video. In this paper, we present a deep network-based model that incorporates both visual and auditory signals to solve this task. The visual features are used to "focus" the audio on desired speakers in a scene and to improve the speech separation quality. To train our joint audio-visual model, we introduce AVSpeech, a new dataset comprised of thousands of hours of video segments from the Web. We demonstrate the applicability of our method to classic speech separation tasks, as well as real-world scenarios involving heated interviews, noisy bars, and screaming children, only requiring the user to specify the face of the person in the video whose speech they want to isolate. Our method shows clear advantage over state-of-the-art audio-only speech separation in cases of mixed speech. In addition, our model, which is speaker-independent (trained once, applicable to any speaker), produces better results than recent audio-visual speech separation methods that are speaker-dependent (require training a separate model for each speaker of interest).
△ Less
Submitted 9 August, 2018; v1 submitted 10 April, 2018;
originally announced April 2018.
-
Self-Similar Epochs: Value in Arrangement
Authors:
Eliav Buchnik,
Edith Cohen,
Avinatan Hassidim,
Yossi Matias
Abstract:
Optimization of machine learning models is commonly performed through stochastic gradient updates on randomly ordered training examples. This practice means that sub-epochs comprise of independent random samples of the training data that may not preserve informative structure present in the full data. We hypothesize that the training can be more effective with {\em self-similar} arrangements that…
▽ More
Optimization of machine learning models is commonly performed through stochastic gradient updates on randomly ordered training examples. This practice means that sub-epochs comprise of independent random samples of the training data that may not preserve informative structure present in the full data. We hypothesize that the training can be more effective with {\em self-similar} arrangements that potentially allow each epoch to provide benefits of multiple ones. We study this for "matrix factorization" -- the common task of learning metric embeddings of entities such as queries, videos, or words from example pairwise associations. We construct arrangements that preserve the weighted Jaccard similarities of rows and columns and experimentally observe training acceleration of 3\%-37\% on synthetic and recommendation datasets. Principled arrangements of training examples emerge as a novel and potentially powerful enhancement to SGD that merits further exploration.
△ Less
Submitted 18 June, 2019; v1 submitted 14 March, 2018;
originally announced March 2018.
-
MUDA: A Truthful Multi-Unit Double-Auction Mechanism
Authors:
Erel Segal-Halevi,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
In a seminal paper, McAfee (1992) presented a truthful mechanism for double auctions, attaining asymptotically-optimal gain-from-trade without any prior information on the valuations of the traders. McAfee's mechanism handles single-parametric agents, allowing each seller to sell a single unit and each buyer to buy a single unit. This paper presents a double-auction mechanism that handles multi-pa…
▽ More
In a seminal paper, McAfee (1992) presented a truthful mechanism for double auctions, attaining asymptotically-optimal gain-from-trade without any prior information on the valuations of the traders. McAfee's mechanism handles single-parametric agents, allowing each seller to sell a single unit and each buyer to buy a single unit. This paper presents a double-auction mechanism that handles multi-parametric agents and allows multiple units per trader, as long as the valuation functions of all traders have decreasing marginal returns. The mechanism is prior-free, ex-post individually-rational, dominant-strategy truthful and strongly-budget-balanced. Its gain-from-trade approaches the optimum when the market size is sufficiently large.
△ Less
Submitted 19 December, 2017;
originally announced December 2017.
-
Truthful Bilateral Trade is Impossible even with Fixed Prices
Authors:
Erel Segal-Halevi,
Avinatan Hassidim
Abstract:
A seminal theorem of Myerson and Satterthwaite (1983) proves that, in a game of bilateral trade between a single buyer and a single seller, no mechanism can be simultaneously individually-rational, budget-balanced, incentive-compatible and socially-efficient. However, the impossibility disappears if the price is fixed exogenously and the social-efficiency goal is subject to individual-rationality…
▽ More
A seminal theorem of Myerson and Satterthwaite (1983) proves that, in a game of bilateral trade between a single buyer and a single seller, no mechanism can be simultaneously individually-rational, budget-balanced, incentive-compatible and socially-efficient. However, the impossibility disappears if the price is fixed exogenously and the social-efficiency goal is subject to individual-rationality at the given price.
We show that the impossibility comes back if there are multiple units of the same good, or multiple types of goods, even when the prices are fixed exogenously. Particularly, if there are $M$ units of the same good or $M$ kinds of goods, for some $M\geq 2$, then no truthful mechanism can guarantee more than $1/M$ of the optimal gain-from-trade. In the single-good multi-unit case, if both agents have submodular valuations (decreasing marginal returns), then no truthful mechanism can guarantee more than $1/H_M$ of the optimal gain-from-trade, where $H_M$ is the $M$-th harmonic number ($H_M\approx \ln{M}+1/2$). All upper bounds are tight.
△ Less
Submitted 19 November, 2017;
originally announced November 2017.
-
New Approximations for Coalitional Manipulation in General Scoring Rules
Authors:
Orgad Keller,
Avinatan Hassidim,
Noam Hazon
Abstract:
We study the problem of coalitional manipulation---where $k$ manipulators try to manipulate an election on $m$ candidates---under general scoring rules, with a focus on the Borda protocol. We do so both in the weighted and unweighted settings. We focus on minimizing the maximum score obtainable by a non-preferred candidate.
In the strongest, most general setting, we provide an algorithm for any…
▽ More
We study the problem of coalitional manipulation---where $k$ manipulators try to manipulate an election on $m$ candidates---under general scoring rules, with a focus on the Borda protocol. We do so both in the weighted and unweighted settings. We focus on minimizing the maximum score obtainable by a non-preferred candidate.
In the strongest, most general setting, we provide an algorithm for any scoring rule as described by a vector $\vecα=(α_1,\ldots,α_m)$: for some $β=O(\sqrt{m\log m})$, it obtains an additive approximation equal to $W\cdot \max_i \lvert α_{i+β}-α_i \rvert$, where $W$ is the sum of voter weights.
For Borda, both the weighted and unweighted variants are known to be $NP$-hard. For the unweighted case, our simpler algorithm provides a randomized, additive $O(k \sqrt{m \log m} )$ approximation; in other words, if there exists a strategy enabling the preferred candidate to win by an $Ω(k \sqrt{m \log m} )$ margin, our method, with high probability, will find a strategy enabling her to win (albeit with a possibly smaller margin). It thus provides a somewhat stronger guarantee compared to the previous methods, which implicitly implied a strategy that provides an $Ω(m)$-additive approximation to the maximum score of a non-preferred candidate.
For the weighted case, our generalized algorithm provides an $O(W \sqrt{m \log m} )$-additive approximation, where $W$ is the sum of voter weights. This is a clear advantage over previous methods: some of them do not generalize to the weighted case, while others---which approximate the number of manipulators---pose restrictions on the weights of extra manipulators added.
Our methods are based on carefully rounding an exponentially-large configuration linear program that is solved by using the ellipsoid method with an efficient separation oracle.
△ Less
Submitted 16 August, 2017;
originally announced August 2017.
-
Fair Allocation based on Diminishing Differences
Authors:
Erel Segal-Halevi,
Haris Aziz,
Avinatan Hassidim
Abstract:
Ranking alternatives is a natural way for humans to explain their preferences. It is being used in many settings, such as school choice, course allocations and residency matches. In some cases, several `items' are given to each participant. Without having any information on the underlying cardinal utilities, arguing about fairness of allocation requires extending the ordinal item ranking to ordina…
▽ More
Ranking alternatives is a natural way for humans to explain their preferences. It is being used in many settings, such as school choice, course allocations and residency matches. In some cases, several `items' are given to each participant. Without having any information on the underlying cardinal utilities, arguing about fairness of allocation requires extending the ordinal item ranking to ordinal bundle ranking. The most commonly used such extension is stochastic dominance (SD), where a bundle X is preferred over a bundle Y if its score is better according to all additive score functions. SD is a very conservative extension, by which few allocations are necessarily fair while many allocations are possibly fair. We propose to make a natural assumption on the underlying cardinal utilities of the players, namely that the difference between two items at the top is larger than the difference between two items at the bottom. This assumption implies a preference extension which we call diminishing differences (DD), where X is preferred over Y if its score is better according to all additive score functions satisfying the DD assumption. We give a full characterization of allocations that are necessarily-proportional or possibly-proportional according to this assumption. Based on this characterization, we present a polynomial-time algorithm for finding a necessarily-DD-proportional allocation if it exists. Using simulations, we show that with high probability, a necessarily-proportional allocation does not exist but a necessarily-DD-proportional allocation exists, and moreover, that allocation is proportional according to the underlying cardinal utilities. We also consider chore allocation under the analogous condition --- increasing-differences.
△ Less
Submitted 1 August, 2019; v1 submitted 22 May, 2017;
originally announced May 2017.
-
Envy-Free Division of Land
Authors:
Erel Segal-Halevi,
Shmuel Nitzan,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
Classic cake-cutting algorithms enable people with different preferences to divide among them a heterogeneous resource (``cake''), such that the resulting division is fair according to each agent's individual preferences. However, these algorithms either ignore the geometry of the resource altogether, or assume it is one-dimensional. In practice, it is often required to divide multi-dimensional re…
▽ More
Classic cake-cutting algorithms enable people with different preferences to divide among them a heterogeneous resource (``cake''), such that the resulting division is fair according to each agent's individual preferences. However, these algorithms either ignore the geometry of the resource altogether, or assume it is one-dimensional. In practice, it is often required to divide multi-dimensional resources, such as land-estates or advertisement spaces in print or electronic media. In such cases, the geometric shape of the allotted piece is of crucial importance. For example, when building houses or designing advertisements, in order to be useful, the allotments should be squares or rectangles with bounded aspect-ratio. We thus introduce the problem of fair land division --- fair division of a multi-dimensional resource wherein the allocated piece must have a pre-specified geometric shape. We present constructive division algorithms that satisfy the two most prominent fairness criteria, namely envy-freeness and proportionality. In settings where proportionality cannot be achieved due to the geometric constraints, our algorithms provide a partially-proportional division, guaranteeing that the fraction allocated to each agent be at least a certain positive constant. We prove that in many natural settings the envy-freeness requirement is compatible with the best attainable partial-proportionality.
△ Less
Submitted 9 March, 2019; v1 submitted 13 September, 2016;
originally announced September 2016.
-
SBBA: a Strongly-Budget-Balanced Double-Auction Mechanism
Authors:
Erel Segal-Halevi,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
In a seminal paper, McAfee (1992) presented the first dominant strategy truthful mechanism for double auction. His mechanism attains nearly optimal gain-from-trade when the market is sufficiently large. However, his mechanism may leave money on the table, since the price paid by the buyers may be higher than the price paid to the sellers. This money is included in the gain-from-trade and in some c…
▽ More
In a seminal paper, McAfee (1992) presented the first dominant strategy truthful mechanism for double auction. His mechanism attains nearly optimal gain-from-trade when the market is sufficiently large. However, his mechanism may leave money on the table, since the price paid by the buyers may be higher than the price paid to the sellers. This money is included in the gain-from-trade and in some cases it accounts for almost all the gain-from-trade, leaving almost no gain-from-trade to the traders. We present SBBA: a variant of McAfee's mechanism which is strongly budget-balanced. There is a single price, all money is exchanged between buyers and sellers and no money is left on the table. This means that all gain-from-trade is enjoyed by the traders. We generalize this variant to spatially-distributed markets with transit costs.
△ Less
Submitted 18 July, 2016;
originally announced July 2016.
-
Demand-Flow of Agents with Gross-Substitute Valuations
Authors:
Erel Segal-Halevi,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
We consider the class of valuations on indivisible items called gross-substitute (GS). This class was introduced by Kelso and Crawford (1982) and is widely used in studies of markets with indivisibilities. GS is a condition on the demand-flow in a specific scenario: some items become more expensive while other items retain their price. We prove that GS implies a much stronger condition, describing…
▽ More
We consider the class of valuations on indivisible items called gross-substitute (GS). This class was introduced by Kelso and Crawford (1982) and is widely used in studies of markets with indivisibilities. GS is a condition on the demand-flow in a specific scenario: some items become more expensive while other items retain their price. We prove that GS implies a much stronger condition, describing the demand-flow in the general scenario in which all prices may change. We prove that the demand of GS agents always flows (weakly) downwards, i.e, from items with higher price-increase to items with lower price-increase. We show that this property is equivalent to GS and is not true when there are complementarities.
△ Less
Submitted 1 October, 2016; v1 submitted 7 July, 2016;
originally announced July 2016.
-
Double Auctions in Markets for Multiple Kinds of Goods
Authors:
Erel Segal-Halevi,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
Motivated by applications such as stock exchanges and spectrum auctions, there is a growing interest in mechanisms for arranging trade in two-sided markets. Existing mechanisms are either not truthful, or do not guarantee an asymptotically-optimal gain-from-trade, or rely on a prior on the traders' valuations, or operate in limited settings such as a single kind of good. We extend the random marke…
▽ More
Motivated by applications such as stock exchanges and spectrum auctions, there is a growing interest in mechanisms for arranging trade in two-sided markets. Existing mechanisms are either not truthful, or do not guarantee an asymptotically-optimal gain-from-trade, or rely on a prior on the traders' valuations, or operate in limited settings such as a single kind of good. We extend the random market-halving technique used in earlier works to markets with multiple kinds of goods, where traders have gross-substitute valuations. We present MIDA: a Multi Item-kind Double-Auction mechanism. It is prior-free, truthful, strongly-budget-balanced, and guarantees near-optimal gain from trade when market sizes of all goods grow to $\infty$ at a similar rate.
△ Less
Submitted 1 May, 2018; v1 submitted 21 April, 2016;
originally announced April 2016.
-
Submodular Optimization under Noise
Authors:
Avinatan Hassidim,
Yaron Singer
Abstract:
We consider the problem of maximizing a monotone submodular function under noise. There has been a great deal of work on optimization of submodular functions under various constraints, resulting in algorithms that provide desirable approximation guarantees. In many applications, however, we do not have access to the submodular function we aim to optimize, but rather to some erroneous or noisy vers…
▽ More
We consider the problem of maximizing a monotone submodular function under noise. There has been a great deal of work on optimization of submodular functions under various constraints, resulting in algorithms that provide desirable approximation guarantees. In many applications, however, we do not have access to the submodular function we aim to optimize, but rather to some erroneous or noisy version of it. This raises the question of whether provable guarantees are obtainable in presence of error and noise. We provide initial answers, by focusing on the question of maximizing a monotone submodular function under a cardinality constraint when given access to a noisy oracle of the function. We show that:
- For a cardinality constraint $k \geq 2$, there is an approximation algorithm whose approximation ratio is arbitrarily close to $1-1/e$;
- For $k=1$ there is an algorithm whose approximation ratio is arbitrarily close to $1/2$. No randomized algorithm can obtain an approximation ratio better than $1/2+o(1)$;
-If the noise is adversarial, no non-trivial approximation guarantee can be obtained.
△ Less
Submitted 4 November, 2016; v1 submitted 12 January, 2016;
originally announced January 2016.
-
Waste Makes Haste: Bounded Time Protocols for Envy-Free Cake Cutting with Free Disposal
Authors:
Erel Segal-Halevi,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
We consider the classic problem of envy-free division of a heterogeneous good ("cake") among several agents. It is known that, when the allotted pieces must be connected, the problem cannot be solved by a finite algorithm for 3 or more agents. The impossibility result, however, assumes that the entire cake must be allocated. In this paper we replace the entire-allocation requirement with a weaker…
▽ More
We consider the classic problem of envy-free division of a heterogeneous good ("cake") among several agents. It is known that, when the allotted pieces must be connected, the problem cannot be solved by a finite algorithm for 3 or more agents. The impossibility result, however, assumes that the entire cake must be allocated. In this paper we replace the entire-allocation requirement with a weaker \emph{partial-proportionality} requirement: the piece given to each agent must be worth for it at least a certain positive fraction of the entire cake value. We prove that this version of the problem is solvable in bounded time even when the pieces must be connected. We present simple, bounded-time envy-free cake-cutting algorithms for: (1) giving each of $n$ agents a connected piece with a positive value; (2) giving each of 3 agents a connected piece worth at least 1/3; (3) giving each of 4 agents a connected piece worth at least 1/7; (4) giving each of 4 agents a disconnected piece worth at least 1/4; (5) giving each of $n$ agents a disconnected piece worth at least $(1-ε)/n$ for any positive $ε$.
△ Less
Submitted 9 February, 2018; v1 submitted 9 November, 2015;
originally announced November 2015.
-
Fair and Square: Cake-Cutting in Two Dimensions
Authors:
Erel Segal-Halevi,
Shmuel Nitzan,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
We consider the classic problem of fairly dividing a heterogeneous good ("cake") among several agents with different valuations. Classic cake-cutting procedures either allocate each agent a collection of disconnected pieces, or assume that the cake is a one-dimensional interval. In practice, however, the two-dimensional shape of the allotted pieces is important. In particular, when building a hous…
▽ More
We consider the classic problem of fairly dividing a heterogeneous good ("cake") among several agents with different valuations. Classic cake-cutting procedures either allocate each agent a collection of disconnected pieces, or assume that the cake is a one-dimensional interval. In practice, however, the two-dimensional shape of the allotted pieces is important. In particular, when building a house or designing an advertisement in printed or electronic media, squares are more usable than long and narrow rectangles. We thus introduce and study the problem of fair two-dimensional division wherein the allotted pieces must be of some restricted two-dimensional geometric shape(s), particularly squares and fat rectangles. Adding such geometric constraints re-opens most questions and challenges related to cake-cutting. Indeed, even the most elementary fairness criterion --- proportionality --- can no longer be guaranteed. In this paper we thus examine the level of proportionality that can be guaranteed, providing both impossibility results and constructive division procedures.
△ Less
Submitted 30 January, 2018; v1 submitted 12 October, 2015;
originally announced October 2015.
-
Optimistic-Conservative Bidding in Sequential Auctions
Authors:
Avinatan Hassidim,
Yishay Mansour
Abstract:
In this work we consider selling items using a sequential first price auction mechanism. We generalize the assumption of conservative bidding to extensive form games (henceforth optimistic conservative bidding), and show that for both linear and unit demand valuations, the only pure subgame perfect equilibrium where buyers are bidding in an optimistic conservative manner is the minimal Walrasian e…
▽ More
In this work we consider selling items using a sequential first price auction mechanism. We generalize the assumption of conservative bidding to extensive form games (henceforth optimistic conservative bidding), and show that for both linear and unit demand valuations, the only pure subgame perfect equilibrium where buyers are bidding in an optimistic conservative manner is the minimal Walrasian equilibrium.
In addition, we show examples where without the requirement of conservative bidding, subgame perfect equilibria can admit a variety of unlikely predictions, including high price of anarchy and low revenue in markets composed of additive bidders, equilibria which elicit all the surplus as revenue, and more. We also show that the order in which the items are sold can influence the outcome.
△ Less
Submitted 30 January, 2015;
originally announced January 2015.
-
Sorting and Selection with Imprecise Comparisons
Authors:
Miklos Ajtai,
Vitaly Feldman,
Avinatan Hassidim,
Jelani Nelson
Abstract:
We consider a simple model of imprecise comparisons: there exists some $δ>0$ such that when a subject is given two elements to compare, if the values of those elements (as perceived by the subject) differ by at least $δ$, then the comparison will be made correctly; when the two elements have values that are within $δ$, the outcome of the comparison is unpredictable. This model is inspired by both…
▽ More
We consider a simple model of imprecise comparisons: there exists some $δ>0$ such that when a subject is given two elements to compare, if the values of those elements (as perceived by the subject) differ by at least $δ$, then the comparison will be made correctly; when the two elements have values that are within $δ$, the outcome of the comparison is unpredictable. This model is inspired by both imprecision in human judgment of values and also by bounded but potentially adversarial errors in the outcomes of sporting tournaments.
Our model is closely related to a number of models commonly considered in the psychophysics literature where $δ$ corresponds to the {\em just noticeable difference unit (JND)} or {\em difference threshold}. In experimental psychology, the method of paired comparisons was proposed as a means for ranking preferences amongst $n$ elements of a human subject. The method requires performing all $\binom{n}{2}$ comparisons, then sorting elements according to the number of wins. The large number of comparisons is performed to counter the potentially faulty decision-making of the human subject, who acts as an imprecise comparator.
We show that in our model the method of paired comparisons has optimal accuracy, minimizing the errors introduced by the imprecise comparisons. However, it is also wasteful, as it requires all $\binom{n}{2}$. We show that the same optimal guarantees can be achieved using $4 n^{3/2}$ comparisons, and we prove the optimality of our method. We then explore the general tradeoff between the guarantees on the error that can be made and number of comparisons for the problems of sorting, max-finding, and selection. Our results provide strong lower bounds and close-to-optimal solutions for each of these problems.
△ Less
Submitted 13 January, 2015;
originally announced January 2015.
-
Fair and Square: Cake-cutting in Two Dimensions
Authors:
Erel Segal-Halevi,
Avinatan Hassidim,
Yonatan Aumann
Abstract:
We consider the problem of fairly dividing a two dimensional heterogeneous good among multiple players. Applications include division of land as well as ad space in print and electronic media. Classical cake cutting protocols primarily consider a one-dimensional resource, or allocate each player multiple infinitesimally small "pieces". In practice, however, the two dimensional \emph{shape} of the…
▽ More
We consider the problem of fairly dividing a two dimensional heterogeneous good among multiple players. Applications include division of land as well as ad space in print and electronic media. Classical cake cutting protocols primarily consider a one-dimensional resource, or allocate each player multiple infinitesimally small "pieces". In practice, however, the two dimensional \emph{shape} of the allotted piece is of crucial importance in many applications (e.g. squares or bounded aspect-ratio rectangles are most useful for building houses, as well as advertisements). We thus introduce and study the problem of fair two-dimensional division wherein the allotted plots must be of some restricted two-dimensional geometric shape(s). Adding this geometric constraint re-opens most questions and challenges related to cake-cutting. Indeed, even the elementary \emph{proportionality} fairness criteria can no longer be guaranteed in all cases. In this paper we thus examine the \emph{level} of proportionality that \emph{can} be guaranteed, providing both impossibility results (for proportionality that cannot be guaranteed), and algorithmic constructions (for proportionality that can be guaranteed). We focus primarily on the case when the cake is a rectilinear polygon and the allotted plots must be squares or bounded aspect-ratio rectangles.
△ Less
Submitted 26 November, 2019; v1 submitted 16 September, 2014;
originally announced September 2014.
-
An Approximate "Law of One Price" in Random Assignment Games
Authors:
Avinatan Hassidim,
Assaf Romm
Abstract:
Assignment games represent a tractable yet versatile model of two-sided markets with transfers. We study the likely properties of the core of randomly generated assignment games. If the joint productivities of every firm and worker are i.i.d bounded random variables, then with high probability all workers are paid roughly equal wages, and all firms make similar profits. This implies that core allo…
▽ More
Assignment games represent a tractable yet versatile model of two-sided markets with transfers. We study the likely properties of the core of randomly generated assignment games. If the joint productivities of every firm and worker are i.i.d bounded random variables, then with high probability all workers are paid roughly equal wages, and all firms make similar profits. This implies that core allocations vary significantly in balanced markets, but that there is core convergence in even slightly unbalanced markets. For the benchmark case of uniform distribution, we provide a tight bound for the workers' share of the surplus under the firm-optimal core allocation. We present simulation results suggesting that the phenomena analyzed appear even in medium-sized markets. Finally, we briefly discuss the effects of unbounded distributions and the ways in which they may affect wage dispersion.
△ Less
Submitted 24 April, 2014;
originally announced April 2014.
-
Local computation mechanism design
Authors:
Avinatan Hassidim,
Yishay Mansour,
Shai Vardi
Abstract:
We introduce the notion of Local Computation Mechanism Design - designing game theoretic mechanisms which run in polylogarithmic time and space. Local computation mechanisms reply to each query in polylogarithmic time and space, and the replies to different queries are consistent with the same global feasible solution. In addition, the computation of the payments is also done in polylogarithmic ti…
▽ More
We introduce the notion of Local Computation Mechanism Design - designing game theoretic mechanisms which run in polylogarithmic time and space. Local computation mechanisms reply to each query in polylogarithmic time and space, and the replies to different queries are consistent with the same global feasible solution. In addition, the computation of the payments is also done in polylogarithmic time and space. Furthermore, the mechanisms need to maintain incentive compatibility with respect to the allocation and payments.
We present local computation mechanisms for a variety of classical game-theoretical problems: 1. stable matching, 2. job scheduling, 3. combinatorial auctions for unit-demand and k-minded bidders, and 4. the housing allocation problem.
For stable matching, some of our techniques may have general implications. Specifically, we show that when the men's preference lists are bounded, we can achieve an arbitrarily good approximation to the stable matching within a fixed number of iterations of the Gale-Shapley algorithm.
△ Less
Submitted 9 June, 2014; v1 submitted 15 November, 2013;
originally announced November 2013.
-
Finding the Minimum-Weight k-Path
Authors:
Avinatan Hassidim,
Orgad Keller,
Moshe Lewenstein,
Liam Roditty
Abstract:
Given a weighted $n$-vertex graph $G$ with integer edge-weights taken from a range $[-M,M]$, we show that the minimum-weight simple path visiting $k$ vertices can be found in time $\tilde{O}(2^k \poly(k) M n^ω) = O^*(2^k M)$. If the weights are reals in $[1,M]$, we provide a $(1+\varepsilon)$-approximation which has a running time of $\tilde{O}(2^k \poly(k) n^ω(\log\log M + 1/\varepsilon))$. For t…
▽ More
Given a weighted $n$-vertex graph $G$ with integer edge-weights taken from a range $[-M,M]$, we show that the minimum-weight simple path visiting $k$ vertices can be found in time $\tilde{O}(2^k \poly(k) M n^ω) = O^*(2^k M)$. If the weights are reals in $[1,M]$, we provide a $(1+\varepsilon)$-approximation which has a running time of $\tilde{O}(2^k \poly(k) n^ω(\log\log M + 1/\varepsilon))$. For the more general problem of $k$-tree, in which we wish to find a minimum-weight copy of a $k$-node tree $T$ in a given weighted graph $G$, under the same restrictions on edge weights respectively, we give an exact solution of running time $\tilde{O}(2^k \poly(k) M n^3) $ and a $(1+\varepsilon)$-approximate solution of running time $\tilde{O}(2^k \poly(k) n^3(\log\log M + 1/\varepsilon))$. All of the above algorithms are randomized with a polynomially-small error probability.
△ Less
Submitted 9 July, 2013;
originally announced July 2013.
-
Probe Scheduling for Efficient Detection of Silent Failures
Authors:
Edith Cohen,
Avinatan Hassidim,
Haim Kaplan,
Yishay Mansour,
Danny Raz,
Yoav Tzur
Abstract:
Most discovery systems for silent failures work in two phases: a continuous monitoring phase that detects presence of failures through probe packets and a localization phase that pinpoints the faulty element(s). This separation is important because localization requires significantly more resources than detection and should be initiated only when a fault is present.
We focus on improving the eff…
▽ More
Most discovery systems for silent failures work in two phases: a continuous monitoring phase that detects presence of failures through probe packets and a localization phase that pinpoints the faulty element(s). This separation is important because localization requires significantly more resources than detection and should be initiated only when a fault is present.
We focus on improving the efficiency of the detection phase, where the goal is to balance the overhead with the cost associated with longer failure detection times. We formulate a general model which unifies the treatment of probe scheduling mechanisms, stochastic or deterministic, and different cost objectives - minimizing average detection time (SUM) or worst-case detection time (MAX).
We then focus on two classes of schedules. {\em Memoryless schedules} -- a subclass of stochastic schedules which is simple and suitable for distributed deployment. We show that the optimal memorlyess schedulers can be efficiently computed by convex programs (for SUM objectives) or linear programs (for MAX objectives), and surprisingly perhaps, are guaranteed to have expected detection times that are not too far off the (NP hard) stochastic optima. {\em Deterministic schedules} allow us to bound the maximum (rather than expected) cost of undetected faults, but like stochastic schedules, are NP hard to optimize. We develop novel efficient deterministic schedulers with provable approximation ratios.
An extensive simulation study on real networks, demonstrates significant performance gains of our memoryless and deterministic schedulers over previous approaches. Our unified treatment also facilitates a clear comparison between different objectives and scheduling mechanisms.
△ Less
Submitted 19 June, 2014; v1 submitted 4 February, 2013;
originally announced February 2013.
-
Joint Cache Partition and Job Assignment on Multi-Core Processors
Authors:
Avinatan Hassidim,
Haim Kaplan,
Omry Tuval
Abstract:
Multicore shared cache processors pose a challenge for designers of embedded systems who try to achieve minimal and predictable execution time of workloads consisting of several jobs. To address this challenge the cache is statically partitioned among the cores and the jobs are assigned to the cores so as to minimize the makespan. Several heuristic algorithms have been proposed that jointly decide…
▽ More
Multicore shared cache processors pose a challenge for designers of embedded systems who try to achieve minimal and predictable execution time of workloads consisting of several jobs. To address this challenge the cache is statically partitioned among the cores and the jobs are assigned to the cores so as to minimize the makespan. Several heuristic algorithms have been proposed that jointly decide how to partition the cache among the cores and assign the jobs. We initiate a theoretical study of this problem which we call the joint cache partition and job assignment problem.
By a careful analysis of the possible cache partitions we obtain a constant approximation algorithm for this problem. For some practical special cases we obtain a 2-approximation algorithm, and show how to improve the approximation factor even further by allowing the algorithm to use additional cache. We also study possible improvements that can be obtained by allowing dynamic cache partitions and dynamic job assignments.
We define a natural special case of the well known scheduling problem on unrelated machines in which machines are ordered by "strength". Our joint cache partition and job assignment problem generalizes this scheduling problem which we think is of independent interest. We give a polynomial time algorithm for this scheduling problem for instances obtained by fixing the cache partition in a practical case of the joint cache partition and job assignment problem where job loads are step functions.
△ Less
Submitted 22 November, 2012; v1 submitted 15 October, 2012;
originally announced October 2012.
-
The AND-OR game: Equilibrium Characterization (Working Paper)
Authors:
Avinatan Hassidim,
Haim Kaplan,
Yishay Mansour,
Noam Nisan
Abstract:
We consider a simple simultaneous first price auction for multiple items in a complete information setting. Our goal is to completely characterize the mixed equilibria in this setting, for a simple, yet highly interesting, {\tt AND}-{\tt OR} game, where one agent is single minded and the other is unit demand.
We consider a simple simultaneous first price auction for multiple items in a complete information setting. Our goal is to completely characterize the mixed equilibria in this setting, for a simple, yet highly interesting, {\tt AND}-{\tt OR} game, where one agent is single minded and the other is unit demand.
△ Less
Submitted 5 October, 2012;
originally announced October 2012.
-
Computing Socially-Efficient Cake Divisions
Authors:
Yonatan Aumann,
Yair Dombb,
Avinatan Hassidim
Abstract:
We consider a setting in which a single divisible good ("cake") needs to be divided between n players, each with a possibly different valuation function over pieces of the cake. For this setting, we address the problem of finding divisions that maximize the social welfare, focusing on divisions where each player needs to get one contiguous piece of the cake. We show that for both the utilitarian a…
▽ More
We consider a setting in which a single divisible good ("cake") needs to be divided between n players, each with a possibly different valuation function over pieces of the cake. For this setting, we address the problem of finding divisions that maximize the social welfare, focusing on divisions where each player needs to get one contiguous piece of the cake. We show that for both the utilitarian and the egalitarian social welfare functions it is NP-hard to find the optimal division. For the utilitarian welfare, we provide a constant factor approximation algorithm, and prove that no FPTAS is possible unless P=NP. For egalitarian welfare, we prove that it is NP-hard to approximate the optimum to any factor smaller than 2. For the case where the number of players is small, we provide an FPT (fixed parameter tractable) FPTAS for both the utilitarian and the egalitarian welfare objectives.
△ Less
Submitted 17 May, 2012;
originally announced May 2012.
-
An Efficient Partitioning Oracle for Bounded-Treewidth Graphs
Authors:
Alan Edelman,
Avinatan Hassidim,
Huy N. Nguyen,
Krzysztof Onak
Abstract:
Partitioning oracles were introduced by Hassidim et al. (FOCS 2009) as a generic tool for constant-time algorithms. For any epsilon > 0, a partitioning oracle provides query access to a fixed partition of the input bounded-degree minor-free graph, in which every component has size poly(1/epsilon), and the number of edges removed is at most epsilon*n, where n is the number of vertices in the graph.…
▽ More
Partitioning oracles were introduced by Hassidim et al. (FOCS 2009) as a generic tool for constant-time algorithms. For any epsilon > 0, a partitioning oracle provides query access to a fixed partition of the input bounded-degree minor-free graph, in which every component has size poly(1/epsilon), and the number of edges removed is at most epsilon*n, where n is the number of vertices in the graph.
However, the oracle of Hassidimet al. makes an exponential number of queries to the input graph to answer every query about the partition. In this paper, we construct an efficient partitioning oracle for graphs with constant treewidth. The oracle makes only O(poly(1/epsilon)) queries to the input graph to answer each query about the partition.
Examples of bounded-treewidth graph classes include k-outerplanar graphs for fixed k, series-parallel graphs, cactus graphs, and pseudoforests. Our oracle yields poly(1/epsilon)-time property testing algorithms for membership in these classes of graphs. Another application of the oracle is a poly(1/epsilon)-time algorithm that approximates the maximum matching size, the minimum vertex cover size, and the minimum dominating set size up to an additive epsilon*n in graphs with bounded treewidth. Finally, the oracle can be used to test in poly(1/epsilon) time whether the input bounded-treewidth graph is k-colorable or perfect.
△ Less
Submitted 22 June, 2011;
originally announced June 2011.
-
Non-Price Equilibria in Markets of Discrete Goods
Authors:
Avinatan Hassidim,
Haim Kaplan,
Yishay Mansour,
Noam Nisan
Abstract:
We study markets of indivisible items in which price-based (Walrasian) equilibria often do not exist due to the discrete non-convex setting. Instead we consider Nash equilibria of the market viewed as a game, where players bid for items, and where the highest bidder on an item wins it and pays his bid. We first observe that pure Nash-equilibria of this game excatly correspond to price-based equili…
▽ More
We study markets of indivisible items in which price-based (Walrasian) equilibria often do not exist due to the discrete non-convex setting. Instead we consider Nash equilibria of the market viewed as a game, where players bid for items, and where the highest bidder on an item wins it and pays his bid. We first observe that pure Nash-equilibria of this game excatly correspond to price-based equilibiria (and thus need not exist), but that mixed-Nash equilibria always do exist, and we analyze their structure in several simple cases where no price-based equilibrium exists. We also undertake an analysis of the welfare properties of these equilibria showing that while pure equilibria are always perfectly efficient ("first welfare theorem"), mixed equilibria need not be, and we provide upper and lower bounds on their amount of inefficiency.
△ Less
Submitted 21 March, 2011;
originally announced March 2011.
-
Topology Discovery of Sparse Random Graphs With Few Participants
Authors:
Animashree Anandkumar,
Avinatan Hassidim,
Jonathan Kelner
Abstract:
We consider the task of topology discovery of sparse random graphs using end-to-end random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obta…
▽ More
We consider the task of topology discovery of sparse random graphs using end-to-end random measurements (e.g., delay) between a subset of nodes, referred to as the participants. The rest of the nodes are hidden, and do not provide any information for topology discovery. We consider topology discovery under two routing models: (a) the participants exchange messages along the shortest paths and obtain end-to-end measurements, and (b) additionally, the participants exchange messages along the second shortest path. For scenario (a), our proposed algorithm results in a sub-linear edit-distance guarantee using a sub-linear number of uniformly selected participants. For scenario (b), we obtain a much stronger result, and show that we can achieve consistent reconstruction when a sub-linear number of uniformly selected nodes participate. This implies that accurate discovery of sparse random graphs is tractable using an extremely small number of participants. We finally obtain a lower bound on the number of participants required by any algorithm to reconstruct the original random graph up to a given edit distance. We also demonstrate that while consistent discovery is tractable for sparse random graphs using a small number of participants, in general, there are graphs which cannot be discovered by any algorithm even with a significant number of participants, and with the availability of end-to-end information along all the paths between the participants.
△ Less
Submitted 3 March, 2012; v1 submitted 24 February, 2011;
originally announced February 2011.
-
Matching with Couples Revisited
Authors:
Itai Ashlagi,
Mark Braverman,
Avinatan Hassidim
Abstract:
It is well known that a stable matching in a many-to-one matching market with couples need not exist. We introduce a new matching algorithm for such markets and show that for a general class of large random markets the algorithm will find a stable matching with high probability. In particular we allow the number of couples to grow at a near-linear rate. Furthermore, truth-telling is an approximate…
▽ More
It is well known that a stable matching in a many-to-one matching market with couples need not exist. We introduce a new matching algorithm for such markets and show that for a general class of large random markets the algorithm will find a stable matching with high probability. In particular we allow the number of couples to grow at a near-linear rate. Furthermore, truth-telling is an approximated equilibrium in the game induced by the new matching algorithm. Our results are tight: for markets in which the number of couples grows at a linear rate, we show that with constant probability no stable matching exists.
△ Less
Submitted 17 November, 2010; v1 submitted 9 November, 2010;
originally announced November 2010.