Skip to main content

Showing 1–50 of 61 results for author: Louis, A

  1. arXiv:2407.04371  [pdf, other

    quant-ph cs.AI

    Exploiting the equivalence between quantum neural networks and perceptrons

    Authors: Chris Mingard, Jessica Pointing, Charles London, Yoonsoo Nam, Ard A. Louis

    Abstract: Quantum machine learning models based on parametrized quantum circuits, also called quantum neural networks (QNNs), are considered to be among the most promising candidates for applications on near-term quantum devices. Here we explore the expressivity and inductive bias of QNNs by exploiting an exact mapping from QNNs with inputs $x$ to classical perceptrons acting on $x \otimes x$ (generalised t… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  2. arXiv:2405.00427  [pdf, other

    cs.DS

    Improved linearly ordered colorings of hypergraphs via SDP rounding

    Authors: Anand Louis, Alantha Newman, Arka Ray

    Abstract: We consider the problem of linearly ordered (LO) coloring of hypergraphs. A hypergraph has an LO coloring if there is a vertex coloring, using a set of ordered colors, so that (i) no edge is monochromatic, and (ii) each edge has a unique maximum color. It is an open question as to whether or not a 2-LO colorable 3-uniform hypergraph can be LO colored with 3 colors in polynomial time. Nakajima and… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: 19 pages; 13 pages for the main body

  3. arXiv:2404.17563  [pdf, other

    cs.LG cond-mat.dis-nn stat.ML

    An exactly solvable model for emergence and scaling laws

    Authors: Yoonsoo Nam, Nayara Fonseca, Seok Hyeong Lee, Chris Mingard, Ard A. Louis

    Abstract: Deep learning models can exhibit what appears to be a sudden ability to solve a new problem as training time, training data, or model size increases, a phenomenon known as emergence. In this paper, we present a framework where each new ability (a skill) is represented as a basis function. We solve a simple multi-linear model in this skill-basis, finding analytic expressions for the emergence of ne… ▽ More

    Submitted 14 July, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  4. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2402.15059  [pdf, other

    cs.CL cs.IR

    ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information Retrieval

    Authors: Antoine Louis, Vageesh Saxena, Gijs van Dijck, Gerasimos Spanakis

    Abstract: State-of-the-art neural retrievers predominantly focus on high-resource languages like English, which impedes their adoption in retrieval scenarios involving other languages. Current approaches circumvent the lack of high-quality labeled data in non-English languages by leveraging multilingual pretrained language models capable of cross-lingual transfer. However, these models require substantial t… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: Under review. Code is available at https://github.com/ant-louis/xm-retrievers

  6. arXiv:2402.12368  [pdf, other

    cs.CL

    A synthetic data approach for domain generalization of NLI models

    Authors: Mohammad Javad Hosseini, Andrey Petrov, Alex Fabrikant, Annie Louis

    Abstract: Natural Language Inference (NLI) remains an important benchmark task for LLMs. NLI datasets are a springboard for transfer learning to other semantic tasks, and NLI models are standard tools for identifying the faithfulness of model-generated text. There are several large scale NLI datasets today, and models have improved greatly by hill-climbing on these collections. Yet their realistic performan… ▽ More

    Submitted 28 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  7. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  8. arXiv:2311.17001  [pdf, ps, other

    cs.DS

    New Approximation Bounds for Small-Set Vertex Expansion

    Authors: Suprovat Ghoshal, Anand Louis

    Abstract: The vertex expansion of the graph is a fundamental graph parameter. Given a graph $G=(V,E)$ and a parameter $δ\in (0,1/2]$, its $δ$-Small-Set Vertex Expansion (SSVE) is defined as \[ \min_{S : |S| = δ|V|} \frac{|{\partial^V(S)}|}{ \min \{ |S|, |S^c| \} } \] where $\partial^V(S)$ is the vertex boundary of a set $S$. The SSVE~problem, in addition to being of independent interest as a natural graph p… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

    Comments: 55 Pages

  9. arXiv:2309.17050  [pdf, other

    cs.CL

    Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models

    Authors: Antoine Louis, Gijs van Dijck, Gerasimos Spanakis

    Abstract: Many individuals are likely to face a legal dispute at some point in their lives, but their lack of understanding of how to navigate these complex issues often renders them vulnerable. The advancement of natural language processing opens new avenues for bridging this legal literacy gap through the development of automated legal aid systems. However, existing legal question answering (LQA) approach… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Under review. Code is available at https://github.com/maastrichtlawtech/lleqa

  10. arXiv:2308.13242  [pdf, other

    cs.LG cs.CY cs.IR

    Optimizing Group-Fair Plackett-Luce Ranking Models for Relevance and Ex-Post Fairness

    Authors: Sruthi Gorantla, Eshaan Bhansali, Amit Deshpande, Anand Louis

    Abstract: In learning-to-rank (LTR), optimizing only the relevance (or the expected ranking utility) can cause representational harm to certain categories of items. Moreover, if there is implicit bias in the relevance scores, LTR models may fail to optimize for true relevance. Previous works have proposed efficient algorithms to train stochastic ranking models that achieve fairness of exposure to the groups… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: 20 pages

  11. arXiv:2306.11964  [pdf, other

    cs.CY cs.DS cs.IR cs.LG stat.ML

    Sampling Individually-Fair Rankings that are Always Group Fair

    Authors: Sruthi Gorantla, Anay Mehrotra, Amit Deshpande, Anand Louis

    Abstract: Rankings on online platforms help their end-users find the relevant information -- people, news, media, and products -- quickly. Fair ranking tasks, which ask to rank a set of items to maximize utility subject to satisfying group-fairness constraints, have gained significant interest in the Algorithmic Fairness, Information Retrieval, and Machine Learning literature. Recent works, however, identif… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: Full version of a paper accepted for presentation in ACM AIES 2023

  12. arXiv:2305.19585  [pdf, other

    cs.CL cs.LG

    LAIT: Efficient Multi-Segment Encoding in Transformers with Layer-Adjustable Interaction

    Authors: Jeremiah Milbauer, Annie Louis, Mohammad Javad Hosseini, Alex Fabrikant, Donald Metzler, Tal Schuster

    Abstract: Transformer encoders contextualize token representations by attending to all other tokens at each layer, leading to quadratic increase in compute effort with the input length. In practice, however, the input text of many NLP tasks can be seen as a sequence of related segments (e.g., the sequence of sentences within a passage, or the hypothesis and premise in NLI). While attending across these segm… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  13. arXiv:2305.00034  [pdf, other

    cs.CL

    Text-Blueprint: An Interactive Platform for Plan-based Conditional Generation

    Authors: Fantine Huot, Joshua Maynez, Shashi Narayan, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Anders Sandholm, Dipanjan Das, Mirella Lapata

    Abstract: While conditional generation models can now generate natural language well enough to create fluent text, it is still difficult to control the generation process, leading to irrelevant, repetitive, and hallucinated content. Recent work shows that planning can be a useful intermediate step to render conditional generation less opaque and more grounded. We present a web browser-based demonstration fo… ▽ More

    Submitted 28 April, 2023; originally announced May 2023.

    Comments: Accepted at EACL Call for System Demonstrations 2023

  14. arXiv:2304.06670  [pdf, other

    cs.LG cs.AI stat.ML

    Do deep neural networks have an inbuilt Occam's razor?

    Authors: Chris Mingard, Henry Rees, Guillermo Valle-Pérez, Ard A. Louis

    Abstract: The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components, we apply a Bayesian picture, based on the functions expressed by a DNN, to supervised learning. The prior over functions is determined by the network, and is varied by exploiting… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  15. arXiv:2301.12847  [pdf, other

    cs.IR cs.CL

    Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural Networks

    Authors: Antoine Louis, Gijs van Dijck, Gerasimos Spanakis

    Abstract: Statutory article retrieval (SAR), the task of retrieving statute law articles relevant to a legal question, is a promising application of legal text processing. In particular, high-quality SAR systems can improve the work efficiency of legal professionals and provide basic legal assistance to citizens in need at no cost. Unlike traditional ad-hoc information retrieval, where each document is cons… ▽ More

    Submitted 30 January, 2023; originally announced January 2023.

    Comments: EACL 2023. Code is available at https://github.com/maastrichtlawtech/gdsr

  16. arXiv:2212.13406  [pdf, other

    cs.DS math.CO

    Sparse Cuts in Hypergraphs from Random Walks on Simplicial Complexes

    Authors: Anand Louis, Rameesh Paul, Arka Ray

    Abstract: There are a lot of recent works on generalizing the spectral theory of graphs and graph partitioning to hypergraphs. There have been two broad directions toward this goal. One generalizes the notion of graph conductance to hypergraph conductance [LM16, CLTZ18]. In the second approach one can view a hypergraph as a simplicial complex and study its various topological properties [LM06, MW09, DKW16,… ▽ More

    Submitted 3 October, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: 27 pages;

  17. arXiv:2212.10933  [pdf, other

    cs.CL

    Resolving Indirect Referring Expressions for Entity Selection

    Authors: Mohammad Javad Hosseini, Filip Radlinski, Silvia Pareti, Annie Louis

    Abstract: Recent advances in language modeling have enabled new conversational systems. In particular, it is often desirable for people to make choices among specified options when using such systems. We address this problem of reference resolution, when people use natural expressions to choose between the entities. For example, given the choice `Should we make a Simnel cake or a Pandan cake?' a natural res… ▽ More

    Submitted 26 May, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

  18. arXiv:2212.10791  [pdf, other

    cs.CL cs.AI

    OpineSum: Entailment-based self-training for abstractive opinion summarization

    Authors: Annie Louis, Joshua Maynez

    Abstract: A typical product or place often has hundreds of reviews, and summarization of these texts is an important and challenging problem. Recent progress on abstractive summarization in domains such as news has been driven by supervised systems trained on hundreds of thousands of news articles paired with human-written summaries. However for opinion texts, such large scale datasets are rarely available.… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  19. arXiv:2212.10471  [pdf, other

    cs.CL

    Little Red Riding Hood Goes Around the Globe:Crosslingual Story Planning and Generation with Large Language Models

    Authors: Evgeniia Razumovskaia, Joshua Maynez, Annie Louis, Mirella Lapata, Shashi Narayan

    Abstract: Previous work has demonstrated the effectiveness of planning for story generation exclusively in a monolingual setting focusing primarily on English. We consider whether planning brings advantages to automatic story generation across languages. We propose a new task of cross-lingual story generation with planning and present a new dataset for this task. We conduct a comprehensive study of differen… ▽ More

    Submitted 25 March, 2024; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted to LREC-COLING 2024

  20. arXiv:2208.11378  [pdf, other

    cs.DS

    Online Algorithms for Matchings with Proportional Fairness Constraints and Diversity Constraints

    Authors: Anand Louis, Meghana Nasre, Prajakta Nimbhorkar, Govind S. Sankar

    Abstract: Matching problems with group-fairness constraints and diversity constraints have numerous applications such as in allocation problems, committee selection, school choice, etc. Moreover, online matching problems have lots of applications in ad allocations and other e-commerce problems like product recommendation in digital marketing. We study two problems involving assigning {\em items} to {\em p… ▽ More

    Submitted 3 August, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: 16 pages, Full version of a paper accepted in ECAI 2023

  21. arXiv:2208.10095  [pdf, other

    cs.LG cs.CY cs.DS

    Socially Fair Center-based and Linear Subspace Clustering

    Authors: Sruthi Gorantla, Kishen N. Gowda, Amit Deshpande, Anand Louis

    Abstract: Center-based clustering (e.g., $k$-means, $k$-medians) and clustering using linear subspaces are two most popular techniques to partition real-world data into smaller clusters. However, when the data consists of sensitive demographic groups, significantly different clustering cost per point for different sensitive groups can lead to fairness-related harms (e.g., different quality-of-service). The… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: 17 pages

  22. arXiv:2208.09951  [pdf, ps, other

    cs.AI cs.DS

    Individual Fairness under Varied Notions of Group Fairness in Bipartite Matching - One Framework to Approximate Them All

    Authors: Atasi Panda, Anand Louis, Prajakta Nimbhorkar

    Abstract: We study the probabilistic assignment of items to platforms that satisfies both group and individual fairness constraints. Each item belongs to specific groups and has a preference ordering over platforms. Each platform enforces group fairness by limiting the number of items per group that can be assigned to it. There could be multiple optimal solutions that satisfy the group fairness constraints,… ▽ More

    Submitted 10 May, 2024; v1 submitted 21 August, 2022; originally announced August 2022.

    Comments: 31 pages. To appear in IJCAI'24

  23. arXiv:2207.00397  [pdf, ps, other

    cs.CL

    Conditional Generation with a Question-Answering Blueprint

    Authors: Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Anders Sandholm, Dipanjan Das, Mirella Lapata

    Abstract: The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. Our wo… ▽ More

    Submitted 1 May, 2023; v1 submitted 1 July, 2022; originally announced July 2022.

    Comments: 22 pages, Accepted at TACL. Pre-MIT Press publication version

  24. arXiv:2205.11328  [pdf, ps, other

    cs.DS

    Approximating CSPs with Outliers

    Authors: Suprovat Ghoshal, Anand Louis

    Abstract: Constraint satisfaction problems (CSPs) are ubiquitous in theoretical computer science. We study the problem of StrongCSPs, i.e. instances where a large induced sub-instance has a satisfying assignment. More formally, given a CSP instance $Ψ(V, E, [k], \{Π_{ij}\}_{(i,j) \in E})$ consisting of a set of vertices $V$, a set of edges $E$, alphabet $[k]$, a constraint $Π_{ij} \subset [k] \times [k]$ fo… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: 45 Pages

  25. arXiv:2205.03727  [pdf, other

    cs.DS

    Exact recovery algorithm for Planted Bipartite Graph in Semi-random Graphs

    Authors: Akash Kumar, Anand Louis, Rameesh Paul

    Abstract: The problem of finding the largest induced balanced bipartite subgraph in a given graph is NP-hard. This problem is closely related to the problem of finding the smallest Odd Cycle Transversal. In this work, we consider the following model of instances: starting with a set of vertices $V$, a set $S \subseteq V$ of $k$ vertices is chosen and an arbitrary $d$-regular bipartite graph is added on it… ▽ More

    Submitted 12 May, 2022; v1 submitted 7 May, 2022; originally announced May 2022.

    Comments: 46 pages

  26. arXiv:2203.00887  [pdf, other

    cs.LG cs.DS

    Sampling Ex-Post Group-Fair Rankings

    Authors: Sruthi Gorantla, Amit Deshpande, Anand Louis

    Abstract: Randomized rankings have been of recent interest to achieve ex-ante fairer exposure and better robustness than deterministic rankings. We propose a set of natural axioms for randomized group-fair rankings and prove that there exists a unique distribution $D$ that satisfies our axioms and is supported only over ex-post group-fair rankings, i.e., rankings that satisfy given lower and upper bounds on… ▽ More

    Submitted 29 May, 2023; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: 31 pages. Accepted for publication as a full paper in IJCAI 2023

  27. arXiv:2108.11792  [pdf, other

    cs.CL

    A Statutory Article Retrieval Dataset in French

    Authors: Antoine Louis, Gerasimos Spanakis

    Abstract: Statutory article retrieval is the task of automatically retrieving law articles relevant to a legal question. While recent advances in natural language processing have sparked considerable interest in many legal tasks, statutory article retrieval remains primarily untouched due to the scarcity of large-scale and high-quality annotated datasets. To address this bottleneck, we introduce the Belgian… ▽ More

    Submitted 15 March, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

    Comments: ACL 2022. Code and dataset are available at https://github.com/maastrichtlawtech/bsard

  28. arXiv:2105.09522  [pdf, ps, other

    cs.DS

    Matchings with Group Fairness Constraints: Online and Offline Algorithms

    Authors: Govind S. Sankar, Anand Louis, Meghana Nasre, Prajakta Nimbhorkar

    Abstract: We consider the problem of assigning items to platforms in the presence of group fairness constraints. In the input, each item belongs to certain categories, called classes in this paper. Each platform specifies the group fairness constraints through an upper bound on the number of items it can serve from each class. Additionally, each platform also has an upper bound on the total number of items… ▽ More

    Submitted 20 May, 2021; originally announced May 2021.

    Comments: 16 pages, to appear in IJCAI 2021

  29. arXiv:2104.00927  [pdf, other

    cs.DS

    Independent Sets in Semi-random Hypergraphs

    Authors: Yash Khanna, Anand Louis, Rameesh Paul

    Abstract: A set of vertices in a hypergraph is called an independent set if no hyperedge is completely contained inside the set. Given a hypergraph, computing its largest size independent set is an NP-hard problem. In this work, we study the independent set problem on hypergraphs in a natural semi-random family of instances. Our semi-random model is inspired by the Feige-Kilian model [FK01]. This popular… ▽ More

    Submitted 2 April, 2021; originally announced April 2021.

    Comments: 21 pages

  30. arXiv:2103.06219  [pdf, other

    cs.LG stat.ML

    Why flatness does and does not correlate with generalization for deep neural networks

    Authors: Shuofeng Zhang, Isaac Reid, Guillermo Valle Pérez, Ard Louis

    Abstract: The intuition that local flatness of the loss landscape is correlated with better generalization for deep neural networks (DNNs) has been explored for decades, spawning many different flatness measures. Recently, this link with generalization has been called into question by a demonstration that many measures of flatness are vulnerable to parameter re-scaling which arbitrarily changes their value… ▽ More

    Submitted 21 June, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

  31. arXiv:2102.07238  [pdf, other

    stat.ML cs.LG

    Double-descent curves in neural networks: a new perspective using Gaussian processes

    Authors: Ouns El Harzli, Bernardo Cuenca Grau, Guillermo Valle-Pérez, Ard A. Louis

    Abstract: Double-descent curves in neural networks describe the phenomenon that the generalisation error initially descends with increasing parameters, then grows after reaching an optimal number of parameters which is less than the number of data points, but then descends again in the overparameterized regime. In this paper, we use techniques from random matrix theory to characterize the spectral distribut… ▽ More

    Submitted 25 May, 2023; v1 submitted 14 February, 2021; originally announced February 2021.

  32. arXiv:2012.04115  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Generalization bounds for deep learning

    Authors: Guillermo Valle-Pérez, Ard A. Louis

    Abstract: Generalization in deep learning has been the topic of much recent theoretical and empirical research. Here we introduce desiderata for techniques that predict generalization errors for deep learning models in supervised learning. Such predictions should 1) scale correctly with data complexity; 2) scale correctly with training set size; 3) capture differences between architectures; 4) capture diffe… ▽ More

    Submitted 9 December, 2020; v1 submitted 7 December, 2020; originally announced December 2020.

  33. arXiv:2010.06986  [pdf, other

    cs.IR cs.CY cs.DS cs.LG stat.ML

    On the Problem of Underranking in Group-Fair Ranking

    Authors: Sruthi Gorantla, Amit Deshpande, Anand Louis

    Abstract: Search and recommendation systems, such as search engines, recruiting tools, online marketplaces, news, and social media, output ranked lists of content, products, and sometimes, people. Credit ratings, standardized tests, risk assessments output only a score, but are also used implicitly for ranking. Bias in such ranking systems, especially among the top ranks, can worsen social and economic ineq… ▽ More

    Submitted 18 February, 2021; v1 submitted 24 September, 2020; originally announced October 2020.

    Comments: 27 pages

  34. arXiv:2010.03450  [pdf, other

    cs.CL

    "I'd rather just go to bed": Understanding Indirect Answers

    Authors: Annie Louis, Dan Roth, Filip Radlinski

    Abstract: We revisit a pragmatic inference problem in dialog: understanding indirect responses to questions. Humans can interpret 'I'm starving.' in response to 'Hungry?', even without direct cue words such as 'yes' and 'no'. In dialog systems, allowing natural responses rather than closed vocabularies would be similarly beneficial. However, today's systems are only as sensitive to these pragmatic moves as… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: 15 pages, 3 figures. To appear at the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

  35. arXiv:2007.06869  [pdf, other

    cs.LG cs.AI cs.DS stat.ML

    Robust Identifiability in Linear Structural Equation Models of Causal Inference

    Authors: Karthik Abinav Sankararaman, Anand Louis, Navin Goyal

    Abstract: In this work, we consider the problem of robust parameter estimation from observational data in the context of linear structural equation models (LSEMs). LSEMs are a popular and well-studied class of models for inferring causality in the natural and social sciences. One of the main problems related to LSEMs is to recover the model parameters from the observational data. Under various conditions on… ▽ More

    Submitted 14 July, 2020; originally announced July 2020.

  36. arXiv:2006.15191  [pdf, other

    cs.LG stat.ML

    Is SGD a Bayesian sampler? Well, almost

    Authors: Chris Mingard, Guillermo Valle-Pérez, Joar Skalse, Ard A. Louis

    Abstract: Overparameterised deep neural networks (DNNs) are highly expressive and so can, in principle, generate almost any function that fits a training dataset with zero error. The vast majority of these functions will perform poorly on unseen data, and yet in practice DNNs often generalise remarkably well. This success suggests that a trained DNN must have a strong inductive bias towards functions with l… ▽ More

    Submitted 24 October, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

    Journal ref: Journal of Machine Learning Research, 22 79 (2021), 1-64

  37. arXiv:2006.07832  [pdf, ps, other

    cs.DS

    Group Fairness for Knapsack Problems

    Authors: Deval Patel, Arindam Khan, Anand Louis

    Abstract: We study the knapsack problem with group fairness constraints. The input of the problem consists of a knapsack of bounded capacity and a set of items, each item belongs to a particular category and has and associated weight and value. The goal of this problem is to select a subset of items such that all categories are fairly represented, the total weight of the selected items does not exceed the c… ▽ More

    Submitted 16 January, 2021; v1 submitted 14 June, 2020; originally announced June 2020.

  38. arXiv:2005.08918  [pdf, ps, other

    cs.DS cs.CC

    Approximation Algorithms and Hardness for Strong Unique Games

    Authors: Suprovat Ghoshal, Anand Louis

    Abstract: The UNIQUE GAMES problem is a central problem in algorithms and complexity theory. Given an instance of UNIQUE GAMES, the STRONG UNIQUE GAMES problem asks to find the largest subset of vertices, such that the UNIQUE GAMES instance induced on them is completely satisfiable. In this paper, we give new algorithmic and hardness results for STRONG UNIQUE GAMES. Given an instance with label set size… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: 67 Pages

  39. arXiv:2004.13978  [pdf, other

    cs.DS

    Planted Models for the Densest $k$-Subgraph Problem

    Authors: Yash Khanna, Anand Louis

    Abstract: Given an undirected graph $G$, the Densest $k$-subgraph problem (DkS) asks to compute a set $S \subset V$ of cardinality $\left\lvert S\right\rvert \leq k$ such that the weight of edges inside $S$ is maximized. This is a fundamental NP-hard problem whose approximability, inspite of many decades of research, is yet to be settled. The current best known approximation algorithm due to Bhaskara et al.… ▽ More

    Submitted 7 November, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: 31 pages. To appear in FSTTCS 2020

  40. arXiv:2003.05582  [pdf, ps, other

    cs.CC cs.DS

    $λ_\infty$ & Maximum Variance Embedding: Measuring and Optimizing Connectivity of A Graph Metric

    Authors: Majid Farhadi, Anand Louis, Mohit Singh, Prasad Tetali

    Abstract: Bobkov, Houdré, and the last author [2000] introduced a Poincaré-type functional parameter, $λ_\infty$, of a graph and related it to connectivity of the graph via Cheeger-type inequalities. A work by the second author, Raghavendra, and Vempala [2013] related the complexity of $λ_\infty$ to the so-called small-set expansion (SSE) problem and further set forth the desiderata for NP-hardness of this… ▽ More

    Submitted 30 June, 2022; v1 submitted 11 March, 2020; originally announced March 2020.

  41. arXiv:1910.08889  [pdf, ps, other

    cs.DS

    Planted Models for $k$-way Edge and Vertex Expansion

    Authors: Anand Louis, Rakesh Venkat

    Abstract: Graph partitioning problems are a central topic of study in algorithms and complexity theory. Edge expansion and vertex expansion, two popular graph partitioning objectives, seek a $2$-partition of the vertex set of the graph that minimizes the considered objective. However, for many natural applications, one might require a graph to be partitioned into $k$ parts, for some $k \geq 2$. For a $k$-pa… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

    Comments: An extended abstract of this paper has been accepted to FSTTCS 2019

  42. arXiv:1909.11522  [pdf, other

    cs.LG stat.ML

    Neural networks are a priori biased towards Boolean functions with low entropy

    Authors: Chris Mingard, Joar Skalse, Guillermo Valle-Pérez, David Martínez-Rubio, Vladimir Mikulik, Ard A. Louis

    Abstract: Understanding the inductive bias of neural networks is critical to explaining their ability to generalise. Here, for one of the simplest neural networks -- a single-layer perceptron with n input neurons, one output neuron, and no threshold bias term -- we prove that upon random initialisation of weights, the a priori probability $P(t)$ that it represents a Boolean function that classifies t points… ▽ More

    Submitted 2 January, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

  43. arXiv:1909.04028  [pdf, other

    cs.CL cs.LG

    Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses

    Authors: Matt Grenander, Yue Dong, Jackie Chi Kit Cheung, Annie Louis

    Abstract: Sentence position is a strong feature for news summarization, since the lead often (but not always) summarizes the key points of the article. In this paper, we show that recent neural systems excessively exploit this trend, which although powerful for many inputs, is also detrimental when summarizing documents where important content should be extracted from later parts of the article. We propose… ▽ More

    Submitted 8 September, 2019; originally announced September 2019.

    Comments: 5 pages, accepted at EMNLP 2019

  44. arXiv:1908.11631  [pdf, ps, other

    cs.DS

    Approximation Algorithms for Partially Colorable Graphs

    Authors: Suprovat Ghoshal, Anand Louis, Rahul Raychaudhury

    Abstract: Graph coloring problems are a central topic of study in the theory of algorithms. We study the problem of partially coloring partially colorable graphs. For $α\leq 1$ and $k \in \mathbb{Z}^+$, we say that a graph $G=(V,E)$ is $α$-partially $k$-colorable, if there exists a subset $S\subset V$ of cardinality $ |S | \geq α| V |$ such that the graph induced on $S$ is $k$-colorable. Partial $k$-colorab… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: 25 Pages

  45. arXiv:1905.06836  [pdf, other

    cs.LG stat.ML

    Stability of Linear Structural Equation Models of Causal Inference

    Authors: Karthik Abinav Sankararaman, Anand Louis, Navin Goyal

    Abstract: We consider the numerical stability of the parameter recovery problem in Linear Structural Equation Model ($\LSEM$) of causal inference. A long line of work starting from Wright (1920) has focused on understanding which sub-classes of $\LSEM$ allow for efficient parameter recovery. Despite decades of study, this question is not yet fully resolved. The goal of this paper is complementary to this li… ▽ More

    Submitted 17 August, 2020; v1 submitted 16 May, 2019; originally announced May 2019.

    Comments: To appear in UAI 2019

  46. arXiv:1901.03859  [pdf, other

    cs.CL

    What comes next? Extractive summarization by next-sentence prediction

    Authors: Jingyun Liu, Jackie C. K. Cheung, Annie Louis

    Abstract: Existing approaches to automatic summarization assume that a length limit for the summary is given, and view content selection as an optimization problem to maximize informativeness and minimize redundancy within this budget. This framework ignores the fact that human-written summaries have rich internal structure which can be exploited to train a summarization system. We present NEXTSUM, a novel… ▽ More

    Submitted 12 January, 2019; originally announced January 2019.

  47. arXiv:1809.02589  [pdf, other

    cs.LG stat.ML

    HyperGCN: A New Method of Training Graph Convolutional Networks on Hypergraphs

    Authors: Naganand Yadati, Madhav Nimishakavi, Prateek Yadav, Vikram Nitin, Anand Louis, Partha Talukdar

    Abstract: In many real-world network datasets such as co-authorship, co-citation, email communication, etc., relationships are complex and go beyond pairwise. Hypergraphs provide a flexible and natural modeling tool to model such complex relationships. The obvious existence of such complex relationships in many real-world networks naturaly motivates the problem of learning with hypergraphs. A popular learni… ▽ More

    Submitted 22 May, 2019; v1 submitted 7 September, 2018; originally announced September 2018.

  48. arXiv:1806.04616  [pdf, ps, other

    cs.SE cs.CL

    Deep Learning to Detect Redundant Method Comments

    Authors: Annie Louis, Santanu Kumar Dash, Earl T. Barr, Charles Sutton

    Abstract: Comments in software are critical for maintenance and reuse. But apart from prescriptive advice, there is little practical support or quantitative understanding of what makes a comment useful. In this paper, we introduce the task of identifying comments which are uninformative about the code they are meant to document. To address this problem, we introduce the notion of comment entailment from cod… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: 12 pages

  49. arXiv:1805.09747  [pdf, other

    cs.DS

    Semi-Random Graphs with Planted Sparse Vertex Cuts: Algorithms for Exact and Approximate Recovery

    Authors: Anand Louis, Rakesh Venkat

    Abstract: The problem of computing the vertex expansion of a graph is an NP-hard problem. The current best worst-case approximation guarantees for computing the vertex expansion of a graph are a $O(\sqrt{\log n})$-approximation algorithm due to Feige, Hajiaghayi and Lee [SIAM J. Comp., 2008], and $O(\sqrt{OPT \log d})$ bound in graphs having vertex degrees at most $d$, due to Louis, Raghavendra and Vempala… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: Full version of paper to appear in ICALP '18

  50. arXiv:1805.08522  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Deep learning generalizes because the parameter-function map is biased towards simple functions

    Authors: Guillermo Valle-Pérez, Chico Q. Camargo, Ard A. Louis

    Abstract: Deep neural networks (DNNs) generalize remarkably well without explicit regularization even in the strongly over-parametrized regime where classical learning theory would instead predict that they would severely overfit. While many proposals for some kind of implicit regularization have been made to rationalise this success, there is no consensus for the fundamental reason why DNNs do not strongly… ▽ More

    Submitted 21 April, 2019; v1 submitted 22 May, 2018; originally announced May 2018.

    Comments: Published as a conference paper at ICLR 2019