Skip to main content

Showing 1–30 of 30 results for author: Katz, B

  1. arXiv:2406.14481  [pdf, other

    cs.LG cs.AI cs.NE q-bio.NC

    Revealing Vision-Language Integration in the Brain with Multimodal Networks

    Authors: Vighnesh Subramaniam, Colin Conwell, Christopher Wang, Gabriel Kreiman, Boris Katz, Ignacio Cases, Andrei Barbu

    Abstract: We use (multi)modal deep neural networks (DNNs) to probe for sites of multimodal integration in the human brain by predicting stereoencephalography (SEEG) recordings taken while human subjects watched movies. We operationalize sites of multimodal integration as regions where a multimodal vision-language model predicts recordings better than unimodal language, unimodal vision, or linearly-integrate… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: ICML 2024; 23 pages, 11 figures

  2. arXiv:2406.03044  [pdf, other

    cs.LG q-bio.NC

    Population Transformer: Learning Population-level Representations of Intracranial Activity

    Authors: Geeling Chau, Christopher Wang, Sabera Talukder, Vighnesh Subramaniam, Saraswati Soedarmadji, Yisong Yue, Boris Katz, Andrei Barbu

    Abstract: We present a self-supervised framework that learns population-level codes for intracranial neural recordings at scale, unlocking the benefits of representation learning for a key neuroscience recording modality. The Population Transformer (PopT) lowers the amount of data required for decoding experiments, while increasing accuracy, even on never-before-seen subjects and tasks. We address two key c… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 17 pages, 10 figures, submitted to NeurIPS 2024

  3. arXiv:2405.09805  [pdf, other

    cs.CL cs.CR

    SecureLLM: Using Compositionality to Build Provably Secure Language Models for Private, Sensitive, and Secret Data

    Authors: Abdulrahman Alabdulkareem, Christian M Arnold, Yerim Lee, Pieter M Feenstra, Boris Katz, Andrei Barbu

    Abstract: Traditional security mechanisms isolate resources from users who should not access them. We reflect the compositional nature of such security mechanisms back into the structure of LLMs to build a provably secure LLM; that we term SecureLLM. Other approaches to LLM safety attempt to protect against bad actors or bad outcomes, but can only do so to an extent making them inappropriate for sensitive d… ▽ More

    Submitted 13 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

  4. arXiv:2401.06967  [pdf

    q-bio.QM cs.LG stat.AP

    NHANES-GCP: Leveraging the Google Cloud Platform and BigQuery ML for reproducible machine learning with data from the National Health and Nutrition Examination Survey

    Authors: B. Ross Katz, Abdul Khan, James York-Winegar, Alexander J. Titus

    Abstract: Summary: NHANES, the National Health and Nutrition Examination Survey, is a program of studies led by the Centers for Disease Control and Prevention (CDC) designed to assess the health and nutritional status of adults and children in the United States (U.S.). NHANES data is frequently used by biostatisticians and clinical scientists to study health trends across the U.S., but every analysis requir… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: 7 pages, 1 figure

  5. arXiv:2302.14367  [pdf, other

    cs.LG eess.SP q-bio.NC

    BrainBERT: Self-supervised representation learning for intracranial recordings

    Authors: Christopher Wang, Vighnesh Subramaniam, Adam Uri Yaari, Gabriel Kreiman, Boris Katz, Ignacio Cases, Andrei Barbu

    Abstract: We create a reusable Transformer, BrainBERT, for intracranial recordings bringing modern representation learning approaches to neuroscience. Much like in NLP and speech recognition, this Transformer enables classifying complex concepts, i.e., decoding neural data, with higher accuracy and with much less data by being pretrained in an unsupervised manner on a large corpus of unannotated neural reco… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: 9 pages, 6 figures, ICLR 2023

  6. arXiv:2210.02585  [pdf, other

    cs.LG cs.AI

    Query The Agent: Improving sample efficiency through epistemic uncertainty estimation

    Authors: Julian Alverio, Boris Katz, Andrei Barbu

    Abstract: Curricula for goal-conditioned reinforcement learning agents typically rely on poor estimates of the agent's epistemic uncertainty or fail to consider the agents' epistemic uncertainty altogether, resulting in poor sample efficiency. We propose a novel algorithm, Query The Agent (QTA), which significantly improves sample efficiency by estimating the agent's epistemic uncertainty throughout the sta… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: Submitted to ICLR 2023

  7. arXiv:2207.07033  [pdf, other

    cs.AI cs.CY

    Developing a Series of AI Challenges for the United States Department of the Air Force

    Authors: Vijay Gadepally, Gregory Angelides, Andrei Barbu, Andrew Bowne, Laura J. Brattain, Tamara Broderick, Armando Cabrera, Glenn Carl, Ronisha Carter, Miriam Cha, Emilie Cowen, Jesse Cummings, Bill Freeman, James Glass, Sam Goldberg, Mark Hamilton, Thomas Heldt, Kuan Wei Huang, Phillip Isola, Boris Katz, Jamie Koerner, Yen-Chen Lin, David Mayo, Kyle McAlpin, Taylor Perron , et al. (17 additional authors not shown)

    Abstract: Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requireme… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  8. arXiv:2110.10298  [pdf, other

    cs.RO

    Incorporating Rich Social Interactions Into MDPs

    Authors: Ravi Tejwani, Yen-Ling Kuo, Tianmin Shu, Bennett Stankovits, Dan Gutfreund, Joshua B. Tenenbaum, Boris Katz, Andrei Barbu

    Abstract: Much of what we do as humans is engage socially with other agents, a skill that robots must also eventually possess. We demonstrate that a rich theory of social interactions originating from microsociology and economics can be formalized by extending a nested MDP where agents reason about arbitrary functions of each other's hidden rewards. This extended Social MDP allows us to encode the five basi… ▽ More

    Submitted 7 February, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: Accepted to the 39th International Conference on Robotics and Automation (ICRA 2022)

  9. arXiv:2110.09741  [pdf, other

    cs.RO cs.AI cs.CL cs.LG

    Trajectory Prediction with Linguistic Representations

    Authors: Yen-Ling Kuo, Xin Huang, Andrei Barbu, Stephen G. McGill, Boris Katz, John J. Leonard, Guy Rosman

    Abstract: Language allows humans to build mental models that interpret what is happening around them resulting in more accurate long-term predictions. We present a novel trajectory prediction model that uses linguistic intermediate representations to forecast trajectories, and is trained using trajectory samples with partially-annotated captions. The model learns the meaning of each of the words without dir… ▽ More

    Submitted 9 March, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: Accepted in ICRA 2022

  10. arXiv:2110.07575  [pdf, other

    cs.CL cs.CV cs.MM eess.AS

    Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

    Authors: Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James Glass

    Abstract: Visually-grounded spoken language datasets can enable models to learn cross-modal correspondences with very weak supervision. However, modern audio-visual datasets contain biases that undermine the real-world performance of models trained on that data. We introduce Spoken ObjectNet, which is designed to remove some of these biases and provide a way to better evaluate how effectively models will pe… ▽ More

    Submitted 14 October, 2021; originally announced October 2021.

    Comments: Presented at Interspeech 2021. This version contains additional experiments on the Spoken ObjectNet test set

  11. arXiv:2103.01933  [pdf, other

    cs.AI cs.CV cs.LG stat.ML

    PHASE: PHysically-grounded Abstract Social Events for Machine Social Perception

    Authors: Aviv Netanyahu, Tianmin Shu, Boris Katz, Andrei Barbu, Joshua B. Tenenbaum

    Abstract: The ability to perceive and reason about social interactions in the context of physical environments is core to human social intelligence and human-machine cooperation. However, no prior dataset or benchmark has systematically evaluated physically grounded perception of complex social interactions that go beyond short actions, such as high-fiving, or simple group activities, such as gathering. In… ▽ More

    Submitted 19 March, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: The first two authors contributed equally; AAAI 2021; 13 pages, 7 figures; Project page: https://www.tshu.io/PHASE

  12. arXiv:2010.13319  [pdf, other

    cs.RO cs.AI

    Migratable AI : Investigating users' affect on identity and information migration of a conversational AI agent

    Authors: Ravi Tejwani, Boris Katz, Cynthia Breazeal

    Abstract: Conversational AI agents are becoming ubiquitous and provide assistance to us in our everyday activities. In recent years, researchers have explored the migration of these agents across different embodiments in order to maintain the continuity of the task and improve user experience. In this paper, we investigate user's affective responses in different configurations of the migration parameters. W… ▽ More

    Submitted 4 September, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: Accepted to ICSR. arXiv admin note: text overlap with arXiv:2007.05801

    Journal ref: 13th International Conference on Social Robotics (ICSR 2021)

  13. arXiv:2010.12091  [pdf, ps, other

    cs.RO cs.AI

    Migratable AI: Personalizing Dialog Conversations with migration context

    Authors: Ravi Tejwani, Boris Katz, Cynthia Breazeal

    Abstract: The migration of conversational AI agents across different embodiments in order to maintain the continuity of the task has been recently explored to further improve user experience. However, these migratable agents lack contextual understanding of the user information and the migrated device during the dialog conversations with the user. This opens the question of how an agent might behave when mi… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  14. arXiv:2008.03277  [pdf, other

    cs.CL

    Learning a natural-language to LTL executable semantic parser for grounded robotics

    Authors: Christopher Wang, Candace Ross, Yen-Ling Kuo, Boris Katz, Andrei Barbu

    Abstract: Children acquire their native language with apparent ease by observing how language is used in context and attempting to use it themselves. They do so without laborious annotations, negative examples, or even direct corrections. We take a step toward robots that can do the same by training a grounded semantic parser, which discovers latent linguistic representations that can be used for the execut… ▽ More

    Submitted 16 March, 2021; v1 submitted 7 August, 2020; originally announced August 2020.

    Comments: 10 pages, 2 figures, Accepted in Conference on Robot Learning (CoRL) 2020

    ACM Class: I.2.7

  15. arXiv:2008.02742  [pdf, other

    cs.CL cs.AI cs.RO

    Compositional Networks Enable Systematic Generalization for Grounded Language Understanding

    Authors: Yen-Ling Kuo, Boris Katz, Andrei Barbu

    Abstract: Humans are remarkably flexible when understanding new sentences that include combinations of concepts they have never encountered before. Recent work has shown that while deep networks can mimic some human language abilities when presented with novel sentences, systematic variation uncovers the limitations in the language-understanding abilities of networks. We demonstrate that these limitations c… ▽ More

    Submitted 19 October, 2021; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted in Findings of EMNLP 2021

  16. arXiv:2006.01110  [pdf, other

    cs.RO cs.CL

    Encoding formulas as deep networks: Reinforcement learning for zero-shot execution of LTL formulas

    Authors: Yen-Ling Kuo, Boris Katz, Andrei Barbu

    Abstract: We demonstrate a reinforcement learning agent which uses a compositional recurrent neural network that takes as input an LTL formula and determines satisfying actions. The input LTL formulas have never been seen before, yet the network performs zero-shot generalization to satisfy them. This is a novel form of multi-task learning for RL agents where agents learn from one diverse set of tasks and ge… ▽ More

    Submitted 6 August, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: Accepted in IROS 2020

  17. arXiv:2003.03716  [pdf, other

    cs.CL

    Investigating the Decoders of Maximum Likelihood Sequence Models: A Look-ahead Approach

    Authors: Yu-Siang Wang, Yen-Ling Kuo, Boris Katz

    Abstract: We demonstrate how we can practically incorporate multi-step future information into a decoder of maximum likelihood sequence models. We propose a "k-step look-ahead" module to consider the likelihood information of a rollout up to k steps. Unlike other approaches that need to train another value network to evaluate the rollouts, we can directly apply this look-ahead module to improve the decoding… ▽ More

    Submitted 7 March, 2020; originally announced March 2020.

    Comments: 7 pages, 5 figures

  18. arXiv:2002.08911  [pdf, other

    cs.CL cs.AI

    Measuring Social Biases in Grounded Vision and Language Embeddings

    Authors: Candace Ross, Boris Katz, Andrei Barbu

    Abstract: We generalize the notion of social biases from language embeddings to grounded vision and language embeddings. Biases are present in grounded embeddings, and indeed seem to be equally or more significant than for ungrounded embeddings. This is despite the fact that vision and language can suffer from different biases, which one might hope could attenuate the biases in both. Multiple ways exist to… ▽ More

    Submitted 21 August, 2023; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: Camera-ready from NAACL 2021. Previous arXiv version was from before conference and was not the most recent version

  19. arXiv:2002.05201  [pdf, other

    cs.RO cs.CL

    Deep compositional robotic planners that follow natural language commands

    Authors: Yen-Ling Kuo, Boris Katz, Andrei Barbu

    Abstract: We demonstrate how a sampling-based robotic planner can be augmented to learn to understand a sequence of natural language commands in a continuous configuration space to move and manipulate objects. Our approach combines a deep network structured according to the parse of a complex command that includes objects, verbs, spatial relations, and attributes, with a sampling-based planner, RRT. A recur… ▽ More

    Submitted 19 February, 2020; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: Accepted in ICRA 2020

  20. arXiv:1909.06586  [pdf, other

    cs.RO

    Highly Dynamic Quadruped Locomotion via Whole-Body Impulse Control and Model Predictive Control

    Authors: Donghyun Kim, Jared Di Carlo, Benjamin Katz, Gerardo Bledt, Sangbae Kim

    Abstract: Dynamic legged locomotion is a challenging topic because of the lack of established control schemes which can handle aerial phases, short stance times, and high-speed leg swings. In this paper, we propose a controller combining whole-body control (WBC) and model predictive control (MPC). In our framework, MPC finds an optimal reaction force profile over a longer time horizon with a simple model, a… ▽ More

    Submitted 14 September, 2019; originally announced September 2019.

  21. arXiv:1811.06966  [pdf, other

    cs.RO

    Temporal Grounding Graphs for Language Understanding with Accrued Visual-Linguistic Context

    Authors: Rohan Paul, Andrei Barbu, Sue Felshin, Boris Katz, Nicholas Roy

    Abstract: A robot's ability to understand or ground natural language instructions is fundamentally tied to its knowledge about the surrounding world. We present an approach to grounding natural language utterances in the context of factual information gathered through natural-language interactions and past visual observations. A probabilistic model estimates, from a natural language utterance, the objects,r… ▽ More

    Submitted 16 November, 2018; originally announced November 2018.

    Comments: Published in ICJAI 2017

  22. arXiv:1810.00804  [pdf, other

    cs.RO

    Deep sequential models for sampling-based planning

    Authors: Yen-Ling Kuo, Andrei Barbu, Boris Katz

    Abstract: We demonstrate how a sequence model and a sampling-based planner can influence each other to produce efficient plans and how such a model can automatically learn to take advantage of observations of the environment. Sampling-based planners such as RRT generally know nothing of their environments even if they have traversed similar spaces many times. A sequence model, such as an HMM or LSTM, guides… ▽ More

    Submitted 1 October, 2018; originally announced October 2018.

    Comments: Published in IROS 2018

  23. arXiv:1805.09462  [pdf, other

    cs.CV

    Complex Relations in a Deep Structured Prediction Model for Fine Image Segmentation

    Authors: Cristina Mata, Guy Ben-Yosef, Boris Katz

    Abstract: Many deep learning architectures for semantic segmentation involve a Fully Convolutional Neural Network (FCN) followed by a Conditional Random Field (CRF) to carry out inference over an image. These models typically involve unary potentials based on local appearance features computed by FCNs, and binary potentials based on the displacement between pixels. We show that while current methods succeed… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

  24. arXiv:1804.07329  [pdf, other

    cs.CL

    Assessing Language Proficiency from Eye Movements in Reading

    Authors: Yevgeni Berzak, Boris Katz, Roger Levy

    Abstract: We present a novel approach for determining learners' second language proficiency which utilizes behavioral traces of eye movements during reading. Our approach provides stand-alone eyetracking based English proficiency scores which reflect the extent to which the learner's gaze patterns in reading are similar to those of native English speakers. We show that our scores correlate strongly with sta… ▽ More

    Submitted 23 April, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

    Comments: NAACL 2018 (license change to CC BY)

  25. arXiv:1704.07398  [pdf, other

    cs.CL

    Predicting Native Language from Gaze

    Authors: Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz

    Abstract: A fundamental question in language learning concerns the role of a speaker's first language in second language acquisition. We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text. Using this methodology, we demonstrate for the first time that the native language of English learners can be predicted from their gaze f… ▽ More

    Submitted 2 May, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: ACL 2017

  26. arXiv:1605.04481  [pdf, ps, other

    cs.CL

    Anchoring and Agreement in Syntactic Annotations

    Authors: Yevgeni Berzak, Yan Huang, Andrei Barbu, Anna Korhonen, Boris Katz

    Abstract: We present a study on two key characteristics of human syntactic annotations: anchoring and agreement. Anchoring is a well known cognitive bias in human decision making, where judgments are drawn towards pre-existing values. We study the influence of anchoring on a standard approach to creation of syntactic resources where syntactic annotations are obtained via human editing of tagger and parser o… ▽ More

    Submitted 21 September, 2016; v1 submitted 14 May, 2016; originally announced May 2016.

    Comments: EMNLP 2016

  27. arXiv:1605.04278  [pdf, ps, other

    cs.CL

    Universal Dependencies for Learner English

    Authors: Yevgeni Berzak, Jessica Kenney, Carolyn Spadine, Jing Xian Wang, Lucia Lam, Keiko Sophie Mori, Sebastian Garza, Boris Katz

    Abstract: We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic… ▽ More

    Submitted 7 June, 2016; v1 submitted 13 May, 2016; originally announced May 2016.

    Comments: Updated parsing experiments to EWT v1.3, improved grammatical error marking, minor revisions. To appear in ACL 2016

  28. arXiv:1603.08079  [pdf, other

    cs.CV cs.AI cs.CL

    Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

    Authors: Yevgeni Berzak, Andrei Barbu, Daniel Harari, Boris Katz, Shimon Ullman

    Abstract: Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, r… ▽ More

    Submitted 26 March, 2016; originally announced March 2016.

    Comments: EMNLP 2015

    Journal ref: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015, pages 1477--1487

  29. arXiv:1603.07609  [pdf, other

    cs.CL

    Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL

    Authors: Yevgeni Berzak, Roi Reichart, Boris Katz

    Abstract: This work examines the impact of cross-linguistic transfer on grammatical errors in English as Second Language (ESL) texts. Using a computational framework that formalizes the theory of Contrastive Analysis (CA), we demonstrate that language specific error distributions in ESL writing can be predicted from the typological properties of the native language and their relation to the typology of Engl… ▽ More

    Submitted 24 March, 2016; originally announced March 2016.

    Comments: Published in CoNLL 2015

    Journal ref: Proceedings of the 19th Conference on Computational Language Learning, pages 94-102, Beijing, China, July 30-31, 2015

  30. arXiv:1404.6312  [pdf, other

    cs.CL

    Reconstructing Native Language Typology from Foreign Language Usage

    Authors: Yevgeni Berzak, Roi Reichart, Boris Katz

    Abstract: Linguists and psychologists have long been studying cross-linguistic transfer, the influence of native language properties on linguistic performance in a foreign language. In this work we provide empirical evidence for this process in the form of a strong correlation between language similarities derived from structural features in English as Second Language (ESL) texts and equivalent similarities… ▽ More

    Submitted 28 May, 2014; v1 submitted 25 April, 2014; originally announced April 2014.

    Comments: CoNLL 2014

    Journal ref: Proceedings of the Eighteenth Conference on Computational Language Learning , pages 21-29, Baltimore, Maryland USA, June 26-27 2014