Skip to main content

Showing 1–19 of 19 results for author: Jacoby, N

  1. arXiv:2406.04302  [pdf, other

    cs.LG

    Representational Alignment Supports Effective Machine Teaching

    Authors: Ilia Sucholutsky, Katherine M. Collins, Maya Malaviya, Nori Jacoby, Weiyang Liu, Theodore R. Sumers, Michalis Korakakis, Umang Bhatt, Mark Ho, Joshua B. Tenenbaum, Brad Love, Zachary A. Pardos, Adrian Weller, Thomas L. Griffiths

    Abstract: A good teacher should not only be knowledgeable; but should be able to communicate in a way that the student understands -- to share the student's representation of the world. In this work, we integrate insights from machine teaching and pragmatic communication with the burgeoning literature on representational alignment to characterize a utility curve defining a relationship between representatio… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Preprint

  2. arXiv:2406.04278  [pdf, other

    cs.CL cs.HC

    Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People

    Authors: Dun-Ming Huang, Pol Van Rijn, Ilia Sucholutsky, Raja Marjieh, Nori Jacoby

    Abstract: Conversational tones -- the manners and attitudes in which speakers communicate -- are essential to effective communication. Amidst the increasing popularization of Large Language Models (LLMs) over recent years, it becomes necessary to characterize the divergences in their conversational tones relative to humans. However, existing investigations of conversational modalities rely on pre-existing t… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Accepted to Main Conference at ACL 2024

  3. arXiv:2402.06992  [pdf, other

    q-bio.NC cs.AI cs.CL stat.AP

    A Rational Analysis of the Speech-to-Song Illusion

    Authors: Raja Marjieh, Pol van Rijn, Ilia Sucholutsky, Harin Lee, Thomas L. Griffiths, Nori Jacoby

    Abstract: The speech-to-song illusion is a robust psychological phenomenon whereby a spoken sentence sounds increasingly more musical as it is repeated. Despite decades of research, a complete formal account of this transformation is still lacking, and some of its nuanced characteristics, namely, that certain phrases appear to transform while others do not, is not well understood. Here we provide a formal a… ▽ More

    Submitted 10 February, 2024; originally announced February 2024.

    Comments: 7 pages, 5 figures

  4. Giving Robots a Voice: Human-in-the-Loop Voice Creation and open-ended Labeling

    Authors: Pol van Rijn, Silvan Mertes, Kathrin Janowski, Katharina Weitz, Nori Jacoby, Elisabeth André

    Abstract: Speech is a natural interface for humans to interact with robots. Yet, aligning a robot's voice to its appearance is challenging due to the rich vocabulary of both modalities. Previous research has explored a few labels to describe robots and tested them on a limited number of robots and existing voices. Here, we develop a robot-voice creation tool followed by large-scale behavioral human experime… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Accepted to CHI 2024, May 11 to 16, 2024, Honolulu, HI, USA

  5. arXiv:2310.13018  [pdf, other

    q-bio.NC cs.AI cs.LG cs.NE

    Getting aligned on representational alignment

    Authors: Ilia Sucholutsky, Lukas Muttenthaler, Adrian Weller, Andi Peng, Andreea Bobu, Been Kim, Bradley C. Love, Erin Grant, Iris Groen, Jascha Achterberg, Joshua B. Tenenbaum, Katherine M. Collins, Katherine L. Hermann, Kerem Oktar, Klaus Greff, Martin N. Hebart, Nori Jacoby, Qiuyi Zhang, Raja Marjieh, Robert Geirhos, Sherol Chen, Simon Kornblith, Sunayana Rane, Talia Konkle, Thomas P. O'Connell , et al. (5 additional authors not shown)

    Abstract: Biological and artificial information processing systems form representations that they can use to categorize, reason, plan, navigate, and make decisions. How can we measure the extent to which the representations formed by these diverse systems agree? Do similarities in representations then translate into similar behavior? How can a system's representations be modified to better match those of an… ▽ More

    Submitted 2 November, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: Working paper, changes to be made in upcoming revisions

  6. arXiv:2306.08564  [pdf, other

    q-bio.NC cs.AI stat.AP

    The Universal Law of Generalization Holds for Naturalistic Stimuli

    Authors: Raja Marjieh, Nori Jacoby, Joshua C. Peterson, Thomas L. Griffiths

    Abstract: Shepard's universal law of generalization is a remarkable hypothesis about how intelligent organisms should perceive similarity. In its broadest form, the universal law states that the level of perceived similarity between a pair of stimuli should decay as a concave function of their distance when embedded in an appropriate psychological space. While extensively studied, evidence in support of the… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: 36 pages, 6 figures

  7. arXiv:2302.01614  [pdf, other

    cs.CL

    Around the world in 60 words: A generative vocabulary test for online research

    Authors: Pol van Rijn, Yue Sun, Harin Lee, Raja Marjieh, Ilia Sucholutsky, Francesca Lanzarini, Elisabeth André, Nori Jacoby

    Abstract: Conducting experiments with diverse participants in their native languages can uncover insights into culture, cognition, and language that may not be revealed otherwise. However, conducting these experiments online makes it difficult to validate self-reported language proficiency. Furthermore, existing proficiency tests are small and cover only a few languages. We present an automated pipeline to… ▽ More

    Submitted 3 February, 2023; originally announced February 2023.

  8. arXiv:2302.01308  [pdf, other

    cs.CL cs.LG stat.ML

    Large language models predict human sensory judgments across six modalities

    Authors: Raja Marjieh, Ilia Sucholutsky, Pol van Rijn, Nori Jacoby, Thomas L. Griffiths

    Abstract: Determining the extent to which the perceptual world can be recovered from language is a longstanding problem in philosophy and cognitive science. We show that state-of-the-art large language models can unlock new insights into this problem by providing a lower bound on the amount of perceptual information that can be extracted from language. Specifically, we elicit pairwise similarity judgments f… ▽ More

    Submitted 15 June, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: 9 pages, 3 figures

  9. arXiv:2211.01407  [pdf, other

    cs.LG cs.AI

    On the Informativeness of Supervision Signals

    Authors: Ilia Sucholutsky, Ruairidh M. Battleday, Katherine M. Collins, Raja Marjieh, Joshua C. Peterson, Pulkit Singh, Umang Bhatt, Nori Jacoby, Adrian Weller, Thomas L. Griffiths

    Abstract: Supervised learning typically focuses on learning transferable representations from training examples annotated by humans. While rich annotations (like soft labels) carry more information than sparse annotations (like hard labels), they are also more expensive to collect. For example, while hard labels only provide information about the closest class an object belongs to (e.g., "this is a dog"), s… ▽ More

    Submitted 4 July, 2023; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Proceedings of UAI 2023

  10. arXiv:2209.14821  [pdf, other

    cs.LG stat.ML

    Analyzing Diffusion as Serial Reproduction

    Authors: Raja Marjieh, Ilia Sucholutsky, Thomas A. Langlois, Nori Jacoby, Thomas L. Griffiths

    Abstract: Diffusion models are a class of generative models that learn to synthesize samples by inverting a diffusion process that gradually maps data into noise. While these models have enjoyed great success recently, a full theoretical understanding of their observed properties is still lacking, in particular, their weak sensitivity to the choice of noise family and the role of adequate scheduling of nois… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: 10 pages, 4 figures

    Journal ref: PMLR 202:24005-24019, 2023

  11. arXiv:2206.04105  [pdf, other

    cs.CL cs.LG stat.ML

    Words are all you need? Language as an approximation for human similarity judgments

    Authors: Raja Marjieh, Pol van Rijn, Ilia Sucholutsky, Theodore R. Sumers, Harin Lee, Thomas L. Griffiths, Nori Jacoby

    Abstract: Human similarity judgments are a powerful supervision signal for machine learning applications based on techniques such as contrastive learning, information retrieval, and model alignment, but classical methods for collecting human similarity judgments are too expensive to be used at scale. Recent methods propose using pre-trained deep neural networks (DNNs) to approximate human similarity, but pr… ▽ More

    Submitted 23 February, 2023; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Accepted to ICLR 2023, final revision. https://openreview.net/forum?id=O-G91-4cMdv

  12. arXiv:2205.04820  [pdf, other

    cs.CL

    Bridging the prosody GAP: Genetic Algorithm with People to efficiently sample emotional prosody

    Authors: Pol van Rijn, Harin Lee, Nori Jacoby

    Abstract: The human voice effectively communicates a range of emotions with nuanced variations in acoustics. Existing emotional speech corpora are limited in that they are either (a) highly curated to induce specific emotions with predefined categories that may not capture the full extent of emotional experiences, or (b) entangled in their semantic and prosodic cues, limiting the ability to study these cues… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: Accepted to CogSci'22

  13. arXiv:2203.16930  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    WavThruVec: Latent speech representation as intermediate features for neural speech synthesis

    Authors: Hubert Siuzdak, Piotr Dura, Pol van Rijn, Nori Jacoby

    Abstract: Recent advances in neural text-to-speech research have been dominated by two-stage pipelines utilizing low-level intermediate speech representation such as mel-spectrograms. However, such predetermined features are fundamentally limited, because they do not allow to exploit the full potential of a data-driven approach through learning hidden representations. For this reason, several end-to-end met… ▽ More

    Submitted 11 July, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Accepted to INTERSPEECH 2022. Audio samples are available at: https://charactr-platform.github.io/WavThruVec/

  14. arXiv:2203.15379  [pdf, other

    cs.SD cs.HC eess.AS

    VoiceMe: Personalized voice generation in TTS

    Authors: Pol van Rijn, Silvan Mertes, Dominik Schiller, Piotr Dura, Hubert Siuzdak, Peter M. C. Harrison, Elisabeth André, Nori Jacoby

    Abstract: Novel text-to-speech systems can generate entirely new voices that were not seen during training. However, it remains a difficult task to efficiently create personalized voices from a high-dimensional speaker space. In this work, we use speaker embeddings from a state-of-the-art speaker verification model (SpeakerNet) trained on thousands of speakers to condition a TTS model. We employ a human sam… ▽ More

    Submitted 11 July, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

    Comments: Accepted to Interspeech'22. Audio and video samples are available at: https://polvanrijn.github.io/VoiceMe/

  15. arXiv:2202.04728  [pdf, other

    cs.LG cs.CL

    Predicting Human Similarity Judgments Using Large Language Models

    Authors: Raja Marjieh, Ilia Sucholutsky, Theodore R. Sumers, Nori Jacoby, Thomas L. Griffiths

    Abstract: Similarity judgments provide a well-established method for accessing mental representations, with applications in psychology, neuroscience and machine learning. However, collecting similarity judgments can be prohibitively expensive for naturalistic datasets as the number of comparisons grows quadratically in the number of stimuli. One way to tackle this problem is to construct approximation proce… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Comments: 7 pages, 6 figures

  16. arXiv:2108.00768  [pdf

    cs.IR cs.SD eess.AS

    Cross-cultural Mood Perception in Pop Songs and its Alignment with Mood Detection Algorithms

    Authors: Harin Lee, Frank Hoeger, Marc Schoenwiesner, Minsu Park, Nori Jacoby

    Abstract: Do people from different cultural backgrounds perceive the mood in music the same way? How closely do human ratings across different cultures approximate automatic mood detection algorithms that are often trained on corpora of predominantly Western popular music? Analyzing 166 participants responses from Brazil, South Korea, and the US, we examined the similarity between the ratings of nine catego… ▽ More

    Submitted 2 August, 2021; originally announced August 2021.

    Comments: 8 pages, 5 figures, to be included as proceedings for the 22nd International Society of Music Information Retrieval (ISMIR)

    Journal ref: Proceedings of the 22nd International Society for Music Information Retrieval Conference, Nov. 2021, pp. 366-373

  17. arXiv:2107.07013  [pdf, other

    cs.CV

    Passive Attention in Artificial Neural Networks Predicts Human Visual Selectivity

    Authors: Thomas A. Langlois, H. Charles Zhao, Erin Grant, Ishita Dasgupta, Thomas L. Griffiths, Nori Jacoby

    Abstract: Developments in machine learning interpretability techniques over the past decade have provided new tools to observe the image regions that are most informative for classification and localization in artificial neural networks (ANNs). Are the same regions similarly informative to human observers? Using data from 79 new experiments and 7,810 participants, we show that passive attention techniques r… ▽ More

    Submitted 31 October, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

  18. Exploring emotional prototypes in a high dimensional TTS latent space

    Authors: Pol van Rijn, Silvan Mertes, Dominik Schiller, Peter M. C. Harrison, Pauline Larrouy-Maestri, Elisabeth André, Nori Jacoby

    Abstract: Recent TTS systems are able to generate prosodically varied and realistic speech. However, it is unclear how this prosodic variation contributes to the perception of speakers' emotional states. Here we use the recent psychological paradigm 'Gibbs Sampling with People' to search the prosodic latent space in a trained GST Tacotron model to explore prototypes of emotional prosody. Participants are re… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Comments: Submitted to INTERSPEECH'21

  19. arXiv:2008.02595  [pdf, other

    q-bio.NC cs.AI cs.CV stat.AP

    Gibbs Sampling with People

    Authors: Peter M. C. Harrison, Raja Marjieh, Federico Adolfi, Pol van Rijn, Manuel Anglada-Tort, Ofer Tchernichovski, Pauline Larrouy-Maestri, Nori Jacoby

    Abstract: A core problem in cognitive science and machine learning is to understand how humans derive semantic representations from perceptual objects, such as color from an apple, pleasantness from a musical chord, or seriousness from a face. Markov Chain Monte Carlo with People (MCMCP) is a prominent method for studying such representations, in which participants are presented with binary choice trials co… ▽ More

    Submitted 2 November, 2020; v1 submitted 6 August, 2020; originally announced August 2020.

    Comments: Accepted for oral presentation at NeurIPS 2020