Skip to main content

Showing 1–19 of 19 results for author: Lang, O

  1. arXiv:2406.18563  [pdf

    cs.CY cs.AI cs.CV cs.LG

    Interdisciplinary Expertise to Advance Equitable Explainable AI

    Authors: Chloe R. Bennett, Heather Cole-Lewis, Stephanie Farquhar, Naama Haamel, Boris Babenko, Oran Lang, Mat Fleck, Ilana Traynis, Charles Lau, Ivor Horn, Courtney Lyles

    Abstract: The field of artificial intelligence (AI) is rapidly influencing health and healthcare, but bias and poor performance persists for populations who face widespread structural oppression. Previous work has clearly outlined the need for more rigorous attention to data representativeness and model performance to advance equity and reduce bias. However, there is an opportunity to also improve the expla… ▽ More

    Submitted 29 May, 2024; originally announced June 2024.

  2. arXiv:2405.14655  [pdf, other

    cs.LG

    Multi-turn Reinforcement Learning from Preference Human Feedback

    Authors: Lior Shani, Aviv Rosenberg, Asaf Cassel, Oran Lang, Daniele Calandriello, Avital Zipori, Hila Noga, Orgad Keller, Bilal Piot, Idan Szpektor, Avinatan Hassidim, Yossi Matias, Rémi Munos

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has become the standard approach for aligning Large Language Models (LLMs) with human preferences, allowing LLMs to demonstrate remarkable abilities in various tasks. Existing methods work by emulating the preferences at the single decision (turn) level, limiting their capabilities in settings that require planning or multi-turn interactions to ach… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  3. arXiv:2403.10578  [pdf, other

    stat.ML cs.LG math.DS math.NA physics.flu-dyn

    Generative Modelling of Stochastic Rotating Shallow Water Noise

    Authors: Dan Crisan, Oana Lang, Alexander Lobbe

    Abstract: In recent work, the authors have developed a generic methodology for calibrating the noise in fluid dynamics stochastic partial differential equations where the stochasticity was introduced to parametrize subgrid-scale processes. The stochastic parameterization of sub-grid scale processes is required in the estimation of uncertainty in weather and climate predictions, to represent systematic model… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    MSC Class: 68T05; 76M35

  4. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  5. arXiv:2308.12591  [pdf, other

    eess.SP cs.AI

    SICNN: Soft Interference Cancellation Inspired Neural Network Equalizers

    Authors: Stefan Baumgartner, Oliver Lang, Mario Huemer

    Abstract: In recent years data-driven machine learning approaches have been extensively studied to replace or enhance traditionally model-based processing in digital communication systems. In this work, we focus on equalization and propose a novel neural network (NN-)based approach, referred to as SICNN. SICNN is designed by deep unfolding a model-based iterative soft interference cancellation (SIC) method.… ▽ More

    Submitted 11 March, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

  6. arXiv:2306.16428  [pdf, ps, other

    cs.LG math.ST

    Complex-valued Adaptive System Identification via Low-Rank Tensor Decomposition

    Authors: Oliver Ploder, Christina Auer, Oliver Lang, Thomas Paireder, Mario Huemer

    Abstract: Machine learning (ML) and tensor-based methods have been of significant interest for the scientific community for the last few decades. In a previous work we presented a novel tensor-based system identification framework to ease the computational burden of tensor-only architectures while still being able to achieve exceptionally good performance. However, the derived approach only allows to proces… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  7. arXiv:2306.00985  [pdf

    eess.IV cs.CV cs.LG

    Using generative AI to investigate medical imagery models and datasets

    Authors: Oran Lang, Doron Yaya-Stupp, Ilana Traynis, Heather Cole-Lewis, Chloe R. Bennett, Courtney Lyles, Charles Lau, Michal Irani, Christopher Semturs, Dale R. Webster, Greg S. Corrado, Avinatan Hassidim, Yossi Matias, Yun Liu, Naama Hammel, Boris Babenko

    Abstract: AI models have shown promise in many medical imaging tasks. However, our ability to explain what signals these models have learned is severely lacking. Explanations are needed in order to increase the trust in AI-based models, and could enable novel scientific discovery by uncovering signals in the data that are not yet known to experts. In this paper, we present a method for automatic visual expl… ▽ More

    Submitted 4 July, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 43 pages, 1 figure

    Journal ref: EBioMedicine 102 (2024)

  8. arXiv:2306.00966  [pdf, other

    cs.CV

    The Hidden Language of Diffusion Models

    Authors: Hila Chefer, Oran Lang, Mor Geva, Volodymyr Polosukhin, Assaf Shocher, Michal Irani, Inbar Mosseri, Lior Wolf

    Abstract: Text-to-image diffusion models have demonstrated an unparalleled ability to generate high-quality, diverse images from a textual prompt. However, the internal representations learned by these models remain an enigma. In this work, we present Conceptor, a novel method to interpret the internal representation of a textual concept by a diffusion model. This interpretation is obtained by decomposing t… ▽ More

    Submitted 5 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

  9. arXiv:2305.10400  [pdf, other

    cs.CL cs.CV

    What You See is What You Read? Improving Text-Image Alignment Evaluation

    Authors: Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor

    Abstract: Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks. In this work, we study methods for automatic text-image alignment evaluation. We first introduce SeeTRUE: a comprehensive evaluation set, spanning multiple datasets from both text-to… ▽ More

    Submitted 26 December, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023. Website: https://wysiwyr-itm.github.io/

  10. arXiv:2303.12438  [pdf, ps, other

    cs.IT eess.SP

    Doppler-Division Multiplexing for MIMO OFDM Joint Sensing and Communications

    Authors: Oliver Lang, Christian Hofbauer, Reinhard Feger, Mario Huemer

    Abstract: A promising waveform candidate for future joint sensing and communication systems is orthogonal frequencydivision multiplexing (OFDM). For such systems, supporting multiple transmit antennas requires multiplexing methods for the generation of orthogonal transmit signals, where equidistant subcarrier interleaving (ESI) is the most popular multiplexing method. In this work, we analyze a multiplexing… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 13 pages, 11 figures

  11. arXiv:2211.06054  [pdf, other

    cs.IT eess.SP

    Neural Network Approaches for Data Estimation in Unique Word OFDM Systems

    Authors: Stefan Baumgartner, Gergő Bognár, Oliver Lang, Mario Huemer

    Abstract: Data estimation is conducted with model-based estimation methods since the beginning of digital communications. However, motivated by the growing success of machine learning, current research focuses on replacing model-based data estimation methods by data-driven approaches, mainly neural networks (NNs). In this work, we particularly investigate the incorporation of existing model knowledge into d… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  12. arXiv:2210.09276  [pdf, other

    cs.CV

    Imagic: Text-Based Real Image Editing with Diffusion Models

    Authors: Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, Michal Irani

    Abstract: Text-conditioned image editing has recently attracted considerable interest. However, most methods are currently either limited to specific editing types (e.g., object overlay, style transfer), or apply to synthetically generated images, or require multiple input images of a common object. In this paper we demonstrate, for the very first time, the ability to apply complex (e.g., non-rigid) text-gu… ▽ More

    Submitted 20 March, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: Project page: https://imagic-editing.github.io/

  13. arXiv:2202.12211  [pdf, other

    cs.CV

    Self-Distilled StyleGAN: Towards Generation from Internet Photos

    Authors: Ron Mokady, Michal Yarom, Omer Tov, Oran Lang, Daniel Cohen-Or, Tali Dekel, Michal Irani, Inbar Mosseri

    Abstract: StyleGAN is known to produce high-fidelity images, while also offering unprecedented semantic editing. However, these fascinating abilities have been demonstrated only on a limited set of datasets, which are usually structurally aligned and well curated. In this paper, we show how StyleGAN can be adapted to work on raw uncurated images collected from the Internet. Such image collections impose two… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

  14. arXiv:2104.13369  [pdf, other

    cs.CV cs.LG cs.NE eess.IV stat.ML

    Explaining in Style: Training a GAN to explain a classifier in StyleSpace

    Authors: Oran Lang, Yossi Gandelsman, Michal Yarom, Yoav Wald, Gal Elidan, Avinatan Hassidim, William T. Freeman, Phillip Isola, Amir Globerson, Michal Irani, Inbar Mosseri

    Abstract: Image classification models can depend on multiple different semantic attributes of the image. An explanation of the decision of the classifier needs to both discover and visualize these properties. Here we present StylEx, a method for doing this, by training a generative model to specifically explain multiple attributes that underlie classifier decisions. A natural source for such attributes is t… ▽ More

    Submitted 1 September, 2021; v1 submitted 27 April, 2021; originally announced April 2021.

    Comments: Accepted to ICCV 2021. Project page: https://explaining-in-style.github.io/, Code: https://github.com/google/explaining-in-style

  15. arXiv:2004.06130  [pdf, other

    cs.CV

    SpeedNet: Learning the Speediness in Videos

    Authors: Sagie Benaim, Ariel Ephrat, Oran Lang, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Michal Irani, Tali Dekel

    Abstract: We wish to automatically predict the "speediness" of moving objects in videos---whether they move faster, at, or slower than their "natural" speed. The core component in our approach is SpeedNet---a novel deep network trained to detect if a video is playing at normal rate, or if it is sped up. SpeedNet is trained on a large corpus of natural videos in a self-supervised manner, without requiring an… ▽ More

    Submitted 26 July, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: Accepted to CVPR 2020 (oral). Project webpage: http://speednet-cvpr20.github.io

  16. arXiv:2002.12764  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Towards Learning a Universal Non-Semantic Representation of Speech

    Authors: Joel Shor, Aren Jansen, Ronnie Maor, Oran Lang, Omry Tuval, Felix de Chaumont Quitry, Marco Tagliasacchi, Ira Shavitt, Dotan Emanuel, Yinnon Haviv

    Abstract: The ultimate goal of transfer learning is to reduce labeled data requirements by exploiting a pre-existing embedding model trained for different datasets or tasks. The visual and language communities have established benchmarks to compare embeddings, but the speech community has yet to do so. This paper proposes a benchmark for comparing speech representations on non-semantic tasks, and proposes a… ▽ More

    Submitted 6 August, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Journal ref: Proceedings of INTERSPEECH 2020

  17. arXiv:1907.13511  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Personalizing ASR for Dysarthric and Accented Speech with Limited Data

    Authors: Joel Shor, Dotan Emanuel, Oran Lang, Omry Tuval, Michael Brenner, Julie Cattiau, Fernando Vieira, Maeve McNally, Taylor Charbonneau, Melissa Nollstadt, Avinatan Hassidim, Yossi Matias

    Abstract: Automatic speech recognition (ASR) systems have dramatically improved over the last few years. ASR systems are most often trained from 'typical' speech, which means that underrepresented groups don't experience the same level of improvement. In this paper, we present and evaluate finetuning techniques to improve ASR for users with non-standard speech. We focus on two types of non-standard speech:… ▽ More

    Submitted 31 July, 2019; originally announced July 2019.

    Comments: 5 pages

  18. arXiv:1804.03619  [pdf, other

    cs.SD cs.CV eess.AS

    Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation

    Authors: Ariel Ephrat, Inbar Mosseri, Oran Lang, Tali Dekel, Kevin Wilson, Avinatan Hassidim, William T. Freeman, Michael Rubinstein

    Abstract: We present a joint audio-visual model for isolating a single speech signal from a mixture of sounds such as other speakers and background noise. Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video. In this paper, we present a deep network-based model that incorporates both visual and aud… ▽ More

    Submitted 9 August, 2018; v1 submitted 10 April, 2018; originally announced April 2018.

    Comments: Accepted to SIGGRAPH 2018. Project webpage: https://looking-to-listen.github.io

    Journal ref: ACM Trans. Graph. 37(4): 112:1-112:11 (2018)

  19. arXiv:1612.04059  [pdf, ps, other

    math.ST cs.IT

    Parameter Estimation Under Model Uncertainties by Iterative Covariance Approximation

    Authors: Oliver Lang, Michael Lunglmayr, Mario Huemer

    Abstract: We propose a novel iterative algorithm for estimating a deterministic but unknown parameter vector in the presence of model uncertainties. This iterative algorithm is based on a system model where an overall noise term describes both, the measurement noise and the noise resulting from the model uncertainties. This overall noise term is a function of the true parameter vector, allowing for an itera… ▽ More

    Submitted 23 November, 2017; v1 submitted 13 December, 2016; originally announced December 2016.