Skip to main content

Showing 1–12 of 12 results for author: Dufour, N

  1. arXiv:2407.01516  [pdf, other

    cs.CV

    E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness

    Authors: Robin Courant, Nicolas Dufour, Xi Wang, Marc Christie, Vicky Kalogeiton

    Abstract: Stories and emotions in movies emerge through the effect of well-thought-out directing decisions, in particular camera placement and movement over time. Crafting compelling camera trajectories remains a complex iterative process, even for skilful artists. To tackle this, in this paper, we propose a dataset called the Exceptional Trajectories (E.T.) with camera trajectories along with character inf… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: ECCV 2024. Project page: https://www.lix.polytechnique.fr/vista/projects/2024_et_courant/

  2. arXiv:2405.20324  [pdf, other

    cs.CV cs.LG

    Don't drop your samples! Coherence-aware training benefits Conditional diffusion

    Authors: Nicolas Dufour, Victor Besnier, Vicky Kalogeiton, David Picard

    Abstract: Conditional diffusion models are powerful generative models that can leverage various types of conditional information, such as class labels, segmentation masks, or text captions. However, in many real-world scenarios, conditional information may be noisy or unreliable due to human annotation errors or weak alignment. In this paper, we propose the Coherence-Aware Diffusion (CAD), a novel method th… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted at CVPR 2024 as a Highlight. Project page: https://nicolas-dufour.github.io/cad.html

  3. arXiv:2405.11697  [pdf, other

    cs.CY

    AMMeBa: A Large-Scale Survey and Dataset of Media-Based Misinformation In-The-Wild

    Authors: Nicholas Dufour, Arkanath Pathak, Pouya Samangouei, Nikki Hariri, Shashi Deshetti, Andrew Dudfield, Christopher Guess, Pablo Hernández Escayola, Bobby Tran, Mevan Babakar, Christoph Bregler

    Abstract: The prevalence and harms of online misinformation is a perennial concern for internet platforms, institutions and society at large. Over time, information shared online has become more media-heavy and misinformation has readily adapted to these new modalities. The rise of generative AI-based tools, which provide widely-accessible methods for synthesizing realistic audio, images, video and human-li… ▽ More

    Submitted 21 May, 2024; v1 submitted 19 May, 2024; originally announced May 2024.

    Comments: Grammar, spelling corrections. Minor rewording and clarification of one sentence. 24 pages, 31 figures

  4. arXiv:2404.18873  [pdf, other

    cs.CV cs.AI

    OpenStreetView-5M: The Many Roads to Global Visual Geolocation

    Authors: Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao XU, Hongyu Zhou, Loic Landrieu

    Abstract: Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms. Yet, the absence of standard, large-scale, open-access datasets with reliably localizable images has limited its potential. To address this issue, we introduce OpenStreetView-5M, a large-scale, open-access dataset comprising over 5.1 milli… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: CVPR 2024

  5. arXiv:2404.13040  [pdf, other

    cs.CV cs.LG

    Analysis of Classifier-Free Guidance Weight Schedulers

    Authors: Xi Wang, Nicolas Dufour, Nefeli Andreou, Marie-Paule Cani, Victoria Fernandez Abrevaya, David Picard, Vicky Kalogeiton

    Abstract: Classifier-Free Guidance (CFG) enhances the quality and condition adherence of text-to-image diffusion models. It operates by combining the conditional and unconditional predictions using a fixed weight. However, recent works vary the weights throughout the diffusion process, reporting superior results but without providing any rationale or analysis. By conducting comprehensive experiments, this p… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  6. arXiv:2303.15533  [pdf, other

    cs.LG cs.CV

    Sequential training of GANs against GAN-classifiers reveals correlated "knowledge gaps" present among independently trained GAN instances

    Authors: Arkanath Pathak, Nicholas Dufour

    Abstract: Modern Generative Adversarial Networks (GANs) generate realistic images remarkably well. Previous work has demonstrated the feasibility of "GAN-classifiers" that are distinct from the co-trained discriminator, and operate on images generated from a frozen GAN. That such classifiers work at all affirms the existence of "knowledge gaps" (out-of-distribution artifacts across samples) present in GAN t… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  7. arXiv:2303.12068  [pdf, other

    cs.CV

    Machine Learning for Brain Disorders: Transformers and Visual Transformers

    Authors: Robin Courant, Maika Edberg, Nicolas Dufour, Vicky Kalogeiton

    Abstract: Transformers were initially introduced for natural language processing (NLP) tasks, but fast they were adopted by most deep learning fields, including computer vision. They measure the relationships between pairs of input tokens (words in the case of text strings, parts of images for visual Transformers), termed attention. The cost is exponential with the number of tokens. For image classification… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: To appear in O. Colliot (Ed.), Machine Learning for Brain Disorders, Springer

  8. arXiv:2212.10957  [pdf, other

    cs.CV

    TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization

    Authors: Fabrizio Guillaro, Davide Cozzolino, Avneesh Sud, Nicholas Dufour, Luisa Verdoliva

    Abstract: In this paper we present TruFor, a forensic framework that can be applied to a large variety of image manipulation methods, from classic cheapfakes to more recent manipulations based on deep learning. We rely on the extraction of both high-level and low-level traces through a transformer-based fusion architecture that combines the RGB image and a learned noise-sensitive fingerprint. The latter lea… ▽ More

    Submitted 25 May, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

  9. arXiv:2210.04883  [pdf, other

    cs.CV cs.AI cs.LG

    SCAM! Transferring humans between images with Semantic Cross Attention Modulation

    Authors: Nicolas Dufour, David Picard, Vicky Kalogeiton

    Abstract: A large body of recent work targets semantically conditioned image generation. Most such methods focus on the narrower task of pose transfer and ignore the more challenging task of subject transfer that consists in not only transferring the pose but also the appearance and background. In this work, we introduce SCAM (Semantic Cross Attention Modulation), a system that encodes rich and diverse info… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: Accepted at ECCV 2022

  10. arXiv:2003.12170  [pdf, other

    cs.LG stat.ML

    Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment

    Authors: Ben Usman, Avneesh Sud, Nick Dufour, Kate Saenko

    Abstract: Distribution alignment has many applications in deep learning, including domain adaptation and unsupervised image-to-image translation. Most prior work on unsupervised distribution alignment relies either on minimizing simple non-parametric statistical distances such as maximum mean discrepancy or on adversarial alignment. However, the former fails to capture the structure of complex real-world di… ▽ More

    Submitted 26 October, 2020; v1 submitted 26 March, 2020; originally announced March 2020.

  11. arXiv:1901.10024  [pdf, other

    cs.LG cs.GR stat.ML

    Cross-Domain Image Manipulation by Demonstration

    Authors: Ben Usman, Nick Dufour, Kate Saenko, Chris Bregler

    Abstract: In this work we propose a model that can manipulate individual visual attributes of objects in a real scene using examples of how respective attribute manipulations affect the output of a simulation. As an example, we train our model to manipulate the expression of a human face using nonphotorealistic 3D renders of a face with varied expression. Our model manages to preserve all other visual attri… ▽ More

    Submitted 3 April, 2019; v1 submitted 28 January, 2019; originally announced January 2019.

  12. arXiv:1707.07204  [pdf, other

    cs.CV

    Eyemotion: Classifying facial expressions in VR using eye-tracking cameras

    Authors: Steven Hickson, Nick Dufour, Avneesh Sud, Vivek Kwatra, Irfan Essa

    Abstract: One of the main challenges of social interaction in virtual reality settings is that head-mounted displays occlude a large portion of the face, blocking facial expressions and thereby restricting social engagement cues among users. Hence, auxiliary means of sensing and conveying these expressions are needed. We present an algorithm to automatically infer expressions by analyzing only a partially o… ▽ More

    Submitted 28 July, 2017; v1 submitted 22 July, 2017; originally announced July 2017.

    Comments: Uploaded Supplementary PDF. Fixed author affiliation. Corrected typo in personalization accuracy