Skip to main content

Showing 1–9 of 9 results for author: Ricco, S

  1. arXiv:2401.14322  [pdf, other

    cs.CV cs.CY

    Generalized People Diversity: Learning a Human Perception-Aligned Diversity Representation for People Images

    Authors: Hansa Srinivasan, Candice Schumann, Aradhana Sinha, David Madras, Gbolahan Oluwafemi Olanubi, Alex Beutel, Susanna Ricco, Jilin Chen

    Abstract: Capturing the diversity of people in images is challenging: recent literature tends to focus on diversifying one or two attributes, requiring expensive attribute labels or building classifiers. We introduce a diverse people image ranking method which more flexibly aligns with human notions of people diversity in a less prescriptive, label-free manner. The Perception-Aligned Text-derived Human repr… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  2. arXiv:2305.09073  [pdf, other

    cs.CV cs.CY

    Consensus and Subjectivity of Skin Tone Annotation for ML Fairness

    Authors: Candice Schumann, Gbolahan O. Olanubi, Auriel Wright, Ellis Monk Jr., Courtney Heldreth, Susanna Ricco

    Abstract: Understanding different human attributes and how they affect model behavior may become a standard need for all model creation and usage, from traditional computer vision tasks to the newest multimodal generative AI systems. In computer vision specifically, we have relied on datasets augmented with perceived attribute signals (e.g., gender presentation, skin tone, and age) and benchmarks enabled by… ▽ More

    Submitted 2 January, 2024; v1 submitted 15 May, 2023; originally announced May 2023.

  3. A Step Toward More Inclusive People Annotations for Fairness

    Authors: Candice Schumann, Susanna Ricco, Utsav Prabhu, Vittorio Ferrari, Caroline Pantofaru

    Abstract: The Open Images Dataset contains approximately 9 million images and is a widely accepted dataset for computer vision research. As is common practice for large datasets, the annotations are not exhaustive, with bounding boxes and attribute labels for only a subset of the classes in each image. In this paper, we present a new set of annotations on a subset of the Open Images dataset called the MIAP… ▽ More

    Submitted 5 May, 2021; originally announced May 2021.

    Journal ref: AIES (2021)

  4. arXiv:1705.08421  [pdf, other

    cs.CV

    AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

    Authors: Chunhui Gu, Chen Sun, David A. Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik

    Abstract: This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1.58M action labels with multiple labels per person occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual… ▽ More

    Submitted 30 April, 2018; v1 submitted 23 May, 2017; originally announced May 2017.

    Comments: To appear in CVPR 2018. Check dataset page https://research.google.com/ava/ for details

  5. arXiv:1705.02082  [pdf, other

    cs.CV

    Motion Prediction Under Multimodality with Conditional Stochastic Networks

    Authors: Katerina Fragkiadaki, Jonathan Huang, Alex Alemi, Sudheendra Vijayanarasimhan, Susanna Ricco, Rahul Sukthankar

    Abstract: Given a visual history, multiple future outcomes for a video scene are equally probable, in other words, the distribution of future outcomes has multiple modes. Multimodality is notoriously hard to handle by standard regressors or classifiers: the former regress to the mean and the latter discretize a continuous high dimensional output space. In this work, we present stochastic neural network arch… ▽ More

    Submitted 5 May, 2017; originally announced May 2017.

  6. arXiv:1704.07804  [pdf, other

    cs.CV

    SfM-Net: Learning of Structure and Motion from Video

    Authors: Sudheendra Vijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar, Katerina Fragkiadaki

    Abstract: We propose SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations. Given a sequence of frames, SfM-Net predicts depth, segmentation, camera and rigid object motions, converts those into a dense frame-to-frame motion field (optical flow), different… ▽ More

    Submitted 25 April, 2017; originally announced April 2017.

  7. Behavior Discovery and Alignment of Articulated Object Classes from Unstructured Video

    Authors: Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

    Abstract: We propose an automatic system for organizing the content of a collection of unstructured videos of an articulated object class (e.g. tiger, horse). By exploiting the recurring motion patterns of the class across videos, our system: 1) identifies its characteristic behaviors; and 2) recovers pixel-to-pixel alignments across different instances. Our system can be useful for organizing video collect… ▽ More

    Submitted 10 August, 2016; v1 submitted 30 November, 2015; originally announced November 2015.

    Comments: 19 pages, 19 figure, 3 tables. arXiv admin note: substantial text overlap with arXiv:1411.7883

    Journal ref: International Journal of Computer Vision (IJCV), July 2016

  8. arXiv:1412.0477  [pdf, other

    cs.CV

    Recovering Spatiotemporal Correspondence between Deformable Objects by Exploiting Consistent Foreground Motion in Video

    Authors: Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

    Abstract: Given unstructured videos of deformable objects, we automatically recover spatiotemporal correspondences to map one object to another (such as animals in the wild). While traditional methods based on appearance fail in such challenging conditions, we exploit consistency in object motion between instances. Our approach discovers pairs of short video intervals where the object moves in a consistent… ▽ More

    Submitted 16 August, 2016; v1 submitted 1 December, 2014; originally announced December 2014.

    Comments: 9 pages, 14 figures. This article is obsolete. Its contents are now covered in arXiv:1511.09319, where we discuss a comprehensive system for behavior discovery and spatial alignment of articulated object classes from unstructured video (available at https://arxiv.org/abs/1511.09319)

  9. arXiv:1411.7883  [pdf, other

    cs.CV

    Articulated motion discovery using pairs of trajectories

    Authors: Luca Del Pero, Susanna Ricco, Rahul Sukthankar, Vittorio Ferrari

    Abstract: We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild. We discover consistent patterns in a bottom-up manner by analyzing the relative displacements of large numbers of ordered trajectory pairs through time, such that each trajectory is attached to a different m… ▽ More

    Submitted 24 April, 2015; v1 submitted 28 November, 2014; originally announced November 2014.

    Comments: 10 pages, 5 figures, 2 tables