-
Depth Extraction from Video Using Non-parametric Sampling
Authors:
Kevin Karsch,
Ce Liu,
Sing Bing Kang
Abstract:
We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to en…
▽ More
We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.
△ Less
Submitted 24 December, 2019;
originally announced February 2020.
-
DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling
Authors:
Kevin Karsch,
Ce Liu,
Sing Bing Kang
Abstract:
We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to en…
▽ More
We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling. We demonstrate our technique in cases where past methods fail (non-translating cameras and dynamic scenes). Our technique is applicable to single images as well as videos. For videos, we use local motion cues to improve the inferred depth maps, while optical flow is used to ensure temporal depth consistency. For training and evaluation, we use a Kinect-based system to collect a large dataset containing stereoscopic videos with known depths. We show that our depth estimation technique outperforms the state-of-the-art on benchmark databases. Our technique can be used to automatically convert a monoscopic video into stereo for 3D visualization, and we demonstrate this through a variety of visually pleasing results for indoor and outdoor scenes, including results from the feature film Charade.
△ Less
Submitted 24 December, 2019;
originally announced January 2020.
-
Inverse Rendering Techniques for Physically Grounded Image Editing
Authors:
Kevin Karsch
Abstract:
From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is know…
▽ More
From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly "teach" computers to make the same high-level observations as people. This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images.
△ Less
Submitted 24 December, 2019;
originally announced January 2020.
-
Lightform: Procedural Effects for Projected AR
Authors:
Brittany Factura,
Laura LaPerche,
Phil Reyneri,
Brett Jones,
Kevin Karsch
Abstract:
Projected augmented reality, also called projection mapping or video mapping, is a form of augmented reality that uses projected light to directly augment 3D surfaces, as opposed to using pass-through screens or headsets. The value of projected AR is its ability to add a layer of digital content directly onto physical objects or environments in a way that can be instantaneously viewed by multiple…
▽ More
Projected augmented reality, also called projection mapping or video mapping, is a form of augmented reality that uses projected light to directly augment 3D surfaces, as opposed to using pass-through screens or headsets. The value of projected AR is its ability to add a layer of digital content directly onto physical objects or environments in a way that can be instantaneously viewed by multiple people, unencumbered by a screen or additional setup.
Because projected AR typically involves projecting onto non-flat, textured objects (especially those that are conventionally not used as projection surfaces), the digital content needs to be mapped and aligned to precisely fit the physical scene to ensure a compelling experience. Current projected AR techniques require extensive calibration at the time of installation, which is not conducive to iteration or change, whether intentional (the scene is reconfigured) or not (the projector is bumped or settles). The workflows are undefined and fragmented, thus making it confusing and difficult for many to approach projected AR. For example, a digital artist may have the software expertise to create AR content, but could not complete an installation without experience in mounting, blending, and realigning projector(s); the converse is true for many A/V installation teams/professionals. Projection mapping has therefore been limited to high-end event productions, concerts, and films, because it requires expensive, complex tools, and skilled teams ($100K+ budgets).
Lightform provides a technology that makes projected AR approachable, practical, intelligent, and robust through integrated hardware and computer-vision software. Lightform brings together and unites a currently fragmented workflow into a single cohesive process that provides users with an approachable and robust method to create and control projected AR experiences.
△ Less
Submitted 24 December, 2019;
originally announced January 2020.
-
Automatic Scene Inference for 3D Object Compositing
Authors:
Kevin Karsch,
Kalyan Sunkavalli,
Sunil Hadap,
Nathan Carr,
Hailin Jin,
Rafael Fonte,
Michael Sittig
Abstract:
We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), post-process illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geo…
▽ More
We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), post-process illumination editing, and depth-of-field manipulation. Underlying our system is a fully automatic technique for recovering a comprehensive 3D scene model (geometry, illumination, diffuse albedo and camera parameters) from a single, low dynamic range photograph. This is made possible by two novel contributions: an illumination inference algorithm that recovers a full lighting model of the scene (including light sources that are not directly visible in the photograph), and a depth estimation algorithm that combines data-driven depth transfer with geometric reasoning about the scene layout. A user study shows that our system produces perceptually convincing results, and achieves the same level of realism as techniques that require significant user interaction.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
Blind Recovery of Spatially Varying Reflectance from a Single Image
Authors:
Kevin Karsch,
David Forsyth
Abstract:
We propose a new technique for estimating spatially varying parametric materials from a single image of an object with unknown shape in unknown illumination. Our method uses a low-order parametric reflectance model, and incorporates strong assumptions about lighting and shape. We develop new priors about how materials mix over space, and jointly infer all of these properties from a single image. T…
▽ More
We propose a new technique for estimating spatially varying parametric materials from a single image of an object with unknown shape in unknown illumination. Our method uses a low-order parametric reflectance model, and incorporates strong assumptions about lighting and shape. We develop new priors about how materials mix over space, and jointly infer all of these properties from a single image. This produces a decomposition of an image which corresponds, in one sense, to microscopic features (material reflectance) and macroscopic features (weights defining the mixing properties of materials over space). We have built a large dataset of real objects rendered with different material models under different illumination fields for training and ground truth evaluation. Extensive experiments on both our synthetic dataset images as well as real images show that (a) our method recovers parameters with reasonable accuracy; (b) material parameters recovered by our method give accurate predictions of new renderings of the object; and (c) our low-order reflectance model still provides a good fit to many real-world reflectances.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
ConstructAide: Analyzing and Visualizing Construction Sites through Photographs and Building Models
Authors:
Kevin Karsch,
Mani Golparvar-Fard,
David Forsyth
Abstract:
We describe a set of tools for analyzing, visualizing, and assessing architectural/construction progress with unordered photo collections and 3D building models. With our interface, a user guides the registration of the model in one of the images, and our system automatically computes the alignment for the rest of the photos using a novel Structure-from-Motion (SfM) technique; images with nearby v…
▽ More
We describe a set of tools for analyzing, visualizing, and assessing architectural/construction progress with unordered photo collections and 3D building models. With our interface, a user guides the registration of the model in one of the images, and our system automatically computes the alignment for the rest of the photos using a novel Structure-from-Motion (SfM) technique; images with nearby viewpoints are also brought into alignment with each other. After aligning the photo(s) and model(s), our system allows a user, such as a project manager or facility owner, to explore the construction site seamlessly in time, monitor the progress of construction, assess errors and deviations, and create photorealistic architectural visualizations. These interactions are facilitated by automatic reasoning performed by our system: static and dynamic occlusions are removed automatically, rendering information is collected, and semantic selection tools help guide user input. We also demonstrate that our user-assisted SfM method outperforms existing techniques on both real-world construction data and established multi-view datasets.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
Boundary Cues for 3D Object Shape Recovery
Authors:
Kevin Karsch,
Zicheng Liao,
Jason Rock,
Jonathan T. Barron,
Derek Hoiem
Abstract:
Early work in computer vision considered a host of geometric cues for both shape reconstruction and recognition. However, since then, the vision community has focused heavily on shading cues for reconstruction, and moved towards data-driven approaches for recognition. In this paper, we reconsider these perhaps overlooked "boundary" cues (such as self occlusions and folds in a surface), as well as…
▽ More
Early work in computer vision considered a host of geometric cues for both shape reconstruction and recognition. However, since then, the vision community has focused heavily on shading cues for reconstruction, and moved towards data-driven approaches for recognition. In this paper, we reconsider these perhaps overlooked "boundary" cues (such as self occlusions and folds in a surface), as well as many other established constraints for shape reconstruction. In a variety of user studies and quantitative tasks, we evaluate how well these cues inform shape reconstruction (relative to each other) in terms of both shape quality and shape recognition. Our findings suggest many new directions for future research in shape reconstruction, such as automatic boundary cue detection and relaxing assumptions in shape from shading (e.g. orthographic projection, Lambertian surfaces).
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
Rendering Synthetic Objects into Legacy Photographs
Authors:
Kevin Karsch,
Varsha Hedau,
David Forsyth,
Derek Hoiem
Abstract:
We propose a method to realistically insert synthetic objects into existing photographs without requiring access to the scene or any additional scene measurements. With a single image and a small amount of annotation, our method creates a physical model of the scene that is suitable for realistically rendering synthetic objects with diffuse, specular, and even glowing materials while accounting fo…
▽ More
We propose a method to realistically insert synthetic objects into existing photographs without requiring access to the scene or any additional scene measurements. With a single image and a small amount of annotation, our method creates a physical model of the scene that is suitable for realistically rendering synthetic objects with diffuse, specular, and even glowing materials while accounting for lighting interactions between the objects and the scene. We demonstrate in a user study that synthetic images produced by our method are confusable with real scenes, even for people who believe they are good at telling the difference. Further, our study shows that our method is competitive with other insertion methods while requiring less scene information. We also collected new illumination and reflectance datasets; renderings produced by our system compare well to ground truth. Our system has applications in the movie and gaming industry, as well as home decorating and user content creation, among others.
△ Less
Submitted 24 December, 2019;
originally announced December 2019.
-
A Fast, Semi-Automatic Brain Structure Segmentation Algorithm for Magnetic Resonance Imaging
Authors:
Kevin Karsch,
Qing He,
Ye Duan
Abstract:
Medical image segmentation has become an essential technique in clinical and research-oriented applications. Because manual segmentation methods are tedious, and fully automatic segmentation lacks the flexibility of human intervention or correction, semi-automatic methods have become the preferred type of medical image segmentation. We present a hybrid, semi-automatic segmentation method in 3D tha…
▽ More
Medical image segmentation has become an essential technique in clinical and research-oriented applications. Because manual segmentation methods are tedious, and fully automatic segmentation lacks the flexibility of human intervention or correction, semi-automatic methods have become the preferred type of medical image segmentation. We present a hybrid, semi-automatic segmentation method in 3D that integrates both region-based and boundary-based procedures. Our method differs from previous hybrid methods in that we perform region-based and boundary-based approaches separately, which allows for more efficient segmentation. A region-based technique is used to generate an initial seed contour that roughly represents the boundary of a target brain structure, alleviating the local minima problem in the subsequent model deformation phase. The contour is deformed under a unique force equation independent of image edges. Experiments on MRI data show that this method can achieve high accuracy and efficiency primarily due to the unique seed initialization technique.
△ Less
Submitted 20 April, 2019;
originally announced April 2019.
-
Web Based Brain Volume Calculation for Magnetic Resonance Images
Authors:
Kevin Karsch,
Brian Grinstead,
Qing He,
Ye Duan
Abstract:
Brain volume calculations are crucial in modern medical research, especially in the study of neurodevelopmental disorders. In this paper, we present an algorithm for calculating two classifications of brain volume, total brain volume (TBV) and intracranial volume (ICV). Our algorithm takes MRI data as input, performs several preprocessing and intermediate steps, and then returns each of the two ca…
▽ More
Brain volume calculations are crucial in modern medical research, especially in the study of neurodevelopmental disorders. In this paper, we present an algorithm for calculating two classifications of brain volume, total brain volume (TBV) and intracranial volume (ICV). Our algorithm takes MRI data as input, performs several preprocessing and intermediate steps, and then returns each of the two calculated volumes. To simplify this process and make our algorithm publicly accessible to anyone, we have created a web-based interface that allows users to upload their own MRI data and calculate the TBV and ICV for the given data. This interface provides a simple and efficient method for calculating these two classifications of brain volume, and it also removes the need for the user to download or install any applications.
△ Less
Submitted 20 April, 2019;
originally announced April 2019.
-
Snaxels on a Plane
Authors:
Kevin Karsch,
John C. Hart
Abstract:
While many algorithms exist for tracing various contours for illustrating a meshed object, few algorithms organize these contours into region-bounding closed loops. Tracing closed-loop boundaries on a mesh can be problematic due to switchbacks caused by subtle surface variation, and the organization of these regions into a planar map can lead to many small region components due to imprecision and…
▽ More
While many algorithms exist for tracing various contours for illustrating a meshed object, few algorithms organize these contours into region-bounding closed loops. Tracing closed-loop boundaries on a mesh can be problematic due to switchbacks caused by subtle surface variation, and the organization of these regions into a planar map can lead to many small region components due to imprecision and noise. This paper adapts "snaxels," an energy minimizing active contour method designed for robust mesh processing, and repurposes it to generate visual, shadow and shading contours, and a simplified visual-surface planar map, useful for stylized vector art illustration of the mesh. The snaxel active contours can also track contours as the mesh animates, and frame-to-frame correspondences between snaxels lead to a new method to convert the moving contours on a 3-D animated mesh into 2-D SVG curve animations for efficient embedding in Flash, PowerPoint and other dynamic vector art platforms.
△ Less
Submitted 20 April, 2019;
originally announced April 2019.
-
User interface design for military AR applications
Authors:
Mark A. Livingston,
Zhuming Ai,
Kevin Karsch,
Gregory O. Gibson
Abstract:
Designing a user interface for military situation awareness presents challenges for managing information in a useful and usable manner. We present an integrated set of functions for the presentation of and interaction with information for a mobile augmented reality application for military applications. Our research has concentrated on four areas. We filter information based on relevance to the us…
▽ More
Designing a user interface for military situation awareness presents challenges for managing information in a useful and usable manner. We present an integrated set of functions for the presentation of and interaction with information for a mobile augmented reality application for military applications. Our research has concentrated on four areas. We filter information based on relevance to the user (in turn based on location), evaluate methods for presenting information that represents entities occluded from the user's view, enable interaction through a top-down map view metaphor akin to current techniques used in the military, and facilitate collaboration with other mobile users and/or a command center. In addition, we refined the user interface architecture to conform to requirements from subject matter experts. We discuss the lessons learned in our work and directions for future research.
△ Less
Submitted 20 April, 2019;
originally announced April 2019.
-
An Approximate Shading Model with Detail Decomposition for Object Relighting
Authors:
Zicheng Liao,
Kevin Karsch,
Hongyi Zhang,
David Forsyth
Abstract:
We present an object relighting system that allows an artist to select an object from an image and insert it into a target scene. Through simple interactions, the system can adjust illumination on the inserted object so that it appears naturally in the scene. To support image-based relighting, we build object model from the image, and propose a \emph{perceptually-inspired} approximate shading mode…
▽ More
We present an object relighting system that allows an artist to select an object from an image and insert it into a target scene. Through simple interactions, the system can adjust illumination on the inserted object so that it appears naturally in the scene. To support image-based relighting, we build object model from the image, and propose a \emph{perceptually-inspired} approximate shading model for the relighting. It decomposes the shading field into (a) a rough shape term that can be reshaded, (b) a parametric shading detail that encodes missing features from the first term, and (c) a geometric detail term that captures fine-scale material properties. With this decomposition, the shading model combines 3D rendering and image-based composition and allows more flexible compositing than image-based methods. Quantitative evaluation and a set of user studies suggest our method is a promising alternative to existing methods of object insertion.
△ Less
Submitted 20 April, 2018;
originally announced April 2018.
-
Where's My Drink? Enabling Peripheral Real World Interactions While Using HMDs
Authors:
Pulkit Budhiraja,
Rajinder Sodhi,
Brett Jones,
Kevin Karsch,
Brian Bailey,
David Forsyth
Abstract:
Head Mounted Displays (HMDs) allow users to experience virtual reality with a great level of immersion. However, even simple physical tasks like drinking a beverage can be difficult and awkward while in a virtual reality experience. We explore mixed reality renderings that selectively incorporate the physical world into the virtual world for interactions with physical objects. We conducted a user…
▽ More
Head Mounted Displays (HMDs) allow users to experience virtual reality with a great level of immersion. However, even simple physical tasks like drinking a beverage can be difficult and awkward while in a virtual reality experience. We explore mixed reality renderings that selectively incorporate the physical world into the virtual world for interactions with physical objects. We conducted a user study comparing four rendering techniques that balances immersion in a virtual world with ease of interaction with the physical world. Finally, we discuss the pros and cons of each approach, suggesting guidelines for future rendering techniques that bring physical objects into virtual reality.
△ Less
Submitted 16 February, 2015;
originally announced February 2015.