subscribe to arXiv mailings

E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness

Authors: Robin Courant, Nicolas Dufour, Xi Wang, Marc Christie, Vicky Kalogeiton

Abstract: Stories and emotions in movies emerge through the effect of well-thought-out directing decisions, in particular camera placement and movement over time. Crafting compelling camera trajectories remains a complex iterative process, even for skilful artists. To tackle this, in this paper, we propose a dataset called the Exceptional Trajectories (E.T.) with camera trajectories along with character inf… ▽ More Stories and emotions in movies emerge through the effect of well-thought-out directing decisions, in particular camera placement and movement over time. Crafting compelling camera trajectories remains a complex iterative process, even for skilful artists. To tackle this, in this paper, we propose a dataset called the Exceptional Trajectories (E.T.) with camera trajectories along with character information and textual captions encompassing descriptions of both camera and character. To our knowledge, this is the first dataset of its kind. To show the potential applications of the E.T. dataset, we propose a diffusion-based approach, named DIRECTOR, which generates complex camera trajectories from textual captions that describe the relation and synchronisation between the camera and characters. To ensure robust and accurate evaluations, we train on the E.T. dataset CLaTr, a Contrastive Language-Trajectory embedding for evaluation metrics. We posit that our proposed dataset and method significantly advance the democratization of cinematography, making it more accessible to common users. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: ECCV 2024. Project page: https://www.lix.polytechnique.fr/vista/projects/2024_et_courant/

arXiv:2402.16143 [pdf, other]

Cinematographic Camera Diffusion Model

Authors: Hongda Jiang, Xi Wang, Marc Christie, Libin Liu, Baoquan Chen

Abstract: Designing effective camera trajectories in virtual 3D environments is a challenging task even for experienced animators. Despite an elaborate film grammar, forged through years of experience, that enables the specification of camera motions through cinematographic properties (framing, shots sizes, angles, motions), there are endless possibilities in deciding how to place and move cameras with char… ▽ More Designing effective camera trajectories in virtual 3D environments is a challenging task even for experienced animators. Despite an elaborate film grammar, forged through years of experience, that enables the specification of camera motions through cinematographic properties (framing, shots sizes, angles, motions), there are endless possibilities in deciding how to place and move cameras with characters. Dealing with these possibilities is part of the complexity of the problem. While numerous techniques have been proposed in the literature (optimization-based solving, encoding of empirical rules, learning from real examples,...), the results either lack variety or ease of control. In this paper, we propose a cinematographic camera diffusion model using a transformer-based architecture to handle temporality and exploit the stochasticity of diffusion models to generate diverse and qualitative trajectories conditioned by high-level textual descriptions. We extend the work by integrating keyframing constraints and the ability to blend naturally between motions using latent interpolation, in a way to augment the degree of control of the designers. We demonstrate the strengths of this text-to-camera motion approach through qualitative and quantitative experiments and gather feedback from professional artists. The code and data are available at \URL{https://github.com/jianghd1996/Camera-control}. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2309.03933 [pdf, other]

BluNF: Blueprint Neural Field

Authors: Robin Courant, Xi Wang, Marc Christie, Vicky Kalogeiton

Abstract: Neural Radiance Fields (NeRFs) have revolutionized scene novel view synthesis, offering visually realistic, precise, and robust implicit reconstructions. While recent approaches enable NeRF editing, such as object removal, 3D shape modification, or material property manipulation, the manual annotation prior to such edits makes the process tedious. Additionally, traditional 2D interaction tools lac… ▽ More Neural Radiance Fields (NeRFs) have revolutionized scene novel view synthesis, offering visually realistic, precise, and robust implicit reconstructions. While recent approaches enable NeRF editing, such as object removal, 3D shape modification, or material property manipulation, the manual annotation prior to such edits makes the process tedious. Additionally, traditional 2D interaction tools lack an accurate sense of 3D space, preventing precise manipulation and editing of scenes. In this paper, we introduce a novel approach, called Blueprint Neural Field (BluNF), to address these editing issues. BluNF provides a robust and user-friendly 2D blueprint, enabling intuitive scene editing. By leveraging implicit neural representation, BluNF constructs a blueprint of a scene using prior semantic and depth information. The generated blueprint allows effortless editing and manipulation of NeRF representations. We demonstrate BluNF's editability through an intuitive click-and-change mechanism, enabling 3D manipulations, such as masking, appearance modification, and object removal. Our approach significantly contributes to visual content creation, paving the way for further research in this area. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: ICCV-W (AI3DCC) 2023. Project page with videos and code: https://www.lix.polytechnique.fr/vista/projects/2023_iccvw_courant/

arXiv:2303.15427 [pdf, other]

JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields

Authors: Xi Wang, Robin Courant, Jinglei Shi, Eric Marchand, Marc Christie

Abstract: This paper presents JAWS, an optimization-driven approach that achieves the robust transfer of visual cinematic features from a reference in-the-wild video clip to a newly generated clip. To this end, we rely on an implicit-neural-representation (INR) in a way to compute a clip that shares the same cinematic features as the reference clip. We propose a general formulation of a camera optimization… ▽ More This paper presents JAWS, an optimization-driven approach that achieves the robust transfer of visual cinematic features from a reference in-the-wild video clip to a newly generated clip. To this end, we rely on an implicit-neural-representation (INR) in a way to compute a clip that shares the same cinematic features as the reference clip. We propose a general formulation of a camera optimization problem in an INR that computes extrinsic and intrinsic camera parameters as well as timing. By leveraging the differentiability of neural representations, we can back-propagate our designed cinematic losses measured on proxy estimators through a NeRF network to the proposed cinematic parameters directly. We also introduce specific enhancements such as guidance maps to improve the overall quality and efficiency. Results display the capacity of our system to replicate well known camera sequences from movies, adapting the framing, camera parameters and timing of the generated video clip to maximize the similarity with the reference clip. △ Less

Submitted 27 March, 2023; originally announced March 2023.

Comments: CVPR 2023. Project page with videos and code: http://www.lix.polytechnique.fr/vista/projects/2023_cvpr_wang

arXiv:2107.03882 [pdf, other]

A Multi-Protocol, Secure, and Dynamic Data Storage Integration Frameworkfor Multi-tenanted Science Gateway Middleware

Authors: Dimuthu Wannipurage, Isuru Ranawaka, Eroma Abeysinghe, Marcus Christie, Suresh Marru, Marlon Pierce

Abstract: Science gateways are user-centric, end-to-end cyberinfrastructure for managing scientific data and executions of computational software on distributed resources. In order to simplify the creation and management of science gateways, we have pursued a multi-tenanted, platform-as-a-service approach that allows multiple gateway front-ends (portals) to be integrated with a consolidated middleware that… ▽ More Science gateways are user-centric, end-to-end cyberinfrastructure for managing scientific data and executions of computational software on distributed resources. In order to simplify the creation and management of science gateways, we have pursued a multi-tenanted, platform-as-a-service approach that allows multiple gateway front-ends (portals) to be integrated with a consolidated middleware that manages the movement of data and the execution of workflows on multiple back-end scientific computing resources. An important challenge for this approach is to provide an end-to-end data movement and management solution that allows gateway users to integrate their own data stores with the gateway platform. These user-provided data stores may include commercial cloud-based object store systems, third-party data stores accessed through APIs such as REST endpoints, and users' own local storage resources. In this paper, we present a solution design and implementation based on the integration of a managed file transfer (MFT) service (Airavata MFT) into the platform. △ Less

Submitted 8 July, 2021; originally announced July 2021.

arXiv:2102.13378 [pdf, other]

Where to look at the movies : Analyzing visual attention to understand movie editing

Authors: Alexandre Bruckert, Marc Christie, Olivier Le Meur

Abstract: In the process of making a movie, directors constantly care about where the spectator will look on the screen. Shot composition, framing, camera movements or editing are tools commonly used to direct attention. In order to provide a quantitative analysis of the relationship between those tools and gaze patterns, we propose a new eye-tracking database, containing gaze pattern information on movie s… ▽ More In the process of making a movie, directors constantly care about where the spectator will look on the screen. Shot composition, framing, camera movements or editing are tools commonly used to direct attention. In order to provide a quantitative analysis of the relationship between those tools and gaze patterns, we propose a new eye-tracking database, containing gaze pattern information on movie sequences, as well as editing annotations, and we show how state-of-the-art computational saliency techniques behave on this dataset. In this work, we expose strong links between movie editing and spectators scanpaths, and open several leads on how the knowledge of editing information could improve human visual attention modeling for cinematic content. The dataset generated and analysed during the current study is available at https://github.com/abruckert/eye_tracking_filmmaking △ Less

Submitted 26 February, 2021; originally announced February 2021.

arXiv:1907.02336 [pdf, other]

Deep Saliency Models : The Quest For The Loss Function

Authors: Alexandre Bruckert, Hamed R. Tavakoli, Zhi Liu, Marc Christie, Olivier Le Meur

Abstract: Recent advances in deep learning have pushed the performances of visual saliency models way further than it has ever been. Numerous models in the literature present new ways to design neural networks, to arrange gaze pattern data, or to extract as much high and low-level image features as possible in order to create the best saliency representation. However, one key part of a typical deep learning… ▽ More Recent advances in deep learning have pushed the performances of visual saliency models way further than it has ever been. Numerous models in the literature present new ways to design neural networks, to arrange gaze pattern data, or to extract as much high and low-level image features as possible in order to create the best saliency representation. However, one key part of a typical deep learning model is often neglected: the choice of the loss function. In this work, we explore some of the most popular loss functions that are used in deep saliency models. We demonstrate that on a fixed network architecture, modifying the loss function can significantly improve (or depreciate) the results, hence emphasizing the importance of the choice of the loss function when designing a model. We also introduce new loss functions that have never been used for saliency prediction to our knowledge. And finally, we show that a linear combination of several well-chosen loss functions leads to significant improvements in performances on different datasets as well as on a different network architecture, hence demonstrating the robustness of a combined metric. △ Less

Submitted 4 July, 2019; originally announced July 2019.

Comments: 10 pages, 4 figures

arXiv:1712.04216 [pdf, other]

Directing Cinematographic Drones

Authors: Quentin Galvane, Christophe Lino, Marc Christie, Julien Fleureau, Fabien Servant, Francois-Louis Tariolle, Philippe Guillotel

Abstract: Quadrotor drones equipped with high quality cameras have rapidely raised as novel, cheap and stable devices for filmmakers. While professional drone pilots can create aesthetically pleasing videos in short time, the smooth -- and cinematographic -- control of a camera drone remains challenging for most users, despite recent tools that either automate part of the process or enable the manual design… ▽ More Quadrotor drones equipped with high quality cameras have rapidely raised as novel, cheap and stable devices for filmmakers. While professional drone pilots can create aesthetically pleasing videos in short time, the smooth -- and cinematographic -- control of a camera drone remains challenging for most users, despite recent tools that either automate part of the process or enable the manual design of waypoints to create drone trajectories. This paper proposes to move a step further towards more accessible cinematographic drones by designing techniques to automatically or interactively plan quadrotor drone motions in 3D dynamic environments that satisfy both cinematographic and physical quadrotor constraints. We first propose the design of a Drone Toric Space as a dedicated camera parameter space with embedded constraints and derive some intuitive on-screen viewpoint manipulators. Second, we propose a specific path planning technique which ensures both that cinematographic properties can be enforced along the path, and that the path is physically feasible by a quadrotor drone. At last, we build on the Drone Toric Space and the specific path planning technique to coordinate the motion of multiple drones around dynamic targets. A number of results then demonstrate the interactive and automated capacities of our approaches on a number of use-cases. △ Less

Submitted 14 December, 2017; v1 submitted 12 December, 2017; originally announced December 2017.

arXiv:cs/0007002 [pdf, ps, other]

Interval Constraint Solving for Camera Control and Motion Planning

Authors: Frederic Benhamou, Frederic Goualard, Eric Languenou, Marc Christie

Abstract: Many problems in robust control and motion planning can be reduced to either find a sound approximation of the solution space determined by a set of nonlinear inequalities, or to the ``guaranteed tuning problem'' as defined by Jaulin and Walter, which amounts to finding a value for some tuning parameter such that a set of inequalities be verified for all the possible values of some perturbation… ▽ More Many problems in robust control and motion planning can be reduced to either find a sound approximation of the solution space determined by a set of nonlinear inequalities, or to the ``guaranteed tuning problem'' as defined by Jaulin and Walter, which amounts to finding a value for some tuning parameter such that a set of inequalities be verified for all the possible values of some perturbation vector. A classical approach to solve these problems, which satisfies the strong soundness requirement, involves some quantifier elimination procedure such as Collins' Cylindrical Algebraic Decomposition symbolic method. Sound numerical methods using interval arithmetic and local consistency enforcement to prune the search space are presented in this paper as much faster alternatives for both soundly solving systems of nonlinear inequalities, and addressing the guaranteed tuning problem whenever the perturbation vector has dimension one. The use of these methods in camera control is investigated, and experiments with the prototype of a declarative modeller to express camera motion using a cinematic language are reported and commented. △ Less

Submitted 20 June, 2003; v1 submitted 3 July, 2000; originally announced July 2000.

Comments: 35 pages, 13 figures, revised and extended version of a paper published in the proceedings of CP '00

ACM Class: D.3.3; D.2.2; G.1.0; H.5.1

Showing 1–9 of 9 results for author: Christie, M