Speakers: Sebastien Ehrhardt, Oliver Groth
In this joint talk, we will present our recent work that challenges the ‘independence’ assumption in Computer Vision. We will start by presenting RELATE, an object-centric generative model that can generate scene component-wise. As opposed to current state-of-the-art models, RELATE does not assume each object positions to be independent but rather correlated. We introduce a model inspired by neural physics literature that models such correlation at the core of its architecture. We show that this is amenable to a more physically plausible scene generation which can extend to video prediction. We will follow by presenting a novel method to compute correspondences between two images. In general correspondences between images are established using a detect and describe approach were key points are extracted independently for each image and are keypoints are then matched based on a distance function. However, this is generally not robust to large changes in point of view or illumination. In this presentation, we will present D2D, a method that extracts keypoints conditioned on both images. Instead of assuming key points to be independent, we extract them with the knowledge of the image we would like to match. This makes our method robust to large changes in viewpoints and illumination.
For requesting a zoom invitation, please contact firstname.lastname@example.org