14: Using multiple images

The previous chapter discussed corner detectors which find particularly distinctive points in a scene that can be reliably detected in different views of the same scene irrespective of viewpoint or lighting conditions. However the 3-dimensional coordinate of the corresponding world point was lost in the perspective projection process which we discussed in Chapter 11 — we mapped a 3-dimensional world point to a 2-dimensional image coordinate. To recover the missing third dimension we need additional information which from multiple views of the same scene. This allows us to determine the 3D location of the point relative to the camera, and even more powerfully we can estimate the 3D motion of the camera between the views as well as the 3D structure of the world. This chapter covers:

  • Finding corresponding features between two images, feature matching using Harris and SURF features
  • Geometry of two views, epipolar plane, fundamental matrix, epipolar lines
  • The essential matrix, its relationship to the fundamental matrix, and its relationship to camera translation and rotation
  • Estimating the fundamental matrix from data, the effect of bad correspondence, the need for robust matching and RANSAC
  • Planar homographies, estimation from data, relationship to camera translation and rotation, determining points within a plane
  • Sparse stereo
  • Dense stereo, disparity, matching measures and matching similarity, disparity space image, stereo failure modes, sub pixel disparity estimation
  • 3D reconstruction, texture mapping, anaglyph
  • Stereo image rectification
  • Dealing with 3D points: plane fitting, matching sets of points (ICP)
  • Structure and motion, the scale estimation problem
  • Applications
    • Perspective correction, remove keystone distortion from an image
    • Mosaicing, stitch images into a mosaic
    • Image matching and retrieval using the “bag of words” technique
    • Image sequence processing, track features over time