Archive

Posts Tagged ‘image processing’

Automatically Making Photorealistic 3D Models from 2D Pictures

March 25th, 2011 No comments

Making 3D models is time consuming. Recent programs like Google’s SketchUp (it’s free) have simplified the process of making digital 3D models, but SketchUp is definitely not automatic.

Example of photorealistic SketchUp Model created manually and placed into Google Earth

To make a 3D model look photorealistic, real world pictures can be “projected” onto a SketchUp model. While this technique can add realism, SketchUp is still a manual approach that can take hours, weeks, or even months to produce good results.

 

Many in the 3D and animation world would like an automatic process that can produce 3D models from a series of 2D pictures. Our goal is to create a system that automatically produces photorealistic digital 3D models that can be processed in existing 3D programs like 3D Studio Max, GeoMagic, or SketchUp.

The Microsoft Photosynth project can automatically create 3D-like effects (some call it 2.5D) by automatically processing 10s to 100s of 2D images. While this process is automatic, it does not produce a 3D model that can be used by other programs.

Garbage in…….. Garbage out.

A challenge for Photosynth and other automatic stitching/panoramic approaches is that they often use regular uncalibrated cameras. While this is convenient, it forces the programs to analyze each camera image to determine the field of view and other essential lens/camera characteristics: the cameras are essentially calibrated during processing. Precisely calibrating a camera is challenging in a lab setting, so it is reasonable to expect that on-the-fly calibration results will not be very precise. Any errors in the camera calibration step will build on each other and cause problems later in the process. While calibration problems cause annoying alignment errors in panoramic 2D & 2.5D images, they cause unacceptable distortion in 3D models. Here is a list of variables that must be determined before using a 2D image to create an accurate 3D model:

Camera Variables that must be determined for Precise Stereoscopic 3D Reconstruction
- The exact center of the image sensor behind the lens: sensors are normally a few pixels off-center
- Camera Horizontal & Vertical Field of View to within 1/100 degree
- Camera lens distortion correction variables: Pincushion, barrel, radial.
- Camera horizontal orientation (0.00 to 360.00 degrees) to within 1/100 of a degree
- Camera vertical orientation (tilt, roll) to within 1/100 degree
- Camera location for each shot: X, Y, and Z coordinates to within one millimeter
- Camera dynamic range and gamma

The quality of a 3D model is limited by the quality of the 2D pictures used to make it. Here’s how we calibrate our camera system:

1) Design and build a calibration routine/facility to determine the key camera variables.
2) Design and build a system of cameras that can be easily calibrated.

The important point is that the camera system and the calibration system need to be built for each other: they literally fit together like a lock and key. As we see it, a calibrated system produces “clean” images that simplify and speed up the 3D reconstruction process. Our current 8-camera system (Proto-4F) has been designed to produce sets of calibrated images, and these images are used to automatically produce 3D models.

We are currently refining the calibration of Proto-4F, and another model should be completed by the end of April.

Automatic 3D Model Creation Using the 3D-360

February 28th, 2011 1 comment


This 3D model includes alignment errors……. and we know how to fix them. Our objective is to develop an automatic 3D model creation system, and we know from experience that the errors will get smaller as our calibration process is refined. Below is a description of how this model was made using images from Proto-4F of our 8-camera 3D-360 scanner.

A 3D model requires images from multiple perspectives, so for this model we scanned from 4 different locations: two scans from a high perspective with the scanner cameras at 6 feet, and two low scans with the scanner 3 feet above the floor. Once the scans were completed (all of the pictures have been taken and downloaded) the images from the 4 scans were processed using our automatic 3D reconstruction software. This processing resulted in 4 “point clouds” of 3D data: one point cloud for each scan. Next the 4 point clouds were aligned with each other to create a single “point cloud” of, in this case, 20 million points.

Point clouds are a precise, but inefficient way to format and store 3D data. Point clouds for 3D data can be compared to the BMP format for 2D images. Just as compressed JPEGs are about 10x more efficient than uncompressed BMPs for storing 2D images, triangular meshes are a more efficient way to store 3D data than uncompressed point clouds. Meshes are efficient because a group of 3 points for a single triangle can replace thousands (or millions) of points if the points are in a plane. Decades of work from people around the world has resulted in mature procedures to generate meshes from point clouds. Our current meshing routine turned the 400 Mbyte “point cloud” of 20,000,000 points into a 20MB mesh of 24,000 triangles. In the future we will use more efficient meshing procedures that produce better meshes with even fewer triangles.

After meshing we have a 3D model of the area that was scanned, but at this point the mesh is not photorealistic. We make the model photorealistic by “projecting” the original color images taken during the scanning process onto the mesh. This automatic process is called “texture projection,” and when it is done well it results in a photorealistic 3D model.

Texture projection works very well when everything is correctly aligned and registered, but alignment errors can rapidly build on each other and produce errors that make a model look bad. The alignment errors in this process come from several different sources in the calibration/scanning/processing pipeline:

- Lens distortion correction errors inside each camera
- Alignment errors between the left and right camera in each of the 4 pairs of cameras
- Alignment errors between each of the 4 pairs of cameras
- Alignment errors between the 4 scans

These are all well defined problems that we are working on. We could proceed slowly and reduce the errors by recalibrating the existing Proto-4F 3D-360 camera system. This approach would take weeks and it could cut the errors in half a few times, but it cannot correct the built-in limitations of our current lenses and calibration facility.

Another option is to build on our two plus years of experience with the Proto-4x family and design a new Proto-5x series. The new design will have more lenses, higher resolution sensors, faster processors (ARM/AMD Fusion/Tegra/FPGA/other?), and it will be calibrated with a 10x larger “calibration bunker.” I am currently working on Proto-5x designs, and a key characteristic may be to increase the number of cameras from the current 8 to 32, or even as many as 100. A large array of inexpensive lenses can cost less and outperform a small number of expensive lenses. The trick is to design a manufacturable and and inexpensive array of sensors, lenses and processors. While a design with up to 100 camera may sound extravagant, remember that the fly’s eyes have over 1,000 lenses:

Because Proto-5x will require the design, layout, fabrication and testing of a new camera/processor board, this approach will take at least four months. Software porting, calibration, and testing could add another 4 to 8 months to the process. Depending on the final design, the Proto-5x family could reduce the errors by a factor of 10 or more.

3D-360 Camera vs Canon 5D

October 27th, 2009 No comments

The Prototype-4.x family of 3D-360s is based on a camera that we have been developing for over a year.  While several areas of enhancement are still left to be implemented, the new camera is ready to be compared against the Canon 5D.  Prototype-3 used eight Canon 5Ds, and the new camera in Prototype-4 needs to meet or exceed the 5D’s performance.

One significant difference between our camera and the Canon 5D is that the 5D (and all other color cameras) uses tiny color filters arranged in a Bayer pattern on top of the individual pixels inside of the camera.  While the 5D has 12 million pixels, only 3 million are RED, 6 million are GREEN, and 3 million are BLUE.  Our camera is arguably a 15 million pixel sensor because it cycles through three large filters with the 5 million pixel monochrome sensor to produce 5 million RED pixels, 5 million GREEN pixels, and 5 million BLUE pixels. Our camera is immune to color artifacts caused by the Bayer patterns, but taking a picture takes three times longer because the filters must be rotated into place between shots. Fortunately our system automatically changes between filters in less than one second.  In the future we may want to add filters for other parts of the spectrum including infrared (IR) and ultra violet.

The purpose of this test is to compare the color reproduction, noise, and Bayer pattern artifacts between the two cameras. The 5D has a 14mm Canon lens, and the FOV is similar to our custom lens. Here is the test procedure:

1) Take a picture with each camera in RAW mode

2) Use minimal automatic processing on each image.  For the 3D-360 Photoshop was used for color balance and sharpening.  For the Canon 5D the image was processed with DxO

3) Compare the cropped images at actual size and zoomed to 600%

Here are the results:

scan001_face01_cam01_texturecropped-600wide

Above is the shot from the Prototype-4 camera,

And below is the shot from the Canon 5D.

5d-cropped-600wide

The two shots show that our camera compares well to the Canon 5D.  A slight BLUE halo is visible to the left of some objects, but this may be caused by a dirty or warped Wratten filter.

Below is a zoomed comparison of the areas the GREEN circles.

5d-vs-mycam-zoom-66Close inspection shows that the 3D-360 camera has less noise and fewer Bayer pattern artifacts, but the 5D seems a little sharper.  The difference in sharpness could be related to the dynamic range of the two images.  The raw 3D-360 image covers a linear range of 24 bits, but the 5D covers a smaller range of only 12 bits.  We use a combination of linear and logarithmic curves to squeeze the 24 bits per pixel per color channel down to 16 bits per pixel per channel.  To improve contrast we may reduce our range from 24 bits to 22 bits.

I am pleased with this early test, and we are currently implementing upgrades that should make the difference even more dramatic.

Color from a Black & White Camera

July 7th, 2009 No comments

This is the first color image produced by the new camera & lens combination. The bilinear rectification routine that we completed last week was automatically applied to correct chromatic aberration.  In the future bicubic interpolation will make the image even sharper.  The original 16-bit image had levels and curves adjusted in Photoshop, and the result was converted to the 8-bit JPG below.

door-rgb-goodexposure

Interpolation: Bilinear vs Bicubic

July 5th, 2009 No comments

Stereo reconstruction works by identifying similar features within two images, and we will use any technique that enhances small features.  As a first step in our stereo reconstruction pipeline we currently use bilinear interpolation to rectify/dewarp images.  While bilinear interpolation is easy to code and does a good job, there are many other types of interpolation worth considering. The two images below have been modified with bicubic interpolation and bilinear interpolation. The results confirm that bicubic is sharper, so we will eventually migrate to bicubic interpolation.

bilinear-vs-bicubic

Wikipedia has some more examples.