Fast & Beautiful Surfel Rendering on a Smartphone

Published in

Matterport Engineering Techblog

8 min readApr 24, 2017

One of Matterport’s newest products is Matterport Scenes. With a Google Tango compatible smartphone and the Matterport Scenes app, your mobile device becomes a hand-held 3D scanner.

Through the integrated depth sensor inside the device, Matterport Scenes captures objects and single rooms to create a 3D model that you can trim, measure, and share with others. We call these Scenes. Like a movie scene, they’re a snapshot of a particular place at a particular time.

Matterport Scenes is complementary to our company’s main product, the Matterport Pro 3D Camera and related software that is used to scan entire rooms, houses, and properties.

Once you’ve scanned an object or partial room with Matterport Scenes, you can view the Scene immediately on your smartphone as a large point cloud in 3D space. Each point in the point cloud is shown as a small disk called a surfel (minimal surface element).

We have a unique challenge with Matterport Scenes: how to render surfels to make them look good while still maintaining a high frame rate for smartphones with limited processing power.

I. An Evolution of Rendering

A. Three Methods of Rendering

When we first developed Matterport Scenes, we used a very simple rendering process — we simply displayed all points as pixels with the same fixed size.

The biggest problem rendering this way is noticeable depth artifacts (holes) when we zoom in. Consider the images below. While the scene looks fine at a distance, when you zoom in you can see holes in the scene where depth and color data is missing. In particular, notice how you can see through the black tablecloth onto the stage.

Method 1: Rendering with fixed point sizes

Method 1: Notice the holes in the table and the podium

To remove holes, we created another renderer that displayed points with variable point sizes. Unlike Method 1 where all points have the same size, in Method 2 and 3, point sizes and radii are based on actual scan data, so they vary in object space (what’s stored as data for the model)

When rendering, we convert the object-space size to image space (actual pixels displayed to the user) based on the current viewing transform. So the same point will appear larger or smaller on the screen depending on user’s current viewpoint.

This helped fill the holes, but the quality can still appear poor because a point with a large radius sometimes intersects with another point with a large radius. Without interpolation this will lead to blocky contours, also called visual noise. Note the text “2017 IEEE VR LOS ANGELES” in the images below.

Method 2: Rendering with variable point sizes

Method 2: Notice the noisy text on “2017 IEEE VR LOS ANGELES”

Another issue with this rendering method is the points are not perspective-correct (the orientation correctly reflects the current viewing perspective). Regardless of the viewing angle, they always appear to be screen-aligned (always appears as a circle no matter the viewing angle). You can see this more clearly with car example later in this article (Section III C).

We resolved all these issues with the first two methods by using Elliptical Weighted Average (EWA) splatting. This is a rendering technique that’s specifically designed for high quality visualization of surfaces that have been sampled with discrete points — very similar to scanning with Matterport Scenes. The pictures below show rendering done with EWA splatting.

Method 3: Rendering with Elliptical Weighted Average (EWA) splatting

Method 3: Notice the clear reconstruction and the lack of noise throughout

B. Comparing the Three Methods (From Afar)

*Method 1: Rendering with a fixed point size. Here, all points are 2 pixels big.*

*Method 2: Rendering with variable point sizes. Point sizes are determined based on the user’s viewpoint.*

*Method 3: Surfel rendering with EWA splatting.*

C. Comparing the Three Methods (From Closeup)

II. Elliptical Weighted Average (EWA)

A. What is a Splat?

A splat is a small, disk-like object. Imagine it as a tiny frisbee. Just like a point from a point cloud, a splat has an X, Y, and Z position and a color. In addition, it also has a radius and a surface normal vector which are used to orient the splat in the right direction.

If done right, splats are a good approximation for any geometry. Ideally, each splat represents the minimal surface element (surfel) required to cover the surface.

The following is an excellent example of EWA splatting in action. In this example, we have a trefoil knot that we sampled and then reconstructed with EWA splatting.

*The surface has been sampled and then reconstructed with EWA splatting.*

*Same reconstruction, but we have shrunk the radius of each splat to show how they are distributed.*

B. What is Elliptical Weighted Average (EWA)?

Because each splat has a radius, splats that are close to each other will intersect (overlap). If there’s an intersection between the two splats and we just choose one of them to draw on top, then the entire image can look bad. Blending the splats together leads to a more accurate and prettier image. EWA specifically refers to the kind of blending we do.

EWA works by applying a special image filter, a Gaussian reconstruction kernel, to all of the splats. When we project this filter from object space (splat position, radius, tangential vectors, etc) to image space (the 2D set of pixels that are actually displayed to the user), the filter has an elliptical shape. Thus the ‘elliptical’ part of the filter’s name.

The second part of the filter’s name (‘weighted average’) refers to its Gaussian nature. Like a Gaussian curve that’s centered in the middle, nearby points are weighted higher than points farther away. The filter is passed over every point that will be displayed as a pixel in image space. For overlapping splats, the contributions from all of them are weighted and combined together to make a weighted average. Overall, this gets us a smooth surface in image space.

C. Screen-aligned Points with Blending

EWA splatting can create beautiful surfaces but is also computationally intensive. Even the newest smartphones have limited computing resources (memory, bandwidth, CPU, GPU), and some scenes can have millions of data points. So it can be a challenge to accurately render the scene yet also maintain a high frame rate for a smooth and interactive user experience.

On closer look, a high frame rate is most necessary when the user is changing their viewpoint within the scene. That is, specifically when they rotate, zoom in/out, or move around the scene. For these situations, we enhanced our screen-aligned point renderer (Method 2 from Section II A) with some of the blending and anti-aliasing techniques in the EWA splatting approach.

The resulting screen-aligned points are not completely accurate, but rendering is significantly faster. By blending the overlapping points, the 3D model looks reasonably good from a distance and for the split second while the user is navigating.

For the best of both worlds, Matterport Scenes automatically switches to this faster rendering style during user navigation (when the app is in motion) to maintain a high frame rate. Once the user has settled on a viewpoint, we render with the slower, more accurate style for a more beautiful presentation.

*Blended, screen-aligned points. Inaccurate but quick to calculate. Used in motion shots (when user is changing their viewpoint).*

*Perspective-correct splats. More accurate but takes longer to calculate. Used in still shots (when user is not moving their viewpoint).*

The images above are also a good example of screen-aligned points. In the first picture (rendered with Method 2), grey circles on the hood are rendered as more flat circles. In the second image (rendered with Method 3), the grey circles on the hood have been drawn as elliptical to match the current user’s perspective.

D. Hole Reconstruction

Even with very high point density, if you zoom in all the way you can still see holes. A pixel in image space is considered a hole if its color value is not drawn from a visible surfel. In other words, the pixel’s color was pulled from the black background. In the picture below, you can see black pixels in the car’s front shoulder.

Holes can also come from pixels that get their color from behind the visible surface. In this case, the pixel’s associated depth value is much higher than neighboring pixels. In the example below, these kind of holes would pull the color data from the other side of the car. Most cars are the same color throughout, so this is not an issue. However, if the opposite side of the car was painted red you might see red bleed through the hole.

Hole detection and reconstruction is a post-processing step done by Matterport Scenes. We reconstruct holes by applying a simple Gaussian filter that interpolates with nearby pixels that are not holes. The app dynamically enables hole reconstruction based on the frame rate and the scene size.

III. Conclusion

In order for Matterport Scenes to render point clouds with high visual quality, we implemented Elliptical Weighted Average (EWA) splatting. This is a technique designed to reconstruct smooth surfaces that have been point sampled. EWA is unique in that it also uses blending and avoids aliasing artifacts.

Furthermore, to improve performance we switched to a faster screen-aligned point renderer when the scene is in motion for smoother user interaction. Finally, we improved image quality with a hole filling algorithm in image space.

In the future, we plan on improving EWA splatting performance so we can use this method in all possible cases. Blending between surfels with different exposures (amounts of light) is another potential improvement.

Download Matterport Scenes in the Play Store and use it today. The app is currently available for Google Tango compatible smartphones. Learn more about Matterport Scenes.

* * *

Lili Sang is a senior graphics engineer on the Matterport vision team. Matterport is a hardware and software company offering 3D capture, processing, and hosting solutions for real-world applications like real estate, construction, travel/hospitality, and business listings. Our 3D content can be experienced in any desktop or mobile web browser, and on virtual reality headsets like Samsung Gear VR, Google Cardboard, and Google Daydream.

We’re hiring!