Like Apple and Google, Facebook has been working on developing shortcuts for computer visions that are designed to give mobile apps augmented reality super powers.
And while Apple's ARKit and Google's ARCore use computer vision to guess the position of horizontal and vertical surfaces, Facebook researchers now claim to have prototype mechanisms that can now derive the 3D shape of an object based on the 2D image information.
If Facebook's research is as groundbreaking as it sounds, then the aforementioned Apple and Google mobile toolkits represent the basis for future AR wearables, Facebook & # 39; s breakthroughs can contribute to its own smartglass developments.
This week, Facebook research scientists Georgia Gkioxari, Shubham Tulsiani and David Novotny published their findings on four new methods of 3D image recognition, which will be presented at the International Conference on Computer Vision (ICCV) in Seoul, South Korea, which runs until November 2.
Two of those methods involve identifying 3D objects from 2D images. Building on the Mask R-CNN model for segmenting the objects in an image (and presented at the same conference last year), Mesh R-CNN deduces the 3D shapes of those identified objects, while occluding, clutter and other provocatively composed compensates photos & # 39; s.
"Adding a third dimension to object detection systems that are robust against such complexities requires stronger technical capabilities and current technical frameworks have hampered progress in this area," the team wrote in a blog post.
In addition, the team has built another computer vision model that serves as an alternative to and supplement to Mesh R-CNN. The boldly named C3DPO system (Canonical 3D Pose Networks) can perform large-scale reconstruction of 3D objects with only 2D keypoints for 14 object categories, such as birds, people and cars.
"Previously, such reconstructions were primarily not feasible due to memory constraints with the previous matrix factorization-based methods that, unlike our deep network, cannot work in a & # 39; minibatch & # 39; regime. Previous methods have "We are talking about modeling distortions by using multiple simultaneous images and making similarities between immediate 3D reconstructions that require hardware that is usually found in special laboratories," the team wrote. "The efficiency introduced by C3DPO makes it possible to enable 3D reconstruction in cases where the use of hardware for 3D recording is not feasible, such as with large-scale objects such as aircraft."
The team also built a system for canonical surface mapping, which takes generic image collections and maps them in 3D forms. This helps computer vision applications to better understand common properties between different objects in an AR scene.
"For example, if we train a system to get to know the right place to sit on a chair or grab a mug, our view may be useful the next time the system needs to understand where to sit on another chair or how to grab another mug, "wrote the team. "Such tasks can not only help deepen our understanding of traditional 2D images and video content, but also improve AR / VR experiences by transferring representations of objects."
Finally, VoteNet is an experimental 3D object detection network that can accurately understand a 3D point cloud using only geometric information in location of color images.
"VoteNet has a simple design, compact model size and high efficiency, with a speed of about 100 milliseconds for a full scene and a smaller memory footprint than previous methods designed for research," the team claims. "Our algorithm takes 3D point clouds from depth cameras & returns 3D-bound boxes of objects with their semantic classes."
The latest innovations in 3D imaging from the Facebook research team reflect a similar development last year that was the result of a collaboration with researchers from the University of Washington in the UW Reality Lab funded by Facebook. That team resulted in Photo Wake-Up, a method for generating 3D animations from 2D images.
Some AR researches from Facebook are commercialized faster than others, so it may take some time before we see all these innovations in action, the company tends to move quickly when it is something worthwhile to the public. For example, using the Mask R-CNN model, Facebook researchers developed a body mask method that has already found its way to Facebook's Spark AR platform.
However, this latest study could contribute to the continued development of Facebook from AR smartglasses. Together with its Live Maps AR cloud platform, Facebook may in the future be able to simulate the real-world occlusion and spatial calculation capabilities of the HoloLens 2 or the Magic Leap One by giving its smartglasses the ability to identify 3D objects.
Of course, the key is how all this could ultimately look like in a housing that meets mainstream tastes in a way that would actually herald the golden age of AR smartglasses.