Preview

Multi layer forests

Powerful Essays
Open Document
Open Document
9006 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Multi layer forests
Learning to be a Depth Camera for Close-Range Human Capture and Interaction

Sean Ryan Fanello1,2

1

Cem Keskin1 Shahram Izadi1 Pushmeet Kohli1 David Kim1 David Sweeney1
Antonio Criminisi1 Jamie Shotton1 Sing Bing Kang1 Tim Paek1

Microsoft Research

2

iCub Facility - Istituto Italiano di Tecnologia

a

b

c

e

g

d

f

h

Figure 1: (a, b) Our approach turns any 2D camera into a cheap depth sensor for close-range human capture and 3D interaction scenarios.
(c, d) Simple hardware modifications allow active illuminated near infrared images to be captured from the camera. (e, f) This is used as input into our machine learning algorithm for depth estimation. (g, h) Our algorithm outputs dense metric depth maps of hands or faces in real-time.

Abstract

1

We present a machine learning technique for estimating absolute, per-pixel depth using any conventional monocular 2D camera, with minor hardware modifications. Our approach targets close-range human capture and interaction where dense 3D estimation of hands and faces is desired. We use hybrid classification-regression forests to learn how to map from near infrared intensity images to absolute, metric depth in real-time. We demonstrate a variety of humancomputer interaction and capture scenarios. Experiments show an accuracy that outperforms a conventional light fall-off baseline, and is comparable to high-quality consumer depth cameras, but with a dramatically reduced cost, power consumption, and form-factor.

While range sensing technologies have existed for a long time, consumer depth cameras such as the Microsoft Kinect have begun to make real-time depth acquisition a commodity. This in turn has opened-up many exciting new applications for gaming, 3D scanning and fabrication, natural user interfaces, augmented reality, and robotics. One important domain where depth cameras have had clear impact is in human-computer interaction. In particular, the ability to



References: A HMED , A. H., AND FARAG , A. A. 2007. Shape from shading under various imaging conditions A MIT, Y., AND G EMAN , D. 1997. Shape quantization and recognition with randomized trees. Neural Computation 9, 7. BARRON , J. T., AND M ALIK , J. 2013. Shape, illumination, and reflectance from shading. Tech. Rep. UCB/EECS-2013-117, EECS, UC Berkeley, May. BATLLE , J., M OUADDIB , E., AND S ALVI , J. 1998. Recent progress in coded structured light as a technique to solve the correspondence problem: a survey B EN -A RIE , J., AND NANDY, D. 1998. A neural network approach for reconstructing surface shape from shading B ESL , P. J. 1988. Active, optical range imaging sensors. Machine vision and applications 1, 2, 127–152. B LAIS , F. 2004. Review of 20 years of range sensor development. B LANZ , V., AND V ETTER , T. 1999. A morphable model for the synthesis of 3D faces B REIMAN , L. 2001. Random forests. Machine Learning 45, 1. B ROWN , M. Z., B URSCHKA , D., AND H AGER , G. D. 2003. C OMANICIU , D., AND M EER , P. 2002. Mean shift: A robust approach toward feature space analysis C RIMINISI , A., AND S HOTTON , J. 2013. Decision Forests for Computer Vision and Medical Image Analysis F REDEMBACH , C., AND S USSTRUNK , S. 2008. Colouring the nearinfrared. In Color and Imaging Conference, vol. 2008, Society for Imaging Science and Technology, 176–182. G URBUZ , S. 2009. Application of inverse square law for 3d sensing. In SPIE Optical Engineering+ Applications, International Society for Optics and Photonics, 744706–744706. H ERTZMANN , A., AND S EITZ , S. 2005. Example-based photometric stereo: Shape reconstruction with general, varying BRDFs. H OIEM , D., E FROS , A., AND H EBERT, M. 2005. Automatic photo pop-up H ORN , B. K. 1975. Obtaining shape from shading information. I DESES , I., YAROSLAVSKY, L., AND F ISHBAIN , B. 2007. Realtime 2D to 3D video conversion. J. of Real-Time Image Processing 2, 3–9. J IANG , T., L IU , B., L U , Y., AND E VANS , D. 2003. A neural network approach to shape from shading K ARSCH , K., L IU , C., AND K ANG , S. 2012. Depth extraction from video using non-parametric sampling K ESKIN , C., K IRAC , F., K ARA , Y., AND A KARUN , L. 2012. Hand ¸ P RADOS , E., AND FAUGERAS , O. 2005. Shape from shading: a well-posed problem? In Proc R EMONDINO , F., AND S TOPPA , D. 2013. ToF range-imaging cameras G EHLER , P. V. 2011. Recovering intrinsic images with a global sparsity prior on reflectance S AXENA , A., S UN , M., AND N G , A. 2009. Make3D: Learning 3D scene structure from a single still image S CHARSTEIN , D., AND S ZELISKI , R. 2002. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. S HOTTON , J., W INN , J., ROTHER , C., AND C RIMINISI , A. 2006. S HOTTON , J., F ITZGIBBON , A., C OOK , M., S HARP, T., F INOC CHIO , M., M OORE , R., K IPMAN , A., AND B LAKE , A. 2011. technique. Physics in Medicine and Biology 43, 2465–2478. S MITH , W. A., AND H ANCOCK , E. R. 2008. Facial shape-fromshading and recognition using principal geodesic analysis and robust statistics K HAN , N., T RAN , L., AND TAPPEN , M. 2009. Training manyparameter shape-from-shading models using a surface database.

You May Also Find These Documents Helpful

  • Powerful Essays

    Civil War Origins and Legacy

    • 2553 Words
    • 11 Pages

    Davidson, J., W., Gienapp, W., E., Heyrman, C., L., Lytle, M.,H., & Stoff, M., B. (2002). Nation…

    • 2553 Words
    • 11 Pages
    Powerful Essays
  • Powerful Essays

    Ryan, J., Kreiner, D. S., Bartels, A., Tree, H., & Schnakenberg-ott, S. D. (2006). Thirty-…

    • 4122 Words
    • 17 Pages
    Powerful Essays
  • Good Essays

    Once satisfied start filling in the general shading and depth of the figure, lightly add more dark and light keeping in mind where the light source is; keep adding general…

    • 787 Words
    • 4 Pages
    Good Essays
  • Best Essays

    Dunn et al., [7], describe the use of invariant geometric structures to determine the projection geometry used to acquire an image. He demonstrated the relationship of these two-dimensional points independent of the projection geometry of which they were acquired.…

    • 4832 Words
    • 20 Pages
    Best Essays
  • Powerful Essays

    Bean, J.F., Kiely, D.K., Herman, S., Leveille, S.G., Mizer, K., Frontera, W.R., et al., 2002.…

    • 11321 Words
    • 46 Pages
    Powerful Essays
  • Good Essays

    The Augmented reality and virtual reality (AR&VR) is going through a significant adoption in the training and infotainment sectors. The main motivation behind the significant adoption is the versatile range usage for these technologies (AR&VR).…

    • 559 Words
    • 3 Pages
    Good Essays
  • Good Essays

    [11] W.T. Freeman, E.C. Pasztor, and O.T. Carmichael. Learning lowlevel vision. Int. J. Computer Vision, 40(1):25–47, 2000.…

    • 5706 Words
    • 23 Pages
    Good Essays
  • Powerful Essays

    Design of Immersivetouch

    • 5197 Words
    • 21 Pages

    References: 1. 2. 3. 4. 5. 6. Accommodation/Convergence conflict, http://vresources.jump-gate.com/articles/vre_articles/stereo/ sterean2.html Ascension Technologies Corp., pciBIRD API, http://www.ascension-tech.com/products/pcibird.php Creative, OpenAL, http://www.openal.org/ Cruz-Neira, C., Sandin, D., DeFanti, T., Kenyon, R., and Hart, J.C., The CAVE: Audio Visual Experience Automatic Virtual Environment, Communications of the ACM, Vol. 35, No. 6, 1992, pp. 65-72. Czernuszenko, M., Pape, D., Sandin, D., DeFanti, T., Dawe, G., Brown, M., The ImmersaDesk and Infinity Wall Projection-Based Virtual Reality Displays. Computer Graphics, 1997. Czernuszenko, M., Sandin D., DeFanti, T., Line of Sight Method for Tracker Calibration in ProjectionBased VR Systems, Proceedings of 2nd International Immersive Projection Technology Workshop, Ames, Iowa, 1998. Fast Light ToolKit, http://www.fltk.org/ Johnson, A., Sandin, D., Dawe, G., DeFanti, T., Pape, D., Qiu, Z., Thongrong, S., Plepys, D., Developing the PARIS: Using the CAVE to Prototype a New VR Display, Proceedings of IPT 2000: Immersive Projection Technology Workshop, Ames, IA., 2000. Kitware Inc., Visualization ToolKit 4.5, http://www.vtk.org/ LaserAid, SpaceGrips, http://www.spacegrips.com/spacegrips.htm Reachin Display, http://www.reachin.se/ SensAble Technologies, GHOST 4.0, http://www.sensable.com/ SenseGraphics 3D-MIW, http://www.sensegraphics.se/3DMIW.pdf Stereographics theory, http://astronomy.swin.edu.au/~pbourke/stereographics/vpac/theory.html Systems in Motion, Coin 2.3, http://www.coin3d.org/ The Visible Human Project, http://www.nlm.nih.gov/research/visible/visible_human.html Pape D., Sandin, D., Transparently supporting a wide range of VR and stereoscopic display devices, Proceedings of SPIE, Stereoscopic Displays and Virtual Reality Systems VI (The Engineering Reality of Virtual Reality 1999), vol 3639, San Jose, CA VRCO, CAVELib™ and Trackd®, http://www.vrco.com/ Zwern, A., How to select the right head-mounted display, Meckler’s VR World, 1995, http://www.genreality.com/howtochoose.html…

    • 5197 Words
    • 21 Pages
    Powerful Essays
  • Powerful Essays

    Muath Sabha Department of Multimedia Technology Arab American University-Jenin, Palestine email: msabha@aauj.edu Philip Dutr´ e Department of Computer Science Katholieke Universiteit Leuven, Belgium email: philip.dutre@cs.kuleuven.be erating a similar distribution in the target texture and that is explained in the subsections 3.1 and 3.2. (2) The second and major component, which we consider the main contribution in our work, is the shape of the patch and its placement in the target texture, which is discussed in subsection 3.3. Texture editing is an important application of our technique, and that is explained in 3.4. In section 4, we show some results generated from different textures, followed by discussion. Finally, section 5 concludes this work and puts some recommendations for future work.…

    • 4064 Words
    • 17 Pages
    Powerful Essays
  • Good Essays

    Gradient Vector Flow (GVF) is a feature-preserving diffusion of gradient information. It was originally introduced by Xu and Prince to drive snakes, or active contours, towards edges of interest in image segmentation. But GVF is also used for detection of tubular structures and skeletonization. In this post I present a simple Matlab implementation of GVF for 3D images which I made because I could not find any online. The implementation is a simple extension of Xu and Prince original 2D implementation found at their website.…

    • 1577 Words
    • 7 Pages
    Good Essays
  • Best Essays

    13. R.H. Baughman, C. Cui, A.A. Zakhidov, Z. Iqbal, J.N. Barisci, G.M. Spinks, G.G. Wallace, A. Mazzoldi, D. De Rossi, A.G. Rinzler, Science 284 (1999)…

    • 896 Words
    • 4 Pages
    Best Essays
  • Powerful Essays

    Kinect

    • 1452 Words
    • 16 Pages

    Overview of depth from stereo Projected Light Pattern Stereo Algorithm 2. How it works for a projector/sensor pair 3. Stereo algorithm used by PrimeSense (Kinect) Segmentation, Part Prediction Body parts and joint positions Depth Image Depth from Stereo Images image 1 image 2 Depth from Stereo Images • Goal: recover depth by finding image coordinate x’ that corresponds to x X X…

    • 1452 Words
    • 16 Pages
    Powerful Essays
  • Good Essays

    Immersive of Multitmedia

    • 828 Words
    • 4 Pages

    Users can interact with a virtual environment or a Virtual Artifact (VA) either through the use of standard input devices such as a keyboard and mouse, or through multimodal devices such as a wired glove, the Polhemus, and omnidirectional treadmills. The simulated environment can be similar to the real world – for example, in simulations for pilot or combat training – orit can be significantly from reality, such as in VR games. In practice, it is currently very difficult to create a high-fidelity virtual reality experience, due largely to technical limitations on processing power, image resolution, and communication bandwidth; however, the technology’s proponents hope that such limitations will be overcome as processor, imaging, and data communication technologies become more powerful and cost-effective overtime.…

    • 828 Words
    • 4 Pages
    Good Essays
  • Satisfactory Essays

    -Host of the Saturday Shows of Dasmarinas City, namely, Dasmarinas Got Talent, Dasmarinas Best Dance Crew, G. at Bb. Dasmarinas Monthly Finals -project of the City Government of Dasmarinas headed by Mayor Jenny Barzaga.…

    • 250 Words
    • 1 Page
    Satisfactory Essays
  • Good Essays

    Hence our project is to design a virtual reality Robot which can reproduce human activity. This project is to cater the need of internal and external security to the industries, offices, army and other places where human security is improved. This system eliminates huge losses due to terrorism and antisocial elements. . This project is designed with extreme care for perfect synchronization.…

    • 280 Words
    • 2 Pages
    Good Essays

Related Topics