Education

Carnegie Mellon University

Aug 2018 - December 2019
Masters of Science, Computer Vision | CGPA: 4.17 / 4.33

Relevant Courses: Geometric Based Methods for Vision (16-822), Advanced Multimodal Machine Learning (11-777), Visual Learning and Recognition (16-824), Robot Localization and Mapping (16-833) Computer Vision (16-720), Introduction to Machine Learning (10-601), Mathematical Fundamentals for Robotics (16-811)

Delhi Technological University

Aug 2014 - May 2018
Bachelors of Technology, Computer Science and Engineering | CGPA: 9.368 / 10

Relevant Courses: Artificial Intelligence, Neural Networks, Data Warehousing and Data Mining, Data Structures and Algorithms

Experiences

Microsoft

February 2020 - Present
Senior Applied Science Manager

I oversee the development of object detection, semantic segmentation, and tracking models for federal customers using aerial/satellite imagery. We collaborate closely with clients to understand their needs, design tailored models, and employ the latest computer vision techniques. Our goal is to provide accurate and reliable insights from visual data to aid their decision-making processes.

Carnegie Mellon University, Robotics Institute

August 2019 - December 2019
Graduate Teaching Assistant

TA for the Graduate Computer Vision course 16-720 taught by Prof. John Galeotti. Responsibilities include preparing assignments, grading, and holding office hours for students.

Uber Advanced Technologies Group

May 2018 - Present
Perception Intern | Advisor: Mr. Warren Smith

Perception Intern at Uber Advanced Technologies Group (https://www.uber.com/info/atg/), Self Driving Division of Uber.

Built a model to characterize LiDAR performance and identify key factors that contribute to detector performance and then use that model to drive decision making about what specifications are important in next-gen.

Successfully tested transfer learning approaches so that models can be re-used across LiDARs with different beam spacings and different scanning patterns.

Carnegie Mellon University, Robotics Institute

Jan 2018 - May 2018
Graduate Teaching Assistant

TA for the Graduate Computer Vision course 16-720 taught by Prof. Srinivasa Narasimhan. Responsibilities include preparing assignments, grading, and holding office hours for students.

Carnegie Mellon University, Computer Science Department

June - August 2018
Visiting Research Scholar | Advisor: Prof. Dave Touretzky

Developed a multi-camera/multi-robot facility for Cozmo, an autonomous robot, by repurposing old phones to act as perched cameras.

Performed Camera calibration and worked on SLAM. Created an Independent server which shares a Shared World Map with its Clients (Robots) helping them to better path plan and navigate.

IBM Research Labs, New Delhi, India

June - August 2017
Research Intern | Manager: Dr. Sameep Mehta

Worked on contextual in-video advertising project which helps a potential advertiser to advertise his/her brand in a contextually relevant video and at the least intrusive position in that video.

Created a context aware ad recommendation/ insertion system using multi-modal analytics through semantic understanding of video content.

Shopclues (an E-Commerce Marketplace), Gurugram, India

June - August 2016
Software Development Intern

Handled the notification module in Shopclues’s Product of Sale (POS) app by integrating Firebase Cloud Messaging and Firebase Notifications.

Integrated Firebase Analytics in the application to log important events.

Publications

Research and Projects

Sensor Fusion with Single-Photon Detectors | Advisor: Prof. Matthew O’Toole
  • Explored and developed a suite of sensor fusion techniques around an emerging sensing technology known as a single-photon avalanche diode, or SPAD.
  • Developed novel sensor fusion algorithm to fuse input data from Single Photon LiDARs, Stereo Camera Pair and RADARs.
  • Proposed a solution which forms an intermediate cost volume representation from different sensors that when passed through deep neural network (PSM-Net), estimates better disparity and depth information of the scene.
Action Recognition using Synthetic Data | Advisor: Prof. Kris Kitani
  • Worked on recognizing multi-object activities such as left/right turning of a car and car taking a u-turn.
  • Working on the generation of synthetic data which closely matches the real world data in a graphics environment such as Unreal Engine.
  • Deployed a bi-directional RNN to recognize the activities.
Semantic Understanding of a Video | Advisor: Dr. Rajni Jindal
  • Preserved the context as well as the temporal sequence of a video by understanding its semantics.
  • Created a sequence to sequence model, where a LSTM is used both as an encoder as well as a decoder and finally generated a summary of the video in natural language.

Skills & Proficiency

Python

OpenCV

C++ & Java

Android Studio

HTML5 & CSS

Cozmo Tools