A computer vision engineer giving robots the ability to interact with the world

For “Mujinians Voice” series, this time we would like to introduce Jeronimo, who is leading our computer vision team. He worked on the same project with Mujin CTO Rosen at Carnegie Mellon University and later joined Mujin as one of the first engineers. 

– Could you tell us about your background? 

I was born in Portugal. While growing up I gained increasing interest in physics, computers and math. At the university, I mainly studied electrical engineering for my bachelor’s degree. I have studied power grids, computer science, electronics, telecommunications and robotics, and decided to pursue robotics for my master’s. While working on many topics of robotics, such as navigation, distributed formation control, and system identification, my main topic of research became computer vision for shape recognition, object detection, and classification. I presented my work in top international conferences such as IEEE CVPR and that opened the path to more research topics and to the start of my PhD in Robotics Institute at Carnegie Mellon University. 

– Why did you become interested in computer vision? 

I was always very fascinated by robots, and always searching for ways to make them smarter and more autonomous. I could see the role that human vision takes in human tasks and how applying that same perception loop to robots could play an important role in robotics autonomy. 

The main difference  between robots and machines is the existence of perception as part of the loop that commands actions. If a robot cannot perceive the surroundings, somebody has to push a button or send a signal to move it. In such a case, the robot is just a machine, and the range of use and autonomy is very limited.  

Computer vision can enable robots to perceive, model, and interact with thier surrounding environment, adding a vast range of capabilities. Smart robots can, for example, map the environment, localize themselves in it, and navigate through it. They can interact with people, react based on the current surrounding context, manipulate objects, and command other machines. By giving robots the power of computer vision, robots can become smarter and much more useful

– How did you get to know Mujin CTO, Rosen?  

At the end of my master’s degree, I got to know one of the advisors from the Robotics Institute at Carnegie Mellon University, Prof. Takeo Kanade, who was also Rosen’s advisor and later became the advisor for Mujin. People call him “the father of computer vision.” Because of him, I decided to enter Carnegie Mellon University for my doctorate degree, where I met Rosen. 

Rosen was, from the beginning, someone I admired for his focus and hard work, as well as his impressive capability to work on very complex projects while making them look fairly easy and smooth. While giving a presentation he would often create complex robotics content on the fly to demonstrate his point (programming faster than I could move my fingers!) and create very meaningful technical discussions. That really held my attention! 

We started to interact with each other when Rosen and I started to work on the same project together. We worked on a bin-picking project as a partnership between Robotics Institute and a major Japanese car manufacturer. At that time, performing generic object pose estimation of textureless shiny objects from single 2D images and demonstrating picking with generic motion planning was very unique work. Still is! We flew to Japan to present that system. I remember Rosen saying to me at that time “Wow, awesome hard work!” for my contributions to the project. 

After less than a year, Rosen graduated from Carnegie Mellon University, and moved to Japan for the University of Tokyo. Meanwhile, I continued working on bin picking, pushing the project further, while I also interned at Qualcomm and Google to create computer vision applications for Augmented Reality in hand-held devices. 

– What happened after that? 

Rosen started Mujin with Issei in 2011. As time went by, Mujin felt the increased need to have in-house computer vision applications to blend with its motion planning algorithms. When Prof. Takeo Kanade visited their office, they asked him if he knew a suitable person for a computer vision engineer at Mujin, and he suggested me. While I was still working on my Phd, I started developing computer vision for picking applications at Mujin, remotely. 

One day, Mujin was preparing for a metal picking demonstration for an exhibition, DMS 2014, using a computer vision program I wrote. When there were only 2 days left before the exhibition. I got a call from Rosen, “Hey, we have just put the hardware and software together, but the system is not working and we need to ship it now! Can you come and take a look at it at the expo?”  After that call, I immediately checked when the next flight was and got onto the first plane the next morning. Because of the time difference, I arrived at Narita airport at 6 am on the day of the exhibition, and reached the exhibition hall at 8 am. Then, I fixed the system just before 10 am when the exhibition started! 

“Join, join, join!” Rosen and others kept telling me. At the time of the visit I also got fascinated by Mujin, and this initial core team. Making the robot move at the command of your code, autonomously, is very fascinating, and a completely different feeling from writing research papers. Mujin had an amazing company vision, a great and smart team, and was working on something nobody had ever accomplished before. After 2 months from the exhibition, I officially joined Mujin! 

– How was life at Mujin at the beginning? 

Most of the people joined Mujin from outside of Japan at that time. While it was very challenging to create such entire new robotics product from scratch, life in a new country was also hard for us. But CEO Issei made our initial days so much easier. 

When I first came to Japan, he let me live in his house. I used his bathroom, his sofa, and his room. He cooked for us frequently. Even if I ate all the food, Issei would say it’s okay. I ate a lot actually. 

I believe the strength of Mujin comes from the technology and the people. Because of Issei’s sense of taking care of people, we were able to create strong bonds and concentrate on our work. 

I was the first computer vision engineer, so there was no code until then and I had to start mostly from scratch. At that time, Mujin didn’t have a fixed solid product, and we had to work on completely different demonstrations frequently. We were trying to figure out what other kinds of robotics applications would be most useful to the real world and to the customers we could find, while building a comprehensive library from scratch. 

We showed our systems in international robotics conferences such as ICRA, or exhibitions such as iREX. We started to provide picking solutions in manufacturing. Later, we also entered logistics, where there were no other actual solutions for picking automation. We have gradually understood the market and customer needs, and evolved our products. 

Since then, Mujin has developed so rapidly and dramatically that I couldn’t imagine the company as it is now,  at the time when I joined. 

– How’s your work now? 

Now we have several kinds of solutions in manufacturing and logistics. While we grow into new technologies, we work hard in making the current solutions very very solid. We are trying to close the gap between the ideal and reality. We are making our products faster and smarter. It needs lots of hard work to make robotics perfect and easy to use, and I am very excited to make that happen in the very high standards of production settings. 

I am now leading our computer vision team. I want to put more great engineers together to give better perception to our robots and achieve the company mission of helping quality of life through automation. Creating very reliable computer vision or robotics applications is incredibly challenging, in particular when we need to design applications that are production ready, sustainable, and reusable in different applications and environments. I think it is important to share this big picture of the work with teammates and to aim for this great common goal.

– What kind of people do you think would fit your team? 

Those would be super smart people with a very strong will to realize our intelligent robot system. There are lots of challenges in front of us and we want to push further the limits of what can be done today. Some challenges need very technical contributions, while others need innovative and creative ideas. But above all, whoever joins should have a strong passion for robotics and automation and a never-surrender spirit! 

– What do you think of Mujin? 

People have realized automation throughout history. Thanks to transportation, people can move within an hour instead of walking for several days. Or thanks to PCs and search engines, people can acquire information instantly without manually searching at libraries for valuable content. Automation helps people to get the same output with smaller effort or shorter time. It makes our lives easier. 

Because of the technology limitations, there are still many repetitive tasks people have to deal with. We aim at making robots work on these tasks autonomously, so that people can spend more time on creative work. The path will not be easy, but I feel super excited for the future Mujin will create!

Stay Connected to Mujin