The Transformation of Machine Vision: The Tech Between Us | Part Two
The Tech Between Us | Part 2
Raymond Yin
Welcome back to The Tech Between Us...
We're continuing our conversation with Jeff Bier, President of consulting firm, BDTI and Founder of the Edge AI and Vision Alliance on our topic of Machine Vision. To catch up on part one of our conversation, visit our Empowering Innovation Together website.
We've talked about artificial intelligence, deep learning … and people here at Mouser have heard me say many times that, I truly believe that AI, in all forms will be the most disruptive technology mankind has ever known. And you've already talked a lot about how deep neural networks are enabling these types of things. Where do you see that heading? As we get into some of the newer applications, and as the new algorithms become available.
Jeff Bier
Well, first of all, Raymond, I completely agree with you - that AI is going to be the most important technology of our lifetimes. And in the realm we focus on, let's call it perceptual AI. I think perceptual AI is going to become probably the most important technology for almost any kind of technology device or system. And the reason I think that is by analogy with how humans work. Most of the information that comes into the human brain comes in visually, whether it's by reading something or recognizing a person or an object or understanding a three-dimensional space. And so, it's odd if you think about it, that almost all of our machines don't have access to this information. They don't understand the world around them. They have few, if any, sensors that help them perceive; that's going to change very quickly. And when machines are able to understand what's going on around them, we're going to find that we can make them much more effective, efficient, safer, autonomous, and so on.
The algorithms are the fundamental technology that's enabling us to do that. And the advances, especially in deep neural networks in the last 10 years, are really what's propelling this field forward. But the algorithms themselves are not sufficient. In fact, these algorithms are extremely computationally demanding. So, if we're going to deploy them out in the field and make all of these devices able to perceive the world, and by being able to perceive the world make them better, we're going to need to be able to deploy very powerful processors in these devices. Powerful, but not very expensive. Because if we're going to proliferate them into everyday devices, we can't be talking about a $5,000 GPU. We're probably talking about a $10 embedded processor that's going to go into more everyday kind of system. So, we need processors that are very powerful and yet very efficient in terms of their cost and power consumption. Fortunately, there's been huge investment in semiconductor industry over the last 10 years to specifically make processors that are very, very efficient on these kinds of algorithms that we've been talking about. The computer vision, classical computer vision, image processing, and especially the deep learning algorithms. They are sometimes hundreds of times more efficient than previous generation processors. And that often makes the difference between being a product being viable and not. If it's going to take $1,000 processor, well that's probably not going in my warehouse robot. If I can do it in a $10 processor. Yeah, okay, that sounds feasible.
Raymond Yin
Right; now, you're going to put it everywhere.
Jeff Bier
Exactly. And that's the path we're on. So, after the advances in algorithms, the advances in the processors is the second thing that's really propelling this field forward. So, then you say, okay, we have these incredibly powerful perception algorithms, and we have processors that can run them at acceptable cost and power consumption. So why isn't this technology already everywhere? What's taking so long? The next obstacle turns out to be the development tools and processes because it's not simple to develop this kind of technology, this kind of solution.
Raymond Yin
Right, yes.
Jeff Bier
I would say the advances in the development tools and techniques are lagging behind the advances in the algorithms and the processors. Fortunately, there's been quite a bit of investment in the last few years in development tools that make this process easier. I'm not going to say easy, but easier and most importantly, more accessible to people who don't have 10 or 20 years of experience in this field. People who are maybe embedded system developers who understand software and hardware, and sensors, and traditional embedded software tools, and development processes, should be able to, in a matter of weeks, understand enough about how embedded computer vision and deep learning works that they can create some kind of initial demonstration and start getting comfortable with how this technology works and start experimenting with how they want to use it in their product. That hasn't been the case, but it's starting to become the case now, thanks to improvements in development tools.
Raymond Yin
Oh, interesting! I know that a lot of our manufacturers are starting to come out with their own software to create these algorithms, to create the models that you're talking about. You know, ST, for example, Renesas, they've all come out with their own versions of that piece of software.
Jeff Bier
Yeah, and when it comes to deep neural network models, the first thing that people need to understand is they’re incredibly powerful. And also, it's a totally different way of doing things from traditional kind of algorithm development. You don't so much specify a step-by-step process like you would with traditional software and algorithm development. As you teach an existing algorithm that's a generalized learning algorithm, you teach it, hey, here's my situation, here's a good one, here's a bad one, here's a safe situation, here's a dangerous situation. And to do that often requires collecting a lot of data, a lot of examples, and organizing those examples to be able to train the algorithm and say, good, bad, dog, cat, whatever the case may be.
And then running some experiments to see, okay, here's my thousand examples of good and a thousand examples of bad I trained the algorithm with. Now here's another thousand examples of good and a thousand examples of bad that I'll use to evaluate how well the algorithm learned. And did it learn well enough? And usually on the first time, first try, no, it didn't. So where did it fail and why did it fail? Oh, there were some special cases here where one object was occluding the view, the camera's view was blocking camera's view of another, or it was in a weird orientation. Okay, so now I need to supplement my training data so that the algorithm can learn. So, it's a fundamentally different way of creating algorithms than what we're all used to. And it requires some effort for people to kind of get their heads around it.
And it requires a whole different set of software tools, development tools and processes compared to traditional embedded software and algorithm development. And those are the kinds of things you were just referring to. And yeah, most of the embedded processor suppliers now have a tool chain that incorporates a path for, oh, okay, you want to incorporate a deep neural network into your application? Here's how you do that with our tools. Here's how you train the network, evaluate the network, and then map it into our chip to run efficiently. Then you can integrate it with the rest of your software, develop with more conventional techniques to create your full application.
Raymond Yin
Exactly, and even third party. You've probably heard of Edge Impulse. We've worked with those guys for many years. We worked with one of their ambassadors to create an eight-part series that walks engineers through how to make those models for any processor.
Jeff Bier
Yeah, Edge Impulse is a great example of the class of companies that I was referring to a minute ago that have developed a suite of tools and a platform specifically oriented towards this type of application, including the deep learning portion, but not limited to the deep learning portion. Because most of these applications wind up using a combination of deep learning, as well as traditional image processing or signal processing or computer vision techniques to come up with an effective and efficient solution. And contemplate the flow of steps that is going to yield a successful algorithm. And then be able to map that algorithm into efficient code running on the target embedded processor, which does not often have massive computation resources and massive memory, so that the algorithm can run with acceptable performance, cost, and power consumption.
Raymond Yin
Yeah, and like I said, the guys at Edge Impulse, they've done a great job, as well as the folks over at ST. You're right, I think that is opening up a whole new realm of software and algorithm development beyond standard embedded software coding. I mean, standard C++ or, whatever it may be. So, I'm really excited as it sounds like you are as well, where that is leading. At what point does that completely democratize the usage of machine learning, for these types of applications?
Jeff Bier
Yeah, it's very exciting. I think we're at a point now where the democratization has reached a level where it's pretty easy to get started. Somebody with embedded software/electrical engineering/embedded processor background can create a demonstration level solution, presuming they have the ability to get the needed data. The data is key for training these algorithms to get from that initial demo to a production worthy solution that's really robust, for example, under all the challenging environmental conditions. If we go back to the truck unloading a tubular container versus a rectangular container, that tends to still require the help of experts with a lot of experience. There's still a lot of what you might call kind of tribal knowledge, artisanal knowledge that comes primarily through experience. It's hard to automate that, but there's more and more investment going into to doing so, and the progress is really encouraging. So, hopefully in the not-too-distant future, anybody with typical embedded system development skills will be able to create a production worthy machine-learning-based computer vision solution in a matter of hours. But we still have a way to go right now. You can get started in a matter of days or weeks, if not hours. And that's exciting and encouraging.
To get the rest of the way to a production worthy solution still requires a certain level of expertise and, and effort. It will be good to keep paring that down, but it'll take some time.
Raymond Yin
Yeah, but I think we will eventually get there. And, once again, before we know it, creating these types of algorithms and the creation of models will truly be as simple as writing in C++ for your typical embedded designer.
Jeff Bier
Well, I would say even simpler. In C++, if you think about it, it's simple if you do it for a living, but as the programmer expressing your wishes in a way that is very carefully structured for the benefit of the machine. What about if you didn't have to program at all, but if you could just express in plain language as if you were having a conversation with chatGPT. And we're not that far away from being able to have access to that kind of approach where you can just express yourself in natural language and show examples, explaining what you're trying to do, and let the AI interpret that and do the kind of tedious, repetitive work of, okay should it be this deep neural network topology or that one? Do I need more training data and if so, can I synthesize with computer graphics the additional training data I need instead of telling you the developer, I need you to go find me another 500 images so that I can properly train this algorithm. We can make it even easier than writing C++ code for an embedded system, but it'll take a while to get there.
Raymond Yin
Sounds good! Something to look forward to!
Jeff, we've talked a lot about getting started with perceptive AI and especially the AI portion. We've pointed to some resources on the Edge AI and Vision Alliance website, where else can engineers go to jumpstart their knowledge of all this?
Jeff Bier
Our organization, the Edge AI and Vision Alliance, is really focused on exactly that, on helping engineers first get the inspiration on how they can use this technology to solve the problems that they're trying to solve with their products. And secondly, to enable them to get the know-how, the practical skills and knowledge they need, the connections to suppliers and so on, so that they can effectively implement visual AI or perceptual AI in their product. We do that a number of ways. One, like you mentioned is the website. The website is edge-ai-vision.com. That's the main Alliance website. Another main way we do this is through our annual conference called the Embedded Vision Summit, which takes place every year in May in Silicon Valley. And it's really the only conference that is entirely focused on the needs of engineers and engineering managers who are developing products across all industries, incorporating this kind of machine perception.
I would recommend people check out the Embedded Vision Summit conference. This year, it'll be May 21 - 23 in Silicon Valley in Santa Clara, California. And it's really a fantastic place to come to learn from the kind of fundamentals up through advanced practical techniques on how to incorporate perceptual AI into products and to learn from other product developers who've used these techniques and know how they work and what their limitations are in the real world. And for folks listening to this podcast who are interested in coming to the Embedded Vision Summit in May, I have a discount code they can use. It's Summit24-Mouser. It's not case sensitive, Summit24-Mouser, and that'll get you the best available price on registration for this year's Embedded Vision Summit in May in Santa Clara.
Raymond Yin
Wow, thanks, Jeff! We really appreciate that!
The curriculum is amazing. And the participants, for the most part, they are people working at large corporations who have, like you said, have gone through the trials and tribulations of creating the systems and algorithms and want to share that knowledge with other engineers.
Jeff Bier
Yeah, what we're trying to do is present, what's working in the real world, here's what it does, here's what it doesn't do. Here's some of the problems that can come up when you try and use it, and here's what's worked for other people when they've run into those problems.
Raymond Yin
That's awesome! And just a couple of months from now.
Let's shift for a moment and highlight one of Mouser's in-depth content articles on machine vision. We take a look how integrating deep learning into embedded systems requires adept handling of various hardware and algorithmic challenges. You can read the full article and other pieces by visiting mouser.com/empowering-innovation. Now back to the conversation as we discuss the future of machine vision.
All right, Jeff, I'm going to ask you to get your crystal ball out now. Looking forward, we've talked about artificial intelligence, deep neural networks, the advancements in embedded processing, advancements in sensors. When you put all this together, where do you think it's going to have the largest societal impact?
Jeff Bier
It's a great question. And I think it, it's hard to pick which area it's going to have this single biggest impact, because honestly, like we touched on earlier, I think it's going to have huge impacts in so many different areas. But if we talk about the public at large, I think where it's going to have the biggest impact is in three areas.
One is safety. Like safety of driving a car or being a pedestrian or a cyclist around cars, we're already seeing really positive impact of perceptual AI being incorporated into cars and trucks for safety features. And that's going to get more widespread and more effective very quickly.
A second area is ease of use. Most of the technology that we, as consumers or workers, interact with these days have to interact with the technology on its terms. We have to explain our intent to it in a way that the machine understands versus the machine being able to interact on human terms and understand for example, speech and gestures and facial expressions and eye gaze in order to understand. So, I'll give you an example. You go into the airport and you're looking for your gate, and there's a huge bank of monitors with 100 or 200 flights listed. And if you're like me, you’re middle aged, your eyesight isn't so great. It's hard to pick out your flight.
Well, what if with your consent, the airline knows who you are and can identify you. And it says, oh, you're Raymond, you're on flight 1722 to Atlanta. I'm going to show you, and only you, using projection techniques. So, the screen's now showing an image that only you can see based on your physical location, I'm going to show you in big, bold font. So even your middle-aged eyes can pick it out from 30 feet away, “Hi Raymond, your flight to Atlanta's leaving in 96 minutes from gate 14 - ahead and to the left.”
Now, some people might find that a little creepy, but let's face it, if you're in an airport, your picture has already been taken. You're giving up any expectation of privacy inside an airport. And so, to me that's a powerful example of the machine being able to perceive, it can make our lives much, much easier.
And the third area is healthcare. There are so many aspects of healthcare that are benefiting from this kind of machine perception. A simple example is, let's say you've got a mole on your skin and you're wondering could this mole be cancerous? Could this be dangerous? It's been shown now that deep neural network algorithms operating on photographs of skin moles are as accurate as trained, experienced dermatologists in classifying whether the moles are dangerous or not. What's great about that is if you're somebody who is susceptible to skin cancer; maybe once a week, it's something you can check at home versus going once a year to the dermatologist and risking, oh, this thing's been developing for nine months, but since I only see the dermatologist once a year, we didn't catch it till nine months later.
A more amazing set of examples comes from retinal imaging. Go into the optometrist or the ophthalmologist and they look into your eye.
Raymond Yin
And they shine that blinding light.
Jeff Bier
One of the things that they're looking at is your retina. They're looking at the blood vessels to assess the health of your eye. So, it won't be surprising if I tell you that deep learning algorithms can do as good a job as a trained physician in interpreting the health of your blood vessels. By now, we kind of expect that sort of thing, but what is astounding is that from looking into your eye, deep neural network algorithms can also infer all kinds of other things about you and your health. For example, they can tell your gender, which a trained physician cannot do from a retinal image. They can detect an important parameter of your heart health called the ejection ratio…again, from an image of your eye.
These visual perception technologies are opening up huge capabilities for early detection of health conditions, non-invasively. So potential for improvements in our health is massive.
Raymond Yin
We've talked about, we've written about how AI and machine learning is becoming more accurate than a radiologist looking at x-rays and MRIs and things like that. I had not heard that about the eye. That's really interesting.
Jeff, those are some amazing new ways that I think vision will really affect society in general rather than individual companies. Where do engineers go to learn where to start developing that kind of stuff?
Jeff Bier
The Edge AI and Vision Alliance website is a great resource. Lots of practical articles and demos, tutorials and so on. That's edge-ai-vision.com. But I think the, the single best resource is the Embedded Vision Summit Conference and Trade Show, which takes place May 21 - 23 in Silicon Valley. It's a great place to come to see, to kind of take inspiration from other engineers who've developed vision-based solutions for the real world, and to see the latest building block technologies, the camera modules, development tools, processors, and so on, to figure out how you can effectively incorporate this kind of perceptual AI into your system.
Raymond Yin
And there you have it, a deep dive into the latest on machine vision or vision AI.
Jeff, thanks so much for being our guest today on The Tech Between us!
For those looking for more content, be sure to use that discount code to see Jeff and his team at the Embedded Vision Summit in May and explore the rest of Mouser's offering on this subject by visiting mouser.com/empowering-innovation.