Wrapping Up Evolving Automotive: Zonal Architecture - The Tech Between Us, Part 3
The Tech Between Us: Zonal Architecture, Part 3
Raymond Yin
Welcome back to The Tech Between Us. The conversation isn’t over yet. Christian Uebber CTO of ETAS is with us again today to explore software defined vehicles and the need for stronger hardware and software co-design to enable freedom on the software layer. To catch up on our previous episodes visit mouser.com/empowering-innovation.
So, Christian, as we move to a zonal architecture, how is that going to change current software engineering processes? For example, the way code is written, tested, or even frequency of updates?
Christian Uebber
We have been integrating all the stuff we do microcontroller in large execution tables where we say you run every 20 milliseconds, you run every second, you run every 10 seconds and so on. Everything the function did had to be integrated into this kind of tightly pack schedules. And if you do that, you become naturally really copied to your actual hardware, introduce architecturally all the hardware abstraction layers you want. If you tightly pack a lot of functions in this kind of integration pattern onto a microcontroller and then you utilize it by like 97% and you change, it's going to be highly dependent on that piece of hardware If you want to simulate it, the simulation needs to be exactly like that hardware as it's worthless and so on.
And as long as we write functions like that, it's really hard to evolve the system in a stable way. It'll be couple to the cycles of your heart failure are some tricks you can do to get less reliant on that. And one is that you try to write functions that are less dependent on actual physical time. You write functions that are mostly depending on the data at the input port of a function. So, you say the same data in you always produce the same result and then you make more complex functional chains of that. And if you can assure that the same data in always produces the same data out no matter how the timing is, how it does we need is find a fast enough hardware, deploy this function chain and usually it's going to run well. And this requires a change of how we write our functions from a function developer has been very, a large part of their work has been a mental model of, hey, I'm living in this time slice.
It's 10 seconds long already and after seven to eight milliseconds, you start asking your run time, how much time you have left. The answer, you do additional stuff before your time slice is over. And if you write your function like that, it becomes super dependent on the actual micro timing of that hardware, and you cannot write that function anywhere else. And if you take just this function, call away from developers, you say you cannot, must not ask the runtime how much time you have left. That's not even the concept of physical time that you can base your calculation on. All you have is you have a mailbox, if I have new data for you, I will put it there. You get an activation, then you run, you make a transformation on that data and then you put it in your outbox.
That is all you see from the world. You see the input data and if you're finished calculating, you can put it in your outbox and you also, you forget everything you knew before. If I call you again, there's nothing left. So, if this is not okay for you, you have to put even the stuff that you want to memorize in your outbox so that I can give it back to you in your inbox when I call you again. And this is a change of a mental model for a lot of function developers which come from traditional embedded development, but it gives you tremendous benefits in how you can scale the development of that system. Because for example, if you imagine you have such a system and you build a test, virtual test in the Cloud,
All this is dependent on is actual data. There are no dependencies to actual timing and so on. You can run this function anywhere. We always produce the same result. And if you give that back to developers that you say, hey, everything you do, you get very fast high-quality feedback. If a code that you just submitted is still working correctly, functionally correct, no regressions better or worse than in the last iteration, it makes it much easier for developers to collaborate with each other. And so, this is a kind of development paradigm that is changing. And to sum it up, so the key is not abstraction layers. The key is reduction of dependencies of the functions that we provide. We try to only data dependent and then we gain a lot of flexibility in the long run.
Raymond Yin
But like you said, that's going to require a complete paradigm thought shift on the way, especially that lower-level microcontroller code is written because you're removing any of the time dependencies. You're just saying you've got a GS and you've got a GS, and what you do with it in between is up to you.
Christian Uebber
Yes. So, a lot of the mental model changes a lot. Also, we sometimes development executors believe you pay your developers for function development in the embedded space. This is only a fraction of what they do,
Raymond Yin
Right?
Christian Uebber
Of course they need to communicate a lot. There's a lot of finding out what actually works. But a big part in embedded development is actually make its run together with all the stuff your colleagues have created in a very tight package together, integration, downsizing, making it stable, not oscillate and so on. And this is a lot of the work, and this is really all based on physical timing and how they interact with each other based on physical time. If you take this out and say no, all you see from the outside world and all you need to care about is the data in your inbox and an efficient transformation to your outbox, this is a big constraints but also opens a lot of new freedoms and a lot of, especially the developers who follow the newest C++ standards and so on with pure functions and new memory semantics and so on, they actually can relate to that really well and they like to try out these kind of patterns and try to write functions like that
Raymond Yin
Right now. I mean some of these new microcontrollers coming out, they're still essentially control level cores. For example, the new Cortex M85 from ARM, I'm not sure if you're familiar with that. I mean it has some amazing capabilities. Can microcontroller developers and function developers simply bludgeon it with just pure CPU cycles and pure compute power at that same lower level and achieve what they need to do?
Christian Uebber
Well, I think there's certainly space for that. So, you cannot write audio system all of your system in the data dependent way. I've just described this is a good fit for a subset of the overall problem to solve. I think that's certainly a case to just.
Continue applying the expertise we have developed in the microcontroller space just to bigger farms of microcontroller course where we just continue what we have been doing so far, maybe with additional benefits of maybe hypervisors where we can more easily petition how we utilize the system. But overall continue in a very classical way of micro controller style programming. I think there's certainly a lot of room for that. Vehicle dynamics for example, is still far away from the end of the road of doing really high precision maneuvers. What you can do with each wheel to really amazing, be like glue to the streets. There's still a lot of super real-time latency critical stuff that we haven't tapped into. And if we want to tap into it in a safe way, we need microcontrollers style programming.
Raymond Yin
Okay, so let's go ahead and shift a little bit. We've already talked a little bit about its kind of once again that shift from the pure lower-level microcontroller traditional ECU type of space to a higher-level compute platform. You've talked already about the difficulty in migrating that code. You talked about some of the potential changes in that the microcontroller engineer no longer gets a specific time slice, they kind of have to work within a larger box. What other types of changes, I mean, are going to be required? I mean you briefly touched on being able to test using Cloud, Cloud-based testing going to a higher level. What are the types of changes are going to be required to be able to effectively test these new platforms to maintain those safety critical functions?
Christian Uebber
Yes, you raised a super important point. If we want to become higher performing software delivery organizations, which creates working in increments in a much higher frequency, this is impossible without high quality testing, which is really to a large extent automated, it needs to run in the Cloud that where you have a lot of cheap dynamic compute resources to do these tests. And there's also a pattern in automotive projects like 20 years we do retrospectives after a task force and every time we conclude we should have invested earlier into virtual testing. Either the hardware is always available to late, so the real hardware that you can do the extra testing on or the high-quality simulators, they're also come too late. And I think there is some improvement nowadays. So, we get earlier availability of virtual hardware, but in those cases it's still very compute intensive and you cannot like with every small increment developer’s tool and like 60 times three times compute resources for short test cycle and so on. But this is where we can actually learn a lot from the IT industry to have a test pyramid where you have the real hardware on the tops and you have the virtual hardware below and then you have different types of tests, especially also for the data execution that we talked about. So, you can do a lot of things much lower in the pyramid to provide very fast test feedback to your developers who collaborate in a Cloud-based environment. So, the Cloud is a really great place to have hundreds of developers collaborating with each other.
To actually do embedded development in the Cloud. You need to offer your two-year developers representative test behavior because we talked about the function development is only a tiny fraction and a lot of development time is spent on fitting a lot of function to this tight space. And if your virtual target in the Cloud has a different behavior than this stuff on your desk, people will continue to work on that desk. This will not be highly collaborative, this will not be incremental, this will be just the old model. So, offering high quality test and smart test solutions for fast feedback in Cloud based environments is really critical going forward. And also, a portfolio topic that we invest heavily into.
Raymond Yin
Okay now in testing, and I guess once again, as in what we've always used at the lower-level embedded world, I mean every MCU ever whatever has a dev kit and I guess there's really no such thing as a dev kit for a Mercedes. So, you're right, you do have to have these virtualized systems that accurately mimic the actual operation of the system.
Christian Uebber
Yes, if it's worthless, you guys are not going to use it. They just continue with a dev kit on that desk.
Raymond Yin
Right, exactly.
Before we get to the second half of this episode, let's highlight a particular use case for zonal architecture. In software-defined vehicles, how will the flexibility and scalability of zonal architecture enable vehicles to adapt to changing needs while increasing safety, efficiency, and personalization? To explore this use case and other exclusive content, visit mouser.com/empowering-innovation.
You also talked a little just about updating and software update cycles and how that's done once again where consumer devices are just constantly updating. And is that going to be a shift once again moving to some of these new architectures, some of these new platforms the way automotive software is being done for the new compute platforms on wheels?
Christian Uebber
That's an interesting question. So certainly, there's a demand for a higher frequency of updates and there's also a proof by new players that this is possible especially on heavily centralized architectures. But we have to look onto what kind of updates are delivered. So, it's certainly very easy to offer something like a far mode or so a feature that puts a smile your customer's face, but it's a gimmick. And this is certainly possible, and we should certainly make sure that we have at least part of our tech stack where such an update is easily possible much faster than in the past. That can either be on the infotainment system or on an SDV Edge stack, something like that. But on the other hand, if we look at what kind of functions are OEMs and we as tier ones at Bosch actually investing into how much of this investment is really this kind of application like a user gimmick and how much is this compared to what we're actually investing into?
And then we see a lot of investments go really into hardcore vehicle functions that influence the actual movement of the vehicle on the street. So, it's safety critical, it's highly dynamic. It's amazing how sensitive customers are for really microsecond behavior in their vehicle dynamics. Really interesting. You really feel whether you have a great application of your vehicle, and everything just fits, everything works nicely together versus a vehicle that is just the default configure configuration little bit on top. And so, a lot of these functions are actually doing something concerning vehicle dynamics concerning safety, concerning the actual physical word. And to offer a user managed individual update mechanism is a much harder problem than we think. So, I think there's a class of functions where making an easy update for high frequency update even user triggered is easy. And there's a part of where we talk about actual functions that interact where the OEM is still responsible to, in every configuration new deem possible, instead of updates, they need to deliver onto the responsibility that the vehicle is always in a safe state no matter what updates or combination of updates, pilot, and I think it could be at least two lanes.
So one part of the vehicle that contains all the safety critical functions and are released together like in a platform release where this is managed by the OEM, they do all the testing before they send it to the fleet and they do all the tests and combinations of each other where we say, Hey, this version 1.3 of the base platform, if they upload this to the vehicle, I certify this is safe. You have maybe another part of the vehicle tech stack where you allow much more independent updates user triggered, this can be infotainment, this can be an SDV stack where you say the base platform is sufficiently separated from that domain of the vehicle. You make a good safety case why this will be the case under all conditions. And then you can allow much more dynamic update behavior on that petition. And I think for a really good end-to-end customer experience, you need to supply both just being these platforms for the vehicle is not enough. A combination of how can the OEM keep up with the responsibility of always a guaranteeing a safe operation on the road and combination with higher updates rates on these new neurodynamic platforms.
Raymond Yin
Alright, yeah, like you said, there are the features that make people happy and these are more, once again, it feels like more of a consumer type of feature or add-on as opposed to the underlying making sure that the car is safe, that it continues to operate in a reliable manner, so on and so forth, which the consumer either isn't aware of, doesn't care about or just takes for granted that part of it is going to work and they're adding, hey, hey, I just got an update with my new app just updated. And so, like you said, that makes a lot of sense. Doing it in two different levels where one is more features and noncritical where one is you got to do this or else.
Christian Uebber
It can give you one concrete example; one is your battery management. Maybe you have a part of the function that is written in classical embedded style, high safety manner that always assures that your battery cannot overheat, that you keep it in the bounds of safe physical parameters and so on. And in the classical development mode you would have a battery controller where these features are allocated and also all the other features like, how do I communicate with the charging station, how do communicate with maybe the backend network of the charging provider. But that need to be the case. All you need to offer in the base platform is make sure that the battery cannot be damaged. An SDV stack that is much more easily capable to keep pace with also innovation and ecosystem momentum that's happening. There are new standards, charging providers, new alliances, new protocols to speak over the wire. You can allocate this function on a much more dynamically easier updateable stack. And these functions communicate with the safety critical functions that actually then control the battery. And I think much more of these combinations where combine the best of both words.
Raymond Yin
Okay. Yeah, that makes a lot of sense. Yeah, like you said, the overall charging of the battery or maintaining the battery really it changes a little bit, but it really doesn't change over time as often as, like you said, the ecosystem, the alliances, the availability of chargers charging, especially as you start tying into the grid and charging off, being able to know when to charge,
All that changes very dynamically. Now we've talked a lot about the challenges in moving from the current domain based MCU based platform architectures to some of these newer, either a centralized zonal or something like that. What's that going to be like on a cultural level between because you mentioned that you've got the culture of the data center over here versus the culture of the automotive industry over here. Are those two cultures once again, not necessarily the companies themselves, but are the cultures going to merge eventually or is it still going to be the automotive culture separate from a data center type of culture?
Christian Uebber
That's a good question. So, I think the first thing that needs to happen, and it has already happened to some extent, is the classical microcontroller experts, they had concerns which were actually not fulfilled by the new kind of architecture patterns that came to the beat with regard to guaranteed latency, jitter, low jitter and so on under all conditions. And there's a lot of stuff, for example for the zone architectures with time-sensitive networking, the new ethernet standards where you can actually offer these kinds of features for the backbone that connects the zones to each other. So, you can offer highly safeguarded fast communication path lanes for important latency critical functions and so on. And this hasn't been there so far, and it was very easy to have. A lot of companies got an IT guy to lead now an automotive organization IT guy came in and says this is oil, it should be much easier, and you just didn't get them, or you are like and stuff.
This crash because they had very legitimate concerns by the new solutions, and they didn't speak the same language. And so, one technical part to improve this is that the new systems actually give you that features and only then can the cultural adaptation start because you actually have a base architecture where you can start solving problems together.
You enable people also from different cultures to solve problems. It gets much easier to come to a cultural journey that combines the best of both words that we have ever been so far.
Raymond Yin
Okay. Yeah. Sounds great. And one technology that has come up over and over while doing these podcasts is the technology of artificial intelligence and machine learning. And clearly there's going to be a lot of that in automobiles moving forward between autonomy, between driver safety. How on the software side, how is that being integrated into the car but also to maintain those safety critical systems and maintain the level of safety that is required?
Christian Uebber
Yes, also good question. So, one thing is what's for sure, we cannot deploy these workloads on microcontrollers. They are compute intensive. We need other architectures, tributes and dedicated accelerators make much more sense.
Raymond Yin
Absolutely.
Christian Uebber
Then, but on the other hand, these architectures come from the data center where they are optimized for batch processing, and they're often not optimized for legacy critical stuff. And so, this is the first technical challenge that we need to solve. There has been some development in the sense of that it wasn't possible two years ago that you could interrupt a log and running GPU task with something safety critical where you say, hey, so everything you do you need to execute that. There was just no API for that. And, it has been improving. It's not fully in the target zone yet, but stuff is improving, and we see more and our use cases where we can actually deploy moderate safety payloads like SAB kind of stuff on accelerators and GPS with further developed architectures. But it remains a challenge. So, I think we are never going to see, and the microcontroller environment where we put super mission critical SAD kind of stuff, we deploy on a hardware, software, co-design microcontroller, and it really is a cornerstone of our safety case. There's just too much code and runtime optimization involved that I expect this in the next five years on GPU. So, we probably have combinations, you have some moderate safety QM or SAB or A, then you have maybe some diversity, different variants and combination of that with the microcontroller or stuff that runs on a CPU in combination creates a safe system. But this is evolving and a very interesting topic.
Raymond Yin
Yeah, absolutely. And once again, artificial intelligence and learning, I mean they're going to affect almost anything, every technology out there, but none more so than the autonomous driving capabilities of some of the new cars.
Christian Uebber
Yes, interesting angle. So, GPUs are a good host or accelerators for a new class of functions, like a more powerful perception of the outside environment and so on. There's a good match between what these technologies offer and what we need. On the other hand, I see a lot of experimentation actually in the sense of you have some simple Linux edge stack and customers are deploying ML based experiments, very simple trigger functions where they monitor some stuff in the vehicle, they run an experiment, but they have some hypothesis what could be correlated and so on. What could say use predict and if you make it very easy to deploy ML functions through the vehicle and make that really easily updateable a lot of creativity at OEMs to just try stuff. This is not really a function that you sell a user, but constantly happening experiments and OS seem to put a lot of value out of that.
Raymond Yin
Oh, interesting. I didn't realize that that sort of work was ongoing. I mean already in vehicles.
Well Christian, I want to thank you so much for sharing your expertise and your insight into the automotive, the software side of the automotive world. As an old hardware guy, I've learned a lot and I really appreciate your time today.
Christian Uebber
Thank you very much. So, I also really enjoyed it.
Raymond Yin
Well, I guess that brings us to a close on our exploration of zonal architecture and the software-defined vehicle. Christian, we are so pleased to have you as our guest today on The Tech Between Us! For all the latest technology news and information visit us at mouser.com/empowering-innovation. For Mouser Electronics, I’m Raymond Yin and thank you for listening.