Audio player
On this page
On this page
Audio Player
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
rh-audio-player {
margin: var(--rh-space-xl, 24px);
}
```
<rh-audio-player id="player" layout="full" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript">
<h4 slot="heading">Transcript</h4>
<rh-cue start="00:02" voice="Burr Sutter">
Hi, I'm Burr Sutter. I'm a Red Hatter who spends a lot of time talking to technologists about technologies. We say this a lot at Red Hat. No single technology provider holds the key to
success, including us. And I would say the same thing about myself. I love to share ideas, so I thought it would be awesome to talk to some brilliant technologists at Red Hat Partners. This is
Code Comments, an original podcast from Red Hat.
</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter">
I'm sure, like many of you here, you have been thinking about AI/ML, artificial intelligence and machine learning. I've been thinking about that for quite some time and I actually had the
opportunity to work on a few successful projects, here at Red Hat, using those technologies, actually enabling a data set, gathering a data set, working with a data scientist and data
engineering team, and then training a model and putting that model into production runtime environment. It was an exciting set of projects and you can see those on numerous YouTube videos that
have published out there before. But I want you to think about the problem space a little bit, because there are some interesting challenges about a AI/ML. One is simply just getting access to
the data, and while there are numerous publicly available data sets, when it comes to your specific enterprise use case, you might not be to find publicly available data.
</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter">
In many cases you cannot, even for our applications that we created, we had to create our data set, capture our data set, explore the data set, and of course, train a model accordingly. And
we also found there's another challenge to be overcome in this a AI/ML world, and that is access to certain types of hardware. If you think about an enterprise environment and the creation of
an enterprise application specifically for a AI/ML, end users need an inference engine to run on their hardware. Hardware that's available to them, to be effective for their application. Let's
say an application like Computer Vision, one that can detect anomalies and medical imaging or maybe on a factory floor. As those things are whizzing by on the factory line there, looking at
them and trying to determine if there is an error or not.
</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter">
Well, how do you actually make it run on your hardware, your accessible technology that you have today? Well, there's a solution for this as an open toolkit called OpenVINO. And you might be
thinking, "Hey, wait a minute, don't you need a GPU for AI inferencing, a GPU for artificial intelligence, machine learning? Well, not according to Ryan Loney, product manager of OpenVINO
Developer Tools at Intel.
</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney">
I guess I'll start with trying to maybe dispel a myth. I think that CPUs are widely used for inference today. So if we look at the data center segment, about 70% of the AI inference is
happening on Intel Xeon, on our data center CPUs. And so you don't need a GPU especially for running inference. And that's part of the value of OpenVINO, is that we're taking models that may
have been trained on a GPU using deep learning frameworks like PyTorch or TensorFlow, and then optimizing them to run on Intel hardware.
</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter">
Ryan joined me to discuss AI/ML in the enterprise across various industries and exploring numerous use cases. Let's talk a little bit about the origin story behind OpenVINO. Tell us more
about it and how it came to be and why it came out of Intel.
</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney">
Definitely. We had the first release of OpenVINO, was back in 2018, so still relatively new. And at that time, we were focused on Computer Vision and pretty tightly coupled with OpenCV, which
is another open source library with origins at Intel. It had its first release back in 1999, so it's been around a little bit longer. And many of the software engineers and architects at Intel
that were involved with and contributing to OpenCV are working on OpenVINO. So you can think of OpenVINO as complimentary software to OpenCV and we're providing an engine for executing
inferences as part of a Computer Vision pipeline, or at least that's how we started.
</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney">
But since 2018, we've started to move beyond just Computer Vision inference. So when I say Computer Vision inference, I mean image classification, object detection, segmentation, and now
we're moving into natural language processing. Things like speech synthesis, speech recognition, knowledge graphs, time series forecasting and other use cases that don't involve Computer
Vision and don't involve inference on pixels. Our latest release, the 2022.1 that came out earlier this year, that was the most significant update that we've had to OpenVINO, since we started
in 2018. And the major focus of that release was optimizing for use cases that go beyond Computer Vision.
</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter">
And I like that concept that you just mentioned right there, Computer Vision, and you said that you extended those use cases and went beyond that. Could you give us some more concrete
examples of Computer Vision?
</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney">
Sure. When you think about manufacturing, quality control in factories, everything from arc welding, defect detection to inspecting BMW cars on assembly lines, they're using cameras or
sensors to collect data and usually it's cameras collecting images like RGB images that you and I can see and looks like something taken from a camera or video camera. But also, things like
infrared or computerized tomography scans used in healthcare, X-ray, different types of images where we can draw bounding boxes around regions of interest and say, "This is a defect," or,
"This is not a defect." And also, "Is this worker wearing a safety hat or did they forget to put it on?" And so, you can take this and integrate it into a pipeline where you're triggering an
alert if somebody forgets to wear their safety mask, or if there's a defect in a product on an assembly line, you can just use cameras and OpenVINO and OpenCV running these on Intel hardware
and help to analyze.
</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney">
And that's what a lot of the partners that we work with are doing, so these independent software vendors. And there's other use cases for things like retail. You think about going to a store
and using an automated checkout system. Sometimes people use those automated checkouts and they slide a few extra items into their bag that they don't scan and it's a huge loss for the retail
outlets that are providing this way to check out realtime shelf monitoring. We have a Vispera, one of our ISVs that helps keep store shelves stocked by just analyzing the cameras in the
stores, detecting when objects are missing from the shelves so that they can be restocked. We have Vistry, another ISV that works with quick service restaurants. When you think about
automating the process of, when do I drop the fries into the fryer so that they're warm when the car gets to the drive through window, there's quite a bit of industrial healthcare retail
examples that we can walk through.
</rh-cue>
<rh-cue start="06:55" voice="Burr Sutter">
And we should dig into some more of those, but I got to tell you, I have a personal experience in this category that I want to share with and you can tell me how silly you might think at this
point in time it is. We actually built a keynote demonstration for the Red Hat big stage back in 2015. And I really want to illustrate the concept of asset tracking. So we actually gave
everybody in the conference a little Bluetooth token with a little battery, a little watch battery, and a little Bluetooth emitter. And we basically tracked those things around the conference.
We basically put a raspberry pi in each of the meeting rooms and up in the lunch room and you could see how the tokens moved from room to room to room.
</rh-cue>
<rh-cue start="07:28" voice="Burr Sutter">
It was a relatively simple application, but it occurred to me, after we figured out how to do that with Bluetooth and triangulating Bluetooth signals by looking at relative signal strength
from one radio to another and putting that through an Apache Spark application at the time, we then realized, "You know what? This is easier done with cameras." And just simply looking at a
camera and having some form of a AI/ML model, a machine learning model, that would say, "There are people here now," or, "There are no people here now." What do you think about that?
</rh-cue>
<rh-cue start="07:56" voice="Ryan Loney">
What you just described is exactly the product that Pathr, one of our partners is offering, but they're doing it with Computer Vision and cameras. So when Pathr tries to help retail stores
analyze the foot traffic and understand, with heat maps, where are people spending the most time in stores, how many people are coming in, what size groups are coming into the store and trying
to help understand if there was a successful transaction from the people who entered the store and left the store, to help with the retail analytics and marketing sales and positioning of
products. And so, they're doing that in a way that also protects privacy. And that's something that's really important. So when you talked about those Bluetooth beacons, probably if everyone
who walked into a grocery store was asked to put a tracking device in their cart or on their person and say, "You're going to be tracked around the store," they probably wouldn't want to do
that.
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney">
The way that you can do this with cameras, is you can detect people as they enter and remove their face. So you can ignore any biometric information and just track the person based on pixels
that are present in the detected region of interest. So they're able to analyze... Say a family walks in the door and they can group those people together with object detection and then they
can track their movement throughout the store without keeping track of their face, or any biometric, or any personal identifiable information, to avoid things like bias and to make sure that
they're protecting the privacy of the shoppers in the store, while still getting that really useful marketing analytics data. So that they can make better decisions about where to place their
products. That's one really good example of how Computer Vision, AI with OpenVINO is being used today.
</rh-cue>
<rh-cue start="09:49" voice="Burr Sutter">
And that is a great example, because you're definitely spot on. It is invasive when you hand someone a Bluetooth device and say, "Please, keep this with you as you go throughout our store,
our mall or throughout our hospital, wherever you might be." Now you mentioned another example earlier in the conversation which was related to worker safety. "Are they wearing a helmet?" I
want to talk more about that concept in a real industrial setting, a manufacturing setting, where there might be a factory floor and there's certain requirements. Or better yet there's like a
quality assurance requirement, let's say, when it comes to looking at a factory line. I've run that use case often with some of our customers. Can you talk more about those kinds of use cases?
</rh-cue>
<rh-cue start="10:23" voice="Ryan Loney">
One of our partners, Robotron, we published a case study, I think last year, where they were working with BMW at one of their factories. And they do quality control inspection, but they're
also doing things related to worker safety and analyzing. I use the safety hat example. There's a number of our ISVs and partners who have similar use cases and it comes down to, there's a few
reasons that are motivating this and some are related to insurance. It's important to make sure that if you want to have your factory insured, that your workers are protecting themselves and
wearing the gear regulatory compliance, you're being asked to properly protect from exposure to chemicals or potentially having something fall and hit someone on the head. So wearing a safety
vest, wearing goggles, wearing a helmet, these are things that you need to do inside the factory and you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney">
I think that's one of the interesting things about the Robotron-BMW example is that they were also blurring, blacking out, so drawing a box to cover the face of the workers in the factory, so
that somebody who was analyzing the video footage and getting the alerts saying that, "Bay 21 has a worker without a hat on," that it's not sending their face and in the alert and potentially
invading or going against privacy laws or just the ethics of the company. They don't want to introduce bias or have people targeted because it's much better to blur the face and alert and have
somebody take care of it on the floor. And then, if you ever need to audit that information later, they have a way to do it where people who need to be able to see who the employee was and
look up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney">
But then just for the purposes of maintaining safety, they don't need to have access to that personal information, or biometric information. Because that's one thing that when you hear about
Computer Vision or person tracking, object detection, there's a lot of concern, and rightfully so, about privacy being invaded and about tracking information, face re-identification,
identifying people who may have committed crimes through video footage. And that's just not something that a lot of companies want to... They want to protect privacy and they don't want to be
in a situation where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter">
Well, privacy is certainly opening up Pandora's box. There's a lot to be explored in that area, especially in a digital world that we now live in. But for now, let's move on and explore a
different area. I'm interested in how machines and computers offer advantages specifically in certain use cases like a quality control scenario. I asked Ryan to explain how a AI/ML and
specifically machines, computers, could augment that capability.
</rh-cue>
<rh-cue start="13:20" voice="Ryan Loney">
I can give a specific example where we have a partner that's doing defect detection, looking for anomalies in batteries. I'm sure you've heard there's a lot of interest right now in electric
vehicles, a lot of batteries being produced. And so, if you go into one of these factories, they have images that they collect of every battery that's going through this assembly line. And
through these images, people can look and see and visually inspect what their eyes and say, "This battery has a defect, send it back." And that's one step in the quality control process,
there's other steps I'm sure, like running diagnostic tests and measuring voltage and doing other types of non-visual inspection. But for the visual inspection piece, where you can really
easily identify some problems, it's much more efficient to introduce Computer Vision. And so, that's where we have this new library that we've introduced, called Anomalib.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney">
So OpenVINO, while we're focused on inference, we're also thinking about the pipeline, or the funnel, that gets these models to OpenVINO. And so, we've invested in this anomaly segmentation,
anomaly detection library that we've recently open sourced and there's a great research paper about it, about Anomalib, but the idea is you can take just a few images and train a model and
start detecting these defects. And so, for this battery example, that's a more advanced example, but to make it simpler, take some bolts and... Take 10 bolts. You have one that has a scratch
on it, or one that is chipped, or has some damage to it, and you can easily get started in training to recognize the bolts that do not have an anomaly and the ones that do, which is a small
data set. And I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney">
Challenges, one is access to data, but the other is needing a massive amount of data to do something meaningful. And so we're starting to try to change that dynamic with Anomalib. You may not
need a 100,000 images, you may need 100 images and you can start detecting anomalies in everything from batteries to bolts to, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter">
That is a very key point because often in that data scientist process, that data engineering data scientist process, the one key thing is, can you gather the data that you need for the input
for the model training? And we've often said, at least people I've worked with over the last couple years, "You need a lot of data, you need tens of thousands of correct images, so we can sort
out the difference between dogs versus cats," let's say. Or you need dozens and dozens of situations where if it's a natural language processing scenario, a good customer interaction, a good
customer conversation. And this case it sounds like what you're saying is, "Show us just the bad things, fewer images, fewer incorrect things, and then let us look for those kind of
anomalies." Can you tell us more about that? Because that is very interesting. The concept that I can use a much smaller data set as my input, as opposed to gathering terabytes of data in some
cases, to just simply get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney">
Like you described, the idea is, if you have some good images and then you have some of the known defects, and you can just label, "Here's a set of good images and here's a few of the
defects." And you can right away start detecting those specific defects that you've identified. And then, also be able to determine when it doesn't match the expected appearance of a non
defective item. So if I have the undamaged screw and then I introduce one with some new anomaly that's never been seen before, I can say this one is not a valid screw. And so, that's the
approach that we're taking and it's really important because so often you need to have subject matter experts. Take the battery example, there's these workers who are on the floor, in a
factory and they're the ones who know best when they look at these images, which one's going to have an issue, which one's defective.
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney">
And then they also need to take that subject matter expertise and then use it to annotate data sets. And when you have these tens of thousands of images you need to annotate, it's asking
those people to stop working on the factory floor so they can come annotate some images. That's a tough business call to make, right? But if you only need them to annotate a handful of images,
it's a much easier ask to get the ball rolling and demonstrate value. And maybe over time you will want to annotate more and more images because you'll get even better accuracy in the model.
Even better, even if it's just small incremental improvements, that's something that if it generates value for the business, it's something the business will invest in over time. But you have
to convince the decision makers that it's worth the time of these subject matter experts to stop what they're doing and go and label some images of the things that they're working on in the
factory.
</rh-cue>
<rh-cue start="18:27" voice="Burr Sutter">
And that labeling process can be very labor intensive. If the annotation is basically saying what is correct, what's wrong, what is this, what is that. And therefore if we can minimize that
timeframe to get the value quicker, then there's something that's useful for the business, useful for the organization, long before we necessarily go through a whole huge model training phase.
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter">
So we talked about labeling and how that is labor intensive activity, but I love the idea of helping the human. And helping the human most specifically not get bored. Basically if the human
is eyeballing a bunch of widgets flying by, over time they make mistakes, they get bored and they don't pay as close attention as they should. That's why the constant of AI/ML, and
specifically Computer Vision augmenting that capability and really helping the human identify anomalies faster, more quickly, maybe with greater accuracy, could be a big win. We focused on
manufacturing, but let's actually go into healthcare and learn how these tools can be used in that sector and that industry. Ryan talked me about how OpenVINO's run time can be incorporated
into medical imaging equipment with Intel processors embedded in CT, MRI and ultrasound machines. While these inferences, this AI/ML workload, can be operating and executing right there in the
same physical room as the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney">
We did a presentation with GE last year, I think they said there's at least 80 countries that have their x-ray machines deployed. And they're doing things like helping doctors place breathing
tubes in patients. So during COVID, during the pandemic, that was a really important tool to help with nurses and doctors who were intubating patients, sometimes in a parking lot or a hallway
of a hospital. And when they had a statistic that GE said, I think one out of four breathing tubes gets placed incorrectly when you're doing it outside the operating room. Because when you're
in an operating room it's much more controlled and there's someone who's an expert at placing the tubes, it's something you have more of a controlled environment. But when you're out, in a
parking lot, in a tent, when the hospital's completely full and you're triaging patients with COVID, that's when they're more likely to make mistakes.And so, they had this endotracheal tube
placement, ETT, model that they trained and it helped to use an x-ray and give an alert and say, "This tube is placed wrong, pull it out and do it again." And so, things like that help doctors
so that they can avoid mistakes. And having a breathing tube placed incorrectly can cause collapsed lung and a number of other unwanted side effects. So it's really important to do it
correctly. Another example is Samsung Medison. They actually are estimating fetal angle of progression. So this is analyzing ultrasound of pregnant women being able to help take measurements
that are usually hard to calculate, but it can be done in an automated way. They're already taking an ultrasound scan and now they're executing this model that can take some of these
measurements to help the doctor avoid potentially more intrusive alternative methods. So the patient wins, it makes their life better and the doctor is getting help from this AI model. And
those are just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter">
Those are some amazing examples when it comes to all these things, we're talking CT scans and x-rays, other examples of Computer Vision. One thing that's kind of interesting in this space, I
think, whenever I get a chance to work on, let's say an object detection model, and one of our workshops, by the way, is actually putting that out in front of people to say, "Look, you can use
your phone and it basically sends the image over to our OpenShift with our data science platform and then analyzes what you see." And even in my case, where I take a picture of my dog as an
example, it can't really decide, is it a dog or a cat? I have a very funny looking dog.
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter">
And so there's always a percentage outcome. In other words, "I think it's a dog, 52%." So I want to talk about that more. How important is it to get to that a hundred percent accuracy? How
important is it to really, depending on the use case, to allow for the gray area if you will, where it's an 80% accuracy or a 70% accuracy, and what are the trade offs there associated with
the application? Can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney">
Accuracy is definitely a touchy subject, because how you measure it makes a huge difference. I think what you were describing with the dog example, there's sort of a top five potential
classes that might maybe be identified. So let's say you're doing object detection and you detect a region of interest, and it says 65% confidence this is a dog. Well, the next potential label
that could be maybe 50% confidence or 20% confidence might be something similar to a dog. Or in the case of models that have been trained on the ImageNet dataset or on COCO dataset, they have
actual breeds of dogs. If I want to look at the top five labels for a dog, for my dog for example, she's a mix, mostly a Labrador retriever, but I may look at the top five labels and it may
say 65% confidence that she's a flat coated retriever.
</rh-cue>
<rh-cue start="23:32" voice="Ryan Loney">
And then confidence that she's a husky as 20%, and then 5% confidence that she's a greyhound or something. Those labels, all of them are dogs. So if I'm just trying to figure out, is this a
dog? I could probably find all of the classes within the data set and say, "Well, these all, class ID 65, 132, 92 and 158, all belong to a group of dogs." So if I want to just write an
application to tell me if this is a dog or not, I would probably use that to determine if it's a dog. But how you measure that as accuracy, well that's where it gets a little bit complicated.
Because if you're being really strict about the definition and you're trying to validate against the data set of labeled images, and I have specific dog breeds or some specific detail and it
doesn't match, well then, the accuracy's going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney">
And that's especially important when we talk about things like compression and quantization, which historically, has been difficult to get adoption in some domains, like healthcare, where
even the hint of accuracy going down implies that we're not going to be able to help. In some small case, maybe if it's even half a percent of the time, we won't detect that that tube is
placed incorrectly or that that patient's lung has collapsed or something like that. And that's something that really prevents adoption of some of these methods that can really boost
performance, like quantization. But if you take that example of... Different from the dog example, and you think about segmentation of kidneys. If I'm doing kidney segmentation, which is
taking a CT scan and then trying to pick the pixels out of that scan that belong to a kidney, how I measure accuracy may be how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney">
Missing some of the pixels is maybe not a problem, depending on how you've built the application, because you still detect the kidney, and maybe you just need to apply padding around the
region of interest, so that you don't miss any of the actual kidney when you compress the model and when you quantize the model. But that requires a data scientist, an ML engineer, somebody to
really, they have to be able to go and apply that after the fact, after the inference happens, to make sure that you're not losing critical information. Because the next step from detecting
the kidney, may be detecting a tumor.
</rh-cue>
<rh-cue start="26:04" voice="Ryan Loney">
And so, maybe you can use the more optimized model to detect the kidney, but then you can use a slower model to detect the tumor. But that also requires somebody to architect and make that
decision or that trade off and say, "Well, I need to add padding," or, "I should only use the quantized model to detect the region of interest for the kidney." And then, use the model that
takes longer to do the inference just to find the tumor, which is going to be on a smaller size. The dimensions are going to be much smaller once we crop to the region of interest. But all of
those details, that's maybe not easy to explain in a few sentences and even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter">
I do love that use case, like you mentioned, the cropping, even in one scenario that we worked on for another project, we specifically decided to pixelate the image that we had taken, because
we knew that we could get the outcome we wanted by even just using a smaller or having less resolution in our image. And therefore, as we transferred it from the mobile device, the edge
device, up into the cloud, we wanted that smaller image just for transfer purposes. And still, we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter">
And one thing that's interesting about that, from my perspective, is, if you're doing image processing, sometimes it takes a while for this transaction to occur. I come from a traditional
application background, where I'm reading and writing things from a database, or a message broker, or moving data from one place to another. Those things happen sub-second normally, even with
great latency between your data centers, it's still sub-second in most cases. While a transaction like this one can actually take two seconds or four seconds, as it's doing its analysis and
actually coming back with its, "I think it's a dog, I think it's a kidney, I think it's whatever." And providing me that accuracy statement. That concept of optimization is very important in
the overall application architecture. Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney">
Definitely. It depends too on the use case. So if you think about how important it is to reduce the latency and increase the number of frames per second that you can process when you're
talking about a loss prevention model that's running at a grocery store. You want to keep the lines moving, you don't want every person who's at the self checkout to have to wait five seconds
for every item they scan. You need it to happen as quickly as possible. And if sometimes the accuracy decreases slightly, or I'd say the accuracy of the whole pipeline, so not just looking at
the individual model or the individual inference, but let's say that the whole pipeline is not as successful at detecting when somebody steals one item from the self checkout, it's not going
to be a life threatening situation. Whereas being hooked up to the x-ray machine with the tube placement model, they might be willing to have the doctor or the nurse wait five seconds to get
the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney">
They don't need it to happen in 500 milliseconds. Their threshold for waiting is a little bit higher. That, I think, also drives some of the decision. You want to keep people moving through
the checkout line and you can afford to, potentially, if you lose a little bit of accuracy here and there, it's not going to cost the company that much money or it's not going to be life
threatening. It's going to be worth the trade off of keeping the line moving and not having people leave the store and not check out at all, to say, "I'm not going to shop today because the
line's too long."
</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter">
There are so many trade-offs in enterprise AI/ML use cases, things like latency, accuracy and availability, and certainly complexities abound, especially in an obviously ever-evolving
technological landscape where we are still very early in the adoption of AI/ML. And to navigate that complexity, that direct feedback from real world end users is essential to Ryan and his
team at Intel. What would you say are some of the big hurdles or big outcomes, big opportunities in that space? And do you agree that we're still at the very beginning, in our infancy if you
will, of adopting these technologies and discovering what they can do for us?
</rh-cue>
<rh-cue start="30:06" voice="Ryan Loney">
Yeah, I think we're definitely in the infancy and I think that what we've seen is, our customers are evolving and the people who are deploying on Intel hardware, they're trying to run more
complicated models. They're the models that are doing object detection or detecting defects and doing segmentation. In the past you could say, "Here's a generic model that will do face
detection, or person detection, or vehicle detection, license plate detection." And those are general purpose models that you can just grab off the shelf and use them. But now we're moving
into the Anomalib scenarios, where I've got my own data and I'm trying to do something very specific and I'm the only one that has access to this data. You don't have that public data set that
you can go download that's under Creative Commons license for car batteries. It's just not something that's available.
</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney">
And so, those use cases, the challenge with training those models and getting them optimized is the beginning of the pipeline. It's the data. You have to get the data, you have to annotate it
and the tools have to exist for you to do that. And that's part of the problem that we're trying to help solve. And then, the models are getting more complex. So if you think, just from
working with customers recently, they're no longer just trying to do image classification, "Is it a dog or a cat?" They've moved on to 3D point clouds and 3D segmentation models and things
that are like the speech synthesis example. These GPT models that are generating... You put a text input and it generates an image for you. It's just becoming much more advanced, much more
sophisticated and on larger images.
</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney">
And so things like running super resolution and enhancing images, upscaling images, instead of just trying to take that 200 by 200 pixel image and classifying if it's a cat, now we're talking
about gigantic, huge images that we're processing and that all requires more resources or more optimized models. And every Computer Vision conference or AI conference, there's a new latest and
greatest architecture, there's new research paper, and things are getting adopted much faster. The lead time for a NeurIPS paper, CVPR, for a company to actually adopt and put those into
production, the time shortens every year.
</rh-cue>
<rh-cue start="32:34" voice="Burr Sutter">
Well Ryan, I got to tell you, I could talk to you, literally, all day about these topics, the various use cases, the various ways models are being optimized, how to put models into a pipeline
for average enterprise applications. I've enjoyed learning about OpenVINO and Anomalib. I'm fascinated by this, because I'll have a chance to go try this myself, taking advantage of Red Hat
OpenShift and taking advantage of our data science platform. On top of that, I will definitely go be poking at this myself. Thank you so much for your time today.
</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney">
Thanks, Burr. This was a lot of fun. Thanks for having me.
</rh-cue>
<rh-cue start="33:05" voice="Burr Sutter">
You can check out the full transcript of our conversation and more resources, like a link to a white paper on OpenVINO and Anomalib at redhat.com/codecommentspodcast. This episode was
produced by Brent Simoneaux and Caroline Creaghead. Our sound designer is Christian Prohom. Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison,
Nick Burns, Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, Laura Walters, Alex Traboulsi, and Victoria Lawton. I'm your host, Burr Sutter. Thank you
for joining me today on Code Comments. I hope you enjoyed today's session and today's conversation, and I look forward to many more.
</rh-cue>
</rh-transcript>
</rh-audio-player>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Color Context
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
import '@rhds/elements/lib/elements/rh-context-demo/rh-context-demo.js';
const form = document.querySelector('form');
const player = document.querySelector('rh-audio-player');
/**
* update audio player demo based on form selections
*/
function updateDemo() {
const data = new FormData(form);
const values = Object.fromEntries(data.entries());
const { layout } = values;
player.layout = layout;
player.poster =
!values.poster ? undefined
: 'https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png';
}
form.addEventListener('input', updateDemo);
updateDemo();
```
<rh-context-demo id="player-context-demo" target="player">
<form slot="controls">
<label>Poster: <input name="poster" type="checkbox" checked=""></label>
<label>Layout:
<select name="layout">
<option value="full" selected="">Full</option>
<option value="compact-wide">Compact Wide</option>
<option value="compact">Compact</option>
<option value="mini">Mini</option>
</select>
</label>
</form>
<rh-audio-player id="player" layout="full" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript">
<h4 slot="heading">Transcript</h4>
<rh-cue start="00:02" voice="Burr Sutter">
Hi, I'm Burr Sutter. I'm a Red Hatter who spends a lot of time talking to technologists about technologies. We say this a lot at Red Hat. No single technology provider holds the key to success,
including us. And I would say the same thing about myself. I love to share ideas, so I thought it would be awesome to talk to some brilliant technologists at Red Hat Partners. This is Code
Comments, an original podcast from Red Hat.
</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter">
I'm sure, like many of you here, you have been thinking about AI/ML, artificial intelligence and machine learning. I've been thinking about that for quite some time and I actually had the
opportunity to work on a few successful projects, here at Red Hat, using those technologies, actually enabling a data set, gathering a data set, working with a data scientist and data
engineering team, and then training a model and putting that model into production runtime environment. It was an exciting set of projects and you can see those on numerous YouTube videos that
have published out there before. But I want you to think about the problem space a little bit, because there are some interesting challenges about a AI/ML. One is simply just getting access to
the data, and while there are numerous publicly available data sets, when it comes to your specific enterprise use case, you might not be to find publicly available data.
</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter">
In many cases you cannot, even for our applications that we created, we had to create our data set, capture our data set, explore the data set, and of course, train a model accordingly. And we
also found there's another challenge to be overcome in this a AI/ML world, and that is access to certain types of hardware. If you think about an enterprise environment and the creation of an
enterprise application specifically for a AI/ML, end users need an inference engine to run on their hardware. Hardware that's available to them, to be effective for their application. Let's say
an application like Computer Vision, one that can detect anomalies and medical imaging or maybe on a factory floor. As those things are whizzing by on the factory line there, looking at them and
trying to determine if there is an error or not.
</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter">
Well, how do you actually make it run on your hardware, your accessible technology that you have today? Well, there's a solution for this as an open toolkit called OpenVINO. And you might be
thinking, "Hey, wait a minute, don't you need a GPU for AI inferencing, a GPU for artificial intelligence, machine learning? Well, not according to Ryan Loney, product manager of OpenVINO
Developer Tools at Intel.
</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney">
I guess I'll start with trying to maybe dispel a myth. I think that CPUs are widely used for inference today. So if we look at the data center segment, about 70% of the AI inference is happening
on Intel Xeon, on our data center CPUs. And so you don't need a GPU especially for running inference. And that's part of the value of OpenVINO, is that we're taking models that may have been
trained on a GPU using deep learning frameworks like PyTorch or TensorFlow, and then optimizing them to run on Intel hardware.
</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter">
Ryan joined me to discuss AI/ML in the enterprise across various industries and exploring numerous use cases. Let's talk a little bit about the origin story behind OpenVINO. Tell us more about
it and how it came to be and why it came out of Intel.
</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney">
Definitely. We had the first release of OpenVINO, was back in 2018, so still relatively new. And at that time, we were focused on Computer Vision and pretty tightly coupled with OpenCV, which is
another open source library with origins at Intel. It had its first release back in 1999, so it's been around a little bit longer. And many of the software engineers and architects at Intel that
were involved with and contributing to OpenCV are working on OpenVINO. So you can think of OpenVINO as complimentary software to OpenCV and we're providing an engine for executing inferences as
part of a Computer Vision pipeline, or at least that's how we started.
</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney">
But since 2018, we've started to move beyond just Computer Vision inference. So when I say Computer Vision inference, I mean image classification, object detection, segmentation, and now we're
moving into natural language processing. Things like speech synthesis, speech recognition, knowledge graphs, time series forecasting and other use cases that don't involve Computer Vision and
don't involve inference on pixels. Our latest release, the 2022.1 that came out earlier this year, that was the most significant update that we've had to OpenVINO, since we started in 2018. And
the major focus of that release was optimizing for use cases that go beyond Computer Vision.
</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter">
And I like that concept that you just mentioned right there, Computer Vision, and you said that you extended those use cases and went beyond that. Could you give us some more concrete examples
of Computer Vision?
</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney">
Sure. When you think about manufacturing, quality control in factories, everything from arc welding, defect detection to inspecting BMW cars on assembly lines, they're using cameras or sensors
to collect data and usually it's cameras collecting images like RGB images that you and I can see and looks like something taken from a camera or video camera. But also, things like infrared or
computerized tomography scans used in healthcare, X-ray, different types of images where we can draw bounding boxes around regions of interest and say, "This is a defect," or, "This is not a
defect." And also, "Is this worker wearing a safety hat or did they forget to put it on?" And so, you can take this and integrate it into a pipeline where you're triggering an alert if somebody
forgets to wear their safety mask, or if there's a defect in a product on an assembly line, you can just use cameras and OpenVINO and OpenCV running these on Intel hardware and help to analyze.
</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney">
And that's what a lot of the partners that we work with are doing, so these independent software vendors. And there's other use cases for things like retail. You think about going to a store and
using an automated checkout system. Sometimes people use those automated checkouts and they slide a few extra items into their bag that they don't scan and it's a huge loss for the retail
outlets that are providing this way to check out realtime shelf monitoring. We have a Vispera, one of our ISVs that helps keep store shelves stocked by just analyzing the cameras in the stores,
detecting when objects are missing from the shelves so that they can be restocked. We have Vistry, another ISV that works with quick service restaurants. When you think about automating the
process of, when do I drop the fries into the fryer so that they're warm when the car gets to the drive through window, there's quite a bit of industrial healthcare retail examples that we can
walk through.
</rh-cue>
<rh-cue start="06:55" voice="Burr Sutter">
And we should dig into some more of those, but I got to tell you, I have a personal experience in this category that I want to share with and you can tell me how silly you might think at this
point in time it is. We actually built a keynote demonstration for the Red Hat big stage back in 2015. And I really want to illustrate the concept of asset tracking. So we actually gave
everybody in the conference a little Bluetooth token with a little battery, a little watch battery, and a little Bluetooth emitter. And we basically tracked those things around the conference.
We basically put a raspberry pi in each of the meeting rooms and up in the lunch room and you could see how the tokens moved from room to room to room.
</rh-cue>
<rh-cue start="07:28" voice="Burr Sutter">
It was a relatively simple application, but it occurred to me, after we figured out how to do that with Bluetooth and triangulating Bluetooth signals by looking at relative signal strength from
one radio to another and putting that through an Apache Spark application at the time, we then realized, "You know what? This is easier done with cameras." And just simply looking at a camera
and having some form of a AI/ML model, a machine learning model, that would say, "There are people here now," or, "There are no people here now." What do you think about that?
</rh-cue>
<rh-cue start="07:56" voice="Ryan Loney">
What you just described is exactly the product that Pathr, one of our partners is offering, but they're doing it with Computer Vision and cameras. So when Pathr tries to help retail stores
analyze the foot traffic and understand, with heat maps, where are people spending the most time in stores, how many people are coming in, what size groups are coming into the store and trying
to help understand if there was a successful transaction from the people who entered the store and left the store, to help with the retail analytics and marketing sales and positioning of
products. And so, they're doing that in a way that also protects privacy. And that's something that's really important. So when you talked about those Bluetooth beacons, probably if everyone who walked into a
grocery store was asked to put a tracking device in their cart or on their person and say, "You're going to be tracked around the store," they probably wouldn't want to do that.
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney">
The way that you can do this with cameras, is you can detect people as they enter and remove their face. So you can ignore any biometric information and just track the person based on pixels
that are present in the detected region of interest. So they're able to analyze... Say a family walks in the door and they can group those people together with object detection and then they can
track their movement throughout the store without keeping track of their face, or any biometric, or any personal identifiable information, to avoid things like bias and to make sure that they're
protecting the privacy of the shoppers in the store, while still getting that really useful marketing analytics data. So that they can make better decisions about where to place their products.
That's one really good example of how Computer Vision, AI with OpenVINO is being used today.
</rh-cue>
<rh-cue start="09:49" voice="Burr Sutter">
And that is a great example, because you're definitely spot on. It is invasive when you hand someone a Bluetooth device and say, "Please, keep this with you as you go throughout our store, our
mall or throughout our hospital, wherever you might be." Now you mentioned another example earlier in the conversation which was related to worker safety. "Are they wearing a helmet?" I want to
talk more about that concept in a real industrial setting, a manufacturing setting, where there might be a factory floor and there's certain requirements. Or better yet there's like a quality
assurance requirement, let's say, when it comes to looking at a factory line. I've run that use case often with some of our customers. Can you talk more about those kinds of use cases?
</rh-cue>
<rh-cue start="10:23" voice="Ryan Loney">
One of our partners, Robotron, we published a case study, I think last year, where they were working with BMW at one of their factories. And they do quality control inspection, but they're also
doing things related to worker safety and analyzing. I use the safety hat example. There's a number of our ISVs and partners who have similar use cases and it comes down to, there's a few
reasons that are motivating this and some are related to insurance. It's important to make sure that if you want to have your factory insured, that your workers are protecting themselves and
wearing the gear regulatory compliance, you're being asked to properly protect from exposure to chemicals or potentially having something fall and hit someone on the head. So wearing a safety
vest, wearing goggles, wearing a helmet, these are things that you need to do inside the factory and you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney">
I think that's one of the interesting things about the Robotron-BMW example is that they were also blurring, blacking out, so drawing a box to cover the face of the workers in the factory, so
that somebody who was analyzing the video footage and getting the alerts saying that, "Bay 21 has a worker without a hat on," that it's not sending their face and in the alert and potentially
invading or going against privacy laws or just the ethics of the company. They don't want to introduce bias or have people targeted because it's much better to blur the face and alert and have
somebody take care of it on the floor. And then, if you ever need to audit that information later, they have a way to do it where people who need to be able to see who the employee was and look
up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney">
But then just for the purposes of maintaining safety, they don't need to have access to that personal information, or biometric information. Because that's one thing that when you hear about
Computer Vision or person tracking, object detection, there's a lot of concern, and rightfully so, about privacy being invaded and about tracking information, face re-identification, identifying
people who may have committed crimes through video footage. And that's just not something that a lot of companies want to... They want to protect privacy and they don't want to be in a situation
where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter">
Well, privacy is certainly opening up Pandora's box. There's a lot to be explored in that area, especially in a digital world that we now live in. But for now, let's move on and explore a
different area. I'm interested in how machines and computers offer advantages specifically in certain use cases like a quality control scenario. I asked Ryan to explain how a AI/ML and
specifically machines, computers, could augment that capability.
</rh-cue>
<rh-cue start="13:20" voice="Ryan Loney">
I can give a specific example where we have a partner that's doing defect detection, looking for anomalies in batteries. I'm sure you've heard there's a lot of interest right now in electric
vehicles, a lot of batteries being produced. And so, if you go into one of these factories, they have images that they collect of every battery that's going through this assembly line. And
through these images, people can look and see and visually inspect what their eyes and say, "This battery has a defect, send it back." And that's one step in the quality control process, there's
other steps I'm sure, like running diagnostic tests and measuring voltage and doing other types of non-visual inspection. But for the visual inspection piece, where you can really easily
identify some problems, it's much more efficient to introduce Computer Vision. And so, that's where we have this new library that we've introduced, called Anomalib.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney">
So OpenVINO, while we're focused on inference, we're also thinking about the pipeline, or the funnel, that gets these models to OpenVINO. And so, we've invested in this anomaly segmentation,
anomaly detection library that we've recently open sourced and there's a great research paper about it, about Anomalib, but the idea is you can take just a few images and train a model and start
detecting these defects. And so, for this battery example, that's a more advanced example, but to make it simpler, take some bolts and... Take 10 bolts. You have one that has a scratch on it, or
one that is chipped, or has some damage to it, and you can easily get started in training to recognize the bolts that do not have an anomaly and the ones that do, which is a small data set. And
I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney">
Challenges, one is access to data, but the other is needing a massive amount of data to do something meaningful. And so we're starting to try to change that dynamic with Anomalib. You may not
need a 100,000 images, you may need 100 images and you can start detecting anomalies in everything from batteries to bolts to, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter">
That is a very key point because often in that data scientist process, that data engineering data scientist process, the one key thing is, can you gather the data that you need for the input for
the model training? And we've often said, at least people I've worked with over the last couple years, "You need a lot of data, you need tens of thousands of correct images, so we can sort out
the difference between dogs versus cats," let's say. Or you need dozens and dozens of situations where if it's a natural language processing scenario, a good customer interaction, a good
customer conversation. And this case it sounds like what you're saying is, "Show us just the bad things, fewer images, fewer incorrect things, and then let us look for those kind of anomalies."
Can you tell us more about that? Because that is very interesting. The concept that I can use a much smaller data set as my input, as opposed to gathering terabytes of data in some cases, to just simply
get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney">
Like you described, the idea is, if you have some good images and then you have some of the known defects, and you can just label, "Here's a set of good images and here's a few of the defects."
And you can right away start detecting those specific defects that you've identified. And then, also be able to determine when it doesn't match the expected appearance of a non defective item.
So if I have the undamaged screw and then I introduce one with some new anomaly that's never been seen before, I can say this one is not a valid screw. And so, that's the approach that we're
taking and it's really important because so often you need to have subject matter experts. Take the battery example, there's these workers who are on the floor, in a factory and they're the ones
who know best when they look at these images, which one's going to have an issue, which one's defective.
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney">
And then they also need to take that subject matter expertise and then use it to annotate data sets. And when you have these tens of thousands of images you need to annotate, it's asking those
people to stop working on the factory floor so they can come annotate some images. That's a tough business call to make, right? But if you only need them to annotate a handful of images, it's a
much easier ask to get the ball rolling and demonstrate value. And maybe over time you will want to annotate more and more images because you'll get even better accuracy in the model. Even
better, even if it's just small incremental improvements, that's something that if it generates value for the business, it's something the business will invest in over time. But you have to
convince the decision makers that it's worth the time of these subject matter experts to stop what they're doing and go and label some images of the things that they're working on in the
factory.
</rh-cue>
<rh-cue start="18:27" voice="Burr Sutter">
And that labeling process can be very labor intensive. If the annotation is basically saying what is correct, what's wrong, what is this, what is that. And therefore if we can minimize that
timeframe to get the value quicker, then there's something that's useful for the business, useful for the organization, long before we necessarily go through a whole huge model training phase.
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter">
So we talked about labeling and how that is labor intensive activity, but I love the idea of helping the human. And helping the human most specifically not get bored. Basically if the human is
eyeballing a bunch of widgets flying by, over time they make mistakes, they get bored and they don't pay as close attention as they should. That's why the constant of AI/ML, and specifically
Computer Vision augmenting that capability and really helping the human identify anomalies faster, more quickly, maybe with greater accuracy, could be a big win. We focused on manufacturing, but
let's actually go into healthcare and learn how these tools can be used in that sector and that industry. Ryan talked me about how OpenVINO's run time can be incorporated into medical imaging
equipment with Intel processors embedded in CT, MRI and ultrasound machines. While these inferences, this AI/ML workload, can be operating and executing right there in the same physical room as
the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney">
We did a presentation with GE last year, I think they said there's at least 80 countries that have their x-ray machines deployed. And they're doing things like helping doctors place breathing
tubes in patients. So during COVID, during the pandemic, that was a really important tool to help with nurses and doctors who were intubating patients, sometimes in a parking lot or a hallway of
a hospital. And when they had a statistic that GE said, I think one out of four breathing tubes gets placed incorrectly when you're doing it outside the operating room. Because when you're in an
operating room it's much more controlled and there's someone who's an expert at placing the tubes, it's something you have more of a controlled environment. But when you're out, in a parking
lot, in a tent, when the hospital's completely full and you're triaging patients with COVID, that's when they're more likely to make mistakes.And so, they had this endotracheal tube placement,
ETT, model that they trained and it helped to use an x-ray and give an alert and say, "This tube is placed wrong, pull it out and do it again." And so, things like that help doctors so that they
can avoid mistakes. And having a breathing tube placed incorrectly can cause collapsed lung and a number of other unwanted side effects. So it's really important to do it correctly. Another example
is Samsung Medison. They actually are estimating fetal angle of progression. So this is analyzing ultrasound of pregnant women being able to help take measurements that are usually hard to
calculate, but it can be done in an automated way. They're already taking an ultrasound scan and now they're executing this model that can take some of these measurements to help the doctor
avoid potentially more intrusive alternative methods. So the patient wins, it makes their life better and the doctor is getting help from this AI model. And those are just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter">
Those are some amazing examples when it comes to all these things, we're talking CT scans and x-rays, other examples of Computer Vision. One thing that's kind of interesting in this space, I
think, whenever I get a chance to work on, let's say an object detection model, and one of our workshops, by the way, is actually putting that out in front of people to say, "Look, you can use
your phone and it basically sends the image over to our OpenShift with our data science platform and then analyzes what you see." And even in my case, where I take a picture of my dog as an
example, it can't really decide, is it a dog or a cat? I have a very funny looking dog.
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter">
And so there's always a percentage outcome. In other words, "I think it's a dog, 52%." So I want to talk about that more. How important is it to get to that a hundred percent accuracy? How
important is it to really, depending on the use case, to allow for the gray area if you will, where it's an 80% accuracy or a 70% accuracy, and what are the trade offs there associated with the
application? Can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney">
Accuracy is definitely a touchy subject, because how you measure it makes a huge difference. I think what you were describing with the dog example, there's sort of a top five potential classes
that might maybe be identified. So let's say you're doing object detection and you detect a region of interest, and it says 65% confidence this is a dog. Well, the next potential label that
could be maybe 50% confidence or 20% confidence might be something similar to a dog. Or in the case of models that have been trained on the ImageNet dataset or on COCO dataset, they have actual
breeds of dogs. If I want to look at the top five labels for a dog, for my dog for example, she's a mix, mostly a Labrador retriever, but I may look at the top five labels and it may say 65%
confidence that she's a flat coated retriever.
</rh-cue>
<rh-cue start="23:32" voice="Ryan Loney">
And then confidence that she's a husky as 20%, and then 5% confidence that she's a greyhound or something. Those labels, all of them are dogs. So if I'm just trying to figure out, is this a dog?
I could probably find all of the classes within the data set and say, "Well, these all, class ID 65, 132, 92 and 158, all belong to a group of dogs." So if I want to just write an application to
tell me if this is a dog or not, I would probably use that to determine if it's a dog. But how you measure that as accuracy, well that's where it gets a little bit complicated. Because if you're
being really strict about the definition and you're trying to validate against the data set of labeled images, and I have specific dog breeds or some specific detail and it doesn't match, well
then, the accuracy's going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney">
And that's especially important when we talk about things like compression and quantization, which historically, has been difficult to get adoption in some domains, like healthcare, where even
the hint of accuracy going down implies that we're not going to be able to help. In some small case, maybe if it's even half a percent of the time, we won't detect that that tube is placed
incorrectly or that that patient's lung has collapsed or something like that. And that's something that really prevents adoption of some of these methods that can really boost performance, like
quantization. But if you take that example of... Different from the dog example, and you think about segmentation of kidneys. If I'm doing kidney segmentation, which is taking a CT scan and then
trying to pick the pixels out of that scan that belong to a kidney, how I measure accuracy may be how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney">
Missing some of the pixels is maybe not a problem, depending on how you've built the application, because you still detect the kidney, and maybe you just need to apply padding around the region
of interest, so that you don't miss any of the actual kidney when you compress the model and when you quantize the model. But that requires a data scientist, an ML engineer, somebody to really,
they have to be able to go and apply that after the fact, after the inference happens, to make sure that you're not losing critical information. Because the next step from detecting the kidney,
may be detecting a tumor.
</rh-cue>
<rh-cue start="26:04" voice="Ryan Loney">
And so, maybe you can use the more optimized model to detect the kidney, but then you can use a slower model to detect the tumor. But that also requires somebody to architect and make that
decision or that trade off and say, "Well, I need to add padding," or, "I should only use the quantized model to detect the region of interest for the kidney." And then, use the model that takes
longer to do the inference just to find the tumor, which is going to be on a smaller size. The dimensions are going to be much smaller once we crop to the region of interest. But all of those
details, that's maybe not easy to explain in a few sentences and even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter">
I do love that use case, like you mentioned, the cropping, even in one scenario that we worked on for another project, we specifically decided to pixelate the image that we had taken, because we
knew that we could get the outcome we wanted by even just using a smaller or having less resolution in our image. And therefore, as we transferred it from the mobile device, the edge device, up
into the cloud, we wanted that smaller image just for transfer purposes. And still, we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter">
And one thing that's interesting about that, from my perspective, is, if you're doing image processing, sometimes it takes a while for this transaction to occur. I come from a traditional
application background, where I'm reading and writing things from a database, or a message broker, or moving data from one place to another. Those things happen sub-second normally, even with
great latency between your data centers, it's still sub-second in most cases. While a transaction like this one can actually take two seconds or four seconds, as it's doing its analysis and
actually coming back with its, "I think it's a dog, I think it's a kidney, I think it's whatever." And providing me that accuracy statement. That concept of optimization is very important in the
overall application architecture. Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney">
Definitely. It depends too on the use case. So if you think about how important it is to reduce the latency and increase the number of frames per second that you can process when you're talking
about a loss prevention model that's running at a grocery store. You want to keep the lines moving, you don't want every person who's at the self checkout to have to wait five seconds for every
item they scan. You need it to happen as quickly as possible. And if sometimes the accuracy decreases slightly, or I'd say the accuracy of the whole pipeline, so not just looking at the
individual model or the individual inference, but let's say that the whole pipeline is not as successful at detecting when somebody steals one item from the self checkout, it's not going to be a
life threatening situation. Whereas being hooked up to the x-ray machine with the tube placement model, they might be willing to have the doctor or the nurse wait five seconds to get the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney">
They don't need it to happen in 500 milliseconds. Their threshold for waiting is a little bit higher. That, I think, also drives some of the decision. You want to keep people moving through the
checkout line and you can afford to, potentially, if you lose a little bit of accuracy here and there, it's not going to cost the company that much money or it's not going to be life
threatening. It's going to be worth the trade off of keeping the line moving and not having people leave the store and not check out at all, to say, "I'm not going to shop today because the
line's too long."
</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter">
There are so many trade-offs in enterprise AI/ML use cases, things like latency, accuracy and availability, and certainly complexities abound, especially in an obviously ever-evolving
technological landscape where we are still very early in the adoption of AI/ML. And to navigate that complexity, that direct feedback from real world end users is essential to Ryan and his team
at Intel. What would you say are some of the big hurdles or big outcomes, big opportunities in that space? And do you agree that we're still at the very beginning, in our infancy if you will, of
adopting these technologies and discovering what they can do for us?
</rh-cue>
<rh-cue start="30:06" voice="Ryan Loney">
Yeah, I think we're definitely in the infancy and I think that what we've seen is, our customers are evolving and the people who are deploying on Intel hardware, they're trying to run more
complicated models. They're the models that are doing object detection or detecting defects and doing segmentation. In the past you could say, "Here's a generic model that will do face
detection, or person detection, or vehicle detection, license plate detection." And those are general purpose models that you can just grab off the shelf and use them. But now we're moving into
the Anomalib scenarios, where I've got my own data and I'm trying to do something very specific and I'm the only one that has access to this data. You don't have that public data set that you can go
download that's under Creative Commons license for car batteries. It's just not something that's available.
</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney">
And so, those use cases, the challenge with training those models and getting them optimized is the beginning of the pipeline. It's the data. You have to get the data, you have to annotate it
and the tools have to exist for you to do that. And that's part of the problem that we're trying to help solve. And then, the models are getting more complex. So if you think, just from working
with customers recently, they're no longer just trying to do image classification, "Is it a dog or a cat?" They've moved on to 3D point clouds and 3D segmentation models and things that are like the
speech synthesis example. These GPT models that are generating... You put a text input and it generates an image for you. It's just becoming much more advanced, much more sophisticated and on
larger images.
</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney">
And so things like running super resolution and enhancing images, upscaling images, instead of just trying to take that 200 by 200 pixel image and classifying if it's a cat, now we're talking
about gigantic, huge images that we're processing and that all requires more resources or more optimized models. And every Computer Vision conference or AI conference, there's a new latest and
greatest architecture, there's new research paper, and things are getting adopted much faster. The lead time for a NeurIPS paper, CVPR, for a company to actually adopt and put those into
production, the time shortens every year.
</rh-cue>
<rh-cue start="32:34" voice="Burr Sutter">
Well Ryan, I got to tell you, I could talk to you, literally, all day about these topics, the various use cases, the various ways models are being optimized, how to put models into a pipeline
for average enterprise applications. I've enjoyed learning about OpenVINO and Anomalib. I'm fascinated by this, because I'll have a chance to go try this myself, taking advantage of Red Hat
OpenShift and taking advantage of our data science platform. On top of that, I will definitely go be poking at this myself. Thank you so much for your time today.
</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney">
Thanks, Burr. This was a lot of fun. Thanks for having me.
</rh-cue>
<rh-cue start="33:05" voice="Burr Sutter">
You can check out the full transcript of our conversation and more resources, like a link to a white paper on OpenVINO and Anomalib at redhat.com/codecommentspodcast. This episode was produced
by Brent Simoneaux and Caroline Creaghead. Our sound designer is Christian Prohom. Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison, Nick Burns,
Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, Laura Walters, Alex Traboulsi, and Victoria Lawton. I'm your host, Burr Sutter. Thank you for joining me
today on Code Comments. I hope you enjoyed today's session and today's conversation, and I look forward to many more.
</rh-cue>
</rh-transcript>
</rh-audio-player>
</rh-context-demo>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Compact Wide
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
rh-audio-player {
margin: var(--rh-space-xl, 24px);
}
```
<rh-audio-player id="player" layout="compact-wide" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript">
<h4 slot="heading">Transcript</h4>
<rh-cue start="00:02" voice="Burr Sutter">
Hi, I'm Burr Sutter. I'm a Red Hatter who spends a lot of time talking to technologists about technologies. We say this a lot at Red Hat. No single technology provider holds the key to
success, including us. And I would say the same thing about myself. I love to share ideas, so I thought it would be awesome to talk to some brilliant technologists at Red Hat Partners. This is
Code Comments, an original podcast from Red Hat.
</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter">
I'm sure, like many of you here, you have been thinking about AI/ML, artificial intelligence and machine learning. I've been thinking about that for quite some time and I actually had the
opportunity to work on a few successful projects, here at Red Hat, using those technologies, actually enabling a data set, gathering a data set, working with a data scientist and data
engineering team, and then training a model and putting that model into production runtime environment. It was an exciting set of projects and you can see those on numerous YouTube videos that
have published out there before. But I want you to think about the problem space a little bit, because there are some interesting challenges about a AI/ML. One is simply just getting access to
the data, and while there are numerous publicly available data sets, when it comes to your specific enterprise use case, you might not be to find publicly available data.
</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter">
In many cases you cannot, even for our applications that we created, we had to create our data set, capture our data set, explore the data set, and of course, train a model accordingly. And
we also found there's another challenge to be overcome in this a AI/ML world, and that is access to certain types of hardware. If you think about an enterprise environment and the creation of
an enterprise application specifically for a AI/ML, end users need an inference engine to run on their hardware. Hardware that's available to them, to be effective for their application. Let's
say an application like Computer Vision, one that can detect anomalies and medical imaging or maybe on a factory floor. As those things are whizzing by on the factory line there, looking at
them and trying to determine if there is an error or not.
</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter">
Well, how do you actually make it run on your hardware, your accessible technology that you have today? Well, there's a solution for this as an open toolkit called OpenVINO. And you might be
thinking, "Hey, wait a minute, don't you need a GPU for AI inferencing, a GPU for artificial intelligence, machine learning? Well, not according to Ryan Loney, product manager of OpenVINO
Developer Tools at Intel.
</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney">
I guess I'll start with trying to maybe dispel a myth. I think that CPUs are widely used for inference today. So if we look at the data center segment, about 70% of the AI inference is
happening on Intel Xeon, on our data center CPUs. And so you don't need a GPU especially for running inference. And that's part of the value of OpenVINO, is that we're taking models that may
have been trained on a GPU using deep learning frameworks like PyTorch or TensorFlow, and then optimizing them to run on Intel hardware.
</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter">
Ryan joined me to discuss AI/ML in the enterprise across various industries and exploring numerous use cases. Let's talk a little bit about the origin story behind OpenVINO. Tell us more
about it and how it came to be and why it came out of Intel.
</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney">
Definitely. We had the first release of OpenVINO, was back in 2018, so still relatively new. And at that time, we were focused on Computer Vision and pretty tightly coupled with OpenCV, which
is another open source library with origins at Intel. It had its first release back in 1999, so it's been around a little bit longer. And many of the software engineers and architects at Intel
that were involved with and contributing to OpenCV are working on OpenVINO. So you can think of OpenVINO as complimentary software to OpenCV and we're providing an engine for executing
inferences as part of a Computer Vision pipeline, or at least that's how we started.
</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney">
But since 2018, we've started to move beyond just Computer Vision inference. So when I say Computer Vision inference, I mean image classification, object detection, segmentation, and now
we're moving into natural language processing. Things like speech synthesis, speech recognition, knowledge graphs, time series forecasting and other use cases that don't involve Computer
Vision and don't involve inference on pixels. Our latest release, the 2022.1 that came out earlier this year, that was the most significant update that we've had to OpenVINO, since we started
in 2018. And the major focus of that release was optimizing for use cases that go beyond Computer Vision.
</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter">
And I like that concept that you just mentioned right there, Computer Vision, and you said that you extended those use cases and went beyond that. Could you give us some more concrete
examples of Computer Vision?
</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney">
Sure. When you think about manufacturing, quality control in factories, everything from arc welding, defect detection to inspecting BMW cars on assembly lines, they're using cameras or
sensors to collect data and usually it's cameras collecting images like RGB images that you and I can see and looks like something taken from a camera or video camera. But also, things like
infrared or computerized tomography scans used in healthcare, X-ray, different types of images where we can draw bounding boxes around regions of interest and say, "This is a defect," or,
"This is not a defect." And also, "Is this worker wearing a safety hat or did they forget to put it on?" And so, you can take this and integrate it into a pipeline where you're triggering an
alert if somebody forgets to wear their safety mask, or if there's a defect in a product on an assembly line, you can just use cameras and OpenVINO and OpenCV running these on Intel hardware
and help to analyze.
</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney">
And that's what a lot of the partners that we work with are doing, so these independent software vendors. And there's other use cases for things like retail. You think about going to a store
and using an automated checkout system. Sometimes people use those automated checkouts and they slide a few extra items into their bag that they don't scan and it's a huge loss for the retail
outlets that are providing this way to check out realtime shelf monitoring. We have a Vispera, one of our ISVs that helps keep store shelves stocked by just analyzing the cameras in the
stores, detecting when objects are missing from the shelves so that they can be restocked. We have Vistry, another ISV that works with quick service restaurants. When you think about
automating the process of, when do I drop the fries into the fryer so that they're warm when the car gets to the drive through window, there's quite a bit of industrial healthcare retail
examples that we can walk through.
</rh-cue>
<rh-cue start="06:55" voice="Burr Sutter">
And we should dig into some more of those, but I got to tell you, I have a personal experience in this category that I want to share with and you can tell me how silly you might think at this
point in time it is. We actually built a keynote demonstration for the Red Hat big stage back in 2015. And I really want to illustrate the concept of asset tracking. So we actually gave
everybody in the conference a little Bluetooth token with a little battery, a little watch battery, and a little Bluetooth emitter. And we basically tracked those things around the conference.
We basically put a raspberry pi in each of the meeting rooms and up in the lunch room and you could see how the tokens moved from room to room to room.
</rh-cue>
<rh-cue start="07:28" voice="Burr Sutter">
It was a relatively simple application, but it occurred to me, after we figured out how to do that with Bluetooth and triangulating Bluetooth signals by looking at relative signal strength
from one radio to another and putting that through an Apache Spark application at the time, we then realized, "You know what? This is easier done with cameras." And just simply looking at a
camera and having some form of a AI/ML model, a machine learning model, that would say, "There are people here now," or, "There are no people here now." What do you think about that?
</rh-cue>
<rh-cue start="07:56" voice="Ryan Loney">
What you just described is exactly the product that Pathr, one of our partners is offering, but they're doing it with Computer Vision and cameras. So when Pathr tries to help retail stores
analyze the foot traffic and understand, with heat maps, where are people spending the most time in stores, how many people are coming in, what size groups are coming into the store and trying
to help understand if there was a successful transaction from the people who entered the store and left the store, to help with the retail analytics and marketing sales and positioning of
products. And so, they're doing that in a way that also protects privacy. And that's something that's really important. So when you talked about those Bluetooth beacons, probably if everyone
who walked into a grocery store was asked to put a tracking device in their cart or on their person and say, "You're going to be tracked around the store," they probably wouldn't want to do
that.
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney">
The way that you can do this with cameras, is you can detect people as they enter and remove their face. So you can ignore any biometric information and just track the person based on pixels
that are present in the detected region of interest. So they're able to analyze... Say a family walks in the door and they can group those people together with object detection and then they
can track their movement throughout the store without keeping track of their face, or any biometric, or any personal identifiable information, to avoid things like bias and to make sure that
they're protecting the privacy of the shoppers in the store, while still getting that really useful marketing analytics data. So that they can make better decisions about where to place their
products. That's one really good example of how Computer Vision, AI with OpenVINO is being used today.
</rh-cue>
<rh-cue start="09:49" voice="Burr Sutter">
And that is a great example, because you're definitely spot on. It is invasive when you hand someone a Bluetooth device and say, "Please, keep this with you as you go throughout our store,
our mall or throughout our hospital, wherever you might be." Now you mentioned another example earlier in the conversation which was related to worker safety. "Are they wearing a helmet?" I
want to talk more about that concept in a real industrial setting, a manufacturing setting, where there might be a factory floor and there's certain requirements. Or better yet there's like a
quality assurance requirement, let's say, when it comes to looking at a factory line. I've run that use case often with some of our customers. Can you talk more about those kinds of use cases?
</rh-cue>
<rh-cue start="10:23" voice="Ryan Loney">
One of our partners, Robotron, we published a case study, I think last year, where they were working with BMW at one of their factories. And they do quality control inspection, but they're
also doing things related to worker safety and analyzing. I use the safety hat example. There's a number of our ISVs and partners who have similar use cases and it comes down to, there's a few
reasons that are motivating this and some are related to insurance. It's important to make sure that if you want to have your factory insured, that your workers are protecting themselves and
wearing the gear regulatory compliance, you're being asked to properly protect from exposure to chemicals or potentially having something fall and hit someone on the head. So wearing a safety
vest, wearing goggles, wearing a helmet, these are things that you need to do inside the factory and you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney">
I think that's one of the interesting things about the Robotron-BMW example is that they were also blurring, blacking out, so drawing a box to cover the face of the workers in the factory, so
that somebody who was analyzing the video footage and getting the alerts saying that, "Bay 21 has a worker without a hat on," that it's not sending their face and in the alert and potentially
invading or going against privacy laws or just the ethics of the company. They don't want to introduce bias or have people targeted because it's much better to blur the face and alert and have
somebody take care of it on the floor. And then, if you ever need to audit that information later, they have a way to do it where people who need to be able to see who the employee was and
look up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney">
But then just for the purposes of maintaining safety, they don't need to have access to that personal information, or biometric information. Because that's one thing that when you hear about
Computer Vision or person tracking, object detection, there's a lot of concern, and rightfully so, about privacy being invaded and about tracking information, face re-identification,
identifying people who may have committed crimes through video footage. And that's just not something that a lot of companies want to... They want to protect privacy and they don't want to be
in a situation where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter">
Well, privacy is certainly opening up Pandora's box. There's a lot to be explored in that area, especially in a digital world that we now live in. But for now, let's move on and explore a
different area. I'm interested in how machines and computers offer advantages specifically in certain use cases like a quality control scenario. I asked Ryan to explain how a AI/ML and
specifically machines, computers, could augment that capability.
</rh-cue>
<rh-cue start="13:20" voice="Ryan Loney">
I can give a specific example where we have a partner that's doing defect detection, looking for anomalies in batteries. I'm sure you've heard there's a lot of interest right now in electric
vehicles, a lot of batteries being produced. And so, if you go into one of these factories, they have images that they collect of every battery that's going through this assembly line. And
through these images, people can look and see and visually inspect what their eyes and say, "This battery has a defect, send it back." And that's one step in the quality control process,
there's other steps I'm sure, like running diagnostic tests and measuring voltage and doing other types of non-visual inspection. But for the visual inspection piece, where you can really
easily identify some problems, it's much more efficient to introduce Computer Vision. And so, that's where we have this new library that we've introduced, called Anomalib.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney">
So OpenVINO, while we're focused on inference, we're also thinking about the pipeline, or the funnel, that gets these models to OpenVINO. And so, we've invested in this anomaly segmentation,
anomaly detection library that we've recently open sourced and there's a great research paper about it, about Anomalib, but the idea is you can take just a few images and train a model and
start detecting these defects. And so, for this battery example, that's a more advanced example, but to make it simpler, take some bolts and... Take 10 bolts. You have one that has a scratch
on it, or one that is chipped, or has some damage to it, and you can easily get started in training to recognize the bolts that do not have an anomaly and the ones that do, which is a small
data set. And I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney">
Challenges, one is access to data, but the other is needing a massive amount of data to do something meaningful. And so we're starting to try to change that dynamic with Anomalib. You may not
need a 100,000 images, you may need 100 images and you can start detecting anomalies in everything from batteries to bolts to, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter">
That is a very key point because often in that data scientist process, that data engineering data scientist process, the one key thing is, can you gather the data that you need for the input
for the model training? And we've often said, at least people I've worked with over the last couple years, "You need a lot of data, you need tens of thousands of correct images, so we can sort
out the difference between dogs versus cats," let's say. Or you need dozens and dozens of situations where if it's a natural language processing scenario, a good customer interaction, a good
customer conversation. And this case it sounds like what you're saying is, "Show us just the bad things, fewer images, fewer incorrect things, and then let us look for those kind of
anomalies." Can you tell us more about that? Because that is very interesting. The concept that I can use a much smaller data set as my input, as opposed to gathering terabytes of data in some
cases, to just simply get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney">
Like you described, the idea is, if you have some good images and then you have some of the known defects, and you can just label, "Here's a set of good images and here's a few of the
defects." And you can right away start detecting those specific defects that you've identified. And then, also be able to determine when it doesn't match the expected appearance of a non
defective item. So if I have the undamaged screw and then I introduce one with some new anomaly that's never been seen before, I can say this one is not a valid screw. And so, that's the
approach that we're taking and it's really important because so often you need to have subject matter experts. Take the battery example, there's these workers who are on the floor, in a
factory and they're the ones who know best when they look at these images, which one's going to have an issue, which one's defective.
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney">
And then they also need to take that subject matter expertise and then use it to annotate data sets. And when you have these tens of thousands of images you need to annotate, it's asking
those people to stop working on the factory floor so they can come annotate some images. That's a tough business call to make, right? But if you only need them to annotate a handful of images,
it's a much easier ask to get the ball rolling and demonstrate value. And maybe over time you will want to annotate more and more images because you'll get even better accuracy in the model.
Even better, even if it's just small incremental improvements, that's something that if it generates value for the business, it's something the business will invest in over time. But you have
to convince the decision makers that it's worth the time of these subject matter experts to stop what they're doing and go and label some images of the things that they're working on in the
factory.
</rh-cue>
<rh-cue start="18:27" voice="Burr Sutter">
And that labeling process can be very labor intensive. If the annotation is basically saying what is correct, what's wrong, what is this, what is that. And therefore if we can minimize that
timeframe to get the value quicker, then there's something that's useful for the business, useful for the organization, long before we necessarily go through a whole huge model training phase.
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter">
So we talked about labeling and how that is labor intensive activity, but I love the idea of helping the human. And helping the human most specifically not get bored. Basically if the human
is eyeballing a bunch of widgets flying by, over time they make mistakes, they get bored and they don't pay as close attention as they should. That's why the constant of AI/ML, and
specifically Computer Vision augmenting that capability and really helping the human identify anomalies faster, more quickly, maybe with greater accuracy, could be a big win. We focused on
manufacturing, but let's actually go into healthcare and learn how these tools can be used in that sector and that industry. Ryan talked me about how OpenVINO's run time can be incorporated
into medical imaging equipment with Intel processors embedded in CT, MRI and ultrasound machines. While these inferences, this AI/ML workload, can be operating and executing right there in the
same physical room as the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney">
We did a presentation with GE last year, I think they said there's at least 80 countries that have their x-ray machines deployed. And they're doing things like helping doctors place breathing
tubes in patients. So during COVID, during the pandemic, that was a really important tool to help with nurses and doctors who were intubating patients, sometimes in a parking lot or a hallway
of a hospital. And when they had a statistic that GE said, I think one out of four breathing tubes gets placed incorrectly when you're doing it outside the operating room. Because when you're
in an operating room it's much more controlled and there's someone who's an expert at placing the tubes, it's something you have more of a controlled environment. But when you're out, in a
parking lot, in a tent, when the hospital's completely full and you're triaging patients with COVID, that's when they're more likely to make mistakes.And so, they had this endotracheal tube
placement, ETT, model that they trained and it helped to use an x-ray and give an alert and say, "This tube is placed wrong, pull it out and do it again." And so, things like that help doctors
so that they can avoid mistakes. And having a breathing tube placed incorrectly can cause collapsed lung and a number of other unwanted side effects. So it's really important to do it
correctly. Another example is Samsung Medison. They actually are estimating fetal angle of progression. So this is analyzing ultrasound of pregnant women being able to help take measurements
that are usually hard to calculate, but it can be done in an automated way. They're already taking an ultrasound scan and now they're executing this model that can take some of these
measurements to help the doctor avoid potentially more intrusive alternative methods. So the patient wins, it makes their life better and the doctor is getting help from this AI model. And
those are just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter">
Those are some amazing examples when it comes to all these things, we're talking CT scans and x-rays, other examples of Computer Vision. One thing that's kind of interesting in this space, I
think, whenever I get a chance to work on, let's say an object detection model, and one of our workshops, by the way, is actually putting that out in front of people to say, "Look, you can use
your phone and it basically sends the image over to our OpenShift with our data science platform and then analyzes what you see." And even in my case, where I take a picture of my dog as an
example, it can't really decide, is it a dog or a cat? I have a very funny looking dog.
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter">
And so there's always a percentage outcome. In other words, "I think it's a dog, 52%." So I want to talk about that more. How important is it to get to that a hundred percent accuracy? How
important is it to really, depending on the use case, to allow for the gray area if you will, where it's an 80% accuracy or a 70% accuracy, and what are the trade offs there associated with
the application? Can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney">
Accuracy is definitely a touchy subject, because how you measure it makes a huge difference. I think what you were describing with the dog example, there's sort of a top five potential
classes that might maybe be identified. So let's say you're doing object detection and you detect a region of interest, and it says 65% confidence this is a dog. Well, the next potential label
that could be maybe 50% confidence or 20% confidence might be something similar to a dog. Or in the case of models that have been trained on the ImageNet dataset or on COCO dataset, they have
actual breeds of dogs. If I want to look at the top five labels for a dog, for my dog for example, she's a mix, mostly a Labrador retriever, but I may look at the top five labels and it may
say 65% confidence that she's a flat coated retriever.
</rh-cue>
<rh-cue start="23:32" voice="Ryan Loney">
And then confidence that she's a husky as 20%, and then 5% confidence that she's a greyhound or something. Those labels, all of them are dogs. So if I'm just trying to figure out, is this a
dog? I could probably find all of the classes within the data set and say, "Well, these all, class ID 65, 132, 92 and 158, all belong to a group of dogs." So if I want to just write an
application to tell me if this is a dog or not, I would probably use that to determine if it's a dog. But how you measure that as accuracy, well that's where it gets a little bit complicated.
Because if you're being really strict about the definition and you're trying to validate against the data set of labeled images, and I have specific dog breeds or some specific detail and it
doesn't match, well then, the accuracy's going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney">
And that's especially important when we talk about things like compression and quantization, which historically, has been difficult to get adoption in some domains, like healthcare, where
even the hint of accuracy going down implies that we're not going to be able to help. In some small case, maybe if it's even half a percent of the time, we won't detect that that tube is
placed incorrectly or that that patient's lung has collapsed or something like that. And that's something that really prevents adoption of some of these methods that can really boost
performance, like quantization. But if you take that example of... Different from the dog example, and you think about segmentation of kidneys. If I'm doing kidney segmentation, which is
taking a CT scan and then trying to pick the pixels out of that scan that belong to a kidney, how I measure accuracy may be how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney">
Missing some of the pixels is maybe not a problem, depending on how you've built the application, because you still detect the kidney, and maybe you just need to apply padding around the
region of interest, so that you don't miss any of the actual kidney when you compress the model and when you quantize the model. But that requires a data scientist, an ML engineer, somebody to
really, they have to be able to go and apply that after the fact, after the inference happens, to make sure that you're not losing critical information. Because the next step from detecting
the kidney, may be detecting a tumor.
</rh-cue>
<rh-cue start="26:04" voice="Ryan Loney">
And so, maybe you can use the more optimized model to detect the kidney, but then you can use a slower model to detect the tumor. But that also requires somebody to architect and make that
decision or that trade off and say, "Well, I need to add padding," or, "I should only use the quantized model to detect the region of interest for the kidney." And then, use the model that
takes longer to do the inference just to find the tumor, which is going to be on a smaller size. The dimensions are going to be much smaller once we crop to the region of interest. But all of
those details, that's maybe not easy to explain in a few sentences and even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter">
I do love that use case, like you mentioned, the cropping, even in one scenario that we worked on for another project, we specifically decided to pixelate the image that we had taken, because
we knew that we could get the outcome we wanted by even just using a smaller or having less resolution in our image. And therefore, as we transferred it from the mobile device, the edge
device, up into the cloud, we wanted that smaller image just for transfer purposes. And still, we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter">
And one thing that's interesting about that, from my perspective, is, if you're doing image processing, sometimes it takes a while for this transaction to occur. I come from a traditional
application background, where I'm reading and writing things from a database, or a message broker, or moving data from one place to another. Those things happen sub-second normally, even with
great latency between your data centers, it's still sub-second in most cases. While a transaction like this one can actually take two seconds or four seconds, as it's doing its analysis and
actually coming back with its, "I think it's a dog, I think it's a kidney, I think it's whatever." And providing me that accuracy statement. That concept of optimization is very important in
the overall application architecture. Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney">
Definitely. It depends too on the use case. So if you think about how important it is to reduce the latency and increase the number of frames per second that you can process when you're
talking about a loss prevention model that's running at a grocery store. You want to keep the lines moving, you don't want every person who's at the self checkout to have to wait five seconds
for every item they scan. You need it to happen as quickly as possible. And if sometimes the accuracy decreases slightly, or I'd say the accuracy of the whole pipeline, so not just looking at
the individual model or the individual inference, but let's say that the whole pipeline is not as successful at detecting when somebody steals one item from the self checkout, it's not going
to be a life threatening situation. Whereas being hooked up to the x-ray machine with the tube placement model, they might be willing to have the doctor or the nurse wait five seconds to get
the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney">
They don't need it to happen in 500 milliseconds. Their threshold for waiting is a little bit higher. That, I think, also drives some of the decision. You want to keep people moving through
the checkout line and you can afford to, potentially, if you lose a little bit of accuracy here and there, it's not going to cost the company that much money or it's not going to be life
threatening. It's going to be worth the trade off of keeping the line moving and not having people leave the store and not check out at all, to say, "I'm not going to shop today because the
line's too long."
</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter">
There are so many trade-offs in enterprise AI/ML use cases, things like latency, accuracy and availability, and certainly complexities abound, especially in an obviously ever-evolving
technological landscape where we are still very early in the adoption of AI/ML. And to navigate that complexity, that direct feedback from real world end users is essential to Ryan and his
team at Intel. What would you say are some of the big hurdles or big outcomes, big opportunities in that space? And do you agree that we're still at the very beginning, in our infancy if you
will, of adopting these technologies and discovering what they can do for us?
</rh-cue>
<rh-cue start="30:06" voice="Ryan Loney">
Yeah, I think we're definitely in the infancy and I think that what we've seen is, our customers are evolving and the people who are deploying on Intel hardware, they're trying to run more
complicated models. They're the models that are doing object detection or detecting defects and doing segmentation. In the past you could say, "Here's a generic model that will do face
detection, or person detection, or vehicle detection, license plate detection." And those are general purpose models that you can just grab off the shelf and use them. But now we're moving
into the Anomalib scenarios, where I've got my own data and I'm trying to do something very specific and I'm the only one that has access to this data. You don't have that public data set that
you can go download that's under Creative Commons license for car batteries. It's just not something that's available.
</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney">
And so, those use cases, the challenge with training those models and getting them optimized is the beginning of the pipeline. It's the data. You have to get the data, you have to annotate it
and the tools have to exist for you to do that. And that's part of the problem that we're trying to help solve. And then, the models are getting more complex. So if you think, just from
working with customers recently, they're no longer just trying to do image classification, "Is it a dog or a cat?" They've moved on to 3D point clouds and 3D segmentation models and things
that are like the speech synthesis example. These GPT models that are generating... You put a text input and it generates an image for you. It's just becoming much more advanced, much more
sophisticated and on larger images.
</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney">
And so things like running super resolution and enhancing images, upscaling images, instead of just trying to take that 200 by 200 pixel image and classifying if it's a cat, now we're talking
about gigantic, huge images that we're processing and that all requires more resources or more optimized models. And every Computer Vision conference or AI conference, there's a new latest and
greatest architecture, there's new research paper, and things are getting adopted much faster. The lead time for a NeurIPS paper, CVPR, for a company to actually adopt and put those into
production, the time shortens every year.
</rh-cue>
<rh-cue start="32:34" voice="Burr Sutter">
Well Ryan, I got to tell you, I could talk to you, literally, all day about these topics, the various use cases, the various ways models are being optimized, how to put models into a pipeline
for average enterprise applications. I've enjoyed learning about OpenVINO and Anomalib. I'm fascinated by this, because I'll have a chance to go try this myself, taking advantage of Red Hat
OpenShift and taking advantage of our data science platform. On top of that, I will definitely go be poking at this myself. Thank you so much for your time today.
</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney">
Thanks, Burr. This was a lot of fun. Thanks for having me.
</rh-cue>
<rh-cue start="33:05" voice="Burr Sutter">
You can check out the full transcript of our conversation and more resources, like a link to a white paper on OpenVINO and Anomalib at redhat.com/codecommentspodcast. This episode was
produced by Brent Simoneaux and Caroline Creaghead. Our sound designer is Christian Prohom. Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison,
Nick Burns, Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, Laura Walters, Alex Traboulsi, and Victoria Lawton. I'm your host, Burr Sutter. Thank you
for joining me today on Code Comments. I hope you enjoyed today's session and today's conversation, and I look forward to many more.
</rh-cue>
</rh-transcript>
</rh-audio-player>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Compact
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
rh-audio-player {
margin: var(--rh-space-xl, 24px);
}
```
<rh-audio-player id="player" layout="compact" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript">
<h4 slot="heading">Transcript</h4>
<rh-cue start="00:02" voice="Burr Sutter">
Hi, I'm Burr Sutter. I'm a Red Hatter who spends a lot of time talking to technologists about technologies. We say this a lot at Red Hat. No single technology provider holds the key to
success, including us. And I would say the same thing about myself. I love to share ideas, so I thought it would be awesome to talk to some brilliant technologists at Red Hat Partners. This is
Code Comments, an original podcast from Red Hat.
</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter">
I'm sure, like many of you here, you have been thinking about AI/ML, artificial intelligence and machine learning. I've been thinking about that for quite some time and I actually had the
opportunity to work on a few successful projects, here at Red Hat, using those technologies, actually enabling a data set, gathering a data set, working with a data scientist and data
engineering team, and then training a model and putting that model into production runtime environment. It was an exciting set of projects and you can see those on numerous YouTube videos that
have published out there before. But I want you to think about the problem space a little bit, because there are some interesting challenges about a AI/ML. One is simply just getting access to
the data, and while there are numerous publicly available data sets, when it comes to your specific enterprise use case, you might not be to find publicly available data.
</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter">
In many cases you cannot, even for our applications that we created, we had to create our data set, capture our data set, explore the data set, and of course, train a model accordingly. And
we also found there's another challenge to be overcome in this a AI/ML world, and that is access to certain types of hardware. If you think about an enterprise environment and the creation of
an enterprise application specifically for a AI/ML, end users need an inference engine to run on their hardware. Hardware that's available to them, to be effective for their application. Let's
say an application like Computer Vision, one that can detect anomalies and medical imaging or maybe on a factory floor. As those things are whizzing by on the factory line there, looking at
them and trying to determine if there is an error or not.
</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter">
Well, how do you actually make it run on your hardware, your accessible technology that you have today? Well, there's a solution for this as an open toolkit called OpenVINO. And you might be
thinking, "Hey, wait a minute, don't you need a GPU for AI inferencing, a GPU for artificial intelligence, machine learning? Well, not according to Ryan Loney, product manager of OpenVINO
Developer Tools at Intel.
</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney">
I guess I'll start with trying to maybe dispel a myth. I think that CPUs are widely used for inference today. So if we look at the data center segment, about 70% of the AI inference is
happening on Intel Xeon, on our data center CPUs. And so you don't need a GPU especially for running inference. And that's part of the value of OpenVINO, is that we're taking models that may
have been trained on a GPU using deep learning frameworks like PyTorch or TensorFlow, and then optimizing them to run on Intel hardware.
</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter">
Ryan joined me to discuss AI/ML in the enterprise across various industries and exploring numerous use cases. Let's talk a little bit about the origin story behind OpenVINO. Tell us more
about it and how it came to be and why it came out of Intel.
</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney">
Definitely. We had the first release of OpenVINO, was back in 2018, so still relatively new. And at that time, we were focused on Computer Vision and pretty tightly coupled with OpenCV, which
is another open source library with origins at Intel. It had its first release back in 1999, so it's been around a little bit longer. And many of the software engineers and architects at Intel
that were involved with and contributing to OpenCV are working on OpenVINO. So you can think of OpenVINO as complimentary software to OpenCV and we're providing an engine for executing
inferences as part of a Computer Vision pipeline, or at least that's how we started.
</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney">
But since 2018, we've started to move beyond just Computer Vision inference. So when I say Computer Vision inference, I mean image classification, object detection, segmentation, and now
we're moving into natural language processing. Things like speech synthesis, speech recognition, knowledge graphs, time series forecasting and other use cases that don't involve Computer
Vision and don't involve inference on pixels. Our latest release, the 2022.1 that came out earlier this year, that was the most significant update that we've had to OpenVINO, since we started
in 2018. And the major focus of that release was optimizing for use cases that go beyond Computer Vision.
</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter">
And I like that concept that you just mentioned right there, Computer Vision, and you said that you extended those use cases and went beyond that. Could you give us some more concrete
examples of Computer Vision?
</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney">
Sure. When you think about manufacturing, quality control in factories, everything from arc welding, defect detection to inspecting BMW cars on assembly lines, they're using cameras or
sensors to collect data and usually it's cameras collecting images like RGB images that you and I can see and looks like something taken from a camera or video camera. But also, things like
infrared or computerized tomography scans used in healthcare, X-ray, different types of images where we can draw bounding boxes around regions of interest and say, "This is a defect," or,
"This is not a defect." And also, "Is this worker wearing a safety hat or did they forget to put it on?" And so, you can take this and integrate it into a pipeline where you're triggering an
alert if somebody forgets to wear their safety mask, or if there's a defect in a product on an assembly line, you can just use cameras and OpenVINO and OpenCV running these on Intel hardware
and help to analyze.
</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney">
And that's what a lot of the partners that we work with are doing, so these independent software vendors. And there's other use cases for things like retail. You think about going to a store
and using an automated checkout system. Sometimes people use those automated checkouts and they slide a few extra items into their bag that they don't scan and it's a huge loss for the retail
outlets that are providing this way to check out realtime shelf monitoring. We have a Vispera, one of our ISVs that helps keep store shelves stocked by just analyzing the cameras in the
stores, detecting when objects are missing from the shelves so that they can be restocked. We have Vistry, another ISV that works with quick service restaurants. When you think about
automating the process of, when do I drop the fries into the fryer so that they're warm when the car gets to the drive through window, there's quite a bit of industrial healthcare retail
examples that we can walk through.
</rh-cue>
<rh-cue start="06:55" voice="Burr Sutter">
And we should dig into some more of those, but I got to tell you, I have a personal experience in this category that I want to share with and you can tell me how silly you might think at this
point in time it is. We actually built a keynote demonstration for the Red Hat big stage back in 2015. And I really want to illustrate the concept of asset tracking. So we actually gave
everybody in the conference a little Bluetooth token with a little battery, a little watch battery, and a little Bluetooth emitter. And we basically tracked those things around the conference.
We basically put a raspberry pi in each of the meeting rooms and up in the lunch room and you could see how the tokens moved from room to room to room.
</rh-cue>
<rh-cue start="07:28" voice="Burr Sutter">
It was a relatively simple application, but it occurred to me, after we figured out how to do that with Bluetooth and triangulating Bluetooth signals by looking at relative signal strength
from one radio to another and putting that through an Apache Spark application at the time, we then realized, "You know what? This is easier done with cameras." And just simply looking at a
camera and having some form of a AI/ML model, a machine learning model, that would say, "There are people here now," or, "There are no people here now." What do you think about that?
</rh-cue>
<rh-cue start="07:56" voice="Ryan Loney">
What you just described is exactly the product that Pathr, one of our partners is offering, but they're doing it with Computer Vision and cameras. So when Pathr tries to help retail stores
analyze the foot traffic and understand, with heat maps, where are people spending the most time in stores, how many people are coming in, what size groups are coming into the store and trying
to help understand if there was a successful transaction from the people who entered the store and left the store, to help with the retail analytics and marketing sales and positioning of
products. And so, they're doing that in a way that also protects privacy. And that's something that's really important. So when you talked about those Bluetooth beacons, probably if everyone
who walked into a grocery store was asked to put a tracking device in their cart or on their person and say, "You're going to be tracked around the store," they probably wouldn't want to do
that.
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney">
The way that you can do this with cameras, is you can detect people as they enter and remove their face. So you can ignore any biometric information and just track the person based on pixels
that are present in the detected region of interest. So they're able to analyze... Say a family walks in the door and they can group those people together with object detection and then they
can track their movement throughout the store without keeping track of their face, or any biometric, or any personal identifiable information, to avoid things like bias and to make sure that
they're protecting the privacy of the shoppers in the store, while still getting that really useful marketing analytics data. So that they can make better decisions about where to place their
products. That's one really good example of how Computer Vision, AI with OpenVINO is being used today.
</rh-cue>
<rh-cue start="09:49" voice="Burr Sutter">
And that is a great example, because you're definitely spot on. It is invasive when you hand someone a Bluetooth device and say, "Please, keep this with you as you go throughout our store,
our mall or throughout our hospital, wherever you might be." Now you mentioned another example earlier in the conversation which was related to worker safety. "Are they wearing a helmet?" I
want to talk more about that concept in a real industrial setting, a manufacturing setting, where there might be a factory floor and there's certain requirements. Or better yet there's like a
quality assurance requirement, let's say, when it comes to looking at a factory line. I've run that use case often with some of our customers. Can you talk more about those kinds of use cases?
</rh-cue>
<rh-cue start="10:23" voice="Ryan Loney">
One of our partners, Robotron, we published a case study, I think last year, where they were working with BMW at one of their factories. And they do quality control inspection, but they're
also doing things related to worker safety and analyzing. I use the safety hat example. There's a number of our ISVs and partners who have similar use cases and it comes down to, there's a few
reasons that are motivating this and some are related to insurance. It's important to make sure that if you want to have your factory insured, that your workers are protecting themselves and
wearing the gear regulatory compliance, you're being asked to properly protect from exposure to chemicals or potentially having something fall and hit someone on the head. So wearing a safety
vest, wearing goggles, wearing a helmet, these are things that you need to do inside the factory and you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney">
I think that's one of the interesting things about the Robotron-BMW example is that they were also blurring, blacking out, so drawing a box to cover the face of the workers in the factory, so
that somebody who was analyzing the video footage and getting the alerts saying that, "Bay 21 has a worker without a hat on," that it's not sending their face and in the alert and potentially
invading or going against privacy laws or just the ethics of the company. They don't want to introduce bias or have people targeted because it's much better to blur the face and alert and have
somebody take care of it on the floor. And then, if you ever need to audit that information later, they have a way to do it where people who need to be able to see who the employee was and
look up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney">
But then just for the purposes of maintaining safety, they don't need to have access to that personal information, or biometric information. Because that's one thing that when you hear about
Computer Vision or person tracking, object detection, there's a lot of concern, and rightfully so, about privacy being invaded and about tracking information, face re-identification,
identifying people who may have committed crimes through video footage. And that's just not something that a lot of companies want to... They want to protect privacy and they don't want to be
in a situation where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter">
Well, privacy is certainly opening up Pandora's box. There's a lot to be explored in that area, especially in a digital world that we now live in. But for now, let's move on and explore a
different area. I'm interested in how machines and computers offer advantages specifically in certain use cases like a quality control scenario. I asked Ryan to explain how a AI/ML and
specifically machines, computers, could augment that capability.
</rh-cue>
<rh-cue start="13:20" voice="Ryan Loney">
I can give a specific example where we have a partner that's doing defect detection, looking for anomalies in batteries. I'm sure you've heard there's a lot of interest right now in electric
vehicles, a lot of batteries being produced. And so, if you go into one of these factories, they have images that they collect of every battery that's going through this assembly line. And
through these images, people can look and see and visually inspect what their eyes and say, "This battery has a defect, send it back." And that's one step in the quality control process,
there's other steps I'm sure, like running diagnostic tests and measuring voltage and doing other types of non-visual inspection. But for the visual inspection piece, where you can really
easily identify some problems, it's much more efficient to introduce Computer Vision. And so, that's where we have this new library that we've introduced, called Anomalib.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney">
So OpenVINO, while we're focused on inference, we're also thinking about the pipeline, or the funnel, that gets these models to OpenVINO. And so, we've invested in this anomaly segmentation,
anomaly detection library that we've recently open sourced and there's a great research paper about it, about Anomalib, but the idea is you can take just a few images and train a model and
start detecting these defects. And so, for this battery example, that's a more advanced example, but to make it simpler, take some bolts and... Take 10 bolts. You have one that has a scratch
on it, or one that is chipped, or has some damage to it, and you can easily get started in training to recognize the bolts that do not have an anomaly and the ones that do, which is a small
data set. And I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney">
Challenges, one is access to data, but the other is needing a massive amount of data to do something meaningful. And so we're starting to try to change that dynamic with Anomalib. You may not
need a 100,000 images, you may need 100 images and you can start detecting anomalies in everything from batteries to bolts to, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter">
That is a very key point because often in that data scientist process, that data engineering data scientist process, the one key thing is, can you gather the data that you need for the input
for the model training? And we've often said, at least people I've worked with over the last couple years, "You need a lot of data, you need tens of thousands of correct images, so we can sort
out the difference between dogs versus cats," let's say. Or you need dozens and dozens of situations where if it's a natural language processing scenario, a good customer interaction, a good
customer conversation. And this case it sounds like what you're saying is, "Show us just the bad things, fewer images, fewer incorrect things, and then let us look for those kind of
anomalies." Can you tell us more about that? Because that is very interesting. The concept that I can use a much smaller data set as my input, as opposed to gathering terabytes of data in some
cases, to just simply get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney">
Like you described, the idea is, if you have some good images and then you have some of the known defects, and you can just label, "Here's a set of good images and here's a few of the
defects." And you can right away start detecting those specific defects that you've identified. And then, also be able to determine when it doesn't match the expected appearance of a non
defective item. So if I have the undamaged screw and then I introduce one with some new anomaly that's never been seen before, I can say this one is not a valid screw. And so, that's the
approach that we're taking and it's really important because so often you need to have subject matter experts. Take the battery example, there's these workers who are on the floor, in a
factory and they're the ones who know best when they look at these images, which one's going to have an issue, which one's defective.
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney">
And then they also need to take that subject matter expertise and then use it to annotate data sets. And when you have these tens of thousands of images you need to annotate, it's asking
those people to stop working on the factory floor so they can come annotate some images. That's a tough business call to make, right? But if you only need them to annotate a handful of images,
it's a much easier ask to get the ball rolling and demonstrate value. And maybe over time you will want to annotate more and more images because you'll get even better accuracy in the model.
Even better, even if it's just small incremental improvements, that's something that if it generates value for the business, it's something the business will invest in over time. But you have
to convince the decision makers that it's worth the time of these subject matter experts to stop what they're doing and go and label some images of the things that they're working on in the
factory.
</rh-cue>
<rh-cue start="18:27" voice="Burr Sutter">
And that labeling process can be very labor intensive. If the annotation is basically saying what is correct, what's wrong, what is this, what is that. And therefore if we can minimize that
timeframe to get the value quicker, then there's something that's useful for the business, useful for the organization, long before we necessarily go through a whole huge model training phase.
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter">
So we talked about labeling and how that is labor intensive activity, but I love the idea of helping the human. And helping the human most specifically not get bored. Basically if the human
is eyeballing a bunch of widgets flying by, over time they make mistakes, they get bored and they don't pay as close attention as they should. That's why the constant of AI/ML, and
specifically Computer Vision augmenting that capability and really helping the human identify anomalies faster, more quickly, maybe with greater accuracy, could be a big win. We focused on
manufacturing, but let's actually go into healthcare and learn how these tools can be used in that sector and that industry. Ryan talked me about how OpenVINO's run time can be incorporated
into medical imaging equipment with Intel processors embedded in CT, MRI and ultrasound machines. While these inferences, this AI/ML workload, can be operating and executing right there in the
same physical room as the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney">
We did a presentation with GE last year, I think they said there's at least 80 countries that have their x-ray machines deployed. And they're doing things like helping doctors place breathing
tubes in patients. So during COVID, during the pandemic, that was a really important tool to help with nurses and doctors who were intubating patients, sometimes in a parking lot or a hallway
of a hospital. And when they had a statistic that GE said, I think one out of four breathing tubes gets placed incorrectly when you're doing it outside the operating room. Because when you're
in an operating room it's much more controlled and there's someone who's an expert at placing the tubes, it's something you have more of a controlled environment. But when you're out, in a
parking lot, in a tent, when the hospital's completely full and you're triaging patients with COVID, that's when they're more likely to make mistakes.And so, they had this endotracheal tube
placement, ETT, model that they trained and it helped to use an x-ray and give an alert and say, "This tube is placed wrong, pull it out and do it again." And so, things like that help doctors
so that they can avoid mistakes. And having a breathing tube placed incorrectly can cause collapsed lung and a number of other unwanted side effects. So it's really important to do it
correctly. Another example is Samsung Medison. They actually are estimating fetal angle of progression. So this is analyzing ultrasound of pregnant women being able to help take measurements
that are usually hard to calculate, but it can be done in an automated way. They're already taking an ultrasound scan and now they're executing this model that can take some of these
measurements to help the doctor avoid potentially more intrusive alternative methods. So the patient wins, it makes their life better and the doctor is getting help from this AI model. And
those are just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter">
Those are some amazing examples when it comes to all these things, we're talking CT scans and x-rays, other examples of Computer Vision. One thing that's kind of interesting in this space, I
think, whenever I get a chance to work on, let's say an object detection model, and one of our workshops, by the way, is actually putting that out in front of people to say, "Look, you can use
your phone and it basically sends the image over to our OpenShift with our data science platform and then analyzes what you see." And even in my case, where I take a picture of my dog as an
example, it can't really decide, is it a dog or a cat? I have a very funny looking dog.
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter">
And so there's always a percentage outcome. In other words, "I think it's a dog, 52%." So I want to talk about that more. How important is it to get to that a hundred percent accuracy? How
important is it to really, depending on the use case, to allow for the gray area if you will, where it's an 80% accuracy or a 70% accuracy, and what are the trade offs there associated with
the application? Can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney">
Accuracy is definitely a touchy subject, because how you measure it makes a huge difference. I think what you were describing with the dog example, there's sort of a top five potential
classes that might maybe be identified. So let's say you're doing object detection and you detect a region of interest, and it says 65% confidence this is a dog. Well, the next potential label
that could be maybe 50% confidence or 20% confidence might be something similar to a dog. Or in the case of models that have been trained on the ImageNet dataset or on COCO dataset, they have
actual breeds of dogs. If I want to look at the top five labels for a dog, for my dog for example, she's a mix, mostly a Labrador retriever, but I may look at the top five labels and it may
say 65% confidence that she's a flat coated retriever.
</rh-cue>
<rh-cue start="23:32" voice="Ryan Loney">
And then confidence that she's a husky as 20%, and then 5% confidence that she's a greyhound or something. Those labels, all of them are dogs. So if I'm just trying to figure out, is this a
dog? I could probably find all of the classes within the data set and say, "Well, these all, class ID 65, 132, 92 and 158, all belong to a group of dogs." So if I want to just write an
application to tell me if this is a dog or not, I would probably use that to determine if it's a dog. But how you measure that as accuracy, well that's where it gets a little bit complicated.
Because if you're being really strict about the definition and you're trying to validate against the data set of labeled images, and I have specific dog breeds or some specific detail and it
doesn't match, well then, the accuracy's going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney">
And that's especially important when we talk about things like compression and quantization, which historically, has been difficult to get adoption in some domains, like healthcare, where
even the hint of accuracy going down implies that we're not going to be able to help. In some small case, maybe if it's even half a percent of the time, we won't detect that that tube is
placed incorrectly or that that patient's lung has collapsed or something like that. And that's something that really prevents adoption of some of these methods that can really boost
performance, like quantization. But if you take that example of... Different from the dog example, and you think about segmentation of kidneys. If I'm doing kidney segmentation, which is
taking a CT scan and then trying to pick the pixels out of that scan that belong to a kidney, how I measure accuracy may be how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney">
Missing some of the pixels is maybe not a problem, depending on how you've built the application, because you still detect the kidney, and maybe you just need to apply padding around the
region of interest, so that you don't miss any of the actual kidney when you compress the model and when you quantize the model. But that requires a data scientist, an ML engineer, somebody to
really, they have to be able to go and apply that after the fact, after the inference happens, to make sure that you're not losing critical information. Because the next step from detecting
the kidney, may be detecting a tumor.
</rh-cue>
<rh-cue start="26:04" voice="Ryan Loney">
And so, maybe you can use the more optimized model to detect the kidney, but then you can use a slower model to detect the tumor. But that also requires somebody to architect and make that
decision or that trade off and say, "Well, I need to add padding," or, "I should only use the quantized model to detect the region of interest for the kidney." And then, use the model that
takes longer to do the inference just to find the tumor, which is going to be on a smaller size. The dimensions are going to be much smaller once we crop to the region of interest. But all of
those details, that's maybe not easy to explain in a few sentences and even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter">
I do love that use case, like you mentioned, the cropping, even in one scenario that we worked on for another project, we specifically decided to pixelate the image that we had taken, because
we knew that we could get the outcome we wanted by even just using a smaller or having less resolution in our image. And therefore, as we transferred it from the mobile device, the edge
device, up into the cloud, we wanted that smaller image just for transfer purposes. And still, we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter">
And one thing that's interesting about that, from my perspective, is, if you're doing image processing, sometimes it takes a while for this transaction to occur. I come from a traditional
application background, where I'm reading and writing things from a database, or a message broker, or moving data from one place to another. Those things happen sub-second normally, even with
great latency between your data centers, it's still sub-second in most cases. While a transaction like this one can actually take two seconds or four seconds, as it's doing its analysis and
actually coming back with its, "I think it's a dog, I think it's a kidney, I think it's whatever." And providing me that accuracy statement. That concept of optimization is very important in
the overall application architecture. Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney">
Definitely. It depends too on the use case. So if you think about how important it is to reduce the latency and increase the number of frames per second that you can process when you're
talking about a loss prevention model that's running at a grocery store. You want to keep the lines moving, you don't want every person who's at the self checkout to have to wait five seconds
for every item they scan. You need it to happen as quickly as possible. And if sometimes the accuracy decreases slightly, or I'd say the accuracy of the whole pipeline, so not just looking at
the individual model or the individual inference, but let's say that the whole pipeline is not as successful at detecting when somebody steals one item from the self checkout, it's not going
to be a life threatening situation. Whereas being hooked up to the x-ray machine with the tube placement model, they might be willing to have the doctor or the nurse wait five seconds to get
the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney">
They don't need it to happen in 500 milliseconds. Their threshold for waiting is a little bit higher. That, I think, also drives some of the decision. You want to keep people moving through
the checkout line and you can afford to, potentially, if you lose a little bit of accuracy here and there, it's not going to cost the company that much money or it's not going to be life
threatening. It's going to be worth the trade off of keeping the line moving and not having people leave the store and not check out at all, to say, "I'm not going to shop today because the
line's too long."
</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter">
There are so many trade-offs in enterprise AI/ML use cases, things like latency, accuracy and availability, and certainly complexities abound, especially in an obviously ever-evolving
technological landscape where we are still very early in the adoption of AI/ML. And to navigate that complexity, that direct feedback from real world end users is essential to Ryan and his
team at Intel. What would you say are some of the big hurdles or big outcomes, big opportunities in that space? And do you agree that we're still at the very beginning, in our infancy if you
will, of adopting these technologies and discovering what they can do for us?
</rh-cue>
<rh-cue start="30:06" voice="Ryan Loney">
Yeah, I think we're definitely in the infancy and I think that what we've seen is, our customers are evolving and the people who are deploying on Intel hardware, they're trying to run more
complicated models. They're the models that are doing object detection or detecting defects and doing segmentation. In the past you could say, "Here's a generic model that will do face
detection, or person detection, or vehicle detection, license plate detection." And those are general purpose models that you can just grab off the shelf and use them. But now we're moving
into the Anomalib scenarios, where I've got my own data and I'm trying to do something very specific and I'm the only one that has access to this data. You don't have that public data set that
you can go download that's under Creative Commons license for car batteries. It's just not something that's available.
</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney">
And so, those use cases, the challenge with training those models and getting them optimized is the beginning of the pipeline. It's the data. You have to get the data, you have to annotate it
and the tools have to exist for you to do that. And that's part of the problem that we're trying to help solve. And then, the models are getting more complex. So if you think, just from
working with customers recently, they're no longer just trying to do image classification, "Is it a dog or a cat?" They've moved on to 3D point clouds and 3D segmentation models and things
that are like the speech synthesis example. These GPT models that are generating... You put a text input and it generates an image for you. It's just becoming much more advanced, much more
sophisticated and on larger images.
</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney">
And so things like running super resolution and enhancing images, upscaling images, instead of just trying to take that 200 by 200 pixel image and classifying if it's a cat, now we're talking
about gigantic, huge images that we're processing and that all requires more resources or more optimized models. And every Computer Vision conference or AI conference, there's a new latest and
greatest architecture, there's new research paper, and things are getting adopted much faster. The lead time for a NeurIPS paper, CVPR, for a company to actually adopt and put those into
production, the time shortens every year.
</rh-cue>
<rh-cue start="32:34" voice="Burr Sutter">
Well Ryan, I got to tell you, I could talk to you, literally, all day about these topics, the various use cases, the various ways models are being optimized, how to put models into a pipeline
for average enterprise applications. I've enjoyed learning about OpenVINO and Anomalib. I'm fascinated by this, because I'll have a chance to go try this myself, taking advantage of Red Hat
OpenShift and taking advantage of our data science platform. On top of that, I will definitely go be poking at this myself. Thank you so much for your time today.
</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney">
Thanks, Burr. This was a lot of fun. Thanks for having me.
</rh-cue>
<rh-cue start="33:05" voice="Burr Sutter">
You can check out the full transcript of our conversation and more resources, like a link to a white paper on OpenVINO and Anomalib at redhat.com/codecommentspodcast. This episode was
produced by Brent Simoneaux and Caroline Creaghead. Our sound designer is Christian Prohom. Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison,
Nick Burns, Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, Laura Walters, Alex Traboulsi, and Victoria Lawton. I'm your host, Burr Sutter. Thank you
for joining me today on Code Comments. I hope you enjoyed today's session and today's conversation, and I look forward to many more.
</rh-cue>
</rh-transcript>
</rh-audio-player>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Customization
#customization {
height: 100%;
& form {
display: contents;
}
/**
* Warning:
* The following are demonstrations of using CSS variables to customize player color.
* They do not use our design token values for color.
*/
& rh-audio-player {
--purple: oklch(48.72% 0.198 289.56);
--cyan: oklch(70.5% 0.143 231.78);
--purple-light: oklch(78.72% 0.198 289.56);
--cyan-dark: oklch(30.5% 0.143 231.78);
/* --purple-light: oklch(from var(--purple) calc(l + 10%) c h); */
/* --cyan-dark: oklch(from var(--cyan) calc(l - 10%) c h); */
&.purple {
--rh-color-surface-lightest: var(--purple-light);
--rh-color-surface-darkest: var(--purple);
--rh-audio-player-range-thumb-color: #f56d6d;
--rh-audio-player-range-progress-color: #f56d6d;
&.img {
--rh-color-surface-darkest: black;
--rh-color-surface-lightest: white;
&::part(toolbar) {
background-image: url("https://www.redhat.com/cms/managed-files/episode-1-art-hero.png");
background-size: cover;
background-repeat: no-repeat;
background-position: right;
background-blend-mode: difference;
}
}
}
&.cyan {
--rh-color-surface-lightest: var(--cyan);
--rh-color-surface-darkest: var(--cyan-dark);
--rh-audio-player-range-thumb-color: #ffe953;
--rh-audio-player-range-progress-color: #ffe953;
}
}
}
```
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
import '@rhds/elements/lib/elements/rh-context-demo/rh-context-demo.js';
const form = document.querySelector('form');
const player = document.querySelector('rh-audio-player');
/**
* update audio player demo based on form selections
*/
function updateDemo() {
const data = new FormData(form);
const values = Object.fromEntries(data.entries());
const { custom, layout } = values;
const colorClass = custom || '';
player.layout = layout;
player.className = colorClass;
player.hasAccentColor = custom.match(/^(cyan|purple)/);
player.poster =
!values.poster || colorClass === 'purple img' ? undefined
: 'https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png';
}
form.addEventListener('input', updateDemo);
updateDemo();
```
<section id="customization">
<rh-context-demo target="player">
<form slot="controls">
<label>Poster: <input name="poster" type="checkbox" checked=""></label>
<label>Custom color theme:
<select name="custom">
<option value="purple">Purple</option>
<option value="purple img">Purple with Image</option>
<option value="cyan">Cyan</option>
</select>
</label>
<label>Layout:
<select name="layout">
<option value="full" selected="">Full</option>
<option value="compact-wide">Compact Wide</option>
<option value="compact">Compact</option>
<option value="mini">Mini</option>
</select>
</label>
</form>
<rh-audio-player id="player" layout="full" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript">
<h4 slot="heading">Transcript</h4>
<rh-cue start="00:02" voice="Burr Sutter">
Hi, I'm Burr Sutter. I'm a Red Hatter who spends a lot of time talking to technologists about technologies. We say this a lot at Red Hat. No single technology provider holds the key to success,
including us. And I would say the same thing about myself. I love to share ideas, so I thought it would be awesome to talk to some brilliant technologists at Red Hat Partners. This is Code
Comments, an original podcast from Red Hat.
</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter">
I'm sure, like many of you here, you have been thinking about AI/ML, artificial intelligence and machine learning. I've been thinking about that for quite some time and I actually had the
opportunity to work on a few successful projects, here at Red Hat, using those technologies, actually enabling a data set, gathering a data set, working with a data scientist and data
engineering team, and then training a model and putting that model into production runtime environment. It was an exciting set of projects and you can see those on numerous YouTube videos that
have published out there before. But I want you to think about the problem space a little bit, because there are some interesting challenges about a AI/ML. One is simply just getting access to
the data, and while there are numerous publicly available data sets, when it comes to your specific enterprise use case, you might not be to find publicly available data.
</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter">
In many cases you cannot, even for our applications that we created, we had to create our data set, capture our data set, explore the data set, and of course, train a model accordingly. And we
also found there's another challenge to be overcome in this a AI/ML world, and that is access to certain types of hardware. If you think about an enterprise environment and the creation of an
enterprise application specifically for a AI/ML, end users need an inference engine to run on their hardware. Hardware that's available to them, to be effective for their application. Let's say
an application like Computer Vision, one that can detect anomalies and medical imaging or maybe on a factory floor. As those things are whizzing by on the factory line there, looking at them and
trying to determine if there is an error or not.
</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter">
Well, how do you actually make it run on your hardware, your accessible technology that you have today? Well, there's a solution for this as an open toolkit called OpenVINO. And you might be
thinking, "Hey, wait a minute, don't you need a GPU for AI inferencing, a GPU for artificial intelligence, machine learning? Well, not according to Ryan Loney, product manager of OpenVINO
Developer Tools at Intel.
</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney">
I guess I'll start with trying to maybe dispel a myth. I think that CPUs are widely used for inference today. So if we look at the data center segment, about 70% of the AI inference is happening
on Intel Xeon, on our data center CPUs. And so you don't need a GPU especially for running inference. And that's part of the value of OpenVINO, is that we're taking models that may have been
trained on a GPU using deep learning frameworks like PyTorch or TensorFlow, and then optimizing them to run on Intel hardware.
</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter">
Ryan joined me to discuss AI/ML in the enterprise across various industries and exploring numerous use cases. Let's talk a little bit about the origin story behind OpenVINO. Tell us more about
it and how it came to be and why it came out of Intel.
</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney">
Definitely. We had the first release of OpenVINO, was back in 2018, so still relatively new. And at that time, we were focused on Computer Vision and pretty tightly coupled with OpenCV, which is
another open source library with origins at Intel. It had its first release back in 1999, so it's been around a little bit longer. And many of the software engineers and architects at Intel that
were involved with and contributing to OpenCV are working on OpenVINO. So you can think of OpenVINO as complimentary software to OpenCV and we're providing an engine for executing inferences as
part of a Computer Vision pipeline, or at least that's how we started.
</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney">
But since 2018, we've started to move beyond just Computer Vision inference. So when I say Computer Vision inference, I mean image classification, object detection, segmentation, and now we're
moving into natural language processing. Things like speech synthesis, speech recognition, knowledge graphs, time series forecasting and other use cases that don't involve Computer Vision and
don't involve inference on pixels. Our latest release, the 2022.1 that came out earlier this year, that was the most significant update that we've had to OpenVINO, since we started in 2018. And
the major focus of that release was optimizing for use cases that go beyond Computer Vision.
</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter">
And I like that concept that you just mentioned right there, Computer Vision, and you said that you extended those use cases and went beyond that. Could you give us some more concrete examples
of Computer Vision?
</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney">
Sure. When you think about manufacturing, quality control in factories, everything from arc welding, defect detection to inspecting BMW cars on assembly lines, they're using cameras or sensors
to collect data and usually it's cameras collecting images like RGB images that you and I can see and looks like something taken from a camera or video camera. But also, things like infrared or
computerized tomography scans used in healthcare, X-ray, different types of images where we can draw bounding boxes around regions of interest and say, "This is a defect," or, "This is not a
defect." And also, "Is this worker wearing a safety hat or did they forget to put it on?" And so, you can take this and integrate it into a pipeline where you're triggering an alert if somebody
forgets to wear their safety mask, or if there's a defect in a product on an assembly line, you can just use cameras and OpenVINO and OpenCV running these on Intel hardware and help to analyze.
</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney">
And that's what a lot of the partners that we work with are doing, so these independent software vendors. And there's other use cases for things like retail. You think about going to a store and
using an automated checkout system. Sometimes people use those automated checkouts and they slide a few extra items into their bag that they don't scan and it's a huge loss for the retail
outlets that are providing this way to check out realtime shelf monitoring. We have a Vispera, one of our ISVs that helps keep store shelves stocked by just analyzing the cameras in the stores,
detecting when objects are missing from the shelves so that they can be restocked. We have Vistry, another ISV that works with quick service restaurants. When you think about automating the
process of, when do I drop the fries into the fryer so that they're warm when the car gets to the drive through window, there's quite a bit of industrial healthcare retail examples that we can
walk through.
</rh-cue>
<rh-cue start="06:55" voice="Burr Sutter">
And we should dig into some more of those, but I got to tell you, I have a personal experience in this category that I want to share with and you can tell me how silly you might think at this
point in time it is. We actually built a keynote demonstration for the Red Hat big stage back in 2015. And I really want to illustrate the concept of asset tracking. So we actually gave
everybody in the conference a little Bluetooth token with a little battery, a little watch battery, and a little Bluetooth emitter. And we basically tracked those things around the conference.
We basically put a raspberry pi in each of the meeting rooms and up in the lunch room and you could see how the tokens moved from room to room to room.
</rh-cue>
<rh-cue start="07:28" voice="Burr Sutter">
It was a relatively simple application, but it occurred to me, after we figured out how to do that with Bluetooth and triangulating Bluetooth signals by looking at relative signal strength from
one radio to another and putting that through an Apache Spark application at the time, we then realized, "You know what? This is easier done with cameras." And just simply looking at a camera
and having some form of a AI/ML model, a machine learning model, that would say, "There are people here now," or, "There are no people here now." What do you think about that?
</rh-cue>
<rh-cue start="07:56" voice="Ryan Loney">
What you just described is exactly the product that Pathr, one of our partners is offering, but they're doing it with Computer Vision and cameras. So when Pathr tries to help retail stores
analyze the foot traffic and understand, with heat maps, where are people spending the most time in stores, how many people are coming in, what size groups are coming into the store and trying
to help understand if there was a successful transaction from the people who entered the store and left the store, to help with the retail analytics and marketing sales and positioning of
products. And so, they're doing that in a way that also protects privacy. And that's something that's really important. So when you talked about those Bluetooth beacons, probably if everyone who walked into a
grocery store was asked to put a tracking device in their cart or on their person and say, "You're going to be tracked around the store," they probably wouldn't want to do that.
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney">
The way that you can do this with cameras, is you can detect people as they enter and remove their face. So you can ignore any biometric information and just track the person based on pixels
that are present in the detected region of interest. So they're able to analyze... Say a family walks in the door and they can group those people together with object detection and then they can
track their movement throughout the store without keeping track of their face, or any biometric, or any personal identifiable information, to avoid things like bias and to make sure that they're
protecting the privacy of the shoppers in the store, while still getting that really useful marketing analytics data. So that they can make better decisions about where to place their products.
That's one really good example of how Computer Vision, AI with OpenVINO is being used today.
</rh-cue>
<rh-cue start="09:49" voice="Burr Sutter">
And that is a great example, because you're definitely spot on. It is invasive when you hand someone a Bluetooth device and say, "Please, keep this with you as you go throughout our store, our
mall or throughout our hospital, wherever you might be." Now you mentioned another example earlier in the conversation which was related to worker safety. "Are they wearing a helmet?" I want to
talk more about that concept in a real industrial setting, a manufacturing setting, where there might be a factory floor and there's certain requirements. Or better yet there's like a quality
assurance requirement, let's say, when it comes to looking at a factory line. I've run that use case often with some of our customers. Can you talk more about those kinds of use cases?
</rh-cue>
<rh-cue start="10:23" voice="Ryan Loney">
One of our partners, Robotron, we published a case study, I think last year, where they were working with BMW at one of their factories. And they do quality control inspection, but they're also
doing things related to worker safety and analyzing. I use the safety hat example. There's a number of our ISVs and partners who have similar use cases and it comes down to, there's a few
reasons that are motivating this and some are related to insurance. It's important to make sure that if you want to have your factory insured, that your workers are protecting themselves and
wearing the gear regulatory compliance, you're being asked to properly protect from exposure to chemicals or potentially having something fall and hit someone on the head. So wearing a safety
vest, wearing goggles, wearing a helmet, these are things that you need to do inside the factory and you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney">
I think that's one of the interesting things about the Robotron-BMW example is that they were also blurring, blacking out, so drawing a box to cover the face of the workers in the factory, so
that somebody who was analyzing the video footage and getting the alerts saying that, "Bay 21 has a worker without a hat on," that it's not sending their face and in the alert and potentially
invading or going against privacy laws or just the ethics of the company. They don't want to introduce bias or have people targeted because it's much better to blur the face and alert and have
somebody take care of it on the floor. And then, if you ever need to audit that information later, they have a way to do it where people who need to be able to see who the employee was and look
up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney">
But then just for the purposes of maintaining safety, they don't need to have access to that personal information, or biometric information. Because that's one thing that when you hear about
Computer Vision or person tracking, object detection, there's a lot of concern, and rightfully so, about privacy being invaded and about tracking information, face re-identification, identifying
people who may have committed crimes through video footage. And that's just not something that a lot of companies want to... They want to protect privacy and they don't want to be in a situation
where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter">
Well, privacy is certainly opening up Pandora's box. There's a lot to be explored in that area, especially in a digital world that we now live in. But for now, let's move on and explore a
different area. I'm interested in how machines and computers offer advantages specifically in certain use cases like a quality control scenario. I asked Ryan to explain how a AI/ML and
specifically machines, computers, could augment that capability.
</rh-cue>
<rh-cue start="13:20" voice="Ryan Loney">
I can give a specific example where we have a partner that's doing defect detection, looking for anomalies in batteries. I'm sure you've heard there's a lot of interest right now in electric
vehicles, a lot of batteries being produced. And so, if you go into one of these factories, they have images that they collect of every battery that's going through this assembly line. And
through these images, people can look and see and visually inspect what their eyes and say, "This battery has a defect, send it back." And that's one step in the quality control process, there's
other steps I'm sure, like running diagnostic tests and measuring voltage and doing other types of non-visual inspection. But for the visual inspection piece, where you can really easily
identify some problems, it's much more efficient to introduce Computer Vision. And so, that's where we have this new library that we've introduced, called Anomalib.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney">
So OpenVINO, while we're focused on inference, we're also thinking about the pipeline, or the funnel, that gets these models to OpenVINO. And so, we've invested in this anomaly segmentation,
anomaly detection library that we've recently open sourced and there's a great research paper about it, about Anomalib, but the idea is you can take just a few images and train a model and start
detecting these defects. And so, for this battery example, that's a more advanced example, but to make it simpler, take some bolts and... Take 10 bolts. You have one that has a scratch on it, or
one that is chipped, or has some damage to it, and you can easily get started in training to recognize the bolts that do not have an anomaly and the ones that do, which is a small data set. And
I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney">
Challenges, one is access to data, but the other is needing a massive amount of data to do something meaningful. And so we're starting to try to change that dynamic with Anomalib. You may not
need a 100,000 images, you may need 100 images and you can start detecting anomalies in everything from batteries to bolts to, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter">
That is a very key point because often in that data scientist process, that data engineering data scientist process, the one key thing is, can you gather the data that you need for the input for
the model training? And we've often said, at least people I've worked with over the last couple years, "You need a lot of data, you need tens of thousands of correct images, so we can sort out
the difference between dogs versus cats," let's say. Or you need dozens and dozens of situations where if it's a natural language processing scenario, a good customer interaction, a good
customer conversation. And this case it sounds like what you're saying is, "Show us just the bad things, fewer images, fewer incorrect things, and then let us look for those kind of anomalies."
Can you tell us more about that? Because that is very interesting. The concept that I can use a much smaller data set as my input, as opposed to gathering terabytes of data in some cases, to just simply
get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney">
Like you described, the idea is, if you have some good images and then you have some of the known defects, and you can just label, "Here's a set of good images and here's a few of the defects."
And you can right away start detecting those specific defects that you've identified. And then, also be able to determine when it doesn't match the expected appearance of a non defective item.
So if I have the undamaged screw and then I introduce one with some new anomaly that's never been seen before, I can say this one is not a valid screw. And so, that's the approach that we're
taking and it's really important because so often you need to have subject matter experts. Take the battery example, there's these workers who are on the floor, in a factory and they're the ones
who know best when they look at these images, which one's going to have an issue, which one's defective.
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney">
And then they also need to take that subject matter expertise and then use it to annotate data sets. And when you have these tens of thousands of images you need to annotate, it's asking those
people to stop working on the factory floor so they can come annotate some images. That's a tough business call to make, right? But if you only need them to annotate a handful of images, it's a
much easier ask to get the ball rolling and demonstrate value. And maybe over time you will want to annotate more and more images because you'll get even better accuracy in the model. Even
better, even if it's just small incremental improvements, that's something that if it generates value for the business, it's something the business will invest in over time. But you have to
convince the decision makers that it's worth the time of these subject matter experts to stop what they're doing and go and label some images of the things that they're working on in the
factory.
</rh-cue>
<rh-cue start="18:27" voice="Burr Sutter">
And that labeling process can be very labor intensive. If the annotation is basically saying what is correct, what's wrong, what is this, what is that. And therefore if we can minimize that
timeframe to get the value quicker, then there's something that's useful for the business, useful for the organization, long before we necessarily go through a whole huge model training phase.
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter">
So we talked about labeling and how that is labor intensive activity, but I love the idea of helping the human. And helping the human most specifically not get bored. Basically if the human is
eyeballing a bunch of widgets flying by, over time they make mistakes, they get bored and they don't pay as close attention as they should. That's why the constant of AI/ML, and specifically
Computer Vision augmenting that capability and really helping the human identify anomalies faster, more quickly, maybe with greater accuracy, could be a big win. We focused on manufacturing, but
let's actually go into healthcare and learn how these tools can be used in that sector and that industry. Ryan talked me about how OpenVINO's run time can be incorporated into medical imaging
equipment with Intel processors embedded in CT, MRI and ultrasound machines. While these inferences, this AI/ML workload, can be operating and executing right there in the same physical room as
the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney">
We did a presentation with GE last year, I think they said there's at least 80 countries that have their x-ray machines deployed. And they're doing things like helping doctors place breathing
tubes in patients. So during COVID, during the pandemic, that was a really important tool to help with nurses and doctors who were intubating patients, sometimes in a parking lot or a hallway of
a hospital. And when they had a statistic that GE said, I think one out of four breathing tubes gets placed incorrectly when you're doing it outside the operating room. Because when you're in an
operating room it's much more controlled and there's someone who's an expert at placing the tubes, it's something you have more of a controlled environment. But when you're out, in a parking
lot, in a tent, when the hospital's completely full and you're triaging patients with COVID, that's when they're more likely to make mistakes.And so, they had this endotracheal tube placement,
ETT, model that they trained and it helped to use an x-ray and give an alert and say, "This tube is placed wrong, pull it out and do it again." And so, things like that help doctors so that they
can avoid mistakes. And having a breathing tube placed incorrectly can cause collapsed lung and a number of other unwanted side effects. So it's really important to do it correctly. Another example
is Samsung Medison. They actually are estimating fetal angle of progression. So this is analyzing ultrasound of pregnant women being able to help take measurements that are usually hard to
calculate, but it can be done in an automated way. They're already taking an ultrasound scan and now they're executing this model that can take some of these measurements to help the doctor
avoid potentially more intrusive alternative methods. So the patient wins, it makes their life better and the doctor is getting help from this AI model. And those are just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter">
Those are some amazing examples when it comes to all these things, we're talking CT scans and x-rays, other examples of Computer Vision. One thing that's kind of interesting in this space, I
think, whenever I get a chance to work on, let's say an object detection model, and one of our workshops, by the way, is actually putting that out in front of people to say, "Look, you can use
your phone and it basically sends the image over to our OpenShift with our data science platform and then analyzes what you see." And even in my case, where I take a picture of my dog as an
example, it can't really decide, is it a dog or a cat? I have a very funny looking dog.
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter">
And so there's always a percentage outcome. In other words, "I think it's a dog, 52%." So I want to talk about that more. How important is it to get to that a hundred percent accuracy? How
important is it to really, depending on the use case, to allow for the gray area if you will, where it's an 80% accuracy or a 70% accuracy, and what are the trade offs there associated with the
application? Can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney">
Accuracy is definitely a touchy subject, because how you measure it makes a huge difference. I think what you were describing with the dog example, there's sort of a top five potential classes
that might maybe be identified. So let's say you're doing object detection and you detect a region of interest, and it says 65% confidence this is a dog. Well, the next potential label that
could be maybe 50% confidence or 20% confidence might be something similar to a dog. Or in the case of models that have been trained on the ImageNet dataset or on COCO dataset, they have actual
breeds of dogs. If I want to look at the top five labels for a dog, for my dog for example, she's a mix, mostly a Labrador retriever, but I may look at the top five labels and it may say 65%
confidence that she's a flat coated retriever.
</rh-cue>
<rh-cue start="23:32" voice="Ryan Loney">
And then confidence that she's a husky as 20%, and then 5% confidence that she's a greyhound or something. Those labels, all of them are dogs. So if I'm just trying to figure out, is this a dog?
I could probably find all of the classes within the data set and say, "Well, these all, class ID 65, 132, 92 and 158, all belong to a group of dogs." So if I want to just write an application to
tell me if this is a dog or not, I would probably use that to determine if it's a dog. But how you measure that as accuracy, well that's where it gets a little bit complicated. Because if you're
being really strict about the definition and you're trying to validate against the data set of labeled images, and I have specific dog breeds or some specific detail and it doesn't match, well
then, the accuracy's going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney">
And that's especially important when we talk about things like compression and quantization, which historically, has been difficult to get adoption in some domains, like healthcare, where even
the hint of accuracy going down implies that we're not going to be able to help. In some small case, maybe if it's even half a percent of the time, we won't detect that that tube is placed
incorrectly or that that patient's lung has collapsed or something like that. And that's something that really prevents adoption of some of these methods that can really boost performance, like
quantization. But if you take that example of... Different from the dog example, and you think about segmentation of kidneys. If I'm doing kidney segmentation, which is taking a CT scan and then
trying to pick the pixels out of that scan that belong to a kidney, how I measure accuracy may be how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney">
Missing some of the pixels is maybe not a problem, depending on how you've built the application, because you still detect the kidney, and maybe you just need to apply padding around the region
of interest, so that you don't miss any of the actual kidney when you compress the model and when you quantize the model. But that requires a data scientist, an ML engineer, somebody to really,
they have to be able to go and apply that after the fact, after the inference happens, to make sure that you're not losing critical information. Because the next step from detecting the kidney,
may be detecting a tumor.
</rh-cue>
<rh-cue start="26:04" voice="Ryan Loney">
And so, maybe you can use the more optimized model to detect the kidney, but then you can use a slower model to detect the tumor. But that also requires somebody to architect and make that
decision or that trade off and say, "Well, I need to add padding," or, "I should only use the quantized model to detect the region of interest for the kidney." And then, use the model that takes
longer to do the inference just to find the tumor, which is going to be on a smaller size. The dimensions are going to be much smaller once we crop to the region of interest. But all of those
details, that's maybe not easy to explain in a few sentences and even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter">
I do love that use case, like you mentioned, the cropping, even in one scenario that we worked on for another project, we specifically decided to pixelate the image that we had taken, because we
knew that we could get the outcome we wanted by even just using a smaller or having less resolution in our image. And therefore, as we transferred it from the mobile device, the edge device, up
into the cloud, we wanted that smaller image just for transfer purposes. And still, we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter">
And one thing that's interesting about that, from my perspective, is, if you're doing image processing, sometimes it takes a while for this transaction to occur. I come from a traditional
application background, where I'm reading and writing things from a database, or a message broker, or moving data from one place to another. Those things happen sub-second normally, even with
great latency between your data centers, it's still sub-second in most cases. While a transaction like this one can actually take two seconds or four seconds, as it's doing its analysis and
actually coming back with its, "I think it's a dog, I think it's a kidney, I think it's whatever." And providing me that accuracy statement. That concept of optimization is very important in the
overall application architecture. Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney">
Definitely. It depends too on the use case. So if you think about how important it is to reduce the latency and increase the number of frames per second that you can process when you're talking
about a loss prevention model that's running at a grocery store. You want to keep the lines moving, you don't want every person who's at the self checkout to have to wait five seconds for every
item they scan. You need it to happen as quickly as possible. And if sometimes the accuracy decreases slightly, or I'd say the accuracy of the whole pipeline, so not just looking at the
individual model or the individual inference, but let's say that the whole pipeline is not as successful at detecting when somebody steals one item from the self checkout, it's not going to be a
life threatening situation. Whereas being hooked up to the x-ray machine with the tube placement model, they might be willing to have the doctor or the nurse wait five seconds to get the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney">
They don't need it to happen in 500 milliseconds. Their threshold for waiting is a little bit higher. That, I think, also drives some of the decision. You want to keep people moving through the
checkout line and you can afford to, potentially, if you lose a little bit of accuracy here and there, it's not going to cost the company that much money or it's not going to be life
threatening. It's going to be worth the trade off of keeping the line moving and not having people leave the store and not check out at all, to say, "I'm not going to shop today because the
line's too long."
</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter">
There are so many trade-offs in enterprise AI/ML use cases, things like latency, accuracy and availability, and certainly complexities abound, especially in an obviously ever-evolving
technological landscape where we are still very early in the adoption of AI/ML. And to navigate that complexity, that direct feedback from real world end users is essential to Ryan and his team
at Intel. What would you say are some of the big hurdles or big outcomes, big opportunities in that space? And do you agree that we're still at the very beginning, in our infancy if you will, of
adopting these technologies and discovering what they can do for us?
</rh-cue>
<rh-cue start="30:06" voice="Ryan Loney">
Yeah, I think we're definitely in the infancy and I think that what we've seen is, our customers are evolving and the people who are deploying on Intel hardware, they're trying to run more
complicated models. They're the models that are doing object detection or detecting defects and doing segmentation. In the past you could say, "Here's a generic model that will do face
detection, or person detection, or vehicle detection, license plate detection." And those are general purpose models that you can just grab off the shelf and use them. But now we're moving into
the Anomalib scenarios, where I've got my own data and I'm trying to do something very specific and I'm the only one that has access to this data. You don't have that public data set that you can go
download that's under Creative Commons license for car batteries. It's just not something that's available.
</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney">
And so, those use cases, the challenge with training those models and getting them optimized is the beginning of the pipeline. It's the data. You have to get the data, you have to annotate it
and the tools have to exist for you to do that. And that's part of the problem that we're trying to help solve. And then, the models are getting more complex. So if you think, just from working
with customers recently, they're no longer just trying to do image classification, "Is it a dog or a cat?" They've moved on to 3D point clouds and 3D segmentation models and things that are like the
speech synthesis example. These GPT models that are generating... You put a text input and it generates an image for you. It's just becoming much more advanced, much more sophisticated and on
larger images.
</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney">
And so things like running super resolution and enhancing images, upscaling images, instead of just trying to take that 200 by 200 pixel image and classifying if it's a cat, now we're talking
about gigantic, huge images that we're processing and that all requires more resources or more optimized models. And every Computer Vision conference or AI conference, there's a new latest and
greatest architecture, there's new research paper, and things are getting adopted much faster. The lead time for a NeurIPS paper, CVPR, for a company to actually adopt and put those into
production, the time shortens every year.
</rh-cue>
<rh-cue start="32:34" voice="Burr Sutter">
Well Ryan, I got to tell you, I could talk to you, literally, all day about these topics, the various use cases, the various ways models are being optimized, how to put models into a pipeline
for average enterprise applications. I've enjoyed learning about OpenVINO and Anomalib. I'm fascinated by this, because I'll have a chance to go try this myself, taking advantage of Red Hat
OpenShift and taking advantage of our data science platform. On top of that, I will definitely go be poking at this myself. Thank you so much for your time today.
</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney">
Thanks, Burr. This was a lot of fun. Thanks for having me.
</rh-cue>
<rh-cue start="33:05" voice="Burr Sutter">
You can check out the full transcript of our conversation and more resources, like a link to a white paper on OpenVINO and Anomalib at redhat.com/codecommentspodcast. This episode was produced
by Brent Simoneaux and Caroline Creaghead. Our sound designer is Christian Prohom. Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison, Nick Burns,
Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, Laura Walters, Alex Traboulsi, and Victoria Lawton. I'm your host, Burr Sutter. Thank you for joining me
today on Code Comments. I hope you enjoyed today's session and today's conversation, and I look forward to many more.
</rh-cue>
</rh-transcript>
</rh-audio-player>
</rh-context-demo>
</section>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Detailed Transcript
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
rh-audio-player {
margin: var(--rh-space-xl, 24px);
}
```
<rh-audio-player layout="full" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript slot="transcript">
<rh-cue start="00:02" voice="Burr Sutter"></rh-cue>
<rh-cue start="00:02" end="00:04">Hi, I'm Burr Sutter.</rh-cue>
<rh-cue start="00:04" end="00:05">I'm a Red Hatter</rh-cue>
<rh-cue start="00:05" end="00:08">who spends a lot of time talking to technologists about technologies.</rh-cue>
<rh-cue start="00:09" end="00:10">We say this a lot of Red Hat.</rh-cue>
<rh-cue start="00:10" end="00:14">No single technology provider holds the key to success, including us.</rh-cue>
<rh-cue start="00:15" end="00:17">And I would say the same thing about myself.</rh-cue>
<rh-cue start="00:17" end="00:18">I love to share ideas.</rh-cue>
<rh-cue start="00:18" end="00:19">So I thought it'd be awesome</rh-cue>
<rh-cue start="00:19" end="00:22">to talk to some brilliant technologists at Red Hat Partners.</rh-cue>
<rh-cue start="00:23" end="00:26">This is Code Comments, an original podcast</rh-cue>
<rh-cue start="00:26" end="00:29">from Red Hat.</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter"></rh-cue>
<rh-cue start="00:29" end="00:32">I'm sure, like many of you here, you have been thinking about</rh-cue>
<rh-cue start="00:32" end="00:36">AI, ML, artificial intelligence and machine learning.</rh-cue>
<rh-cue start="00:36" end="00:38">I've been thinking about that for quite some time</rh-cue>
<rh-cue start="00:38" end="00:39">and actually had the opportunity</rh-cue>
<rh-cue start="00:39" end="00:43">to work on a few successful projects here at Red Hat using those technologies,</rh-cue>
<rh-cue start="00:43" end="00:46">actually enabling a dataset, gathering a dataset,</rh-cue>
<rh-cue start="00:46" end="00:49">working with data scientists and data engineering team,</rh-cue>
<rh-cue start="00:49" end="00:51">and then training a model and putting that model into production</rh-cue>
<rh-cue start="00:51" end="00:53">runtime environment.</rh-cue>
<rh-cue start="00:53" end="00:55">It was an exciting set of projects and you can kind of see</rh-cue>
<rh-cue start="00:55" end="00:58">those on numerous YouTube videos I have published out there before.</rh-cue>
<rh-cue start="00:59" end="01:01">But I want you to think about the problem space a little bit</rh-cue>
<rh-cue start="01:01" end="01:04">because there are some interesting challenges about AI/ML.</rh-cue>
<rh-cue start="01:04" end="01:06">One is simply just getting access to the data,</rh-cue>
<rh-cue start="01:06" end="01:09">and while there are numerous publicly available datasets</rh-cue>
<rh-cue start="01:09" end="01:12">when it comes to your specific enterprise use case, you might not be to find</rh-cue>
<rh-cue start="01:12" end="01:14">publicly available data.</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter"></rh-cue>
<rh-cue start="01:14" end="01:17">In many cases, you cannot, even for our applications that we created,</rh-cue>
<rh-cue start="01:17" end="01:20">we had to create our dataset, capture our dataset,</rh-cue>
<rh-cue start="01:21" end="01:24">explore the dataset, and of course train a model accordingly.</rh-cue>
<rh-cue start="01:24" end="01:27">And we also found there's another challenge to be overcome</rh-cue>
<rh-cue start="01:27" end="01:30">in this AML world, and that is access to certain types of hardware.</rh-cue>
<rh-cue start="01:31" end="01:33">If you think about the enterprise environment</rh-cue>
<rh-cue start="01:33" end="01:36">and the creation of an enterprise application specifically for AML</rh-cue>
<rh-cue start="01:37" end="01:40">and users need an inference engine to run on their hardware,</rh-cue>
<rh-cue start="01:40" end="01:43">hardware that's available to them to be effective for their application.</rh-cue>
<rh-cue start="01:43" end="01:45">Let's say an application like computer vision,</rh-cue>
<rh-cue start="01:45" end="01:49">one that can detect anomalies in medical imaging or maybe on a factory floor,</rh-cue>
<rh-cue start="01:49" end="01:52">You know, those things are whizzing by on the factory line.</rh-cue>
<rh-cue start="01:52" end="01:55">They're looking at them and trying to determine if there is an error or not.</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter"></rh-cue>
<rh-cue start="01:56" end="01:58">Well, how do you actually make it run on your hardware,</rh-cue>
<rh-cue start="01:58" end="02:01">your accessible technology that you have today?</rh-cue>
<rh-cue start="02:01" end="02:05">Well, there's a solution for this as an open toolkit called Open vino.</rh-cue>
<rh-cue start="02:05" end="02:07">And you might be thinking, hey, wait a minute,</rh-cue>
<rh-cue start="02:07" end="02:10">don't you need a GPU for a I inferencing a GPU</rh-cue>
<rh-cue start="02:10" end="02:12">for artificial intelligence machine learning?</rh-cue>
<rh-cue start="02:12" end="02:15">Well, not according to Ryan Loney, product manager of Open Vino Developer</rh-cue>
<rh-cue start="02:15" end="02:16">Tools at Intel.</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney"></rh-cue>
<rh-cue start="02:20" end="02:20">I guess we'll</rh-cue>
<rh-cue start="02:20" end="02:23">start with trying to maybe dispel the myths, right?</rh-cue>
<rh-cue start="02:23" end="02:27">I think that CPUs are widely used for inference today.</rh-cue>
<rh-cue start="02:27" end="02:32">So and if we look at the data center segment, you know, about 70% of the A.I.</rh-cue>
<rh-cue start="02:32" end="02:36">inference is happening on Intel Xeon on our data center CPUs.</rh-cue>
<rh-cue start="02:36" end="02:40">And so you don't needed a GPU, especially for running inference.</rh-cue>
<rh-cue start="02:40" end="02:43">And that's part of the value of open vino, is that we're you know,</rh-cue>
<rh-cue start="02:43" end="02:47">we're taking models that may have been trained on a GPU</rh-cue>
<rh-cue start="02:47" end="02:50">using deep learning frameworks like PyTorch or TensorFlow</rh-cue>
<rh-cue start="02:51" end="02:54">and then optimizing them to run on Intel hardware.</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter"></rh-cue>
<rh-cue start="02:56" end="03:00">Ryan joined me to discuss AI/ML and the enterprise</rh-cue>
<rh-cue start="03:00" end="03:03">across various industries and exploring numerous use cases.</rh-cue>
<rh-cue start="03:05" end="03:08">Let's talk a little bit about the origin story behind Open Vino.</rh-cue>
<rh-cue start="03:08" end="03:10">Tell us more about it and how it came to be</rh-cue>
<rh-cue start="03:10" end="03:13">and why it came out of Intel.</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney"></rh-cue>
<rh-cue start="03:12" end="03:16">Definitely. So we had the first release of Open Vino</rh-cue>
<rh-cue start="03:16" end="03:20">was back in 2018, so still relatively new.</rh-cue>
<rh-cue start="03:20" end="03:25">And at that time we were focused on computer vision and pretty tightly coupled</rh-cue>
<rh-cue start="03:25" end="03:31">with open CV, which is another open source library with origins at Intel.</rh-cue>
<rh-cue start="03:31" end="03:31">You know, it</rh-cue>
<rh-cue start="03:31" end="03:36">had its first release back in 1999, so it's been around a little bit longer.</rh-cue>
<rh-cue start="03:36" end="03:40">And many of the software engineers and architects at Intel</rh-cue>
<rh-cue start="03:40" end="03:45">that were involved with and contributing to open CV are working on open Vino.</rh-cue>
<rh-cue start="03:45" end="03:49">So you can think of open vino as complementary software to open CV.</rh-cue>
<rh-cue start="03:50" end="03:53">And we're providing like an engine for executing inference</rh-cue>
<rh-cue start="03:53" end="03:57">as part of a computer vision pipeline, or at least that's how we started.</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney"></rh-cue>
<rh-cue start="03:58" end="04:01">But since 2018, we've we've started to move beyond</rh-cue>
<rh-cue start="04:01" end="04:02">just computer vision inference.</rh-cue>
<rh-cue start="04:02" end="04:05">So when I say computer vision inference, I mean like image</rh-cue>
<rh-cue start="04:05" end="04:08">classification, object detection, segmentation.</rh-cue>
<rh-cue start="04:09" end="04:12">And now we're moving into natural language processing, things</rh-cue>
<rh-cue start="04:12" end="04:16">like speech synthesis, speech recognition, knowledge, graphs,</rh-cue>
<rh-cue start="04:17" end="04:21">time series forecasting, and other use cases that don't involve</rh-cue>
<rh-cue start="04:21" end="04:24">computer vision and don't involve inference on pixels.</rh-cue>
<rh-cue start="04:25" end="04:28">Our latest release, the 20 22.1 that came out earlier this year,</rh-cue>
<rh-cue start="04:29" end="04:32">there was a most significant update that we've had to open vino</rh-cue>
<rh-cue start="04:32" end="04:36">since we started in 2018, and the major focus of that release</rh-cue>
<rh-cue start="04:36" end="04:40">was optimizing for use cases that go beyond computer vision.</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter"></rh-cue>
<rh-cue start="04:41" end="04:44">And I like that concept that you just mentioned right there, computer vision.</rh-cue>
<rh-cue start="04:44" end="04:47">And you said that you extended those use cases and went beyond that.</rh-cue>
<rh-cue start="04:47" end="04:50">So could you give us more concrete examples of computer vision?</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney"></rh-cue>
<rh-cue start="04:50" end="04:50">Yeah, sure.</rh-cue>
<rh-cue start="04:50" end="04:55">So when you think about manufacturing quality control in factories, everything</rh-cue>
<rh-cue start="04:55" end="05:01">from ARC welding, defect detection to inspecting BMW cars on assembly lines,</rh-cue>
<rh-cue start="05:02" end="05:05">they're using cameras or sensors to collect data.</rh-cue>
<rh-cue start="05:05" end="05:11">And usually it's cameras collecting images like RGV images that you and I can see.</rh-cue>
<rh-cue start="05:11" end="05:14">And looks like something taken from a camera or video camera,</rh-cue>
<rh-cue start="05:15" end="05:19">but also things like infrared or computerized tomography</rh-cue>
<rh-cue start="05:19" end="05:24">scans used in health care, X-ray, different types of images where we can</rh-cue>
<rh-cue start="05:25" end="05:28">draw bounding boxes around regions of interest</rh-cue>
<rh-cue start="05:29" end="05:32">and say, you know, this is a defect or this is not a defect.</rh-cue>
<rh-cue start="05:32" end="05:37">And also, is this worker wearing a safety hat or did they forget to put it on?</rh-cue>
<rh-cue start="05:37" end="05:41">And so you can take this and integrate it into a pipeline</rh-cue>
<rh-cue start="05:41" end="05:44">where you're triggering an alert if somebody forgets</rh-cue>
<rh-cue start="05:44" end="05:49">to wear their safety mask or if there's a defect in a product</rh-cue>
<rh-cue start="05:49" end="05:53">on an assembly line, you can just use cameras and open</rh-cue>
<rh-cue start="05:53" end="05:58">vino and open CV running these on Intel hardware and help to analyze.</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney"></rh-cue>
<rh-cue start="05:58" end="06:01">And that's what a lot of the partners that we work with are doing.</rh-cue>
<rh-cue start="06:01" end="06:03">So these independent software vendors</rh-cue>
<rh-cue start="06:03" end="06:06">and there's other use cases for things like retail.</rh-cue>
<rh-cue start="06:06" end="06:10">You think about going to a store and using an automated checkout system.</rh-cue>
<rh-cue start="06:11" end="06:13">You know, sometimes people use those automated checkouts</rh-cue>
<rh-cue start="06:13" end="06:17">and they they slide a few extra items into their bag that they don't scan.</rh-cue>
<rh-cue start="06:17" end="06:21">And it's a huge loss for the retail outlets</rh-cue>
<rh-cue start="06:21" end="06:25">that are providing this way to check out real time shelf monitoring.</rh-cue>
<rh-cue start="06:25" end="06:29">We have this bear on one of our is fees that helps keep store shelves</rh-cue>
<rh-cue start="06:29" end="06:33">stocked by just analyzing the cameras in the stores, detecting</rh-cue>
<rh-cue start="06:33" end="06:37">when objects are missing from the shelves so that they can be restocked.</rh-cue>
<rh-cue start="06:37" end="06:41">We have Vistry, another ISP that works with quick service restaurants.</rh-cue>
<rh-cue start="06:41" end="06:44">So when you think about automating the process of</rh-cue>
<rh-cue start="06:44" end="06:48">when do I drop the fries into the fryer so that they're warm</rh-cue>
<rh-cue start="06:48" end="06:50">when the car gets to the drive thru window,</rh-cue>
<rh-cue start="06:50" end="06:54">you know, there's quite a bit of industrial health care retail examples</rh-cue>
<rh-cue start="06:54" end="06:57">that we can walk through and we should dig into some more of those.</rh-cue>
<rh-cue start="06:57" voice="Burr Sutter"></rh-cue>
<rh-cue start="06:57" end="06:59">But I got to tell you, I have I have a personal experience</rh-cue>
<rh-cue start="06:59" end="06:59">in this category</rh-cue>
<rh-cue start="06:59" end="07:01">that I want to share with, and you can tell me how</rh-cue>
<rh-cue start="07:01" end="07:03">how silly you might think at this point in time.</rh-cue>
<rh-cue start="07:03" end="07:04">It is.</rh-cue>
<rh-cue start="07:04" end="07:08">We actually built an AI keynote demonstration for the Red Hat big stage</rh-cue>
<rh-cue start="07:08" end="07:12">back in 2015, and I really want to illustrate the concept of asset tracking.</rh-cue>
<rh-cue start="07:12" end="07:15">So we actually gave everybody in the conference a little Bluetooth token,</rh-cue>
<rh-cue start="07:16" end="07:19">but a little battery, a little watch battery and a little Bluetooth emitter.</rh-cue>
<rh-cue start="07:19" end="07:22">And we basically tracked those things around the conference.</rh-cue>
<rh-cue start="07:22" end="07:24">We basically put a Raspberry Pi in each of the meeting rooms</rh-cue>
<rh-cue start="07:24" end="07:26">and up in the lunch room, and you could see how the tokens</rh-cue>
<rh-cue start="07:26" end="07:30">moved from room to room to room as a relatively simple application.</rh-cue>
<rh-cue start="07:30" voice="Burr Sutter"></rh-cue>
<rh-cue start="07:30" end="07:32">But it occurred to me after we figured out,</rh-cue>
<rh-cue start="07:32" end="07:34">okay, how to do that with Bluetooth and triangulating</rh-cue>
<rh-cue start="07:34" end="07:39">Bluetooth signals by looking at relative signal strength from one radio to another</rh-cue>
<rh-cue start="07:39" end="07:42">and putting that through an Apache Spark application at the time,</rh-cue>
<rh-cue start="07:42" end="07:45">we then realized, you know what, this is easier done with cameras</rh-cue>
<rh-cue start="07:45" end="07:49">and just simply looking at a camera and having some form of animal</rh-cue>
<rh-cue start="07:49" end="07:51">or machine learning model that would say, Oh,</rh-cue>
<rh-cue start="07:51" end="07:53">there are people here now are there are no people here now.</rh-cue>
<rh-cue start="07:53" end="07:55">What do you think about that?</rh-cue>
<rh-cue start="07:55" voice="Ryan Loney"></rh-cue>
<rh-cue start="07:55" end="07:59">Yeah, I mean, what you just described is sort of exactly that the product</rh-cue>
<rh-cue start="07:59" end="08:02">that either one of our partners is offering,</rh-cue>
<rh-cue start="08:02" end="08:04">you know, that they're doing it with computer vision and cameras.</rh-cue>
<rh-cue start="08:04" end="08:08">So when partner tries to help retail stores</rh-cue>
<rh-cue start="08:08" end="08:12">analyze the foot traffic and understand with Heatmaps,</rh-cue>
<rh-cue start="08:12" end="08:16">where people are spending the most time in stores, how many people are coming</rh-cue>
<rh-cue start="08:16" end="08:19">in, what size groups are coming into the store,</rh-cue>
<rh-cue start="08:19" end="08:23">you know, and trying to help understand if there was a successful transaction</rh-cue>
<rh-cue start="08:23" end="08:27">from the people who entered the store and left the store so that you can,</rh-cue>
<rh-cue start="08:27" end="08:30">you know, to help with the, you know, retail analytics</rh-cue>
<rh-cue start="08:30" end="08:33">and marketing sales and positioning of products.</rh-cue>
<rh-cue start="08:34" end="08:37">And so they're doing that in a way that also protects privacy.</rh-cue>
<rh-cue start="08:37" end="08:38">And that's something that's really important.</rh-cue>
<rh-cue start="08:38" end="08:41">So when you talked about those Bluetooth beacons, probably,</rh-cue>
<rh-cue start="08:41" end="08:44">
you know, if everyone who walked into a grocery store was asked
</rh-cue>
<rh-cue start="08:44" end="08:49">
to put a tracking device in their cart or on their person and say, you know,
</rh-cue>
<rh-cue start="08:49" end="08:50">
you're going
</rh-cue>
<rh-cue start="08:50" end="08:53">
to be tracked around the store, they probably wouldn't want to do that.
</rh-cue>
<rh-cue start="08:53" end="08:56">
The way that you can do this with cameras is you can,
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney"></rh-cue>
<rh-cue start="08:56" end="09:01">
you know, detect people as they enter and, you know, remove their face.
</rh-cue>
<rh-cue start="09:01" end="09:01">
Right.
</rh-cue>
<rh-cue start="09:01" end="09:05">
So you can ignore any biometric information
</rh-cue>
<rh-cue start="09:05" end="09:08">
and and just track the person based on pixels
</rh-cue>
<rh-cue start="09:09" end="09:12">
that are present in the detected region of interest.
</rh-cue>
<rh-cue start="09:12" end="09:15">
So they're able to analyze, say, a family walks in the door
</rh-cue>
<rh-cue start="09:16" end="09:20">
and they can group those people together with object detection
</rh-cue>
<rh-cue start="09:20" end="09:23">
and then they can track their movement throughout the store
</rh-cue>
<rh-cue start="09:23" end="09:26">
without keeping track of their face or any biometric
</rh-cue>
<rh-cue start="09:26" end="09:30">
or any personal identifiable information to avoid things like bias
</rh-cue>
<rh-cue start="09:30" end="09:35">
and to make sure that they're protecting the privacy of the shoppers in the store
</rh-cue>
<rh-cue start="09:35" end="09:39">
while still getting that really useful marketing analytics data rate
</rh-cue>
<rh-cue start="09:39" end="09:42">
so that they can make better decisions about where to place their products.
</rh-cue>
<rh-cue start="09:42" end="09:45">
So that's one really good example of how
</rh-cue>
<rh-cue start="09:45" end="09:48">
computer vision AI with open vino is being used today.
</rh-cue>
<rh-cue start="09:48" voice="Burr Sutter"></rh-cue>
<rh-cue start="09:48" end="09:51">
And that is a great example because you're definitely spot on.
</rh-cue>
<rh-cue start="09:51" end="09:53">
It is invasive when you hand someone to Bluetooth devices,
</rh-cue>
<rh-cue start="09:53" end="09:56">
say, please keep this with you as you go throughout our our store
</rh-cue>
<rh-cue start="09:56" end="09:59">
or our mall or throughout our hospital, wherever you might be.
</rh-cue>
<rh-cue start="09:59" end="10:01">
Now, you mentioned another example earlier
</rh-cue>
<rh-cue start="10:01" end="10:03">
in the conversation which was related to like worker safety.
</rh-cue>
<rh-cue start="10:03" end="10:05">
Are they wearing a helmet?
</rh-cue>
<rh-cue start="10:05" end="10:08">
I want to talk more about that concept in a real industrial setting,
</rh-cue>
<rh-cue start="10:08" end="10:11">
a manufacturing setting where there might be a factory floor
</rh-cue>
<rh-cue start="10:11" end="10:13">
and there are certain requirements, or better yet, there's like a
</rh-cue>
<rh-cue start="10:13" end="10:16">
a quality assurance requirement, let's say, when it comes to looking
</rh-cue>
<rh-cue start="10:16" end="10:20">
at a factory line, I run to that use case often what some of our customers.
</rh-cue>
<rh-cue start="10:20" end="10:22">
Can you talk more about those kinds of use cases? Yeah.
</rh-cue>
<rh-cue start="10:22" voice="Ryan Loney"></rh-cue>
<rh-cue start="10:22" end="10:27">
So one of our partners, Robuchon and we you know, published a case study
</rh-cue>
<rh-cue start="10:27" end="10:31">
I think last year where they're working with BMW at one of their factories
</rh-cue>
<rh-cue start="10:32" end="10:35">
and they do quality control inspection, but they're also doing
</rh-cue>
<rh-cue start="10:35" end="10:38">
things related to worker safety and analyzing.
</rh-cue>
<rh-cue start="10:38" end="10:40">
You know, I used the safety had example.
</rh-cue>
<rh-cue start="10:40" end="10:45">
There's a number of of our ISP's and partners who have similar use cases.
</rh-cue>
<rh-cue start="10:45" end="10:48">
And it comes down to there's a few reasons
</rh-cue>
<rh-cue start="10:48" end="10:51">
that are motivating this and some are related to like insurance, right?
</rh-cue>
<rh-cue start="10:51" end="10:53">
It's important to make sure that
</rh-cue>
<rh-cue start="10:53" end="10:56">
if you want to have your factory insured and that your workers
</rh-cue>
<rh-cue start="10:56" end="10:58">
are protecting themselves and wearing the gear.
</rh-cue>
<rh-cue start="10:58" end="11:00">
Regulatory compliance. Right.
</rh-cue>
<rh-cue start="11:00" end="11:05">
You're you're being asked to properly protect from exposure to chemicals or,
</rh-cue>
<rh-cue start="11:05" end="11:09">
you know, potentially having something fall and and hit someone on the head.
</rh-cue>
<rh-cue start="11:09" end="11:13">
So wearing a safety vest, wearing goggles, wearing a helmet,
</rh-cue>
<rh-cue start="11:14" end="11:17">
these are things that you need to do inside the factory.
</rh-cue>
<rh-cue start="11:17" end="11:21">
And you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney"></rh-cue>
<rh-cue start="11:21" end="11:26">
I think that's one of the interesting things about the robots on BMW example
</rh-cue>
<rh-cue start="11:26" end="11:31">
is that they were also blurring sort of blocking out and so drawing a box
</rh-cue>
<rh-cue start="11:31" end="11:35">
to cover the face of the workers in the factory
</rh-cue>
<rh-cue start="11:35" end="11:38">
so that somebody who was analyzing the video footage
</rh-cue>
<rh-cue start="11:38" end="11:43">
and getting the alerts saying that, hey, you know, Bay 21 has a worker
</rh-cue>
<rh-cue start="11:43" end="11:47">
without a hat on, that it's not sending their face
</rh-cue>
<rh-cue start="11:47" end="11:50">
and in the alert and potentially, you know, invading
</rh-cue>
<rh-cue start="11:50" end="11:54">
or going against privacy laws or just the ethics of the company.
</rh-cue>
<rh-cue start="11:54" end="11:54">
Right.
</rh-cue>
<rh-cue start="11:54" end="11:58">
They don't want to introduce bias or have people targeted because
</rh-cue>
<rh-cue start="11:58" end="12:02">
it's much better to to have it be, you know, blur the face
</rh-cue>
<rh-cue start="12:02" end="12:06">
and alert and have somebody take care of it on the floor.
</rh-cue>
<rh-cue start="12:06" end="12:09">
And then if you ever need to audit that information later,
</rh-cue>
<rh-cue start="12:09" end="12:12">
they have a way to do it where people who need to be able to see
</rh-cue>
<rh-cue start="12:12" end="12:17">
who the employee was and look up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney"></rh-cue>
<rh-cue start="12:17" end="12:20">
But then just for the purposes of maintaining safety,
</rh-cue>
<rh-cue start="12:20" end="12:21">
they don't need to have access
</rh-cue>
<rh-cue start="12:21" end="12:24">
to that personal information or biometric information,
</rh-cue>
<rh-cue start="12:25" end="12:28">
because that's one thing that when you hear about computer vision
</rh-cue>
<rh-cue start="12:28" end="12:31">
or object person tracking, object detection,
</rh-cue>
<rh-cue start="12:32" end="12:36">
there's a lot of concern, and rightfully so, about privacy
</rh-cue>
<rh-cue start="12:36" end="12:40">
being invaded and about tracking information, face ID,
</rh-cue>
<rh-cue start="12:41" end="12:45">
identifying people who may have committed crimes through video footage.
</rh-cue>
<rh-cue start="12:45" end="12:48">
And that's just not something that a lot of companies want to
</rh-cue>
<rh-cue start="12:49" end="12:51">
you know, they want to protect privacy
</rh-cue>
<rh-cue start="12:51" end="12:52">
and they don't they don't want to be in a situation
</rh-cue>
<rh-cue start="12:52" end="12:55">
where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter"></rh-cue>
<rh-cue start="12:56" end="12:58">
Well, privacy is certainly opening up Pandora's box.
</rh-cue>
<rh-cue start="12:58" end="13:00">
There's a lot to be explored in that area,
</rh-cue>
<rh-cue start="13:00" end="13:02">
especially in a digital world that we now live in.
</rh-cue>
<rh-cue start="13:02" end="13:05">
But for now, let's move on and explore different area.
</rh-cue>
<rh-cue start="13:05" end="13:08">
I'm interested in how machines and computers offer advantages,
</rh-cue>
<rh-cue start="13:08" end="13:12">
specifically in certain use cases like a quality control scenario.
</rh-cue>
<rh-cue start="13:12" end="13:15">
I asked Ryan to explain how AML and specifically machines
</rh-cue>
<rh-cue start="13:15" end="13:18">
computers can augment that capability.
</rh-cue>
<rh-cue start="13:19" voice="Ryan Loney"></rh-cue>
<rh-cue start="13:19" end="13:22">
I can give a specific example where we have a partner
</rh-cue>
<rh-cue start="13:22" end="13:25">
that's there doing defect detection with
</rh-cue>
<rh-cue start="13:25" end="13:28">
and looking for anomalies in batteries.
</rh-cue>
<rh-cue start="13:28" end="13:31">
So, you know, sure, you've heard there's a lot of interest right now
</rh-cue>
<rh-cue start="13:31" end="13:34">
in electric vehicles, a lot of batteries being produced.
</rh-cue>
<rh-cue start="13:34" end="13:36">
And so if you go into one of these factories,
</rh-cue>
<rh-cue start="13:36" end="13:40">
they have images that they collect of every battery that's going through this
</rh-cue>
<rh-cue start="13:40" end="13:44">
assembly line and through these images, people
</rh-cue>
<rh-cue start="13:44" end="13:47">
can look and see and visually inspect with their eyes and say,
</rh-cue>
<rh-cue start="13:48" end="13:50">
this battery has a defect, send it back.
</rh-cue>
<rh-cue start="13:50" end="13:53">
And that's one step in the quality control process.
</rh-cue>
<rh-cue start="13:53" end="13:58">
And there's other steps, I'm sure, like running diagnostic tests and, you know,
</rh-cue>
<rh-cue start="13:58" end="14:02">
measuring voltage and doing other types of non-visual inspection.
</rh-cue>
<rh-cue start="14:02" end="14:06">
But for the visual inspection piece where you can really easily identify
</rh-cue>
<rh-cue start="14:06" end="14:10">
some problems, it's much more efficient to introduce computer vision.
</rh-cue>
<rh-cue start="14:11" end="14:14">
And so that's where we have this new library that we've introduced
</rh-cue>
<rh-cue start="14:14" end="14:17">
called Anomali, that's open vino.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney"></rh-cue>
<rh-cue start="14:17" end="14:20">
While we're focused on inference, you know, we're also thinking
</rh-cue>
<rh-cue start="14:20" end="14:25">
about the pipeline or the funnel that gets these models to open vino.
</rh-cue>
<rh-cue start="14:25" end="14:28">
And so we've we've invested in this anomaly segmentation,
</rh-cue>
<rh-cue start="14:28" end="14:32">
anomaly detection library and that we've recently open source
</rh-cue>
<rh-cue start="14:32" end="14:35">
and there's a great research paper about it about Anomali.
</rh-cue>
<rh-cue start="14:36" end="14:39">
But the idea is you can take just a few images
</rh-cue>
<rh-cue start="14:39" end="14:43">
and train a model and start detecting these defects.
</rh-cue>
<rh-cue start="14:43" end="14:46">
And so for this battery example, that's a more advanced example.
</rh-cue>
<rh-cue start="14:46" end="14:52">
But to make it simpler, you know, take some bolts and, you know, take ten bolts.
</rh-cue>
<rh-cue start="14:52" end="14:55">
You have one that has a scratch on it or one that is chipped
</rh-cue>
<rh-cue start="14:56" end="14:58">
or has some damage to it.
</rh-cue>
<rh-cue start="14:58" end="15:00">
And you can easily get started in training
</rh-cue>
<rh-cue start="15:00" end="15:03">
to recognize the bolts that do not have an anomaly.
</rh-cue>
<rh-cue start="15:04" end="15:06">
And the ones that do, which is a small data set
</rh-cue>
<rh-cue start="15:06" end="15:10">
and I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney"></rh-cue>
<rh-cue start="15:11" end="15:14">
Challenges is one is access to data, but the other is
</rh-cue>
<rh-cue start="15:14" end="15:17">
is needing a massive amount of data to do something meaningful.
</rh-cue>
<rh-cue start="15:18" end="15:22">
And so we're starting to try to change that dynamic with Anomali.
</rh-cue>
<rh-cue start="15:22" end="15:27">
So you may not need 100,000 images, you may need 100 images,
</rh-cue>
<rh-cue start="15:27" end="15:33">
and you can start detecting anomalies in everything from batteries to bolts to,
</rh-cue>
<rh-cue start="15:33" end="15:37">
you know, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter"></rh-cue>
<rh-cue start="15:37" end="15:40">
That is very key point because often in that data scientist
</rh-cue>
<rh-cue start="15:40" end="15:43">
process, that data engineer and data scientist process, right.
</rh-cue>
<rh-cue start="15:43" end="15:44">
The one key thing is can you gather
</rh-cue>
<rh-cue start="15:44" end="15:47">
the data that you need for the input for the model training?
</rh-cue>
<rh-cue start="15:47" end="15:49">
And we've often sat at least people I've worked
</rh-cue>
<rh-cue start="15:49" end="15:52">
with over the last couple of years, you know, you need a lot of data.
</rh-cue>
<rh-cue start="15:52" end="15:55">
You need tens of thousands of correct images
</rh-cue>
<rh-cue start="15:55" end="15:58">
so we can sort out the difference between dogs versus cats, let's say,
</rh-cue>
<rh-cue start="15:58" end="16:01">
or you need dozens and dozens of situations
</rh-cue>
<rh-cue start="16:01" end="16:03">
where if it's a natural language processing scenario,
</rh-cue>
<rh-cue start="16:03" end="16:06">
you know, a good customer interaction, a good customer conversation,
</rh-cue>
<rh-cue start="16:06" end="16:07">
and in this case,
</rh-cue>
<rh-cue start="16:07" end="16:11">
it sounds like what you're saying is show us just the bad things, right?
</rh-cue>
<rh-cue start="16:11" end="16:14">
Fewer images, fewer incorrect things,
</rh-cue>
<rh-cue start="16:14" end="16:17">
and then let us look for those kind of anomalies.
</rh-cue>
<rh-cue start="16:18" end="16:20">
Can tell us more about that because that is very interesting.
</rh-cue>
<rh-cue start="16:20" end="16:23">
The concept that I can use a much smaller dataset as my input
</rh-cue>
<rh-cue start="16:23" end="16:26">
as opposed to gathering terabytes of data in some cases
</rh-cue>
<rh-cue start="16:26" end="16:29">
to just simply get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney"></rh-cue>
<rh-cue start="16:30" end="16:34">
You know, like you described, the idea is if you have some good images
</rh-cue>
<rh-cue start="16:34" end="16:37">
and then you have some of the the known defects
</rh-cue>
<rh-cue start="16:38" end="16:41">
and you can just label here's a set of good images
</rh-cue>
<rh-cue start="16:41" end="16:44">
and here's a few of the defects and you can right away
</rh-cue>
<rh-cue start="16:44" end="16:48">
start detecting those specific defects that you've identified.
</rh-cue>
<rh-cue start="16:48" end="16:49">
And then also, you know, be able to
</rh-cue>
<rh-cue start="16:50" end="16:53">
determine when it doesn't match
</rh-cue>
<rh-cue start="16:53" end="16:57">
the expected appearance of a non defective item.
</rh-cue>
<rh-cue start="16:57" end="17:00">
So if I have the undamaged screw and then I introduce
</rh-cue>
<rh-cue start="17:00" end="17:03">
one with some new anomaly that's never been seen before,
</rh-cue>
<rh-cue start="17:04" end="17:07">
I can say, you know, this one is not a valid screw.
</rh-cue>
<rh-cue start="17:07" end="17:11">
And so that's sort of the the approach that we're taking.
</rh-cue>
<rh-cue start="17:11" end="17:15">
And it's really important because so often you need to have
</rh-cue>
<rh-cue start="17:15" end="17:19">
subject matter experts often like if you think the take the battery example,
</rh-cue>
<rh-cue start="17:20" end="17:23">
there's these workers who are on the floor
</rh-cue>
<rh-cue start="17:23" end="17:27">
in a factory and they're the ones who know best when they look at these images,
</rh-cue>
<rh-cue start="17:28" end="17:31">
which one's going to have an issue, which one's defective?
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney"></rh-cue>
<rh-cue start="17:31" end="17:34">
And then they also need to take that subject matter, expertise
</rh-cue>
<rh-cue start="17:35" end="17:38">
and then use it to annotate data sets.
</rh-cue>
<rh-cue start="17:38" end="17:39">
And when you have these, you know,
</rh-cue>
<rh-cue start="17:39" end="17:43">
tens of thousands of images you need to annotate, it's asking those people
</rh-cue>
<rh-cue start="17:43" end="17:47">
to stop working on the factory floor so they can come annotate some images.
</rh-cue>
<rh-cue start="17:47" end="17:49">
That's a tough business call to make, right?
</rh-cue>
<rh-cue start="17:49" end="17:53">
But if you only need them to annotate a handful of images, it's a much easier
</rh-cue>
<rh-cue start="17:53" end="17:56">
ask to get the ball rolling and demonstrate value.
</rh-cue>
<rh-cue start="17:56" end="17:59">
And maybe over time you will want to annotate more
</rh-cue>
<rh-cue start="17:59" end="18:03">
and more images because you'll get even better accuracy in the model.
</rh-cue>
<rh-cue start="18:03" end="18:07">
Even better, even if it's just small incremental improvements.
</rh-cue>
<rh-cue start="18:08" end="18:11">
You know, that's something that if it generates value for the business,
</rh-cue>
<rh-cue start="18:11" end="18:14">
it's something the business will invest in over time.
</rh-cue>
<rh-cue start="18:14" end="18:17">
But you have to convince the decision makers that it's worth
</rh-cue>
<rh-cue start="18:17" end="18:22">
the time of these subject matter experts to stop what they're doing
</rh-cue>
<rh-cue start="18:22" end="18:26">
and go and label some images of the things that they're working on in the factory.
</rh-cue>
<rh-cue start="18:26" voice="Burr Sutter"></rh-cue>
<rh-cue start="18:26" end="18:30">
And that labeling process can be very labor intensive of the annotations,
</rh-cue>
<rh-cue start="18:30" end="18:33">
basically saying what is correct, what's wrong, what is this, what is that?
</rh-cue>
<rh-cue start="18:33" end="18:36">
And therefore, if we can minimize that time frame to get the value quicker,
</rh-cue>
<rh-cue start="18:36" end="18:40">
then there's something that's useful for the business, useful for the organization
</rh-cue>
<rh-cue start="18:40" end="18:41">
long before we necessarily good.
</rh-cue>
<rh-cue start="18:41" end="18:43">
There are huge model training based,
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter"></rh-cue>
<rh-cue start="18:49" end="18:52">
so we talk about labeling and how that is labor intensive activity.
</rh-cue>
<rh-cue start="18:52" end="18:54">
But I love the idea of helping the human
</rh-cue>
<rh-cue start="18:54" end="18:57">
and helping the human models specifically not get bored.
</rh-cue>
<rh-cue start="18:57" end="19:01">
Basically, if the human is eyeballing a bunch of widgets flying by over time,
</rh-cue>
<rh-cue start="19:01" end="19:03">
they make mistakes, they get bored
</rh-cue>
<rh-cue start="19:03" end="19:06">
and they don't pay as close attention as they should.
</rh-cue>
<rh-cue start="19:07" end="19:10">
That's why the concept of Amazon specifically computer vision, augmenting
</rh-cue>
<rh-cue start="19:10" end="19:14">
that capability and really helping the human identify anomalies faster,
</rh-cue>
<rh-cue start="19:14" end="19:17">
more quickly, maybe with greater accuracy could be a big win.
</rh-cue>
<rh-cue start="19:18" end="19:21">
We focused on manufacturing, but let's actually go into health care
</rh-cue>
<rh-cue start="19:21" end="19:24">
and learn how these tools can be used in that sector and that industry.
</rh-cue>
<rh-cue start="19:24" end="19:28">
Ryan talked to me about how Open Windows runtime can be incorporated into medical
</rh-cue>
<rh-cue start="19:28" end="19:32">
imaging equipment with intel processors and better than c.T.
</rh-cue>
<rh-cue start="19:32" end="19:34">
MRI and ultrasound machines.
</rh-cue>
<rh-cue start="19:34" end="19:37">
Well, these inferences, this AML workload can be operating
</rh-cue>
<rh-cue start="19:37" end="19:41">
and executing right there in the same physical room as the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney"></rh-cue>
<rh-cue start="19:44" end="19:46">
We did a presentation when she last year.
</rh-cue>
<rh-cue start="19:46" end="19:47">
I think they said
</rh-cue>
<rh-cue start="19:47" end="19:50">
there's at least 80 countries that have their X-ray machines deployed
</rh-cue>
<rh-cue start="19:51" end="19:56">
and they're doing things like helping doctors place breathing tubes in patients.
</rh-cue>
<rh-cue start="19:56" end="20:01">
So during COVID, during the pandemic, that was a really important tool
</rh-cue>
<rh-cue start="20:01" end="20:05">
to help with nurses and doctors who were intubating patients
</rh-cue>
<rh-cue start="20:05" end="20:09">
sometimes like in a parking lot or a hallway of the hospital.
</rh-cue>
<rh-cue start="20:09" end="20:14">
And, you know, when they had a statistic that you said, I think one out of four
</rh-cue>
<rh-cue start="20:14" end="20:17">
breathing tubes gets placed incorrectly
</rh-cue>
<rh-cue start="20:17" end="20:19">
when you're doing it outside the operating room,
</rh-cue>
<rh-cue start="20:19" end="20:22">
because when you're in an operating room, it's much more controlled
</rh-cue>
<rh-cue start="20:22" end="20:24">
and there's someone who's an expert at placing the tubes.
</rh-cue>
<rh-cue start="20:24" end="20:28">
It's something you have more of a controlled environment
</rh-cue>
<rh-cue start="20:28" end="20:31">
than when you're out in a parking lot, in a tent.
</rh-cue>
<rh-cue start="20:31" end="20:34">
You know, when the hospital's completely full and you're triaging patients
</rh-cue>
<rh-cue start="20:34" end="20:37">
with COVID, that's when they're more likely to make mistakes.
</rh-cue>
<rh-cue start="20:37" end="20:40">
And so they had this endotracheal tube placement
</rh-cue>
<rh-cue start="20:42" end="20:43">
model that they trained,
</rh-cue>
<rh-cue start="20:43" end="20:47">
and it helped to use an x ray and give an alert and say, hey,
</rh-cue>
<rh-cue start="20:47" end="20:50">
this tube is placed wrong, pull it out and do it again.
</rh-cue>
<rh-cue start="20:50" end="20:53">
And so things like that help doctors so that they can avoid mistakes.
</rh-cue>
<rh-cue start="20:54" end="20:57">
And, you know, having a breathing tube placed incorrectly
</rh-cue>
<rh-cue start="20:57" end="21:01">
can cause collapsed lung and a number of other unwanted side effects.
</rh-cue>
<rh-cue start="21:01" end="21:03">
So it's really important to do it correctly.
</rh-cue>
<rh-cue start="21:03" end="21:06">
Another example is Samsung Medicine.
</rh-cue>
<rh-cue start="21:06" end="21:10">
They actually are doing estimating fetal angle of progression.
</rh-cue>
<rh-cue start="21:10" end="21:13">
So this is analyzing ultrasound
</rh-cue>
<rh-cue start="21:14" end="21:18">
of pregnant women with that, being able to to help take measurements
</rh-cue>
<rh-cue start="21:18" end="21:22">
that are usually hard to calculate that can be done in an automated way.
</rh-cue>
<rh-cue start="21:22" end="21:26">
They're already taking the ultrasound scan and now they're executing this model.
</rh-cue>
<rh-cue start="21:26" end="21:31">
They can take some of these measurements to help the doctor avoid potentially more
</rh-cue>
<rh-cue start="21:31" end="21:34">
intrusive alternative methods so the patient wins.
</rh-cue>
<rh-cue start="21:35" end="21:36">
It makes their life better.
</rh-cue>
<rh-cue start="21:36" end="21:39">
And the doctors is getting help from this A.I.
</rh-cue>
<rh-cue start="21:39" end="21:40">
model.
</rh-cue>
<rh-cue start="21:40" end="21:42">
And those are, you know, just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter"></rh-cue>
<rh-cue start="21:42" end="21:45">
Those are some amazing examples when it comes to all these things.
</rh-cue>
<rh-cue start="21:45" end="21:49">
We're talking like CT scans, right, and x rays, other examples of computer vision.
</rh-cue>
<rh-cue start="21:49" end="21:52">
One thing that's kind of interesting in the space, I think
</rh-cue>
<rh-cue start="21:52" end="21:56">
whenever I get a chance to work on, let's say an object traction model
</rh-cue>
<rh-cue start="21:56" end="21:58">
and one of our workshops, by the way, is actually putting that out
</rh-cue>
<rh-cue start="21:58" end="22:01">
in front of people to say, Hey, look, you can use your phone.
</rh-cue>
<rh-cue start="22:01" end="22:04">
And it basically sends the image over to our OpenShift, right,
</rh-cue>
<rh-cue start="22:04" end="22:07">
with our data science platform and then analyzes what you see.
</rh-cue>
<rh-cue start="22:08" end="22:09">
And even in my case, where I take a picture of my dog
</rh-cue>
<rh-cue start="22:09" end="22:13">
as an example, it can't really decide is it a dog or a cat?
</rh-cue>
<rh-cue start="22:13" end="22:15">
I have a very funny looking dog,
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter"></rh-cue>
<rh-cue start="22:15" end="22:18">
and so there's always a percentage outcome, you know?
</rh-cue>
<rh-cue start="22:18" end="22:21">
In other words, I think it's a dog 52%.
</rh-cue>
<rh-cue start="22:21" end="22:22">
So I want to talk about that more.
</rh-cue>
<rh-cue start="22:22" end="22:25">
What how important is it to get to 100% accuracy?
</rh-cue>
<rh-cue start="22:25" end="22:29">
How important is it to really, depending on the use case, to allow
</rh-cue>
<rh-cue start="22:29" end="22:34">
for the gray area, if you will, where it's an 80% accuracy or 70% accuracy?
</rh-cue>
<rh-cue start="22:34" end="22:36">
And where are the trade offs there associated with the application?
</rh-cue>
<rh-cue start="22:36" end="22:38">
Can you can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney"></rh-cue>
<rh-cue start="22:38" end="22:40">
Accuracy is definitely, you know, a touchy subject
</rh-cue>
<rh-cue start="22:40" end="22:43">
because how you measure it makes a huge difference.
</rh-cue>
<rh-cue start="22:43" end="22:46">
And then I think with like what you were describing with the dog example, there's
</rh-cue>
<rh-cue start="22:46" end="22:51">
sort of a top five potential classes that might may be identified.
</rh-cue>
<rh-cue start="22:51" end="22:55">
So let's say you're doing object detection and you detect a region of interest
</rh-cue>
<rh-cue start="22:55" end="22:57">
and it says 65% confidence.
</rh-cue>
<rh-cue start="22:57" end="22:58">
This is a dog.
</rh-cue>
<rh-cue start="22:58" end="23:03">
Well, the next potential label that could be maybe 50% confidence
</rh-cue>
<rh-cue start="23:03" end="23:08">
or 20% confidence might be something similar to a dog or in the case of models
</rh-cue>
<rh-cue start="23:08" end="23:11">
that have been trained on like the image net dataset
</rh-cue>
<rh-cue start="23:11" end="23:15">
or on cocoa data set, they have like actual breeds of dogs.
</rh-cue>
<rh-cue start="23:15" end="23:20">
So if I want to look at the top five labels for a dog,
</rh-cue>
<rh-cue start="23:20" end="23:24">
for my dog, for example, she's a mixed mostly Labrador retriever.
</rh-cue>
<rh-cue start="23:24" voice="Ryan Loney"></rh-cue>
<rh-cue start="23:24" end="23:29">
But I may look at the top five labels and it may say 65% confidence that she's
</rh-cue>
<rh-cue start="23:29" end="23:34">
a flat coated retriever and then confidence that she's a husky,
</rh-cue>
<rh-cue start="23:34" end="23:39">
as you know, 20% and then 5% confidence that she's a Greyhound or something.
</rh-cue>
<rh-cue start="23:40" end="23:42">
Those labels, all of them are dogs.
</rh-cue>
<rh-cue start="23:42" end="23:45">
So if I'm just trying to figure out is, is this a dog,
</rh-cue>
<rh-cue start="23:45" end="23:50">
I could probably find all of the, you know, classes within the data set
</rh-cue>
<rh-cue start="23:50" end="23:53">
and say, well, these are all, you know, class ID
</rh-cue>
<rh-cue start="23:53" end="24:00">
65, 132, 92 and 158 all belong to a group of dogs.
</rh-cue>
<rh-cue start="24:00" end="24:04">
So if I wanted to just write an application to tell me if this is a dog
</rh-cue>
<rh-cue start="24:04" end="24:07">
or not, I would probably use that to determine if it's a dog.
</rh-cue>
<rh-cue start="24:08" end="24:10">
But how you measure that is accuracy.
</rh-cue>
<rh-cue start="24:10" end="24:11">
Well, that's where it gets a little bit complicated,
</rh-cue>
<rh-cue start="24:11" end="24:15">
because if you're being really strict about the definition and you're
</rh-cue>
<rh-cue start="24:15" end="24:18">
trying to validate against the data set of labeled images
</rh-cue>
<rh-cue start="24:18" end="24:22">
and I have specific dog breeds or some specific detail
</rh-cue>
<rh-cue start="24:22" end="24:25">
and it doesn't match, well, then the accuracy is going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney"></rh-cue>
<rh-cue start="24:25" end="24:29">
That's especially important when we talk about things like compression
</rh-cue>
<rh-cue start="24:29" end="24:30">
and quantization,
</rh-cue>
<rh-cue start="24:30" end="24:34">
which, you know, historically has been difficult for to get adoption
</rh-cue>
<rh-cue start="24:34" end="24:40">
in some domains like health care, where even the hint of accuracy going down
</rh-cue>
<rh-cue start="24:40" end="24:44">
implies that we're not going to be able to help in some small case,
</rh-cue>
<rh-cue start="24:44" end="24:47">
maybe if it's even half a percent of the time
</rh-cue>
<rh-cue start="24:47" end="24:51">
we want to take that that tube is placed incorrectly or that, you know,
</rh-cue>
<rh-cue start="24:51" end="24:54">
that patient's, you know, lung has collapsed or something like that.
</rh-cue>
<rh-cue start="24:54" end="24:58">
And that's something that really prevents adoption of some of these methods
</rh-cue>
<rh-cue start="24:58" end="25:01">
that can really boost performance like quantization.
</rh-cue>
<rh-cue start="25:01" end="25:05">
But if you take that example of sort of different from the dog example
</rh-cue>
<rh-cue start="25:05" end="25:07">
and you think about like segmentation of kidneys.
</rh-cue>
<rh-cue start="25:07" end="25:11">
So if I'm doing kidney segmentation, which is, you know, taking a CT scan
</rh-cue>
<rh-cue start="25:12" end="25:14">
and then trying to pick the pixels out of that
</rh-cue>
<rh-cue start="25:14" end="25:17">
scan that belong to a kidney,
</rh-cue>
<rh-cue start="25:17" end="25:20">
how I measure accuracy may be
</rh-cue>
<rh-cue start="25:20" end="25:24">
how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney"></rh-cue>
<rh-cue start="25:25" end="25:29">
Missing some of the pixels is maybe not a problem, right,
</rh-cue>
<rh-cue start="25:29" end="25:33">
depending on how you built the application because you still detect the kidney
</rh-cue>
<rh-cue start="25:34" end="25:38">
and maybe you just need to apply padding around the region of interest
</rh-cue>
<rh-cue start="25:38" end="25:41">
so that you don't miss any of the the actual kidney.
</rh-cue>
<rh-cue start="25:42" end="25:45">
When you compress the model and when you quantized the model. But
</rh-cue>
<rh-cue start="25:45" end="25:50">
that requires, you know, data scientist and email engineer somebody to really
</rh-cue>
<rh-cue start="25:51" end="25:52">
they have to be
</rh-cue>
<rh-cue start="25:52" end="25:55">
able to go and apply that after the fact, after the inference
</rh-cue>
<rh-cue start="25:55" end="25:59">
happens to make sure that you're not losing critical information,
</rh-cue>
<rh-cue start="25:59" end="26:02">
because the next step from detecting the kidney may be detecting a tumor.
</rh-cue>
<rh-cue start="26:03" voice="Ryan Loney"></rh-cue>
<rh-cue start="26:03" end="26:06">
And so maybe you can use the more optimized model
</rh-cue>
<rh-cue start="26:06" end="26:11">
to detect the kidney, but then you can use a slower model to detect the tumor.
</rh-cue>
<rh-cue start="26:11" end="26:15">
But that also requires somebody to architect and make that decision
</rh-cue>
<rh-cue start="26:15" end="26:16">
or that tradeoff and say,
</rh-cue>
<rh-cue start="26:16" end="26:20">
well, I need to add padding, or I should only use the quantized model
</rh-cue>
<rh-cue start="26:20" end="26:24">
to detect the region of interest for the kidney and then use the model
</rh-cue>
<rh-cue start="26:24" end="26:27">
that takes longer to do the inference
</rh-cue>
<rh-cue start="26:27" end="26:30">
just to find the tumor, which is going to be on a smaller size.
</rh-cue>
<rh-cue start="26:30" end="26:33">
Right. The dimensions are going to be much smaller
</rh-cue>
<rh-cue start="26:33" end="26:35">
once we crop to the region of interest.
</rh-cue>
<rh-cue start="26:35" end="26:40">
But all of those details, that's maybe not easy to explain in a few sentences.
</rh-cue>
<rh-cue start="26:40" end="26:43">
And even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter"></rh-cue>
<rh-cue start="26:45" end="26:46">
I do love that use case.
</rh-cue>
<rh-cue start="26:46" end="26:47">
Like you mentioned, the cropping
</rh-cue>
<rh-cue start="26:47" end="26:50">
even in one such an area that we worked on for another project,
</rh-cue>
<rh-cue start="26:50" end="26:53">
we specifically decided to pix like the image that we had taken
</rh-cue>
<rh-cue start="26:53" end="26:57">
because we knew that we could get the outcome we wanted by even
</rh-cue>
<rh-cue start="26:57" end="27:01">
just using a smaller or less having less resolution in our image.
</rh-cue>
<rh-cue start="27:01" end="27:04">
And therefore, as we transferred it from the mobile device storage device
</rh-cue>
<rh-cue start="27:04" end="27:08">
up into the cloud, we wanted that smaller image just for transfer purposes
</rh-cue>
<rh-cue start="27:08" end="27:11">
and it still we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter"></rh-cue>
<rh-cue start="27:11" end="27:14">
And one thing that's interesting about that from my perspective is
</rh-cue>
<rh-cue start="27:15" end="27:18">
if you're doing image processing, sometimes it takes a while
</rh-cue>
<rh-cue start="27:18" end="27:20">
for this transaction to occur.
</rh-cue>
<rh-cue start="27:20" end="27:20">
Like I,
</rh-cue>
<rh-cue start="27:20" end="27:24">
I come from a traditional application background, you know, where I'm reading
</rh-cue>
<rh-cue start="27:24" end="27:25">
and writing things from a database
</rh-cue>
<rh-cue start="27:25" end="27:28">
or a message broker or moving data from one place to another.
</rh-cue>
<rh-cue start="27:28" end="27:29">
Those things happen subsequent.
</rh-cue>
<rh-cue start="27:29" end="27:33">
Normally, even with great latency between your data centers, you know,
</rh-cue>
<rh-cue start="27:33" end="27:34">
it's still subsequent.
</rh-cue>
<rh-cue start="27:34" end="27:38">
In most cases, while on a transaction like this, one can actually take 2 seconds
</rh-cue>
<rh-cue start="27:38" end="27:42">
or 4 seconds as it's doing its analysis and actually coming back with, you know,
</rh-cue>
<rh-cue start="27:42" end="27:46">
I think it's a dog, I think it's a kidney, I think it's whatever, and provided me
</rh-cue>
<rh-cue start="27:46" end="27:48">
that accuracy statement.
</rh-cue>
<rh-cue start="27:48" end="27:51">
So that concept of optimization is very important
</rh-cue>
<rh-cue start="27:51" end="27:53">
in the overall application architecture.
</rh-cue>
<rh-cue start="27:53" end="27:56">
Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" end="27:56">
Yeah, definitely.
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney"></rh-cue>
<rh-cue start="27:56" end="27:58">
It depends too on the use case.
</rh-cue>
<rh-cue start="27:58" end="28:02">
So if you think about how important it is to reduce the latency
</rh-cue>
<rh-cue start="28:02" end="28:06">
and increase the number of frames per second that you can process when you're
</rh-cue>
<rh-cue start="28:06" end="28:10">
talking about a loss prevention model that's running at a grocery store.
</rh-cue>
<rh-cue start="28:10" end="28:13">
So you want to keep the lines moving.
</rh-cue>
<rh-cue start="28:13" end="28:16">
You don't want every person who's at the self-checkout
</rh-cue>
<rh-cue start="28:16" end="28:19">
to have to wait 5 seconds for every item they scan.
</rh-cue>
<rh-cue start="28:19" end="28:22">
You need it to happen as quickly as possible.
</rh-cue>
<rh-cue start="28:22" end="28:25">
And if sometimes you, you know, the accuracy
</rh-cue>
<rh-cue start="28:25" end="28:28">
decreases slightly or the I'd say the accuracy of the whole pipeline.
</rh-cue>
<rh-cue start="28:28" end="28:32">
So not just looking at the individual model or the individual inference, but
</rh-cue>
<rh-cue start="28:32" end="28:36">
let's say that the the whole pipeline is not as successful at detecting
</rh-cue>
<rh-cue start="28:37" end="28:40">
when somebody steals one item from the self-checkout,
</rh-cue>
<rh-cue start="28:41" end="28:43">
it's not going to be a life threatening situation.
</rh-cue>
<rh-cue start="28:43" end="28:47">
Whereas, you know, being in the hooked up to the X-ray machine
</rh-cue>
<rh-cue start="28:47" end="28:51">
with the two placement model, they might be willing to have the doctor,
</rh-cue>
<rh-cue start="28:51" end="28:54">
the nurse wait 5 seconds to get the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney"></rh-cue>
<rh-cue start="28:55" end="28:58">
They don't need it to happen in 500 milliseconds.
</rh-cue>
<rh-cue start="28:58" end="29:02">
So they're willing their threshold for waiting is a little bit higher.
</rh-cue>
<rh-cue start="29:02" end="29:05">
So that, I think, also drives some of the decision, like
</rh-cue>
<rh-cue start="29:06" end="29:09">
you want to keep people moving through the checkout line
</rh-cue>
<rh-cue start="29:09" end="29:13">
and you can afford to to potentially if you lose a little bit of accuracy here
</rh-cue>
<rh-cue start="29:13" end="29:14">
and there, it's not going to
</rh-cue>
<rh-cue start="29:14" end="29:18">
cost the company that much money or it's not going to be life threatening.
</rh-cue>
<rh-cue start="29:18" end="29:21">It's going to be worth the tradeoff of keeping the line moving</rh-cue>
<rh-cue start="29:21" end="29:24">and not having people leave the store and not check out at all.</rh-cue>
<rh-cue start="29:24" end="29:27">And to say, I'm not going to shop today because the line's too long.</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter"></rh-cue>
<rh-cue start="29:30" end="29:32">There are so many trade offs and enterprise</rh-cue>
<rh-cue start="29:32" end="29:35">AML use cases, things like latency, accuracy and availability.</rh-cue>
<rh-cue start="29:35" end="29:40">And certainly complexities abound, especially in an obviously ever evolving</rh-cue>
<rh-cue start="29:40" end="29:43">technological landscape where we are still very early in the adoption of AML.</rh-cue>
<rh-cue start="29:44" end="29:47">And to navigate that complexity, the direct feedback from real world</rh-cue>
<rh-cue start="29:47" end="29:51">end users is essential to Ryan and his team at Intel.</rh-cue>
<rh-cue start="29:52" end="29:54">What would you say are some of the big hurdles or big</rh-cue>
<rh-cue start="29:54" end="29:57">outcomes, big opportunities in that space?</rh-cue>
<rh-cue start="29:57" end="30:01">And do you agree that we're kind of still at the very beginning in our infancy,</rh-cue>
<rh-cue start="30:01" end="30:01">if you will,</rh-cue>
<rh-cue start="30:01" end="30:05">of adopting these technologies and and discovering what they can do for us?</rh-cue>
<rh-cue start="30:05" voice="Ryan Loney"></rh-cue>
<rh-cue start="30:05" end="30:07">Yeah, I think we're definitely in the infancy</rh-cue>
<rh-cue start="30:07" end="30:10">and I think that what we've seen is our customers are evolving</rh-cue>
<rh-cue start="30:10" end="30:14">and the people who are deploying on Intel hardware, they're trying to run</rh-cue>
<rh-cue start="30:14" end="30:16">more complicated models.</rh-cue>
<rh-cue start="30:16" end="30:19">They're the models that are doing object detection or, you know,</rh-cue>
<rh-cue start="30:19" end="30:22">detecting defects and, you know, doing segmentation.</rh-cue>
<rh-cue start="30:23" end="30:27">You know, in the past you could say, oh, here's a generic model that will do face</rh-cue>
<rh-cue start="30:27" end="30:31">detection or person detection or vehicle detection and license plate detection.</rh-cue>
<rh-cue start="30:32" end="30:33">And those are sort of like</rh-cue>
<rh-cue start="30:33" end="30:36">general purpose models that you can just grab off the shelf and use them.</rh-cue>
<rh-cue start="30:37" end="30:40">But now we're moving into like the anomaly scenarios</rh-cue>
<rh-cue start="30:40" end="30:44">where I've got my own data and I'm trying to do something very specific</rh-cue>
<rh-cue start="30:45" end="30:47">and I'm the only one that has access to this data.</rh-cue>
<rh-cue start="30:47" end="30:51">And you don't have a public data set that you can go download</rh-cue>
<rh-cue start="30:51" end="30:54">that's under Creative Commons license for, you know, car batteries.</rh-cue>
<rh-cue start="30:54" end="30:57">It's, you know, it's just not something that's available.</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney"></rh-cue>
<rh-cue start="30:57" end="31:02">And so those use cases, the challenge with with training those models</rh-cue>
<rh-cue start="31:02" end="31:06">and and getting them optimized is the beginning of the pipeline.</rh-cue>
<rh-cue start="31:06" end="31:10">It's the data you have to get the data you have to annotated</rh-cue>
<rh-cue start="31:10" end="31:12">and the tools have to exist for you to do that.</rh-cue>
<rh-cue start="31:12" end="31:15">And that's part of the problem that we're trying to help solve.</rh-cue>
<rh-cue start="31:16" end="31:17">And then the models are getting more complex.</rh-cue>
<rh-cue start="31:17" end="31:21">So if you think, you know, just from working with customers recently,</rh-cue>
<rh-cue start="31:21" end="31:22">you know, they're no longer</rh-cue>
<rh-cue start="31:22" end="31:26">just trying to do image classification and, you know, like is it a dog or a cat?</rh-cue>
<rh-cue start="31:26" end="31:29">They've moved on to like 3D point clouds</rh-cue>
<rh-cue start="31:29" end="31:34">and, you know, 3D segmentation models and things that are like the speech</rh-cue>
<rh-cue start="31:34" end="31:39">synthesis example, doing things these GPT models that are generating,</rh-cue>
<rh-cue start="31:40" end="31:44">you know, you, you put a text input and it generates an image for you.</rh-cue>
<rh-cue start="31:44" end="31:47">It's just becoming much more advanced, much more sophisticated</rh-cue>
<rh-cue start="31:48" end="31:50">and on larger images.</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney"></rh-cue>
<rh-cue start="31:50" end="31:54">And so things like running super resolution enhancing images, upscaling</rh-cue>
<rh-cue start="31:54" end="31:59">images, instead of just trying to take that, you know, 200 by 200 pixel</rh-cue>
<rh-cue start="32:00" end="32:02">image and classifying if it's a cat.</rh-cue>
<rh-cue start="32:02" end="32:05">Now we're talking about gigantic</rh-cue>
<rh-cue start="32:05" end="32:09">huge images that we're processing and that all requires</rh-cue>
<rh-cue start="32:09" end="32:12">more resources or more optimized models.</rh-cue>
<rh-cue start="32:13" end="32:16">And, you know, every computer vision conference or A.I.</rh-cue>
<rh-cue start="32:16" end="32:19">conference, there's there's a new latest and greatest architecture.</rh-cue>
<rh-cue start="32:19" end="32:22">There's new research paper, and things are getting adopted much faster.</rh-cue>
<rh-cue start="32:23" end="32:27">The lead time for a nurse paper or CV PR</rh-cue>
<rh-cue start="32:27" end="32:30">for a company to actually adopt and put those into production.</rh-cue>
<rh-cue start="32:30" end="32:32">It's like the time shortens every year.</rh-cue>
<rh-cue start="32:33" voice="Burr Sutter"></rh-cue>
<rh-cue start="32:33" end="32:35">Well, Ryan, I got to tell you, I could talk to you</rh-cue>
<rh-cue start="32:35" end="32:39">literally all day about these topics, the various use cases, the various ways</rh-cue>
<rh-cue start="32:39" end="32:41">models are being optimized,</rh-cue>
<rh-cue start="32:41" end="32:44">how to put models into a pipeline for average enterprise applications.</rh-cue>
<rh-cue start="32:44" end="32:47">I've enjoyed learning about pop and vino and anomalies,</rh-cue>
<rh-cue start="32:47" end="32:50">but I'm fascinated by this because I will have a chance to go try this myself.</rh-cue>
<rh-cue start="32:51" end="32:52">Taking advantage of Red Hat OpenShift</rh-cue>
<rh-cue start="32:52" end="32:54">and taking advantage of our data science platform.</rh-cue>
<rh-cue start="32:54" end="32:58">On top of that, I will definitely go be poking at this myself.</rh-cue>
<rh-cue start="32:58" end="33:00">So thank you so much for your time today.</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney"></rh-cue>
<rh-cue start="33:00" end="33:00">Thanks, Burr.</rh-cue>
<rh-cue start="33:00" end="33:01">This was a lot of fun.</rh-cue>
<rh-cue start="33:01" end="33:04">Thanks for having me.</rh-cue>
<rh-cue start="33:04" voice="Burr Sutter"></rh-cue>
<rh-cue start="33:04" end="33:06">And you can check out</rh-cue>
<rh-cue start="33:06" end="33:09">the full transcript of our conversation and more resources,</rh-cue>
<rh-cue start="33:09" end="33:12">like a link to a white paper on open vino and normal lib at Red Hat dot</rh-cue>
<rh-cue start="33:12" end="33:15">com slash code Comments Podcast.</rh-cue>
<rh-cue start="33:15" end="33:19">This episode was produced by Brant Seminole and Caroline Prickett.</rh-cue>
<rh-cue start="33:20" end="33:21">Our sound designer is Christian.</rh-cue>
<rh-cue start="33:21" end="33:26">From our audio team includes Lee Day, Stephanie Wunderlich, Mike Esser,</rh-cue>
<rh-cue start="33:27" end="33:32">Laura Barnes, Claire Allison, Nick Burns, Aaron Williamson, Karen King,</rh-cue>
<rh-cue start="33:32" end="33:36">Booboo House, Rachel Artell, Mike Compton, Ocean</rh-cue>
<rh-cue start="33:36" end="33:40">Mathews, Laura Walters, Alex Trabelsi and Victoria Lutton.</rh-cue>
<rh-cue start="33:41" end="33:43">I'm your host, Burt Sutter.</rh-cue>
<rh-cue start="33:43" end="33:45">Thank you for joining me today on Code Comments.</rh-cue>
<rh-cue start="33:45" end="33:48">I hope you enjoyed today's session and today's conversation, and I</rh-cue>
<rh-cue start="33:48" end="33:52">look forward to many more in.</rh-cue>
</rh-transcript>
</rh-audio-player>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Heading Levels
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
#heading-levels {
padding: var(--rh-space-xl, 24px);
}
```
<section id="heading-levels">
<p>Audio player should automatically calculate it's heading levels</p>
<h3>Root Level h3</h3>
<p>Transcript should be <code>h5</code>, cues should be <code>h6</code></p>
<rh-audio-player layout="full" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h4 slot="title">Bringing Deep Learning to Enterprise Applications</h4>
<rh-audio-player-about slot="about">
<h5 slot="heading">About the episode</h5>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h5 slot="heading">Subscribe</h5>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript slot="transcript">
<rh-cue start="00:02" voice="Burr Sutter"></rh-cue>
<rh-cue start="00:02" end="00:04">Hi, I'm Burr Sutter.</rh-cue>
<rh-cue start="00:04" end="00:05">I'm a Red Hatter</rh-cue>
<rh-cue start="00:05" end="00:08">who spends a lot of time talking to technologists about technologies.</rh-cue>
<rh-cue start="00:09" end="00:10">We say this a lot of Red Hat.</rh-cue>
<rh-cue start="00:10" end="00:14">No single technology provider holds the key to success, including us.</rh-cue>
<rh-cue start="00:15" end="00:17">And I would say the same thing about myself.</rh-cue>
<rh-cue start="00:17" end="00:18">I love to share ideas.</rh-cue>
<rh-cue start="00:18" end="00:19">So I thought it'd be awesome</rh-cue>
<rh-cue start="00:19" end="00:22">to talk to some brilliant technologists at Red Hat Partners.</rh-cue>
<rh-cue start="00:23" end="00:26">This is Code Comments, an original podcast</rh-cue>
<rh-cue start="00:26" end="00:29">from Red Hat.</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter"></rh-cue>
<rh-cue start="00:29" end="00:32">I'm sure, like many of you here, you have been thinking about</rh-cue>
<rh-cue start="00:32" end="00:36">AI, ML, artificial intelligence and machine learning.</rh-cue>
<rh-cue start="00:36" end="00:38">I've been thinking about that for quite some time</rh-cue>
<rh-cue start="00:38" end="00:39">and actually had the opportunity</rh-cue>
<rh-cue start="00:39" end="00:43">to work on a few successful projects here at Red Hat using those technologies,</rh-cue>
<rh-cue start="00:43" end="00:46">actually enabling a dataset, gathering a dataset,</rh-cue>
<rh-cue start="00:46" end="00:49">working with data scientists and data engineering team,</rh-cue>
<rh-cue start="00:49" end="00:51">and then training a model and putting that model into production</rh-cue>
<rh-cue start="00:51" end="00:53">runtime environment.</rh-cue>
<rh-cue start="00:53" end="00:55">It was an exciting set of projects and you can kind of see</rh-cue>
<rh-cue start="00:55" end="00:58">those on numerous YouTube videos I have published out there before.</rh-cue>
<rh-cue start="00:59" end="01:01">But I want you to think about the problem space a little bit</rh-cue>
<rh-cue start="01:01" end="01:04">because there are some interesting challenges about AI/ML.</rh-cue>
<rh-cue start="01:04" end="01:06">One is simply just getting access to the data,</rh-cue>
<rh-cue start="01:06" end="01:09">and while there are numerous publicly available datasets</rh-cue>
<rh-cue start="01:09" end="01:12">when it comes to your specific enterprise use case, you might not be to find</rh-cue>
<rh-cue start="01:12" end="01:14">publicly available data.</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter"></rh-cue>
<rh-cue start="01:14" end="01:17">In many cases, you cannot, even for our applications that we created,</rh-cue>
<rh-cue start="01:17" end="01:20">we had to create our dataset, capture our dataset,</rh-cue>
<rh-cue start="01:21" end="01:24">explore the dataset, and of course train a model accordingly.</rh-cue>
<rh-cue start="01:24" end="01:27">And we also found there's another challenge to be overcome</rh-cue>
<rh-cue start="01:27" end="01:30">in this AML world, and that is access to certain types of hardware.</rh-cue>
<rh-cue start="01:31" end="01:33">If you think about the enterprise environment</rh-cue>
<rh-cue start="01:33" end="01:36">and the creation of an enterprise application specifically for AML</rh-cue>
<rh-cue start="01:37" end="01:40">and users need an inference engine to run on their hardware,</rh-cue>
<rh-cue start="01:40" end="01:43">hardware that's available to them to be effective for their application.</rh-cue>
<rh-cue start="01:43" end="01:45">Let's say an application like computer vision,</rh-cue>
<rh-cue start="01:45" end="01:49">one that can detect anomalies in medical imaging or maybe on a factory floor,</rh-cue>
<rh-cue start="01:49" end="01:52">You know, those things are whizzing by on the factory line.</rh-cue>
<rh-cue start="01:52" end="01:55">They're looking at them and trying to determine if there is an error or not.</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter"></rh-cue>
<rh-cue start="01:56" end="01:58">Well, how do you actually make it run on your hardware,</rh-cue>
<rh-cue start="01:58" end="02:01">your accessible technology that you have today?</rh-cue>
<rh-cue start="02:01" end="02:05">Well, there's a solution for this as an open toolkit called Open vino.</rh-cue>
<rh-cue start="02:05" end="02:07">And you might be thinking, hey, wait a minute,</rh-cue>
<rh-cue start="02:07" end="02:10">don't you need a GPU for a I inferencing a GPU</rh-cue>
<rh-cue start="02:10" end="02:12">for artificial intelligence machine learning?</rh-cue>
<rh-cue start="02:12" end="02:15">Well, not according to Ryan Loney, product manager of Open Vino Developer</rh-cue>
<rh-cue start="02:15" end="02:16">Tools at Intel.</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney"></rh-cue>
<rh-cue start="02:20" end="02:20">I guess we'll</rh-cue>
<rh-cue start="02:20" end="02:23">start with trying to maybe dispel the myths, right?</rh-cue>
<rh-cue start="02:23" end="02:27">I think that CPUs are widely used for inference today.</rh-cue>
<rh-cue start="02:27" end="02:32">So and if we look at the data center segment, you know, about 70% of the A.I.</rh-cue>
<rh-cue start="02:32" end="02:36">inference is happening on Intel Xeon on our data center CPUs.</rh-cue>
<rh-cue start="02:36" end="02:40">And so you don't needed a GPU, especially for running inference.</rh-cue>
<rh-cue start="02:40" end="02:43">And that's part of the value of open vino, is that we're you know,</rh-cue>
<rh-cue start="02:43" end="02:47">we're taking models that may have been trained on a GPU</rh-cue>
<rh-cue start="02:47" end="02:50">using deep learning frameworks like PyTorch or TensorFlow</rh-cue>
<rh-cue start="02:51" end="02:54">and then optimizing them to run on Intel hardware.</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter"></rh-cue>
<rh-cue start="02:56" end="03:00">Ryan joined me to discuss AI/ML and the enterprise</rh-cue>
<rh-cue start="03:00" end="03:03">across various industries and exploring numerous use cases.</rh-cue>
<rh-cue start="03:05" end="03:08">Let's talk a little bit about the origin story behind Open Vino.</rh-cue>
<rh-cue start="03:08" end="03:10">Tell us more about it and how it came to be</rh-cue>
<rh-cue start="03:10" end="03:13">and why it came out of Intel.</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney"></rh-cue>
<rh-cue start="03:12" end="03:16">Definitely. So we had the first release of Open Vino</rh-cue>
<rh-cue start="03:16" end="03:20">was back in 2018, so still relatively new.</rh-cue>
<rh-cue start="03:20" end="03:25">And at that time we were focused on computer vision and pretty tightly coupled</rh-cue>
<rh-cue start="03:25" end="03:31">with open CV, which is another open source library with origins at Intel.</rh-cue>
<rh-cue start="03:31" end="03:31">You know, it</rh-cue>
<rh-cue start="03:31" end="03:36">had its first release back in 1999, so it's been around a little bit longer.</rh-cue>
<rh-cue start="03:36" end="03:40">And many of the software engineers and architects at Intel</rh-cue>
<rh-cue start="03:40" end="03:45">that were involved with and contributing to open CV are working on open Vino.</rh-cue>
<rh-cue start="03:45" end="03:49">So you can think of open vino as complementary software to open CV.</rh-cue>
<rh-cue start="03:50" end="03:53">And we're providing like an engine for executing inference</rh-cue>
<rh-cue start="03:53" end="03:57">as part of a computer vision pipeline, or at least that's how we started.</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney"></rh-cue>
<rh-cue start="03:58" end="04:01">But since 2018, we've we've started to move beyond</rh-cue>
<rh-cue start="04:01" end="04:02">just computer vision inference.</rh-cue>
<rh-cue start="04:02" end="04:05">So when I say computer vision inference, I mean like image</rh-cue>
<rh-cue start="04:05" end="04:08">classification, object detection, segmentation.</rh-cue>
<rh-cue start="04:09" end="04:12">And now we're moving into natural language processing, things</rh-cue>
<rh-cue start="04:12" end="04:16">like speech synthesis, speech recognition, knowledge, graphs,</rh-cue>
<rh-cue start="04:17" end="04:21">time series forecasting, and other use cases that don't involve</rh-cue>
<rh-cue start="04:21" end="04:24">computer vision and don't involve inference on pixels.</rh-cue>
<rh-cue start="04:25" end="04:28">Our latest release, the 20 22.1 that came out earlier this year,</rh-cue>
<rh-cue start="04:29" end="04:32">there was a most significant update that we've had to open vino</rh-cue>
<rh-cue start="04:32" end="04:36">since we started in 2018, and the major focus of that release</rh-cue>
<rh-cue start="04:36" end="04:40">was optimizing for use cases that go beyond computer vision.</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter"></rh-cue>
<rh-cue start="04:41" end="04:44">And I like that concept that you just mentioned right there, computer vision.</rh-cue>
<rh-cue start="04:44" end="04:47">And you said that you extended those use cases and went beyond that.</rh-cue>
<rh-cue start="04:47" end="04:50">So could you give us more concrete examples of computer vision?</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney"></rh-cue>
<rh-cue start="04:50" end="04:50">Yeah, sure.</rh-cue>
<rh-cue start="04:50" end="04:55">So when you think about manufacturing quality control in factories, everything</rh-cue>
<rh-cue start="04:55" end="05:01">from ARC welding, defect detection to inspecting BMW cars on assembly lines,</rh-cue>
<rh-cue start="05:02" end="05:05">they're using cameras or sensors to collect data.</rh-cue>
<rh-cue start="05:05" end="05:11">And usually it's cameras collecting images like RGV images that you and I can see.</rh-cue>
<rh-cue start="05:11" end="05:14">And looks like something taken from a camera or video camera,</rh-cue>
<rh-cue start="05:15" end="05:19">but also things like infrared or computerized tomography</rh-cue>
<rh-cue start="05:19" end="05:24">scans used in health care, X-ray, different types of images where we can</rh-cue>
<rh-cue start="05:25" end="05:28">draw bounding boxes around regions of interest</rh-cue>
<rh-cue start="05:29" end="05:32">and say, you know, this is a defect or this is not a defect.</rh-cue>
<rh-cue start="05:32" end="05:37">And also, is this worker wearing a safety hat or did they forget to put it on?</rh-cue>
<rh-cue start="05:37" end="05:41">And so you can take this and integrate it into a pipeline</rh-cue>
<rh-cue start="05:41" end="05:44">where you're triggering an alert if somebody forgets</rh-cue>
<rh-cue start="05:44" end="05:49">to wear their safety mask or if there's a defect in a product</rh-cue>
<rh-cue start="05:49" end="05:53">on an assembly line, you can just use cameras and open</rh-cue>
<rh-cue start="05:53" end="05:58">vino and open CV running these on Intel hardware and help to analyze.</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney"></rh-cue>
<rh-cue start="05:58" end="06:01">And that's what a lot of the partners that we work with are doing.</rh-cue>
<rh-cue start="06:01" end="06:03">So these independent software vendors</rh-cue>
<rh-cue start="06:03" end="06:06">and there's other use cases for things like retail.</rh-cue>
<rh-cue start="06:06" end="06:10">You think about going to a store and using an automated checkout system.</rh-cue>
<rh-cue start="06:11" end="06:13">You know, sometimes people use those automated checkouts</rh-cue>
<rh-cue start="06:13" end="06:17">and they they slide a few extra items into their bag that they don't scan.</rh-cue>
<rh-cue start="06:17" end="06:21">And it's a huge loss for the retail outlets</rh-cue>
<rh-cue start="06:21" end="06:25">that are providing this way to check out real time shelf monitoring.</rh-cue>
<rh-cue start="06:25" end="06:29">We have this bear on one of our is fees that helps keep store shelves</rh-cue>
<rh-cue start="06:29" end="06:33">stocked by just analyzing the cameras in the stores, detecting</rh-cue>
<rh-cue start="06:33" end="06:37">when objects are missing from the shelves so that they can be restocked.</rh-cue>
<rh-cue start="06:37" end="06:41">We have Vistry, another ISP that works with quick service restaurants.</rh-cue>
<rh-cue start="06:41" end="06:44">So when you think about automating the process of</rh-cue>
<rh-cue start="06:44" end="06:48">when do I drop the fries into the fryer so that they're warm</rh-cue>
<rh-cue start="06:48" end="06:50">when the car gets to the drive thru window,</rh-cue>
<rh-cue start="06:50" end="06:54">you know, there's quite a bit of industrial health care retail examples</rh-cue>
<rh-cue start="06:54" end="06:57">that we can walk through and we should dig into some more of those.</rh-cue>
<rh-cue start="06:57" voice="Burr Sutter"></rh-cue>
<rh-cue start="06:57" end="06:59">But I got to tell you, I have I have a personal experience</rh-cue>
<rh-cue start="06:59" end="06:59">in this category</rh-cue>
<rh-cue start="06:59" end="07:01">that I want to share with, and you can tell me how</rh-cue>
<rh-cue start="07:01" end="07:03">how silly you might think at this point in time.</rh-cue>
<rh-cue start="07:03" end="07:04">It is.</rh-cue>
<rh-cue start="07:04" end="07:08">We actually built an AI keynote demonstration for the Red Hat big stage</rh-cue>
<rh-cue start="07:08" end="07:12">back in 2015, and I really want to illustrate the concept of asset tracking.</rh-cue>
<rh-cue start="07:12" end="07:15">So we actually gave everybody in the conference a little Bluetooth token,</rh-cue>
<rh-cue start="07:16" end="07:19">but a little battery, a little watch battery and a little Bluetooth emitter.</rh-cue>
<rh-cue start="07:19" end="07:22">And we basically tracked those things around the conference.</rh-cue>
<rh-cue start="07:22" end="07:24">We basically put a Raspberry Pi in each of the meeting rooms</rh-cue>
<rh-cue start="07:24" end="07:26">and up in the lunch room, and you could see how the tokens</rh-cue>
<rh-cue start="07:26" end="07:30">moved from room to room to room as a relatively simple application.</rh-cue>
<rh-cue start="07:30" voice="Burr Sutter"></rh-cue>
<rh-cue start="07:30" end="07:32">But it occurred to me after we figured out,</rh-cue>
<rh-cue start="07:32" end="07:34">okay, how to do that with Bluetooth and triangulating</rh-cue>
<rh-cue start="07:34" end="07:39">Bluetooth signals by looking at relative signal strength from one radio to another</rh-cue>
<rh-cue start="07:39" end="07:42">and putting that through an Apache Spark application at the time,</rh-cue>
<rh-cue start="07:42" end="07:45">we then realized, you know what, this is easier done with cameras</rh-cue>
<rh-cue start="07:45" end="07:49">and just simply looking at a camera and having some form of animal</rh-cue>
<rh-cue start="07:49" end="07:51">or machine learning model that would say, Oh,</rh-cue>
<rh-cue start="07:51" end="07:53">there are people here now are there are no people here now.</rh-cue>
<rh-cue start="07:53" end="07:55">What do you think about that?</rh-cue>
<rh-cue start="07:55" voice="Ryan Loney"></rh-cue>
<rh-cue start="07:55" end="07:59">Yeah, I mean, what you just described is sort of exactly that the product</rh-cue>
<rh-cue start="07:59" end="08:02">that either one of our partners is offering,</rh-cue>
<rh-cue start="08:02" end="08:04">you know, that they're doing it with computer vision and cameras.</rh-cue>
<rh-cue start="08:04" end="08:08">So when partner tries to help retail stores</rh-cue>
<rh-cue start="08:08" end="08:12">analyze the foot traffic and understand with Heatmaps,</rh-cue>
<rh-cue start="08:12" end="08:16">where people are spending the most time in stores, how many people are coming</rh-cue>
<rh-cue start="08:16" end="08:19">in, what size groups are coming into the store,</rh-cue>
<rh-cue start="08:19" end="08:23">you know, and trying to help understand if there was a successful transaction</rh-cue>
<rh-cue start="08:23" end="08:27">from the people who entered the store and left the store so that you can,</rh-cue>
<rh-cue start="08:27" end="08:30">you know, to help with the, you know, retail analytics</rh-cue>
<rh-cue start="08:30" end="08:33">and marketing sales and positioning of products.</rh-cue>
<rh-cue start="08:34" end="08:37">And so they're doing that in a way that also protects privacy.</rh-cue>
<rh-cue start="08:37" end="08:38">And that's something that's really important.</rh-cue>
<rh-cue start="08:38" end="08:41">So when you talked about those Bluetooth beacons, probably,</rh-cue>
<rh-cue start="08:41" end="08:44">
you know, if everyone who walked into a grocery store was asked
</rh-cue>
<rh-cue start="08:44" end="08:49">
to put a tracking device in their cart or on their person and say, you know,
</rh-cue>
<rh-cue start="08:49" end="08:50">
you're going
</rh-cue>
<rh-cue start="08:50" end="08:53">
to be tracked around the store, they probably wouldn't want to do that.
</rh-cue>
<rh-cue start="08:53" end="08:56">
The way that you can do this with cameras is you can,
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney"></rh-cue>
<rh-cue start="08:56" end="09:01">
you know, detect people as they enter and, you know, remove their face.
</rh-cue>
<rh-cue start="09:01" end="09:01">
Right.
</rh-cue>
<rh-cue start="09:01" end="09:05">
So you can ignore any biometric information
</rh-cue>
<rh-cue start="09:05" end="09:08">
and and just track the person based on pixels
</rh-cue>
<rh-cue start="09:09" end="09:12">
that are present in the detected region of interest.
</rh-cue>
<rh-cue start="09:12" end="09:15">
So they're able to analyze, say, a family walks in the door
</rh-cue>
<rh-cue start="09:16" end="09:20">
and they can group those people together with object detection
</rh-cue>
<rh-cue start="09:20" end="09:23">
and then they can track their movement throughout the store
</rh-cue>
<rh-cue start="09:23" end="09:26">
without keeping track of their face or any biometric
</rh-cue>
<rh-cue start="09:26" end="09:30">
or any personal identifiable information to avoid things like bias
</rh-cue>
<rh-cue start="09:30" end="09:35">
and to make sure that they're protecting the privacy of the shoppers in the store
</rh-cue>
<rh-cue start="09:35" end="09:39">
while still getting that really useful marketing analytics data rate
</rh-cue>
<rh-cue start="09:39" end="09:42">
so that they can make better decisions about where to place their products.
</rh-cue>
<rh-cue start="09:42" end="09:45">
So that's one really good example of how
</rh-cue>
<rh-cue start="09:45" end="09:48">
computer vision AI with open vino is being used today.
</rh-cue>
<rh-cue start="09:48" voice="Burr Sutter"></rh-cue>
<rh-cue start="09:48" end="09:51">
And that is a great example because you're definitely spot on.
</rh-cue>
<rh-cue start="09:51" end="09:53">
It is invasive when you hand someone to Bluetooth devices,
</rh-cue>
<rh-cue start="09:53" end="09:56">
say, please keep this with you as you go throughout our our store
</rh-cue>
<rh-cue start="09:56" end="09:59">
or our mall or throughout our hospital, wherever you might be.
</rh-cue>
<rh-cue start="09:59" end="10:01">
Now, you mentioned another example earlier
</rh-cue>
<rh-cue start="10:01" end="10:03">
in the conversation which was related to like worker safety.
</rh-cue>
<rh-cue start="10:03" end="10:05">
Are they wearing a helmet?
</rh-cue>
<rh-cue start="10:05" end="10:08">
I want to talk more about that concept in a real industrial setting,
</rh-cue>
<rh-cue start="10:08" end="10:11">
a manufacturing setting where there might be a factory floor
</rh-cue>
<rh-cue start="10:11" end="10:13">
and there are certain requirements, or better yet, there's like a
</rh-cue>
<rh-cue start="10:13" end="10:16">
a quality assurance requirement, let's say, when it comes to looking
</rh-cue>
<rh-cue start="10:16" end="10:20">
at a factory line, I run to that use case often what some of our customers.
</rh-cue>
<rh-cue start="10:20" end="10:22">
Can you talk more about those kinds of use cases? Yeah.
</rh-cue>
<rh-cue start="10:22" voice="Ryan Loney"></rh-cue>
<rh-cue start="10:22" end="10:27">
So one of our partners, Robuchon and we you know, published a case study
</rh-cue>
<rh-cue start="10:27" end="10:31">
I think last year where they're working with BMW at one of their factories
</rh-cue>
<rh-cue start="10:32" end="10:35">
and they do quality control inspection, but they're also doing
</rh-cue>
<rh-cue start="10:35" end="10:38">
things related to worker safety and analyzing.
</rh-cue>
<rh-cue start="10:38" end="10:40">
You know, I used the safety had example.
</rh-cue>
<rh-cue start="10:40" end="10:45">
There's a number of of our ISP's and partners who have similar use cases.
</rh-cue>
<rh-cue start="10:45" end="10:48">
And it comes down to there's a few reasons
</rh-cue>
<rh-cue start="10:48" end="10:51">
that are motivating this and some are related to like insurance, right?
</rh-cue>
<rh-cue start="10:51" end="10:53">
It's important to make sure that
</rh-cue>
<rh-cue start="10:53" end="10:56">
if you want to have your factory insured and that your workers
</rh-cue>
<rh-cue start="10:56" end="10:58">
are protecting themselves and wearing the gear.
</rh-cue>
<rh-cue start="10:58" end="11:00">
Regulatory compliance. Right.
</rh-cue>
<rh-cue start="11:00" end="11:05">
You're you're being asked to properly protect from exposure to chemicals or,
</rh-cue>
<rh-cue start="11:05" end="11:09">
you know, potentially having something fall and and hit someone on the head.
</rh-cue>
<rh-cue start="11:09" end="11:13">
So wearing a safety vest, wearing goggles, wearing a helmet,
</rh-cue>
<rh-cue start="11:14" end="11:17">
these are things that you need to do inside the factory.
</rh-cue>
<rh-cue start="11:17" end="11:21">
And you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney"></rh-cue>
<rh-cue start="11:21" end="11:26">
I think that's one of the interesting things about the robots on BMW example
</rh-cue>
<rh-cue start="11:26" end="11:31">
is that they were also blurring sort of blocking out and so drawing a box
</rh-cue>
<rh-cue start="11:31" end="11:35">
to cover the face of the workers in the factory
</rh-cue>
<rh-cue start="11:35" end="11:38">
so that somebody who was analyzing the video footage
</rh-cue>
<rh-cue start="11:38" end="11:43">
and getting the alerts saying that, hey, you know, Bay 21 has a worker
</rh-cue>
<rh-cue start="11:43" end="11:47">
without a hat on, that it's not sending their face
</rh-cue>
<rh-cue start="11:47" end="11:50">
and in the alert and potentially, you know, invading
</rh-cue>
<rh-cue start="11:50" end="11:54">
or going against privacy laws or just the ethics of the company.
</rh-cue>
<rh-cue start="11:54" end="11:54">
Right.
</rh-cue>
<rh-cue start="11:54" end="11:58">
They don't want to introduce bias or have people targeted because
</rh-cue>
<rh-cue start="11:58" end="12:02">
it's much better to to have it be, you know, blur the face
</rh-cue>
<rh-cue start="12:02" end="12:06">
and alert and have somebody take care of it on the floor.
</rh-cue>
<rh-cue start="12:06" end="12:09">
And then if you ever need to audit that information later,
</rh-cue>
<rh-cue start="12:09" end="12:12">
they have a way to do it where people who need to be able to see
</rh-cue>
<rh-cue start="12:12" end="12:17">
who the employee was and look up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney"></rh-cue>
<rh-cue start="12:17" end="12:20">
But then just for the purposes of maintaining safety,
</rh-cue>
<rh-cue start="12:20" end="12:21">
they don't need to have access
</rh-cue>
<rh-cue start="12:21" end="12:24">
to that personal information or biometric information,
</rh-cue>
<rh-cue start="12:25" end="12:28">
because that's one thing that when you hear about computer vision
</rh-cue>
<rh-cue start="12:28" end="12:31">
or object person tracking, object detection,
</rh-cue>
<rh-cue start="12:32" end="12:36">
there's a lot of concern, and rightfully so, about privacy
</rh-cue>
<rh-cue start="12:36" end="12:40">
being invaded and about tracking information, face ID,
</rh-cue>
<rh-cue start="12:41" end="12:45">
identifying people who may have committed crimes through video footage.
</rh-cue>
<rh-cue start="12:45" end="12:48">
And that's just not something that a lot of companies want to
</rh-cue>
<rh-cue start="12:49" end="12:51">
you know, they want to protect privacy
</rh-cue>
<rh-cue start="12:51" end="12:52">
and they don't they don't want to be in a situation
</rh-cue>
<rh-cue start="12:52" end="12:55">
where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter"></rh-cue>
<rh-cue start="12:56" end="12:58">
Well, privacy is certainly opening up Pandora's box.
</rh-cue>
<rh-cue start="12:58" end="13:00">
There's a lot to be explored in that area,
</rh-cue>
<rh-cue start="13:00" end="13:02">
especially in a digital world that we now live in.
</rh-cue>
<rh-cue start="13:02" end="13:05">
But for now, let's move on and explore different area.
</rh-cue>
<rh-cue start="13:05" end="13:08">
I'm interested in how machines and computers offer advantages,
</rh-cue>
<rh-cue start="13:08" end="13:12">
specifically in certain use cases like a quality control scenario.
</rh-cue>
<rh-cue start="13:12" end="13:15">
I asked Ryan to explain how AML and specifically machines
</rh-cue>
<rh-cue start="13:15" end="13:18">
computers can augment that capability.
</rh-cue>
<rh-cue start="13:19" voice="Ryan Loney"></rh-cue>
<rh-cue start="13:19" end="13:22">
I can give a specific example where we have a partner
</rh-cue>
<rh-cue start="13:22" end="13:25">
that's there doing defect detection with
</rh-cue>
<rh-cue start="13:25" end="13:28">
and looking for anomalies in batteries.
</rh-cue>
<rh-cue start="13:28" end="13:31">
So, you know, sure, you've heard there's a lot of interest right now
</rh-cue>
<rh-cue start="13:31" end="13:34">
in electric vehicles, a lot of batteries being produced.
</rh-cue>
<rh-cue start="13:34" end="13:36">
And so if you go into one of these factories,
</rh-cue>
<rh-cue start="13:36" end="13:40">
they have images that they collect of every battery that's going through this
</rh-cue>
<rh-cue start="13:40" end="13:44">
assembly line and through these images, people
</rh-cue>
<rh-cue start="13:44" end="13:47">
can look and see and visually inspect with their eyes and say,
</rh-cue>
<rh-cue start="13:48" end="13:50">
this battery has a defect, send it back.
</rh-cue>
<rh-cue start="13:50" end="13:53">
And that's one step in the quality control process.
</rh-cue>
<rh-cue start="13:53" end="13:58">
And there's other steps, I'm sure, like running diagnostic tests and, you know,
</rh-cue>
<rh-cue start="13:58" end="14:02">
measuring voltage and doing other types of non-visual inspection.
</rh-cue>
<rh-cue start="14:02" end="14:06">
But for the visual inspection piece where you can really easily identify
</rh-cue>
<rh-cue start="14:06" end="14:10">
some problems, it's much more efficient to introduce computer vision.
</rh-cue>
<rh-cue start="14:11" end="14:14">
And so that's where we have this new library that we've introduced
</rh-cue>
<rh-cue start="14:14" end="14:17">
called Anomali, that's open vino.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney"></rh-cue>
<rh-cue start="14:17" end="14:20">
While we're focused on inference, you know, we're also thinking
</rh-cue>
<rh-cue start="14:20" end="14:25">
about the pipeline or the funnel that gets these models to open vino.
</rh-cue>
<rh-cue start="14:25" end="14:28">
And so we've we've invested in this anomaly segmentation,
</rh-cue>
<rh-cue start="14:28" end="14:32">
anomaly detection library and that we've recently open source
</rh-cue>
<rh-cue start="14:32" end="14:35">
and there's a great research paper about it about Anomali.
</rh-cue>
<rh-cue start="14:36" end="14:39">
But the idea is you can take just a few images
</rh-cue>
<rh-cue start="14:39" end="14:43">
and train a model and start detecting these defects.
</rh-cue>
<rh-cue start="14:43" end="14:46">
And so for this battery example, that's a more advanced example.
</rh-cue>
<rh-cue start="14:46" end="14:52">
But to make it simpler, you know, take some bolts and, you know, take ten bolts.
</rh-cue>
<rh-cue start="14:52" end="14:55">
You have one that has a scratch on it or one that is chipped
</rh-cue>
<rh-cue start="14:56" end="14:58">
or has some damage to it.
</rh-cue>
<rh-cue start="14:58" end="15:00">
And you can easily get started in training
</rh-cue>
<rh-cue start="15:00" end="15:03">
to recognize the bolts that do not have an anomaly.
</rh-cue>
<rh-cue start="15:04" end="15:06">
And the ones that do, which is a small data set
</rh-cue>
<rh-cue start="15:06" end="15:10">
and I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney"></rh-cue>
<rh-cue start="15:11" end="15:14">
Challenges is one is access to data, but the other is
</rh-cue>
<rh-cue start="15:14" end="15:17">
is needing a massive amount of data to do something meaningful.
</rh-cue>
<rh-cue start="15:18" end="15:22">
And so we're starting to try to change that dynamic with Anomali.
</rh-cue>
<rh-cue start="15:22" end="15:27">
So you may not need 100,000 images, you may need 100 images,
</rh-cue>
<rh-cue start="15:27" end="15:33">
and you can start detecting anomalies in everything from batteries to bolts to,
</rh-cue>
<rh-cue start="15:33" end="15:37">
you know, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter"></rh-cue>
<rh-cue start="15:37" end="15:40">
That is very key point because often in that data scientist
</rh-cue>
<rh-cue start="15:40" end="15:43">
process, that data engineer and data scientist process, right.
</rh-cue>
<rh-cue start="15:43" end="15:44">
The one key thing is can you gather
</rh-cue>
<rh-cue start="15:44" end="15:47">
the data that you need for the input for the model training?
</rh-cue>
<rh-cue start="15:47" end="15:49">
And we've often sat at least people I've worked
</rh-cue>
<rh-cue start="15:49" end="15:52">
with over the last couple of years, you know, you need a lot of data.
</rh-cue>
<rh-cue start="15:52" end="15:55">
You need tens of thousands of correct images
</rh-cue>
<rh-cue start="15:55" end="15:58">
so we can sort out the difference between dogs versus cats, let's say,
</rh-cue>
<rh-cue start="15:58" end="16:01">
or you need dozens and dozens of situations
</rh-cue>
<rh-cue start="16:01" end="16:03">
where if it's a natural language processing scenario,
</rh-cue>
<rh-cue start="16:03" end="16:06">
you know, a good customer interaction, a good customer conversation,
</rh-cue>
<rh-cue start="16:06" end="16:07">
and in this case,
</rh-cue>
<rh-cue start="16:07" end="16:11">
it sounds like what you're saying is show us just the bad things, right?
</rh-cue>
<rh-cue start="16:11" end="16:14">
Fewer images, fewer incorrect things,
</rh-cue>
<rh-cue start="16:14" end="16:17">
and then let us look for those kind of anomalies.
</rh-cue>
<rh-cue start="16:18" end="16:20">
Can tell us more about that because that is very interesting.
</rh-cue>
<rh-cue start="16:20" end="16:23">
The concept that I can use a much smaller dataset as my input
</rh-cue>
<rh-cue start="16:23" end="16:26">
as opposed to gathering terabytes of data in some cases
</rh-cue>
<rh-cue start="16:26" end="16:29">
to just simply get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney"></rh-cue>
<rh-cue start="16:30" end="16:34">
You know, like you described, the idea is if you have some good images
</rh-cue>
<rh-cue start="16:34" end="16:37">
and then you have some of the the known defects
</rh-cue>
<rh-cue start="16:38" end="16:41">
and you can just label here's a set of good images
</rh-cue>
<rh-cue start="16:41" end="16:44">
and here's a few of the defects and you can right away
</rh-cue>
<rh-cue start="16:44" end="16:48">
start detecting those specific defects that you've identified.
</rh-cue>
<rh-cue start="16:48" end="16:49">
And then also, you know, be able to
</rh-cue>
<rh-cue start="16:50" end="16:53">
determine when it doesn't match
</rh-cue>
<rh-cue start="16:53" end="16:57">
the expected appearance of a non defective item.
</rh-cue>
<rh-cue start="16:57" end="17:00">
So if I have the undamaged screw and then I introduce
</rh-cue>
<rh-cue start="17:00" end="17:03">
one with some new anomaly that's never been seen before,
</rh-cue>
<rh-cue start="17:04" end="17:07">
I can say, you know, this one is not a valid screw.
</rh-cue>
<rh-cue start="17:07" end="17:11">
And so that's sort of the the approach that we're taking.
</rh-cue>
<rh-cue start="17:11" end="17:15">
And it's really important because so often you need to have
</rh-cue>
<rh-cue start="17:15" end="17:19">
subject matter experts often like if you think the take the battery example,
</rh-cue>
<rh-cue start="17:20" end="17:23">
there's these workers who are on the floor
</rh-cue>
<rh-cue start="17:23" end="17:27">
in a factory and they're the ones who know best when they look at these images,
</rh-cue>
<rh-cue start="17:28" end="17:31">
which one's going to have an issue, which one's defective?
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney"></rh-cue>
<rh-cue start="17:31" end="17:34">
And then they also need to take that subject matter, expertise
</rh-cue>
<rh-cue start="17:35" end="17:38">
and then use it to annotate data sets.
</rh-cue>
<rh-cue start="17:38" end="17:39">
And when you have these, you know,
</rh-cue>
<rh-cue start="17:39" end="17:43">
tens of thousands of images you need to annotate, it's asking those people
</rh-cue>
<rh-cue start="17:43" end="17:47">
to stop working on the factory floor so they can come annotate some images.
</rh-cue>
<rh-cue start="17:47" end="17:49">
That's a tough business call to make, right?
</rh-cue>
<rh-cue start="17:49" end="17:53">
But if you only need them to annotate a handful of images, it's a much easier
</rh-cue>
<rh-cue start="17:53" end="17:56">
ask to get the ball rolling and demonstrate value.
</rh-cue>
<rh-cue start="17:56" end="17:59">
And maybe over time you will want to annotate more
</rh-cue>
<rh-cue start="17:59" end="18:03">
and more images because you'll get even better accuracy in the model.
</rh-cue>
<rh-cue start="18:03" end="18:07">
Even better, even if it's just small incremental improvements.
</rh-cue>
<rh-cue start="18:08" end="18:11">
You know, that's something that if it generates value for the business,
</rh-cue>
<rh-cue start="18:11" end="18:14">
it's something the business will invest in over time.
</rh-cue>
<rh-cue start="18:14" end="18:17">
But you have to convince the decision makers that it's worth
</rh-cue>
<rh-cue start="18:17" end="18:22">
the time of these subject matter experts to stop what they're doing
</rh-cue>
<rh-cue start="18:22" end="18:26">
and go and label some images of the things that they're working on in the factory.
</rh-cue>
<rh-cue start="18:26" voice="Burr Sutter"></rh-cue>
<rh-cue start="18:26" end="18:30">
And that labeling process can be very labor intensive of the annotations,
</rh-cue>
<rh-cue start="18:30" end="18:33">
basically saying what is correct, what's wrong, what is this, what is that?
</rh-cue>
<rh-cue start="18:33" end="18:36">
And therefore, if we can minimize that time frame to get the value quicker,
</rh-cue>
<rh-cue start="18:36" end="18:40">
then there's something that's useful for the business, useful for the organization
</rh-cue>
<rh-cue start="18:40" end="18:41">
long before we necessarily good.
</rh-cue>
<rh-cue start="18:41" end="18:43">
There are huge model training based,
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter"></rh-cue>
<rh-cue start="18:49" end="18:52">
so we talk about labeling and how that is labor intensive activity.
</rh-cue>
<rh-cue start="18:52" end="18:54">
But I love the idea of helping the human
</rh-cue>
<rh-cue start="18:54" end="18:57">
and helping the human models specifically not get bored.
</rh-cue>
<rh-cue start="18:57" end="19:01">
Basically, if the human is eyeballing a bunch of widgets flying by over time,
</rh-cue>
<rh-cue start="19:01" end="19:03">
they make mistakes, they get bored
</rh-cue>
<rh-cue start="19:03" end="19:06">
and they don't pay as close attention as they should.
</rh-cue>
<rh-cue start="19:07" end="19:10">
That's why the concept of Amazon specifically computer vision, augmenting
</rh-cue>
<rh-cue start="19:10" end="19:14">
that capability and really helping the human identify anomalies faster,
</rh-cue>
<rh-cue start="19:14" end="19:17">
more quickly, maybe with greater accuracy could be a big win.
</rh-cue>
<rh-cue start="19:18" end="19:21">
We focused on manufacturing, but let's actually go into health care
</rh-cue>
<rh-cue start="19:21" end="19:24">
and learn how these tools can be used in that sector and that industry.
</rh-cue>
<rh-cue start="19:24" end="19:28">
Ryan talked to me about how Open Windows runtime can be incorporated into medical
</rh-cue>
<rh-cue start="19:28" end="19:32">
imaging equipment with intel processors and better than c.T.
</rh-cue>
<rh-cue start="19:32" end="19:34">
MRI and ultrasound machines.
</rh-cue>
<rh-cue start="19:34" end="19:37">
Well, these inferences, this AML workload can be operating
</rh-cue>
<rh-cue start="19:37" end="19:41">
and executing right there in the same physical room as the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney"></rh-cue>
<rh-cue start="19:44" end="19:46">
We did a presentation when she last year.
</rh-cue>
<rh-cue start="19:46" end="19:47">
I think they said
</rh-cue>
<rh-cue start="19:47" end="19:50">
there's at least 80 countries that have their X-ray machines deployed
</rh-cue>
<rh-cue start="19:51" end="19:56">
and they're doing things like helping doctors place breathing tubes in patients.
</rh-cue>
<rh-cue start="19:56" end="20:01">
So during COVID, during the pandemic, that was a really important tool
</rh-cue>
<rh-cue start="20:01" end="20:05">
to help with nurses and doctors who were intubating patients
</rh-cue>
<rh-cue start="20:05" end="20:09">
sometimes like in a parking lot or a hallway of the hospital.
</rh-cue>
<rh-cue start="20:09" end="20:14">
And, you know, when they had a statistic that you said, I think one out of four
</rh-cue>
<rh-cue start="20:14" end="20:17">
breathing tubes gets placed incorrectly
</rh-cue>
<rh-cue start="20:17" end="20:19">
when you're doing it outside the operating room,
</rh-cue>
<rh-cue start="20:19" end="20:22">
because when you're in an operating room, it's much more controlled
</rh-cue>
<rh-cue start="20:22" end="20:24">
and there's someone who's an expert at placing the tubes.
</rh-cue>
<rh-cue start="20:24" end="20:28">
It's something you have more of a controlled environment
</rh-cue>
<rh-cue start="20:28" end="20:31">
than when you're out in a parking lot, in a tent.
</rh-cue>
<rh-cue start="20:31" end="20:34">
You know, when the hospital's completely full and you're triaging patients
</rh-cue>
<rh-cue start="20:34" end="20:37">
with COVID, that's when they're more likely to make mistakes.
</rh-cue>
<rh-cue start="20:37" end="20:40">
And so they had this endotracheal tube placement
</rh-cue>
<rh-cue start="20:42" end="20:43">
model that they trained,
</rh-cue>
<rh-cue start="20:43" end="20:47">
and it helped to use an x ray and give an alert and say, hey,
</rh-cue>
<rh-cue start="20:47" end="20:50">
this tube is placed wrong, pull it out and do it again.
</rh-cue>
<rh-cue start="20:50" end="20:53">
And so things like that help doctors so that they can avoid mistakes.
</rh-cue>
<rh-cue start="20:54" end="20:57">
And, you know, having a breathing tube placed incorrectly
</rh-cue>
<rh-cue start="20:57" end="21:01">
can cause collapsed lung and a number of other unwanted side effects.
</rh-cue>
<rh-cue start="21:01" end="21:03">
So it's really important to do it correctly.
</rh-cue>
<rh-cue start="21:03" end="21:06">
Another example is Samsung Medicine.
</rh-cue>
<rh-cue start="21:06" end="21:10">
They actually are doing estimating fetal angle of progression.
</rh-cue>
<rh-cue start="21:10" end="21:13">
So this is analyzing ultrasound
</rh-cue>
<rh-cue start="21:14" end="21:18">
of pregnant women with that, being able to to help take measurements
</rh-cue>
<rh-cue start="21:18" end="21:22">
that are usually hard to calculate that can be done in an automated way.
</rh-cue>
<rh-cue start="21:22" end="21:26">
They're already taking the ultrasound scan and now they're executing this model.
</rh-cue>
<rh-cue start="21:26" end="21:31">
They can take some of these measurements to help the doctor avoid potentially more
</rh-cue>
<rh-cue start="21:31" end="21:34">
intrusive alternative methods so the patient wins.
</rh-cue>
<rh-cue start="21:35" end="21:36">
It makes their life better.
</rh-cue>
<rh-cue start="21:36" end="21:39">
And the doctors is getting help from this A.I.
</rh-cue>
<rh-cue start="21:39" end="21:40">
model.
</rh-cue>
<rh-cue start="21:40" end="21:42">
And those are, you know, just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter"></rh-cue>
<rh-cue start="21:42" end="21:45">
Those are some amazing examples when it comes to all these things.
</rh-cue>
<rh-cue start="21:45" end="21:49">
We're talking like CT scans, right, and x rays, other examples of computer vision.
</rh-cue>
<rh-cue start="21:49" end="21:52">
One thing that's kind of interesting in the space, I think
</rh-cue>
<rh-cue start="21:52" end="21:56">
whenever I get a chance to work on, let's say an object traction model
</rh-cue>
<rh-cue start="21:56" end="21:58">
and one of our workshops, by the way, is actually putting that out
</rh-cue>
<rh-cue start="21:58" end="22:01">
in front of people to say, Hey, look, you can use your phone.
</rh-cue>
<rh-cue start="22:01" end="22:04">
And it basically sends the image over to our OpenShift, right,
</rh-cue>
<rh-cue start="22:04" end="22:07">
with our data science platform and then analyzes what you see.
</rh-cue>
<rh-cue start="22:08" end="22:09">
And even in my case, where I take a picture of my dog
</rh-cue>
<rh-cue start="22:09" end="22:13">
as an example, it can't really decide is it a dog or a cat?
</rh-cue>
<rh-cue start="22:13" end="22:15">
I have a very funny looking dog,
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter"></rh-cue>
<rh-cue start="22:15" end="22:18">
and so there's always a percentage outcome, you know?
</rh-cue>
<rh-cue start="22:18" end="22:21">
In other words, I think it's a dog 52%.
</rh-cue>
<rh-cue start="22:21" end="22:22">
So I want to talk about that more.
</rh-cue>
<rh-cue start="22:22" end="22:25">
What how important is it to get to 100% accuracy?
</rh-cue>
<rh-cue start="22:25" end="22:29">
How important is it to really, depending on the use case, to allow
</rh-cue>
<rh-cue start="22:29" end="22:34">
for the gray area, if you will, where it's an 80% accuracy or 70% accuracy?
</rh-cue>
<rh-cue start="22:34" end="22:36">
And where are the trade offs there associated with the application?
</rh-cue>
<rh-cue start="22:36" end="22:38">
Can you can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney"></rh-cue>
<rh-cue start="22:38" end="22:40">
Accuracy is definitely, you know, a touchy subject
</rh-cue>
<rh-cue start="22:40" end="22:43">
because how you measure it makes a huge difference.
</rh-cue>
<rh-cue start="22:43" end="22:46">
And then I think with like what you were describing with the dog example, there's
</rh-cue>
<rh-cue start="22:46" end="22:51">
sort of a top five potential classes that might may be identified.
</rh-cue>
<rh-cue start="22:51" end="22:55">
So let's say you're doing object detection and you detect a region of interest
</rh-cue>
<rh-cue start="22:55" end="22:57">
and it says 65% confidence.
</rh-cue>
<rh-cue start="22:57" end="22:58">
This is a dog.
</rh-cue>
<rh-cue start="22:58" end="23:03">
Well, the next potential label that could be maybe 50% confidence
</rh-cue>
<rh-cue start="23:03" end="23:08">
or 20% confidence might be something similar to a dog or in the case of models
</rh-cue>
<rh-cue start="23:08" end="23:11">
that have been trained on like the image net dataset
</rh-cue>
<rh-cue start="23:11" end="23:15">
or on cocoa data set, they have like actual breeds of dogs.
</rh-cue>
<rh-cue start="23:15" end="23:20">
So if I want to look at the top five labels for a dog,
</rh-cue>
<rh-cue start="23:20" end="23:24">
for my dog, for example, she's a mixed mostly Labrador retriever.
</rh-cue>
<rh-cue start="23:24" voice="Ryan Loney"></rh-cue>
<rh-cue start="23:24" end="23:29">
But I may look at the top five labels and it may say 65% confidence that she's
</rh-cue>
<rh-cue start="23:29" end="23:34">
a flat coated retriever and then confidence that she's a husky,
</rh-cue>
<rh-cue start="23:34" end="23:39">
as you know, 20% and then 5% confidence that she's a Greyhound or something.
</rh-cue>
<rh-cue start="23:40" end="23:42">
Those labels, all of them are dogs.
</rh-cue>
<rh-cue start="23:42" end="23:45">
So if I'm just trying to figure out is, is this a dog,
</rh-cue>
<rh-cue start="23:45" end="23:50">
I could probably find all of the, you know, classes within the data set
</rh-cue>
<rh-cue start="23:50" end="23:53">
and say, well, these are all, you know, class ID
</rh-cue>
<rh-cue start="23:53" end="24:00">
65, 132, 92 and 158 all belong to a group of dogs.
</rh-cue>
<rh-cue start="24:00" end="24:04">
So if I wanted to just write an application to tell me if this is a dog
</rh-cue>
<rh-cue start="24:04" end="24:07">
or not, I would probably use that to determine if it's a dog.
</rh-cue>
<rh-cue start="24:08" end="24:10">
But how you measure that is accuracy.
</rh-cue>
<rh-cue start="24:10" end="24:11">
Well, that's where it gets a little bit complicated,
</rh-cue>
<rh-cue start="24:11" end="24:15">
because if you're being really strict about the definition and you're
</rh-cue>
<rh-cue start="24:15" end="24:18">
trying to validate against the data set of labeled images
</rh-cue>
<rh-cue start="24:18" end="24:22">
and I have specific dog breeds or some specific detail
</rh-cue>
<rh-cue start="24:22" end="24:25">
and it doesn't match, well, then the accuracy is going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney"></rh-cue>
<rh-cue start="24:25" end="24:29">
That's especially important when we talk about things like compression
</rh-cue>
<rh-cue start="24:29" end="24:30">
and quantization,
</rh-cue>
<rh-cue start="24:30" end="24:34">
which, you know, historically has been difficult for to get adoption
</rh-cue>
<rh-cue start="24:34" end="24:40">
in some domains like health care, where even the hint of accuracy going down
</rh-cue>
<rh-cue start="24:40" end="24:44">
implies that we're not going to be able to help in some small case,
</rh-cue>
<rh-cue start="24:44" end="24:47">
maybe if it's even half a percent of the time
</rh-cue>
<rh-cue start="24:47" end="24:51">
we want to take that that tube is placed incorrectly or that, you know,
</rh-cue>
<rh-cue start="24:51" end="24:54">
that patient's, you know, lung has collapsed or something like that.
</rh-cue>
<rh-cue start="24:54" end="24:58">
And that's something that really prevents adoption of some of these methods
</rh-cue>
<rh-cue start="24:58" end="25:01">
that can really boost performance like quantization.
</rh-cue>
<rh-cue start="25:01" end="25:05">
But if you take that example of sort of different from the dog example
</rh-cue>
<rh-cue start="25:05" end="25:07">
and you think about like segmentation of kidneys.
</rh-cue>
<rh-cue start="25:07" end="25:11">
So if I'm doing kidney segmentation, which is, you know, taking a CT scan
</rh-cue>
<rh-cue start="25:12" end="25:14">
and then trying to pick the pixels out of that
</rh-cue>
<rh-cue start="25:14" end="25:17">
scan that belong to a kidney,
</rh-cue>
<rh-cue start="25:17" end="25:20">
how I measure accuracy may be
</rh-cue>
<rh-cue start="25:20" end="25:24">
how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney"></rh-cue>
<rh-cue start="25:25" end="25:29">
Missing some of the pixels is maybe not a problem, right,
</rh-cue>
<rh-cue start="25:29" end="25:33">
depending on how you built the application because you still detect the kidney
</rh-cue>
<rh-cue start="25:34" end="25:38">
and maybe you just need to apply padding around the region of interest
</rh-cue>
<rh-cue start="25:38" end="25:41">
so that you don't miss any of the the actual kidney.
</rh-cue>
<rh-cue start="25:42" end="25:45">
When you compress the model and when you quantized the model. But
</rh-cue>
<rh-cue start="25:45" end="25:50">
that requires, you know, data scientist and email engineer somebody to really
</rh-cue>
<rh-cue start="25:51" end="25:52">
they have to be
</rh-cue>
<rh-cue start="25:52" end="25:55">
able to go and apply that after the fact, after the inference
</rh-cue>
<rh-cue start="25:55" end="25:59">
happens to make sure that you're not losing critical information,
</rh-cue>
<rh-cue start="25:59" end="26:02">
because the next step from detecting the kidney may be detecting a tumor.
</rh-cue>
<rh-cue start="26:03" voice="Ryan Loney"></rh-cue>
<rh-cue start="26:03" end="26:06">
And so maybe you can use the more optimized model
</rh-cue>
<rh-cue start="26:06" end="26:11">
to detect the kidney, but then you can use a slower model to detect the tumor.
</rh-cue>
<rh-cue start="26:11" end="26:15">
But that also requires somebody to architect and make that decision
</rh-cue>
<rh-cue start="26:15" end="26:16">
or that tradeoff and say,
</rh-cue>
<rh-cue start="26:16" end="26:20">
well, I need to add padding, or I should only use the quantized model
</rh-cue>
<rh-cue start="26:20" end="26:24">
to detect the region of interest for the kidney and then use the model
</rh-cue>
<rh-cue start="26:24" end="26:27">
that takes longer to do the inference
</rh-cue>
<rh-cue start="26:27" end="26:30">
just to find the tumor, which is going to be on a smaller size.
</rh-cue>
<rh-cue start="26:30" end="26:33">
Right. The dimensions are going to be much smaller
</rh-cue>
<rh-cue start="26:33" end="26:35">
once we crop to the region of interest.
</rh-cue>
<rh-cue start="26:35" end="26:40">
But all of those details, that's maybe not easy to explain in a few sentences.
</rh-cue>
<rh-cue start="26:40" end="26:43">
And even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter"></rh-cue>
<rh-cue start="26:45" end="26:46">
I do love that use case.
</rh-cue>
<rh-cue start="26:46" end="26:47">
Like you mentioned, the cropping
</rh-cue>
<rh-cue start="26:47" end="26:50">
even in one such an area that we worked on for another project,
</rh-cue>
<rh-cue start="26:50" end="26:53">
we specifically decided to pix like the image that we had taken
</rh-cue>
<rh-cue start="26:53" end="26:57">
because we knew that we could get the outcome we wanted by even
</rh-cue>
<rh-cue start="26:57" end="27:01">
just using a smaller or less having less resolution in our image.
</rh-cue>
<rh-cue start="27:01" end="27:04">
And therefore, as we transferred it from the mobile device storage device
</rh-cue>
<rh-cue start="27:04" end="27:08">
up into the cloud, we wanted that smaller image just for transfer purposes
</rh-cue>
<rh-cue start="27:08" end="27:11">
and it still we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter"></rh-cue>
<rh-cue start="27:11" end="27:14">
And one thing that's interesting about that from my perspective is
</rh-cue>
<rh-cue start="27:15" end="27:18">
if you're doing image processing, sometimes it takes a while
</rh-cue>
<rh-cue start="27:18" end="27:20">
for this transaction to occur.
</rh-cue>
<rh-cue start="27:20" end="27:20">
Like I,
</rh-cue>
<rh-cue start="27:20" end="27:24">
I come from a traditional application background, you know, where I'm reading
</rh-cue>
<rh-cue start="27:24" end="27:25">
and writing things from a database
</rh-cue>
<rh-cue start="27:25" end="27:28">
or a message broker or moving data from one place to another.
</rh-cue>
<rh-cue start="27:28" end="27:29">
Those things happen subsequent.
</rh-cue>
<rh-cue start="27:29" end="27:33">
Normally, even with great latency between your data centers, you know,
</rh-cue>
<rh-cue start="27:33" end="27:34">
it's still subsequent.
</rh-cue>
<rh-cue start="27:34" end="27:38">
In most cases, while on a transaction like this, one can actually take 2 seconds
</rh-cue>
<rh-cue start="27:38" end="27:42">
or 4 seconds as it's doing its analysis and actually coming back with, you know,
</rh-cue>
<rh-cue start="27:42" end="27:46">
I think it's a dog, I think it's a kidney, I think it's whatever, and provided me
</rh-cue>
<rh-cue start="27:46" end="27:48">
that accuracy statement.
</rh-cue>
<rh-cue start="27:48" end="27:51">
So that concept of optimization is very important
</rh-cue>
<rh-cue start="27:51" end="27:53">
in the overall application architecture.
</rh-cue>
<rh-cue start="27:53" end="27:56">
Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" end="27:56">
Yeah, definitely.
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney"></rh-cue>
<rh-cue start="27:56" end="27:58">
It depends too on the use case.
</rh-cue>
<rh-cue start="27:58" end="28:02">
So if you think about how important it is to reduce the latency
</rh-cue>
<rh-cue start="28:02" end="28:06">
and increase the number of frames per second that you can process when you're
</rh-cue>
<rh-cue start="28:06" end="28:10">
talking about a loss prevention model that's running at a grocery store.
</rh-cue>
<rh-cue start="28:10" end="28:13">
So you want to keep the lines moving.
</rh-cue>
<rh-cue start="28:13" end="28:16">
You don't want every person who's at the self-checkout
</rh-cue>
<rh-cue start="28:16" end="28:19">
to have to wait 5 seconds for every item they scan.
</rh-cue>
<rh-cue start="28:19" end="28:22">
You need it to happen as quickly as possible.
</rh-cue>
<rh-cue start="28:22" end="28:25">
And if sometimes you, you know, the accuracy
</rh-cue>
<rh-cue start="28:25" end="28:28">
decreases slightly or the I'd say the accuracy of the whole pipeline.
</rh-cue>
<rh-cue start="28:28" end="28:32">
So not just looking at the individual model or the individual inference, but
</rh-cue>
<rh-cue start="28:32" end="28:36">
let's say that the the whole pipeline is not as successful at detecting
</rh-cue>
<rh-cue start="28:37" end="28:40">
when somebody steals one item from the self-checkout,
</rh-cue>
<rh-cue start="28:41" end="28:43">
it's not going to be a life threatening situation.
</rh-cue>
<rh-cue start="28:43" end="28:47">
Whereas, you know, being in the hooked up to the X-ray machine
</rh-cue>
<rh-cue start="28:47" end="28:51">
with the two placement model, they might be willing to have the doctor,
</rh-cue>
<rh-cue start="28:51" end="28:54">
the nurse wait 5 seconds to get the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney"></rh-cue>
<rh-cue start="28:55" end="28:58">
They don't need it to happen in 500 milliseconds.
</rh-cue>
<rh-cue start="28:58" end="29:02">
So they're willing their threshold for waiting is a little bit higher.
</rh-cue>
<rh-cue start="29:02" end="29:05">
So that, I think, also drives some of the decision, like
</rh-cue>
<rh-cue start="29:06" end="29:09">
you want to keep people moving through the checkout line
</rh-cue>
<rh-cue start="29:09" end="29:13">
and you can afford to to potentially if you lose a little bit of accuracy here
</rh-cue>
<rh-cue start="29:13" end="29:14">
and there, it's not going to
</rh-cue>
<rh-cue start="29:14" end="29:18">
cost the company that much money or it's not going to be life threatening.
</rh-cue>
<rh-cue start="29:18" end="29:21">It's going to be worth the tradeoff of keeping the line moving</rh-cue>
<rh-cue start="29:21" end="29:24">and not having people leave the store and not check out at all.</rh-cue>
<rh-cue start="29:24" end="29:27">And to say, I'm not going to shop today because the line's too long.</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter"></rh-cue>
<rh-cue start="29:30" end="29:32">There are so many trade offs and enterprise</rh-cue>
<rh-cue start="29:32" end="29:35">AML use cases, things like latency, accuracy and availability.</rh-cue>
<rh-cue start="29:35" end="29:40">And certainly complexities abound, especially in an obviously ever evolving</rh-cue>
<rh-cue start="29:40" end="29:43">technological landscape where we are still very early in the adoption of AML.</rh-cue>
<rh-cue start="29:44" end="29:47">And to navigate that complexity, the direct feedback from real world</rh-cue>
<rh-cue start="29:47" end="29:51">end users is essential to Ryan and his team at Intel.</rh-cue>
<rh-cue start="29:52" end="29:54">What would you say are some of the big hurdles or big</rh-cue>
<rh-cue start="29:54" end="29:57">outcomes, big opportunities in that space?</rh-cue>
<rh-cue start="29:57" end="30:01">And do you agree that we're kind of still at the very beginning in our infancy,</rh-cue>
<rh-cue start="30:01" end="30:01">if you will,</rh-cue>
<rh-cue start="30:01" end="30:05">of adopting these technologies and and discovering what they can do for us?</rh-cue>
<rh-cue start="30:05" voice="Ryan Loney"></rh-cue>
<rh-cue start="30:05" end="30:07">Yeah, I think we're definitely in the infancy</rh-cue>
<rh-cue start="30:07" end="30:10">and I think that what we've seen is our customers are evolving</rh-cue>
<rh-cue start="30:10" end="30:14">and the people who are deploying on Intel hardware, they're trying to run</rh-cue>
<rh-cue start="30:14" end="30:16">more complicated models.</rh-cue>
<rh-cue start="30:16" end="30:19">They're the models that are doing object detection or, you know,</rh-cue>
<rh-cue start="30:19" end="30:22">detecting defects and, you know, doing segmentation.</rh-cue>
<rh-cue start="30:23" end="30:27">You know, in the past you could say, oh, here's a generic model that will do face</rh-cue>
<rh-cue start="30:27" end="30:31">detection or person detection or vehicle detection and license plate detection.</rh-cue>
<rh-cue start="30:32" end="30:33">And those are sort of like</rh-cue>
<rh-cue start="30:33" end="30:36">general purpose models that you can just grab off the shelf and use them.</rh-cue>
<rh-cue start="30:37" end="30:40">But now we're moving into like the anomaly scenarios</rh-cue>
<rh-cue start="30:40" end="30:44">where I've got my own data and I'm trying to do something very specific</rh-cue>
<rh-cue start="30:45" end="30:47">and I'm the only one that has access to this data.</rh-cue>
<rh-cue start="30:47" end="30:51">And you don't have a public data set that you can go download</rh-cue>
<rh-cue start="30:51" end="30:54">that's under Creative Commons license for, you know, car batteries.</rh-cue>
<rh-cue start="30:54" end="30:57">It's, you know, it's just not something that's available.</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney"></rh-cue>
<rh-cue start="30:57" end="31:02">And so those use cases, the challenge with with training those models</rh-cue>
<rh-cue start="31:02" end="31:06">and and getting them optimized is the beginning of the pipeline.</rh-cue>
<rh-cue start="31:06" end="31:10">It's the data you have to get the data you have to annotated</rh-cue>
<rh-cue start="31:10" end="31:12">and the tools have to exist for you to do that.</rh-cue>
<rh-cue start="31:12" end="31:15">And that's part of the problem that we're trying to help solve.</rh-cue>
<rh-cue start="31:16" end="31:17">And then the models are getting more complex.</rh-cue>
<rh-cue start="31:17" end="31:21">So if you think, you know, just from working with customers recently,</rh-cue>
<rh-cue start="31:21" end="31:22">you know, they're no longer</rh-cue>
<rh-cue start="31:22" end="31:26">just trying to do image classification and, you know, like is it a dog or a cat?</rh-cue>
<rh-cue start="31:26" end="31:29">They've moved on to like 3D point clouds</rh-cue>
<rh-cue start="31:29" end="31:34">and, you know, 3D segmentation models and things that are like the speech</rh-cue>
<rh-cue start="31:34" end="31:39">synthesis example, doing things these GPT models that are generating,</rh-cue>
<rh-cue start="31:40" end="31:44">you know, you, you put a text input and it generates an image for you.</rh-cue>
<rh-cue start="31:44" end="31:47">It's just becoming much more advanced, much more sophisticated</rh-cue>
<rh-cue start="31:48" end="31:50">and on larger images.</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney"></rh-cue>
<rh-cue start="31:50" end="31:54">And so things like running super resolution enhancing images, upscaling</rh-cue>
<rh-cue start="31:54" end="31:59">images, instead of just trying to take that, you know, 200 by 200 pixel</rh-cue>
<rh-cue start="32:00" end="32:02">image and classifying if it's a cat.</rh-cue>
<rh-cue start="32:02" end="32:05">Now we're talking about gigantic</rh-cue>
<rh-cue start="32:05" end="32:09">huge images that we're processing and that all requires</rh-cue>
<rh-cue start="32:09" end="32:12">more resources or more optimized models.</rh-cue>
<rh-cue start="32:13" end="32:16">And, you know, every computer vision conference or A.I.</rh-cue>
<rh-cue start="32:16" end="32:19">conference, there's there's a new latest and greatest architecture.</rh-cue>
<rh-cue start="32:19" end="32:22">There's new research paper, and things are getting adopted much faster.</rh-cue>
<rh-cue start="32:23" end="32:27">The lead time for a nurse paper or CV PR</rh-cue>
<rh-cue start="32:27" end="32:30">for a company to actually adopt and put those into production.</rh-cue>
<rh-cue start="32:30" end="32:32">It's like the time shortens every year.</rh-cue>
<rh-cue start="32:33" voice="Burr Sutter"></rh-cue>
<rh-cue start="32:33" end="32:35">Well, Ryan, I got to tell you, I could talk to you</rh-cue>
<rh-cue start="32:35" end="32:39">literally all day about these topics, the various use cases, the various ways</rh-cue>
<rh-cue start="32:39" end="32:41">models are being optimized,</rh-cue>
<rh-cue start="32:41" end="32:44">how to put models into a pipeline for average enterprise applications.</rh-cue>
<rh-cue start="32:44" end="32:47">I've enjoyed learning about pop and vino and anomalies,</rh-cue>
<rh-cue start="32:47" end="32:50">but I'm fascinated by this because I will have a chance to go try this myself.</rh-cue>
<rh-cue start="32:51" end="32:52">Taking advantage of Red Hat OpenShift</rh-cue>
<rh-cue start="32:52" end="32:54">and taking advantage of our data science platform.</rh-cue>
<rh-cue start="32:54" end="32:58">On top of that, I will definitely go be poking at this myself.</rh-cue>
<rh-cue start="32:58" end="33:00">So thank you so much for your time today.</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney"></rh-cue>
<rh-cue start="33:00" end="33:00">Thanks, Burr.</rh-cue>
<rh-cue start="33:00" end="33:01">This was a lot of fun.</rh-cue>
<rh-cue start="33:01" end="33:04">Thanks for having me.</rh-cue>
<rh-cue start="33:04" voice="Burr Sutter"></rh-cue>
<rh-cue start="33:04" end="33:06">And you can check out</rh-cue>
<rh-cue start="33:06" end="33:09">the full transcript of our conversation and more resources,</rh-cue>
<rh-cue start="33:09" end="33:12">like a link to a white paper on open vino and normal lib at Red Hat dot</rh-cue>
<rh-cue start="33:12" end="33:15">com slash code Comments Podcast.</rh-cue>
<rh-cue start="33:15" end="33:19">This episode was produced by Brant Seminole and Caroline Prickett.</rh-cue>
<rh-cue start="33:20" end="33:21">Our sound designer is Christian.</rh-cue>
<rh-cue start="33:21" end="33:26">From our audio team includes Lee Day, Stephanie Wunderlich, Mike Esser,</rh-cue>
<rh-cue start="33:27" end="33:32">Laura Barnes, Claire Allison, Nick Burns, Aaron Williamson, Karen King,</rh-cue>
<rh-cue start="33:32" end="33:36">Booboo House, Rachel Artell, Mike Compton, Ocean</rh-cue>
<rh-cue start="33:36" end="33:40">Mathews, Laura Walters, Alex Trabelsi and Victoria Lutton.</rh-cue>
<rh-cue start="33:41" end="33:43">I'm your host, Burt Sutter.</rh-cue>
<rh-cue start="33:43" end="33:45">Thank you for joining me today on Code Comments.</rh-cue>
<rh-cue start="33:45" end="33:48">I hope you enjoyed today's session and today's conversation, and I</rh-cue>
<rh-cue start="33:48" end="33:52">look forward to many more in.</rh-cue>
</rh-transcript>
</rh-audio-player>
<h2>Even with more headings</h2>
<p>This last heading is allowed to go back up to h2, but the player should still take h3 as its root heading level.</p>
</section>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Language Localization
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
rh-audio-player {
margin: var(--rh-space-xl, 24px);
}
```
<rh-audio-player lang="es" layout="full" poster="https://www.redhat.com/cms/managed-files/img-clh-s4e1-hero-455x539_0.png">
<p slot="series">Temporada 4, Episodio 1</p>
<h3 slot="title">Minicomputadoras: el alma de las máquinas de antes</h3>
<audio crossorigin="anonymous" slot="media" control="" srclang="es" src="https://cdn.simplecast.com/audio/ec894038-2e91-449e-9bff-e7ebc323c3e6/episodes/7507a7a1-7340-43f9-bde2-bb0645646ff6/audio/ab8ed5c7-9fdc-4a39-8384-141d66f02c46/default_tc.mp3"></audio>
<rh-audio-player-about slot="about" label="Notas del podcast">
<p>Sí, es cierto, las minicomputadoras no caben en tu bolsillo, pero en su momento representaron un avance importante porque redujeron el espacio que necesitaban sus antecesoras, las mainframes,
que ocupaban habitaciones enteras. Además, abrieron la posibilidad de que las computadoras personales cupieran en una bolsa y de que, posteriormente, se convirtieran en el teléfono que traes en
tu bolsillo. </p>
<p>Las computadoras de 16 bits cambiaron el mundo de la tecnología de la información en los años 70. Gracias a ellas, las empresas tuvieron la posibilidad de darle a cada ingeniero su propia
máquina. Pero los avances aún no eran suficientes; todavía faltaba que llegaran las versiones de 32 bits. </p>
<p>Carl Alsing y Jim Guyer nos hablan del trabajo que realizaron en Data General para crear una nueva y revolucionaria máquina de 32 bits. Y aunque ahora esos esfuerzos son toda una leyenda, en su
momento se realizaron en secreto. “Eagle” era el nombre clave de la computadora que diseñaron, cuyo primer propósito era competir con otra máquina que estaba desarrollando otro equipo de la
misma empresa. Los ingenieros nos hablan de las políticas corporativas y nos explican todas las tramas necesarias para que el proyecto pudiera seguir su curso, e incluso nos dicen cómo lograron
que las restricciones jugaran a su favor. Neal Firth nos cuenta cómo vivió un proyecto muy emocionante pero exigente, en que nuestros héroes trabajaron juntos por pura voluntad, sin ninguna
expectativa de fama ni fortuna. Y los tres nos mencionan que la historia quedó inmortalizada en el libro clásico de ingeniería de Tracy Kidder, <em>El alma de una nueva máquina,</em> que se basa
en hechos reales.</p>
</rh-audio-player-about>
<rh-transcript slot="transcript" label="Transcripción">
<rh-cue start="00:03" voice="Presentadora">
Corría el año de 1978 y en el sector de las minicomputadoras había una guerra a punto de estallar. Apenas un año antes, Digital Equipment Corporation, o DEC, había lanzado su computadora VAX
11 780 de 32 bits. Tenía una capacidad mucho mayor que las máquinas de 16 bits del mercado. Las ventas de la VAX pronto arrasaron con las de la competencia, que ofrecía computadoras más
lentas. A Data General, la archienemiga de DEC, le urgía diseñar una nueva máquina capaz de competir con la VAX. Necesitaba su propia computadora de 32 bits y la necesitaba ya, pero la
competencia entre Data General y DEC no era el único conflicto del momento. También había una disputa territorial en el interior de Data General, y el resultado de ambas guerras sería la
creación de una computadora increíble, en circunstancias igual de increíbles. Una laptop de 13 pulgadas pesa como kilo y medio. Hoy en día damos por hecho la portabilidad y la practicidad de
nuestras computadoras, pero en la década de 1970 la mayoría eran mainframes del tamaño de una habitación; eran aparatos que costaban millones de dólares y pesaban varias toneladas. Luego,
cuando se desplomaron los costos del hardware, comenzó la carrera para desarrollar computadoras más pequeñas, más rápidas y más baratas. La minicomputadora abrió la posibilidad de que los
ingenieros y los investigadores tuvieran su propia terminal, y nos trajo a donde estamos en la actualidad.
</rh-cue>
<rh-cue start="01:37" voice="Presentadora">
En la temporada pasada de Command Line Heroes en español, analizamos un área clave para el desarrollo del software: el mundo de los lenguajes de programación. Hablamos de su historia, de los
problemas que resolvieron y de su evolución a través del tiempo. Abordamos lenguajes como JavaScript, Python, C, Perl, COBOL y Go. En esta temporada, que es la cuarta, por si alguien lleva la
cuenta, vamos a profundizar en el hardware en que se ejecuta nuestro software. Te vamos a contar siete historias maravillosas sobre las personas y los equipos que se atrevieron a cambiar las
reglas del hardware. Piensa en la laptop que está en tu escritorio, o en el teléfono que traes en el bolsillo… Los héroes de la línea de comandos siempre dejan el alma en tu hardware; con su
pasión por el diseño informático y su ingenio para que cada pieza se vuelva realidad, han revolucionado la forma en que programamos hoy en día.
</rh-cue>
<rh-cue start="02:36" voice="Presentadora">
Esto es Command Line Heroes en español, un podcast original de Red Hat.
</rh-cue>
<rh-cue start="02:45" voice="Presentadora">
El primer episodio de esta temporada cuenta la carrera contrarreloj de un equipo de ingenieros que tenían que diseñar, depurar y entregar una computadora de vanguardia. Su trabajo se
convirtió en el tema principal del bestseller El alma de una nueva máquina, de Tracy Kidder, que posteriormente se haría acreedor al premio Pulitzer y que habla de muchos de los invitados de
este episodio.
</rh-cue>
<rh-cue start="03:07" voice="Presentadora">
Pero volvamos a Data General. El presidente de la compañía, Ed de Castro, había trazado un plan para competir con DEC. Dividió al departamento de ingeniería y trasladó a una parte del equipo
de la sede de Westboro, Massachusetts, en Estados Unidos, a una nueva oficina que estaba en Carolina del Norte. ¿Su misión? Diseñar una computadora avanzada de 32 bits que hiciera trizas a la
VAX. El proyecto se llamaba Fountainhead, y de Castro le dio apoyo y recursos casi ilimitados. Fountainhead iba a ser la salvación de la empresa. Los pocos ingenieros que se quedaron en
Massachusetts se sintieron terriblemente menospreciados. Sabían que eran capaces de crear una computadora que destrozara a la VAX, y que probablemente sería mejor que la de Fountainhead, pero
de Castro no les daba la oportunidad. Así que Tom West, que era el líder del grupo, decidió ocuparse personalmente del asunto. El ingeniero en Computación Tom West, de formación autodidacta,
dirigía el departamento Eclipse de Data General. Eclipse era la gama de minicomputadoras de 16 bits más exitosa de Data General. Tom sabía fabricar y distribuir computadoras, y también sabía
lo que quería el mercado. Después de poner en marcha el proyecto Fountainhead, de Castro les pidió a los demás ingenieros que siguieran mejorando la gama de productos del año anterior. Ni a
Tom ni a los demás les convencía la idea.
</rh-cue>
<rh-cue start="04:31" voice="Carl Alsing">
No nos hacía ninguna gracia. Algunos decidieron cambiar de empleo y otros estábamos deprimidos y preocupados por nuestras carreras; no nos sentíamos nada entusiasmados. Y nos imaginábamos que
el otro grupo no lo iba a lograr.
</rh-cue>
<rh-cue start="04:46" voice="Presentadora">
Carl Alsing era el gerente del grupo de microprogramación de Data General. Era el segundo al mando, después de Tom. Así que los dos decidieron empezar su propio proyecto.
</rh-cue>
<rh-cue start="04:56" voice="Carl Alsing">
Iba a ser un diseño completamente nuevo, con las técnicas más avanzadas, para diseñar una computadora de 32 bits que superara a la VAX de DEC. Preparamos nuestra propuesta, se la presentamos
al presidente, Ed de Castro, que nos dice: “No, para nada. El grupo de Carolina del Norte está en eso. No se preocupen”. Nos desanimamos, pero se nos ocurrió otra propuesta a la que le pusimos
Víctor. Buscamos formas de mejorar el producto del año pasado. Le pusimos un pequeño interruptor, un bit de modo en el sistema; si lo encendías permitía que la computadora funcionara como una
minicomputadora moderna de 32 bits, pero lenta. Se lo llevamos a Ed de Castro y se lo presentamos. Y total que nos dijo: “Eso es un bit de modo. No quiero ni ver diseños con bits de modo. La
que se encarga de los nuevos diseños es Carolina del Norte”. Entonces otra vez nos desanimamos, y creo que en fue en ese momento que Tom West decidió hacer algo a escondidas.
</rh-cue>
<rh-cue start="06:06" voice="Presentadora">
A Tom se le ocurrieron dos cosas. Una de ellas era para de Castro. Iban a mejorar la antigua línea de productos Eclipse: la harían un poco más rápida, le agregarían unos cuantos botones, le
cambiarían el color. Tom lo presentó como una especie de plan b, en caso de que algo saliera mal en Carolina del Norte. De Castro lo aprobó. Pero a su equipo Tom le contó otra historia, una
más interesante.
</rh-cue>
<rh-cue start="06:32" voice="Carl Alsing">
Tom West nos propuso diseñar una computadora moderna, muy buena, que fuera totalmente compatible con las anteriores y que pudiera manejar lo último en alta tecnología. Iba a tener memoria
virtual, 32 bits, códigos de corrección de errores y esas cosas. Multitareas, multiprocesamiento, mucha memoria. “Oigan, vamos a diseñar una computadora nueva que se va a comer vivo al
mercado”.
</rh-cue>
<rh-cue start="07:04" voice="Presentadora">
El código de esta maravilla informática: la “Eagle”. Actualmente parece que no hay límites con lo que podemos hacer con la memoria integrada de nuestra computadora, pero en ese entonces,
pasar de 16 a 32 bits era un paso enorme. De un día para otro, el espacio de direcciones había pasado de 65 mil bytes de información a más de 4 mil millones. Y con ese aumento, el software
podía procesar mayores cantidades de datos. Esto generó dos grandes desafíos para las empresas de informática: obviamente había que pasar de 16 a 32 bits, pero además había que dejar conformes
a los antiguos clientes, que todavía utilizaban el software anterior. Así que había que desarrollar una computadora que pudiera funcionar con el viejo software; una computadora de 32 bits que
fuera compatible con lo anterior. La VAX tenía una gran potencia, pero no tenía ninguna solución elegante para el segundo problema. Tom estaba decidido a que su Eagle fuera la respuesta.
</rh-cue>
<rh-cue start="08:14" voice="Presentadora">
La Eagle estaba escondida en el sótano del edificio Westborough, 14 AB. Tom le pidió a Carl que dirigiera la microcodificación. Carl nombró a Chuck Holland como gerente de los programadores,
que se pusieron el nombre de Micro Kids. Mientras, Ed Rasala supervisaría el hardware. Y Ed designó a Ken Holberger para que dirigiera al equipo, al que llamaron, muy apropiadamente, los Hardy
Boys. Tom encontró un aliado: el vicepresidente de Ingeniería, Carl Carman. Carman también tenía cuentas pendientes con De Castro, que se había negado a ponerlo a cargo del grupo de Carolina
del Norte.
</rh-cue>
<rh-cue start="08:51" voice="Carl Alsing">
Carl Carman sabía en qué andábamos, pero no le dijo nada a su jefe. Él era el que nos financiaba; el problema es que necesitábamos ingenieros muy, pero muy buenos, pero teníamos que mantener
bajos los salarios. Así que decidimos contratar estudiantes universitarios. Una de las ventajas es que no conocen tus límites. Creen que puedes hacer cualquier cosa.
</rh-cue>
<rh-cue start="09:15" voice="Presentadora">
Jim Guyer había egresado de la universidad dos años antes y trabajaba en Data General cuando lo pusieron a cargo de los Hardy Boys.
</rh-cue>
<rh-cue start="09:21" voice="Jim Guyer">
La computadora que estaban desarrollando en Carolina del Norte tenía una tecnología informática mucho más avanzada, casi como una mainframe. Y bueno, digamos que en esa época no era cualquier
cosa ponerse a competir con IBM y las demás empresas de mainframes. Creíamos que teníamos ventaja porque nuestro proyecto no era tan ambicioso y estábamos muy, muy concentrados en una
implementación clara, sencilla y elegante, de bajo costo, pocos componentes... cosas así.
</rh-cue>
<rh-cue start="09:51" voice="Presentadora">
Bajo costo, diseño sencillo... Eso los hizo entender que tendrían que usar el firmware para controlar todo. Mientras más funciones lograran implementar en el firmware en vez del hardware, más
barato y flexible sería el resultado
</rh-cue>
<rh-cue start="10:03" voice="Presentadora">
Además, podrían hacer los cambios a medida que se necesitaran. Actualmente nos parece lógico porque así funcionan las computadoras, pero en 1978 era algo completamente nuevo.
</rh-cue>
<rh-cue start="10:15" voice="Carl Alsing">
El diseño que estábamos haciendo era algo básico. Lo que queríamos era encontrar formas sencillas y directas de hacer las cosas, sin complicaciones, porque sabíamos que no podíamos terminar
diseñando una computadora grande y cara. Necesitábamos usar pocas tarjetas, pocos circuitos, y de hecho eso nos ayudaba para que fuera rápida. No es lo mismo diseñar un producto seguro y sin
riesgos, que diseñar un producto exitoso. Y no nos importaban los riesgos. Nos importaba el éxito. Queríamos que nuestra computadora fuera rápida y barata, y queríamos diseñarla rápido. Así
que le pusimos unas tres o cuatro tarjetas, lo mínimo que podíamos de hardware, y lo compensamos con el firmware.
</rh-cue>
<rh-cue start="11:06" voice="Presentadora">
Pero el equipo de la Eagle se enfrentaba a varios obstáculos difíciles de superar. La VAX era la computadora de 32 bits con mejor rendimiento del mundo. La Eagle necesitaba estar a la altura.
Pero además, tenía que ser compatible con la arquitectura anterior de 16 bits de Data General. Para lograr todo eso, pero con menos tiempo y dinero que los demás equipos, había que apostarle
mucho a la Eagle. Pero el equipo de Tom West estaba dispuesto a jugárselo todo.
</rh-cue>
<rh-cue start="11:32" voice="Jim Guyer">
Había dos sistemas que funcionaban las 24 horas del día, los 7 días de la semana, y teníamos dos turnos de ingenieros que trabajaban en eso. Todos necesitábamos entender cómo funcionaba todo.
Así que tuvimos que aprender qué hacía cada una de las piezas que construían los demás. Me costó mucho trabajo, pero al mismo tiempo aprendí muchísimo. Todos participábamos en el trabajo de
los demás y pensábamos: “¿Cuál es el siguiente paso para resolver este problema? ¿En qué hay que fijarse?” Todos revisábamos los diagramas de circuitos y demás documentos para tratar de
entender: “A ver, fíjate en esta señal, ve el estado de la computadora, revisa la secuencia de pasos del microcódigo. ¿Sí está haciendo lo que tiene que hacer? Uy, espérense, va para el otro
lado. Ay, ¿pero por qué hizo eso?”
</rh-cue>
<rh-cue start="12:13" voice="Carl Alsing">
Lo tomábamos muy en serio, era parte de la ética de trabajo. El ambiente era intenso. A veces había discusiones sobre la manera de hacer las cosas. Por ejemplo, tal vez había una forma un
poco más cara y otra que era más barata pero no tan rápida o eficaz. Y había discusiones acaloradas y reuniones en las que teníamos que esforzarnos por llegar a un acuerdo. Pero al final
lográbamos tomar una decisión. Y empezábamos a trabajar juntos.
</rh-cue>
<rh-cue start="12:44" voice="Carl Alsing">
Trabajábamos día y noche, nos repartíamos las horas que se necesitaban para diseñar el prototipo. Solo teníamos dos prototipos, y era muy importante que los dos equipos trabajaran en ellos.
Algunos trabajaban en la noche, otros trabajaban en el día, y ya empezábamos a cansarnos. Pero estábamos muy motivados, así que sentíamos mucha satisfacción. Así que nadie se quejaba mucho de
las condiciones laborales.
</rh-cue>
<rh-cue start="13:11" voice="Presentadora">
Las condiciones laborales. Algunos relatos de esa época dicen que, para que el equipo funcionara, Tom West puso en práctica una cosa que se llama “la gestión de los hongos”: si les das de
comer cualquier porquería y los mantienes en la oscuridad vas a verlos crecer. Estaban encerrados en un espacio de trabajo abarrotado y caluroso, así que las horas se hacían largas y los
plazos eran poco realistas. Dicen que Tom era enigmático, frío, indiferente. Uno de los ingenieros incluso lo llamaba el “Príncipe de las Tinieblas”. ¿Pero a Tom West le importaba tanto lograr
el éxito que se aprovechó de su equipo? ¿Sacrificó el bienestar de los Micro Kids y los Hardy Boys para diseñar la computadora perfecta?
</rh-cue>
<rh-cue start="13:56" voice="Jim Guyer">
Era interesante trabajar con Tom. Porque tenía muchas expectativas, pero no te daba suficientes instrucciones. Esperaba que entendieras lo que tenías que hacer, y si no, pues qué pena, te
sacaba del equipo.
</rh-cue>
<rh-cue start="14:10" voice="Presentadora">
Los que daban instrucciones eran Carl y Ed, los gerentes de línea que trabajaban codo a codo con Jim y el resto del equipo. Pero estos jóvenes ingenieros también buscaban el éxito, y les
gustaba tener la oportunidad de resolver las cosas ellos mismos.
</rh-cue>
<rh-cue start="14:26" voice="Jim Guyer">
Yo me gané el primer lugar de los Micro Kids por aguantar toda la noche sin dormir. Quién sabe, a lo mejor éramos jóvenes, empezábamos nuestra vida profesional, éramos bravucones, muy
seguros, y no entendíamos nada de nada. Confiábamos en nosotros mismos. Nos sentíamos muy inteligentes, creíamos que podíamos resolver todo, y yo supongo que el ego de los demás también nos...
nos alimentaba, en cierto sentido. Yo me la pasaba muy bien. Yo creo que la mayoría de nosotros nos divertíamos mucho.
</rh-cue>
<rh-cue start="14:56" voice="Presentadora">
Carl no está de acuerdo con lo de la gestión de los hongos. En su opinión no estaban en la oscuridad, sino al contrario: todos sabían exactamente lo que estaba pasando y lo que se esperaba.
Los directores eran los que no sabían. Al mismo tiempo, Tom West estaba bajo una enorme presión de varios frentes, y se la transmitía al grupo.
</rh-cue>
<rh-cue start="15:18" voice="Carl Alsing">
Tom mantenía en secreto la verdadera finalidad del proyecto. Así que no hablaba mucho con los ingenieros, se mantenía a distancia y obviamente les decía que no hablaran del proyecto fuera del
grupo, ni siquiera en su casa. Les decía que ni mencionaran la palabra Eagle. Así que también dejábamos muy claro que esto era muy urgente, que teníamos que lograrlo en un año, que la
competencia ya estaba en el mercado, y si queríamos salir al mercado en medio del pico de ventas, teníamos que lograrlo ya. Estaban muy estresados, y se esperaba que trabajaran en la noche y
los fines de semana; se esperaba que olvidaran los picnics con la familia; no había tiempo para nada que no fuera del trabajo.
</rh-cue>
<rh-cue start="16:06" voice="Presentadora">
Como me daba curiosidad saber cómo era trabajar en las trincheras del Edificio 14 AB, me senté a conversar con Neal Firth, que era uno de los Micro Kids. Acababa de salir de la universidad
cuando se incorporó al equipo.
</rh-cue>
<rh-cue start="16:20" voice="Presentadora">
¿Cómo era trabajar para Tom West? ¿Te comunicabas mucho con él?
</rh-cue>
<rh-cue start="16:24" voice="Neal Firth">
No tanto. Era como un fantasma. A veces lo veíamos por ahí. Intentaba no interferir para que hiciéramos lo que necesitábamos y alcanzáramos los objetivos. El proyecto era algo completamente
nuevo en comparación con lo que hacía Data General, y Tom no quería imponernos nada forzoso respecto a la generación anterior de procesadores.
</rh-cue>
<rh-cue start="16:49" voice="Presentadora">
Suena intenso, suena a que había que trabajar sin descanso y a que siempre había algo que resolver. ¿Cómo te sentías de que no tuvieran el tiempo necesario para lograrlo?
</rh-cue>
<rh-cue start="16:57" voice="Neal Firth">
Sinceramente no nos preocupaba. En realidad la falta de tiempo no era problema. Nosotros nos tomábamos el tiempo que hiciera falta para lograr el resultado. Por eso necesitábamos que nuestras
esposas nos apoyaran y fueran comprensivas, porque no siempre aceptaban todo. Era más o menos lo que sucedía con las personas de Silicon Valley de esa época, o con Jobs y Wozniak: “vamos a
ponernos a hacer esto hasta terminarlo”. No vivíamos en el mismo departamento ni nos sentábamos en el piso a escribir código, pero teníamos mucho en común con ellos.
</rh-cue>
<rh-cue start="17:35" voice="Presentadora">
¿Y qué te impulsaba a seguir adelante? ¿Por qué estabas tan motivado?
</rh-cue>
<rh-cue start="17:39" voice="Neal Firth">
La verdad solo era la posibilidad de resolver algún problema. Siempre me habían gustado los acertijos, los problemas que necesitaban solución. De hecho, así éramos casi todos. Todos
compartíamos eso, y todos lo disfrutábamos. Nos motivaba resolver los problemas, solucionar esas cosas, descubrir una manera nueva de hacer algo.
</rh-cue>
<rh-cue start="18:01" voice="Presentadora">
¿Y cuál fue el momento del proyecto que no vas a poder olvidar?
</rh-cue>
<rh-cue start="18:05" voice="Neal Firth">
Fue... Ya llevábamos mucho tiempo con el proyecto, y estábamos ejecutando el simulador de microcódigo. Y resulta que lo que se estaba ejecutando era la propuesta del simulador de producción,
que ya llevaba como 10 o 12 horas funcionando. Y de repente aparece la letra E en la consola… Nos esperamos un rato y de pronto aparece otra letra, y luego otra. Y entonces nos dimos cuenta de
que lo que estábamos ejecutando como código de prueba era el diagnóstico que estábamos diseñando para que se ejecutara. Así que el simulador ejecutaba el microcódigo, y ya había empezado a
imprimir letras como si realmente estuviera funcionando. Era mil veces más lento que en la vida real, o sea, era más lento que cuando se lanzó realmente, pero ese fue uno de los momentos que
nunca voy a olvidar.
</rh-cue>
<rh-cue start="19:02" voice="Presentadora">
Y ahora que lo piensas, ¿te parece que te explotaron?
</rh-cue>
<rh-cue start="19:07" voice="Neal Firth">
No. O sea, yo sabía. Yo sabía lo que estaba pasando. Entonces… no. No me siento explotado. En realidad, mis expectativas... yo nunca hubiera esperado participar en un proyecto tan importante
justo al salir de la universidad, ni tener la oportunidad de desempeñar un papel tan interesante en un proyecto así.
</rh-cue>
<rh-cue start="19:31" voice="Presentadora">
Me gustaría saber tu opinión sobre el sacrificio que requiere el inventar algo, porque cuando hacemos algo importante, en general hay que renunciar a algo para lograrlo, ¿no? Para lograr
algo, hay que renunciar a algo, ¿no?¿Ese fue el caso? Y si sí, ¿a qué tuviste que renunciar?
</rh-cue>
<rh-cue start="19:48" voice="Neal Firth">
Yo no creo que haya tenido la conciencia de que iba a renunciar a algo. Más bien creo que lo que pasó fue que empecé a ser un poco más consciente de lo que estaba haciendo, y de que eso
afectaba a los que me rodeaban.
</rh-cue>
<rh-cue start="20:03" voice="Neal Firth">
Pero para mí no... no era un sacrificio, y las personas que me rodeaban lo vivían como algo normal; así son las cosas y punto. A mí me han contado cosas horribles de lo que se vive hoy:
amanece, te despiertas, te inyectas café, muerdes un pedazo de pizza o cualquier cosa... y empiezas a escribir código hasta que te quedas dormido encima del teclado. Y al día siguiente, igual.
</rh-cue>
<rh-cue start="20:35" voice="Neal Firth">
Nosotros no hacíamos tantos sacrificios. Digo, yo seguía casado, tenía amigos, los veía... Sí, no era un trabajo de nueve a cinco, pero me permitió obtener muchos logros personales y
técnicos, y pude compartirlos con mi esposa, mi hermana, mi mamá, mi papá y mi suegro. O sea, mi familia lo apreciaba.
</rh-cue>
<rh-cue start="20:59" voice="Presentadora">
Sí. ¿Y cuál es el secreto para lograr algo maravilloso?
</rh-cue>
<rh-cue start="21:06" voice="Neal Firth">
¿Para lograr algo maravilloso? Qué interesante. Creo que la cosa es que quien participe lo haga porque quiere, no porque busca logros, fama o dinero. Porque esas son cosas muy fugaces y...
casi nunca te dejan satisfecho. Pero si la idea es alcanzar un objetivo, y colaboras con muchas personas y lo logras, ahí vas a saber lo que es la satisfacción.
</rh-cue>
<rh-cue start="21:42" voice="Presentadora">
Neal Firth era uno de los Micro Kids del proyecto Eagle. Hoy en día es el presidente de VIZIM Worldwide, que es una empresa de software.
</rh-cue>
<rh-cue start="21:57" voice="Presentadora">
Como bien dice el libro de Tracy Kidder, la indiferencia y el distanciamiento de Tom West eran a propósito. Era un intento de mantener la cabeza despejada, por encima de toda la cháchara
diaria, para conservar intacto el objetivo de la Eagle. Pero lo que más quería era proteger al equipo, aislarlo de la política y de los estira y afloja corporativos de su entorno. También
protegió a los Micro Kids y a los Hardy Boys de las ideas preconcebidas de lo que se podía lograr.
</rh-cue>
<rh-cue start="22:28" voice="Presentadora">
En 1980 se terminó el proyecto Eagle. Un año después de lo que Tom había prometido, pero se logró, a diferencia de Fountainhead. Y tal como pensaba el equipo sénior, el objetivo de
Fountainhead no se logró, y el proyecto se quedó olvidado en algún cajón. Bill Foster, que en ese entonces era director de desarrollo de software, nos cuenta las dificultades de Fountainhead.
</rh-cue>
<rh-cue start="22:50" voice="Bill Foster">
Creo que el mayor error fue que no se les puso ningún límite. Había que hacer la mejor computadora del mundo. “¿Pero para cuándo?” “Pues... la verdad no tenemos fecha.”. “¿Y cuánto debe
costar?” “Eh… tampoco sabemos”. Y yo le atribuyo el fracaso a Edson. No les puso suficientes límites a los programadores ni a los ingenieros.
</rh-cue>
<rh-cue start="23:15" voice="Bill Foster">
¿Y sabes qué pasa cuando no les pones límites? Diseñan algo tan amplio y complejo que simplemente no se puede concretar.
</rh-cue>
<rh-cue start="23:26" voice="Presentadora">
Pero a ver, vamos a hacer memoria. Tom y su equipo decidieron diseñar la Eagle a escondidas, y es lo que hicieron durante dos años. Y el presidente de la empresa nunca supo lo que estaba
pasando. La computadora ahora se llamaba oficialmente Eclipse MV/8000, y cuando ya estaba lista para salir al mercado, el jefe de marketing fue a ver a Ed de Castro para que aprobara la
campaña de publicidad. Vamos a escuchar a Carl Alsing.
</rh-cue>
<rh-cue start="23:53" voice="Carl Alsing">
El jefe de marketing nos dijo: “Bueno, pues ya estamos listos para lanzar la Eagle, y vamos a necesitar varios miles de dólares. Vamos a hacer una conferencia de prensa en seis ciudades del mundo.
Y
después vamos a hacer una gira para visitar muchas ciudades, vamos a filmar una película y a mostrarla, y vamos a ser la sensación”.
</rh-cue>
<rh-cue start="24:14" voice="Carl Alsing">
Pero Ed de Castro contestó: “No entiendo. ¿Para qué quieren hacer eso?” Va a ser un peso más para la Eclipse. Es como hacerle cirugía estética: promocionarla por encimita. Pero el gerente de
marketing le respondió: “No, es una computadora completamente nueva. Es una computadora de 32 bits. Tiene memoria virtual. Es compatible. Va a arrasar con la VAX. Tiene todo”.
</rh-cue>
<rh-cue start="24:37" voice="Carl Alsing">
Ed de Castro no entendía nada. Pensaba que nos habíamos equivocado en Carolina del Norte, y que ese era el fin de la empresa, pero en realidad le habíamos salvado el pellejo. Un día nos invitó a
todos a almorzar. Había sándwiches y gaseosas, y de repente nos dice: “Bueno, pues felicidades por el trabajo que hicieron, estoy sorprendido. Yo no sabía que estaban con ese proyecto, pero vamos
a lanzarlo, y tengo entendido que va a haber una película y varias giras, y ustedes van a participar en eso, así que gracias y buen provecho con los sándwiches”.
</rh-cue>
<rh-cue start="25:19" voice="Presentadora">
La Eagle, que ahora se llamaba MV/8000, apareció en la portada de la revista Computer World. El lanzamiento con bombos y platillos en los medios de comunicación les dio cierta fama a aquellos
empleados, que hasta entonces se habían escondido en el sótano. Habían salvado a Data General.
</rh-cue>
<rh-cue start="25:38" voice="Presentadora">
Pero todo lo bueno dura poco. Tom West ya no podía seguir protegiendo al grupo de la política interna de la empresa. Y el equipo no estaba preparado para los resentimientos que surgieron. En la
compañía había gente que envidiaba sus logros y no podía creer que se hubieran salido con la suya durante tanto tiempo con un proyecto secreto.
</rh-cue>
<rh-cue start="25:57" voice="Presentadora">
Pronto, el nuevo vicepresidente de Ingeniería reemplazó a Carl Carman, que era el aliado del grupo. El recién llegado desarmó el grupo de Eagle y envió a Tom a la oficina de Data General de Japón
antes de que se vendiera la primera MV/8000.
</rh-cue>
<rh-cue start="26:13" voice="Jim Guyer">
Yo creía que habíamos hecho la mejor superminicomputadora de 32 bits que el dinero podía comprar, lo cual era excelente para Data General, y que durante un tiempo destronaríamos a Digital
Equipment Corporation, no que ya habíamos acabado con ellos. La competencia era salvaje en esos tiempos, y no es fácil tener éxito en el sector de la alta tecnología, pero yo pensaba que lo que
habíamos hecho valía la pena.
</rh-cue>
<rh-cue start="26:42" voice="Presentadora">
Sin duda, el lanzamiento de la Eagle salvó a Data General, pero habían perdido participación en el mercado frente a DEC durante tres años, así que la empresa nunca se recuperó realmente y la
industria había seguido avanzando. Las minicomputadoras ya no eran lo más importante. La carrera de las microcomputadoras ya había comenzado, y le abrió camino a la revolución de las computadoras
personales.
</rh-cue>
<rh-cue start="27:04" voice="Carl Alsing">
Data General siguió adelante, sacó nuevas versiones, las mejoró en los siguientes modelos y las vendió durante un tiempo, así que disfrutó de cierto éxito. Pero, bueno, las cosas cambian. El
mercado cambió y... ellos se convirtieron en una empresa de software, y finalmente otra empresa los compró. Y ahora creo que lo único que queda de ellos es algún archivador en alguna empresa de
Hopkinton, Massachusetts.
</rh-cue>
<rh-cue start="27:36" voice="Presentadora">
Un año después, muchos de los integrantes del grupo de la Eagle habían dejado Data General. Algunos estaban agotados. Otros ya querían diseñar alguna otra cosa. Otros se fueron al oeste, hacia
Silicon Valley, y estaban ansiosos por encontrar la siguiente chispa creativa. Cualquiera que fuera el caso, no tenía mucho sentido quedarse en una empresa que no reconocía todo lo que habían
hecho para salvarla. En ese mismo año, en 1981, se publicó El alma de una nueva máquina, de Tracy Kidder. Ahora el mundo sabría cómo se había diseñado la Eagle.
</rh-cue>
<rh-cue start="28:14" voice="Carl Alsing">
Si me preguntas qué constituye el alma de una nueva máquina, yo diría que las personas y lo que les pasa a esas personas; los sacrificios que hacen, el esfuerzo y el entusiasmo que sienten, y las
satisfacciones que esperan obtener. Tal vez lo logren, tal vez no, pero tienen un objetivo y luchan por él.
</rh-cue>
<rh-cue start="28:35" voice="Jim Guyer">
En realidad, la computadora era un personaje secundario. El corazón del proyecto era la gente.
</rh-cue>
<rh-cue start="28:47" voice="Presentadora">
En el próximo episodio de nuestra nueva temporada sobre el hardware, vamos a retroceder en el tiempo hasta la era de las computadoras mainframe, y te contaremos la historia de otro grupo de
empleados rebeldes. La computadora que construyeron hizo surgir un lenguaje de programación que cambió el mundo.
</rh-cue>
<rh-cue start="29:04" voice="Presentadora">
Command Line Heroes en español es un podcast original de Red Hat. Para esta temporada, recopilamos excelentes materiales de investigación para que puedas saber más sobre la historia del hardware
del que estamos hablando. Si quieres saber más sobre la Eagle y el equipo que la diseñó, visita redhat.com/commandlineheroes. Hasta la próxima, sigan programando.
</rh-cue>
</rh-transcript>
</rh-audio-player>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Mini
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
rh-audio-player {
margin: var(--rh-space-xl, 24px);
}
```
<rh-audio-player id="player" layout="mini" poster="https://www.redhat.com/cms/managed-files/CLH-S7-ep1.png">
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<rh-audio-player-about slot="about">
<h4 slot="heading">About the episode</h4>
<p>
There are a lot of publicly available data sets out there. But when it
comes to specific enterprise use cases, you're not necessarily going to
able to find one to train your models. To realize the power of AI/ML in
enterprise environments, end users need an inference engine to run on
their hardware. Ryan Loney takes us through OpenVINO and Anomalib, open
toolkits from Intel that do precisely that. He looks specifically at
anomaly detection in use cases as varied as medical imaging and
manufacturing.
</p>
<p>
Want to learn more about Anomalib? Check out the research paper that
introduces the deep learning library.
</p>
<rh-avatar slot="profile" src="https://www.redhat.com/cms/managed-files/ryan-loney.png">
Ryan Loney
<span slot="subtitle">Product manager, OpenVINO Developer Tools, <em>Intel®</em></span>
</rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe">
<h4 slot="heading">Subscribe</h4>
<p>Subscribe here:</p>
<a slot="link" href="https://podcasts.apple.com/us/podcast/code-comments/id1649848507" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/6eJc62sKckHs4uEQ8eoKzD" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5wYWNpZmljLWNvbnRlbnQuY29tL2NvZGVjb21tZW50cw" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feeds.pacific-content.com/codecomments" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript">
<h4 slot="heading">Transcript</h4>
<rh-cue start="00:02" voice="Burr Sutter">
Hi, I'm Burr Sutter. I'm a Red Hatter who spends a lot of time talking to technologists about technologies. We say this a lot at Red Hat. No single technology provider holds the key to
success, including us. And I would say the same thing about myself. I love to share ideas, so I thought it would be awesome to talk to some brilliant technologists at Red Hat Partners. This is
Code Comments, an original podcast from Red Hat.
</rh-cue>
<rh-cue start="00:29" voice="Burr Sutter">
I'm sure, like many of you here, you have been thinking about AI/ML, artificial intelligence and machine learning. I've been thinking about that for quite some time and I actually had the
opportunity to work on a few successful projects, here at Red Hat, using those technologies, actually enabling a data set, gathering a data set, working with a data scientist and data
engineering team, and then training a model and putting that model into production runtime environment. It was an exciting set of projects and you can see those on numerous YouTube videos that
have published out there before. But I want you to think about the problem space a little bit, because there are some interesting challenges about a AI/ML. One is simply just getting access to
the data, and while there are numerous publicly available data sets, when it comes to your specific enterprise use case, you might not be to find publicly available data.
</rh-cue>
<rh-cue start="01:14" voice="Burr Sutter">
In many cases you cannot, even for our applications that we created, we had to create our data set, capture our data set, explore the data set, and of course, train a model accordingly. And
we also found there's another challenge to be overcome in this a AI/ML world, and that is access to certain types of hardware. If you think about an enterprise environment and the creation of
an enterprise application specifically for a AI/ML, end users need an inference engine to run on their hardware. Hardware that's available to them, to be effective for their application. Let's
say an application like Computer Vision, one that can detect anomalies and medical imaging or maybe on a factory floor. As those things are whizzing by on the factory line there, looking at
them and trying to determine if there is an error or not.
</rh-cue>
<rh-cue start="01:56" voice="Burr Sutter">
Well, how do you actually make it run on your hardware, your accessible technology that you have today? Well, there's a solution for this as an open toolkit called OpenVINO. And you might be
thinking, "Hey, wait a minute, don't you need a GPU for AI inferencing, a GPU for artificial intelligence, machine learning? Well, not according to Ryan Loney, product manager of OpenVINO
Developer Tools at Intel.
</rh-cue>
<rh-cue start="02:20" voice="Ryan Loney">
I guess I'll start with trying to maybe dispel a myth. I think that CPUs are widely used for inference today. So if we look at the data center segment, about 70% of the AI inference is
happening on Intel Xeon, on our data center CPUs. And so you don't need a GPU especially for running inference. And that's part of the value of OpenVINO, is that we're taking models that may
have been trained on a GPU using deep learning frameworks like PyTorch or TensorFlow, and then optimizing them to run on Intel hardware.
</rh-cue>
<rh-cue start="02:57" voice="Burr Sutter">
Ryan joined me to discuss AI/ML in the enterprise across various industries and exploring numerous use cases. Let's talk a little bit about the origin story behind OpenVINO. Tell us more
about it and how it came to be and why it came out of Intel.
</rh-cue>
<rh-cue start="03:12" voice="Ryan Loney">
Definitely. We had the first release of OpenVINO, was back in 2018, so still relatively new. And at that time, we were focused on Computer Vision and pretty tightly coupled with OpenCV, which
is another open source library with origins at Intel. It had its first release back in 1999, so it's been around a little bit longer. And many of the software engineers and architects at Intel
that were involved with and contributing to OpenCV are working on OpenVINO. So you can think of OpenVINO as complimentary software to OpenCV and we're providing an engine for executing
inferences as part of a Computer Vision pipeline, or at least that's how we started.
</rh-cue>
<rh-cue start="03:58" voice="Ryan Loney">
But since 2018, we've started to move beyond just Computer Vision inference. So when I say Computer Vision inference, I mean image classification, object detection, segmentation, and now
we're moving into natural language processing. Things like speech synthesis, speech recognition, knowledge graphs, time series forecasting and other use cases that don't involve Computer
Vision and don't involve inference on pixels. Our latest release, the 2022.1 that came out earlier this year, that was the most significant update that we've had to OpenVINO, since we started
in 2018. And the major focus of that release was optimizing for use cases that go beyond Computer Vision.
</rh-cue>
<rh-cue start="04:41" voice="Burr Sutter">
And I like that concept that you just mentioned right there, Computer Vision, and you said that you extended those use cases and went beyond that. Could you give us some more concrete
examples of Computer Vision?
</rh-cue>
<rh-cue start="04:50" voice="Ryan Loney">
Sure. When you think about manufacturing, quality control in factories, everything from arc welding, defect detection to inspecting BMW cars on assembly lines, they're using cameras or
sensors to collect data and usually it's cameras collecting images like RGB images that you and I can see and looks like something taken from a camera or video camera. But also, things like
infrared or computerized tomography scans used in healthcare, X-ray, different types of images where we can draw bounding boxes around regions of interest and say, "This is a defect," or,
"This is not a defect." And also, "Is this worker wearing a safety hat or did they forget to put it on?" And so, you can take this and integrate it into a pipeline where you're triggering an
alert if somebody forgets to wear their safety mask, or if there's a defect in a product on an assembly line, you can just use cameras and OpenVINO and OpenCV running these on Intel hardware
and help to analyze.
</rh-cue>
<rh-cue start="05:58" voice="Ryan Loney">
And that's what a lot of the partners that we work with are doing, so these independent software vendors. And there's other use cases for things like retail. You think about going to a store
and using an automated checkout system. Sometimes people use those automated checkouts and they slide a few extra items into their bag that they don't scan and it's a huge loss for the retail
outlets that are providing this way to check out realtime shelf monitoring. We have a Vispera, one of our ISVs that helps keep store shelves stocked by just analyzing the cameras in the
stores, detecting when objects are missing from the shelves so that they can be restocked. We have Vistry, another ISV that works with quick service restaurants. When you think about
automating the process of, when do I drop the fries into the fryer so that they're warm when the car gets to the drive through window, there's quite a bit of industrial healthcare retail
examples that we can walk through.
</rh-cue>
<rh-cue start="06:55" voice="Burr Sutter">
And we should dig into some more of those, but I got to tell you, I have a personal experience in this category that I want to share with and you can tell me how silly you might think at this
point in time it is. We actually built a keynote demonstration for the Red Hat big stage back in 2015. And I really want to illustrate the concept of asset tracking. So we actually gave
everybody in the conference a little Bluetooth token with a little battery, a little watch battery, and a little Bluetooth emitter. And we basically tracked those things around the conference.
We basically put a raspberry pi in each of the meeting rooms and up in the lunch room and you could see how the tokens moved from room to room to room.
</rh-cue>
<rh-cue start="07:28" voice="Burr Sutter">
It was a relatively simple application, but it occurred to me, after we figured out how to do that with Bluetooth and triangulating Bluetooth signals by looking at relative signal strength
from one radio to another and putting that through an Apache Spark application at the time, we then realized, "You know what? This is easier done with cameras." And just simply looking at a
camera and having some form of a AI/ML model, a machine learning model, that would say, "There are people here now," or, "There are no people here now." What do you think about that?
</rh-cue>
<rh-cue start="07:56" voice="Ryan Loney">
What you just described is exactly the product that Pathr, one of our partners is offering, but they're doing it with Computer Vision and cameras. So when Pathr tries to help retail stores
analyze the foot traffic and understand, with heat maps, where are people spending the most time in stores, how many people are coming in, what size groups are coming into the store and trying
to help understand if there was a successful transaction from the people who entered the store and left the store, to help with the retail analytics and marketing sales and positioning of
products. And so, they're doing that in a way that also protects privacy. And that's something that's really important. So when you talked about those Bluetooth beacons, probably if everyone
who walked into a grocery store was asked to put a tracking device in their cart or on their person and say, "You're going to be tracked around the store," they probably wouldn't want to do
that.
</rh-cue>
<rh-cue start="08:53" voice="Ryan Loney">
The way that you can do this with cameras, is you can detect people as they enter and remove their face. So you can ignore any biometric information and just track the person based on pixels
that are present in the detected region of interest. So they're able to analyze... Say a family walks in the door and they can group those people together with object detection and then they
can track their movement throughout the store without keeping track of their face, or any biometric, or any personal identifiable information, to avoid things like bias and to make sure that
they're protecting the privacy of the shoppers in the store, while still getting that really useful marketing analytics data. So that they can make better decisions about where to place their
products. That's one really good example of how Computer Vision, AI with OpenVINO is being used today.
</rh-cue>
<rh-cue start="09:49" voice="Burr Sutter">
And that is a great example, because you're definitely spot on. It is invasive when you hand someone a Bluetooth device and say, "Please, keep this with you as you go throughout our store,
our mall or throughout our hospital, wherever you might be." Now you mentioned another example earlier in the conversation which was related to worker safety. "Are they wearing a helmet?" I
want to talk more about that concept in a real industrial setting, a manufacturing setting, where there might be a factory floor and there's certain requirements. Or better yet there's like a
quality assurance requirement, let's say, when it comes to looking at a factory line. I've run that use case often with some of our customers. Can you talk more about those kinds of use cases?
</rh-cue>
<rh-cue start="10:23" voice="Ryan Loney">
One of our partners, Robotron, we published a case study, I think last year, where they were working with BMW at one of their factories. And they do quality control inspection, but they're
also doing things related to worker safety and analyzing. I use the safety hat example. There's a number of our ISVs and partners who have similar use cases and it comes down to, there's a few
reasons that are motivating this and some are related to insurance. It's important to make sure that if you want to have your factory insured, that your workers are protecting themselves and
wearing the gear regulatory compliance, you're being asked to properly protect from exposure to chemicals or potentially having something fall and hit someone on the head. So wearing a safety
vest, wearing goggles, wearing a helmet, these are things that you need to do inside the factory and you can really easily automate and detect and sometimes without bias.
</rh-cue>
<rh-cue start="11:21" voice="Ryan Loney">
I think that's one of the interesting things about the Robotron-BMW example is that they were also blurring, blacking out, so drawing a box to cover the face of the workers in the factory, so
that somebody who was analyzing the video footage and getting the alerts saying that, "Bay 21 has a worker without a hat on," that it's not sending their face and in the alert and potentially
invading or going against privacy laws or just the ethics of the company. They don't want to introduce bias or have people targeted because it's much better to blur the face and alert and have
somebody take care of it on the floor. And then, if you ever need to audit that information later, they have a way to do it where people who need to be able to see who the employee was and
look up their personal information, they can do that.
</rh-cue>
<rh-cue start="12:17" voice="Ryan Loney">
But then just for the purposes of maintaining safety, they don't need to have access to that personal information, or biometric information. Because that's one thing that when you hear about
Computer Vision or person tracking, object detection, there's a lot of concern, and rightfully so, about privacy being invaded and about tracking information, face re-identification,
identifying people who may have committed crimes through video footage. And that's just not something that a lot of companies want to... They want to protect privacy and they don't want to be
in a situation where they might be violating someone's rights.
</rh-cue>
<rh-cue start="12:56" voice="Burr Sutter">
Well, privacy is certainly opening up Pandora's box. There's a lot to be explored in that area, especially in a digital world that we now live in. But for now, let's move on and explore a
different area. I'm interested in how machines and computers offer advantages specifically in certain use cases like a quality control scenario. I asked Ryan to explain how a AI/ML and
specifically machines, computers, could augment that capability.
</rh-cue>
<rh-cue start="13:20" voice="Ryan Loney">
I can give a specific example where we have a partner that's doing defect detection, looking for anomalies in batteries. I'm sure you've heard there's a lot of interest right now in electric
vehicles, a lot of batteries being produced. And so, if you go into one of these factories, they have images that they collect of every battery that's going through this assembly line. And
through these images, people can look and see and visually inspect what their eyes and say, "This battery has a defect, send it back." And that's one step in the quality control process,
there's other steps I'm sure, like running diagnostic tests and measuring voltage and doing other types of non-visual inspection. But for the visual inspection piece, where you can really
easily identify some problems, it's much more efficient to introduce Computer Vision. And so, that's where we have this new library that we've introduced, called Anomalib.
</rh-cue>
<rh-cue start="14:17" voice="Ryan Loney">
So OpenVINO, while we're focused on inference, we're also thinking about the pipeline, or the funnel, that gets these models to OpenVINO. And so, we've invested in this anomaly segmentation,
anomaly detection library that we've recently open sourced and there's a great research paper about it, about Anomalib, but the idea is you can take just a few images and train a model and
start detecting these defects. And so, for this battery example, that's a more advanced example, but to make it simpler, take some bolts and... Take 10 bolts. You have one that has a scratch
on it, or one that is chipped, or has some damage to it, and you can easily get started in training to recognize the bolts that do not have an anomaly and the ones that do, which is a small
data set. And I think that's really one of the most important things today.
</rh-cue>
<rh-cue start="15:11" voice="Ryan Loney">
Challenges, one is access to data, but the other is needing a massive amount of data to do something meaningful. And so we're starting to try to change that dynamic with Anomalib. You may not
need a 100,000 images, you may need 100 images and you can start detecting anomalies in everything from batteries to bolts to, maybe even the wood varnish use case that you mentioned.
</rh-cue>
<rh-cue start="15:37" voice="Burr Sutter">
That is a very key point because often in that data scientist process, that data engineering data scientist process, the one key thing is, can you gather the data that you need for the input
for the model training? And we've often said, at least people I've worked with over the last couple years, "You need a lot of data, you need tens of thousands of correct images, so we can sort
out the difference between dogs versus cats," let's say. Or you need dozens and dozens of situations where if it's a natural language processing scenario, a good customer interaction, a good
customer conversation. And this case it sounds like what you're saying is, "Show us just the bad things, fewer images, fewer incorrect things, and then let us look for those kind of
anomalies." Can you tell us more about that? Because that is very interesting. The concept that I can use a much smaller data set as my input, as opposed to gathering terabytes of data in some
cases, to just simply get my model training underway.
</rh-cue>
<rh-cue start="16:30" voice="Ryan Loney">
Like you described, the idea is, if you have some good images and then you have some of the known defects, and you can just label, "Here's a set of good images and here's a few of the
defects." And you can right away start detecting those specific defects that you've identified. And then, also be able to determine when it doesn't match the expected appearance of a non
defective item. So if I have the undamaged screw and then I introduce one with some new anomaly that's never been seen before, I can say this one is not a valid screw. And so, that's the
approach that we're taking and it's really important because so often you need to have subject matter experts. Take the battery example, there's these workers who are on the floor, in a
factory and they're the ones who know best when they look at these images, which one's going to have an issue, which one's defective.
</rh-cue>
<rh-cue start="17:31" voice="Ryan Loney">
And then they also need to take that subject matter expertise and then use it to annotate data sets. And when you have these tens of thousands of images you need to annotate, it's asking
those people to stop working on the factory floor so they can come annotate some images. That's a tough business call to make, right? But if you only need them to annotate a handful of images,
it's a much easier ask to get the ball rolling and demonstrate value. And maybe over time you will want to annotate more and more images because you'll get even better accuracy in the model.
Even better, even if it's just small incremental improvements, that's something that if it generates value for the business, it's something the business will invest in over time. But you have
to convince the decision makers that it's worth the time of these subject matter experts to stop what they're doing and go and label some images of the things that they're working on in the
factory.
</rh-cue>
<rh-cue start="18:27" voice="Burr Sutter">
And that labeling process can be very labor intensive. If the annotation is basically saying what is correct, what's wrong, what is this, what is that. And therefore if we can minimize that
timeframe to get the value quicker, then there's something that's useful for the business, useful for the organization, long before we necessarily go through a whole huge model training phase.
</rh-cue>
<rh-cue start="18:49" voice="Burr Sutter">
So we talked about labeling and how that is labor intensive activity, but I love the idea of helping the human. And helping the human most specifically not get bored. Basically if the human
is eyeballing a bunch of widgets flying by, over time they make mistakes, they get bored and they don't pay as close attention as they should. That's why the constant of AI/ML, and
specifically Computer Vision augmenting that capability and really helping the human identify anomalies faster, more quickly, maybe with greater accuracy, could be a big win. We focused on
manufacturing, but let's actually go into healthcare and learn how these tools can be used in that sector and that industry. Ryan talked me about how OpenVINO's run time can be incorporated
into medical imaging equipment with Intel processors embedded in CT, MRI and ultrasound machines. While these inferences, this AI/ML workload, can be operating and executing right there in the
same physical room as the patient.
</rh-cue>
<rh-cue start="19:44" voice="Ryan Loney">
We did a presentation with GE last year, I think they said there's at least 80 countries that have their x-ray machines deployed. And they're doing things like helping doctors place breathing
tubes in patients. So during COVID, during the pandemic, that was a really important tool to help with nurses and doctors who were intubating patients, sometimes in a parking lot or a hallway
of a hospital. And when they had a statistic that GE said, I think one out of four breathing tubes gets placed incorrectly when you're doing it outside the operating room. Because when you're
in an operating room it's much more controlled and there's someone who's an expert at placing the tubes, it's something you have more of a controlled environment. But when you're out, in a
parking lot, in a tent, when the hospital's completely full and you're triaging patients with COVID, that's when they're more likely to make mistakes.And so, they had this endotracheal tube
placement, ETT, model that they trained and it helped to use an x-ray and give an alert and say, "This tube is placed wrong, pull it out and do it again." And so, things like that help doctors
so that they can avoid mistakes. And having a breathing tube placed incorrectly can cause collapsed lung and a number of other unwanted side effects. So it's really important to do it
correctly. Another example is Samsung Medison. They actually are estimating fetal angle of progression. So this is analyzing ultrasound of pregnant women being able to help take measurements
that are usually hard to calculate, but it can be done in an automated way. They're already taking an ultrasound scan and now they're executing this model that can take some of these
measurements to help the doctor avoid potentially more intrusive alternative methods. So the patient wins, it makes their life better and the doctor is getting help from this AI model. And
those are just a few examples.
</rh-cue>
<rh-cue start="21:42" voice="Burr Sutter">
Those are some amazing examples when it comes to all these things, we're talking CT scans and x-rays, other examples of Computer Vision. One thing that's kind of interesting in this space, I
think, whenever I get a chance to work on, let's say an object detection model, and one of our workshops, by the way, is actually putting that out in front of people to say, "Look, you can use
your phone and it basically sends the image over to our OpenShift with our data science platform and then analyzes what you see." And even in my case, where I take a picture of my dog as an
example, it can't really decide, is it a dog or a cat? I have a very funny looking dog.
</rh-cue>
<rh-cue start="22:15" voice="Burr Sutter">
And so there's always a percentage outcome. In other words, "I think it's a dog, 52%." So I want to talk about that more. How important is it to get to that a hundred percent accuracy? How
important is it to really, depending on the use case, to allow for the gray area if you will, where it's an 80% accuracy or a 70% accuracy, and what are the trade offs there associated with
the application? Can you discuss that more?
</rh-cue>
<rh-cue start="22:38" voice="Ryan Loney">
Accuracy is definitely a touchy subject, because how you measure it makes a huge difference. I think what you were describing with the dog example, there's sort of a top five potential
classes that might maybe be identified. So let's say you're doing object detection and you detect a region of interest, and it says 65% confidence this is a dog. Well, the next potential label
that could be maybe 50% confidence or 20% confidence might be something similar to a dog. Or in the case of models that have been trained on the ImageNet dataset or on COCO dataset, they have
actual breeds of dogs. If I want to look at the top five labels for a dog, for my dog for example, she's a mix, mostly a Labrador retriever, but I may look at the top five labels and it may
say 65% confidence that she's a flat coated retriever.
</rh-cue>
<rh-cue start="23:32" voice="Ryan Loney">
And then confidence that she's a husky as 20%, and then 5% confidence that she's a greyhound or something. Those labels, all of them are dogs. So if I'm just trying to figure out, is this a
dog? I could probably find all of the classes within the data set and say, "Well, these all, class ID 65, 132, 92 and 158, all belong to a group of dogs." So if I want to just write an
application to tell me if this is a dog or not, I would probably use that to determine if it's a dog. But how you measure that as accuracy, well that's where it gets a little bit complicated.
Because if you're being really strict about the definition and you're trying to validate against the data set of labeled images, and I have specific dog breeds or some specific detail and it
doesn't match, well then, the accuracy's going to go down.
</rh-cue>
<rh-cue start="24:25" voice="Ryan Loney">
And that's especially important when we talk about things like compression and quantization, which historically, has been difficult to get adoption in some domains, like healthcare, where
even the hint of accuracy going down implies that we're not going to be able to help. In some small case, maybe if it's even half a percent of the time, we won't detect that that tube is
placed incorrectly or that that patient's lung has collapsed or something like that. And that's something that really prevents adoption of some of these methods that can really boost
performance, like quantization. But if you take that example of... Different from the dog example, and you think about segmentation of kidneys. If I'm doing kidney segmentation, which is
taking a CT scan and then trying to pick the pixels out of that scan that belong to a kidney, how I measure accuracy may be how many of those pixels I'm able to detect and how many did I miss?
</rh-cue>
<rh-cue start="25:25" voice="Ryan Loney">
Missing some of the pixels is maybe not a problem, depending on how you've built the application, because you still detect the kidney, and maybe you just need to apply padding around the
region of interest, so that you don't miss any of the actual kidney when you compress the model and when you quantize the model. But that requires a data scientist, an ML engineer, somebody to
really, they have to be able to go and apply that after the fact, after the inference happens, to make sure that you're not losing critical information. Because the next step from detecting
the kidney, may be detecting a tumor.
</rh-cue>
<rh-cue start="26:04" voice="Ryan Loney">
And so, maybe you can use the more optimized model to detect the kidney, but then you can use a slower model to detect the tumor. But that also requires somebody to architect and make that
decision or that trade off and say, "Well, I need to add padding," or, "I should only use the quantized model to detect the region of interest for the kidney." And then, use the model that
takes longer to do the inference just to find the tumor, which is going to be on a smaller size. The dimensions are going to be much smaller once we crop to the region of interest. But all of
those details, that's maybe not easy to explain in a few sentences and even the way I explained it is probably really confusing.
</rh-cue>
<rh-cue start="26:45" voice="Burr Sutter">
I do love that use case, like you mentioned, the cropping, even in one scenario that we worked on for another project, we specifically decided to pixelate the image that we had taken, because
we knew that we could get the outcome we wanted by even just using a smaller or having less resolution in our image. And therefore, as we transferred it from the mobile device, the edge
device, up into the cloud, we wanted that smaller image just for transfer purposes. And still, we could get the accuracy we needed by a lot of testing.
</rh-cue>
<rh-cue start="27:11" voice="Burr Sutter">
And one thing that's interesting about that, from my perspective, is, if you're doing image processing, sometimes it takes a while for this transaction to occur. I come from a traditional
application background, where I'm reading and writing things from a database, or a message broker, or moving data from one place to another. Those things happen sub-second normally, even with
great latency between your data centers, it's still sub-second in most cases. While a transaction like this one can actually take two seconds or four seconds, as it's doing its analysis and
actually coming back with its, "I think it's a dog, I think it's a kidney, I think it's whatever." And providing me that accuracy statement. That concept of optimization is very important in
the overall application architecture. Would you agree with that or how do you think about that concept?
</rh-cue>
<rh-cue start="27:56" voice="Ryan Loney">
Definitely. It depends too on the use case. So if you think about how important it is to reduce the latency and increase the number of frames per second that you can process when you're
talking about a loss prevention model that's running at a grocery store. You want to keep the lines moving, you don't want every person who's at the self checkout to have to wait five seconds
for every item they scan. You need it to happen as quickly as possible. And if sometimes the accuracy decreases slightly, or I'd say the accuracy of the whole pipeline, so not just looking at
the individual model or the individual inference, but let's say that the whole pipeline is not as successful at detecting when somebody steals one item from the self checkout, it's not going
to be a life threatening situation. Whereas being hooked up to the x-ray machine with the tube placement model, they might be willing to have the doctor or the nurse wait five seconds to get
the result.
</rh-cue>
<rh-cue start="28:55" voice="Ryan Loney">
They don't need it to happen in 500 milliseconds. Their threshold for waiting is a little bit higher. That, I think, also drives some of the decision. You want to keep people moving through
the checkout line and you can afford to, potentially, if you lose a little bit of accuracy here and there, it's not going to cost the company that much money or it's not going to be life
threatening. It's going to be worth the trade off of keeping the line moving and not having people leave the store and not check out at all, to say, "I'm not going to shop today because the
line's too long."
</rh-cue>
<rh-cue start="29:30" voice="Burr Sutter">
There are so many trade-offs in enterprise AI/ML use cases, things like latency, accuracy and availability, and certainly complexities abound, especially in an obviously ever-evolving
technological landscape where we are still very early in the adoption of AI/ML. And to navigate that complexity, that direct feedback from real world end users is essential to Ryan and his
team at Intel. What would you say are some of the big hurdles or big outcomes, big opportunities in that space? And do you agree that we're still at the very beginning, in our infancy if you
will, of adopting these technologies and discovering what they can do for us?
</rh-cue>
<rh-cue start="30:06" voice="Ryan Loney">
Yeah, I think we're definitely in the infancy and I think that what we've seen is, our customers are evolving and the people who are deploying on Intel hardware, they're trying to run more
complicated models. They're the models that are doing object detection or detecting defects and doing segmentation. In the past you could say, "Here's a generic model that will do face
detection, or person detection, or vehicle detection, license plate detection." And those are general purpose models that you can just grab off the shelf and use them. But now we're moving
into the Anomalib scenarios, where I've got my own data and I'm trying to do something very specific and I'm the only one that has access to this data. You don't have that public data set that
you can go download that's under Creative Commons license for car batteries. It's just not something that's available.
</rh-cue>
<rh-cue start="30:57" voice="Ryan Loney">
And so, those use cases, the challenge with training those models and getting them optimized is the beginning of the pipeline. It's the data. You have to get the data, you have to annotate it
and the tools have to exist for you to do that. And that's part of the problem that we're trying to help solve. And then, the models are getting more complex. So if you think, just from
working with customers recently, they're no longer just trying to do image classification, "Is it a dog or a cat?" They've moved on to 3D point clouds and 3D segmentation models and things
that are like the speech synthesis example. These GPT models that are generating... You put a text input and it generates an image for you. It's just becoming much more advanced, much more
sophisticated and on larger images.
</rh-cue>
<rh-cue start="31:50" voice="Ryan Loney">
And so things like running super resolution and enhancing images, upscaling images, instead of just trying to take that 200 by 200 pixel image and classifying if it's a cat, now we're talking
about gigantic, huge images that we're processing and that all requires more resources or more optimized models. And every Computer Vision conference or AI conference, there's a new latest and
greatest architecture, there's new research paper, and things are getting adopted much faster. The lead time for a NeurIPS paper, CVPR, for a company to actually adopt and put those into
production, the time shortens every year.
</rh-cue>
<rh-cue start="32:34" voice="Burr Sutter">
Well Ryan, I got to tell you, I could talk to you, literally, all day about these topics, the various use cases, the various ways models are being optimized, how to put models into a pipeline
for average enterprise applications. I've enjoyed learning about OpenVINO and Anomalib. I'm fascinated by this, because I'll have a chance to go try this myself, taking advantage of Red Hat
OpenShift and taking advantage of our data science platform. On top of that, I will definitely go be poking at this myself. Thank you so much for your time today.
</rh-cue>
<rh-cue start="33:00" voice="Ryan Loney">
Thanks, Burr. This was a lot of fun. Thanks for having me.
</rh-cue>
<rh-cue start="33:05" voice="Burr Sutter">
You can check out the full transcript of our conversation and more resources, like a link to a white paper on OpenVINO and Anomalib at redhat.com/codecommentspodcast. This episode was
produced by Brent Simoneaux and Caroline Creaghead. Our sound designer is Christian Prohom. Our audio team includes Leigh Day, Stephanie Wonderlick, Mike Esser, Laura Barnes, Claire Allison,
Nick Burns, Aaron Williamson, Karen King, Boo Boo Howse, Rachel Ertel, Mike Compton, Ocean Matthews, Laura Walters, Alex Traboulsi, and Victoria Lawton. I'm your host, Burr Sutter. Thank you
for joining me today on Code Comments. I hope you enjoyed today's session and today's conversation, and I look forward to many more.
</rh-cue>
</rh-transcript>
</rh-audio-player>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Prevent Concurrent Playback
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
```
#concurrent {
padding: var(--rh-space-xl, 24px);
}
```
<section id="concurrent">
<p>Pressing play on any <code>rh-audio-player</code> element will pause
any other currently playing <code>rh-audio-player</code> elements.</p>
<rh-audio-player>
<p slot="series">Code Comments</p>
<h3 slot="title">Bringing Deep Learning to Enterprise Applications</h3>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/bd38190e-516f-49c0-b47e-6cf663d80986/audio/dc570fd1-7a5e-41e2-b9a4-96deb346c20f/default_tc.mp3">
</audio>
</rh-audio-player>
<rh-audio-player>
<p slot="series">Code Comments</p>
<h3 slot="title">Rethinking Networks In Telecommunications</h3>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="en" src="https://cdn.simplecast.com/audio/28d037d3-7d17-42d4-a8e2-2e00fd8b602b/episodes/32d79061-21f8-40a1-9ef8-d424e82a8326/audio/0a65ec45-a21a-4c31-94a6-61701a145b1d/default_tc.mp3">
</audio>
</rh-audio-player>
</section>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Right To Left
#rtl {
margin: var(--rh-space-xl);
& label {
display: flex;
align-items: center;
}
& select {
margin-inline-end: var(--rh-space-md);
flex: 1 0 auto;
}
}
```
import '@rhds/elements/rh-audio-player/rh-audio-player.js';
const player = document.querySelector('rh-audio-player');
const form = document.querySelector('form');
form.addEventListener('input', function updateDemo() {
player.layout = form.layout.value || undefined;
});
```
<section id="rtl">
<!-- Options for demo -->
<form>
<label>Layout:
<select name="layout">
<option value="full" selected="">Full</option>
<option value="compact-wide">Compact Wide</option>
<option value="compact">Compact</option>
<option value="">Mini</option>
</select>
</label>
</form>
<!-- Right to left layout will display based on form checkbox -->
<rh-audio-player lang="he" dir="rtl" layout="full" poster="https://deow9bq0xqvbj.cloudfront.net/ep-logo/pbblog15366739/Copy_of_podcast_5__x3k5r4.jpg">
<p slot="series">מדברים פתוח</p>
<h3 slot="title">מדברים פתוח - תום פטאהיי - המסע מאתיופיה להייטק</h3>
<rh-audio-player-about slot="about" label="אודות ההסכת">
<h4 slot="heading">אודות הפרק</h4>
<p>בפרק הזה תום שיתף אותנו על המסלול שעשה מהילדות באתיופיה אל הייטק הישראלי.
דיברנו על תנאי הפתיחה של העולים מאתיופיה דיברנו על הקשיים על ההצלחות ועל הכישלנות.</p>
<rh-avatar slot="profile" name="אילן פינטו" src="https://deow9bq0xqvbj.cloudfront.net/image-logo/15366739/ALm5wu1GYAPF0EO3GvWnZxLHsbiMsp-C2DfXxzhR9K3o6vM=s96-c_300x300.jpg"></rh-avatar>
</rh-audio-player-about>
<audio crossorigin="anonymous" slot="media" controls="">
<source type="audio/mp3" srclang="he" src="https://mcdn.podbean.com/mf/web/7i9q6v/tom_haftaa.mp3">
</audio>
<rh-audio-player-subscribe slot="subscribe" label="הירשם">
<h4 slot="heading">הירשם</h4>
<p>הירשמו פה:</p>
<a slot="link" href="https://www.podbean.com/site/podcatcher/index/blog/gppMmVUk8uG0" target="_blank" title="Listen on Apple Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Apple Podcasts" data-analytics-category="Hero|Listen on Apple Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_apple-podcast-white.svg" alt="Listen on Apple Podcasts">
</a>
<a slot="link" href="https://open.spotify.com/show/1lR9GVeDtZgkCM9gu0X4ZT" target="_blank" title="Listen on Spotify" data-analytics-linktype="cta" data-analytics-text="Listen on Spotify" data-analytics-category="Hero|Listen on Spotify">
<img src="https://www.redhat.com/cms/managed-files/badge_spotify.svg" alt="Listen on Spotify">
</a>
<a slot="link" href="https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkLnBvZGJlYW4uY29tL3RhbGtpbmdPcGVuL2ZlZWQueG1s" target="_blank" title="Listen on Google Podcasts" data-analytics-linktype="cta" data-analytics-text="Listen on Google Podcasts" data-analytics-category="Hero|Listen on Google Podcasts">
<img src="https://www.redhat.com/cms/managed-files/badge_google-podcast.svg" alt="Listen on Google Podcasts">
</a>
<a slot="link" href="https://feed.podbean.com/talkingOpen/feed.xml" target="_blank" title="Subscribe via RSS Feed" data-analytics-linktype="cta" data-analytics-text="Subscribe via RSS Feed" data-analytics-category="Hero|Subscribe via RSS Feed">
<img class="img-fluid" src="https://www.redhat.com/cms/managed-files/badge_RSS-feed.svg" alt="Subscribe via RSS Feed">
</a>
</rh-audio-player-subscribe>
<rh-transcript id="regular" slot="transcript" label="תמליל">
<h4 slot="heading">תמליל</h4>
<rh-cue start="00:05" voice="ג'וש סלומון">
ברוכים הבאים לפודקאסט מדברים פתוח של רדהט ישראל.
</rh-cue>
<rh-cue start="00:09" voice="ג'וש סלומון">
אני ג'וש סלומון.
</rh-cue>
<rh-cue start="00:11" voice="אילן פינטו">
ואני אילן פינטו ויחד נגיש לכם פודקאסט בעברית על כל מה שחשוב ומעניין.
</rh-cue>
<rh-cue start="00:15" voice="אילן פינטו">
ורדהט עם דגש על טכנולוגיה, אבל לא רק
</rh-cue>
<rh-cue start="00:18" voice="אילן פינטו">
מדברים פתוח.
</rh-cue>
<rh-cue start="00:19" voice="ג'וש סלומון">
מתחילים!
</rh-cue>
<rh-cue start="00:24" voice="ג'וש סלומון">
שלום לכולם וברוכים הבאים לעוד פרק של מדברים פתוח.
</rh-cue>
<rh-cue start="00:28" voice="ג'וש סלומון">
הפודקאסט של רדת ישראל והפעם פרק ראשון בסדרה חדשה פודקאסט בהפתעה.
</rh-cue>
<rh-cue start="00:35" voice="ג'וש סלומון">
אחרי שכמה מרואיינים הבריזו לנו.
</rh-cue>
<rh-cue start="00:38" voice="אילן פינטו">
היום לא נזכיר שמות, לא נזכיר לנו נזכיר שמות.
</rh-cue>
<rh-cue start="00:41" voice="ג'וש סלומון">
עילן, לא נזכיר שמות אוהד גם אותך זה, נזכיר.
</rh-cue>
<rh-cue start="00:44" voice="ג'וש סלומון">
הלכנו למטבחון ושאלנו מי מוכן לבוא לפודקאסט.
</rh-cue>
<rh-cue start="00:48" voice="אילן פינטו">
לא, אבל רגע צריך להגיד שהרבה זמן רצינו לעשות.
</rh-cue>
<rh-cue start="00:51" voice="אילן פינטו">
פרק על דיורסיטי אן קריוזן ויצא לנו טוב.
</rh-cue>
<rh-cue start="00:56" voice="אילן פינטו">
יצא לנו טוב, אז היום את מי יש לנו?
</rh-cue>
</rh-transcript>
</rh-audio-player>
</section>
<link rel="stylesheet" href="../rh-audio-player-lightdom.css">
```
Other libraries
To learn more about our other libraries, visit this page.
Feedback
To give feedback about anything on this page, contact us.