Check out out all the on-desire sessions from the Smart Safety Summit in this article.
The most sophisticated generative AI models in the globe, like Steady Diffusion, frequently operate only in the cloud.
But what if that very same design could operate on a smartphone in your pocket? That’s the challenge that Qualcomm engineers have tackled. In investigation released nowadays, Qualcomm has revealed that applying a combination of software methods and hardware optimization, it was equipped to shrink Secure Diffusion these kinds of that it could run inference models on widespread Android smartphone products.
Secure Diffusion is produced by startup Balance AI and is one particular of the most well known generative AI designs for image generation in use today, generally competing versus OpenAI’s DALL-E.
To be distinct, the technologies essential to educate generative AI styles is large and is not likely to operate on a smartphone. Rather, what Qualcomm has worked on is the inference facet, that is the “generative” piece, which permits a new picture to be designed from the pretrained product. To date, consumers have been ready to produce Steady Diffusion–based illustrations or photos on their phones in an oblique approach, wherever a cellular application or browser accesses a cloud services that generates the impression. What Qualcomm is now demonstrating is the means to produce Stable Diffusion generative AI images instantly on an Android smartphone, without the need of the want to contact out to the cloud to do the heavy lifting.
Intelligent Security Summit On-Demand
Find out the significant role of AI & ML in cybersecurity and industry distinct situation scientific tests. Check out on-desire sessions right now.
>>Follow VentureBeat’s ongoing generative AI protection<<
“For privacy and security, when entering queries through a cloud API for Stable Diffusion, all your information or ideas are sent to the cloud server of some company,” Jilei Hou, VP, engineering at Qualcomm Technologies, told VentureBeat. “With on-device AI, that issue goes away since all your ideas stay solely on the device.”
Hou noted that for enterprise use of generative AI, this could be an even bigger issue where company confidential information needs to be protected.
Hardware alone isn’t enough to run generative AI
The demo that Qualcomm built to prove out its capabilities is running on a Qualcomm Reference Design device with the latest Snapdragon 8 Gen 2 Mobile Platform, which is in many commercial devices today.
Hou said the inferencing part is done on the Hexagon Processor, which is a complete custom design for AI acceleration by Qualcomm engineers and is part of the Snapdragon 8 Gen 2 silicon.
While Qualcomm’s silicon is powerful for a mobile device, Stable Diffusion presents a series of challenges to running directly on a smartphone. For one, Hou noted that the size of the model is over 1.1 billion parameters and the associated computing is more than 10 times the size of the typical workloads that are run on a smartphone.
“This is the biggest model that we have run on a smartphone,” Hou said. “All the full-stack optimizations that we made were very important to make the model fit and run efficiently.”
How Qualcomm shrank Stable Diffusion to run on Android
The optimizations that were required involved heavy use of the Qualcomm AI Stack, which is a portfolio of AI tools designed to help optimize models and workloads.
Hou explained that for Stable Diffusion, his team started with the FP32 version 1-5 open-source model from Hugging Face and made optimizations through quantization, compilation and hardware acceleration to run it on a phone powered by the Snapdragon 8 Gen 2 Mobile Platform.
To shrink the model, his team used the AI Model Efficiency Toolkit’s (AIMET) post-training quantization capabilities.
“Quantization not only increases performance, but also saves power by allowing the model to efficiently run on our dedicated AI hardware and to consume less memory bandwidth,” Hou said.
For compilation, the Qualcomm AI Engine direct framework was used to map the neural network into a program that runs efficiently on the smartphone hardware. Hou noted that the overall optimizations made in the Qualcomm AI Engine have significantly reduced runtime latency and power consumption. He added that all the work done to get Stable Diffusion running well on the smartphone will benefit future iterations and users of the Qualcomm AI Stack.
Looking forward, Hou said Qualcomm will build on lessons learned to bring other large generative AI models (for example, GPT-like models) from the cloud to the device. He adds that the optimizations for Stable Diffusion to run efficiently on phones can also be used for other platforms like laptops, XR headsets, and virtually any other device powered by Qualcomm Technologies.
“Running all the AI processing in the cloud will be too costly, which is why efficient edge AI processing is so important,” Hou said. “Edge AI processing ensures user privacy while running Stable Diffusion and other generative AI models since the input text and generated image never need to leave the device — this is a big deal for the adoption of both consumer and enterprise applications.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.