Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Synthesis AI, a San Francisco-based startup that specializes in synthetic data technologies, announced today that it has developed a new way to create realistic 3D digital humans from text prompts.
The company said its new text-to-3D technology, which is showcased in its online platform Synthesis Labs, uses generative artificial intelligence and visual effects pipelines to produce high-resolution, cinematic-quality digital humans that can be used for various applications, such as gaming, virtual reality, film and simulation.
Synthesis AI claims it is the first company to demonstrate text-to-3D digital human synthesis at such a high level of quality and detail. The technology allows users to input text descriptions of the desired digital human, such as age, gender, ethnicity, hairstyle and clothing, and then generate a 3D model that matches the specifications. Users can also edit the 3D model by changing the text prompts or using sliders to adjust features like facial expressions and lighting.
The company said its text-to-3D technology was part of its broader mission to support advanced artificial intelligence applications by providing perfectly labeled synthetic data to train machine learning models. Synthetic data is artificially generated data that mimics real data, but does not contain any personal or sensitive information.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
“The text-to-3D capability we’re showcasing in Synthesis Labs takes a programmatic, API-driven approach as its starting point, adds a dead-simple prompt-based user interface, and outputs a high-resolution 3D model that can be used as synthetic data across a broad range of use cases that require digital humans,” Yashar Behzadi, CEO and founder of Synthesis AI told VentureBeat. “Synthesis Labs externalizes some of our research and development work with actual customers.”
This announcement follows the launch of Synthesis Humans and Synthesis Scenarios, which represent in-depth offerings of human-centric synthetic data currently available in the market.
Leveraging text-to-3D with generative AI
Synthesis AI has combined generative AI and cinematic VFX pipelines to produce perfectly labeled synthetic data to train machine learning models. This development marks the first time that text-to-3D digital human synthesis has been demonstrated in high-resolution, cinematic quality, and is expected to accelerate the development and reduce the costs of 3D applications in a variety of industries, including AR/VR, gaming, VFX, smart cities, virtual try-on (VTON), automotive, and industrial and manufacturing simulations.
The creation of 3D models is a multifaceted and intricate process that demands the interplay of several elements, including geometry, meshes, and texture layers. For seasoned gaming and VFX artists, starting with a human model for human-centric characters and scenes has historically been the go-to option. This approach is often faster and more straightforward than building a computer-generated human from scratch.
However, crafting high-quality human models is a challenging feat that requires specialized photogrammetry setups. These setups are designed to capture multiple angles of actual people under controlled settings to create raw 2D images. Images are then meticulously combined using a variety of hand-crafted and optimized tools to ensure optimal quality.
Through text-to-3D digital human synthesis, the company devised an innovative approach, developing in-house models leveraging diffusion-based generative AI architectures to generate a diverse array of meshes that are governed by critical parameters such as gender, age, ethnicity, and more. The texture layers are created using a separate generative model that offers fine-grained independent control.
A comprehensive and high-resolution 3D model gets produced by merging these two essential components.
“Creating a diverse set of humans is further complicated by the logistics of recruiting specific individuals and obtaining waivers. Starting with an inexpensively synthesized digital human are orders of magnitude faster and cheaper than either of those options,” Synthesis AI’s Behzadi told VentureBeat.” The text-to-3D capability enables on-demand generation of high-quality assets, saving weeks of time and thousands of dollars per model.”
The new text-to-3D offerings, featured in Synthesis Labs, introduce prompt-based input and editing, making the no-code 3D generative AI capabilities more accessible to all experience levels.
“For starters, prompt-based generation and iteration brings creative power to anyone capable of using a search engine. However, we think the early adopters and power users will be technical artists across all forms of entertainment and media, as well as product managers in industrial and manufacturing software looking to populate 3D simulations with representative digital humans,” said Behzadi. “These are both technical audiences, but likely don’t have advanced machine learning skills.”
Synthesis AI’s proprietary library of over 100K digital humans (or IDs) is the underlying data used to train the models. The company’s other products, Synthesis Humans and Synthesis Scenarios, already leverage this library to support leading computer vision teams with labeled training data to support the development of face ID capabilities, driver monitoring, avatars and more.
What’s next for Synthesis AI?
The launch of Synthesis Labs represents a significant milestone in Synthesis AI’s journey to enable enterprise, industrial and public sector customers to simulate reality by synthesizing any person, place or object. Applications include simulation and synthetic data to train computer vision models in VFX, AR/VR, and media and content creation.
The new text-to-3D digital human capabilities will be available to a select group of beta testers starting in Q2 this year.
“Opening up the capability to external users will allow us to leverage community feedback to further refine the underlying generative models,” explained Behzadi. “Reinforcement learning from human feedback (RLHF) is key to continually improving the performance of the underlying models and discovering edge cases.”
Behzadi said that by combining generative AI with cinematic visual effects pipelines, companies would be able to synthesize the world, including humans, environments, and objects.
“We hope to continue to innovate and lower the bar for developers to create assets and synthetic data to drive the state-of-the-art forward in computer vision,” he added.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.