• Tue. May 28th, 2024

Nvidia’s 20 Siggraph papers highlight generative AI, 3D characters and hair


May 2, 2023
Nvidia's 20 Siggraph papers highlight generative AI, 3D characters and hair


Connect with top gaming leaders in Los Angeles at GamesBeat Summit 2023 this May 22-23. Register here.

Nvidia Graphics Research said it has 20 research papers debuting at SIGGRAPH, the big computer graphics conference coming in Los Angeles.

The papers highlight generative AI, which has taken the world by storm in the past year with the text prompts of ChatGPT and the imagery of AI-created art.

And Nvidia is also showing off more traditional 3D graphics such as natural-looking hair, said Aaron Lefohn, vice president of graphics research at Nvidia, in an interview with GamesBeat. Lefohn said that the research will enable developers and artists to bring their ideas to life — whether still or moving, in 2D or 3D, hyperrealistic or fantastical.

The papers come from collaborations on generative AI and neural graphics with over a dozen universities in the U.S., Europe and Israel. They will be presented at SIGGRAPH 2023, which takes place August 6 to August 10 in Los Angeles.


GamesBeat Summit 2023

Join the GamesBeat community in Los Angeles this May 22-23. You’ll hear from the brightest minds within the gaming industry to share their updates on the latest developments.

Register Here

The papers include generative AI models that turn text into personalized images; inverse rendering tools that transform still images into 3D objects; neural physics models that use AI to simulate complex 3D elements with stunning realism; and neural rendering models that unlock new capabilities for generating real-time, AI-powered visual details.

Nvidia regularly shares its researchers’ innovations with developers on GitHub and they’re incorporated into products, including the Nvidia Omniverse platform for building and operating metaverse applications and Nvidia Picasso, a recently announced foundry for custom generative AI models for visual design.

Years of Nvidia graphics research helped bring film-style rendering to games, like the recently released Cyberpunk 2077 Ray Tracing: Overdrive Mode.

More reflections.
More reflections.

Lefohn said Nvidia Research has more than 300 scientists and engineers around the world. Those teams have done a lot of research that has been turned into products, like ray-tracing cores and DLSS.

“Here’s a bunch of graphics inventions that might someday make America make our products even better,” said Lefohn.

He noted that Nvidia worked with CD Projekt Red to release Cyberpunk 2077: Overdrive Mode, which was the world’s first triple-A path trace game, where the graphics are being done in a way that films have been done with true physics simulation.

“It’s been a goal of the field for a long time. And we’re finally now at this turning point where that’s possible. And this is due to all of Nvidia graphics, the hardware team, the software team, the dev tech team. But in also research,” he said.

“You may have an idea that you want to get into some visual imagery, eventually an interactive virtual world,” Lefohn said. “But it can start with 2D storyboarding, and there are all kinds of tools of generative AI now for doing storyboarding, from text to photoreal images. And we have some papers we’ll talk to you about that are helping personalize that so you generate images in that storyboarding phase that are closer to your world rather than just being generic stock imagery.”

The research advancements presented this year at SIGGRAPH will help developers and enterprises rapidly generate synthetic data to populate virtual worlds for robotics and autonomous vehicle training. They’ll also enable creators in art, architecture, graphic design, game development and film to more quickly produce high-quality visuals for storyboarding, previsualization and even production.

“The cost of creating that content that is like the single hardest problem, the most expensive piece of the puzzle,” said David Luebke, vice president of research, in an interview with GamesBeat. “That’s why this is super exciting graphics people, even though it’s not necessarily going to change the way we do ray tracing, right? It’s going to fit into how we do ray tracing, and it’s going to generate the content for how we generate ray tracing. And that’s what we’re so excited about.”

Customized text-to-image AI models

Nvidia Research is studying teddy bears.
Nvidia Research is studying teddy bears.

Generative AI models that transform text into images are powerful tools to create concept art or storyboards for films, video games and 3D virtual worlds. Text-to-image AI tools can turn a prompt like “children’s toys” into nearly infinite visuals a creator can use for inspiration — generating images of stuffed animals, blocks or puzzles.

However, artists may have a particular subject in mind. A creative director for a toy brand, for example, could be planning an ad campaign around a new teddy bear and want to visualize the toy in different situations, such as a teddy bear tea party. To enable this level of specificity in the output of a generative AI model, researchers from Tel Aviv University and Nvidia have published two SIGGRAPH papers that enable users to provide image examples that the model quickly learns from.

ne paper describes a technique that needs a single example image to customize its output, accelerating the personalization process from minutes to roughly 11 seconds on a single Nvidia A100 Tensor Core GPU, more than 60 times faster than previous personalization approaches.

A second paper introduces a highly compact model called Perfusion, which takes a handful of concept images to allow users to combine multiple personalized elements — such as a specific teddy bear and teapot — into a single AI-generated visual.

I asked how much more researchers could do with generative AI, given it is already out in the public.

“It’s a reasonable question. AI has become very mainstream,” said Luebke. “It’s very much in the public mind. What’s left to do?”

One thing the researchers are doing is increasing the capability of the generative AI, or increasing your ability to generate something that you envision, something that is personalized or customized to you, Luebke said.

“There there is the question of how you can make it performant,” he said. “How do you take something like a meme or postcard or high-quality image that you can use to make move 3D models. Working in Israel with colleagues at Tel Aviv University, the researchers can now take a single image and customize the output.”

Another paper focuses on the ability to combine concepts and blend them together.

“We’re going to see a lot more generative AI this year,” he said.

Serving in 3D: Advances in inverse rendering and character creation

Nvidia A100 80GB GPU
Nvidia A100 80GB GPU

Once a creator comes up with concept art for a virtual world, the next step is to render the environment and populate it with 3D objects and characters. Nvidia Research is inventing AI techniques to accelerate this time-consuming process by automatically transforming 2D images and videos into 3D representations that creators can import into graphics applications for further editing.

A third paper created with researchers at the University of California at San Diego, discusses tech that can generate and render a photorealistic 3D head-and-shoulders model based on a single 2D portrait — a major breakthrough that makes 3D avatar creation and 3D video conferencing accessible with AI. The method runs in real time on a consumer desktop, and can generate a photorealistic or stylized 3D telepresence using only conventional webcams or smartphone cameras.

A fourth project, a collaboration with Stanford University, brings lifelike motion to 3D characters. The researchers created an AI system that can learn a range of tennis skills from 2D video recordings of real tennis matches and apply this motion to 3D characters. The simulated tennis players can accurately hit the ball to target positions on a virtual court, and even play extended rallies with other characters.

On the tennis example, he noted that 3D content about how people move is often the hardest to do, most expensive, and most difficult to capture by hand.

“What’s what’s interesting about this, is that these 3D rendered characters, which are supposed to look like a state-of-the-art tennis game, are created by watching videos of tennis,” Luebke said.

Beyond the test case of tennis, this SIGGRAPH paper addresses the difficult challenge of producing 3D characters that can perform diverse skills with realistic movement — without the use of expensive motion-capture data.

Not a hair out of place: Neural physics enables realistic simulations

sully jacket

Once a 3D character is generated, artists can layer in realistic details such as hair — a complex, computationally expensive challenge for animators.

Humans have an average of 100,000 hairs on their heads, with each reacting dynamically to an individual’s motion and the surrounding environment. Traditionally, creators have used physics formulas to calculate hair movement, simplifying or approximating its motion based on the resources available. That’s why virtual characters in a big-budget film sport much more detailed heads of hair than real-time video game avatars.

“We’ll share some breakthroughs in neural physics where, with hair simulation, by approximating the complex physics, it’s done offline. With the AI models that we can run in real time, we can bring much higher fidelity hair simulation to real time than we’ve ever been able to do,” Lefohn said. “And with rendering, we’re showing how we are able to replace traditional handwritten components of the rendering pipeline with replacements that are far more capable. We’re bringing AI graphics more closer together than ever.”

Luebke said that creating physically simulated hair used to take a lot of computation. Now that computation can be compressed so that small neural networks can run it interactively. This can provide a much higher fidelity simulation.

A fifth paper showcases a method that can simulate tens of thousands of hairs in high resolution and in real time using neural physics, an AI technique that teaches a neural network to predict how an object would move in the real world.

The team’s novel approach for accurate simulation of full-scale hair is specifically optimized for modern GPUs. It offers significant performance leaps compared to state-of-the-art, CPU-based solvers, reducing simulation times from multiple days to merely hours — while also boosting the quality of hair simulations possible in real time. This technique finally enables both accurate and interactive physically based hair grooming.

Neural rendering brings film-quality detail to real-time graphics

Neural texture compression.
Neural texture compression.

After an environment is filled with animated 3D objects and characters, real-time rendering simulates the physics of light reflecting through the virtual scene. Recent NVIDIA research shows how AI models for textures, materials and volumes can deliver film-quality, photorealistic visuals in real time for video games and digital twins.

Nvidia invented programmable shading over two decades ago, enabling developers to customize the graphics pipeline. In these latest neural rendering inventions, researchers extend programmable shading code with AI models that run deep inside Nvidia’s real-time graphics pipelines.

With programmable shading, Nvidia has been able to do AI computations with graphics shaders, replacing huge swaths of previously handwritten code with AI models that are capable of so much more, Lefohn said.

In a sixth SIGGRAPH paper, Nvidia will present neural texture compression that delivers up to 16 times more texture detail without taking additional GPU memory. Neural texture compression can substantially increase the realism of 3D scenes, as seen in the image below, which demonstrates how neural-compressed textures capture sharper detail than previous formats, where the text remains blurry.

Neural materials

A seventh paper features NeuralVDB, an AI-enabled data compression technique that decreases by 100 times the memory needed to represent volumetric data — like smoke, fire, clouds and water.

Nvidia also released today more details about neural materials research that was shown in the most recent Nvidia GTC keynote. The paper describes an AI system that learns how light reflects from photoreal, many-layered materials, reducing the complexity of these assets down to small neural networks that run in real time, enabling up to 10 times faster shading.

The level of realism can be seen in this neural-rendered teapot, which accurately represents the ceramic, the imperfect clear-coat glaze, fingerprints, smudges and even dust.

Nvidia is looking closely at the connection between generative AI and 3D computer graphics, Luebke said.

“We’re seeing amazing new technology there,” Luebke said. “We’ve seen stories about general AI. But there’s still a very long way to go before general AI is creating an interactive experience for you look at something like a massively multiplayer computer game.”

GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Discover our Briefings.

Leave a Reply

Your email address will not be published. Required fields are marked *