Nvidia ACE for Games Aims to Bring Interactive NPCs to Life Using the NeMo LLM
The new solution leverages three Nvidia generative AI models
Nvidia announced Avatar Cloud Engine (ACE) for Games this week at Computex. The new solution utilizes generative AI to enhance non-player characters (NPCs) in video games. ACE for Games is designed to assist game developers in creating more realistic and dynamic virtual characters that can interact with players. According to the announcement:
The creation of non-playable characters (NPCs) has evolved as games have become more sophisticated. The number of pre-recorded lines has grown, the number of options a player has to interact with NPCs has increased, and facial animations have become more realistic.
Yet player interactions with NPCs still tend to be transactional, scripted, and short-lived, as dialogue options exhaust quickly, serving only to push the story forward. Now, generative AI can make NPCs more intelligent by improving their conversational skills, creating persistent personalities that evolve over time, and enabling dynamic responses that are unique to the player.
Generative AI Triple Play
By employing multiple AI models, ACE can generate NPCs with distinct personalities, behaviors, and dialogue. Nvidia says this enables the NPCs to interact more like real players and generate personalized responses to other players.
The ACE generative AI models include the NeMo (LLM) for language modeling, Riva for automatic speech recognition (ASR) and text-to-speech, and Audio2Face for animating avatar faces to make speaking look natural. Developers can utilize ACE for Games to build new models or modify existing models that employ neural networks optimized for different game and system capabilities.
Generative AI has recently been a hot topic in the video game industry. Roblox and Unity are experimenting with generative AI tools for game creation, Didimo's Popul8 uses generative AI to modify character templates, Scenario employs generative AI for video game art, and Latitude's AI Dungeon leverages advanced language models for generating game text and character interactions. Even Google has experimented with using its Imagen text-to-image AI tool to add graphics to a classic text adventure game.
AI Safety Included
NeMo LLM model alignment features also offer additional features for realism and user safety. A feature Nvidia calls behavior cloning is designed to allow developers to train a base language model to perform role-playing tasks based on player instructions. However, no additional information was provided for this feature, though it sounds interesting.
In the future, Nvidia plans to offer reinforcement learning from human feedback (RLHF). This will enable designers to provide real-time input during the NPC's development and testing processes to further align behavior with desired expectations.
Implementing of NeMo Guardrails is a final step in NPC alignment. ACE provides programmable rules for NPCs, ensuring that game characters adhere to criteria such as accuracy, appropriateness, relevancy, and security. NeMo Guardrails also supports LangChain so developers can add prompt engineering alignment beyond fine-tuning.
Game Developer Use and Demo
The video at the top of this post employed Nvidia’s ACE for Games with Convai services platform to enable the NPC Jin to interact naturally with open-ended player conversation. Unreal Engine 5 and MetaHuman were used to create the scene. Purnendu Mukherjee, founder and CEO at Convai commented:
“With NVIDIA ACE for Games, Convai’s tools can achieve the latency and quality needed to make AI non-playable characters available to nearly every developer in a cost-efficient way.”
Nvidia also said several other game studios were using Nvidia generative AI Audio2Face services, including:
GSC Game World, one of Europe’s leading game developers, is adopting Audio2Face in its upcoming game, S.T.A.L.K.E.R. 2: Heart of Chornobyl.
Fallen Leaf, an indie game developer, is also using Audio2Face for character facial animation in Fort Solis, a third-person sci-fi thriller game that takes place on Mars.
Generative AI-focused companies such as Charisma.ai are leveraging Audio2Face to power the animation in their conversation engine.
Business Alignment
Nvidia’s move to package its generative AI solutions for developers is smart. The company is already well known in the gaming community and works closely with game makers to enable increasingly sophisticated video game graphics. Game developers also have specific needs around control and low latency performance, two areas where Nvidia believes it has a competitive advantage.
The generalized foundation models are available, and many are robust. But, their features are largely undifferentiated beyond a threshold level of performance. You should expect many more examples of industry and niche-based generative AI solution packaging throughout the remainder of 2023. Useful generative AI capabilities exist. The trick now is optimizing them around customers and use cases.