How It’s Made: AI Roadtrip, a Pixel Marketing campaign Powered by Generative AI and Followers – Uplaza

What occurs when two telephones cease being rivals and begin being pals? You get the world’s friendliest aggressive marketing campaign: Greatest Telephones Eternally. Throughout 17 episodes, this sequence has taken the telephones on a variety of adventures and constructed a loyal viewers of followers.

Partaking straight with that fan group has at all times been a part of the Greatest Telephones Eternally playbook. For final 12 months’s sequence launch, our workforce skilled a LLM on the tone of the marketing campaign to assist group managers generate friendship-themed responses to 1000’s of feedback. And with fast developments in generative expertise, we noticed a possibility to take that spirit of real-time engagement at scale even additional.

Enter Greatest Telephones Eternally: AI Roadtrip — our first experiment in utilizing generative AI to place followers within the driver’s seat and convey these characters to life.

Right here’s the way it works: An episode on Instagram Reels explains that the 2 characters are happening a street journey powered by AI. When a fan feedback with a location concept, our workforce makes use of a purpose-built software to generate a customized video response inside minutes. Over 16 hours, we plan to create as many distinctive replies as potential.

Utilizing generative AI to create customized, fan-driven content material at scale

Working with our companions The Mill and Left Area Labs, we used a stack of Google AI fashions to design a software that balances machine effectivity with human ingenuity. We’re hoping a few of our takeaways encourage you to discover your individual artistic purposes of those applied sciences.

To see the activation in motion, go to @googlepixel_us on Instagram


In regards to the software

After a consumer feedback a instructed location, we take that location – for instance, “the Grand Canyon” – and enter it into our technology engine to provide personalized belongings:

  • Script Technology: Gemini 1.5 Professional generates a number of scripts based mostly on the commented location, incorporating location-specific references and humor.
  • Picture Technology: Imagen creates a gallery of potential background photos that match the script’s context, setting the scene for the journey.
  • Audio Technology: Cloud Textual content-to-Speech outputs the dialogues from the generated scripts, giving voice to our cellphone besties.

Our artistic workforce is within the loop at every step, choosing, modifying, reviewing, and infrequently re-prompting to verify each video feels prefer it’s really a part of the Greatest Telephones Eternally universe.

The constructing blocks of our reply-generation software

How scripts are generated

We would have liked Gemini to reliably produce scripts within the voice of the marketing campaign, with the proper characters, size, formatting, and magnificence, whereas additionally being entertaining and true to no matter location a consumer instructed.

We discovered the best method to do that wasn’t with prolonged instructions, however by offering quite a few examples within the immediate. Our writers created quick scripts about Pixel and iPhone in several areas and the sorts of dialog they may have in every place.

Sorry, your browser does not assist playback for this video

Our script technology immediate

Feeding these into Gemini as a part of the system immediate achieved two issues. First, it set in place the specified size and construction of our generated scripts, with every cellphone taking a flip in a 4-6-line format. Second, it conditioned the mannequin to output the sorts of dialogue we wished to listen to in these movies (observations concerning the location, phone-related humor, pleasant banter, and various dad jokes).

We designed this immediate to work as a co-writer with human writers, so an vital consideration was ensuring Gemini would produce a variety of scripts that centered on completely different facets of a location and take completely different approaches to the dialog between Pixel and iPhone. That method, our human writers may choose from quite a lot of scripts to both select the one which labored greatest, make edits, or mix scripts.

To make sure this breadth of responses, we had Gemini write scripts conversationally. After Gemini produced one script, we requested it to provide a distinct one, after which a distinct one, and so forth, all within the context of a single dialog. That method, it may see the scripts that had been beforehand generated and ensure the brand new ones lined new floor — giving the human curators a variety of choices.


How photos are created

We used Imagen 2 to supply the picture technology for our backgrounds. As Google’s newest usually obtainable mannequin, it gave our workforce the power to generate the big variety of areas and kinds that this marketing campaign required, with highly effective natural-language controls to assist us tune every output.

We wished Imagen to create backgrounds for all types of areas, however we additionally wished the backgrounds to be compositionally just like accommodate Pixel and iPhone driving within the foreground.

Merely prompting the mannequin with the situation like “Paris” or “the dark side of the moon” would yield photos that seemed just like the areas, however had been inconsistent each stylistically and compositionally. Some can be too zoomed out, some can be black and white, and a few wouldn’t have any space on which Pixel and iPhone may “drive.”

Including further directions may assist generate higher photos, however we discovered tailoring that language to every location was handbook and time-consuming. That’s why we determined to make use of Gemini to generate the picture prompts. After a human author inputs a location, Gemini creates a immediate for that location based mostly on a lot of pattern prompts written by people. That immediate is then despatched to Imagen, which generates the picture.

Utilizing Gemini to generate extra detailed, particular background photos

We discovered utilizing AI-generated prompts yielded photos that had been each extra compositionally constant and likewise extra visually attention-grabbing. The background of our movies aren’t simply static belongings, although; as soon as they’re ingested into Unreal Engine, they grow to be a vital a part of the scene – extra on that within the part beneath.


How sound is created

After we finalize the scripts, we ship every line to Cloud Textual content-to-Speech to generate the audio. This is identical course of we’ve used for all the character voices within the Greatest Telephones Eternally marketing campaign.

Whereas we lean on Cloud TTS to synthesize high-fidelity, natural-sounding speech, our voices for Pixel and iPhone have their very own traits. Right here, we haven’t discovered an AI mannequin that may actually assist our creatives to hit the particular timbre and cadence we would like. As an alternative, we use inner tooling so as to add emphasis and inflections to essentially carry our characters to life.

Artistic tuning on TTS voice outputs

Some movies even have ambient audio beneath the dialogue. We use a mixture of composed sound results, area recordings, and, in fact, AI-generated audio with MusicFX to create soundscapes for the situation and add an additional contact of realism.


The way it all comes collectively

As soon as all the constituent belongings are produced, they mechanically populate a render queue to be ingested by Unreal Engine and composited right into a 3D scene with iPhone, Pixel, and the automotive.

The background picture wraps across the rear and sides of the scene, offering not simply the background for the straight-on pictures of the telephones and the automotive, however the angled views we see when the digicam strikes to spotlight one character talking. Components of the background are captured within the reflections on the automotive hood and even the glass of the telephones’ cameras, whereas the sky above interacts with the lighting of the scene so as to add much more element and realism.

Our nonlinear animation editor permits our creatives so as to add movement to every particular person cellphone in all of our digicam positions. For example, if a cellphone asks a query, they could orient in the direction of the opposite cellphone, somewhat than searching the window or by the windshield, leaning and tilting in a tentative method. Statements, jokes, settlement, and shock all of their very own distinctive animations, and we seamlessly interpolate between all of them and our relaxation state.

Sorry, your browser does not assist playback for this video

Our web-based modifying software

Lastly, our creatives can activate the dynamic parts and textures that actually personalize every video – like mud splatter on the hood for rustic areas and quite a lot of hats for (most) climate circumstances. Some areas may additionally benefit a complete transformation of the automotive, from trusty rover to submarine or spaceship.

Creatives can preview their video’s VO, digicam cuts, and first animations earlier than hitting render. As soon as they’re prepared, all the render jobs are dispatched throughout 15 digital machines on Google Cloud Compute. From begin to end, a brief video might be generated in as little as 10 minutes, together with render time.


Closing ideas and subsequent steps

Utilizing generative AI for artistic improvement and manufacturing is not a brand new concept. However we’re excited to have constructed an utility that stacks collectively Google’s newest, production-ready fashions in a novel method, that takes an concept to real-time supply at scale.

A typical Greatest Telephones Eternally video takes weeks to write down, animate, and render. With this software, our creatives hope to generate lots of of customized mini-episodes in a single day — all impressed by the creativeness of the Pixel group on social.

We hope that this experiment provides you a glimpse of what’s potential utilizing the Gemini and Imagen APIs, no matter your artistic vacation spot could also be.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version