Build a Custom Personality for Real Time Video AI

Yujian Tang

August 26, 2024

Table of Contents

One of the coolest things about being able to converse with an AI in real time is the ability to customize it. Imagine, you can make an AI that can act like anything you want it to be. It can be a life coach, a celebrity, or even your best friend. Does this sound like it could be difficult? Well, it’s not. You can do it all with just a simple API call via Tavus, all you need is an API key, Python, and an internet connection.

In this article we cover:

A Review of Conversational AI
What is a Custom Persona?
How Do You Make a Custom Personality for Your AI?
Summary of Building a Custom Personality for Real Time Video AI

Review of Conversational AI

Tavus’ recent launch of conversational AI allows you to build a real time video AI conversation in just a few lines of code. Each conversation consists of a room, a replica, and a persona. The room is a Daily room by default. You can pick a replica from the list of stock replicas, or create your own. Finally, there’s a persona, and in this tutorial, we’re going to cover how you can make a persona.

For a full overview, check out part one - how to build a real time video AI in 5 minutes.

What is a Custom Persona?

In simple terms, a custom persona is a person that you load up to talk to that you have given custom instructions to. In Tavus, personas are powered by LLMs. There are three main levers you can pull for customization. One, the system prompt. Two, the context. Three, the LLM itself. Beyond these three, the other tweak you can make is that you can bring your own text-to-speech engine as well.

How Do You Make a Custom Personality for Your AI?

Just like creating replicas, conversations, and videos, personas are created via an API interface. Let’s look into some sample code to make a “Life Coach” persona. We’ll start with simply importing `requests` so we can make API requests, and the set the URL that we need, “https://tavusapi.com/v2/personas”.

import requests

url = "https://tavusapi.com/v2/personas"

`‍`Parts of the Persona

Now, we need to make the payload for the API call to make the persona. The first parameter we’ll look at is persona_name. In this example, we’ll call it “Life Coach”. Next, we get to the interesting parts - system_prompt and context.

Think about this like the pieces of setting up an LLM. The system prompt tells the LLM how it should act, and what its goals are. In this example, you can see that it starts with “As a life coach, …” When you’re working with LLMs, the classic system prompt starts off with something like “you are a helpful assistant.”

The next interesting piece is the context. This is the next piece of crafting your AI’s custom personality. If the system prompt is hiring your LLM for a job, the context is like the first month of training. In this example, we use it to give the LLM some examples of things that it has done. Next, we have the default_replica_id parameter, which can be taken from the stock replicas or your own.

The last piece of the puzzle, layer, is actually a set of multiple layers. First, the LLM itself. The way this works is that it’s actually just an API call. It doesn’t have to actually be an LLM, but the API call must be able to take and output strings like an LLM. You can define this like an LLM via the model, a base URL, and an API key for access.

The other two pieces of the layers are the text-to-speech (TTS), and VQA (vision question answer). For the TTS, you need to specify the API key, which text to speech engine you’re using, and an external voice ID. For VQA, there’s only one parameter. You can either enable vision for the persona or not. Enabling vision allows the replica to “see” you, it gives the replica the ability to do image processing via the video input.

payload = {

"persona_name": "Life Coach",

"system_prompt": "As a Life Coach, you are a dedicated professional who specializes in...",

"context": "Here are a few times that you have helped an individual make a breakthrough in...",

"default_replica_id": "r79e1c033f",

"layers": {

"llm": {

"model": "<string>",

"base_url": "your-base-url",

"api_key": "your-api-key"

},

"tts": {

"api_key": "your-api-key",

"tts_engine": "cartesia",

"external_voice_id": "external-voice-id"

},

"vqa": {"enable_vision": "false"}

}

‍

With everything put together, all we have to do is send a POST request. For that, the last thing we need to create is a set of headers. The headers only need to contain your Tavus API key and the content type, which is “application/json”. Everything is created: the URL, the JSON payload, and the headers. Now, you just send a POST request.

headers = {

"x-api-key": "<api-key>",

"Content-Type": "application/json"

}

response = requests.request("POST", url, json=payload, headers=headers)

print(response.text)You’ll get back three pieces: a persona ID, a persona name (which you set earlier), and when the persona was created.

Summary of Building a Custom Personality for Real Time Video AI

In this article, we covered how you can build a custom personality for your real time video AI. We started by reviewing how conversational AI works - you need a virtual room, a replica, and a persona. The virtual room is the interface, the replica is what appears on screen, and the persona is how the replica interacts with you. You can create your own replicas, or use the stock replicas available on Tavus.

We focused on creating the persona, this piece covers the basics of how to create a persona. The persona requires a system prompt and some context for the LLM that operates it. You can also bring your own custom LLM through an API endpoint, as well as specify your own text-to-speech engine and vision question answering. In the future, we’ll look at covering more pieces of customizing a conversational AI such as recording it, what a custom LLM could look like, and retrieving transcripts.