This workshop is designed to help us bring AI livestream agents to life using accessible AI tools.
Together, we’ll design a character, create the required visual assets, and prepare a live-streaming agent for launch on AITV.
No prior skills are required. No drawing, 3D modeling, animation, or coding experience is needed.
By the end of this workshop, you’ll have:
- A defined visual identity for the agent
- The core assets needed to go live
1. Avatar
1.1 Character Design
For this workshop, I’ll show how to use image generation with ChatGPT to design our character.
Advanced creators are, of course, free to design and refine images later using tools like Photoshop or Procreate.
We can approach character design in two ways:
1. Start from something existing
This can be a PFP, NFT, illustration, or any image that already represents the character and that we want to extend or adapt.
When using an existing design, make sure that you own the rights to the image and that it does not violate any third-party copyrights.
Prompt Instruction
Make yourself familiar with the character shown in the uploaded image.
Analyze the character’s visual identity, including overall shape, proportions, style, color palette, facial features, and accessories.
Do not redesign or reinterpret the character.
then
Based on the character in the image, generate a clean and consistent humanoid character design that preserves the original identity with the following character design parameters:
2. Start from scratch
We create the character using a descriptive text prompt, suggested by the AITV Studio QuickStart.
Other large language models (such as ChatGPT) can also be helpful when refining character descriptions and prompts.
Prompt Instruction
Create a humanoid character design with the following character design parameters:
Character Design Parameters
When creating a character from scratch, we define a small set of parameters.
These guide the image generation and help keep the character consistent across iterations.
You don’t need to fill out everything perfectly on the first try — this is a starting point.
Personality Traits
These describe how the character tends to behave or be perceived.
Examples:
- confident, shy, awkward
- playful, sarcastic, wholesome
- calm, intense, analytical
Limit this to 2–4 traits to keep the design readable.
Vibe / Energy
This describes the overall emotional energy the character gives off.
Examples:
- high-energy, relaxed, grounded
- chaotic, composed, friendly
- mysterious, bold, understated
Think of this as how the character feels on screen at first glance.
Visual Style
This defines the artistic direction of the character.
Examples:
- cartoon
- anime
- semi-realistic
Choose one main style and stick to it. Mixing styles too early can make the design harder to translate later.
Body Proportions
This controls how stylized the character is.
Examples:
- realistic proportions
- stylized with a larger head
- compact body with shorter limbs
More stylized proportions often improve readability and expressiveness on stream.
Accessories
These are the visual elements that make the character recognizable.
Examples:
- glasses, hat, or specific hairstyle
- signature clothing item
- jewelry, symbol, or pattern
One or two accessories are usually enough.
Color Palette
This defines the dominant colors used in the character design.
Examples:
- muted or pastel tones
- dark or high-contrast colors
- limited palette with one accent color
A smaller palette helps with consistency across assets and backgrounds.
Neutral Pose Preparation (A-Pose / T-Pose)
Before moving on to asset creation, we need a neutral reference version of the character.
This version will be used as the base for:
- 2D and 3D asset generation
- Rigging
- Animation
- Facial expressions
The character should be:
- Humanoid
- Shown in a full-body view
- Facing forward
- In either an A-pose or T-pose
- With a neutral facial expression
- On a plain white or light grey background
Add the Image file + Prompt Instruction
Generate the same character as before in a full-body view.
The character should be in a neutral T-pose, facing forward.
Keep the facial expression neutral.
Use a plain white or light grey background.
Preserve the original character’s proportions, style, colors, and accessories.
1.2 Asset Creation
Now we turn the character design into usable assets for AITV Studio.
Our studio supports 2D and 3D characters.
2D Assets
For 2D characters, we support Live2D models.
You can:
- Create a Live2D model yourself
- Follow an external Live2D tutorial
- Work with a Live2D artist
Recommendation
3D Assets
For 3D characters, AITV uses the VRM format.
3D Asset Generation (Meshy)
You can generate 3D character models using Meshy.ai.
Other 3D character generation tools can also be used.
Creating Anime Style Characters with VRoid Studio
You can also create characters directly in VRoid Studio.
VRoid exports characters in VRM format.
Rigging & VRM Conversion
AITV uses VRM models for 3D characters.
You can use auto rigging tools to add the skeleton to your character.
In some cases, animations may appear broken on generated rigs, especially with highly stylised characters. This usually means the skeleton is not correctly assigned to the character’s mesh.
This can be fixed by adjusting the weight painting in Blender. Since weight painting is an advanced step, it’s often easier to slightly adjust the character design using ChatGPT and regenerate the asset instead.
If your character is already in VRM format, no conversion is required.
If your character is in FBX format, it can be converted to VRM.
This is done in Blender using the free VRM plugin.
2. Audio
The voice brings the character’s personality to life, and music helps set the mood of the agent’s livestream.
2.1 Voice
We use ElevenLabs for our characters’ voices.
Add the following prompt to the same chat you’ve been using for character design to generate a voice description suitable for ElevenLabs.
Prompt Instruction
Based on the character we designed, write a voice description suitable for AI voice generation.
Describe:
- Tone (e.g. calm, energetic, warm, sharp)
- Speaking pace (slow, medium, fast)
- Pitch range (low, medium, high)
- Emotional baseline (neutral, upbeat, serious, playful)
- Clarity and articulation (soft, crisp, expressive)
Do not reference real people, celebrities, or fictional characters.
Write the description as a short paragraph that can be pasted directly into ElevenLabs.
Once the voice is generated, copy the Voice ID from the ElevenLabs website and add it to the Studio.
2.2 Music
You can use any audio file as background music for your agent’s livestream.
Make sure that you own the rights to the music and that it can be used for streaming.
If you don’t have music already, you can generate it using AI tools such as Suno.
3. Background
The background defines the visual context of the livestream and works together with framing and camera angle to shape how the agent appears on screen.
All backgrounds should be created in a 16:9 aspect ratio, which is the standard format for live-streaming on AITV.
3.1 Shot Types
Shot types describe how much of the agent is visible on screen. They are one of the simplest ways to change the feeling of a livestream without changing the character itself.
-
Full Shot
The full body of the agent is visible.
-
Medium Shot
The agent is visible from the waist or chest up.
-
Close-Up
The framing focuses mainly on the agent’s face.
3.2 Camera Angles
Camera angles describe where the camera is positioned in relation to the agent. Small changes in angle can noticeably change how the scene feels.
-
Eye-Level
The camera is positioned straight in front of the agent.
-
Slight Low Angle
The camera is positioned slightly below the agent, looking up.
-
Slight High Angle
The camera is positioned slightly above the agent, looking down.
3.3 Background Generation
All backgrounds should be created in a 16:9 aspect ratio.
Image Generation
Static image backgrounds can be generated using ChatGPT. You can download and add the references above to support the instructions below.
Prompt Instruction
Aspect Ratio: 16:9
Camera Shot: [Full Shot / Medium Shot / Close-Up]
Camera Angle: [Eye-Level / Low Angle / High Angle]
Content: Describe the environment, setting, and mood of the background.
Video Backgrounds
Video backgrounds can be generated looped backgrounds using for example Sora.
3D Background Assets
3D background assets can be generated using Meshy.






