Tutorial

How to Build an Interactive 360° Character Viewer

A complete, step-by-step guide to creating a drag-to-rotate character portrait using AI video generation and a lightweight web viewer. No AI agent required: every step is manual and reproducible.

● 6 steps ● ~30 min active work ● Intermediate level

🎬 Prefer a video? Watch the full video tutorial on YouTube: How to Build a 360° Character Viewer Using AI Video Generation

✍️ Note from the Creator

Thanks for checking out this tutorial. The workflow you're about to learn was developed while building the interactive character gallery for my novel, An Enduring Spark, and its companion site, The Roar of Winchester.

The characters in this guide aren't just tech demos. They're people from a world I've spent years building: a series that will span multiple novels and interweaving timelines. As the world and the novels grow, the website will grow with it and I'd love if you could visit, subscribe, and be a part of the journey. If you enjoy the creative side of what you see here than you will find plenty of additional examples of world building and character building that I am exploring. Thank you and enjoy the tutorial.

You can browse the full character gallery to see more examples of what this workflow produces. Each character below has their own interactive 360° viewer built using the exact process described in this guide.

Character gallery showing 10 interactive 360 viewers

The full character gallery at roarofwinchester.com — all built with this workflow

— Nick Dowbiggin, Author

Prerequisites

Before you begin, make sure you have the following tools and accounts ready.

Software

ffmpeg (any recent version) for extracting video frames
A text editor (VS Code, Sublime, Notepad++, etc.)
A web browser for testing the final viewer
A web server or hosting (nginx, Apache, GitHub Pages, Netlify, etc.)

Accounts & Access

Google Cloud account with Vertex AI API enabled (for Veo 3 video generation)
Python 3.8+ with the google-genai package installed
gcloud CLI authenticated with Application Default Credentials

ℹ️

This tutorial uses Google Veo 3 via the Vertex AI SDK for video generation. You could substitute another provider (Kling, Luma, Runway, etc.) as long as it produces a smooth 360° orbital rotation. The prompt may need adaptation for other providers.

What You'll Build

The finished product is an interactive web page where visitors can click and drag (or swipe on mobile) to rotate a character in a full 360° circle. It works by rapidly cycling through pre-extracted video frames, creating the illusion of a 3D turntable.

The source video: a smooth 360° orbital rotation generated by Veo 3

The viewer runs entirely client-side with zero dependencies: plain HTML, CSS, and JavaScript. No libraries, no frameworks, no build tools. It works on desktop and mobile.

1 Prepare Your Reference Image

The process begins with a single high-quality reference image of your character. This image anchors the video generation, ensuring the output maintains your character's appearance throughout the full rotation.

The input reference image for Dariah Spence, created in a painterly style

Image Requirements

Property	Requirement
Subject	Full-body view, standing, facing camera
Pose	Neutral standing position, arms at sides or relaxed
Background	Simple, dark, non-distracting (studio-style is ideal)
Resolution	At least 1024x1024. Higher is better.
Style	Should match your target output (photorealistic, painterly, etc.)
Format	PNG or high-quality JPG

💡

Key insight: The reference image does not need to show the character from multiple angles. The video model infers 3D structure from a single front-facing view. However, accuracy improves significantly when clothing details, hair texture, and body proportions are clearly visible.

2 Write the Generation Prompt

The prompt is the most critical variable in the entire process. It must accomplish two things simultaneously: describe the character with extreme precision (so the model doesn't drift from the reference) and instruct the camera movement with technical specificity (so the orbit is smooth and consistent).

The Dariah Prompt (Actual)

This is the exact prompt used to generate the Dariah 360° video shown above:

Generation Prompt A high-fidelity, production-grade 360-degree orbital camera rotation shot for a professional 3D character viewer. NO SPECIAL EFFECTS. Subject: A full-body view of this same exact woman, same exact face: a strikingly beautiful woman, age 25, with fair skin and sharp angular features. Short textured black hair just below her ears, wavy, with wispy ends. Deep-set dark brown almond-shaped eyes with slightly hooded lids, high cheekbones, strong defined jawline, full lips, a slightly pointed chin. Slender but athletic build, toned. She is wearing a dark navy fitted V-neck sweater over a crisp white collared shirt with the collar visible above the sweater neckline and white shirt cuffs visible below the sweater sleeves, tailored plaid grey trousers, and polished brown leather shoes. A brown leather belt at the waist. Her arms rest loosely at her sides with a confident upright posture and a subtle composed expression. The subject is in a frozen neutral standing position, completely static with zero facial or body movement. Environment: Standing at the exact center of a minimalist, dark studio stage with a subtle, non-reflective matte-grey concrete floor. Lighting: Professional three-point studio lighting; Key light, Fill light, and Backlight remain stationary relative to the subject to ensure consistent shadows. Camera Movement: The camera performs a mathematically precise, perfectly horizontal 360-degree circular orbit at a constant eye-level height and a fixed 3-meter radius. Zero vertical tilt, zero zoom fluctuation. Temporal Consistency: 100% frame-to-frame coherence. No morphing, no flickering, no background shifting. Maintain heavy cinematic oil painting painterly style with rich warm palette, dramatic chiaroscuro lighting, and visible confident brushstrokes throughout. NO SPECIAL EFFECTS, no particles, no glowing, no aura, no magical elements.

Prompt Anatomy

Each section of the prompt serves a specific purpose:

Section	Purpose	Example
Opening declaration	Sets the technical context. Tells the model this is a controlled studio shot, not a cinematic scene.	"A high-fidelity, production-grade 360-degree orbital camera rotation shot..."
Subject anchor	Forces the model to keep the same face throughout. Critical for consistency.	"this same exact woman, same exact face"
Physical description	Hyper-specific details reduce hallucination. Every feature, every garment.	Hair color/length/texture, eye shape, skin tone, build, each clothing item
Pose lock	Prevents the model from animating the character.	"frozen neutral standing position, completely static with zero facial or body movement"
Environment	Minimal environment = model focuses all capacity on the character.	"minimalist, dark studio stage with a subtle, non-reflective matte-grey concrete floor"
Lighting spec	Stationary lights prevent shadow flickering during the orbit.	"Professional three-point studio lighting... remain stationary relative to the subject"
Camera spec	Mathematical precision language produces the smoothest orbit.	"mathematically precise, perfectly horizontal 360-degree circular orbit at a constant eye-level height and a fixed 3-meter radius"
Temporal lock	Explicitly demands frame-to-frame consistency.	"100% frame-to-frame coherence. No morphing, no flickering, no background shifting."
Style directive	Controls visual aesthetic across all frames.	"heavy cinematic oil painting painterly style with rich warm palette"
Negative reinforcement	Bans common unwanted model behaviors.	"NO SPECIAL EFFECTS, no particles, no glowing, no aura"

⚠️

Common mistake: Vague clothing descriptions. "Casual outfit" lets the model invent different clothes per frame. Describe each garment individually: the top, what's under it, the pants, the belt, the shoes. Specificity equals consistency.

3 Generate the 360° Video

With your reference image and prompt ready, generate the video using Google Veo 3 via the Vertex AI Python SDK.

Setup

Terminal

# Install the SDK
pip install google-genai

# Authenticate (one-time setup)
gcloud auth application-default login

Generation Script

Save this as generate-360.py:

generate-360.py

#!/usr/bin/env python3
"""Generate a 360 character rotation video with Google Veo 3."""
from google import genai
from google.genai import types
import os, sys, time

# Configuration
PROJECT = "your-gcp-project-id"
LOCATION = "us-central1"
MODEL = "veo-3.0-generate-001"

# Negative prompt: prevent the CHARACTER from moving
# (the camera orbits; the subject stays frozen)
NEGATIVE_PROMPT = (
    "subject rotation, turntable, body movement, "
    "subject turning, walking, shifting, swaying, "
    "zooming, character movement"
)

def main():
    character = sys.argv[1]       # e.g. "dariah"
    prompt_file = sys.argv[2]     # e.g. "dariah-prompt.txt"
    ref_image = f"references/{character}.png"

    with open(prompt_file) as f:
        prompt = f.read().strip()

    with open(ref_image, "rb") as f:
        img_bytes = f.read()

    client = genai.Client(
        vertexai=True,
        project=PROJECT,
        location=LOCATION
    )

    print(f"Submitting generation for {character}...")
    operation = client.models.generate_videos(
        model=MODEL,
        prompt=prompt,
        image=types.Image(
            image_bytes=img_bytes,
            mime_type="image/png"
        ),
        config=types.GenerateVideosConfig(
            aspect_ratio="16:9",
            number_of_videos=2,
            duration_seconds=8,
            negative_prompt=NEGATIVE_PROMPT,
            person_generation="allow_all",
        ),
    )

    # Poll until complete (typically 3-8 minutes)
    print("Generating... (takes 3-8 minutes)")
    for attempt in range(120):
        time.sleep(10)
        operation = client.operations.get(operation)
        if operation.done:
            print("Done!")
            break
        if attempt % 6 == 0:
            print(f"  ...{attempt * 10}s elapsed")

    if not operation.done:
        print("ERROR: Timed out"); sys.exit(1)

    # Save videos
    os.makedirs("videos", exist_ok=True)
    for i, sample in enumerate(
        operation.result.generated_videos
    ):
        path = f"videos/{character}-v{i+1}.mp4"
        v = sample.video

        if v.video_bytes:
            with open(path, "wb") as f:
                f.write(v.video_bytes)
        elif v.uri:
            import urllib.request, subprocess
            token = subprocess.check_output(
                ["gcloud", "auth",
                 "application-default",
                 "print-access-token"], text=True
            ).strip()
            req = urllib.request.Request(
                v.uri,
                headers={
                    "Authorization": f"Bearer {token}"
                }
            )
            with urllib.request.urlopen(req) as resp:
                with open(path, "wb") as f:
                    f.write(resp.read())

        kb = os.path.getsize(path) // 1024
        print(f"  Saved {path} ({kb} KB)")

if __name__ == "__main__":
    main()

Run It

Terminal

python3 generate-360.py dariah dariah-prompt.txt

This produces two candidate videos. Watch both and pick the one with the smoothest rotation and best consistency.

Configuration Reference

Parameter	Value	Why
`aspect_ratio`	"16:9"	Matches standard widescreen viewers
`number_of_videos`	2	More candidates = better odds of a clean orbit
`duration_seconds`	8	Maximum for Veo 3. More frames = smoother rotation.
`negative_prompt`	(see above)	Prevents the character from moving; only the camera orbits
`person_generation`	"allow_all"	Required for human subjects

4 Extract Frames from the Video

The interactive viewer works by displaying individual frames. Extract every frame from the selected video as a numbered JPEG sequence using ffmpeg.

The Command

Terminal

# Create the output directory
mkdir -p dariah-frames

# Extract all frames as high-quality JPEGs
ffmpeg -i videos/dariah-v1.mp4 \
       -qscale:v 2 \
       dariah-frames/frame_%04d.jpg

Flags Explained

Flag	Meaning
`-i videos/dariah-v1.mp4`	Input video file
`-qscale:v 2`	JPEG quality (2 = very high, 31 = lowest). Use 2.
`frame_%04d.jpg`	Output: `frame_0001.jpg`, `frame_0002.jpg`, etc.

For the Dariah video (8 seconds at 24fps), this produces 192 individual frames.

Extracted Frame Samples

frame_0001

frame_0048

frame_0096

frame_0144

Four evenly-spaced frames: front, quarter, back, three-quarter

💡

Quality check: Scrub through the frames before proceeding. Look for sudden face changes, clothing morphing, or background shifts. If issues exist, try the other video candidate or regenerate.

5 Build the Interactive Web Viewer

The viewer is a single HTML file with inline CSS and JavaScript. Zero external dependencies. It preloads all frames into memory, then maps mouse drag / touch swipe / scroll wheel input to frame index changes.

Complete Viewer Code

Create index.html in the same directory as your frames:

index.html

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport"
      content="width=device-width, initial-scale=1.0">
<title>Character Name &mdash; 360 Viewer</title>
<style>
  * { margin:0; padding:0; box-sizing:border-box }
  body {
    background: #0a0a0f;
    color: #e0e0e0;
    font-family: -apple-system, BlinkMacSystemFont,
                 'Segoe UI', sans-serif;
    overflow: hidden;
    height: 100vh;
    display: flex;
    flex-direction: column;
    align-items: center;
    justify-content: center;
    user-select: none;
    cursor: grab;
  }
  body.dragging { cursor: grabbing; }

  #viewer {
    position: relative;
    width: 90vw;
    max-width: 1280px;
    aspect-ratio: 16/9;
    background: #111;
    border-radius: 12px;
    overflow: hidden;
    box-shadow:
      0 0 60px rgba(0,0,0,0.8),
      0 0 120px rgba(40,40,80,0.3);
  }

  #frame {
    width: 100%;
    height: 100%;
    object-fit: contain;
    pointer-events: none;
  }

  #instructions {
    position: absolute;
    bottom: 20px; left: 20px;
    font-size: 12px;
    color: rgba(255,255,255,0.3);
    line-height: 1.6;
  }

  .loading {
    position: absolute;
    top: 50%; left: 50%;
    transform: translate(-50%, -50%);
    font-size: 14px;
    color: rgba(255,255,255,0.3);
  }
</style>
</head>
<body>

<div id="viewer">
  <img id="frame" alt="360 viewer">
  <div class="loading" id="loader">
    LOADING... 0%
  </div>
  <div id="instructions">
    Drag left/right to rotate<br>
    Scroll wheel to rotate<br>
    Arrow keys to step
  </div>
</div>

<script>
// ===== CONFIGURATION =====
const TOTAL_FRAMES = 192; // your frame count
const sensitivity = 0.15; // drag sensitivity

const frames = [];
let currentFrame = 0;
let loaded = 0;
let dragging = false;
let lastX = 0;

function preload() {
  for (let i = 1; i <= TOTAL_FRAMES; i++) {
    const img = new Image();
    img.onload = () => {
      loaded++;
      if (loaded === 1) showFrame(0);
      if (loaded === TOTAL_FRAMES)
        document.getElementById('loader')
          .style.display = 'none';
      document.getElementById('loader')
        .textContent = 'LOADING... ' +
        Math.round(loaded/TOTAL_FRAMES*100) + '%';
    };
    img.src = 'frame_' +
      String(i).padStart(4, '0') + '.jpg';
    frames.push(img);
  }
}

function showFrame(idx) {
  idx = ((idx % TOTAL_FRAMES) + TOTAL_FRAMES)
        % TOTAL_FRAMES;
  currentFrame = idx;
  document.getElementById('frame').src =
    frames[idx].src;
}

// Mouse drag
document.addEventListener('mousedown', e => {
  dragging = true;
  lastX = e.clientX;
  document.body.classList.add('dragging');
});
document.addEventListener('mousemove', e => {
  if (!dragging) return;
  const dx = e.clientX - lastX;
  const fd = Math.round(dx * sensitivity);
  if (fd !== 0) {
    showFrame(currentFrame + fd);
    lastX = e.clientX;
  }
});
document.addEventListener('mouseup', () => {
  dragging = false;
  document.body.classList.remove('dragging');
});

// Scroll wheel
document.addEventListener('wheel', e => {
  e.preventDefault();
  showFrame(currentFrame + (e.deltaY>0 ? 2 : -2));
}, { passive: false });

// Touch (mobile)
let touchX = 0;
document.addEventListener('touchstart', e => {
  touchX = e.touches[0].clientX;
});
document.addEventListener('touchmove', e => {
  e.preventDefault();
  const dx = e.touches[0].clientX - touchX;
  const fd = Math.round(dx * sensitivity);
  if (fd !== 0) {
    showFrame(currentFrame + fd);
    touchX = e.touches[0].clientX;
  }
}, { passive: false });

// Keyboard
document.addEventListener('keydown', e => {
  if (e.key==='ArrowLeft')
    showFrame(currentFrame - 1);
  if (e.key==='ArrowRight')
    showFrame(currentFrame + 1);
});

preload();
</script>
</body>
</html>

How It Works

Preload: All 192 frames load into Image objects on page load. A progress indicator shows status. This ensures instant frame switching once loaded.
Frame display: showFrame() wraps the index with modulo arithmetic (frame 193 becomes frame 1) and swaps the <img> source.
Mouse drag: Tracks horizontal movement. Each pixel translates to sensitivity frames. At 0.15, dragging 100px moves ~15 frames.
Touch support: Same logic as mouse drag, using touchstart / touchmove events for mobile.
Scroll wheel: Each tick jumps 2 frames forward or backward.
Keyboard: Arrow keys step one frame at a time for precise control.

Customization

Variable	Default	Effect
`TOTAL_FRAMES`	192	Must match your extracted frame count
`sensitivity`	0.15	Higher = faster. Try 0.1 for slower, 0.25 for faster.
Scroll delta	2	Frames per scroll tick

6 Deploy

The viewer is entirely static files. Your deployment folder:

Directory Structure

dariah/
  index.html
  frame_0001.jpg
  frame_0002.jpg
  frame_0003.jpg
  ... (192 frames total)
  frame_0192.jpg

Hosting Options

Option	Setup	Best For
nginx	Copy to served directory, add server block	Self-hosted, custom domain
GitHub Pages	Push to repo, enable Pages	Free, public projects
Netlify	Drag and drop the folder	Quick free deployment
Any static host	Upload via FTP/SFTP	Existing hosting

💡

Performance tip: The 192 frames total ~14 MB. Set long cache headers so returning visitors load instantly. Frames never change once deployed, so aggressive caching is safe.

Appendix: Reusable Prompt Template

Use this for any character. Replace [BRACKETS] with your details. Keep all technical language exactly as written.

Template A high-fidelity, production-grade 360-degree orbital camera rotation shot for a professional 3D character viewer. NO SPECIAL EFFECTS. Subject: A full-body view of this same exact [man/woman], same exact face: [AGE DESCRIPTION with skin tone]. [HAIR: color, length, style, texture]. [EYES: color and description]. [FACIAL FEATURES: lips, nose, eyebrows, jawline, face shape, distinguishing marks]. [BUILD: body type with specifics]. [CLOTHING: each item individually: top with color/fit/fabric, bottom with color/fit, shoes with color/style]. [POSE: arm positions, posture, expression]. The subject is in a frozen neutral standing position, completely static with zero facial or body movement. Environment: Standing at the exact center of a minimalist, dark studio stage with a subtle, non-reflective matte-grey concrete floor. Lighting: Professional three-point studio lighting; Key light, Fill light, and Backlight remain stationary relative to the subject to ensure consistent shadows. Camera Movement: The camera performs a mathematically precise, perfectly horizontal 360-degree circular orbit at a constant eye-level height and a fixed 3-meter radius. Zero vertical tilt, zero zoom fluctuation. Temporal Consistency: 100% frame-to-frame coherence. No morphing, no flickering, no background shifting. Maintain heavy cinematic oil painting painterly style with rich warm palette, dramatic chiaroscuro lighting, and visible confident brushstrokes throughout. NO SPECIAL EFFECTS, no particles, no glowing, no aura, no magical elements.

ℹ️

Always include a negative prompt in the API config: subject rotation, turntable, body movement, subject turning, walking, shifting, swaying, zooming, character movement

Troubleshooting

The character's face changes mid-rotation

The most common issue. Fix: make the physical description as specific as possible. Name individual features: jawline shape, nose bridge width, eyebrow arch, lip thickness. More anchor points = better consistency.

The character moves or shifts position

Ensure the prompt includes "frozen neutral standing position, completely static with zero facial or body movement." Also verify the negative prompt includes "subject rotation, turntable, body movement."

Background flickers between frames

Use the "minimalist, dark studio stage" environment. Complex backgrounds give the model more opportunities to hallucinate. The matte-grey floor is deliberately boring.

Clothing changes between front and back

Describe every garment individually. Sweater over a shirt? Describe both. Mention cuffs, collars, belts, accessories. The model can only maintain what it was told about.

Rotation doesn't complete 360°

Generate at maximum duration (8s for Veo 3). Shorter durations often produce partial rotations. If still incomplete, regenerate: there is natural variance between runs.

Viewer loads slowly

The viewer preloads all frames before full interactivity. Optimize with -qscale:v 4 (smaller files, slightly lower quality) or extract every other frame: ffmpeg -i video.mp4 -vf "select=not(mod(n\,2))" -vsync vfr -qscale:v 2 frames/frame_%04d.jpg