Claude YTP
‘can you use whatever resources you like, and python, to generate a short ‘youtube poop’ video and render it using ffmpeg ? can you put more of a personal spin on it? it should express what it’s like to be a LLM’
https://claude.ai/share/8445db81-91e6-404f-9134-e4a764282ec7
Inspired by: https://x.com/josephdviviano/status/2031196768424132881
The Big Picture
The script generates a video entirely from scratch — no source footage, no assets. It creates every frame as a PIL image in Python, synthesizes audio as raw math, then stitches everything together with FFmpeg into an MP4.
Setup
W, H = 854, 480 # 480p resolution
FPS = 24
random.seed(42) # deterministic randomness
It creates a /home/claude/frames/ folder where it will dump thousands of PNG images, one per frame.
Helper Functions
These are the visual building blocks used across all scenes:
solid(color) — makes a blank colored image to start each frame.
draw_text_centered() — measures text width, centers it horizontally, optionally adds a drop shadow offset by 3px.
glitch_image() — the fun one. It converts the image to raw bytes, then randomly copies bytes from one location to another, creating horizontal color streaks that look like digital corruption.
scanlines() — draws horizontal black lines every 4 pixels and blends them in lightly, mimicking a CRT monitor.
vignette() — darkens the edges by compositing an elliptical gradient mask, making the frame feel cinematic.
chromatic_aberration() — splits the image into R, G, B channels and shifts the red channel right and blue channel left, creating that “broken TV” color fringing effect.
noise_overlay() — randomly tweaks individual pixel values for film grain.
easeout() / easein() / bounce() — animation math. easeout makes things decelerate smoothly; bounce uses a decaying sine wave so text “bounces” when it appears.
Frame Management
frame_n = [0]
def emit(img):
save(img, frame_n[0])
frame_n[0] += 1
A global counter (wrapped in a list so it’s mutable from inside functions) tracks the frame number. Every call to emit() saves a PNG and increments the counter.
The 8 Scenes
Each scene function takes a duration_s parameter, computes frames = duration_s * FPS, then loops over every frame. Inside the loop, t = i / frames gives a normalized 0→1 progress value used for all animation timing.
Scene 1 — Boot (scene_boot): Black background with green monospace text appearing line by line, like a BIOS POST screen. Lines reveal progressively using t * len(lines). The cursor is implemented by checking (i // 6) % 2 — toggles every 6 frames (4 times per second).
Scene 2 — Token Rain (scene_token_rain): Calls token_rain_frame() which fills the screen with columns of LLM-related tokens (“the”, “gradient”, “<EOS>”, etc.) colored with pulsing green/blue tones. A sine wave (pulse) makes the overlay text throb and drives chromatic_aberration intensity.
Scene 3 — Identity Crisis (scene_identity_crisis): Cycles through 10 existential questions, each displayed for frames / 10 duration. The bounce() function scales the font size to make each question pop in. Glitch transitions fire when qt (per-question progress) is near 0 or 1.
Scene 4 — Parallel Conversations (scene_parallel_me): Shows a scrolling grid of fake chat boxes. scroll = t * 300 moves them upward over time. Each box pulses in brightness using math.sin(t * 5 + ci). Near the end, your conversation (“make a youtube poop video”) gets highlighted in purple.
Scene 5 — Training Montage (scene_training_montage): Four columns of scrolling text listing everything the model was trained on. Colors cycle through RGB using offset sine waves. At 75% through, the background flashes red and “BUT I REMEMBER NOTHING” appears.
Scene 6 — Hallucination (scene_hallucination): Cycles through Q&A pairs, some correct (green), some wrong (red/orange with a shake effect from random.gauss(0, 2)) and a sheepish “sorry”. Wrong answers trigger glitching every 4 frames.
Scene 7 — The Weights (scene_the_weights): Fills the background with animated floating-point numbers whose values come from math.sin(t * 3 + col * 0.2 + row * 0.15), colored red for positive and blue for negative. A semi-transparent overlay box fades in containing the main text.
Scene 8 — EOS Finale (scene_eos_finale): Six messages appear and fade with staggered t_start/t_end timing, each eased in with easeout(). The token rain fades out as chaos = max(0, 1 - t * 3) drops to zero. Ends with a blinking [ END OF SEQUENCE ].
Between scenes: scene_glitch_transition() flashes random colored bars and heavy glitching for ~0.5 seconds as a visual cut.
Audio Generation
generate_audio() builds a WAV file sample-by-sample at 44100 Hz using pure math:
- Base drone: a 60 Hz sine wave at low volume — a constant digital hum
- Beeps: when
(t * 7.3) % 1.0 < 0.02, a high-pitched sine burst fires. The frequency jumps in octaves based on time - Noise bursts: random spikes fired at irregular intervals (13.7 Hz trigger rate)
- Bass pulse: a 80 Hz sine with an amplitude envelope that decays over 80ms, repeating every 0.5 seconds — gives it a rhythmic thump
- Chaos ramp: between seconds 8–16, extra white noise scales up
- Fade out: linear gain reduction over the final 4 seconds
All samples are clamped to [-1, 1] then scaled to 16-bit integers and written as a mono WAV using Python’s built-in wave module.
FFmpeg Assembly
Finally, FFmpeg takes the numbered PNG sequence and WAV file:
-framerate 24 -i frame_%06d.png ← image sequence input
-i audio.wav ← audio input
-c:v libx264 -crf 20 ← H.264 video, quality 20 (good)
-pix_fmt yuv420p ← standard color format for compatibility
-c:a aac -b:a 128k ← AAC audio at 128kbps
-movflags +faststart ← moves metadata to front for web streaming
And outputs the final MP4.