2026-07-02 · architecture
Your video never leaves your browser. Here is the whole trick.
"Private" is a claim every converter makes and almost none can prove: the moment your lecture uploads to someone's server, you are trusting a retention policy. Video2Any makes the claim structurally instead. There is no upload endpoint to trust, because the conversion never leaves your machine.
What actually happens when you drop a file
- The file becomes an object URL and feeds a hidden
<video>element. Your browser's own decoder does the heavy lifting, hardware accelerated. - We seek through it, draw frames to a canvas, and compare pixel buffers to find slide changes (the detection pipeline has its own post).
- Kept frames are encoded to JPEG by the canvas, in memory.
- The .pptx is a zip file assembled in the tab and handed to you as a download. PowerPoint files are just XML plus images, so no server is needed to write one.
- Subtitles run the same way: audio is decoded and resampled by the Web Audio API, and Whisper runs in the tab. The model weights download once from a public CDN and cache in your browser.
Close the tab and everything is gone. There is no account because there is nothing to attach one to.
Why this makes the free tier boringly sustainable
An upload converter pays for bandwidth, storage, and CPU on every single job, so it has to meter you: ten minutes free, then a paywall. Our marginal cost per conversion is zero, so the web tier is free without a catch: unlimited conversions, two hour videos, subtitles included. The paid product is the API, where the work genuinely runs on our servers and metering is honest.
The trade-offs, stated plainly
- Your hardware sets the pace. A long video on an old laptop takes longer than a server farm would.
- Codec support is your browser's codec support. A format your browser cannot play, we cannot read.
- Whisper in a tab is slower than Whisper on a GPU server, and the small model trades some accuracy for size.
For a confidential all-hands recording or an unpublished lecture, those are usually the right trade-offs to accept.
The converter is here. Watch the network tab while it runs, that is the whole pitch.