Live animation has gained immense popularity for enhancing online engagement, yet achieving high-quality,
real-time, and stable animation with diffusion models remains challenging, especially on consumer-grade
GPUs. Existing methods struggle with generating long, consistent video streams efficiently, often being
limited by latency issues and degraded visual quality over extended periods. In this paper, we introduce
RAIN, a pipeline solution capable of animating infinite video streams in real-time with low latency using
a single RTX 4090 GPU. The core idea of RAIN is to efficiently compute frame-token attention across
different noise levels and long time-intervals while simultaneously denoising a significantly larger
number of frame-tokens than previous stream-based methods. This design allows RAIN to generate video
frames with much shorter latency and faster speed, while maintaining long-range attention over extended
video streams, resulting in enhanced continuity and consistency. Consequently, a Stable Diffusion model
fine-tuned with RAIN in just a few epochs can produce video streams in real-time and low latency without
much compromise in quality or consistency, up to infinite long. Despite its advanced capabilities, the
RAIN only introduces a few additional 1D attention blocks, imposing minimal additional burden. Experiments
in benchmark datasets and generating super-long videos demonstrating that RAIN can animate characters in
real-time with much better quality, accuracy, and consistency than competitors while costing less latency.
All code and models will be made publicly available.