The paper introduces KV Inversion, a method for text-conditioned real image action editing. This method allows for results that conform to the action semantics of the editing prompt while preserving the content of the original image. The method does not require training the Stable Diffusion model itself, nor does it require scanning a large-scale dataset for time-consuming training. The potential applications of this method are vast, including comic book production, video editing, and advertising material production.

 

Publication date: 28 Sep 2023
Project Page: https://arxiv.org/abs/2309.16608
Paper: https://arxiv.org/pdf/2309.16608