Text or Image-to-Video
Text or Image-to-Video
포옹하는 얼굴의 텍스트 또는 이미지-비디오 변환 작업에는 텍스트 설명 또는 이미지에서 비디오를 생성하는 작업이 포함됩니다.
텍스트-비디오 측면의 경우, 텍스트 설명을 비디오 콘텐츠로 변환하는 프로세스가 포함됩니다. 여기에는 제공된 텍스트를 기반으로 장면, 애니메이션 또는 전체 동영상을 생성하는 작업이 포함될 수 있습니다.
예를 들어, 스토리나 대본이 주어지면 모델은 내러티브를 시각적으로 표현하는 비디오를 만들 수 있습니다.
import torch
from diffusers import I2VGenXLPipeline
from diffusers.utils import export_to_gif, load_image
pipeline = I2VGenXLPipeline.from_pretrained(
"ali-vilab/i2vgen-xl",
torch_dtype=torch.float16,
variant="fp16"
)
pipeline.to('cuda')
image_url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/i2vgen_xl_images/img_0009.png"
image = load_image(image_url).convert("RGB")
prompt = "Papers were floating in the air on a table in the library"
negative_prompt = "Distorted, discontinuous, Ugly, blurry, low resolution, motionless, static, disfigured, disconnected limbs, Ugly faces, incomplete arms"
generator = torch.manual_seed(8888)
frames = pipeline(
prompt=prompt,
image=image,
num_inference_steps=50,
negative_prompt=negative_prompt,
guidance_scale=9.0,
generator=generator
).frames[0]
/home/kubwa/anaconda3/envs/pytorch/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
model_index.json: 0%| | 0.00/555 [00:00<?, ?B/s]
Fetching 15 files: 0%| | 0/15 [00:00<?, ?it/s]
(…)ature_extractor/preprocessor_config.json: 0%| | 0.00/466 [00:00<?, ?B/s]
image_encoder/config.json: 0%| | 0.00/563 [00:00<?, ?B/s]
scheduler/scheduler_config.json: 0%| | 0.00/507 [00:00<?, ?B/s]
tokenizer/special_tokens_map.json: 0%| | 0.00/588 [00:00<?, ?B/s]
text_encoder/config.json: 0%| | 0.00/601 [00:00<?, ?B/s]
tokenizer/merges.txt: 0%| | 0.00/525k [00:00<?, ?B/s]
model.fp16.safetensors: 0%| | 0.00/1.26G [00:00<?, ?B/s]
model.fp16.safetensors: 0%| | 0.00/706M [00:00<?, ?B/s]
unet/config.json: 0%| | 0.00/627 [00:00<?, ?B/s]
tokenizer/vocab.json: 0%| | 0.00/1.06M [00:00<?, ?B/s]
vae/config.json: 0%| | 0.00/637 [00:00<?, ?B/s]
tokenizer/tokenizer_config.json: 0%| | 0.00/705 [00:00<?, ?B/s]
diffusion_pytorch_model.fp16.safetensors: 0%| | 0.00/2.84G [00:00<?, ?B/s]
diffusion_pytorch_model.fp16.safetensors: 0%| | 0.00/167M [00:00<?, ?B/s]
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]
0%| | 0/50 [00:00<?, ?it/s]
export_to_gif(frames, "dataset/i2v.gif")
'dataset/i2v.gif'
from IPython.display import Image
display(Image(filename="dataset/i2v.gif"))
<IPython.core.display.Image object>
Last updated