diff --git a/README.md b/README.md index 6b260359..97ff6d79 100644 --- a/README.md +++ b/README.md @@ -78,6 +78,11 @@ If you want to have a quick experiment, you can try it on [![Open In Colab](http The [prototype]() of how to build a video preprocessing for LLM training data in Bytedance, which serves billions of clip processing each day. +The input video will be split according to scene change, and subtitles in the video will be detected and cropped by OCR module, and the video quality will be assessed by BMF provided aesthetic module. +After that, the finalized video clips will be encoded as output. + +If you want to have a quick experiment, you can try it on [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/BabitMF/bmf/blob/master/bmf/demo/LLM_video_preprocessing/LLM_video_preprocessing.ipynb) + #### Deoldify This demo shows how to integrate the state of art AI algorithms into the BMF video processing pipeline. The famous open source colorization algorithm [DeOldify](https://github.com/jantic/DeOldify) is wrapped as a BMF pyhton module in less than 100 lines of codes. The final effect is illustrated below, with the original video on the left side and the colored video on the right. diff --git a/bmf/demo/LLM_video_preprocessing/LLM_video_preprocessing.ipynb b/bmf/demo/LLM_video_preprocessing/LLM_video_preprocessing.ipynb new file mode 100644 index 00000000..2ad075ea --- /dev/null +++ b/bmf/demo/LLM_video_preprocessing/LLM_video_preprocessing.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[{"file_id":"https://github.com/BabitMF/bmf/blob/master/bmf/demo/transcode/bmf_transcode_demo.ipynb","timestamp":1731034209070}],"private_outputs":true},"kernelspec":{"name":"python3","display_name":"Python 3"},"language_info":{"name":"python"}},"cells":[{"cell_type":"markdown","source":["## About this demo\n","It's a demo for LLM video/image generating training preprocessing.\n","\n","Based on BMF, it's flexible to build and integrate algorithms into whole pipeline of preprocessing."],"metadata":{"id":"4-zV1WWh4KHR"}},{"cell_type":"markdown","source":["## 1. Environmental preparation"],"metadata":{"id":"zfRMFe8T7m7V"}},{"cell_type":"markdown","source":["### 1.1 FFmpeg\n","FFmpeg 4.x or 5.x is needed by BMF when transcoding, check versions via apt:"],"metadata":{"id":"91HYL6LrOpOS"}},{"cell_type":"code","source":["! apt show ffmpeg | grep \"^Package:\\|^Version:\""],"metadata":{"id":"JRc_qwCp7lKQ"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["If the version meets the requirements, install ffmpeg via apt:"],"metadata":{"id":"AtQwyUAs8Sjt"}},{"cell_type":"code","source":["! apt install -y ffmpeg"],"metadata":{"id":"P2HXgryJ8WM1"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["Otherwise, you need to compile ffmpeg from source, you can use the script we provided(only linux and macos now):"],"metadata":{"id":"UdubJ4ld-s1E"}},{"cell_type":"code","source":["! git clone https://github.com/BabitMF/bmf bmf\n","! ./bmf/scripts/build_ffmpeg.sh nasm yasm x264 x265"],"metadata":{"id":"Fd_ig_h-DRqo"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["### 1.2 BMF\n","BMF can be installed in many ways, we use pip here:"],"metadata":{"id":"ozFWzDUlEZ_2"}},{"cell_type":"code","source":["! pip3 install BabitMF"],"metadata":{"id":"nifZsafGEtAU"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["### 1.3 pip requirements"],"metadata":{"id":"lvwmhasRsf7g"}},{"cell_type":"code","source":["! pip3 install easyocr pydantic onnxruntime"],"metadata":{"id":"i042rpVksoc4"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["### 1.4 wurlitzer(optional)\n","This package is installed to show the BMF C++ logs in the colab console, otherwise only python logs are printed. This step is not necessary if you're not in a Colab or iPython notebook environment."],"metadata":{"id":"DZIEEZrsorvS"}},{"cell_type":"code","source":["!pip install wurlitzer\n","%load_ext wurlitzer"],"metadata":{"id":"SGUtzAwyo0A3"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["## 2. preprocessing demo"],"metadata":{"id":"K6d5Zx1gE40p"}},{"cell_type":"markdown","source":["Download the video file we will be using first:"],"metadata":{"id":"cWV4P15GpnxI"}},{"cell_type":"code","source":["!gdown --fuzzy https://drive.google.com/file/d/1l8bDSrWn6643aDhyaocVStXdoUbVC3o2/view?usp=sharing -O big_bunny_10s_30fps.mp4"],"metadata":{"id":"nvxgOt8upwVO"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["! ffprobe big_bunny_10s_30fps.mp4"],"metadata":{"id":"4nnPCG_DuYA4"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["### 2.1 python code and download resources"],"metadata":{"id":"XGT6gZcmFfbU"}},{"cell_type":"code","source":["! git clone https://github.com/BabitMF/bmf.git"],"metadata":{"id":"fGDKnFpOwb46"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["%cd /content/bmf/bmf/demo/LLM_video_preprocessing\n","! pwd\n"],"metadata":{"id":"pRVNAUAcutYd"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["! wget https://github.com/BabitMF/bmf/releases/download/files/models.tar.gz && tar zxvf models.tar.gz -C ../../"],"metadata":{"id":"TpapRIB07wRt"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["! ls -lrht ../../models"],"metadata":{"id":"KRDfkrXJ8U5l"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["### 2.3 run demo"],"metadata":{"id":"WP2AzPPkR3uI"}},{"cell_type":"code","source":["import logging\n","from media_info import MediaInfo\n","from split import get_split_info\n","import torch\n","from clip_process import ClipProcess\n","import argparse\n","import os\n","\n","logger = logging.getLogger('main')\n","logger.setLevel(logging.INFO)\n","\n","scene_thres = 0.3\n","\n","def get_timeline_list(pts_list, last):\n"," current = 0\n"," timelines = []\n"," for pts in pts_list:\n"," pts = pts/1000000\n"," if pts > current:\n"," timelines.append([current,pts])\n"," current = pts\n"," # last\n"," if last > current:\n"," timelines.append([current,last])\n"," return timelines\n","def video_processing_demo(input_file, mode, config):\n"," media_info = MediaInfo(\"ffprobe\", input_file)\n"," duration = media_info.video_duration()\n"," logger.info(f\"duration:{duration}\")\n","\n"," pts_list = get_split_info(input_file, scene_thres)\n"," timelines = get_timeline_list(pts_list, duration)\n"," logger.info(f\"timelines:{timelines}\")\n"," cp = ClipProcess(input_file, timelines, mode, config)\n"," cp.process()\n","\n","def demo_run(input_file):\n"," model_path = \"../../models/aes_transonnx_update3.onnx\"\n","\n"," torch.set_num_threads(4)\n"," mode = \"ocr_crop,aesmod_module\"\n"," config = {\"output_configs\":[{\"res\":\"orig\", \"type\":\"jpg\"}, {\"res\":\"480\", \"type\":\"mp4\"}]}\n"," video_processing_demo(input_file, mode, config)\n","\n"],"metadata":{"id":"AUcjjyR5WM25"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["!ls /content"],"metadata":{"id":"latYuFcA8oDS"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["demo_run(\"/content/big_bunny_10s_30fps.mp4\")"],"metadata":{"id":"xmW17RiI-Fid"},"execution_count":null,"outputs":[]},{"cell_type":"code","source":["! cd clip_output && ls -lrth && cat clip_0_ocr_crop_result.json && cat clip_0_aesmod_module_result.json"],"metadata":{"id":"wBct3CDz-LyO"},"execution_count":null,"outputs":[]},{"cell_type":"markdown","source":["### 2.3 visulize output video"],"metadata":{"id":"V5w69RK3SFnZ"}},{"cell_type":"code","source":["from IPython.display import HTML\n","from base64 import b64encode\n","\n","def show_video(video_path, video_width = 800):\n","\n"," video_file = open(video_path, \"r+b\").read()\n","\n"," video_url = f\"data:video/mp4;base64,{b64encode(video_file).decode()}\"\n"," return HTML(f\"\"\"\"\"\")\n","\n","# input video\n","show_video('clip_output/clip_1_480.mp4')"],"metadata":{"id":"3EVN2Q69SQn1"},"execution_count":null,"outputs":[]}]} \ No newline at end of file