initial publishing

Pseudo-Lab · Jun 30, 2024 · 09b837a · 09b837a
1 parent b92d694
commit 09b837a
Show file tree

Hide file tree

Showing 97 changed files with 4,506 additions and 2,608 deletions.
diff --git a/.buildinfo b/.buildinfo
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: e20c211361d9744e9e2885f468391109
+config: a9cef571adfa06844ac20f23e54a8c14
 tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/_images/0-side-view-photo-of-17-year-old-girl-in-a-japanese-school.gif b/_images/0-side-view-photo-of-17-year-old-girl-in-a-japanese-school.gif
diff --git a/_images/0-side-view-photo-of-17-year-old-girl-in-a-japanese-school_edit.gif b/_images/0-side-view-photo-of-17-year-old-girl-in-a-japanese-school_edit.gif
diff --git a/_images/0-side-view-photo-of-man-in-black-padded-jumper,-hallway,.gif b/_images/0-side-view-photo-of-man-in-black-padded-jumper,-hallway,.gif
diff --git a/_images/0-side-view-photo-of-man-in-black-padded-jumper,-hallway,_edit.gif b/_images/0-side-view-photo-of-man-in-black-padded-jumper,-hallway,_edit.gif
diff --git a/_images/6-image-of-a-man-with-blonde-hair-and-blue-eyes,.gif b/_images/6-image-of-a-man-with-blonde-hair-and-blue-eyes,.gif
diff --git a/_images/6-image-of-a-man-with-blonde-hair-and-blue-eyes,_edit.gif b/_images/6-image-of-a-man-with-blonde-hair-and-blue-eyes,_edit.gif
diff --git a/_images/StyleGAN_fig1.png b/_images/StyleGAN_fig1.png
diff --git a/_images/StyleGAN_fig2.png b/_images/StyleGAN_fig2.png
diff --git a/_images/StyleGAN_fig3.png b/_images/StyleGAN_fig3.png
diff --git a/_images/StyleGAN_fig4.png b/_images/StyleGAN_fig4.png
diff --git a/_images/StyleGAN_fig5.png b/_images/StyleGAN_fig5.png
diff --git a/_images/StyleGAN_fig6.png b/_images/StyleGAN_fig6.png
diff --git a/_images/StyleGAN_fig7.png b/_images/StyleGAN_fig7.png
diff --git a/_images/StyleGAN_fig8.png b/_images/StyleGAN_fig8.png
diff --git a/_images/clip.png b/_images/clip.png
diff --git a/_images/eDiff-I.png b/_images/eDiff-I.png
diff --git a/_images/exp-1.png b/_images/exp-1.png
diff --git a/_images/exp-2.png b/_images/exp-2.png
diff --git a/_images/exp-3.png b/_images/exp-3.png
diff --git a/_images/exp-4.png b/_images/exp-4.png
diff --git a/_images/exp-5.png b/_images/exp-5.png
diff --git a/_images/imagen.png b/_images/imagen.png
diff --git a/_images/img14.png b/_images/img14.png
diff --git a/_images/img15.png b/_images/img15.png
diff --git a/_images/img151.png b/_images/img151.png
diff --git a/_images/img152.png b/_images/img152.png
diff --git a/_images/inference_pipeline.png b/_images/inference_pipeline.png
diff --git a/_images/lora.png b/_images/lora.png
diff --git a/_images/training_pipeline.png b/_images/training_pipeline.png
diff --git a/_sources/docs/review/3DGS.md b/_sources/docs/review/3DGS.md
@@ -25,115 +25,101 @@ NeRF 기반 방식들은 높은 품질의 새로운 장면 합성이 가능하
 
 
 ## Overview
-:::{figure-md} 
-    <img src="../../pics/3DGS/image0.png" alt="main process" class="bg-primary mb-1" width="800px">
+<br>
+  :::{figure-md} 
+      <img src="../../pics/3DGS/image0.png" alt="main process" class="bg-primary mb-1" width="800px">
+      Main process of 3D Gaussian Splatting
+  :::
 
-     Main process of 3D Gaussian Splatting  (source: {https://arxiv.org/abs/2308.04079})
-:::
+<br>
+  :::{figure-md} 
+      <img src="../../pics/3DGS/image1.png" alt="main process" class="bg-primary mb-1" width="800px">
+      Peusdo Algorithm of 3D Gaussian Splatting 
+  :::
 
-:::{figure-md} 
-    <img src="../../pics/3DGS/image1.png" alt="main process" class="bg-primary mb-1" width="800px">
-
-     Peusdo Algorithm of 3D Gaussian Splatting (source: {https://arxiv.org/abs/2308.04079})
-:::
-
-
-
 
 
-## Differentiable 3D Gaussian Splatting
-
-이 논문은 normal(표면 법선)이 없는 Structure-from-Motion(SfM) 포인트들의 sparse한 셋을 initial point로 하여, 고품질의 novel view synthesis를 가능하게 하는 scene representation을 최적화하는 것을 목표로 한다. 
-
-빠른 렌더링을 위해 unstructured하고 explicit한 primitive를 필요로 하며, 이를 위해 미분 가능하고 2D splats로 쉽게 project되는 3D Gaussian을 선택했다.
-
-:::{figure-md} 
-    <img src="../../pics/3DGS/image2.png" alt="Eq. 4" class="bg-primary mb-1" width="800px">
-
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
 
-3D Gaussian은 포인트(mean) μ를 중심으로 하고, 3D 공분산 행렬 Σ로 정의한다. 렌더링을 위해 3D Gaussian을 2D로 project해야 하며, 이는 viewing transformation W에 따라 카메라 좌표계에서의 공분산 행렬 Σ'로 나타낼 수 있다. 최적화를 위해, Σ는 positive semi-definite 행렬이어야 하며, 이 때문에 최적화가 어렵다고 한다.
+<br>
+  ## Differentiable 3D Gaussian Splatting
 
-:::{figure-md} 
-    <img src="../../pics/3DGS/image3.png" alt="Eq. 5" class="bg-primary mb-1" width="800px">
+  이 논문은 normal(표면 법선)이 없는 Structure-from-Motion(SfM) 포인트들의 sparse한 셋을 initial point로 하여, 고품질의 novel view synthesis를 가능하게 하는 scene representation을 최적화하는 것을 목표로 한다. 
 
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
+  빠른 렌더링을 위해 unstructured하고 explicit한 primitive를 필요로 하며, 이를 위해 미분 가능하고 2D splats로 쉽게 project되는 3D Gaussian을 선택했다.
 
-따라서 논문에서는 더 직관적이고 최적화에 적합한 representation을 선택한다. 3D Gaussian의 공분산 행렬 Σ는 타원체의 구성을 설명하는 것과 유사하며, 이를 위해 scaling matrix S와 rotation matrix R을 사용한다. 
+<br>
+  :::{figure-md} 
+      <img src="../../pics/3DGS/image2.png" alt="Eq. 4" class="bg-primary mb-1" width="800px">
+  :::
 
-scaling은 3D vector s로, rotation은 quaternion q로 표현하며, 이들은 각각의 행렬로 변환될 수 있다. 학습 동안 Auto grad(자동 미분)의 오버헤드를 피하기 위해 모든 파라미터에 대한 gradient를 명시적으로 유도한다.
+  3D Gaussian은 포인트(mean) μ를 중심으로 하고, 3D 공분산 행렬 Σ로 정의한다. 렌더링을 위해 3D Gaussian을 2D로 project해야 하며, 이는 viewing transformation W에 따라 카메라 좌표계에서의 공분산 행렬 Σ'로 나타낼 수 있다. 최적화를 위해, Σ는 positive semi-definite 행렬이어야 하며, 이 때문에 최적화가 어렵다고 한다.
+  <br>
+    :::{figure-md} 
+        <img src="../../pics/3DGS/image3.png" alt="Eq. 5" class="bg-primary mb-1" width="800px">
+    :::
 
-:::{figure-md} 
-    <img src="../../pics/3DGS/image4.png" alt="Eq. 6" class="bg-primary mb-1" width="800px">
+  따라서 논문에서는 더 직관적이고 최적화에 적합한 representation을 선택한다. 3D Gaussian의 공분산 행렬 Σ는 타원체의 구성을 설명하는 것과 유사하며, 이를 위해 scaling matrix S와 rotation matrix R을 사용한다. 
 
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
+  scaling은 3D vector s로, rotation은 quaternion q로 표현하며, 이들은 각각의 행렬로 변환될 수 있다. 학습 동안 Auto grad(자동 미분)의 오버헤드를 피하기 위해 모든 파라미터에 대한 gradient를 명시적으로 유도한다.
+  <br>
+    :::{figure-md} 
+        <img src="../../pics/3DGS/image4.png" alt="Eq. 6" class="bg-primary mb-1" width="800px">
+    :::
 
-최적화에 적합한 anisotropic covariance representation은 장면의 다양한 geometry에 적응하도록 3D Gaussian을 최적화한다.
+  최적화에 적합한 anisotropic covariance representation은 장면의 다양한 geometry에 적응하도록 3D Gaussian을 최적화한다.
 
 
 
-
-## Optimization with Adaptive Density Control of 3D Gaussians
-
-- **Optimization**
-- **Adaptive Control of Gaussians**
-
-이 논문의 핵심 접근법은 free-view synthesis를 위해 장면을 정확하게 표현하는 3D Gaussian의 밀집된 세트를 만드는 최적화 단계다. 여기에는 position 𝑝, 투명도 𝛼, 공분산 Σ뿐만 아니라, scene의 view-dependent appearance를 정확하게 위한 각 Gaussian의 색상 c를 표현하는 SH coefficients까지 포함된다.
-
-
-## Optimization
-- 3D를 2D로 project할 때 발생할 수 있는 모호함을 피하기 위해, optimization 과정에서 geometry가 더 생성되거나, 삭제되거나 혹은 이동할 수 있어야 함
-  - 공분산 파라미터의 퀄리티는 큰 homogeneous area들을 적은 수의 큰 anisotropic Gaussian들로 캡처될 수 있기 때문에 representation의 compactness에 중요
-
-- 논문은 SGD를 사용하고 일부 연산은 CUDA 커널을 사용합니다. 특히 빠른 rasterization은 최적화의 효율성에 중요합니다.
-  - 이는 최적화의 주요 computation bottleneck이기 때문
-
-- 투명도 𝛼에 대해서는 sigmoid function을, 공분산의 scale에 대해서는 exponential activation 함수를 사용
-  - initial 공분산 행렬은 가장 가까운 세 점까지의 거리의 평균을 축으로 하는 isotropic Gaussian으로 추정
-  - position에 대해서만 exponential decay 스케줄링을 사용
+<br>
+  ## Optimization with Adaptive Density Control of 3D Gaussians
 
-:::{figure-md} 
-    <img src="../../pics/3DGS/image5.png" alt="Loss(Eq. 6)" class="bg-primary mb-1" width="800px">
+  - **Optimization**
+  - **Adaptive Control of Gaussians**
+
+  이 논문의 핵심 접근법은 free-view synthesis를 위해 장면을 정확하게 표현하는 3D Gaussian의 밀집된 세트를 만드는 최적화 단계다. 여기에는 position 𝑝, 투명도 𝛼, 공분산 Σ뿐만 아니라, scene의 view-dependent appearance를 정확하게 위한 각 Gaussian의 색상 c를 표현하는 SH coefficients까지 포함된다.
 
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
-Loss function은 D-SSIM과 L1 loss를 사용하며, D-SSIM loss는 이미지의 왜곡(distortion)을 잘 반영하고 미분 가능하여 evaluation metric뿐만 아니라 loss로도 사용 가능.
+<br>
+  ## Optimization
+  - 3D를 2D로 project할 때 발생할 수 있는 모호함을 피하기 위해, optimization 과정에서 geometry가 더 생성되거나, 삭제되거나 혹은 이동할 수 있어야 함
+    - 공분산 파라미터의 퀄리티는 큰 homogeneous area들을 적은 수의 큰 anisotropic Gaussian들로 캡처될 수 있기 때문에 representation의 compactness에 중요
 
-참고로 SSIM은 이미지의 밝기, 대조, 구조를 고려하여 두 이미지 간의 유사성을 측정하는 메트릭이다.
+  - 논문은 SGD를 사용하고 일부 연산은 CUDA 커널을 사용합니다. 특히 빠른 rasterization은 최적화의 효율성에 중요합니다.
+    - 이는 최적화의 주요 computation bottleneck이기 때문
 
-:::{figure-md} 
-    <img src="../../pics/3DGS/image6.png" alt="Loss(Eq. 6)" class="bg-primary mb-1" width="800px">
+  - 투명도 𝛼에 대해서는 sigmoid function을, 공분산의 scale에 대해서는 exponential activation 함수를 사용
+    - initial 공분산 행렬은 가장 가까운 세 점까지의 거리의 평균을 축으로 하는 isotropic Gaussian으로 추정
+    - position에 대해서만 exponential decay 스케줄링을 사용
 
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
+  Loss function은 D-SSIM과 L1 loss를 사용하며, D-SSIM loss는 이미지의 왜곡(distortion)을 잘 반영하고 미분 가능하여 evaluation metric뿐만 아니라 loss로도 사용 가능.
 
-:::{figure-md} 
-    <img src="../../pics/3DGS/image7.png" alt="Loss(Eq. 6)" class="bg-primary mb-1" width="800px">
+  참고로 SSIM은 이미지의 밝기, 대조, 구조를 고려하여 두 이미지 간의 유사성을 측정하는 메트릭이다.
 
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
+  <br>
+    :::{figure-md} 
+        <img src="../../pics/3DGS/image6.png" alt="Loss(Eq. 6)" class="bg-primary mb-1" width="800px">
+    :::
 
+  <br>
+    :::{figure-md} 
+        <img src="../../pics/3DGS/image7.png" alt="Loss(Eq. 6)" class="bg-primary mb-1" width="800px">
+    :::
 
-## Adaptive Control of Gaussians
-또한 Structure-from-Motion(SfM)으로 얻은 초기 희소 점들을 시작으로, scene을 더 잘 표현하기 위해 unit volume  내 Gaussian들의 수와 밀도를 점진적으로 최적화하는 방식을 제안한다.
-- 매 100번의 반복(iter)마다 Gaussian을 추가하고, 투명도 𝛼가 일정 값보다 작은 Gaussian을 제거
+<br>
+  ## Adaptive Control of Gaussians
+  또한 Structure-from-Motion(SfM)으로 얻은 초기 희소 점들을 시작으로, scene을 더 잘 표현하기 위해 unit volume  내 Gaussian들의 수와 밀도를 점진적으로 최적화하는 방식을 제안한다.
+  - 매 100번의 반복(iter)마다 Gaussian을 추가하고, 투명도 𝛼가 일정 값보다 작은 Gaussian을 제거
 
-Adaptive Control of Gaussians는 빈 공간을 채우고, missing geometric feature이 있는 region과 하나의 Gaussian이 너무 넓은 region을 커버하는 region에 집중합니다.
+  Adaptive Control of Gaussians는 빈 공간을 채우고, missing geometric feature이 있는 region과 하나의 Gaussian이 너무 넓은 region을 커버하는 region에 집중합니다.
 
-이러한 지역은 모두 큰 view-space positional gradient를 가지며, 최적화를 통해 Gaussian을 이동시킵니다.
-- Under-reconstruction: 동일한 크기의 Gaussian을 복제하고 위치 기울기 방향으로 이동.
-- Over-reconstruction: Gaussian을 두 개로 나누고, 위치는 기존 Gaussian의 PDF를 샘플링해서 초기화.
+  이러한 지역은 모두 큰 view-space positional gradient를 가지며, 최적화를 통해 Gaussian을 이동시킵니다.
+  - Under-reconstruction: 동일한 크기의 Gaussian을 복제하고 위치 기울기 방향으로 이동.
+  - Over-reconstruction: Gaussian을 두 개로 나누고, 위치는 기존 Gaussian의 PDF를 샘플링해서 초기화.
 
-최적화 과정에서 입력 카메라에 가까운 Gaussian density의 부적절한 증가를 방지하기 위해, 3000번의 반복마다 투명도 α를 0에 가깝게 설정한다.
-
-:::{figure-md} 
-    <img src="../../pics/3DGS/image8.png" alt="figure of Adaptive Control of Gaussians" class="bg-primary mb-1" width="800px">
-
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
+  최적화 과정에서 입력 카메라에 가까운 Gaussian density의 부적절한 증가를 방지하기 위해, 3000번의 반복마다 투명도 α를 0에 가깝게 설정한다.
+  <br>
+    :::{figure-md} 
+        <img src="../../pics/3DGS/image8.png" alt="figure of Adaptive Control of Gaussians" class="bg-primary mb-1" width="800px">
+    :::
 
 
 ## Fast Diffenrentiable Rasterization for Gaussians
@@ -152,42 +138,36 @@ tile-based rasterization은 아래와 같은 흐름으로 진행된다.
 - 이를 GPU Radix sort를 이용해 정렬
 - 각 타일마다 front-to-back으로 color와 α값을 accumulate해서 픽셀 값을 구함
 
-
-## Results and Evaluation
-
-- Results and Evaluation
-:::{figure-md} 
-    <img src="../../pics/3DGS/image12.png" alt="figure of Results" class="bg-primary mb-1" width="800px">
-
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
-
-데이터셋에 따라 결과는 다르지만 SOTA이상의 퀄리티를 내면서 좋은 Training time과 FPS를 보인다.
-
-
-:::{figure-md} 
-    <img src="../../pics/3DGS/image13.png" alt="figure of Results" class="bg-primary mb-1" width="800px">
-
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
-
-- Ablations
-
-:::{figure-md} 
-    <img src="../../pics/3DGS/image19.png" alt="figure of Ablations" class="bg-primary mb-1" width="800px">
-
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
-
-:::{figure-md} 
-    <img src="../../pics/3DGS/image18.png" alt="figure of Ablations" class="bg-primary mb-1" width="800px">
-
-     (source: {https://arxiv.org/abs/2308.04079})
-:::
-
-## Limitations
-- 이전의 방식들과 유사하게 잘 관측되지 않은 장면은 artifact들이 존재
-- 이전의 방식들과 유사하게 늘어지고 얼룩진 artifact를 생성할 수 있음
-- 최적화에서 거대한 Gaussian이 만들어지면 popping artifacts 가끔 발생
-- 최적화에서 regularization을 적용하지 않음
-- NeRF-based 기법들보다 memory consumption이 상당히 높음
+<br>   
+  ## Results and Evaluation
+  <br>
+    - Results and Evaluation
+      데이터셋에 따라 결과는 다르지만 SOTA이상의 퀄리티를 내면서 좋은 Training time과 FPS를 보인다.
+    <br>
+      :::{figure-md} 
+          <img src="../../pics/3DGS/image12.png" alt="figure of Results" class="bg-primary mb-1" width="800px">
+      :::
+
+<br>
+  <br>
+    :::{figure-md} 
+        <img src="../../pics/3DGS/image13.png" alt="figure of Results" class="bg-primary mb-1" width="800px">
+    :::
+  <br>
+  - Ablations
+    <br>
+      :::{figure-md}
+          <img src="../../pics/3DGS/image19.png" alt="figure of Ablations" class="bg-primary mb-1" width="800px">
+      :::
+    <br>
+      :::{figure-md}
+          <img src="../../pics/3DGS/image18.png" alt="figure of Ablations" class="bg-primary mb-1" width="800px">
+      :::
+
+<br>
+  ## Limitations
+  - 이전의 방식들과 유사하게 잘 관측되지 않은 장면은 artifact들이 존재
+  - 이전의 방식들과 유사하게 늘어지고 얼룩진 artifact를 생성할 수 있음
+  - 최적화에서 거대한 Gaussian이 만들어지면 popping artifacts 가끔 발생
+  - 최적화에서 regularization을 적용하지 않음
+  - NeRF-based 기법들보다 memory consumption이 상당히 높음