Understanding AutoregressiveTransform #16
-
Hi everyone, I'm trying to understand how the As far as I understand, the auto-regression's purpose is to create something I would call a "triangular dependency". Each output
The function Could someone help me to understand how this causes the autoregressive transformation? I'm not that familiar with Python, but I cannot find the autoregression there. Or am I looking at the wrong place?
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Hello @tipf, this is a very important question! I'll start by explaining what is an autoregressive transformation formally and then how it is implemented in Zuko. Tell me if this is clear. FormalizationLet We can decompose the forward pass
For the inverse pass
GeneralizationUnfortunately, in some cases, it is not possible (or not desirable) to condition and inverse the transformations This is not much of a problem for the forward pass, as we have for all leads to Zuko's implementationIn This design allows for a wide range of autoregressive architectures (e.g. fully autoregressive and coupling) and does not rely on the variables ordering, which makes it very modular. All the conditioning shenanigans (e.g. masked networks and parametrizations) are hidden in the ExampleLet's take a simple example to illustrate the principles. We have If and its inverse Let where where where all elements are correct. |
Beta Was this translation helpful? Give feedback.
-
Hi @francois-rozet, I will try to digest everything until next week... |
Beta Was this translation helpful? Give feedback.
-
I think I have got the basic principle. At least to a certain degree... Again thanks a lot @francois-rozet for your great explanation you made it pretty clear. 👍 I was able to implement an inverse function that transforms a sample while keeping some of its elements constant. I added this to the def inverse_given_partial(self, y: Tensor, x_part: Tensor) -> Tensor:
# find lenght of partial
x_part_len = x_part.shape[-1]
# copy partial to x
x = torch.zeros_like(y)
x[..., :x_part_len] = x_part
# do inverse passes for the non-partial dimensions
for _ in range(self.passes - x_part_len):
x = self.meta(x).inv(y)
# overwrite partial dim, because the parial is known and we just want to update the rest
x[..., :x_part_len] = x_part
return x It allows me to implement a conditional sampler. Here in this example is the first element of Since this is a rather specific use case and the API does not generalize to other transforms, I doubt that it makes sense to add it to Zuko. But maybe the code will help someone... |
Beta Was this translation helpful? Give feedback.
Hello @tipf, this is a very important question! I'll start by explaining what is an autoregressive transformation formally and then how it is implemented in Zuko. Tell me if this is clear.
Formalization
Let$x$ be a vector in $\mathbb{R}^n$ . An autoregressive transformation is a mapping $y = f(x) \in \mathbb{R}^n$ such that the $i$ -th element of $y$ is a bijective univariate transformation of the $i$ -th element of $x$ , conditioned on the preceding elements. That is $y_i = f_i(x_i | x_{1:i-1})$ and $x_i = f_i^{-1}(y_i | x_{1:i-1})$ where $x_{1:i} = (x_1, x_2, \dots, x_i)$ . It is important to note that $f_i$ is only bijective with respect to $x_i$ , hence the vertical bar between $x_i$ and $…