Create format for Shapelets #13

JoshVStaden · 2019-07-30T19:44:38Z

No description provided.

JoshVStaden · 2019-07-30T19:57:26Z

@sjperkins Hi Simon. Here is the pull request as requested. So far, Tigger LSM can support shapelets with a single coefficient describing them. At this stage, we just need to change the coefficients column to support a list of coefficients. I think the idea was to think of the shapelet coefficients as a 2 x 2 matrix, and simply write out the matrix row by row in each column of the file.

An example of how I imagine this would look like would be something along the lines of
#format name ....... coeffs_0 coeffs_1 coeffs_2 coeffs_3 ...
J0 .... 1,2,3,4,5 5,4,3,2,1 6,7,8,9,10 10,9,8,7,6
And this would describe a shapelet coefficient matrix that looked like this:
| 1 2 3 4 5 |
| 5 4 3 2 1 |
| 6 7 8 9 10 |
| 10 9 8 7 6 |
Where it is l rows by m columns. Am I on the right track here?

sjperkins · 2019-07-31T08:07:31Z

@sjperkins Hi Simon. Here is the pull request as requested. So far, Tigger LSM can support shapelets with a single coefficient describing them. At this stage, we just need to change the coefficients column to support a list of coefficients. I think the idea was to think of the shapelet coefficients as a 2 x 2 matrix, and simply write out the matrix row by row in each column of the file.

When you say a 2 x 2 matrix, do you perhaps mean a 2D matrix?

An example of how I imagine this would look like would be something along the lines of
#format name ....... coeffs_0 coeffs_1 coeffs_2 coeffs_3 ...
J0 .... 1,2,3,4,5 5,4,3,2,1 6,7,8,9,10 10,9,8,7,6
And this would describe a shapelet coefficient matrix that looked like this:
| 1 2 3 4 5 |
| 5 4 3 2 1 |
| 6 7 8 9 10 |
| 10 9 8 7 6 |
Where it is l rows by m columns. Am I on the right track here?

That could work. There's also the option of have a single coefficient column, expressing the coefficients as a list and parsing it with ast.literal_eval

>>> from __future__ import print_function
>>> import ast

>>> print(ast.literal_eval("[[1,2,3],[4,5,6]]"))

>>> [[1, 2, 3], [4, 5, 6]]

What do you think?

o-smirnov · 2019-07-31T13:03:05Z

Sigh, we need to write things down at meetings -- seems like everyone on the same page when we talk, and then we go our own ways with a different mental image...

Mathematically, the coefficients are indeed a 2D matrix, of arbitrary large size. There should be just one coefficients column ("``shapelet_coeff''") in the LSM, containing an arbitrary length list. This list is mapped to the 2D matrix as follows:

If shapelet_coeff is 1,2,3,4,5,6,7,8,9,10,11 then the matrix is

1  3  6 10 0
2  5  9  0
4  8  0
7  12
11

and zero everywhere else. This way, you don't need a variable number of columns, and you always, unambiguously, know how many coefficients you have.

sjperkins · 2019-07-31T14:57:17Z

Sigh, we need to write things down at meetings -- seems like everyone on the same page when we talk, and then we go our own ways with a different mental image...

Yeah, sorry. I think I had a solution in search of a problem.

JoshVStaden · 2019-08-02T07:50:33Z

@o-smirnov Hi Oleg. I have made the changes you requested to how tigger reads in the coefficients field. I have run into an issue with how the coefficients are structured and wanted to run it past you and @landmanbester before going forward.

Mathematically, the coefficients are indeed a 2D matrix, of arbitrary large size. There should be just one coefficients column ("``shapelet_coeff''") in the LSM, containing an arbitrary length list. This list is mapped to the 2D matrix as follows:

According to my shapelet script, the coefficients are not read as a 2D matrix of arbitrary size, but, for each source, they are treated as vectors (i.e. a list of coefficients for l, and a list of coefficients for m).

Just to run it past you, if the coefficients are a 2D Matrix, would this matrix simply be the product between each element in each specific vector?

So, for example, if vec_coeffs_l is the vector holding the coefficients for l dimension, and likewise for vec_coeffs_m, and mat_coeffs is the matrix of coefficients that you described, then would it be the case that mat_coeffs[3,4] = vec_coeffs_l[3] * vec_coeffs_m[4]?

sjperkins · 2019-08-02T08:32:45Z

Tigger/Models/Formats/ASCII.py

+            n += 1
+            x += n
+        return n
+


You can use np.tril_indices here

landmanbester · 2019-08-02T10:28:32Z

@JoshVStaden yes the basis function is assumed separable so to get the basis function for ij you simply take the product of the basis for coeff_l[i] and coeff_m[j] but there is still only a single coefficient per 2D basis function. The ordering that Oleg is suggesting can be illustrated by the following figure

We enumerate the basis functions by zig-zagging along diagonals. For the above case, starting at the bottom left corner, we enumerate the basis functions as

[(0,0), (0,1), (1,0), (2,0), (1,1), (0,2), (0,3), (1,2), (2,1), (3,0),....]

and so on. This ordering should be implicit in the lsm format. Thus, given a 1D list with n coefficients say, you can always use the ordering to associate each coefficient with its basis function. For example, if we receive the following list of coefficients

[a0, a1, a2, a3]

then we know to reconstruct the function as

f(l, m) = a0 B0(l) B0(m) + a1 B0(l) B1(m) + a2 B1(l) B0(m) + a3 B2(l) B0(m)

Is that a bit clearer now?

landmanbester · 2019-08-02T10:42:00Z

Oh, and I am just realising that to complete the top right corner of the square in that figure you are going to need to specify the maximum order for the 1D basis functions. In the figure, the maximum order for both l and m is 4. If the maximum order for l and m are not the same you end up with a rectangle instead of a square but I am not sure if that will ever happen in practice. Maybe we can just add an order parameter to the lsm and assume it is always square. @o-smirnov what do you think?

JoshVStaden · 2019-08-05T07:47:15Z

Just so that I am following, we are assuming the coefficient matrix is just the element-wise product between the two coefficient vectors for l and m, correct? And the issue here is that, without specifying the maximum order, the format we specified would not be able to write the coefficients for the top right hand corner in Landman's example image there? So what we would do is specify the maximum order, so that the script can start writing down the diagonals towards the top right hand corner of that example (in the case of the code, it would be towards the bottom right hand side of the matrix).

If what I am saying is correct, would it not make more sense to simply specify the two l and m matrices, and have the code automatically generate the matrix from that? So one would specify shapelet_coeffs as a single field, with an even number of elements in that field, and the code would simply split it up into the l and m parts, and multiply them?

So, for example, for an input of 1,2,3,4,5,6, the code would output the following matrix:

4 5 6
8 10 12
12 15 18

Because the l vector is 1,2,3 and the m vector is 4,5,6. And then, if the user inputs an odd number of coefficients, the code would simply throw an exception.

Does this make sense, or does this go against the general goals of this format?

ratt-priv-ci · 2019-11-20T10:41:10Z

Can one of the admins verify this patch?

Added support for Shapelet models.

ac448fd

Changed shapelet coeffs field to read in a single list as a triangle.

775127e

sjperkins reviewed Aug 2, 2019

View reviewed changes

Tigger/Models/Formats/ASCII.py

n += 1

x += n

return n

Copy link

Member

sjperkins Aug 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use np.tril_indices here

Replaced triangular_index with np.tril_indices

0965723

Changes to shapelet coefficient parsing.

4161e46

JoshVStaden added 2 commits March 4, 2020 09:51

Fix bugs in ASCII.py.

d2ad42f

Changed shapelet format to be expressed in terms of ex and ey.

59f1f34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create format for Shapelets #13

Create format for Shapelets #13

JoshVStaden commented Jul 30, 2019

JoshVStaden commented Jul 30, 2019

sjperkins commented Jul 31, 2019

o-smirnov commented Jul 31, 2019

sjperkins commented Jul 31, 2019

JoshVStaden commented Aug 2, 2019

sjperkins Aug 2, 2019

landmanbester commented Aug 2, 2019

landmanbester commented Aug 2, 2019

JoshVStaden commented Aug 5, 2019

ratt-priv-ci commented Nov 20, 2019

Create format for Shapelets #13

Are you sure you want to change the base?

Create format for Shapelets #13

Conversation

JoshVStaden commented Jul 30, 2019

JoshVStaden commented Jul 30, 2019

sjperkins commented Jul 31, 2019

o-smirnov commented Jul 31, 2019

sjperkins commented Jul 31, 2019

JoshVStaden commented Aug 2, 2019

sjperkins Aug 2, 2019

Choose a reason for hiding this comment

landmanbester commented Aug 2, 2019

landmanbester commented Aug 2, 2019

JoshVStaden commented Aug 5, 2019

ratt-priv-ci commented Nov 20, 2019