-
Notifications
You must be signed in to change notification settings - Fork 332
SPL:Recursively Search Referenced Records
Recursion refers to an operation or function that calls itself directly or indirectly. For example, the tower of Hanoi is a typical recursive operation. It is also very common to use recursion for queries in practice. For instance, we want to know the authorities that are in charge of a certain office. The superior authority that directly dominates the office is not difficult to find. However, the uncertain layers of affiliations are hard to find if we continue to query the superiors of the relative branches repeatedly, and that’s where recursion comes in.
[e.g. 1] Query the level of each office based on the organizational structure table of the company (the head office is level 1, branch office is level 2, and so forth). Some of the data are as follows:
ID | ORG_NAME | PARENT_ID |
---|---|---|
1 | Head Office | 0 |
2 | Beijing Branch Office | 1 |
3 | Shanghai Branch Office | 1 |
4 | Chengdu Branch Office | 1 |
5 | Beijing R&D Center | 2 |
… | … | … |
Every record needs to be looped to recursively find all the superiors of each office. The A.prior(F) function in SPL recursively searches referenced records, and by default, all of the referenced records.
The SPL script is as follows:
A | |
---|---|
1 | =T("Organization.txt") |
2 | >A1.switch(PARENT_ID,A1:ID) |
3 | =A1.new(ID,ORG_NAME,~.prior(PARENT_ID).len():LEVEL) |
A1: import the organizational structure table.
A2: objectify the parent ID foreign key as the corresponding parent office record for self-join
A3: create a new table consists of sequence numbers, organization names, and levels, in which the levels of offices are calculated by A.prior() function with the layer numbers of the referenced records recursively.
[e.g. 2] Query the affiliated offices of Beijing Branch Office and list the names of its superior offices, separated by commas with multi-layers, based on the organizational structure table of the company. Here are some of the data:
ID | ORG_NAME | PARENT_ID |
---|---|---|
1 | Head Office | 0 |
2 | Beijing Branch Office | 1 |
3 | Shanghai Branch Office | 1 |
4 | Chengdu Branch Office | 1 |
5 | Beijing R&D Center | 2 |
… | … | … |
In this example, we want to terminate the recursion and save the current office when the specified value (Beijing Branch Office) is found and skip the offices that cannot be found. The A.prior(F,r) function in SPL recursively searches referenced records until the specified value r.
The SPL script is as follows:
A | |
---|---|
1 | =T("Organization.txt") |
2 | >A1.switch(PARENT_ID,A1:ID) |
3 | =A1.select@1(ORG_NAME=="Beijing Branch Office") |
4 | =A1.new(ID,ORG_NAME,~.prior(PARENT_ID,A3):PARENT_NAME) |
5 | =A4.select(PARENT_NAME!=null) |
6 | =A5.run(PARENT_NAME=PARENT_NAME.(PARENT_ID.ORG_NAME).concat@c()) |
A1: import the organizational structure table.
A2: convert the parent ID to the corresponding parent office record to objectify the foreign key.
A3: select the record of Beijing Branch Office.
A4: create a new table consisting of sequence numbers, organization names, and the record set of all the superiors.
A5: select the record of non-empty parent office, that is, the record of Beijing Branch Office.
A6: loop the names of parent offices to concatenate them as a string separated by commas.
[e.g.3] Query the following counties of Hebei Province according to the Chinese administrative region table. Here are some of the data:
ID | NAME | PARENT_ID |
---|---|---|
1 | China | 0 |
11 | Beijing | 1 |
12 | Tianjin | 1 |
13 | Hebei | 1 |
… | … | … |
1301 | Shijiazhuang | 13 |
1302 | Tangshan | 13 |
… | … | … |
On the contrary to the previous sections, this example searches all the leaf records based on the parent records. The P.nodes@d(F,r) function queries to search all of the leaf records recursively.
The SPL script is as follows:
A | |
---|---|
1 | =T("ChinaRegion.csv") |
2 | >A1.switch(PARENT_ID,A1:ID) |
3 | =A1.select@1(NAME=="Hebei") |
4 | =A1.nodes@d(PARENT_ID,A3) |
5 | =A4.new(ID,NAME,PARENT_ID.NAME:PARENT_NAME) |
A1: import the Chinese administrative region table.
A2: convert the parent ID to the corresponding parent region record to objectify the foreign key.
A3: select the record of Hebei Province.
A4: search recursively with P.nodes() function, and @d is used to recursively search all the leaf records.
A5: create a table consisting of sequence numbers, county names, and province names.
SPL Resource: SPL Official Website | SPL Blog | Download esProc SPL | SPL Source Code