Skip to content

The Open source SPL Boosts MongoDB Computing Ability

esProcSPL edited this page Feb 26, 2024 · 1 revision

MongoDB is a typical NoSQL database. Its document-oriented structure makes both storage and access convenient and efficient. But the database has rather weak computing ability. Computations on MongoDB data, particularly complex ones, are hard to handle. A data computing engine having powerful computing capability is needed to work with MongoDB to achieve relevant computing tasks.

The open-source esProc SPL is a specialized structured data computation engine. It supplies rich class libraries and all-around, database-independent computational capabilities. SPL has an independent procedural syntax that is particularly good at handling complex computations. It can help MongoDB increase its ability to compute, accomplish grouping & aggregation, joins, subqueries, and all the other computing tasks effortlessly.

Regular queries

It is easy to achieve JOINs MongoDB finds it difficult to handle in SPL:

A B
1 =mongo_open("mongodb://127.0.0.1:27017/raqdb") /Connect   to MongDB
2 =mongo_shell(A1,"c1.find()").fetch() /Fetch   data from MongoDB
3 =mongo_shell(A1,"c2.find()").fetch()
4 =A2.join(user1:user2,A3:user1:user2,output) /Perform   join
5 >A1.close() /Close   connection to MongoDB

 

SPL can reuse the result of handling data of a table that is repeatedly involved in computations:

A B
1 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
2 =mongo_shell(A1,“course.find(,{_id:0})”).fetch() /Fetch   data from MongoDB
3 =A2.group(Sno).((avg   = ~.avg(Grade), ~.select(Grade>avg))).conj() /Get   documents where grading level is above average
4 >A1.close()

 

Perform IN conditional query in SPL:

A B
1 =mongo_open("mongodb://localhost:27017/test")
2 =mongo_shell(A1,"orders.find(,{_id:0})") /Fetch   data from MongoDB
3 =mongo_shell(A1,"employee.find({STATE:'California'},{_id:0})").fetch() /Select   certain employee documents
4 =A3.(EID).sort() /Get   EID field and sort it
5 =A2.select(A4.pos@b(SELLERID)).fetch() /Perform   binary search
6 >A1.close()

 

SPL’s technique to turn foreign key values to objects – the object-referencing foreign key – creates efficient foreign key pointers:

A B
1 =mongo_open("mongodb://localhost:27017/local")
2 =mongo_shell(A1,"Progress.find({},   {_id:0})").fetch() /Fetch   Progress data
3 =A2.groups(courseid;   count(userId):popularityCount) /Group   and count by course
4 =mongo_shell(A1,"Course.find(,{title:1})").fetch() /Get   Course data
5 =A3.switch(courseid,A4:_id) /Foreign-key-based   join
6 =A5.new(popularityCount,courseid.title) /Create   result set
7 =A1.close()

 

SPL achieves APPLY algorithm in a simple way:

A B
1 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
2 =mongo_shell(A1,"users.find()").fetch() /Fetch   users data
3 =mongo_shell(A1,"workouts.find()").fetch() /Fetch   workouts data
4 =A2.conj(A3.select(A2.workouts.pos(_id)).derive(A2.name)) /Get   matching _id values from the sequence of workouts documents
5 >A1.close()

 

SPL’s way of performing set-oriented calculations – intersection, union, difference and concatenation:

A B
1 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
2 =mongo_shell(A1,"emp1.find()").fetch()
3 =mongo_shell(A1,"emp2.find()").fetch()
4 =[A2,A3].conj() /Concatenation   of sequences
5 =[A2,A3].merge@ou() /Union   by whole row comparison
6 =[A2,A3].merge@ou(_id,   NAME) /Union   by key value comparison
7 =[A2,A3].merge@oi() /Intersection   by whole row comparison
8 =[A2,A3].merge@oi(_id,   NAME) /Intersection   by key value comparison
9 =[A2,A3].merge@od() /Difference   by whole row comparison
10 =[A2,A3].merge@od(_id,   NAME) /Difference   by key value comparison
11 >A1.close()

 

Get sequence number of a member in a sequence in SPL:

A B
1 =mongo_open("mongodb://localhost:27017/local)
2 =mongo_shell(A1,"users.find({name:'jim'},{name:1,friends:1,_id:0})").fetch()
3 =A2.friends.pos("luke") /Get   sequence numbers of members in sequence friends
4 =A1.close()

 

Perform intersection of multi-member collections in SPL:

A B
1 [Chemical,   Biology, Math] /Courses
2 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
3 =mongo_shell(A2,"student.find()").fetch() /Fetch   student data
4 =A3.select(Lesson^A1!=[]) /Get   documents where at least one course is selected
5 =A4.new(_id,   Name, ~.Lesson^A1:Lession) /Get   the final result
6 >A2.close()

Complex queries

Getting TopN in SPL:

A B
1 =mongo_open("mongodb://127.0.0.1:27017/test")
2 =mongo_shell(A1,"last3.find(,{_id:0};{variable:1})") /Get   last3 data and sort it by variable
3 for   A2;variable =A3.top(3;-timestamp) /Get   the three documents having the latest timestamps
4 =@|B3 /Append   the selected documents to B4
5 =B4.minp(~.timestamp)      / Get   documents with earliest timestamp
6 >mongo_close(A1)

 

Summarize a nested-structure collection in SPL:

A
1 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
2 =mongo_shell(A1,"computer.find()").fetch()
3 =A2.new(_id:ID,income.array().sum():INCOME,output.array().sum():OUTPUT)
4 >A1.close()

 

Combine subdocuments made up of multiple attributes in SPL:

A B C
1 =mongo_open("mongodb://localhost:27017/local")
2 =mongo_shell(A1,"c1.find(,{_id:0};{name:1})")
3 =create(_id,   readUsers) /Create   result table sequence
4 for   A2;name =A4.conj(acls.read.users|acls.append.users|acls.edit.users|acls.fullControl.users).id() /Get   all users fields
5 >A3.insert(0,   A4.name, B4) /Insert   the current group of data to the result set
6 =A1.close()

 

Query nested List subdocument in SPL:

A B
1 =mongo_open("mongodb://localhost:27017/local")
2 =mongo_shell(A1,"Cbettwen.find(,{_id:0})").fetch()
3 =A2.conj((t=~.objList.data.dataList,t.select((s=float(~.split@c1()(1)),   s>6154 && s<=6155)))) / Get   eligible strings
4 =A1.close()

 

SPL Cross-sector aggregation:

A
1 =mongo_open("mongodb://localhost:27017/local")
2 =mongo_shell(A1,"student.find()").fetch()
3 =A2.group(school)
4 =A3.new(school:school,~.align@a(5,sub1).(~.len()):sub1,~.align@a(5,sub2).(~.len()):sub2)
5 =A4.new(school,sub1(5):sub1-5,sub1(4):sub1-4,sub1(3):sub1-3,sub1(2):sub1-2,sub1(1):sub1-1,sub2(5):sub2-5,sub2(4):sub2-4,sub2(3):sub2-3,sub2(2):sub2-2,sub2(1):sub2-1)
6 =A1.close()

 

SPL segment-based grouping:

A B
1 [3000,5000,7500,10000,15000] /Intervals   of Sales for segmentation
2 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
3 =mongo_shell(A2,"sales.find()").fetch()
4 =A3.groups(A1.pseg(~.SALES):Segment;count(1):   number) /Group   data and count employees by SALES intervals
5 >A2.close()

 

SPL class-based grouping:

A B
1 =mongo_open("mongodb://127.0.0.1:27017/raqdb")
2 =mongo_shell(A1,"books.find()")
3 =A2.groups(addr,book;count(book):Count) / Grouping   & count
4 =A3.groups(addr;sum(Count):Total) / Grouping   & sum
5 =A3.join(addr,A4:addr,Total) / Join   operation
6 >A1.close()

 

Data writing

Export data as CSV in SPL:

A B
1 =mongo_open("mongodb://localhost:27017/raqdb")
2 =mongo_shell(A1,"carInfo.find(,{_id:0})")
3 =A2.conj((t=~,cars.car.new(t.id:id,t.cars.name,   ~:car))) /Split   each car field value into multiple rows
4 =file("D:\\data.csv").export@tc(A3) /Export   as csv
5 >A1.close()

 

SPL database update (from MongoDB to MySQL):

A B
1 =mongo_open("mongodb://localhost:27017/raqdb") /Connect   to MongDB
2 =mongo_shell(A1,"course.find(,{_id:0})").fetch()
3 =connect("myDB1") /Connect   to mysql
4 =A3.query@x("select   * from course2").keys(Sno, Cno)
5 >A3.update(A2:A4,course2,Sno,Cno,   Grade; Sno,Cno) /Update   data into mysql
6 >A1.close()

 

SPL database update (from MySQL to MongoDB):

A B
1 =connect("mysql") /Connect   to mysql
2 =A1.query@x("select   * from course2") /Get   data of course2 table
3 =mongo_open("mongodb://localhost:27017/raqdb") /Connect   to MongDB
4 =mongo_insert(A3,   "course",A2) /Insert   records of MySQL table course2 into the MongoDB collection
5 >A3.close()

 

Mixed computations

SPL enables convenient mixed computation between MongoDB and another data source:

A B
1 =mongo_open("mongodb://localhost:27017/test") /Connect to MongDB
2 =mongo_shell(A1,"emp.find({'$and':[{'Birthday':{'$gte':'"+string(begin)+"'}},{'Birthday':{'$lte':'"+string(end)+"'}}]},{_id:0})").fetch() /Get records within a specified time   interval
3 =A1.close() / Close MongoDB connection
4 =myDB1.query("select * from   cities") / Get data of cities table in mysql
5 =A2.switch(CityID,A4:CityID) / Foreign-key-based join
6 =A5.new(EID,Dept,CityID.CityName:CityName,Name,Gender) / Create result set
7 return A6 / Return the result set

 

SQL support

Besides the native syntax, SPL offers support of SQL92 standard. You can use SQL to query MongoDB. To achieve the above join operation, for instance:

A
1 =mongo_open("mongodb://127.0.0.1:27017/test")
2 =mongo_shell(A1,"c1.find()").fetch()
3 =mongo_shell@x(A1,"c2.find()").fetch()
4 $select   s.* from {A2} as s left join {A3} as r on s.user1=r.user1 and s.user2=r.user2   where r.income>0.3

Integration into application

SPL provides standard JDBC/ODBC drivers through which SPL can be conveniently integrated into an application or invoked by it. To invoke SPL code through JDBC, for instance:

…
Class.forName("com.esproc.jdbc.InternalDriver");
Connection conn = DriverManager.getConnection("jdbc:esproc:local://");
PrepareStatement st=con.prepareStatement("call splScript(?)"); // splScript is the name of SPL script file
st.setObject(1,"California");
st.execute();
ResultSet rs = st.getResultSet();
…

With all those functionalities, you’ll sure to be impressed by MongoDB’s strikingly boosted computing ability. Try your hand now.

Download SPL

SPL source code

Clone this wiki locally