-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Python prerequisites
The S.IM.PL (de)serializer is developed in Python 2.7, so this Python version should be the one installed. For the XML deserializer, I used the pulldom module from the internal Python library xml.dom. For the serializer I also used a builtin module, xml.etree.ElementTree.
For json, I used an external library for deserializing, the ijson library. For serializing, I used the buitin module json.
ijson can be installed on unix with easy_install
sudo easy_install ijson
In order to install the ijson library, these steps should be followed:
- Download the ijson archive from http://pypi.python.org/pypi/ijson/.
- Copy the library directory to the python directory.
- Run 'python setup.py build install' in the ijson directory to install the python library.
- To have a working ijson module, the yajl C library needs to be compiled on the local machine in a directory which is in the PATH environment variable of the operating's system.
- Download yajl 1.0.12 from http://lloyd.github.com/yajl/. Only this version will work with ijson.
- Compile the yajl library using cmake and place the binary (.so or .dll) in a directory in the PATH of the OS.
Repository and tree structure
The Ecologylab Fundamental Python (de)serializer trunk can be found at https://github.com/ecologylab/ecologylabFundamentalPython
The deserializer code is placed in the deserializer package and the serializer is located in the serializer package.
The tests are located in the simpl_testcases directory.
Running tests
S.IM.PL Python uses JSON serialized simplTypesScopes generated in the Java repository, since the tests were ported from the S.IM.PL tests in Java. In order to generate these files, I created a new method in the Java environment that serializez the simplTypesScope for each test and places it in a folder where it can be accessed by the Python tests. A Python S.IM.PL test works in the following way:
- A S.IM.PL object is deserialized from serialized files also generated by the Java tests(both JSON and XML) in the same directory as the serialized simplTypesScopes.
- The serializer serializes the objects back to their original format, XML or JSON.
- The output strings are compared to the initial inputs.
Also, there is a script located in simpl_testcases/tests named run_tests.py which is used for running all the S.IM.PL tests.
There is also the possibility of testing the serializer from the Python terminal, by entering the following commands from the project's root:
>>> from serializer.simpl_types_scope import *
>>> scope = SimplTypesScope(Format.JSON, "circle_scope")
>>> point = scope.SimplType("Point")
>>> point.x = 10
>>> point.y = 20
>>> scope.printSerialized(point, Format.XML)
<?xml version="1.0" ?>
<point x="10" y="20"/>
{"point": {"y": "20", "x": "10"}}
Another example which can be tested like the Point example above is the personDirectory test, which includes a collection with polymorphic fields:
>>> from serializer.simpl_types_scope import *
>>> scope = SimplTypesScope(Format.JSON, "personDirectory_scope")
>>> student = scope.SimplType("Student")
>>> student.name = "John"
>>> student.stuNum = "123"
>>> faculty = scope.SimplType("Faculty")
>>> faculty.name = "Andruid"
>>> faculty.designation = "prof"
>>> personDirectory = scope.SimplType("PersonDirectory")
>>> personDirectory.persons.append(student)
>>> personDirectory.persons.append(faculty)
>>> scope.printSerialized(personDirectory, Format.JSON)
{"person_directory": {"persons": [{"student":]}}
>>> scope.printSerialized(personDirectory, Format.XML)
<?xml version="1.0" ?>
<person_directory>
<persons>
<student name="John" stu_num="123"/>
<faculty designation="prof" name="Andruid"/>
</persons>
</person_directory>
Graph serialization
For serializing graph structures, I have followed the algorithms specified in Nabeel Shazad's thesis. Therefore, the graph serialization can be enabled on the simplTypesScope instance. Once this is done, when an object is serialized, the simpl:id attribute is added to the serialization only if deeper into the serialization tree the object needs to be serialized once more. Also, the simpl:ref attribute replaces the object's serialization when this was already serialized.
To generate a unique id for each object reference, I used the following algorithm based on the fact that in Python every new object has a persistent memory address during runtime:
- the simpl_id is first added the id() of the object, which returns the object's memory address which is a long integer.
- the type() of the object returns a string representing the object's type. For this string, I generate a simple hashcode from adding the ascii codes of all the string's characters, using the ord() method.
- these two numbers are added and the result is the generated simpl_id.
>>> from serializer.simpl_types_scope import *
>>> scope = SimplTypesScope(Format.JSON, "classAclassB_scope")
>>> scope.enableGraphSerialization()
>>> classA1 = scope.SimplType("ClassA")
>>> classB1 = scope.SimplType("ClassB")
>>> classA1.x = 1
>>> classA1.y = 2
>>> classB1.a = 3
>>> classB1.b = 4
>>> classA1.classA = classA1
>>> classA1.classB = classB1
>>> classB1.classA = classA1
>>> scope.printSerialized(classA1, Format.XML)
<?xml version="1.0" ?>
<class_a simpl:id="31405957" x="1" xmlns:simpl="http://ecologylab.net/research/simplGuide/serialization/index.html" y="2">
<class_a simpl:ref="31405957"/>
<class_b a="3" b="4">
<class_a simpl:ref="31405957"/>
</class_b>
</class_a>
{"class_a": {"y": "2", "x": "1", "simpl:id": "31405957", "class_a": {"simpl:ref": "31405957"}, "class_b": {"a": "3", "class_a": {"simpl:ref": "31405957"}, "b": "4"}}}