Using Java serialization

SCaVis Serialized class

Data can be serialised into a file using the Serialized class. The only limitation is the computer memory, since the object has to be created first. In this example we create a list with two 2D arrays and write it to the file. Then we restore it.

Code example

 1: from jhplot  import *
 2: from import *
 4: p1=P1D("test") # create 2D array and fill it
 5: p1.add(10,20)
 6: p1.add(12,40)
 7: print p1.toString()
 9: p2=P0D("test2")  # create 2D arrays and fill it
10: for i in range(10):
11:      p2.add(i*i)
13: a=[p1,p2]                      # make a list
14: Serialized.write(a,'file.ser') # write the list to the file
16:'file.ser') # read the data from the file
18: p1=fas[0];  p2=fas[1]           # read 2D arryas from the list
19: print "After serialization:n"
20: print p1.toString(); print p2.toString()

One can also use SerializedXML class, which writes/reads data to/from XML files. Let us take a look how to write histograms into XML serialized file and then read it back

Code example

 1: from jhplot  import *
 2: from import *
 3: from java.util import Random
 5: h1=H1D("test H1D",20,-3,3)
 6: h1.setFill(1)
 8: p0=P0D("test P0D")
 9: r=Random()           # fill with random numbers
10: for i in range(100):
11:       h1.fill(2*r.nextGaussian())
12:       p0.add(r.nextGaussian()+10)
13: a=[] # put both objects to the list
14: a.append(h1)
15: a.append(p0)
17: Serialized.writeXML(a,'file.xml')  # write to a XML file
18: print Serialized.toXML(a)          # convert to XML string
20: b=Serialized.readXML('file.xml')   # read back from XML
21: print b

HFile class

The class HFile is designed to store any Java object (including containers) in a compact serialized form. It is designed to store large data volumes without memory limitations (unlike the class Serialized considered above). It is well suited for sequential input and outputs. For example one can store data containers and functions described Data structures and functions. The class is based on the standard Java serialization mechanism (compression is by default, but can be switched off). A typical extension for such files is ”.ser”.

Essentially, almost any Java object can be stored and retrieved from the HFile file

Data stored in a file created by the class HFile can be viewed in the browser based on the class HFileBrowser

One can also store data in the XML form using the class HFileXML.

Let us give an example: we create 3 objects (array and two histograms) and write then into compressed serialized file (use HFileXML to write into XML format). The we read this file and create these objects:

from import *
from jhplot import *
x=P0D('X'); y=P0D('Y')
h1 = H1D("fixed bins",10, -2, 2.0)
h2 = H1D("variable",[-1,2,4,7])
r = Random()
for i in range(100):
# reading objects
print "Created the file="
print h1.toString()
print h1.toString()

You can append any any number of objects to this file. You can even make a Jython map or list from different objects and write such containers in one go.

There is no any restriction on which Java object is written. One can write arbitrary complicated data in form of arrays, strings, lists, tuples, maps, dictionaries, SCaVis functions, histograms etc. Any Java, Jython, SCaVis or any third-party Java container which can hold data can be written into a file. If you want to write objects which can be retrieved using keys, use maps or dictionaries.

Using the keys

Before we considered data records organized sequentially. One can also store objects using keys in form of string. The keys should be unique.

from jhplot import *
from import *
p1.randomNormal(1000,0,2) # 1000 random numbers
p2.randomNormal(1000,1,2) # 1000 random numbers
g.write("directory/key4",p1) # store the objects in the directory
# now reading the objects back
print g.get("directory/key4")  # prints p1

The notion of “directory” is important. Now one can organize data using some meaningful logic and open data in a browser as will be discussed below.

Be careful mixing data inside HFile without keys and with keys. If you have inserted a lot of records without using keys, you will pay a penalty in retrieval of objects with keys, since objects with keys will always be extracted last, after scanning through all other objects without the keys.

DataBrowser to open ".jser" files

All SCaVis objects stored in compressed Java-serialized files can be viewed using a browser. For example, if a serialized file contains P1D, P0D, H1D objects, one can view them and plot them using a mouse-click approach.

If you have a file with the extension ”.jser”, you can view it using the DataBrowser. Go to the toolbar, select [Plot}→[HPlot canvas]→ [File]→ [Open data file]. The you can plot the objects using the mouse click. Read input_output.

One can also open the browser fin a macro as:

Code example

1: from  import *
2: from jhplot  import *
3: c1=HPlot("Browser")
4: c1.visible()
5: f=HFile("test.ser")
6: HFileBrowser(c1,f,1)

I/O performance and benchmarks

Here we compare performance of the PFile and HFile classes for read and write mode. Benchmark results are given together with the code.

Third-party IO classes

here are a lot of other Java-based I/O classes designed for storing and retrieving data. A complete description of how to use Java, Jython and SCaVis for scientific analysis is described in the book Scientific data analysis using Jython and Java published by Springer Verlag, London, 2010 (by S.V.Chekanov)

Sergei Chekanov 2010/03/07 17:35