You are a guest. Restricted access. Read more.
SCaVis manual

File input/output

SCaVis supports many different types of I/O (input-output), in most cases the I/O part of SCaVis is based on self-descriptive file formats.

Here is the list of I/O supported by SCaVis:

  • The native Java I/O. Access them from the java.io package;
  • The native Python I/O methods and classes;
  • Native SCaVis I/O classes which are built-in into the JHPLOT package (will be discussed below). Several packages based on the standard Java serialization and XML-type serialization. Access them from jhplot.io. We will discuss some of them below;
  • Native SCaVis I/O classes build around XML syntax. File format is cross platform
  • External databases such as:
    • SQL-type (Derby and SQLite based on SQLjet). Starting from v3.8, Derby is excluded from the package since it comes with JDK as JavaDB (see the JAVAHOME/db/ directory).
    • Several object-based databases (likeNeoDatis)
  • Maps with disk backends to store large data
  • External file formats native to C++, such as ROOT and AIDA
  • Google's Protocol Buffers library which is fully integrated, thus all SCaVis Java data containers can be accessed or written using C++ program (or any other which are supported by the Protocol Buffers).
  • Data (mainly time series) can be read and saved using ASCII, Gauss, Matlab, Excel formats (PRO edition).
  • DIF - Data Interchange Format (see the DIF description), CSV formats etc.

SCaVis is 100% Java, but the unique feature is that it fully supports many ways to share data between Java and C++ or other programming languages.

Input and Output

SCaVis contains several powerful classes for persistent storage of objects (data) in files. It should be noted that many SCaVis objects described in data_structures and histograms have their own methods for file input/output (with or without compression). Any data container for arrays can be initialized form a file (on the disk or from URL).

For example, let's create a PND object representing a multidimensional matrix. We will initialize this object from URL. You can examine this file here. The number of columns and rows can be arbitrary.

>>> from jhplot import *
>>> pn=PND('data','http://jwork.org/scavis/examples/data/pnd.d')
>>> print pn.toString()

Run this class and you will see the output of this container. Now you can project this matrix into 1D, or make X-Y plot using ant column or row. This is explained in data_structures section.

Still, one can use external classes from the package jhplot.io to write and read data in a persistent form. This will be considered in this section.

Here is a short summary of the input-output classes for data:

  • jhplot.io.Serialized - Write/read read any Java object (lists, maps, etc.) using the Java serialization.
  • jhplot.io.HFile - Write/read read any Java object in sequential order using the Java serialization.
  • jhplot.io.HFileXML - Write/read read any Java object in sequential order using the XML serialization.
  • jhplot.io.EFile - Write/read SCaVis data structures (P1D,P2D,H1D,H2D) in files using Google's Prototype Buffer (cross platform)
  • jhplot.io.EFile - Write/read SCaVis data structures (P1D,P2D,H1D,H2D) in sequential order into ntuples using Google's Prototype Buffer (cross platform)

SCaVis file formats

SCaVis supports the following formats:

  • Files based on the Java serialization, which includes binary compressed format and XML human readable formats. The extension are “.jser”. Such files are generated by HFile class. Generally, all graphical attributes can be saved (which can make the files are larger then expected. Read about this in serialized.
  • File formats based on Googles protocol buffers. The files can be generated by PFile class. The extension is “.jpbu”. This is cross platform data format, but data are not human readable. Also it is fastest engine with small sizes for output files. Read about it in crossplatform
  • XML-based format with the extension jdat. Such files can be generated by HBook class. Generally, the format is significantly smaller than the usual XML serialisation since data blocks do not have XML tags. See hbook_class

* External ROOT and AIDA file formats. SCaVis can read and display data from such files, but does not write data in such formats. Read about this [man:io:root_aida]].

You can list and read data stored in the files with the extensions:

  • “.jser” (HFile compressed Java serialization). Any Java object can be saved and restored as long as implements java.io.Serializable. All SCaVis data objects can be saved/restored in this format. Files are JVM independent, meaning an object can be serialized on one platform and deserialized on an entirely different platform.
  • “.jxml” (HFileXML XML serialisation). Human readable, but file sizes is x5 larger than for *.jser
  • “.jpbu” (PFile compressed Google ProtocolBuffer serialization with cross platform support). All data structures from the jhplot package are supported.
  • “.jdat” (HBook human-readable XML format). Good cross platform compatibility.
  • “.root” (ROOT zip compressed ROOT format)
  • “.aida/.xml” AIDA XML format

using a data browser. You can open such data file and plot them (or show as a table) as this:

  • Start SCaVis IDE.
  • In the toolbar, go to “Plots” and the “HPlot” (or HPlot3D“ in case 3D objects). You will see a canvas
  • In the canvas, select “File” → “Open data file”. The look at the files with the extensions as above. When you click on the file, you will see a table to the left of the browser. Then select the select the object and plot it.

DataBrowser

You can open data browser for any supported file inside a Java or Jython script. In this case, the program executes the script and will bring up 2 windows: one is a canvas and the second window with the objects inside the input files. The file should have the extension *.jdat, *.jpbu, *.jser, *.root or *.aida. The file can be located on URL.

Data objects can be organized in directories and can be shown in the data browser as a trees. This is discussed in hbook_class

You can find below a simple script which brings up the file browser and the canvas exactly as it is shown in figure above:

Unregistered users have a limited access to this section. You can unlock advanced pages after becoming a full member. You can also request to edit this manual and insert comments.

ASCII formats

This is a data interchange format (extension .dif) and comma-separated values (CSV) with file extension .csv, are text file formats used to import/export single spreadsheets. This small example shows how to read DIF files:

import dif
f=open("nature04632-s16-2.dif",'r')
d = dif.DIF(f)
print d.header
print d.vectors
print d.data

Note that this module is pure Python, so this example will not work using Java. The DIF file format can write most of data structures supported by ScaVis. Read more in DIF file format section that describes Java implementation of this format. Section CSV format explains how to write data in CSV files.

I/O performance and benchmarks

Here we compare performance of the PFile and HFile classes for read and write mode. Benchmark results are given together with the code.

Unregistered users have a limited access to this section. You can unlock advanced pages after becoming a full member.

Third-party IO classes

click here if you want to know more

click here if you want to know more

click here if you want to know more

here are a lot of other Java-based I/O classes designed for storing and retrieving data. A complete description of how to use Java, Jython and SCaVis for scientific analysis is described in the book Scientific data analysis using Jython and Java published by Springer Verlag, London, 2010 (by S.V.Chekanov)

Sergei Chekanov 2010/03/07 17:35

man/io/input_output.txt · Last modified: 2014/09/30 18:06 by admin
CC Attribution-Share Alike 3.0 Unported
Powered by PHP Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0 Valid HTML5