next up previous contents
Next: Ntuples Up: The MINOS Off-line Software Previous: Calibrator   Contents

Subsections


Data Model and I/O

Last significant change: 2004/04/18


Introduction

The purpose of this chapter is to describe the model that is used as the basis for MINOS persistable records and the I/O related management of those records.

Persistency and ROOT

Persistence describes the process of making objects permanent beyond their present application for reuse by some later application, such as when a C++ object is written to a file.

The problem of persisting C++ objects to file and retrieving those objects back in memory is not trivial. Objects may have complex structure (hierarchy of inheritance, and data members that are objects or pointers to objects) and C++ has weak built-in support for object persistency.

The solution to this problem is to supplement C++ with a framework for use of its I/O facilities. In the case of MINOS, ROOT provides this framework. In particular, ROOT provides the following persistency tools:


Data Model

The data model is defined by the classes that are used to store the data, the data structures that are used to store the data objects, and the organizational scheme that is used to store the data in the data structures and the data structures in the files.

Two considerations that have gone into defining a MINOS data model are:

Definitions

The MINOS data model makes use of the following terms:

More on Records

To be supplied.


Input Stream Management

Figure 13.1: Diagram illustrating layout of MINOS records, trees and files for a typical far detector run.
\includegraphics[width=6in]{datamodel_layout.eps}

Figure 13.1 illustrates the organization of MINOS records in ROOT TTrees and the organization of ROOT TTrees in data files. The diagram also illustrates how the records from the different streams are synchronized and loaded into the MomNavigator (Mom) according to each record's VldContext by the input stream manager.

In particular, note that:

Specifying input files

Specifying input files is done using the JobCInput::AddFile method which has the format:
   void AddFile(const char* filename,const char* streamlist="*",int at = -1);
Specifying the AddFile method without the optional streamname argument will apply the specified file to all active input data streams. The user may optionally specify separate data files lists for separate data input streams. For example:
   JobC j;
   ...
   j.Input.AddFile("testntp.root","NtpCand"); // to serve ntuple records
   j.Input.AddFile("F00005903_0000.cand.root","Cand"); // to serve candrecords

The AddFile method also supports wildcards in the input data file name, for example,

  j.Input.AddFile("/mydir/F00059*.mdaq.root","DaqSnarl");
  j.Input.AddFile("/mydir/F00059*.cand.root","Cand");

The method:

   j.Input.DefineStream("streamname","treename");
allows the user to specify the data tree to attach to a streamname. This can be useful if reading two streams from 2 sets of input files but both streams serve data from the same tree, for example:

  j.Input.DefineStream("Rel8Cand","Cand");
  j.Input.DefineStream("Rel9Cand","Cand");
  j.Input.Set("Streams = Rel8Cand,Rel9Cand");
  j.Input.AddFile("/mydir/F00059*.R0.8.0.root","Rel8Cand");
  j.Input.AddFile("/mydir/F00059*.R0.9.0.root","Rel9Cand");

Finally, JobCInput::List() can be used to view the streams attached to a given stream, for example,

 j.Input.List();  // to view all files attached to all streams
 j.Input.List("DaqSnarl");  // to view all files attached to the DaqSnarl stream

Identifying the Source of a Data Record

All records are stamped on input with the streamname, filename, treename, and tree index from which they came. (This information is stamped on the record whether it is received through the dispatcher or through offline file access.) The data is stored in a temporary (not persisted) Registry accessible through the RecRecord base class. The following code shows how to access this data given a pointer ``mom'' to a MomNavigator object:
   // Iterate over all objects contained in Mom
   TIter mitr = mom -> FragmentIter();
   TObject* object;
    while ( (object = mitr.Next()) ) {
        RecRecord* record = dynamic_cast<RecRecord*>(object);
        if ( record ) {
           char* streamname = 0;
           if ( record -> GetTempTags().Get("stream",streamname) )
              cout << "stream " << streamname << endl;
           char* treename = 0;
           if ( record -> GetTempTags().Get("tree",treename) )
             cout << "tree " << treename << endl;
           int treeindex = 0;
           if ( record -> GetTempTags().Get("index",treeindex) )
             cout << "index " << treeindex << endl;
           char* filename = 0;
           if ( record -> GetTempTags().Get("file",filename))
           cout << " file " << filename << endl;
        }
     }


Output Stream Management

Data is written to output stream(s) in output file(s) via the Output job module provided by the framework. The Output job module is like any other user job module in that it may be inserted at any point in the user's job path. An example:
loon[0] JobC j;
loon[1] j.Path.Create("Demo","DigitListModule::Get Output::Put");
loon[2] j.Path("Demo").Mod("Output").Set("Streams=DaqSnarl,Cand");
loon[3] j.Path("Demo").Mod("Output").Set("FileName=recons.root");
In this example, the user has specified that the Put method of the Output module be invoked at the end of the Demo job path. The output module has then been configured to write data to streams DaqSnarl and Cand, and to store these data streams in output file recons.root.

To view the list of output module configurable parameters and their default values, use the Report() method:

loon[0] JobC j;
loon[1] j.Path.Create("Demo","DigitListModule::Get Output::Put");
loon[2] j.Path("Demo").Mod("Output").Report();
Output configured with: ['Output.config.default' 
                         'AccessMode'=(string)'Recreate' 
                         'AutoSaveBytes'=(int)10000000 
                         'AutoSaveInt'=(int)0 
                         'AutoSaveTime'=(int)0 
                         'DefaultFileName'=(string)'out.root' 
                         'FileName'=(string)'' 
         'Streams'=(string)'DaqSnarl,DaqMonitor,LightInjection,Cand,SimSnarl']
The AccessMode, Streams, and FileName configuration parameters are described in the following section. The AutoSave parameters are described in Section  13.5.2.

Specifying output streams and files

User-defined output streams

The user may design their own records, and these may be persisted through the job framework output module. The command sequence to do this is:
  j.Path("Demo").Mod("Output").Cmd("DefineStream MyStreamName MyClassName");
  j.Path("Demo").Mod("Output").Set("Streams=DaqSnarl,Cand,MyStreamName,...");
  j.Path("Demo").Mod("Output").Set("FileName=outputfile.root");
where MyClassName is the name of the user's record class, and MyStreamName is the name of the user's record stream. (The convention is to define a record class with a name of the form XxxRecord, and the stream name is then by convention defined to be Xxx, but this is not enforced.)

The user then needs to create a job module to create the objects of MyClassName, and push these into Mom. The framework will automatically create a root TTree of name ``MyStreamName'' to hold records of type ``MyClassName'' with splitlevel 99. The output module will search Mom for objects of the user's specified type and persist them to this output tree.

DefineStream has optional arguments to further configure the user's output stream. The full interface is:

  j.Path("Demo").Mod("Output").Cmd(
  "DefineStream streamName className userName inputStreamName splitLevel");
where userName (TNamed name) and inputStreamName can be used to further refine the object the output module should search for in Mom. splitLevel may be used to adjust the split level with which the root TTree is created, where the range is 0 to 99. The default of 99 splits the tree branch structure to the finest possible level. A setting of 0 means all object data will be stored on a single root tree branch.

Writing data to multiple output files

Writing data to two or more output files requires creating separate job paths to support the use of multiple output modules. Each output module can then be configured differently with regard to its output file. The separate job paths can be appended to create one job path.


Salvaging output data from crashed jobs

It's possible to switch on tree ``autosaving'' in the job control script, so that output trees are saved at regular intervals. If a crash occurs during the course of a job, the data in the output trees is then salvagable up to the last autosave. The IoOutputModule configurable parameters allow the user to specify saving the trees at regular intervals determined by the number of tree entries, the amount of time that has passed since the last save, and/or the number of bytes that have been filled in the tree. For example, the user may specify:
jc.Path("Demo").Mod("Output").Set("AutoSaveInt = N");  // save every N entries
and/or:
jc.Path("Demo").Mod("Output").Set("AutoSaveTime = N"); // save every N seconds
and/or:
jc.Path("Demo").Mod("Output").Set("AutoSaveBytes = N");// save every N filled bytes
By default, only AutoSaveBytes is activated at 10 Mbyte intervals. A caution is that setting the autosave intervals so that autosaves occur at a high frequency will cause a degradation in performance. For example, saving on the order of every 100 to 1000 entries is reasonable, but every 1 entry is not.

When reading an aborted file with the autosaved trees in a subsequent root session, the file can be opened in ``UPDATE'' mode and the recovery is in principle automatic, although sometimes the TFile::Recover method needs to be explicitly invoked. For example:

root[0] TFile* file = new TFile("abortedfile.root","UPDATE");
root[1] file -> Recover();
will recover the data stored in each tree up to last the autosave call before the crash. The recovered information will be stored in the file, so that the next open of the file will not need to go through the recovery process.

Miscellanea


Rules for writing a Persistable Class

Since ROOT forms the framework for MINOS persistency, it defines the rules that the user must follow when defining a class to be persisted to file. What follows is an abbreviated list of rules to remember when writing a persistable class. A much more detailed description of defining a class can be found in the ROOT documentation.


next up previous contents
Next: Ntuples Up: The MINOS Off-line Software Previous: Calibrator   Contents
MINOS Software 2017-10-29