Script Output Information
This page presents a summary of information concerning the output methods of the production scripts and to a lesser extent job control in general.
For information about Concatenated Files click
here.
Types of Output Files
During standard production there are three main Output streams classes of output files: Candidate files, Standard Ntuple Files, and Short Standard Ntuple Files.
Candidate files contain all of the candidate reconstruction objects and are largest files produced.
Standard Ntuples have much of the
information reduced into an ntuple form for physics analysis groups to perform later analysis. These files tend to be smaller than the Candidate
files by about an order of magnitude.
Short Standard Ntuples contain the same information as the Standard ntuples, but the strip information
has been removed. This can save close to another factor of five in space as well. These files are no longer produced by default (Sept. 2006).
Each file type can have multiple streams attached to it, candidate files will frequently also contain DaqSnarl or SimSnarl (MC) information as well.
All data production files also have the Beam Monitoring Ntuples attached which stores information concerning beam data.
Configuring the Output Module
Defining an output module is roughly equivalent to defining an output file. After declaring the module it is necessary to define what it is that will be
written to that file. By default an unconfigured Output Module will attempt to create a candidate file named out.root. Configuration for an
Output module falls in three major steps:
1. Declaring the Output Streams
2. Setting the Streams to Output
3. Setting the Output filename
Here is a section of sample code:
jcm is a generic job control output module for example
jcm = jc.Path("Reco1").Mod("Output")
jcm.Cmd("DefineStream NtpSt NtpStRecord");
jcm.Cmd("DefineStream NtpBDLite NtpBDLiteRecord");
// These two lines define the Standard Ntuple and Beam Monitoring Ntuple for output
jcm.Set("Streams=NtpBDLite,NtpSt");
//This line actually Sets the output module to output the two ntuple streams to the file
jcm.Set("FileName=ntupleS.root");
//Finally the filename for output is set
Starting in R1-18 Simple functions that contain these instructioned are in the production script files to streamline setting the output options
How to produce snts files
At the September 2006 collaboration meeting it was decided to no longer write out the snts (short ntuple) files. For future reference the code is included here
// Ntuple abridged record
jc.Path.Create("NtpSRFilter",
"NtpStFilterModule::Reco "
"Output::Put "
);
JobCModuile jcm = jc.Path("NtpSRFilter").Mod("Output");
jcm.Cmd("DefineStream NtpSt NtpStRecord");
jcm.Set("Streams=NtpSt");
jcm.Set("FileName=ntupleSt.sub.root");
jc.Path.Attach("Reco1", "NtpSRFilter");
What are the standard output files?
The data scripts produce files named in the following format:
CandX.root, ntupleStX.root, ntupleStX.sub.root.
The X will be S for spill or beam MC scripts and A for All or Cosmic scripts.
CandX.root is the standard Candidate file, ntupleStX.root the full Standard Ntuple (NtpSt) file, and ntupleStX.sub.root the trimmed short ntuple.
Near Detectors Files
Near Spill Data scripts produce just the three standard output files.
After production they will be renamed to the standard naming scheme as follows:
CandS.root -> N(RUN)_(SUB_RUN).spill.cand.(release).root
ntupleStS.root -> N(RUN)_(SUB_RUN).spill.sntp.(release).root
ntupleStS.sub.root -> N(RUN)_(SUB_RUN).spill.snts.(release).root
Near Beam MC scripts produce just the three standard output files.
After production they will be renamed to the standard naming scheme as follows:
CandS.root -> n(RUN)_(SUB_RUN)_L(beam).cand.(release).root
ntupleStS.root -> n(RUN)_(SUB_RUN)_L(beam).sntp.(release).root
ntupleStS.sub.root -> n(RUN)_(SUB_RUN)_L(beam).snts.(release).root
Near Cosmic Data scripts produce just the three standard output files.
After production they will be renamed to the standard naming scheme as follows:
CandA.root -> N(RUN)_(SUB_RUN).cosmic.cand.(release).root
ntupleStA.root -> N(RUN)_(SUB_RUN).cosmic.sntp.(release).root
ntupleStA.sub.root -> N(RUN)_(SUB_RUN).cosmic.snts.(release).root
Far Detector Files
Far Spill Data scripts produce six output files, the three standard output files once for the full data set
and then again for a blinded data set.
After production they will be renamed to the standard naming scheme as follows:
* Closed (Full) Data:
CandSBlind.root -> F(RUN)_(SUB_RUN).spill.bcnd.(release).root
ntupleStSBlind.root -> F(RUN)_(SUB_RUN).spill.bntp.(release).root
ntupleStSBlind.sub.root -> F(RUN)_(SUB_RUN).spill.bnts.(release).root
* Open Data:
CandS.root -> F(RUN)_(SUB_RUN).spill.cand.(release).root
ntupleStS.root -> F(RUN)_(SUB_RUN).spill.sntp.(release).root
ntupleStS.sub.root -> F(RUN)_(SUB_RUN).spill.snts.(release).root
Far MC Beam Data scripts produce just the three standard output files.
After production they will be renamed to the standard naming scheme as follows:
CandS.root -> f(RUN)_(SUB_RUN)_L(beam).cand.(release).root
ntupleStS.root -> f(RUN)_(SUB_RUN)_L(beam).sntp.(release).root
ntupleStS.sub.root -> f(RUN)_(SUB_RUN)_L(beam).snts.(release).root
Far All Data scripts produce just the three standard output files.
After production they will be renamed to the standard naming scheme as follows:
CandA.root -> N(RUN)_(SUB_RUN).all.cand.(release).root
ntupleStA.root -> N(RUN)_(SUB_RUN).all.sntp.(release).root
ntupleStA.sub.root -> N(RUN)_(SUB_RUN).all.snts.(release).root
A more detailed explanation of the MC file naming scheme may be found
here.
Concatenated Files
Starting in Cedar process an effort to concatenate the standard ntuple files is being employed. This will hopefully reduce the total number of files being moved around in dcache and the total number of files that people need to locate to access. More information will be reported as it becomes available.
The concatenation takes place by making use of the hadd utility in root and is currently under the direction of Art Kreymer and Howard Rubin.