Retrieving MINOS Data from FNAL

All the data from the MINOS experiment is archived on tape in the FNAL mass store. The tape robot uses a product called ENSTORE to manage the data and retrieve it from tape.There is a disk cache (DCache) sitting in front of the tape robot that files can be staged into. They remain in the cache for some period of time meaning that other users can access the file from disk rather than needing to stage it from tape.

Useful montitoring is available at the STKen Enstore web page. To report problems with the system please contact the FNAL helpdesk either by phoning 840-2345 or filling out the web form. To ensure that your request gets routed to the correct person please include the words stken enstore or stken dcache in your request and please give as complete a description of the problem as possible. For example if you get an error message please include it.

To find out what runs have been taken use the SAM Database search interface. This will show you all the data files cataloged in SAM. The search also returns the path to the file. This database contains all the raw data and reco data but not Caldet. Also the Monte Carlo data is not catalogued in SAM yet but it will be added soon. If you are running jobs with multiple files on the MINOS cluster or the FNALU batch nodes you are encouraged to define a SAM dataset and access it via loon. See the SAM tutorial for more details.

  • The far detector raw data is in /pnfs/minos/fardet_data and is organized by the month in which the data were taken. For example December 2001 files can be found in 2001-12.
  • The near detector raw data is in /pnfs/minos/neardet_data and is organized by the month in which the data were taken. For example May 2004 files can be found in 2004-05.
  • The reconstructed data is stored in /pnfs/minos/reco_far and /pnfs/minos/reco_near. It is organized according to the version of production used for processing, e.g. R1.18, the type of output data cand_data, sntp_data, snts_data and the month the data was taken.
  • The Monte Carlo data is stored in /pnfs/minos/mcout_data. It is organized according to the version of production used for processing e.g. R1.18, the detector far or near and the type of output data cand_data, sntp_data, snts_data
  • The 2002/2003 CalDet detector raw data is in /pnfs/minos/caldet_data and is organized by the month in which the data were taken. For example May 2002 files can be found in 2002-05. The 2001 CalDet data is in /pnfs/minos/caldet_data/2001 and is not divided by month.

Using the Disk Cache command line interface

This requires installation of the dcap package. You can find the tar file and installation instructions on the external packages installation page .

This uses a URL syntax to specify the files to fetch. You will need to know the dcache path for the file. The SAM Database search interface will provide this information, select the option to produce dccp style output. For example to copy the raw data file for run 11732 you type

 
setup dcap -q unsecured
dccp dcap://fndca1.fnal.gov:24125/pnfs/fnal.gov/usr/minos/fardet_data/2003-01/F00011732_0000.mdaq.root F00011732_0000.mdaq.root localfile

There are four ports that you can use to connect. The current list is 24125, 24136, 24137, 24138. This service is accessible off-site but will not work if your site has a firewall in place.

dccp is not a recommended method for off-site access.

Using the Disk Cache with ROOT/loon

This requires installation of the dcap package and building a TDCacheFile plugin for ROOT. You can find the tar file and installation instructions on the external packages installation page .

You can then access files from the DCache directly from your loon job or from a ROOT prompt.

(An alternative to dcap is available for streaming access to enstore data. It uses an xrootd server, kept running at Fermilab, to serve Minos data files to clients running root or loon sessions. This extends the functionality of dcap-like streaming to clients without dcap access, including remote sites behind firewalls and platforms, like MacOSX, for which dcap builds are not available.)

Loon job

 loon -bq myscript.C dcap://fndca1.fnal.gov:24125/pnfs/fnal.gov/usr/minos/fardet_data/2003-01/F00011732_0000.mdaq.root

If the file is already in cache then the job starts immediately. If it is only on tape then the file will be staged into the cache and you job will wait until this is complete before it starts to read the file. Do not list large numbers of files on the command line because ROOT/Loon tries to open them all at the same time. This does not work well with Dcache. The same applies to the use of the TChain command in ROOT. Do not use this command with Dcache. It will try and open all the files in the chain which not only will cause your job to hang but can also cause a denial of service for other users because you will use up all the available doors in the system. You should only use the AddFile syntax in JobControl for reading multiple files from Dcache. You should also include the streams to read otherwise Loon tries to also open all the files to read the available streams. See the offline manual for information about Job Control.

From a ROOT prompt

root [19] TFile *myfile=TFile::Open("dcap://fndca1.fnal.gov:24125/pnfs/fnal.gov/usr/minos/fardet_data/2003-01/F00011732_0000.mdaq.root");

Again this will not work off-site if your site has a firewall installed. TChains do not work well for the reason described above.

Using the Disk Cache FTP interface

The data is also acessible from any machine through the Disk Cache ftp interface. This interface provides read-only access using the usual ftp commands.Please note that issuing the ls command in the ftp client is a resource intensive operation because of the database lookups. Please do not do this. Either use the SAM Database search interface to find the location of the files that you want to copy or install the SAM Web Services which allows you to run SAM commands remotely and copy the files to your local host.

To get a file do the following:

  1. ftp fndca1.fnal.gov 24126.
  2. Login as user mindata and use the usual NuMI web password
  3. It may be necessary to put ftp into passive mode if you are behind a firewall. If ftp hangs when you type ls then type the command passive at the ftp prompt before doing any other ftp commands.
  4. The cwd is /pnfs/minos.
  5. cd to the appropriate sub-directory.
  6. To fetch a file type get F00001730_0000.mdaq.root
  7. Contrary to previous advice, do not fetch fetch multiple files using wildcards and mget.
    As noted above, doing an 'ls' is a very expensive thing.
    mget does an implicit ls.

From the MINOS Linux Cluster

All of the above DCache methods can be used on the MINOS Linux Cluster and are the preferred method of access because you will avoid a tape mount if the file is already in DCache. As noted above if you are running jobs on multiple files then it is much easier to define a SAM dataset and use that within loon.

Storing Data

Please contact the minos-data mailing list to discuss what you are going to store. Writing to /pnfs/minos has been restricted to node minos01 since 2006 Feb 8, in order to avoid accidental file creation and removal.

The system supports file_families. This groups data together on tapes so that it can be accessed efficiently and can be easily deleted if no longer required. Depending on your needs you may need to create a file_family. We will determine this when you discuss your needs.

In the case where no special file family is required.

  1. setup encp
  2. cd /pnfs/minos/users
  3. Create a directory for your data, e.g. mkdir buckley. You need to belong to group e875 (gid=5111). If this is not your group then please contact helpdesk@fnal.gov and asked to have it changed on FNALU and the MINOS Cluster.
  4. encp ph2le-hh-plug_1.paw.gz /pnfs/minos/users/buckley
  5. If outpath is a directory then inpath can be a list of files.

Fermi National Accelerator Laboratory Magnet Logo