Reference: Data Access: LFC/LCG (LCG File Catalog and Computing Grid)

Last modified: Fri Oct 31 16:03:10 GMT 2008
Nick West
Return to home page

Contents

Overview

For an introduction to GRID data access please see Tutorial: Accessing Storage Elements. For help in selecting the best access protocols see The MINOS UK Data Model

The LCG File Catalog and Computing Grid tools are an attempt to view the entire GRID as a single logical storage device. It consists of a file catalogue (LFC - The LCG File Catalogue) that records information for each file including the locations of its replicas. The lfc-* commands interrogate and update the LFC and the lcg-utils move data into and out of SEs maintaining consistency with the LFC.

Stephen Burke: November 2007: "The latest versions of lcg-utils and GFAL can be decoupled from both the LFC and BDII, which is explicitly aimed at use at US sites." Once that is mainstream it might be a way to move data to/from FNAL?

Caution: Before you can use the LFC/LCG you must set up a Short Term Proxy

Reference: Data Access: The LCG File Catalogue

Caution: We do not use the LFC, see The MINOS UK Data Model: Frontend choice so you should only read the remainder of this section if you want some background on how the LHC experiments use a global catalogue to interface to their data.

The primary function of the LFC is to provide central registration of data files distributed amongst the various SEs. The catalogue resembles a UNIX directory tree with all the MINOS files below:-

  /grid/minos.vo.gridpp.ac.uk
The catalogue can be listed using the lfc-ls command, but before you can use that you need to set the environmental variable LFC_HOST. The easiest way to do that is to
  setenv LFC_HOST `lcg-infosites --vo minos.vo.gridpp.ac.uk lfc`
after which commands like:-
  lfc-ls -l /grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root
can be used to list files and directories.

The file names appearing in the /grid structure are called LFNs (Logical File Names) but in order to access data you need a SURL (Storage URL) and the LFC provides a look-up to convert between LFN and SURL. Note that this isn't necessarily a 1:1 mapping. If data is replicated in several SEs, a single LFN can map to multiple SURLs and if someone has simply made an entry in the catalogue a LFN could have no corresponding SURL.

The story doesn't end there! A SURL uniquely defines a physical file within a specific SE but is not, in general sufficient, to access a file, for as well as identifying a file an access protocol is needed. For that we need a TURL (Transport URL) While SURLs are in principle invariable (they are entries in the file catalog), TURLs are obtained dynamically from the SURL through the Information System or the SRM interface (for SRM managed SEs). The TURL therefore can change with time and should be considered only valid for a relatively small period of time after it has been obtained.

There is one final way to refer to a file, by its GUID (Grid Unique IDentifier) of the form:

guid: 40_bytes_unique_string
e.g. guid:38ed3f60-c402-11d7-a6b0-f53ee5a37e1d
Initially there is a 1:1 mapping between a GUID and a LFN but it is also possible to define alias LFNs (soft link) and then the mapping is 1:n.

The LFC can associate a comment with each file name. This could be used to record simple meta-data.

Reference: Data Access: lfc-* commands

Caution: We do not use the LFC, see The MINOS UK Data Model: Frontend choice so you should only read the remainder of this section if you want some background on how the LHC experiments use a global catalogue to interface to their data.

The lfc-* commands interrogate and update the LFC. In general the updating commands should not be used as that runs the risk of the catalogue getting out of sync with the SEs that it represents. Instead it is far better to use the lcg-utils"> as they perform operations jointly on the SEs and the LFC ensuring that they stay in step. The one exception to this rule is the lfc-mkdir command that is used to create a directory in the LFC as directories are part of the Logical File Name concept and have no corresponding element in an SE. Indeed you have to use lfc-mkdir whenever you want a new directory; it cannot be done implicitly simply by presenting the LFC with a LFN that refers to the new directory.

lfc-* commands equivalents exist for many of UNIX file commands and as far as possible use the same name prefixed by lfc-*. For example: lfc-ln, lfc-ls, lfc-rm, lfc-chmod.

In order to use the lfc-* commands the environmental variable LFC_HOST must be set. For example by:-

  setenv LFC_HOST `lcg-infosites --vo minos.vo.gridpp.ac.uk lfc`
Caution:-

CommandFunctionNotes
lfc-chmod Change access mode of a LFC file/directory.  
lfc-chown Change owner and group of a LFC file/directory.  
lfc-delcomment Delete the comment associated with a file/directory. The only way to add/remove user meta-data
lfc-getacl Get file/directory access control lists. Example:
lfc-getacl /grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root
lfc-ln Make a symbolic link to a file/directory. User responsible for deleting if target deleted
Example:
lfc-ln -s /grid/dteam/MyExample/day2/measure2 /grid/dteam/MyExample/interesting/file1
lfc-ls List file/directory entries in a directory. -l long format  -c show comment  -R recursive Don't use - expensive!
Example:
lfc-ls -l /grid/minos.vo.gridpp.ac.uk/
lfc-mkdir Create directory. The only updating command an average user should need.
Example:
lfc-mkdir /grid/minos.vo.gridpp.ac.uk/nwest/test
lfc-rename Rename a file/directory.  
lfc-rm Remove a file/directory. Caution: doesn't remove files from SE.
Fails for files that still have SURLs
Needed to remove directories (requires -r)
Example:
lfc-rm  /grid/dteam/MyExample/trash
lfc-setacl Set file/directory access control lists.  
lfc-setcomment Add/replace a comment. The only way to add/remove user meta-data
Example:
lfc-setcomment /grid/dteam/MyExample/interesting/file1 "Most promising measure"

For more information on individual commands use the info command e.g.

 info lfc-mkdir
This information comes from @ LCG-2 User Guide / 7.5 LFC Interaction Commands

Reference: Data Access: lcg-utils

The LCG Data Management tools (usually called lcg-utils) allow you to copy files between UI, CE, WN and a SE, to register entries in the File Catalog and replicate files between SEs. As has been explained above these tools maintain synchronisation between the LFC and the SEs.

However, as explained in The MINOS UK Data Model: Frontend choice we do not use the LFC, so only a subset of the commands, which will be presented first, are relevant for MINOS.

The lfg-utils use the following environment variables:-

VariableTypeSetting and use
LCG_GFAL_INFOSYS. Compulsory Is server; should already be set to: lcgbdii02.gridpp.rl.ac.uk:2170
LCG_GFAL_VO. Optional VO, used if option --vo omitted. Set to minos.vo.gridpp.ac.uk
LFC_HOST. Optional LFC host; no need to set, the right one will be taken from the IS

Common options

-v or -verbose Verbose
--vo vo-name Specify VO i.e. minos.vo.gridpp.ac.uk
-t time-in-secs Used by lcg-cr, lcg-del, lcg-gt, lcg-rf, lcg-sd lcg-rep
Default 0 i.e.: no timeout.
If timeout occurs any all actions performed to that point are undone

Common arguments

MINOS subset of lcg-utils

CommandFunction     Notes
lcg-cp Copy local file to/from SURL (srm:) Example:
  lcg-cp -v \
    srm://srm-minos.gridpp.rl.ac.uk:8443/castor/ads.rl.ac.uk/prod/minos ...
     ... /test/nwest/F00035853_0022.mdaq.root\
    file:./F00035853_0022.mdaq.root
and
  lcg-cp -v \
    file:./F00035853_0022.mdaq.root\
    srm://srm-minos.gridpp.rl.ac.uk:8443/castor/ads.rl.ac.uk/prod/minos ...
     ... /test/nwest/F00035853_0022.mdaq.root\

lcg-del Deletes one file (either one replica or all replicas). You must use --nolfc to stop LFC look-up
Example:
lcg-del --nolfc srm://srm-minos.gridpp.rl.ac.uk:8443/castor/ads.rl.ac.uk/prod/minos ...
     ... /test/nwest/F00035853_0022.mdaq.root

lcg-ls Lists diectories or files Example:
lcg-ls -l srm://srm-minos.gridpp.rl.ac.uk:8443/castor/ads.rl.ac.uk/prod/minos ...
   ... /test/nwest

Caution: The remainder of this document cover commands that deal with replica management and/or keep the LFC in sync and are not used by MINOS. You should only read on if you want some background on how the LHC experiments these tools.

The utilities are divided into two types: Replica Management (moving data to/from SEs) and File Catalog Interaction (access catalogue in isolation).

Replica Management

CommandFunction     Notes
lcg-cp Copy LFN, GUID or SURL to local file. Needs VO_MINOS_VO_GRIDPP_AC_UK_DEFAULT_SE if using LFN
Don't use to copy into an SE (skips catalogue update)
Example:
lcg-cp -v --vo minos.vo.gridpp.ac.uk lfn:/grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root \
file:/tmp/F00024999_0000.mdaq.root
lcg-cr Copy and Register a file Returns a GUID
Example:
 lcg-cr  -v --vo minos.vo.gridpp.ac.uk file:/home/tier1/nwest/grid/scripts/LVJ_F00034638_0000.mdaq.root \
    -l lfn:/grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root \
    -d srm://dcache.gridpp.rl.ac.uk/pnfs/gridpp.rl.ac.uk/data/minos/nwest/test/...
        ...LVJ_F00034638_0000.mdaq.root

Source file can also be  gsiftp:?file-name
-d dest_file  either SE host (filename is generated) e.g. dcache.gridpp.rl.ac.uk
           or SURL e.g. see example above

You must create the directories in the LFC before you start.  However,
if -d specifies a non-existent directory tree it will be created
automatically.

lcg-del Deletes one file (either one replica or all replicas). Use -a to delete all replicas
If no replicas remain LFN/GUID also removed.
Example:
lcg-del  -v --vo minos.vo.gridpp.ac.uk -a lfn:/grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root
lcg-rep Replicate a file Example: (not minos - we only have one SE!)
lcg-rep -v --vo dteam -d lxb0707.cern.ch guid:db7ddbc5-613e-423f-9501-3c0c00a0ae24
lcg-gt Get TURL for a given SURL Returns 3 lines of output: TURL, requestID and fileID
Example:
lcg-gt srm://dcache.gridpp.rl.ac.uk/pnfs/gridpp.rl.ac.uk/data/...
        ...minos/nwest/test/LVJ_F00034638_0000.mdaq.root dcap
returns:-
dcap://dcache-head.gridpp.rl.ac.uk:22125//pnfs/gridpp.rl.ac.uk/...
        ...data/minos/nwest/test/LVJ_F00034638_0000.mdaq.root
-2147033715
-2147033714
It will block while the file is staged if necessary and will lock the file on disk with a timeout (typically 24hrs). There is a limit (300?) on the number of concurrent TURLs. Use lcg-sd to dismiss once I/O is complete.
lcg-sd Sets file status to "Done" for a given SURL Use after lcg-gt once finished with file to prevent failure: too many open requests
Example:
lcg-sd srm://dcache.gridpp.rl.ac.uk/pnfs/gridpp.rl.ac.uk/data/minos/nwest//...
        ...test/LVJ_F00034638_0000.mdaq.root " -2147033715" " -2147033714" 0
Note: the quotes and the leading space!

File Catalog Interaction

CommandFunctionNotes
lcg-aa Adds an alias LFN for a given GUID. Example:
lcg-aa --vo dteam guid:baddb707-0cb5-4d9a-8141-a046659d243b lfn:my_new_alias
lcg-ra Removes an alias LFN for a given GUID. Example:
lcg-ra --vo dteam guid:baddb707-0cb5-4d9a-8141-a046659d243b lfn:my_alias1
lcg-rf Register file already in SE. Returns the GUID
Example:
lcg-rf -v --vo dteam -g guid:baddb707-0cb5-4d9a-8141-a046659d243b \
 sfn://lxb0710.cern.ch/flatfiles/SE00/dteam/generated/2004-07-08/...
        ...file0dcabb46-2214-4db8-9ee8-2930de1a6bef
lcg-uf Unregister file in an SE. WARNING: Does not remove from SE
Also removes GUID/LFN if last replica is unregistered.
Example:
lcg-uf --vo dteam guid:baddb707-0cb5-4d9a-8141-a046659d243b \
  sfn://lxb0710.cern.ch/flatfiles/SE00/dteam/generated/2004-07-12/...
        ...file04eec6b2-9ce5-4fae-bf62-b6234bf334d6
lcg-la Lists the aliases for a given LFN, GUID or SURL. Example:
lcg-la --vo dteam guid:baddb707-0cb5-4d9a-8141-a046659d243b
lcg-lg List GUID for a given LFN or SURL. Example:
lcg-lg  --vo minos.vo.gridpp.ac.uk lfn:/grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root
lcg-lr Lists replicas for a given LFN, GUID or SURL. Example
lcg-lr  --vo minos.vo.gridpp.ac.uk lfn:/grid/minos.vo.gridpp.ac.uk/nwest/test/LVJ_F00034638_0000.mdaq.root

For more information on individual commands use the info command e.g.

 info lcg-rf
This information comes from @ LCG-2 User Guide / 7.7 File and Replica Management Client Tools

External Links


Return to home page