Database Distribution: Master Site Operation

Contents


Preparation

The first step is to think of a unique update name that expresses that this is an update from a specific database. The updates names FNAL, SOUDAN and CALDET will be the update names for the primary feeds from the 3 database sources. Secondary feeds should choose one of these name and the qualify it e.g. SOUDAN_CVS.

The system produces update files using the naming convention:-

    <update_name>_nnnnnnnn.dbm    nnnnnnnn = update number
e.g.:-
    FNAL_00000123.dbm
If you have followed the instructions in the section Slave: Preparation then you most of the work has been done and you only need to make changes as follows
  1. local.config
    Modify this as follows:-
    1. Set dbm_update_names to be the single update name your database is exporting e.g.:- "FNAL"
    2. Set dbm_mode to:- export

    3. Set dbm_command_options to the list of tables to be exported e.g.:- --Table=* This example selects all tables. If you include the '!' character, it must be preceded by a backslash e.g.:- --Table=*,\!DBU*,DBUSUM* This example selects all tables except those starting DBU unless they are DBUSUM*.

    If you want to set up multiple export jobs, or want to do both export and import, then create multiple config files with different names. e.g. local_export.config, local_import.config.

  2. run_import.sh
    Rename the file run_export.sh. If you want to set up multiple export jobs, or want to do both export and import, then create multiple scripts, one for each config file.

  3. run_checksum.sh
    No changes are necessary, unless you have renamed the config file.

  4. FNAL_import.context
    Rename that to:-
        <update_name>_export.context    e.g. FNAL_EXPORT.context
    
    Change the Update No: to be one less than the first update e.g. 0 if starting a new series, and add two further lines to the context file that should both have the current date time e.g:-
    Export last date:      2003-01-09 12:32:24
    Export previous date:  2003-01-09 12:18:15
    
    If you want to set up multiple export jobs then create multiple context files with different names. e.g. FNAL_export.config, FNAL_CVS_export.config.

  5. FNAL_import.log
    Rename that to:-
        <update_name>_export.log    e.g. FNAL_EXPORT.log
    

Distributing Fixup Files

The principle function of DBMauto is to distribute recently inserted data from a Master database to all its Slaves. However it can also distribute "raw SQL" which could be used, for example, to remove unwanted data or change the format of an existing table. To distribute an update:-
  1. Check the SQL fixup carefully - it will end up be executed at every Slave site!.
  2. Create a file with the extension name .fixup and store in it the required SQL commands. The file may also contain blank lines and comments (# in column 1), for example:-
        cat good_bye_cruel_world.fixup
    
        #   So they think being a Database Manager isn't a responsible job !!
    
        DROP DATABASE OFFLINE;
    
    only kidding, but it makes the point about being careful!
  3. Include in the fixup an update to the INSERTDATE of any modified SEQNO:-
      update ?tableVLD set INSERTDATE =  NOW() where SEQNO = ?;
    
    This will ensure that INSERTDATE reflects that date when the fixup was inserted and also means that the Validation System will recompute the SEQNO checksum.
  4. Place the file in the work directory of the Master site.
  5. The next time the export job run starts it will collect up all .fixup files, extract any sql lines and copy them to the update file. At the end of the export job, the fixup files are renamed to .fixup_done so that they will not be reprocessed.
  6. Note that the fixup file will not be played at the Master site; there the correction has to be made by hand. At Slave sites it will be ignored unless the --EnableSQL option is supplied on the import command.
The same system can also be used to rebroadcast individual SEQNOs which can be useful if data was inserted into a Master site without obeying the rules on insertion dates or to plug a gap in at a Slave site. Simply insert commands of the format:-
  BROADCAST: <table> <seqno>
e.g.
  BROADCAST: CALCHARGEINJECT 200000001
into the .fixup file. To rebroadcast a band of SEQNOs specify the lower and upper limits plus a maximum number (as a safety check):-
  BROADCAST: <table> <seqno_lo> <seqno_hi> <max_num>;
e.g.
  BROADCAST: CALCHARGEINJECT 200000001 200000010 10
All SEQNOs in the band should exist.

BROADCAST and SQL commands can be mixed together in a .fixup file.


Making Updates by Hand

The recommended way to make updates is via DbiWriter (see User Manual) as that ensures the Rules on SEQNO, INSERTDATE and CREATIONDATE will be followed, but there are times when someone comes with an update that is just a mysql script and then there is some work to be done.

To illustrate the steps I will take a request made by Phil Adamson. He supplied a mysql script that looked like this:-

INSERT INTO CALLINEARITYVLD VALUES (400002067,'2002-02-16 14:49:15','2002-09-16 13:45:17',4,1,2,1,'2002-01-26 14:49:15','2013-06-02 13:40:13');
INSERT INTO CALLINEARITYVLD VALUES (400002068,'2002-02-16 15:08:29','2002-09-16 14:04:44',4,1,2,2,'2002-01-26 15:08:29','2013-06-02 13:40:13');
INSERT INTO CALLINEARITYVLD VALUES (400002069,'2002-02-16 15:40:03','2002-09-16 14:24:08',4,1,2,3,'2002-01-26 15:40:03','2013-06-02 13:40:13');
INSERT INTO CALLINEARITYVLD VALUES (400002070,'2002-02-16 15:59:17','2002-09-16 14:43:36',4,1,2,4,'2002-01-26 15:59:17','2013-06-02 13:40:14');
INSERT INTO CALLINEARITYVLD VALUES (400002071,'2002-02-16 16:19:37','2002-09-16 15:03:00',4,1,2,5,'2002-01-26 16:19:37','2013-06-02 13:40:14');
INSERT INTO CALLINEARITYVLD VALUES (400002072,'2002-02-16 16:38:52','2002-09-16 15:22:29',4,1,2,6,'2002-01-26 16:38:52','2013-06-02 13:40:14');

INSERT INTO CALLINEARITY VALUES (400002067,1,2,49,1074049038,7,0,0,1,0,6864.51,6864.51,1,-3.48743e-05,7453.36,7441.27,0.979464,2.23998e-05,8036.26,8019.81,0.992521,-4.54651e-05,9
241.58,9150.06,0.937721,-7.52076e-05,10380,10120.1,0.852105,-5.13853e-05,13834.1,12450.3,0.674613,0,202593,0,0,0,0,0,0,0,0,0,0,0);

.... (lots of data)

INSERT INTO CALLINEARITY VALUES (400002072,6,2,2850,1089253533,8,0,0,1,0,6118.64,6118.64,1,1.0966e-05,7215.16,7228.35,1.01202,-5.13833e-05,8377.31,8335.07,0.95231,-5.21831e-05,9573.04,9399.17,0.889912,-6.92531e-06,10764.6,10449.7,0.881661,-6.55443e-05,11958.8,11409.1,0.803388,-4.67877e-05,23962.9,14311,0.241741,0,988007,0,0,0,0,0,0,0);
Here are the steps that were followed:-
  1. It will be inserted into the Master DB at minos-db1. The first thing to check is that those SEQNOs are not already in use. On minos-db1:-
      mysql -h offline `hostname` -u reader -p
      select * from CALLINEARITYVLD where  SEQNO between 400002067 and 400002072;
      select * from CALLINEARITY where  SEQNO between 400002067 and 400002072;
      quit
    
    shows that there is no data. In fact Phil knows they are O.K., as this is CalDet data and the CalDet Master DB, which issues 4 series SEQNOs, is no more. In general the appropriate GLOALSEQNO (or GlobalSeqNo) table needs to be checked and updated - see Rules on SEQNO, INSERTDATE and CREATIONDATE.

  2. The next thing to do is to ensure that the data is propagated by setting the insertion dates to the far future (greater than 1 January 1st 2030). Phil has only set them to '2013-06-02' so those dates need to changed to '2031-01-01 00:00:00' in the scripts.

  3. Now the creation dates must be checked to ensure that they overlay properly. Phil is an old hand at this so the dates are right - see Overlay Creation Dates in the User Manual.

  4. The data can now be entered into the database:-
      mysql offline -h  `hostname` -u writer -p
      \. macro-file
      quit
    

  5. All that remains is to create a DBILOGENTRY that describes this change. Phil has supplied the text:-
    Hacked up CALLINEARITY for 2002 T7 runs by cloning 2002 T11 run data. 
    Will supercede with real data when available.
    
    1. So this comment is put into a temporary file - say reason.txt.

    2. The TSQL environmental variables are set up:-
        setenv ENV_TSQL_UPDATE_PSWD ...
        setenv ENV_TSQL_UPDATE_USER writer
        setenv ENV_TSQL_UPDATE_URL  mysql://minos-db1.fnal.gov/offline
      

    3. dbmjob is run and the DBILOGENTRY entry created:-
        dbmjob -b
        logentry  CALLINEARITY '@reason.txt' 400002067 --SeqNoMax 400002072
      

Deleting SEQNOs

In general replacing existing data is achieved by adding new SEQNOs with later CREATIONDATEs and allowing the standard DBI context query to resolve ambiguity in favour of the new entries. This allows roll-back to the older data if required. Occasionally though, when the data is plain wrong, it is necessary to remove it. The original way this was carried out was to remove the SEQNOs (both the VLD and main table entries) in the Primary Master database and then distribute a fixup file to remove the SEQNOs from Slave sites.

This method has a number of weaknesses:-

To address these weaknesses a new system is being introduced which propagates expiring SEQNOs. It works as follows:-

  1. To expire a SEQNO the following changes are made in any Master database:- So a typical command might look like:-
    update  ???VLD 
      set TIMESTART    = '2038-01-01 00:00:00',
          TIMEEND      = '1970-01-01 00:00:00',
          TASK         = 999999999,
          INSERTDATE   = '2031-01-01 00:00:00',
          CREATIONDATE = now() + interval 7 day
      where SEQNO = ???;
    

  2. The expiring SEQNO propagates throughout the database network.

  3. When checksumming runs at any site the expiring SEQNOs are identified and an "expiry record" is added to an external file (expiring_seqnos.dat in the work directory). A mail is sent to the database contact telling them that the SEQNO will expire shortly.

  4. When the expiry date is reached the SEQNO is deleted and a second mail sent to the database contact.

  5. When the checksum analysis code detects a conflict it looks to see if the SEQNO is expiring and if it is:-

  6. After a further period of time the expiry record is removed from expiring_seqnos.dat.

Drip Feeding

DBMauto is designed to distribute relatively small updates ( < 100MB per update) which should be sufficient for typical database operation. Occasionally much larger changes are made when introducing new tables or reprocessing all the data used to derive existing ones. Then the updating system can be overwhelmed. In this case a drip-feed system can be used to feed them into the distribution system more slowly as follows:-

  1. In the exporter .config, stop exporting the table. If distributing is going to take days, it might be as well to suppress checksumming until everything is back in sync.

  2. Make a note of the last SEQNO exported. Call this SEQNO_START

  3. Generate the table data and note the last SEQNO created. Call this SEQNO_END

  4. In the exporter .config, start exporting the table.

  5. Decide how many SEQNOs to distribute per update. Aim to have ~ 50MB i.e. 1/3 of number in the column "SEQNOs/150MB" Database: Size Report. Call this MAX_SEQNO.

  6. In the work directory create a file
      drip_feed.dat
    
    and place in it an entry of the form:-
         ?table SEQNO_START SEQNO_END MAX_SEQNO
    e.g. DCS_HV 200421000 200425999 2500
    
    If required the file can have multiple entries to drip feed from more than one sequence range in a table or to drip feed multiple tables.

  7. Each time the exporter runs it checks for this file and if present sets up a BROADCAST fix-up command (see Distributing Fixup Files) to drip feed the next section of the first entry. It then corrects this first entry, to reflect the amount still to do, or removes it entirely if the entry is complete and updates drip_feed.dat ready for the next export job.

  8. Caution: If the export job fails for any reason the drip feed file is not reset. To fix, inspect the export log which will contain entries like this:- log
         2006-08-02 11:27:40 Drip feed: BROADCAST: DCS_HV 200421000
         200425999 5000
    
    and move any failing entries back into drip_feed.dat

Exporting Updates with Local SEQNOs

It is sometimes convenient to develop new data in a non-authorising database using local SEQNOs. Then, once the data has been validated, it can be exported with global SEQNOs using dbmjob, so long as the cascade contains an authorising database.

For the purposes of demonstration suppose the following:-

Proceed as follows:-
  1. Set up the cascade so that the development database is at entry 0 and the authorising is at entry 1:-
      setenv ENV_TSQL_URL   "mysql://dev_database_server/dev_database;mysql://auth_database_sever/auth_database"
      setenv ENV_TSQL_USER  "dev_writer;auth_writer"
      setenv ENV_TSQL_PSWD  "dev_password;auth_password"
    

  2. Run dbmjob and type:-
      export --Table MY_TABLE --File dbm.tmp --SeqNo 12345
    
    This exports the required SEQNO reassigning it a global SEQNO as it does so. The new SEQNO is recorded in the development database so that if it is exported again it won't be assigned another global SEQNO.

    If multiple SEQNOs are to be exported, separate them with commas. If all are to be exported, just drop the --SeqNo option.

    The exported database update file (dbm.tmp) can now be imported to the authorising database at entry number 1 using:-

      import --File dbm.tmp --DatabaseNumber 1
    

Adding a New Table

  1. Check that table is properly indexed
       o  VLD must have 
    
          o SEQNO as a primary key.
    
          o TIMESTART and TIMEEND as an index.
              alter table xxxVLD add index(TIMESTART), add index(TIMEEND);
            
    
       o  The main table must have SEQNO and ROW_COUNTER as a primary key.
    

  2. At FNAL in Master DB update DB_TABLE_ATTRIBUTES:-
       o   insert into DB_TABLE_ATTRIBUTES values( 700000001,NULL,"xxx",diff_thr,full_thr,seqno_gap,max_slice);
           e.g. insert into DB_TABLE_ATTRIBUTES values( 700000001,NULL,"xxx",6,20,1,100000);
    
       o   update DB_TABLE_ATTRIBUTESVLD set creationdate = now(),
                                          insertdate = '2031-01-01 00:00:00';
    

  3. Review and if necessary update:-

  4. Update drop down list in:-

    /afs/fnal.gov/files/expwww/numi/html/minwork/computing/find_db_update.html

    by editing

    /afs/fnal.gov/files/expwww/numi/cgi-bin/find_db_update.cgi


Return to the top-level Database Distribution document
Last Modified: $Date: 2008/01/03 05:59:15 $