Configuring the Monster Script

This script basically acts as a secretary, creating subdirectories, copying files, editing text, and submitting jobs in a coordinated fashion so that after the user configures just 2 text files, a single function call can result in thousands of computers working for days to weeks to calculate up to a combined 10-dimensional paramter scan. There are two steps to configuring MonsterScript.sh: configuring the input file and preparing the run files. An example input file can be found at Input_Examples/ParameterScan.inp which contains in-line documentation (see 10). Inside the working directory, one must place a properly configured ParameterScan.inp, libmctdhx.so, a binary executable file consistent with $binary defined inside ParameterScan.inp, and an input template consistent with $Input_Template defined inside ParameterScan.inp. Once these 4 files are properly configured and placed, the script is called by running
$MCTDHXDIR/MonsterScript.sh. The script scans up to 5 user defined relaxation parameters, runs them until convergence is detected and, if desired, automatically scans each relaxation with up to 5 user-defined propagation parameters. When running on a cluster, the script will automatically restart jobs if they finish, due to time constraints, before the calculation is complete. To circumvent job number restrictions on clusters, the script will build a series of runscripts that each simultaneously run many calculations in a single job, rather than a single computation per job.

A little under-the-hood knowledge is useful to to effectively use and debug MonsterScript.sh. The script initially calls $MCTDHXDIR/Computation_Scripts/ParameterScan_Propagation.sh
or $MCTDHXDIR/Comptuation_Scripts/ParameterScan_Relaxation.sh depending on the input file configuration, which then iterates through the corresponding parameter set. As the script loops through each parameter set, it calls either
$MCTDHXDIR/Scripts/IterateParameters.sh or $MCTDHXDIR/Scripts/IterateParameters_relax.sh which then perform the actual secretary functionality using about a dozen smaller scripts inside $MCTDHXDIR/Scripts. If it is determined that a new calculation must be run when using a cluster, a few lines are added to a runscript in the working directory called run#.sh for some number #. To avoid duplicate computations, a file in the computation directory is created called RunFlag, which is automatically deleted when the job runs out of time or the computation is complete, via some code within the runscript. The runscript accumulates calculation jobs until the total number of nodes requested reaches a threshold defined by $MPMDNodes in ParameterScan.inp, and then it is submitted using (in most cases) a qsub command within either $MCTDHXDIR/Scripts/MPMDrun_relax.sh or $MCTDHXDIR/Scripts/MPMDrun_prop.sh on line 106. It is often useful to comment this line for testing purposes, as it will prevent the script from submitting any jobs, leaving it to only copy, move, and edit files. Directories with incomplete calculations are stored in files Relax_Array and Prop_Array*. Directories are removed from these files when their corresponding calculations are complete, triggering an end of the corresponding ParameterScan_*.sh function call when the array is empty.

When MonsterScript.sh is run without any arguments, all lingering runscripts, RunFlag's, and arrays are cleared. If the user wishes to keep these files, they must run MonsterScript.sh save.

Table 10: Parameters inside ParameterScan.inp.
Parameter Name Description Values
Do_Relax Specifies whether to do or skip relaxations. ``T'', ``F''
Do_Propagations Specifies whether to do or skip propagations ``T'', ``F''
Propagation_Start Specifies how to start propagations when relaxation is skipped ``BINR'', ``HAND''
Do_Analysis Specifies whether to do or skip analysis ``T'', ``F''
MonsterName Suffix for job names ``MONSTER_$MonsterName'' any string
runhost Specifies which cluster is used, or if no cluster is used ``hermit'', ``hornet'', ``maia'', ``bwgrid'', ``PC''
numnodes Specifies number of nodes used for relaxation jobs Integer
MPMD Specifies whether to run in Multiple Program Multiple Data mode, i.e. multiple computations per submitted job. This only applies to scans on a cluster, and is recommended if more than 20 computations are desired ``T'', ``F''
MPMDjobs If using MPMD mode, this specifies how many nodes are requested for each job Integer
maxjobs Maximum number of jobs allowed on the queue (20 for hornet and hermit) Integer
binary Name of MCTDHX executable in working directory usually ``MCTDHX_intel''
Relaxation_Template Name of input template for relaxations in working directory usually ``MCTDHX.inp''
Propagation_Template Name of input template for propagations in working directory (if identical to Relaxation_Template, the parameters adjusted for the relaxation will be copied, if not then the paramters adjusted for the relaxation will not be copied). usually ``MCTDHX.inp''
Relaxtime Time to run relaxations for Positive number
NParameters Number of parameters to scan for relaxations [1-5]
Parameter# #=1-5, name of relaxation parameter # Any parameter found in MCTDHX.inp
List# #=1-5, specifies if a list of values is used for relaxation parameter # ``T'', ``F''
Parameter#List #=1-5, if $List#=``T'', specifies parameter # values to scan over Array
Scan#_Start #=1-5, if if $List#=``F'', speficies beginning of range of relaxation parameter # to be scanned number
Scan#_Stop #=1-5, if if $List#=``F'', speficies end of range of relaxation parameter # to be scanned number
Scan#_Step #=1-5, if if $List#=``F'', speficies step of range of relaxation parameter # to be scanned number
MaxNodes Maximum number of nodes allocated for a single propagation computation Positive integer
Prop_Time_Final End time of propagation computations Positive number
Prop_NParameters Number of propagation parameters to scan over [1-5]
Prop_Parameter# #=1-5, name of propagation parameter #. Any parameter in ``MCTDHX.inp''
Prop_List# #=1-5, specifies if a list of values is used for propagation paramter # ``T'',``F''
Prop_Parameter#List #=1-5, if $Prop_List#=``T'', specifies propagation parameter # values to scan over list of appropriate values
Prop_Scan#_Start #=1-5, if $Prop_List#=``F'', specifies beginning of scan range for propagation parameter # Number
Prop_Scan#_Stop #=1-5, if $Prop_List#=``F'', specifies end of scan range for propagation parameter # Number
Prop_Scan#_Step #=1-5, if $Prop_List#=``F'', specifies step of scan range for propagation parameter # Number
     
   

Back to http://ultracold.org