Skip to content

Generating parameter sweeps

Alexandros Avdis edited this page Jun 12, 2014 · 3 revisions

Introduction

It is sometimes useful to vary certain parameters of your simulation to assess the effect of that parameter on some part of the output. This can be a tedious thing to set up, especially if you wish to cover a number of parameters and their interacting effects.

A script (tools/create_param_sweep.py) allows this to be done very easily.

Usage

Take a directory which contains everything you need for a simulation; mesh files, flml, any input files, python scripts, etc. Next generate a parameter file in the following format:

NAME; spud_path; colon:seperated:parameter:values
NAME2; spud_path2; colon:seperated:parameter:values

The values should be in "Python" format, e.g. a tensor is [[0,0,1],[3,5,23],[34,6,2]]. The spud path can be copied from diamond (i.e. click on the parameter you wish to vary and click "copy path"). The name is a human readable name that will be used in the output files.

An example parameter file is:

MinEdge; /mesh_adaptivity/hr_adaptivity/tensor_field::MinimumEdgeLengths/anisotropic_symmetric/constant; [[100,0,0],[0,100,0],[0,0,0.5]]:[[100,0,0],[0,100,0],[0,0,1]]:[[100,0,0],[0,100,0],[0,0,2]]
PertRhoErr; /material_phase::Fluid/scalar_field::PerturbationDensity/diagnostic/adaptivity_options/relative_measure/scalar_field:: InterpolationErrorBound/prescribed/value:: WholeMesh/constant; 0.01: 0.05: 0.1
Initial Vel; /material_phase:: Fluid/vector_field::Velocity/prognostic/initial_condition::WholeMesh/constant; [0.0,0.0,0.0]:[0.0,0.1,0.0]

Which varies a tensor (min edge length), a scalar (the interpolation error on perturbation density), and a vector (the initial velocity field).

You can then run the script:

./create_param_sweep.py template/ sens_test/ param_file.txt

where template is your set up directory, sens_test is where you want all the files to be created (doesn't need to exist already) and param_file.txt is your parameter space file.

When complete you will have the following directory structure:

output_dir/
  template (copied in if not already here)
  runs/
     1/
       run.flml
       other files
     2
     3
     directory_listing.csv`

directory_listing.csv will contain the directory numbers and which parameter set they contain. Each run is contained in a separate, numbered, directory.

These are then easy to run on a cluster via an array job. For example, on CX1, use the following script:

#!/bin/bash

#PBS -N kp_sens
# Time required in hh:mm:sss
#PBS -l walltime=72:00:00
#PBS -lselect=1:ncpus=1:mem=1800mb 

cd $PBS_O_WORKDIR/$PBS_ARRAY_INDEX

module load fluidity-cx1

/work/jhill1/cx1_fluidity/bin/fluidity gls-MixedLayer_periodised.flml`

This script is submitted as:

qsub -J 1-34 array.pbs

for a sweep where the directories go from 1 to 34. The directory number is stored in the environment variable $PBS_ARRAY_INDEX. The above PBS script will submit all 34 jobs for you under the same job name, each with another ID: the $PBS_ARRAY_INDEX. They will simply run when there is space on the machine, like any other job. The jobs can be parallel or serial.

A useful command to monitor the jobs is qstat which will show your job as

[jhill1@login-1 jhill1]$ qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
5901636[].cx1       LakeBala         jhill1            00:00:00 R q48   

-- the [ ] indicating an array job. qstat -J will limit the list to just job arrays. While qstat -t will list all details of a job, including for arrays, the status of each individual job. Combined with the -J flag, this will confine the output to your array jobs only.

Clone this wiki locally