ssh User.Name@apXX.uc.osg-htc.org
mkdir -p singularity/linux_r4ss
Running an array of SS3 jobs on OSG
Similar to the array_lm example, this example also sets up running an array job on OSG. As before, we will use a *.txt
to indicate which directories we want to run jobs in as a part of our array.
There are a few main differences that serve to illustrate useful modifications to the workflow:
- in this case we will run Stock Synthesis (SS3) on all models in the SS3 testing suite to conduct a retrospective analysis;
- we will run our jobs within a software container;
- lastly, we will define variables to be passed between job submission and job execution scripts to ensure that the correct output is produced.
The OSG/ss3 example can be set-up either by cloning the repository git clone https://github.com/MOshima-PIFSC/NSASS-HTC-HPC-Computing.git
, or stepping through the following code:
Note: throughout this tutorial we are using User.Name as a stand-in for your actual username and project_name as a stand-in for your project. In all cases, replace User.Name your actual user name and project_name with your specific project name.
1 Build software container
Software containers allow for portable, reproducible research by allowing researchers to set-up a software environment to their exact spefications and being able to run it on any Linux system. The Apptainer container system is widely used across HPC/HTC systems, and make it easy to build a container from a definition file. Running a job within a container means that you are able to replicate an identical software environment in any location with Apptainer installed, no matter the native operating system, software and installed packages. The Apptainer container can be built from any Linux machine with Apptainer installed, including the Open Science Grid (OSG) access points. Here we walk through the steps needed to build a Linux (Ubuntu 20.04) container containing Stock Synthesis (version 3.30.22.1), R (version 4.4.0) and the R packages r4ss, ss3diags, data.table, magrittr, and mvtnorm from a definition file, linux-r4ss-v4.def. In this case we will show the steps needed to build the container using the OSG access point as our Linux virtual machine (VM), though this may not be needed if working from an alternative Linux VM.
Note: you will have to change apXX to match your OSG access point (e.g., ap20
or ap21
).
The first step is to log onto your OSG access point via ssh using a Terminal/PowerShell window and make a directory to build your container in this case singularity
1.
Using a second Terminal/PowerShell window, navigate to the directory that you cloned the NSASS-HTC-HPC-Computing
repo into and upload the definition file (linux-r4ss-v4.def
) to the directory you just created on OSG.
scp apptainer/linux-r4ss-v4.def User.Name@apXX.uc.osg-htc.org:/home/User.Name/singularity/linux_r4ss
Back in your first Terminal/PowerShell window manoeuvre into the directory, and build the container2. The second line of code is what builds the Singularity Image File (.sif
) and takes two arguments: the name of the output .sif
file and the input definition file (.def
).
cd singularity/linux_r4ss
apptainer build linux-r4ss-v4.sif linux-r4ss-v4.def
Move the built .sif
file to Open Science Data Federation (OSDF) so that it can be cached and made available to any availble HTCondor jobs. Note that if you update the build of the .sif
file in any way you should also rename it (e.g., -v5) as caching may not update the file if it has the same name.
mkdir -p /ospool/apXX/data/User.Name/singularity
cp linux-r4ss-v4.sif /ospool/apXX/data/User.Name/singularity
2 Setup data inputs and directories
Given that our example is to run a 4-year retrospective analysis for each of the SS3 test models, the next step is downloading the SS3 test models from the nmfs-stock-synthesis/test-models Github repo. Once you’ve downloaded the test models, copy the models/
directory into a new example directory ss3/inputs/
within the NSASS-HTC-HPC-Computing/examples/OSG/
directory on your machine. If you cloned the NSASS-HTC-HPC-Computing
repo, the SS3 test models will already be in the correct location.
For the sake of example the job array will be set-up to run each retrospective peel (e.g., -0 years, -1 year, … , -4 years of data) as individual jobs in the job array. We will store the results of each retrospective peel in its own directory. The directories on OSG will be listed in a text file, and we will use this text file to launch jobs on OSG (as a part of the job array) in each of the named directories.
Let us define that text file using R.
- Define a relative path, we are starting from the root directory of this project.
Show code used to define relative paths.
= this.path::this.proj() proj_dir
- Write a text file containing the path names for where the directories will be on OSG relative to the examples/OSG/ss3/ folder.
Show code used to define job directory structure.
=list.dirs(paste0(proj_dir,"/examples/OSG/ss3/inputs/models/"),recursive=FALSE,full.names=FALSE)
test_models=0:4
retro_peels
# replace '-' with '_' in model names since we will use '-' as a delimiter
if(length(grep("-",test_models,fixed=TRUE))>0){
= gsub("-","_",test_models)
test_models_new = grep("-",test_models,fixed=TRUE)
rename_models_idx for(i in seq_along(rename_models_idx)){
# create new dir
dir.create(paste0(proj_dir,"/examples/OSG/ss3/inputs/models/",test_models_new[rename_models_idx[i]]),recursive=TRUE)
# copy files
file.copy(paste0(proj_dir,"/examples/OSG/ss3/inputs/models/",test_models[rename_models_idx[i]],"/",list.files(paste0(proj_dir,"/examples/OSG/ss3/inputs/models/",test_models[rename_models_idx[i]]),full.names=FALSE,recursive=FALSE)),paste0(proj_dir,"/examples/OSG/ss3/inputs/models/",test_models_new[rename_models_idx[i]]))
# delete old dir
shell(paste0("powershell rm -r ",proj_dir,"/examples/OSG/ss3/inputs/models/",test_models[rename_models_idx[i]],"/"))
}= test_models_new
test_models
}
# define scenarios
= expand.grid(model=test_models,peel=retro_peels)
scenario_df $run_id = 1:nrow(scenario_df)
scenario_df= scenario_df[,c(3,1,2)]
scenario_df $run_id = ifelse(scenario_df$run_id<10,paste0(0,scenario_df$run_id),as.character(scenario_df$run_id))
scenario_df
# write text file
# define paths relative to the examples/OSG/ss3/ folder
= paste0("output/", apply(scenario_df,1,paste0,collapse="-"), "/")
osg_dir_lines writeLines(osg_dir_lines, con=paste0(proj_dir, "/examples/OSG/ss3/scripts/osg_job_directories.txt"))
3 Prepare job scripts
In order to execute the HTC workflow, instructions are coordinated using four scripts:
osg-prep.sh
: This script prepares files for HTCondor job execution, and makes the directory structure specified byosg_job_directories.txt
.osg.sub
: This is the HTCondor job submission script that specifies job requirements and input/output files.osg-wrapper-r.sh
: This wrapper script controls file input/output to and from the R scriptosg-ss3-example-calcs.r
, executes the R script, conducts job timing and tidies up the job working directory.osg-ss3-example-calcs.r
: This is the actual computation script which modifies the SS3 input files as needed, executes the appropriate SS3 model run and conducts any needed post-processing of the output within R.
In osg.sub
you will need to change the following before you upload and run the script:
- Line 15: change access point
apXX
andUser.Name
to match your user name and access point. - Line 27: change
project_name
to match your project on OSG.
- From within R, compress the
OSG/ss3/inputs/
andOSG/ss3/scripts/
directories as a tar.gz fileupload.example-ss3.tar.gz
. This simplifies the number of steps needed for file transfers.
shell(paste0("powershell cd ", file.path(proj_dir, "examples", "OSG", "ss3","inputs"), ";tar -czf upload.example-ss3-models.tar.gz models/"))
shell(paste0("powershell cd ", file.path(proj_dir, "examples", "OSG", "ss3"), ";tar -czf upload.example-ss3-scripts.tar.gz scripts/"))
shell(paste0("powershell cd ", file.path(proj_dir, "examples", "OSG", "ss3"), ";tar -czf upload.example-ss3.tar.gz inputs/upload.example-ss3-models.tar.gz upload.example-ss3-scripts.tar.gz"))
4 OSG workflow
- Connect to OSG
As mentioned, access to OSG and file transfer is done using a pair of Terminal/PowerShell windows, we will call them Terminal A and Terminal B. In Terminal A, log onto your access point and create a directory for this example.
ssh User.Name@ap21.uc.osg-htc.org
mkdir -p examples/OSG/ss3/
- Transfer files
Open a second PowerShell terminal in the NSASS-HTC-HPC-Computing
directory on your machine. This will be your local workstation, call it Terminal B. Use this terminal window to upload via scp the needed files (examples/OSG/ss3/upload.example-ss3.tar.gz
) to OSG. The upload.example-ss3.tar.gz
will be uploaded to the directory you just created examples/ss3
.
scp examples/OSG/ss3/upload.example-ss3.tar.gz User.Name@apXX.uc.osg-htc.org:/home/User.Name/examples/OSG/ss3/
- Prepare files and submit job on Hera
In Terminal A, un-tar files, change the permissions/line endings for prep scripts, execute the script and launch the condor job.
cd examples/OSG/ss3/
tar -xzf upload.example-ss3.tar.gz
tar -xzf upload.example-ss3-scripts.tar.gz
chmod 777 scripts/osg-prep.sh
dos2unix scripts/osg-prep.sh
./scripts/osg-prep.sh
condor_submit scripts/osg.sub
- Download jobs
Once all jobs are completed (or the job has hit its time limit), use your Terminal B to download your jobs.
scp -r User.Name@apXX.uc.osg-htc.org:/home/User.Name/examples/OSG/ss3/output examples/OSG/ss3/
5 Process results
After results are downloaded they can be processed in R to extract the model run times, time series of estimated biomass for each model run, and Mohn’s rho across retrospective peels for a given model ‘family’.
Show output processing code
# iterate over output files and extract quantities
library(data.table)
library(magrittr)
library(r4ss)
= list.dirs(paste0(proj_dir,"/examples/OSG/ss3/output/"),recursive=FALSE,full.names=FALSE)
output_dirs = comptime_dt.list = as.list(rep(NA,length(output_dirs)))
ssb_dt.list = as.list(rep(NA,length(output_dirs)))
ss_output_list names(ss_output_list) = output_dirs
for(i in seq_along(output_dirs)){
= strsplit(output_dirs[i],"-")[[1]][2]
tmp_model = as.numeric(strsplit(output_dirs[i],"-")[[1]][3])
tmp_peel = as.numeric(strsplit(output_dirs[i],"-")[[1]][1])
tmp_index
# check if the End.tar.gz file got created
if(file.exists(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/End.tar.gz")))
{# get snapshot of original files in the directory
= list.files(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/"))
tmp_orig_files
# un-tar if the End.tar.gz file gets made
shell(paste0("powershell cd ", paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/"), ";tar -xzf End.tar.gz"))
# check if runtime.txt was produced and extract output
if(file.exists(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/runtime.txt"))){
= readLines(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/runtime.txt")) %>%
tmp_time gsub(".*?([0-9]+).*", "\\1", .) %>%
as.numeric(.) %>%
as.data.table(.) %>%
setnames(.,".","time")
= data.table(id = output_dirs[i])
comptime_dt.list[[i]] $index = tmp_index
comptime_dt.list[[i]]$model = tmp_model
comptime_dt.list[[i]]$peel = tmp_peel
comptime_dt.list[[i]]$OSG_start = as.POSIXct(tmp_time$time[1],origin="1970-01-01")
comptime_dt.list[[i]]$OSG_end = as.POSIXct(tmp_time$time[2],origin="1970-01-01")
comptime_dt.list[[i]]$OSG_runtime = tmp_time$time[3]/60
comptime_dt.list[[i]]
# clean-up
rm(list=c("tmp_time"))
}
# if "ss_report.RData" is produced put it into the storage list
if(file.exists(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/ss_report.RData"))){
load(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/ss_report.RData"))
= ss_report
ss_output_list[[i]]
= ss_report$derived_quants %>%
ssb_dt.list[[i]] as.data.table(.) %>%
%in% paste0("SSB_", ss_report$startyr:ss_report$endyr)] %>%
.[Label := output_dirs[i]] %>%
.[,id :=Value/subset(ss_report$derived_quants,Label=="SSB_Virgin")$Value] %>%
.[,sbo:=sapply(Label,function(x)as.numeric(strsplit(x,"_")[[1]][2]))] %>%
.[,yr
.[,.(id,yr,sbo)]# clean-up
rm(list=c("ss_report"))
}
# clean-up
file.remove(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/",setdiff(list.files(paste0(proj_dir,"/examples/OSG/ss3/output/",output_dirs[i],"/")),tmp_orig_files)))
rm(list=c("tmp_orig_files"))
else {
} = data.table(id=output_dirs[i],index=tmp_index,model=tmp_model,peel=tmp_peel,OSG_start=NA,OSG_end=NA,OSG_runtime=NA)
comptime_dt.list[[i]] = data.table(id=output_dirs[i],yr=2023,sbo=NA)
ssb_dt.list[[i]]
}
# clean-up
rm(list=c("tmp_model","tmp_peel","tmp_index"))
}
= rbindlist(na.omit(comptime_dt.list))
comptime_dt = rbindlist(ssb_dt.list) %>% merge(comptime_dt[,.(id,index,model,peel)],.,by="id")
ssb_dt = na.omit(ss_output_list)
ss_output_list
# save
fwrite(comptime_dt,file=paste0(proj_dir,"/examples/OSG/ss3/output/comptime_dt.csv"))
fwrite(ssb_dt,file=paste0(proj_dir,"/examples/OSG/ss3/output/ssb_dt.csv"))
# calculate Mohn's rho
= unique(comptime_dt$model)
unique_models = as.list(rep(NA,length(unique_models)))
retro_dt.list
for(i in seq_along(unique_models)){
= unique_models[i]
tmp_model
= data.table(model=tmp_model)
retro_dt.list[[i]] $type = c("SBO")
retro_dt.list[[i]]$rho = NA
retro_dt.list[[i]]
if(uniqueN(na.omit(ssb_dt[model==tmp_model])$peel)==5){
= ssb_dt[model==tmp_model]
tmp_dt = tmp_dt[peel==0]
base_dt = max(base_dt$yr) - 1:4
year_vec = rep(NA,length(year_vec))
bias_vec # calc Mohn's rho for runs where all models completed
for(j in 1:4){
= (ssb_dt[model==tmp_model&peel==j&yr==year_vec[j]]$sbo - base_dt[yr==year_vec[j]]$sbo)/base_dt[yr==year_vec[j]]$sbo
bias_vec[j]
}$rho = mean(bias_vec)
retro_dt.list[[i]]rm(list=c("tmp_dt","base_dt","year_vec","bias_vec"))
}
rm(list=c("tmp_model"))
}
= rbindlist(retro_dt.list)
retro_dt fwrite(retro_dt,file=paste0(proj_dir,"/examples/OSG/ss3/output/retro_dt.csv"))
5.1 Job runtime
The 90 jobs run on OSG completed 4.14 hours of calculations (2.76 minutes per job) in an elapsed time of \(\sim\) 34 minutes (Figure 1).
Show plotting code
library(ggplot2)
= comptime_dt %>%
p %>%
.[,.(id,OSG_start,OSG_end)] melt(.,id.vars="id") %>%
:=ifelse(variable%in%c("OSG_start"),"start","end")] %>%
.[,variabledcast(.,id~variable) %>%
order(start)] %>%
.[ggplot() +
xlab("Time (GMT)") +
ylab("Job") +
geom_segment(aes(x=start,xend=end,y=id,yend=id),color="#003087",alpha=0.5,linewidth=2) +
theme(panel.background = element_rect(fill = "transparent", color = "black", linetype = "solid"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
strip.background =element_rect(fill="transparent"),
legend.key = element_rect(fill = "transparent"),
axis.text.y=element_blank(),
axis.ticks.y=element_blank())
p
# save plot
ggsave(
"OSG-ss3-elapsed.png",
plot = p,
device = "png",
path = paste0(proj_dir,"/assets/static/"),
width = 8,
height = 4.5,
units = c("in"),
dpi = 300,
bg = "transparent")
6 Example results
6.1 Retrospectives
Retrospective plots of static biomass depletion for the SS3 test models are shown in Figure 2.
Show plotting code
= as.list(rep(NA,uniqueN(ssb_dt$model)))
text_dt.list for(i in seq_along(text_dt.list)){
= ssb_dt[model==unique(ssb_dt$model)[i]]
tmp_dt = min(tmp_dt$yr)
tmp_min_yr = data.table(model=unique(ssb_dt$model)[i],yr=tmp_min_yr,sbo=0.2,rho=round(retro_dt[model==unique(ssb_dt$model)[i]]$rho,digits=2))
text_dt.list[[i]]
}= rbindlist(text_dt.list)
text_dt
= ssb_dt %>%
p ggplot() +
facet_wrap(~model,scales="free_x") +
xlab("Year") +
ylab(expression(SB/SB[0])) +
ylim(0,NA) +
geom_hline(yintercept=0) +
geom_path(aes(x=yr,y=sbo,color=as.character(peel),group=id)) +
geom_text(data=text_dt,aes(x=yr,y=sbo,label=rho),size=3,hjust = 0) +
::scale_color_viridis("Peel",begin = 0.1,end = 0.8,direction = 1,option = "H",discrete=TRUE) +
viridis::scale_fill_viridis("Peel",begin = 0.1,end = 0.8,direction = 1,option = "H",discrete=TRUE) +
viridistheme(panel.background = element_rect(fill = "transparent", color = "black", linetype = "solid"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
strip.background =element_rect(fill="transparent"),
legend.key = element_rect(fill = "transparent"))
p
# save plot
ggsave(
"OSG-ss3-retro.png",
plot = p,
device = "png",
path = paste0(proj_dir,"/assets/static/"),
width = 8,
height = 4.5,
units = c("in"),
dpi = 300,
bg = "transparent")