Working with Open Science Grid

Authors

Nicholas Ducharme-Barth

Megumi Oshima

Last updated

2024-11-06

1 Open Science Grid

The Open Science Grid is an example of an HTC computing environment where jobs are run and scheduled using HTCondor. OSG is free to use for US-based government staff conducting research or education related work, and HTCondor is well suited for running many short, non-sequential, single core jobs. Helpful documentation can be found on their website and the following information is based on this documentation.

Coding alert!

In the following code you will need to, where necessary:

  1. Replace User.Name with your OSG user name.

  2. Replace osg.project_name your OSG project name.

  3. Replace apXX to match your OSG access point (e.g. ap20 or ap21).

2 Connecting to OSG vis ssh

Access to OSG and file transfer is done using a pair of Terminal/PowerShell windows. In your first powershell terminal, log in to your access point. You will be prompted to enter your passphrase (password). It will then ask for a verification code generated by the Google Authenticator app.

ssh User.Name@apXX.uc.osg-htc.org

If you see the following, you have connected successfully:


                *** Unauthorized use is prohibited. ***

      If you log on to this computer system, you acknowledge your
  awareness of and concurrence with the OSG Acceptable Use Policy; see
             https://www.osgconnect.net/aup or /etc/osg/AUP

                              ***

              Cite the OSPool in your publications!
                https://osg-htc.org/acknowledging

                              ***

               OSPool Office Hours are twice a week:
    Tues 4-5:30pm ET/1-2:30pm PT, Thurs 11:30am-1pm ET/8:30-10am PT
             Zoom link: https://osg-htc.org/OfficeHoursZoom

***********************************************************************

3 Transferring files via scp

3.1 Move a file to OSG

If you want to transfer a file (for example, my_file.txt) from your local machine to OSG, navigate to the directory of my_file.txt on your local computer. Open a terminal and use scp and specify the source file name and destination path as shown:


## format is scp <source> <destination> 
scp my_file.txt User_Name@apXX.uc.osg-htc.org:/home/User_Name/

Note, you do not need to login to OSG first, you will be prompted to enter your passphrase and then a verification code before the file transfers. And each time you transfer a file you will need to re-enter the information.

3.2 Move multiple files at once

To transfer many files more efficiently, you can compress your files into tarballs (.tar.gz file). For example, to upload all of the input files for running an array of linear model jobs at once, in R, run:

system(paste0("powershell cd ",file.path("examples", "OSG", "array_lm"),";tar -czf inputs.tar.gz inputs"))

to create a tarball called inputs.tar.gz. To do this in a terminal, open a terminal window in the directory above the files to compress (e.g. examples/OSG/array_lm) and run:

tar -czf inputs.tar.gz inputs

3.3 Move files back to local machine

To transfer files from OSG back to your local machine you can use the same command as above but swap the source and destination paths. From the same terminal as before (or in the directory to put the file), use:


scp User_Name@apXX.uc.osg-htc.org:/home/User_Name/my_file.txt ./

where ./ is the current location on your local computer or you can specify the path relative to where you are (e.g. ./path/to/files).

4 Transferring large files to OSDF

For large input and output files, including containers, it is recommended to use the Open Science Data Federation (OSDF). To see exactly which OSDF origins to use see this guidance on Where to Put Your Files. If you are working with a container, you can upload it and transfer it using the following commands.

From a terminal in the directory where the container file (linux-r4ss-v4.sif) is stored on your local computer, run:

scp linux-r4ss-v4.sif User.Name@apXX.uc.osg-htc.org:/home/User.Name

Again you will be prompted for your passphrase and RSA code each time. Then, in a second terminal that is logged into OSG, move the container to the OSDF location.

mv linux-r4ss-v4.sif /ospool/apXX/data/User.Name

To test the container, in that same OSG terminal, run:


apptainer shell /ospool/apXX/data/User.Name/linux-r4ss-v4.sif

#once it opens you can try running R and making sure the packages you need are there
R
library(r4ss)
q()
Back to top