Beginner’s Guide to Cluster

  • First, you need to make sure that your KUACC HPC Cluster account is created. You will receive an e-mail when your account is created with detailed instructions on how to use the cluster.
  • If you have the account, you can access the cluster with your KU NetID and password.
  • Remember that VPN is required to access the system. You can learn how to set up the VPN service from this link: https://confluence.ku.edu.tr/kuhelp/ithelp/it-services/network-and-wireless/vpn-access

How to Learn Details of Cluster Usage?

If you need comprehensive help, there are two main resources available:

  1. The video of HPC usage training
  2. When your KUACC HPC Cluster account is activated, you will receive an email. Open the email and find KUACC_Presentation file attached. Download it and follow this path:  KUACC_Presentation –> docs –> kuaccguide –> pdf –>KUACC-HPC-Start-Guide.pdf (we need to add these document links here, if you don’t have them, ask Ufuk Bey)

Which information is involved in Video and PDF?

It is important to read the PDF document and watch the video

If you have never used a Cluster, or are not familiar with this cluster, YOU WILL WANT to read and follow the examples in the document to become familiar with how to run jobs on HPC. It is a common practice for new users to ignore this manual and simply try to run jobs without understanding what they are doing. Such carelessness can and WILL easily impact hundreds/thousands of critical jobs and users currently running on the cluster.

If your actions compromise the health of the HPC cluster, your account will be LOCKED so please make sure you run through the examples below before you embark on running jobs. Some important remarks to avoid this:

  • Do NOT use the login nodes for work. If everyone does this, the login nodes will crash keeping HPC users from being able to log in to the cluster.
  • Never submit a large number of jobs (greater than 2) without first running a small test case to make sure all works as expected. Start slow and then ramp up with jobs once you are familiar with how things work.
  • (add never create too many files or ask for help if you need to create too many files, specify how many is too many)
  • work with Ufuk Bey to find out more wrong usage
  • If you have a question, in the first instance please carefully check the document.

How to use HPC

Using a High Performance Computing Cluster such as the HPC Cluster requires at a minimum some basic understanding of the Linux Operating System. It is outside the scope of this manual to explain Linux commands and/or how parallel programs such as MPI work. This manual (replace this with this page) simply explains how to run jobs on the HPC cluster.

When you login to HPC you are connected to what is called a login node. The HPC Cluster has several major components:

● Login nodes: The login nodes are meant for simple tasks such as submitting jobs, checking on job status, editing (emacs, vi), and performing simple tasks.
● A Head node: The head node runs all of the cluster critical services.
Compute nodes: The compute nodes are the workhorse of the cluster. For
computational work both Serial or Parallel, in Batch mode or Interactive
mode, you will be using the compute nodes.

 

(Add how to use them in these different modes.)

How do I login to the HPC Machines?

Windows

Use a secure shell client, e.g. MobaXterm
1) Here is the direct link to download the mobaxterm program
2) Once you have mobaxterm installed follow this guide
Note if you have cygwin installed, you can open a cygwin-terminal and then use ssh the same as for Linux and
Mac below.
If you aren’t sure what cygwin is, you can safely ignore the above line.

Linux and Mac

Use ssh on the command line
ssh username@login.kuacc.ku.edu.tr
Note: Username is your HPC username

How do I copy files/data to the HPC Machines?

Windows

Use MobaXterm. This is a GUI-based scp client for MS Windows-based computers that has a drag-and-drop
facility and an inbuilt file editor. If you have cygwin installed, you can open a cygwin-terminal and then use ssh the
same as Linux and Mac below.
Download MobaXterm

Linux and Mac

Use the scp on the command line
scp file-to-name USERNAME@login.kuacc.ku.edu.tr:/HOME_DIR/SUB_FOLDER/new-filename
This will copy the file to a SUB_FOLDER and renaming it to new-filename

How do I submit a SLURM job script?

Jobs are not run directly from the command line, the user needs to create a job script which specifies both the required compute resources, libraries and the job’s application that is to be run.

The script is submitted to the job management system (queueing system) and if the requested resources
(processors, memory, etc) are available on the system, the job will by run.

If not, it will be placed in a queue until such time as the resources do become available. In order to provide a fair share of the resources among users, the priority of jobs in the queue may be varied based on how much resources someone has used, so it is possible that jobs may not run in the order in which they have been submitted to the queue.

SLURM Job Submission Scripts

You will find SLURM submission script templates in a the folder: /kuacc/jobscripts
Copy the one you need to your work folder and modify it as required:
mkdir /scratch/users/<username>/workfolder
cd /scratch/users/<username>/workfolder/
cp/ kuacc/ jobscripts/ example_submit.sh/ scratch/ <username>/ workfolder/ my_experiment.sh
vim my_experiment.sh

Submitting jobs to the queue

Jobs are submitted to the system with the command below:
sbatch myscript.sh
See the page about SLURM Queueing System Commands for more information on creating job submission scripts.

Submitting a SLURM Job Script

The job flags are used with SBATCH command. The syntax for the SLURM directive in a script is “#SBATCH<flag>”. Some of the flags are used with the srun and salloc commands, as well for interactive jobs.

Also these technical topics are explained in PDF document (also in the video)

General

  • How do I login to the HPC Machines?
  • How do I copy files/data to the HPC Machines?
  • How do I edit my files on the HPC system?

 

SLURM queuing system

  • SLURM Job Submission Scripts
  • Submitting jobs to the queue
  • SLURM Partitions (Job Queues)
  • Essential SLURM Commands
  • Submitting a SLURM Job Script

Other

  • Running a GUI on the Cluster (X11 Forwarding)
  • Job Reason Codes
  • KUACC QoS settings
  • Software

 

If your problem is in any topic listed above, then It is very important that you read the document and watch the video.

Further Questions

For more information on HPC facilities, systems support, assistance with parallel programming and performance
optimization and to report any problems, contact the KU Service Desk.

Alternatively you can sent an email to hpc-support@ku.edu.tr this will create a support ticket automatically.
When reporting problems, please give as much information as you can to help us in diagnosis, for example:
● Your username
● Queue/partition name
● Job ID(s)
● A copy of any error messages
● Command used to submit the job(s)
● Path(s) to scripts called by the submission command
● Path(s) to output files from your jobs
● When the problem occurred
● What commands or programs you were trying to execute at the time
● A pointer to the program you were trying to run or compile