You finally got an account on our HPC? Welcome! Here is a short guide on things you need to know before starting computations.
Login
Our cluster can be reached via ssh
, provided you are either within the university network or are connected via VPN. Under Windows, open the command line (cmd) or PowerShell, in Linux or on MacOS, open a terminal and type
ssh <$USER>@hpc.rz.uni-duesseldorf.de
Of course, replace <$USER>
with your actual username. When you login for the first time, the systen may ask if you want to continue connecting, answer with "yes". When prompted, type in your password (you may possibly not see on the screen what you are typing, this is ok) and hit enter. You'll find yourself in your home directory at our login-host hpc-login7
.
Please note that this node is only used for login and submitting batch jobs, do not start computations, heavy tasks or file transfers here - actual computations are performed on our compute nodes, for transfer of files to the HPC see below.
The cluster has no direct connection to the internet, local mirrors are however implemented for PyPi, conda and R.
You may also connect to our jupyterhub server via your webbrowser, https://jupyter.hpc.rz.uni-duesseldorf.de to start jupyter notebooks or terminal sessions.
File system
There are three important directories where you can store data, these are:
Your home directory /home/<$USER>
- this directory has a size limit of only 60 GB and should only be used to store small
files/programs/configuration data.
A project directory /gpfs/project/<$USER>
- this directory is used for project data, it has a maximum capacity of 10 TB per user and its contents
are backed up regularly
Temporary files /gpfs/scratch/<$USER>
- to store temporary files during computations. Max. capacity is 20 TB, there is no backup and files
older than 60 days are deleted automatically.
These directories are accessible from the login node, all compute nodes and the storage server (see below). The /gpfs
directories are connected to a fast parallel file system, these should be used to perform computations.
Please do not use temporary directories like /tmp
on the local nodes. These will overfill quickly and in the worst case consequently crash the node with all its computations on it!
For the contents of /home
and /gpfs/project
regular snapshots are created, you can restore your data from these snapshots. You can check your current quotas for these directories by loading the module "hpc-tools" on the login node and typing usage_report.py:
module load hpc-tools
usage_report.py
Transferring files to the HPC
Files can be transfered from your computer to the HPC via scp
, please use our storage server for this purpose, e.g.:
scp yourfile <$USER>@storage.hpc.rz.uni-duesseldorf.de:/gpfs/project/<$USER>
or use any scp client (FileZilla, WinSCP etc.).
You can also mount the above mentioned directories to your PC. Use sshfs
under Linux or MacOS:
sshfs <$USER>@storage.hpc.rz.uni-duesseldorf.de:/gpfs/project/$USER your_local_dir
Windows and MacOS users can also attach these directories as network drives, see this page for further instructions. Large files can also be transfered using our GLOBUS-Connect service.
Please do not use the login node hpc-login7 for file transfers or rsync, this is a virtual machine with limited bandwidth, large file transfers will be blocked automatically.
Submitting jobs
We use PBS professional for batch control and submission of jobs, you can start interactive jobs or use scripts to define your computation tasks.
For interactive tasks e.g. use:
qsub -I -l select=1:ncpus=1:mem=4GB -l walltime=02:00:00 -A $PROJECT
This will open an interactive shell on one of our compute nodes. The select
statement defines your required ressources, the above line means you want to access 1 compute node with 1 cpu and 4 Gigabye of RAM for 2 hours. Please do not forget to attach your project name $PROJECT
with the -A
switch.
You can also write a batch script and submit this, e.g.
#!/bin/bash
#PBS -l select=1:ncpus=1:mem=4GB
#PBS -l walltime=02:00:00
#PBS -A $PROJECT
cd $PBS_O_WORKDIR
module load Python/3.10.4
python myscript.py
Submit the script with qsub yourscript.sh
. Check your jobs with qstat -u <$USER>
. Jobs can be cancelled with qdel JOBID
, where JOBID is the number reported by qstat.
Job Monitoring
To monitor your jobs you can either use the qstat command as stated above or use our graphical cluster tool myjam at https://myjam3.hhu.de.
Under UserTools → my Jobs you can get information on your running and finished jobs. Click on the JobID for further information. This link will also give performance information for your finished jobs, it is worth to check those from time to time so you can adapt the ressources for future jobs.
Cluster Status → Rackview provides an overview on the current activity of the cluster. Nodes with rose color are busy, green nodes are free, red nodes are offline. Click on the node to receive further information.
Cluster Status → Queues gives an overview on the number of queued and running jobs in each queue.
Where are the programs?
We have a large collection of installed software, these can be accessed via the module load
command. Type module avail
to get an overview over all installed software. Load a module with
module load
wanted_module
e.g. module load hpc-tools
Where do I get help?
You can contact us any time via email to hpc-support@hhu.de. Questions can also be posted in our Rocket-Chat channel #HPC-Users. Each Wednesday between 14:30 and 16:00 we have HPC-Office hours, where you can just come and communicate your problem or request. The meeting is online, the link will be posted on #HPC-Users, but you can also just come over in person to room 25.41.00.51.