Launching Fluent (GUI)
To launch the Fluent GUI, first allocate nodes on the cluster.
salloc -n 16 -N 2 -p rsa -t 120
Once the job is allocated, run the following commands:
module load proprietary/ansys/[14.5|15] runFluentGUI
Submitting a Fluent job (no GUI)
To submit Fluent jobs to the CCI queue three files are necessary, a Fluent case (.cas) file, a Fluent journal file (.jou) and a run script (.sh). The first file is generally created from the Fluent GUI (see above) and contains all problem data, including the mesh, boundary conditions, material specifications, etc.. The other required files are discussed below.
A Fluent journal file is the sequence of commands for Fluent to execute. This file, with extension .jou, can either be created by recording it within Fluent or writing it from scratch.
To record a journal file, launch the Fluent GUI (see above), select File -> Write -> Start Journal, preform the tasks you wish the run to do, and stop the recording with File -> Write -> Stop Journal.
To write a journal file from scratch copy the basic file setup below and title it basic.jou. Note that ';' indicates a comment.
; Read case file rc myfluentcase.cas ; Initialize the solution /solve/initialize/initialize-flow ; Calculate 50 iterations it 50 ; Write data file wd myfluentcase.dat exit yes
Copy the following run script to run_fluent.sh.
#!/bin/bash -x srun hostname -s > /tmp//hosts.$SLURM_JOB_ID if [ "x$SLURM_NPROCS" = "x" ]; then if [ "x$SLURM_NTASKS_PER_NODE" = "x" ]; then SLURM_NTASKS_PER_NODE=1 fi SLURM_NPROCS=`expr $SLURM_JOB_NUM_NODES \* $SLURM_NTASKS_PER_NODE` fi module load mpi/openmpi-1.4-gcc44/1 export OPENMPI_ROOT=/gpfs/sb/software/x86-rhel6/mpi/openmpi-1.4-gcc44/1/ fluent 3ddp -t $SLURM_NPROCS -mpi=openmpi -mpiopt='--bind-to-core' -mpirun='mpirun' -pib -cnf=/tmp//hosts.$SLURM_JOB_ID -g -i basic.jou rm /tmp/hosts.$SLURM_JOB_ID
Intel Cluster with UDF
User defined functions (UDFs) can require additional communication/initialization that by default will use rsh. rsh is not installed on the Intel Cluster so password-less ssh will be used.
Setup ssh keys to enable password-less ssh:
ssh rsa35-ib cd ~/.ssh/ ssh-keygen // accept defaults for prompt-less ssh access // id_rsa.pub will be overwritten in the ~/.ssh dir cat id_rsa.pub >> authorized_keys2
Copy the following run script to run_fluent.sh. Note the use of the
#!/bin/bash -x srun hostname -s > /tmp//hosts.$SLURM_JOB_ID if [ "x$SLURM_NPROCS" = "x" ]; then if [ "x$SLURM_NTASKS_PER_NODE" = "x" ]; then SLURM_NTASKS_PER_NODE=1 fi SLURM_NPROCS=`expr $SLURM_JOB_NUM_NODES \* $SLURM_NTASKS_PER_NODE` fi module load mpi/openmpi-1.4-gcc44/1 export OPENMPI_ROOT=/gpfs/sb/software/x86-rhel6/mpi/openmpi-1.4-gcc44/1/ fluent 3ddp -t $SLURM_NPROCS -mpi=openmpi -mpiopt='--bind-to-core' -mpirun='mpirun' -pib -cnf=/tmp//hosts.$SLURM_JOB_ID -g -i basic.jou -rsh=ssh rm /tmp/hosts.$SLURM_JOB_ID
Once you have the three required files, submit the Fluent job, use the sbatch command:
sbatch -p jobQueue -N nodes -n numberofprocessors -t timeofjob --licenses=acfd:1,anshpc_pack:N ./run_fluent.sh
nodes should be set to ceiling(numberofprocessors/8) to avoid hyperthreads.
Jobs requiring more then four processes must request hpc packs by passing
where 1 <= N <= 5 hpc packs support up to 8*4^(N-1) processors.
The sbatch option :
where name is the name of the license and count is the number of licenses needed for the job will prevent Slurm from starting a job until the appropriate license is available.
Contact CCI support for the names of the licenses available to your project.
Fluent licenses have limited availability. If you are using the GUI and get a license error, first check and see if you have a stopped job that is using the license with
$ ps -a
A strong scaling test was done on the RSA nodes for Fluent with a viscous fluid case that has roughly 3.5 million elements. The case was done four times for each different processor to test for variance in the time. As seen in Figure 1, Fluent has near ideal strong scaling up to 128 cores on the Intel cluster (labeled 'RSA' in the figures below) and little variance between different instances of the run. These runs were performed with one processes per physical core.
The y-axis in the plots below is labeled with the log base 2 of the speed-up.