2.1.2. Run CMAQv5.3.3 on c6a.2xlarge#

Login to the ec2 instance#

Note, the following command must be modified to specify your key, and ip address (obtained from the previous command):

ssh -v -Y -i ~/downloads/your-pem.pem ubuntu@ip.address

Login to the ec2 instance again, so that you have two windows logged into the machine.#

ssh -Y -i ~/downloads/your-pem.pem ubuntu@your-ip-address

2.1.3. Load the environment modules#

module avail

module load ioapi-3.2/gcc-11.3.0-netcdf  mpi/openmpi-4.1.2  netcdf-4.8.1/gcc-11.3

2.1.4. Update the pcluster-cmaq repo using git#

cd /shared/pcluster-cmaq

git pull

Verify that the input data is available#

Input Data for the 2016_12SE1 Benchmark

ls -lrt /shared/data/CMAQv5.3.2_Benchmark_2Day_Input/2016_12SE1/*

2.1.5. Run CMAQv5.3.3 for 2016_12SE1 1 Day benchmark Case on 4 pe#

' '
'LamCon_40N_97W'
  2        33.000        45.000       -97.000       -97.000        40.000
' '
'SE52BENCH'
'LamCon_40N_97W'    792000.000  -1080000.000     12000.000     12000.000 100  80   1
'2016_12SE1'
'LamCon_40N_97W'    792000.000  -1080000.000     12000.000     12000.000 100  80   1

Use command line to submit the job. This single virtual machine does not have a job scheduler such as slurm installed.#

cd /shared/build/openmpi_gcc/CMAQ_v533/CCTM/scripts
./run_cctm_Bench_2016_12SE1.csh |& tee run_cctm_Bench_2016_12SE1.log

Use HTOP to view performance.#

htop

output

If the ec2 instance was created without specifying 1 thread per core in the Advanced Settings, then it will have 8 vcpus.

Screenshot of HTOP with hyperthreading on

Successful output using the gp3 volume with hyperthreading on (8vcpus)#

==================================
  ***** CMAQ TIMING REPORT *****
==================================
Start Day: 2016-07-01
End Day:   2016-07-01
Number of Simulation Days: 1
Domain Name:               2016_12SE1
Number of Grid Cells:      280000  (ROW x COL x LAY)
Number of Layers:          35
Number of Processes:       4
   All times are in seconds.

Num  Day        Wall Time
01   2016-07-01   2083.32
     Total Time = 2083.32
      Avg. Time = 2083.32

Use lscpu to confirm that there are 4 cores on the c6a.2xlarge ec2 instance that was created with hyperthreading turned off (1 thread per core).#

lscpu

Output:

Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         48 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               AuthenticAMD
  Model name:            AMD EPYC 7R13 Processor
    CPU family:          25
    Model:               1
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            1
    BogoMIPS:            5300.00
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm con
                         stant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt a
                         es xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext invpcid_single ssbd ibrs ibpb stibp vmm
                         call fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nr
                         ip_save vaes vpclmulqdq rdpid
Virtualization features: 
  Hypervisor vendor:     KVM
  Virtualization type:   full
Caches (sum of all):     
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    2 MiB (4 instances)
  L3:                    16 MiB (1 instance)
NUMA:                    
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-7
Vulnerabilities:         
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer sanitization
  Spectre v2:            Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected