2.2.3. Learn how to Use AWS CLI to launch c6a.48xlarge EC2 instance using Public AMI#
Public AMI contains the software and data to run 2016_12SE1 using CMAQv5.3.3#
Software was pre-installed and saved to a public ami.
The input data was also transferred from the AWS Open Data Program and installed on the EBS volume.
This chapter describes the process that was used to test and configure the c6a.48xlarge ec2 instance to run CMAQv5.4 for the 12US2 domain.
Todo: Need to create command line options to copy a public ami to a different region.
Verify that you can see the public AMI on the us-east-1 region.#
aws ec2 describe-images --region us-east-1 --image-id ami-051ba52c157e4070c
Output:
{
"Images": [
{
"Architecture": "x86_64",
"CreationDate": "2023-06-26T18:17:08.000Z",
"ImageId": "ami-088f82f334dde0c9f",
"ImageLocation": "440858712842/EC2CMAQv54io2_12LISTOS-training_12NE3_12US1",
"ImageType": "machine",
"Public": true,
"OwnerId": "440858712842",
"PlatformDetails": "Linux/UNIX",
"UsageOperation": "RunInstances",
"State": "available",
"BlockDeviceMappings": [
{
"DeviceName": "/dev/sda1",
"Ebs": {
"DeleteOnTermination": true,
"Iops": 100000,
"SnapshotId": "snap-042b05034228ec830",
"VolumeSize": 500,
"VolumeType": "io2",
"Encrypted": false
}
},
{
"DeviceName": "/dev/sdb",
"VirtualName": "ephemeral0"
},
{
"DeviceName": "/dev/sdc",
"VirtualName": "ephemeral1"
}
],
"EnaSupport": true,
"Hypervisor": "xen",
"Name": "EC2CMAQv54io2_12LISTOS-training_12NE3_12US1",
"RootDeviceName": "/dev/sda1",
"RootDeviceType": "ebs",
"SriovNetSupport": "simple",
"VirtualizationType": "hvm",
"DeprecationTime": "2025-06-26T18:17:08.000Z"
}
]
}
Use q to exit out of the command line
Note, the AMI uses the maximum value available on io2 for Iops of 100000.
AWS Resources for the aws cli method to launch ec2 instances.#
Tutorial Launch Spot Instances
(note, it discourages the use of run-instances for launching spot instances, but they do provide an example method)
Launching EC2 Spot Instances using Run Instances API
Additional resources for spot instance provisioning.
To launch a Spot Instance with RunInstances API you create the configuration file as described below:
cat <<EoF > ./runinstances-config.json
{
"DryRun": false,
"MaxCount": 1,
"MinCount": 1,
"InstanceType": "c6a.48xlarge",
"ImageId": "ami-088f82f334dde0c9f",
"InstanceMarketOptions": {
"MarketType": "spot"
},
"TagSpecifications": [
{
"ResourceType": "instance",
"Tags": [
{
"Key": "Name",
"Value": "EC2SpotCMAQv54"
}
]
}
]
}
EoF
Use the publically available AMI to launch an ondemand c6a.48xlarge ec2 instance using a gp3 volume with 16000 IOPS with hyperthreading disabled#
Note, we will be using a json file that has been preconfigured to specify the ImageId
Obtain the code using git#
git clone -b main https://github.com/CMASCenter/pcluster-cmaq
cd pcluster-cmaq/json
Note, you will need to obtain a security group id from your IT administrator that allows ssh login access. If this is enabled by default, then you can remove the –security-group-ids launch-wizard-with-tcp-access
Example command: note launch-wizard-with-tcp-access needs to be replaced by your security group ID, and your-pem key needs to be replaced by the name of your-pem.pem key.
aws ec2 run-instances --debug --key-name your-pem --security-group-ids launch-wizard-with-tcp-access --dry-run --region us-east-1 --cli-input-json file://runinstances-config.json
Command that works for UNC’s security group and pem key:
aws ec2 run-instances --debug --key-name cmaqv5.4 --security-group-ids launch-wizard-179 --region us-east-1 --dry-run --ebs-optimized --cpu-options CoreCount=96,ThreadsPerCore=1 --cli-input-json file://runinstances-config.io2.json
Once you have verified that the command above works with the –dry-run option, rerun it without as follows.
aws ec2 run-instances --debug --key-name cmaqv5.4 --security-group-ids launch-wizard-179 --region us-east-1 --ebs-optimized --cpu-options CoreCount=96,ThreadsPerCore=1 --cli-input-json file://runinstances-config.io2.json
Example of security group inbound and outbound rules required to connect to EC2 instance via ssh.
Additional resources
CLI commands to create Security Group
Use the following command to obtain the public IP address of the machine.#
This command is commented out, as the instance hasn’t been created yet. keeping the instructions for documentation purposes.
aws ec2 describe-instances --region=us-east-1 --filters "Name=image-id,Values=ami-088f82f334dde0c9f" | grep PublicIpAddress
Login to the ec2 instance#
Note, the following command must be modified to specify your key, and ip address (obtained from the previous command): Note, you will get a connection refused if you try to login prior to the ec2 instance being ready to run (takes ~5 minutes for initialization).
ssh -v -Y -i ~/downloads/your-pem.pem ubuntu@ip.address
Login to the ec2 instance again, so that you have two windows logged into the machine.#
ssh -Y -i ~/downloads/your-pem.pem ubuntu@your-ip-address
Load the environment modules#
module avail
module load ioapi-3.2/gcc-11.3.0-netcdf mpi/openmpi-4.1.2 netcdf-4.8.1/gcc-11.3
Update the pcluster-cmaq repo using git#
cd /shared/pcluster-cmaq
git pull
Run CMAQv5.3.3 for 2016_12SE1 1 Day benchmark Case on 4 pe#
' '
'LamCon_40N_97W'
2 33.000 45.000 -97.000 -97.000 40.000
' '
'SE53BENCH'
'LamCon_40N_97W' 792000.000 -1080000.000 12000.000 12000.000 100 80 1
'2016_12SE1'
'LamCon_40N_97W' 792000.000 -1080000.000 12000.000 12000.000 100 80 1
Use command line to submit the job. This single virtual machine does not have a job scheduler such as slurm installed.#
cd /shared/build/openmpi_gcc/CMAQ_v533/CCTM/scripts/
./run_cctm_Bench_2016_12SE1.csh |& tee run_cctm_Bench_2016_12SE1.log
Use HTOP to view performance.#
htop
output
Successful output#
==================================
***** CMAQ TIMING REPORT *****
==================================
Start Day: 2016-07-01
End Day: 2016-07-01
Number of Simulation Days: 1
Domain Name: 2016_12SE1
Number of Grid Cells: 280000 (ROW x COL x LAY)
Number of Layers: 35
Number of Processes: 4
All times are in seconds.
Num Day Wall Time
01 2016-07-01 2083.32
Total Time = 2083.32
Avg. Time = 2083.32
Use lscpu to confirm that there are 8 processors on the c6a.2xlarge ec2 instance that was created with hyperthreading turned on.#
lscpu
Output:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: AuthenticAMD
Model name: AMD EPYC 7R13 Processor
CPU family: 25
Model: 1
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
Stepping: 1
BogoMIPS: 5300.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm con
stant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt a
es xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext invpcid_single ssbd ibrs ibpb stibp vmm
call fsgsbase bmi1 avx2 smep bmi2 invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nr
ip_save vaes vpclmulqdq rdpid
Virtualization features:
Hypervisor vendor: KVM
Virtualization type: full
Caches (sum of all):
L1d: 128 KiB (4 instances)
L1i: 128 KiB (4 instances)
L2: 2 MiB (4 instances)
L3: 16 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Retbleed: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling, PBRSB-eIBRS Not affected
Srbds: Not affected
Tsx async abort: Not affected
Run 12US2 benchmark again using gp3 volume#
Stop the instance#
aws ec2 stop-instances --region=us-east-1 --instance-ids i-xxxx
Get the following error message.
aws ec2 stop-instances –region=us-east-1 –instance-ids i-041a702cc9f7f7b5d
An error occurred (UnsupportedOperation) when calling the StopInstances operation: You can’t stop the Spot Instance ‘i-041a702cc9f7f7b5d’ because it is associated with a one-time Spot Instance request. You can only stop Spot Instances associated with persistent Spot Instance requests.
Note sure how to do a persistent spot instance request .
Terminate Instance#
aws ec2 terminate-instances --region=us-east-1 --instance-ids i-xxxx
Verify that the instance is being shut down.#
aws ec2 describe-instances --region=us-east-1