4. Performance and Cost Optimization#
Timing information and scaling plots to assist users in optimizing the performance of their parallel cluster.
- 4.1. Right-sizing Compute Nodes for the ParallelCluster Configuration
- 4.2. An explanation of why a scaling analysis is required for Multinode or Parallel MPI Codes
- 4.3. Slurm Compute Node Provisioning
- 4.4. Spot versus On-Demand Pricing
- 4.5. Benchmark Timings for CMAQv5.3.3 12US2 Benchmark
- 4.6. Benchmark Scaling Plots for CMAQv5.3.3 12US2 Benchmark
- 4.6.1. Benchmark Scaling Plot for c5n.18xlarge
- 4.6.2. Investigation of why there is a difference between the total run times for the benchmark when NPCOLxNPROW used 12x9 as compared to 9x12 and 6x18.
- 4.6.3. Benchmark Scaling Plot for c5n.18xlarge and c5n.9xlarge
- 4.6.4. Total Time and Cost versus CPU Plot for c5n.18xlarge
- 4.6.5. Total Time and Cost versus CPU Plot for c5n.9xlarge
- 4.6.6. Total Time and Cost versus CPU Plot for both c5n.18xlarge and c5n.9xlarge
- 4.6.7. Total Time and Cost versus CPU Plot for hpc6a.48xlarge
- 4.7. Cost Information
- 4.8. Recommended Workflow for extending to annual run
- 4.9. Side by Side Comparison of the information in the log files for 12x9 pe run compared to 9x12 pe run.