CDES: An approach to HPC workload modelling

John Brennan, Ibad Kureshi, Violeta Holmes

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Computational science and complex system administration relies on being able to model user interactions. When it comes to managing HPC, HTC and grid systems user workloads - their job submission behaviour, is an important metric when designing systems or scheduling algorithms. Most simulators are either inflexible or tied in to proprietary scheduling systems. For system administrators being able to model how a scheduling algorithm behaves or how modifying system configurations can affect the job completion rates is critical. Within computer science research many algorithms are presented with no real description or verification of behaviour. In this paper we are presenting the Cluster Discrete Event Simulator (CDES) as an strong candidate for HPC workload simulation. Built around an open framework, CDES can take system definitions, multi-platform real usage logs and can be interfaced with any scheduling algorithm through the use of an API. CDES has been tested against 3 years of usage logs from a production level HPC system and verified to a greater than 95% accuracy.

LanguageEnglish
Title of host publicationProceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages47-54
Number of pages8
ISBN (Electronic)9781479961443
DOIs
Publication statusPublished - 13 Nov 2014
Event18th IEEE/ACM International Symposium on Distributed Simulations and Real Time Applications - Aeronautics and Space Institute, Toulouse, France
Duration: 1 Oct 20143 Oct 2014
Conference number: 18
http://ds-rt.com/2014/ (Link to Conference Website)

Conference

Conference18th IEEE/ACM International Symposium on Distributed Simulations and Real Time Applications
Abbreviated titleDS-RT 2014
CountryFrance
CityToulouse
Period1/10/143/10/14
Internet address

Fingerprint

Simulators
Scheduling algorithms
Application programming interfaces (API)
Computer science
Large scale systems
Scheduling

Cite this

Brennan, J., Kureshi, I., & Holmes, V. (2014). CDES: An approach to HPC workload modelling. In Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT (pp. 47-54). [6957176] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/DS-RT.2014.15
Brennan, John ; Kureshi, Ibad ; Holmes, Violeta. / CDES : An approach to HPC workload modelling. Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 47-54
@inproceedings{16850404869c4646b50761a3906f917c,
title = "CDES: An approach to HPC workload modelling",
abstract = "Computational science and complex system administration relies on being able to model user interactions. When it comes to managing HPC, HTC and grid systems user workloads - their job submission behaviour, is an important metric when designing systems or scheduling algorithms. Most simulators are either inflexible or tied in to proprietary scheduling systems. For system administrators being able to model how a scheduling algorithm behaves or how modifying system configurations can affect the job completion rates is critical. Within computer science research many algorithms are presented with no real description or verification of behaviour. In this paper we are presenting the Cluster Discrete Event Simulator (CDES) as an strong candidate for HPC workload simulation. Built around an open framework, CDES can take system definitions, multi-platform real usage logs and can be interfaced with any scheduling algorithm through the use of an API. CDES has been tested against 3 years of usage logs from a production level HPC system and verified to a greater than 95{\%} accuracy.",
keywords = "HPC, HPC simulator, scheduler, WMS, workload modelling",
author = "John Brennan and Ibad Kureshi and Violeta Holmes",
year = "2014",
month = "11",
day = "13",
doi = "10.1109/DS-RT.2014.15",
language = "English",
pages = "47--54",
booktitle = "Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Brennan, J, Kureshi, I & Holmes, V 2014, CDES: An approach to HPC workload modelling. in Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT., 6957176, Institute of Electrical and Electronics Engineers Inc., pp. 47-54, 18th IEEE/ACM International Symposium on Distributed Simulations and Real Time Applications, Toulouse, France, 1/10/14. https://doi.org/10.1109/DS-RT.2014.15

CDES : An approach to HPC workload modelling. / Brennan, John; Kureshi, Ibad; Holmes, Violeta.

Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT. Institute of Electrical and Electronics Engineers Inc., 2014. p. 47-54 6957176.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - CDES

T2 - An approach to HPC workload modelling

AU - Brennan, John

AU - Kureshi, Ibad

AU - Holmes, Violeta

PY - 2014/11/13

Y1 - 2014/11/13

N2 - Computational science and complex system administration relies on being able to model user interactions. When it comes to managing HPC, HTC and grid systems user workloads - their job submission behaviour, is an important metric when designing systems or scheduling algorithms. Most simulators are either inflexible or tied in to proprietary scheduling systems. For system administrators being able to model how a scheduling algorithm behaves or how modifying system configurations can affect the job completion rates is critical. Within computer science research many algorithms are presented with no real description or verification of behaviour. In this paper we are presenting the Cluster Discrete Event Simulator (CDES) as an strong candidate for HPC workload simulation. Built around an open framework, CDES can take system definitions, multi-platform real usage logs and can be interfaced with any scheduling algorithm through the use of an API. CDES has been tested against 3 years of usage logs from a production level HPC system and verified to a greater than 95% accuracy.

AB - Computational science and complex system administration relies on being able to model user interactions. When it comes to managing HPC, HTC and grid systems user workloads - their job submission behaviour, is an important metric when designing systems or scheduling algorithms. Most simulators are either inflexible or tied in to proprietary scheduling systems. For system administrators being able to model how a scheduling algorithm behaves or how modifying system configurations can affect the job completion rates is critical. Within computer science research many algorithms are presented with no real description or verification of behaviour. In this paper we are presenting the Cluster Discrete Event Simulator (CDES) as an strong candidate for HPC workload simulation. Built around an open framework, CDES can take system definitions, multi-platform real usage logs and can be interfaced with any scheduling algorithm through the use of an API. CDES has been tested against 3 years of usage logs from a production level HPC system and verified to a greater than 95% accuracy.

KW - HPC

KW - HPC simulator

KW - scheduler

KW - WMS

KW - workload modelling

UR - http://www.scopus.com/inward/record.url?scp=84913555795&partnerID=8YFLogxK

U2 - 10.1109/DS-RT.2014.15

DO - 10.1109/DS-RT.2014.15

M3 - Conference contribution

SP - 47

EP - 54

BT - Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Brennan J, Kureshi I, Holmes V. CDES: An approach to HPC workload modelling. In Proceedings - IEEE International Symposium on Distributed Simulation and Real-Time Applications, DS-RT. Institute of Electrical and Electronics Engineers Inc. 2014. p. 47-54. 6957176 https://doi.org/10.1109/DS-RT.2014.15