HOME

TheInfoList



OR:

The following tables compare general and technical information for notable
computer cluster A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The comp ...
software Software is a set of computer programs and associated software documentation, documentation and data (computing), data. This is in contrast to Computer hardware, hardware, from which the system is built and which actually performs the work. ...
. This software can be grossly separated in four categories:
Job scheduler A job scheduler is a computer application for controlling unattended background program execution of jobs. This is commonly called batch scheduling, as execution of non-interactive jobs is often called batch processing, though traditional ''job' ...
, nodes management, nodes installation and integrated stack (all the above).


General information

Table explanation * ''Software'': The name of the application that is described


Technical information

{, class="wikitable sortable" style="font-size: 80%; text-align: center; width: auto;" ! Software ! Implementation Language ! Authentication ! Encryption ! Integrity ! Global File System ! Global File System + Kerberos ! Heterogeneous/ Homogeneous exec node ! Jobs priority ! Group priority ! Queue type ! SMP aware ! Max exec node ! Max job submitted !
CPU scavenging Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from c ...
! Parallel job ! Job checkpointing !Python interface , - ! class="table-rh" ,
Enduro/X Enduro/X is an open-source middleware platform for distributed transaction processing. It is built on proven APIs such as X/Open group's XATMI and XA. The platform is designed for building real-time microservices based applications with a clu ...
, C/ C++ , OS Authentication , GPG, AES-128, SHA1 , None , Any cluster Posix FS (gfs, gpfs, ocfs, etc.) , Any cluster Posix FS (gfs, gpfs, ocfs, etc.) , Heterogeneous , OS Nice level , OS Nice level , SOA Queues, FIFO , Yes , OS Limits , OS Limits , Yes , Yes , No , No , - ! class="table-rh" ,
HTCondor HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, or to farm out wor ...
, C++ , GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous , None, Triple DES, BLOWFISH , None, MD5 , None, NFS, AFS , Not official, hack with ACL and NFS4 , Heterogeneous , Yes , Yes , Fair-share with some programmability , basic (hard separation into different node) , tested ~10000? , tested ~100000? , Yes , MPI, OpenMP, PVM , Yes
Yes
an

, - ! class="table-rh" ,
PBS Pro Portable Batch System (or simply PBS) is the name of computer software that performs job scheduling. Its primary task is to allocate computational tasks, i.e., batch jobs, among the available computing resources. It is often used in conjunction ...
, C/ Python , OS Authentication, Munge , , , Any, e.g., NFS, Lustre, GPFS, AFS , Limited availability , Heterogeneous , Yes , Yes , Fully configurable , Yes , tested ~50,000 , Millions , Yes , MPI, OpenMP , Yes
Yes
, - ! class="table-rh" ,
OpenLava OpenLava is a workload job scheduler for a cluster of computers. OpenLava was pirated from an early version of Platform LSF. Its configuration file syntax, application program interface (API), and command-line interface (CLI) have been kept uncha ...
, C/C++ , OS authentication , None , , NFS , , Heterogeneous Linux , Yes , Yes , Configurable , Yes , , , Yes, supports preemption based on priority , Yes , Yes , No , - ! class="table-rh" , Slurm , C , Munge, None, Kerberos , , , , , Heterogeneous , Yes , Yes , Multifactor Fair-share , yes , tested 120k , tested 100k , No , Yes , Yes
PySlurm
, - ! class="table-rh" , Spectrum LSF , C/C++ , Multiple - OS Authentication/Kerberos , Optional , Optional , Any - GPFS/Spectrum Scale, NFS, SMB , Any - GPFS/Spectrum Scale, NFS, SMB , Heterogeneous - HW and OS agnostic (AIX, Linux or Windows) , Policy based - no queue to computenode binding , Policy based - no queue to computegroup binding , Batch, interactive, checkpointing, parallel and combinations , yes and GPU aware (GPU License free) , > 9.000 compute hots , > 4 mio jobs a day , Yes, supports preemption based on priority, supports checkpointing/resume , Yes, fx parallel submissions for job collaboration over fx MPI , Yes, with support for user, kernel or library level checkpointing environments
Yes
, - ! ,
Torque In physics and mechanics, torque is the rotational equivalent of linear force. It is also referred to as the moment of force (also abbreviated to moment). It represents the capability of a force to produce change in the rotational motion of t ...
, C , SSH, munge , , , None, any , , Heterogeneous , Yes , Yes , Programmable , Yes , tested , tested , Yes , Yes , Yes
Yes
, - ! {{rh class="table-rh" , Altair Grid Engine , C , OS Authentication/Kerberos/Oauth2 , Certificate Based , Integrity , Arbitrary, e.g. NFS, Lustre, HDFS, AFS , AFS , Fully heterogeneous , Yes; automatically policy controlled (e.g. fair-share, deadline, resource dependent) or manual , Yes; can be dependent on user groups as well as projects and is governed by policies , Batch, interactive, checkpointing, parallel and combinations , Yes, with core binding, GPU and Intel Xeon Phi support , commercial deployments with many tens of thousands hosts , >300K tested in commercial deployments , Yes; can suspend job on interactive usage , Yes, with support of arbitrary parallel environments such as OpenMPI, MPICH 1/2,
MVAPICH MVAPICH, also known as MVAPICH2, is a BSD-licensed implementation of the MPI standard developed by Ohio State University. MVAPICH comes in a number of flavors: * MVAPICH2, with support for InfiniBand, iWARP, RoCE, and Intel Omni-Path * MVAP ...
1/2, LAM, etc. , Yes, with support for user, kernel or library level checkpointing environments
drmaa2
, - ! Software ! Implementation Language ! Authentication ! Encryption ! Integrity ! Global File System ! Global File System + Kerberos ! Heterogeneous/ Homogeneous exec node ! Jobs priority ! Group priority ! Queue type ! SMP aware ! Max exec node ! Max job submitted !
CPU scavenging Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from c ...
! Parallel job ! Job checkpointing ! Table Explanation * ''Software'': The name of the application that is described * ''SMP aware'': ** basic: hard split into multiple virtual host ** basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer ** dynamic: split the resource of the computer (CPU/Ram) on demand


See also

*
List of volunteer computing projects This is a comprehensive list of volunteer computing projects; a type of distributed computing where volunteers donate computing time to specific causes. The donated computing power comes from idle CPUs and GPUs in personal computers, video game c ...
*
List of cluster management software List of software for cluster management. Free and open source * HA ** Apache Mesos, from the Apache Software Foundation ** Kubernetes, founded by Google Inc, from the Cloud Native Computing Foundation ** Heartbeat, from Linux-HA ** Docker Swar ...
*
Computer cluster A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The comp ...
*
Grid computing Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from ...
*
World Community Grid World Community Grid (WCG) is an effort to create the world's largest volunteer computing platform to tackle scientific research that benefits humanity. Launched on November 16, 2004, with proprietary Grid MP client from United Devices and addin ...
*
Distributed computing A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
*
Distributed resource management A job scheduler is a computer application for controlling unattended background program execution of jobs. This is commonly called batch scheduling, as execution of non-interactive jobs is often called batch processing, though traditional ''job' ...
*
High-Throughput Computing In computer science, high-throughput computing (HTC) is the use of many computing resources over long periods of time to accomplish a computational task. Challenges The HTC community is also concerned with robustness and reliability of jobs over ...
* Job Processing Cycle *
Batch processing Computerized batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically ...
*
Fallacies of Distributed Computing The fallacies of distributed computing are a set of assertions made by L Peter Deutsch and others at Sun Microsystems describing false assumptions that programmers new to distributed applications invariably make. The fallacies The fallacies are ...
Cluster computing Cluster software Job scheduling