Narrow your search
Listing 1 - 4 of 4
Sort by
Funding a revolution : government support for computing research
Author:
ISBN: 0309062780 9786612081910 1282081918 0309525012 0585142734 9780585142739 9780309062787 0305062780 0309173523 9780309173520 9781282081918 6612081910 9780309525015 Year: 1999 Publisher: Washington, D.C. : National Academy Press,


Book
Tools for high performance computing 2012
Authors: ---
ISBN: 3642373488 3642373496 Year: 2013 Publisher: Heidelberg, Germany : Springer,

Loading...
Export citation

Choose an application

Bookmark

Abstract

The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and  tuning work required. This process  is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus  making a major contribution to the development of  robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.


Book
Fault-Tolerance Techniques for High-Performance Computing
Authors: ---
ISBN: 9783319209432 3319209426 9783319209425 3319209434 Year: 2015 Publisher: Cham : Springer International Publishing : Imprint: Springer,

Loading...
Export citation

Choose an application

Bookmark

Abstract

This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Supérieure de Lyon, France, and a Visiting Research Scholar in the ICL.


Book
Tools for high performance computing 2011 : proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing, September 2011, Zih, Dresden
Authors: ---
ISBN: 3642314759 3642439853 1283640597 3642314767 Year: 2012 Publisher: New York : Springer,

Loading...
Export citation

Choose an application

Bookmark

Abstract

The proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing provide an overview on supportive software tools and environments in the fields of System Management, Parallel Debugging and Performance Analysis. In the pursuit to maintain exponential growth for the performance of high performance computers the HPC community is currently targeting Exascale Systems. The initial planning for Exascale already started when the first Petaflop system was delivered. Many challenges need to be addressed to reach the necessary performance. Scalability, energy efficiency and fault-tolerance need to be increased by orders of magnitude. The goal can only be achieved when advanced hardware is combined with a suitable software stack. In fact, the importance of software is rapidly growing. As a result, many international projects focus on the necessary software.

Keywords

Cloud computing -- Research. --- Computational grids (Computer systems). --- Computer science. --- Computer software. --- Computer system performance. --- Data mining. --- High performance computing -- Research. --- Operating systems (Computers). --- Engineering & Applied Sciences --- Computer Science --- High performance computing --- Parallel programming (Computer science) --- Computer software --- Computer system failures. --- Programming languages (Electronic computers). --- Algorithms. --- Application software. --- Computer Science. --- System Performance and Evaluation. --- Performance and Reliability. --- Programming Languages, Compilers, Interpreters. --- Algorithm Analysis and Problem Complexity. --- Data Mining and Knowledge Discovery. --- Computer Applications. --- Reusability. --- Algorithmic knowledge discovery --- Factual data analysis --- KDD (Information retrieval) --- Knowledge discovery in data --- Knowledge discovery in databases --- Mining, Data --- Database searching --- Software, Computer --- Computer systems --- Computer operating systems --- Computers --- Disk operating systems --- Systems software --- Informatics --- Science --- Operating systems --- Computer software—Reusability. --- Application computer programs --- Application computer software --- Applications software --- Apps (Computer software) --- Algorism --- Algebra --- Arithmetic --- Computer languages --- Computer program languages --- Computer programming languages --- Machine language --- Electronic data processing --- Languages, Artificial --- Computer failures --- Computer malfunctions --- Failure of computer systems --- System failures (Engineering) --- Fault-tolerant computing --- Foundations --- Failures --- Reusability of software --- Reusable code (Computer programs) --- Software reusability --- Software reengineering --- Generic programming (Computer science)

Listing 1 - 4 of 4
Sort by