Listing 1 - 4 of 4 |
Sort by
|
Choose an application
High performance computing - Research grants - United States. --- Computer science --- High performance computing --- Engineering & Applied Sciences --- Computer Science --- Research grants --- Informatics --- HPC (Computer science) --- Science --- Electronic data processing --- Cyberinfrastructure --- Supercomputers
Choose an application
The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and tuning work required. This process is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus making a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.
Cloud computing -- Research. --- Computational grids (Computer systems). --- High performance computing -- Research. --- High performance computing --- Mathematics --- Physical Sciences & Mathematics --- Mathematics - General --- Mathematics. --- Operating systems (Computers) --- Computer science. --- Informatics --- Computer operating systems --- Computers --- Disk operating systems --- Math --- Operating systems --- Computer software --- Computer programming. --- Application software. --- Computer mathematics. --- Computational Science and Engineering. --- Performance and Reliability. --- Programming Techniques. --- Computer Applications. --- Reusability. --- Science --- Systems software --- Operating systems (Computers). --- Computer software—Reusability. --- Application computer programs --- Application computer software --- Applications software --- Apps (Computer software) --- Electronic computer programming --- Electronic data processing --- Electronic digital computers --- Programming (Electronic computers) --- Coding theory --- Computer mathematics --- Programming --- Reusability of software --- Reusable code (Computer programs) --- Software reusability --- Software reengineering --- Generic programming (Computer science)
Choose an application
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Supérieure de Lyon, France, and a Visiting Research Scholar in the ICL.
Computer Science. --- System Performance and Evaluation. --- Performance and Reliability. --- Numeric Computing. --- Computer science. --- Operating systems (Computers). --- Computer system performance. --- Electronic data processing. --- Informatique --- Systèmes d'exploitation (Ordinateurs) --- Fault-tolerant computing -- Equipment and supplies. --- High performance computing -- Research. --- Engineering & Applied Sciences --- Computer Science --- Fault-tolerant computing --- High performance computing --- Equipment and supplies. --- Research. --- Computing, Fault-tolerant --- Computer software --- Computer system failures. --- Numerical analysis. --- Reusability. --- Electronic data processing --- Electronic digital computers --- Fault tolerance (Engineering) --- Computer system failures --- Reliability --- ADP (Data processing) --- Automatic data processing --- Data processing --- EDP (Data processing) --- IDP (Data processing) --- Integrated data processing --- Computers --- Office practice --- Computer operating systems --- Disk operating systems --- Systems software --- Automation --- Operating systems --- Computer software—Reusability. --- Mathematical analysis --- Computer failures --- Computer malfunctions --- Computer systems --- Failure of computer systems --- System failures (Engineering) --- Failures
Choose an application
The proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing provide an overview on supportive software tools and environments in the fields of System Management, Parallel Debugging and Performance Analysis. In the pursuit to maintain exponential growth for the performance of high performance computers the HPC community is currently targeting Exascale Systems. The initial planning for Exascale already started when the first Petaflop system was delivered. Many challenges need to be addressed to reach the necessary performance. Scalability, energy efficiency and fault-tolerance need to be increased by orders of magnitude. The goal can only be achieved when advanced hardware is combined with a suitable software stack. In fact, the importance of software is rapidly growing. As a result, many international projects focus on the necessary software.
Cloud computing -- Research. --- Computational grids (Computer systems). --- Computer science. --- Computer software. --- Computer system performance. --- Data mining. --- High performance computing -- Research. --- Operating systems (Computers). --- Engineering & Applied Sciences --- Computer Science --- High performance computing --- Parallel programming (Computer science) --- Computer software --- Computer system failures. --- Programming languages (Electronic computers). --- Algorithms. --- Application software. --- Computer Science. --- System Performance and Evaluation. --- Performance and Reliability. --- Programming Languages, Compilers, Interpreters. --- Algorithm Analysis and Problem Complexity. --- Data Mining and Knowledge Discovery. --- Computer Applications. --- Reusability. --- Algorithmic knowledge discovery --- Factual data analysis --- KDD (Information retrieval) --- Knowledge discovery in data --- Knowledge discovery in databases --- Mining, Data --- Database searching --- Software, Computer --- Computer systems --- Computer operating systems --- Computers --- Disk operating systems --- Systems software --- Informatics --- Science --- Operating systems --- Computer software—Reusability. --- Application computer programs --- Application computer software --- Applications software --- Apps (Computer software) --- Algorism --- Algebra --- Arithmetic --- Computer languages --- Computer program languages --- Computer programming languages --- Machine language --- Electronic data processing --- Languages, Artificial --- Computer failures --- Computer malfunctions --- Failure of computer systems --- System failures (Engineering) --- Fault-tolerant computing --- Foundations --- Failures --- Reusability of software --- Reusable code (Computer programs) --- Software reusability --- Software reengineering --- Generic programming (Computer science)
Listing 1 - 4 of 4 |
Sort by
|