skip to main content
Primo Search
Search in: Busca Geral

Energy savings and performance improvements with SSDs in the Hadoop Distributed File System

Polato, Ivanilton

Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística 2016-08-29

Acesso online. A biblioteca também possui exemplares impressos.

  • Título:
    Energy savings and performance improvements with SSDs in the Hadoop Distributed File System
  • Autor: Polato, Ivanilton
  • Orientador: Kon, Fabio
  • Assuntos: Armazenamento Híbrido; Sistemas De Arquivos Paralelo; Sistema De Arquivos Distribuído; Hdfs; Ssds; Hadoop; Eficiência Energética; Computação Verde; Discos De Estado Sólido; Green Computing; Solid-State Disk; Parallel File Systems; Hybrid Storage; Distributed File Systems; Energy Efficiency
  • Notas: Tese (Doutorado)
  • Descrição: Energy issues gathered strong attention over the past decade, reaching IT data processing infrastructures. Now, they need to cope with such responsibility, adjusting existing platforms to reach acceptable performance while promoting energy consumption reduction. As the de facto platform for Big Data, Apache Hadoop has evolved significantly over the last years, with more than 60 releases bringing new features. By implementing the MapReduce programming paradigm and leveraging HDFS, its distributed file system, Hadoop has become a reliable and fault tolerant middleware for parallel and distributed computing over large datasets. Nevertheless, Hadoop may struggle under certain workloads, resulting in poor performance and high energy consumption. Users increasingly demand that high performance computing solutions address sustainability and limit energy consumption. In this thesis, we introduce HDFSH, a hybrid storage mechanism for HDFS, which uses a combination of Hard Disks and Solid-State Disks to achieve higher performance while saving power in Hadoop computations. HDFSH brings, to the middleware, the best from HDs (affordable cost per GB and high storage capacity) and SSDs (high throughput and low energy consumption) in a configurable fashion, using dedicated storage zones for each storage device type. We implemented our mechanism as a block placement policy for HDFS, and assessed it over six recent releases of Hadoop with different architectural properties. Results indicate that our approach increases overall job performance while decreasing the energy consumption under most hybrid configurations evaluated. Our results also showed that, in many cases, storing only part of the data in SSDs results in significant energy savings and execution speedups
  • DOI: 10.11606/T.45.2016.tde-31102016-155908
  • Editor: Biblioteca Digital de Teses e Dissertações da USP; Universidade de São Paulo; Instituto de Matemática e Estatística
  • Data de criação/publicação: 2016-08-29
  • Formato: Adobe PDF
  • Idioma: Inglês

Buscando em bases de dados remotas. Favor aguardar.