99 Results for:hadoop

  • Sort by: 

Hadoop 2

Apache Hadoop 2 is the second iteration of the Hadoop framework for distributed data processing.  Hadoop 2 adds support for running non-batch applications as well as new features to improve system availability. Read Full Definition


SQL-on-Hadoop is a class of analytical application tools that combine established SQL-style querying with newer Hadoop data framework elements. Read Full Definition

Hadoop as a service (HaaS)

Hadoop as a service (HaaS), also known as Hadoop in the cloud, is a big data analytics framework that stores and analyzes data in the cloud using Hadoop. Read Full Definition

Hadoop data lake

A Hadoop data lake is a data management platform comprising one or more Hadoop clusters. Read Full Definition

Associated Glossaries


Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications running in clustered systems. Read Full Definition

Apache Hadoop YARN

Apache Hadoop YARN is the resource management and job scheduling technology in the open source Hadoop distributed processing framework. Read Full Definition

Associated Glossaries

Hadoop cluster

A Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge amounts of unstructured data in a distributed computing environment.  Read Full Definition

Associated Glossaries

Hadoop Distributed File System (HDFS)

The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications. Read Full Definition


MapReduce is a core component of the Apache Hadoop software framework. Read Full Definition

Apache Parquet

Apache Parquet is a column-oriented storage format for Hadoop. Read Full Definition

Associated Glossaries