98 Results for:hadoop

  • Sort by: 

Hadoop 2

Apache Hadoop 2 is the second iteration of the Hadoop framework for distributed data processing.  Hadoop 2 adds support for running non-batch applications as well as new features to improve system availability. Read Full Definition

SQL-on-Hadoop

SQL-on-Hadoop is a class of analytical application tools that combine established SQL-style querying with newer Hadoop data framework elements. Read Full Definition

Hadoop data lake

A Hadoop data lake is a data management platform comprising one or more Hadoop clusters. Read Full Definition

Associated Glossaries

Hadoop

Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications running in clustered systems. Read Full Definition

Apache Hadoop YARN

Apache Hadoop YARN is the resource management and job scheduling technology in the open source Hadoop distributed processing framework. Read Full Definition

Associated Glossaries

Hadoop as a service (HaaS)

Hadoop as a service provides organizations with big data analytics capabilities that are deployed and managed through a third party. This lets organizations without in-house expertise use the complicated Hadoop ... Read Full Definition

Hadoop cluster

A Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge amounts of unstructured data in a distributed computing environment.  Read Full Definition

Associated Glossaries

Hadoop Distributed File System (HDFS)

The Hadoop Distributed File System (HDFS) is the primary data storage system used by Hadoop applications. Read Full Definition

MapReduce

MapReduce is a core component of the Apache Hadoop software framework. Read Full Definition

Apache Parquet

Apache Parquet is a column-oriented storage format for Hadoop. Read Full Definition

Associated Glossaries