What Is Apache Hadoop?

http://hadoop.apache.org/

What Is Apache Hadoop?

The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

The project includes these subprojects:

Hadoop Common: The common utilities that support the other Hadoop subprojects.
Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters.

Other Hadoop-related projects at Apache include:

Avro™: A data serialization system.
Cassandra™: A scalable multi-master database with no single points of failure.
Chukwa™: A data collection system for managing large distributed systems.
HBase™: A scalable, distributed database that supports structured data storage for large tables.
Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
Mahout™: A Scalable machine learning and data mining library.
Pig™: A high-level data-flow language and execution framework for parallel computation.
ZooKeeper™: A high-performance coordination service for distributed applications.

Who Uses Hadoop?

A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the Hadoop PoweredBywiki page.

저작자표시 비영리 변경금지 (새창열림)

'Hadoop Ecosystem' 카테고리의 다른 글

RStudio-server version install (0)	2013.06.27
R installation on Centos (0)	2013.06.26
[hadoop@h001 ~]$ cat .hivehistory with 항공데이터 (0)	2013.06.20
HIVE 상에서 사용할 수 있는 명령어 (0)	2013.06.20
하둡(Hadoop)이나 NoSQL, H베이스(HBase) ,리눅스나 아파치 메일 서버 등의 운영체제(OS) 이외에도 톰캣(Tomcat)이나 제이보스(JBoss), 포스트그레SQL(PostgreSQL), 마이SQL (0)	2012.01.26

Jacob's Cafe

What Is Apache Hadoop?