What Is Apache Hadoop?
The Apache™ Hadoop™ project develops open-source software for reliable, scalable, distributed computing.
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.
The project includes these subprojects:
- Hadoop Common: The common utilities that support the other Hadoop subprojects.
- Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
- Hadoop MapReduce: A software framework for distributed processing of large data sets on compute clusters.
Other Hadoop-related projects at Apache include:
- Avro™: A data serialization system.
- Cassandra™: A scalable multi-master database with no single points of failure.
- Chukwa™: A data collection system for managing large distributed systems.
- HBase™: A scalable, distributed database that supports structured data storage for large tables.
- Hive™: A data warehouse infrastructure that provides data summarization and ad hoc querying.
- Mahout™: A Scalable machine learning and data mining library.
- Pig™: A high-level data-flow language and execution framework for parallel computation.
- ZooKeeper™: A high-performance coordination service for distributed applications.
Who Uses Hadoop?
'Hadoop Ecosystem' 카테고리의 다른 글
RStudio-server version install (0) | 2013.06.27 |
---|---|
R installation on Centos (0) | 2013.06.26 |
[hadoop@h001 ~]$ cat .hivehistory with 항공데이터 (0) | 2013.06.20 |
HIVE 상에서 사용할 수 있는 명령어 (0) | 2013.06.20 |
하둡(Hadoop)이나 NoSQL, H베이스(HBase) ,리눅스나 아파치 메일 서버 등의 운영체제(OS) 이외에도 톰캣(Tomcat)이나 제이보스(JBoss), 포스트그레SQL(PostgreSQL), 마이SQL (0) | 2012.01.26 |