HBase install and basic command [하둡 동영상 강의]

HBase 설치 진행내역을 담았습니다. [column base db]

Prerequisite : HDFS

[refered to sites below and downloaded hbase tar.gz file]

http://hbase.apache.org/

http://www.apache.org/dyn/closer.cgi/hbase/

http://hbase.apache.org/book/quickstart.html

# vi /etc/profile

export HBASE_HOME=/home/hadoop/hbase

export PATH=$PATH:$JAVA_HOME/bin:$ANT_HOME:$HADOOP_HOME/bin:$HBASE_HOME/bin

# source /etc/profile

# cat hbase-env.sh

# The java implementation to use. Java 1.6 required.

export JAVA_HOME=/usr/local/java

export HBASE_CLASSPATH=/home/hadoop/hbase/conf

export HBASE_MANAGER_ZK=true

# The maximum amount of heap to use, in MB. Default is 1000.

# export HBASE_HEAPSIZE=1000

# Extra Java runtime options.

# Below are what we set by default. May only work with SUN JVM.

# For more on why as well as other possible settings,

# see http://wiki.apache.org/hadoop/PerformanceTuning

export HBASE_OPTS="-XX:+UseConcMarkSweepGC"

# Uncomment one of the below three options to enable java garbage collection logging for the server-side processes.

# This enables basic gc logging to the .out file.

# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export SERVER_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment one of the below three options to enable java garbage collection logging for the client processes.

# This enables basic gc logging to the .out file.

# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps"

# This enables basic gc logging to its own file.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH>"

# This enables basic GC logging to its own file with automatic log rolling. Only applies to jdk 1.6.0_34+ and 1.7.0_2+.

# If FILE-PATH is not replaced, the log file(.gc) would still be generated in the HBASE_LOG_DIR .

# export CLIENT_GC_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<FILE-PATH> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1 -XX:GCLogFileSize=512M"

# Uncomment below if you intend to use the EXPERIMENTAL off heap cache.

# export HBASE_OPTS="$HBASE_OPTS -XX:MaxDirectMemorySize="

# Set hbase.offheapcache.percentage in hbase-site.xml to a nonzero value.

# Uncomment and adjust to enable JMX exporting

# See jmxremote.password and jmxremote.access in $JRE_HOME/lib/management to configure remote password access.

# More details at: http://java.sun.com/javase/6/docs/technotes/guides/management/agent.html

# export HBASE_JMX_BASE="-Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false"

# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10101"

# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10102"

# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10103"

# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS $HBASE_JMX_BASE -Dcom.sun.management.jmxremote.port=10104"

# File naming hosts on which HRegionServers will run. $HBASE_HOME/conf/regionservers by default.

# export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers

# File naming hosts on which backup HMaster will run. $HBASE_HOME/conf/backup-masters by default.

# export HBASE_BACKUP_MASTERS=${HBASE_HOME}/conf/backup-masters

# Extra ssh options. Empty by default.

# export HBASE_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HBASE_CONF_DIR"

# Where log files are stored. $HBASE_HOME/logs by default.

# export HBASE_LOG_DIR=${HBASE_HOME}/logs

# Enable remote JDWP debugging of major HBase processes. Meant for Core Developers

# export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8070"

# export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8071"

# export HBASE_THRIFT_OPTS="$HBASE_THRIFT_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8072"

# export HBASE_ZOOKEEPER_OPTS="$HBASE_ZOOKEEPER_OPTS -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=8073"

# A string representing this instance of hbase. $USER by default.

# export HBASE_IDENT_STRING=$USER

# The scheduling priority for daemon processes. See 'man nice'.

# export HBASE_NICENESS=10

# The directory where pid files are stored. /tmp by default.

# export HBASE_PID_DIR=/var/hadoop/pids

# Seconds to sleep between slave commands. Unset by default. This

# can be useful in large clusters, where, e.g., slave rsyncs can

# otherwise arrive faster than the master can service them.

# export HBASE_SLAVE_SLEEP=0.1

# Tell HBase whether it should manage it's own instance of Zookeeper or not.

# export HBASE_MANAGES_ZK=true

[hadoop@h001 conf]$ vi base-site.xml

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<name>hbase.rootdir</name>

<value>hdfs://192.168.73.71:9000/hbase</value>

</property>

<name>hbase.master</name>

</property>

<name>hbase.zookeeper.quorum</name>

</property>

<name>hbase.zookeeper.property.dataDir</name>

<value>/home/hadoop/zk_data</value>

</property>

<name>hbase.cluster.distributed</name>

</property>

<name>dfs.support.append</name>

</property>

<name>dfs.datanode.max.xcievers</name>

</property>

</configuration>

[hadoop@h001 conf]$ cat regionservers

192.168.73.72 192.168.73.73 192.168.73.74

$tar cvzf  hbase.tar.gz  hbase
$scp hbase.tar.gz hadoop@h002:.
$ssh h002  tar xvzf  hbase.tar.gz

$ ./bin/start-hbase.sh

$ ./bin/hbase shell

[hadoop@h001 hbase]$ jps

3855 HQuorumPeer <-- Zookeeper

3935 HMaster <--- HBase

4134 Jps

3137 NameNode

3333 JobTracker

http://h001:60010

http://h001:50070/dfshealth.jsp

http://h001:50030/jobtracker.jsp

[hadoop@h001 hbase]$ ./bin/hbase shell

[hadoop@h001 hbase]$ ./bin/hbase shell

HBase Shell; enter 'help<RETURN>' for list of supported commands.

Type "exit<RETURN>" to leave the HBase Shell

Version 0.94.7, r1471806, Wed Apr 24 18:48:26 PDT 2013

hbase(main):001:0> help

HBase Shell, version 0.94.7, r1471806, Wed Apr 24 18:48:26 PDT 2013

Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.

Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.

COMMAND GROUPS:

Group name: general

Commands: status, version, whoami

Group name: ddl

Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, is_disabled, is_enabled, list, show_filters

Group name: dml

Commands: count, delete, deleteall, get, get_counter, incr, put, scan, truncate

Group name: tools

Commands: assign, balance_switch, balancer, close_region, compact, flush, hlog_roll, major_compact, move, split, unassign, zk_dump

Group name: replication

Commands: add_peer, disable_peer, enable_peer, list_peers, remove_peer, start_replication, stop_replication

Group name: snapshot

Commands: clone_snapshot, delete_snapshot, list_snapshots, restore_snapshot, snapshot

Group name: security

Commands: grant, revoke, user_permission

SHELL USAGE:

Quote all names in HBase Shell such as table and column names. Commas delimit

command parameters. Type <RETURN> after entering a command to run it.

Dictionaries of configuration used in the creation and alteration of tables are

Ruby Hashes. They look like this:

{'key1' => 'value1', 'key2' => 'value2', ...}

and are opened and closed with curley-braces. Key/values are delimited by the

'=>' character combination. Usually keys are predefined constants such as

NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type

'Object.constants' to see a (messy) list of all constants in the environment.

If you are using binary keys or values and need to enter them in the shell, use

double-quote'd hexadecimal representation. For example:

hbase> get 't1', "key\x03\x3f\xcd"

hbase> get 't1', "key\003\023\011"

hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"

The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.

For more on the HBase Shell, see http://hbase.apache.org/docs/current/book.html

hbase(main):002:0> help 'create'

$ ./bin/hbase shell

Create table; pass table name, a dictionary of specifications per

column family, and optionally a dictionary of table configuration.

Dictionaries are described below in the GENERAL NOTES section.

Examples:

hbase> create 't1', {NAME => 'f1', VERSIONS => 5}

hbase> create 't1', {NAME => 'f1'}, {NAME => 'f2'}, {NAME => 'f3'}

hbase> # The above in shorthand would be the following:

hbase> create 't1', 'f1', 'f2', 'f3'

hbase> create 't1', {NAME => 'f1', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true}

hbase> create 't1', 'f1', {SPLITS => ['10', '20', '30', '40']}

hbase> create 't1', 'f1', {SPLITS_FILE => 'splits.txt'}

hbase> # Optionally pre-split the table into NUMREGIONS, using

hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname)

hbase> create 't1', 'f1', {NUMREGIONS => 15, SPLITALGO => 'HexStringSplit'}

hbase(main):003:0>

hbase(main):039:0> create 'ta1', 'cf1'

$ ./bin/hbase shell

hbase(main):035:0> describe 'ta1'

$ ./bin/hbase shell

DESCRIPTION ENABLED

'ta1', {NAME => 'cf1', DATA_BLOCK_ENCODING => 'NONE false

', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0',

VERSIONS => '3', COMPRESSION => 'NONE', MIN_VERSIO

NS => '0', TTL => '2147483647', KEEP_DELETED_CELLS

=> 'false', BLOCKSIZE => '65536', IN_MEMORY => 'fal

se', ENCODE_ON_DISK => 'true', BLOCKCACHE => 'true'

}

1 row(s) in 0.0660 seconds

hbase(main):039:0> disable 'ta1'

hbase(main):036:0> is_enabled 'ta1'

false

0 row(s) in 0.0160 seconds

hbase(main):037:0> enable 'ta1'

0 row(s) in 2.1680 seconds

hbase(main):038:0> is_enabled 'ta1'

true

0 row(s) in 0.0170 seconds

hbase(main):044:0> drop 'ta1'

ERROR: Table ta1 is enabled. Disable it first.'

Here is some help for this command:

Drop the named table. Table must first be disabled: e.g. "hbase> drop 't1'"

hbase(main):045:0> disable 'ta1'

0 row(s) in 2.1170 seconds

hbase(main):046:0> drop 'ta1'

0 row(s) in 1.1630 seconds

==> drop 시 hdfs 에 생성되어 있던 /hbase/ta1 디렉토리가 삭제 처리됨

hbase(main):047:0> list

TABLE

hbase009

regionsplit_table

test

textxx

4 row(s) in 0.0570 seconds

hbase(main):048:0> create 'table01', 'cf'

0 row(s) in 1.1100 seconds

hbase(main):049:0> put 'table01', 'row001', 'cf:a', 'value i wanna put'

0 row(s) in 0.1260 seconds

hbase(main):050:0> put 'table01', 'row001', 'cf:b', 'value b i wanna put'

0 row(s) in 0.0200 seconds

hbase(main):051:0> put 'table01', 'row001', 'cf:c', 'value c i wanna put'

0 row(s) in 0.0150 seconds

hbase(main):052:0> put 'table01', 'row002', 'cf:2a', 'value 2a i wanna put'

0 row(s) in 0.0120 seconds

hbase(main):053:0> put 'table01', 'row003', 'cf:3a', 'value 3a i wanna put'

0 row(s) in 0.0260 seconds

hbase(main):054:0> scan 'table01'

ROW COLUMN+CELL

row001 column=cf:a, timestamp=1372602441690, value=value i wanna put

row001 column=cf:b, timestamp=1372602450824, value=value b i wanna put

row001 column=cf:c, timestamp=1372602456583, value=value c i wanna put

row002 column=cf:2a, timestamp=1372602470758, value=value 2a i wanna put

row003 column=cf:3a, timestamp=1372602481567, value=value 3a i wanna put

3 row(s) in 0.1360 seconds

hbase(main):054:0> scan 'table01'

ROW COLUMN+CELL

row001 column=cf:a, timestamp=1372602441690, value=value i wanna put

row001 column=cf:b, timestamp=1372602450824, value=value b i wanna put

row001 column=cf:c, timestamp=1372602456583, value=value c i wanna put

row002 column=cf:2a, timestamp=1372602470758, value=value 2a i wanna put

row003 column=cf:3a, timestamp=1372602481567, value=value 3a i wanna put

3 row(s) in 0.0500 seconds

hbase(main):067:0> get 'table01', 'row001'

COLUMN CELL

cf:a timestamp=1372602441690, value=value i wanna put

cf:b timestamp=1372602450824, value=value b i wanna put

cf:c timestamp=1372602456583, value=value c i wanna put

3 row(s) in 0.0240 seconds

hbase(main):068:0> get 'table01', 'row001', 'cf:a'

COLUMN CELL

cf:a timestamp=1372602441690, value=value i wanna put

1 row(s) in 0.0220 seconds

hbase(main):070:0> get 'table01', 'row001', 'cf:a', 'cf:b'

COLUMN CELL

cf:a timestamp=1372602441690, value=value i wanna put

cf:b timestamp=1372602450824, value=value b i wanna put

2 row(s) in 0.0270 seconds

hbase(main):071:0> get 'table01', 'row001', ['cf:a', 'cf:b']

COLUMN CELL

cf:a timestamp=1372602441690, value=value i wanna put

cf:b timestamp=1372602450824, value=value b i wanna put

2 row(s) in 0.0330 seconds

hbase(main):072:0>

hbase(main):071:0> get 'table01', 'row001', ['cf:a', 'cf:b']

hbase(main):074:0> scan 'table01'

ROW COLUMN+CELL

row001 column=cf:a, timestamp=1372602441690, value=value i wanna put

row001 column=cf:b, timestamp=1372602450824, value=value b i wanna put

row001 column=cf:c, timestamp=1372602456583, value=value c i wanna put

row002 column=cf:2a, timestamp=1372602470758, value=value 2a i wanna put

row003 column=cf:3a, timestamp=1372602481567, value=value 3a i wanna put

3 row(s) in 0.0630 seconds

hbase(main):075:0> import java.util.Date

=> Java::JavaUtil::Date

hbase(main):076:0> Date.new(1372602456583).toString()

=> "Sun Jun 30 23:27:36 KST 2013"

$ hbase org.apache.hadoop.hbase.util.RegionSplitter test_table HexStringSplit -c 3 -f f1

[hadoop@h001 hbase]$ ps auxk -rss|less

[hadoop@h001 hbase]$ jmap -heap

[hadoop@h001 hbase]$ lsof -uhadoop | wc -l

[hadoop@h001 hbase]$ ps -o pid, comm,user, thcount -u hadoop

[refered to http://hortonworks.com/blog/apache-hbase-region-splitting-and-merging/]

[refered to this HBase 클러스터 구축과 관리]

저작자표시 비영리 변경금지 (새창열림)

'Bigdata 동영상' 카테고리의 다른 글

HBase basic commands and RegionSplitter [하둡 동영상 강의] (0)	2013.07.01
RHadoop install [하둡 동영상 강의] (0)	2013.06.27
Hive install QL use [하둡 동영상 강의] (0)	2013.06.22
HDFS Management [하둡 동영상 강의] (0)	2013.06.22
MapReduce Wordcount [하둡 동영상 강의] (1)	2013.06.20

Jacob's Cafe

HBase install and basic command [하둡 동영상 강의]

'Bigdata 동영상' 카테고리의 다른 글

티스토리툴바

HBase install and basic command [하둡 동영상 강의]

'Bigdata 동영상' 카테고리의 다른 글

'Bigdata 동영상' Related Articles

티스토리툴바