본문 바로가기

Hadoop Ecosystem

Map Output Key 변경변화 확인 Year, Month 에 UniqueCarrier 및 로직 추가


Year, Month 로만 Map Output Key 로 잡은 경우



13/07/11 18:30:42 INFO mapred.JobClient: Job complete: job_201307111807_0001

13/07/11 18:30:43 INFO mapred.JobClient: Counters: 30

13/07/11 18:30:43 INFO mapred.JobClient:   Job Counters 

13/07/11 18:30:43 INFO mapred.JobClient:     Launched reduce tasks=1

13/07/11 18:30:43 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2531434

13/07/11 18:30:43 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

13/07/11 18:30:43 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

13/07/11 18:30:43 INFO mapred.JobClient:     Rack-local map tasks=6

13/07/11 18:30:43 INFO mapred.JobClient:     Launched map tasks=118

13/07/11 18:30:43 INFO mapred.JobClient:     Data-local map tasks=112

13/07/11 18:30:43 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=442589

13/07/11 18:30:43 INFO mapred.JobClient:   File Output Format Counters 

13/07/11 18:30:43 INFO mapred.JobClient:     Bytes Written=369

13/07/11 18:30:43 INFO mapred.JobClient:   FileSystemCounters

13/07/11 18:30:43 INFO mapred.JobClient:     FILE_BYTES_READ=69043637

13/07/11 18:30:43 INFO mapred.JobClient:     HDFS_BYTES_READ=1204389180

13/07/11 18:30:43 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=140614063

13/07/11 18:30:43 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=369

13/07/11 18:30:43 INFO mapred.JobClient:   File Input Format Counters 

13/07/11 18:30:43 INFO mapred.JobClient:     Bytes Read=1204376530

13/07/11 18:30:43 INFO mapred.JobClient:   Map-Reduce Framework

13/07/11 18:30:43 INFO mapred.JobClient:     Map output materialized bytes=69044321

13/07/11 18:30:43 INFO mapred.JobClient:     Map input records=12655313

13/07/11 18:30:43 INFO mapred.JobClient:     Reduce shuffle bytes=69044321

13/07/11 18:30:43 INFO mapred.JobClient:     Spilled Records=10425302

13/07/11 18:30:43 INFO mapred.JobClient:     Map output bytes=58618329

13/07/11 18:30:43 INFO mapred.JobClient:     Total committed heap usage (bytes)=18297933824

13/07/11 18:30:43 INFO mapred.JobClient:     CPU time spent (ms)=491740

13/07/11 18:30:43 INFO mapred.JobClient:     Combine input records=0

13/07/11 18:30:43 INFO mapred.JobClient:     SPLIT_RAW_BYTES=12650

13/07/11 18:30:43 INFO mapred.JobClient:     Reduce input records=5212651

13/07/11 18:30:43 INFO mapred.JobClient:     Reduce input groups=24

13/07/11 18:30:43 INFO mapred.JobClient:     Combine output records=0

13/07/11 18:30:43 INFO mapred.JobClient:     Physical memory (bytes) snapshot=24657891328

13/07/11 18:30:43 INFO mapred.JobClient:     Reduce output records=24

13/07/11 18:30:43 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=83286425600

13/07/11 18:30:43 INFO mapred.JobClient:     Map output records=5212651






Year, Month, UniqueCarrier 로만 Map Output Key 로 잡은 경우



13/07/11 19:26:37 INFO mapred.JobClient: Job complete: job_201307111807_0007

13/07/11 19:26:37 INFO mapred.JobClient: Counters: 30

13/07/11 19:26:37 INFO mapred.JobClient:   Job Counters 

13/07/11 19:26:37 INFO mapred.JobClient:     Launched reduce tasks=1

13/07/11 19:26:37 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=2633868

13/07/11 19:26:37 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0

13/07/11 19:26:37 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0

13/07/11 19:26:37 INFO mapred.JobClient:     Rack-local map tasks=4

13/07/11 19:26:37 INFO mapred.JobClient:     Launched map tasks=118

13/07/11 19:26:37 INFO mapred.JobClient:     Data-local map tasks=114

13/07/11 19:26:37 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=450543

13/07/11 19:26:37 INFO mapred.JobClient:   File Output Format Counters 

13/07/11 19:26:37 INFO mapred.JobClient:     Bytes Written=6861

13/07/11 19:26:37 INFO mapred.JobClient:   FileSystemCounters

13/07/11 19:26:37 INFO mapred.JobClient:     FILE_BYTES_READ=47084767

13/07/11 19:26:37 INFO mapred.JobClient:     HDFS_BYTES_READ=1204389180

13/07/11 19:26:37 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=96696671

13/07/11 19:26:37 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=6861

13/07/11 19:26:37 INFO mapred.JobClient:   File Input Format Counters 

13/07/11 19:26:37 INFO mapred.JobClient:     Bytes Read=1204376530

13/07/11 19:26:37 INFO mapred.JobClient:   Map-Reduce Framework

13/07/11 19:26:37 INFO mapred.JobClient:     Map output materialized bytes=47085451

13/07/11 19:26:37 INFO mapred.JobClient:     Map input records=  12,655,313

13/07/11 19:26:37 INFO mapred.JobClient:     Reduce shuffle bytes=47085451

13/07/11 19:26:37 INFO mapred.JobClient:     Spilled Records=5790528

13/07/11 19:26:37 INFO mapred.JobClient:     Map output bytes=41294233

13/07/11 19:26:37 INFO mapred.JobClient:     Total committed heap usage (bytes)=18417872896

13/07/11 19:26:37 INFO mapred.JobClient:     CPU time spent (ms)=528320

13/07/11 19:26:37 INFO mapred.JobClient:     Combine input records=0

13/07/11 19:26:37 INFO mapred.JobClient:     SPLIT_RAW_BYTES=12650

13/07/11 19:26:37 INFO mapred.JobClient:     Reduce input records=2895264

13/07/11 19:26:37 INFO mapred.JobClient:     Reduce input groups=400

13/07/11 19:26:37 INFO mapred.JobClient:     Combine output records=0

13/07/11 19:26:37 INFO mapred.JobClient:     Physical memory (bytes) snapshot=24987451392

13/07/11 19:26:37 INFO mapred.JobClient:     Reduce output records=400

13/07/11 19:26:37 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=83288907776

13/07/11 19:26:37 INFO mapred.JobClient:     Map output records=    2,895,264  <---- 12,655,313


'Hadoop Ecosystem' 카테고리의 다른 글

snappy  (0) 2013.08.05
Job Configuration: JobId - job_201307111807_0008  (0) 2013.07.11
cloud : kt ucloud vm - openAPI test 01  (1) 2013.07.11
Partitioning - MapReduce  (0) 2013.07.08
Shuffling - MapReduce  (0) 2013.07.07