본문 바로가기

Data Analytics

[책] Data Analysis Using SQL and Excel

SQL 을 잘 아는 
가까운 지인으로부터 소개받은 책인데 SQL 이 편안하고 능숙하신 분을 위해,
특정 챕터의 글을 간단히 소개겸 올려봅니다. 

CH.3  How Differenct Is Different ? 

The word "statistics" itself is often misunderstood. It is the plural of "static", 
and a static is just a measurement, such as the averages, medians, and modes calculated in the previous chapter.
A big challenge in statistics is generalizing from results on a small group to a larger group.

This chapter introduces the statistics used for addressing the question "how different is different", 
with an emphasis on the application of the ideas rather than their theoritical derivation. Throughout, examples use Excel and SQL to illustrate the concepts.
Key statistical concepts, such as confidence and the normal distribution, are applied to the most common statistic of all, the average value.

Two other statistical techniques are also introduced.
One is the difference of proportions, which is often used for comparing the response rates between groups of customers.
The other is the chi-square test, which is also used to compare are results among different groups of customers and determin whether the groups are essentially the same or fundamentally different.

The chapter has simple examples with small amounts of data illustrate the ideas. Larger examples using the purchase and subscriptions databases illustrate the application of the ideas to real datasets stored in databases.

Tip] Perhaps the most important lesson from statistics is skepticism and the willingness to ask questions.
      The default assumption should be that differences are due to chance;
      data analysis has to demonstrate that this assumption is highly unlikely

Basic Statistical Concepts
 - The Null Hypothesis
 - Confidence (versus probability)
 - Normal Distribution

위 모든 일련의 개념과 과정(?)을 data 를 가지고  "SQL" 로 풀어나가는 책이다. (Wow)
원래 SQL 하던 사람에게 통계의 개념을 알기 쉽게 설명해 주고 친근감 있는 SQL 로 풀어주는 ... 
SQL 활용의 새로운 세계를