Minggu, 13 April 2008

CLUSTER ANALYSIS

Cluster Analysis is a technique of grouping of obsevations owning same characteristic into one cluster. There are some types cluster analysis:

K-Mean cluster

In K-Mean cluster, number of cluster have been determined or known. Variable which is used in K-Mean cluster is numerical variable. To do it using SPSS, follow these steps :

  • click on the main menu : ' analyze -> ' statistics descriptive'
  • put variable into ' variables'.
  • activating 'save standardized values as variables'.
  • click ' analyze -> ' classify -> ' cluster means k' .
  • put variable into ' variables' -> fill the number of cluster that you want into 'number of cluster'.
  • click ' save -> activating 'cluster membership' and 'distance from cluster center' .
  • click ' option', activating 'anova table' -> click ' ok'.

Hierarchical Cluster

Hierarchical Cluster is used when the number of cluster is unknown and thenomber of observation at the most 200. Variable which is used in hierarchical cluster is numerical variable. To do it using SPSS, follow these steps :

  • click ' analyze -> ' classify -> ' hierarchical cluster' .
  • put variable into ' variables'.
  • click ' statistics’ , choose 'agglomeration schedule' and fill 'range of solution' .
  • click ' plot’ -> activating ' dendogram'.
  • click ' method’ , in 'cluster method's’ choose 'ward's method'.
  • in ' standardize' choose z score ->click continue -> click ' O.K.'

Jumat, 11 April 2008

CROSSTAB

Crosstab is one of the statistical tools that can be used to examine relationship between two categorical variables. To do Crosstab using SPSS, you can follow these steps :

  1. click ‘analyze’ -> ‘descriptive statistics’ -> ‘crosstabs’
  2. enter the variable into ‘rows’ and ‘column’
3. click ‘statistics’ -> choose ‘chi-square’ -> click ‘continue’-> click ‘OK’

UNIVARIATE NORMALITY TEST

To investigate the normality of univariate variable you can follow these steps :

  1. click ‘graphs’ -> ‘P-P’.
  2. enter the variable that you want to check it’s normality
  3. choose ‘test distribution’ : normal -> click ‘OK’
  4. linear trend indicates that the data is normal.

ONE WAY ANOVA

Anova is one of the statistical tools that can be used to examine relationship between numerical variable and categorical variable, where the dependent variable is numerical variable and the independent variable is categorical variable. The purpose of Anova is to compare mean of many populations. To do Anova using SPSS, you can follow these steps :

§ click ‘Analyze’ -> ‘compare means’ -> ‘one way anova’.

§ Enter numerical variable into ‘dependent list’ and enter categorical variable

into ‘factor’.

§ To see variables that have significant influence to dependent variable, click

‘posthoc’ -> LSD -> click ‘continue’.

§ click ‘options’ -> ‘homogeneity of variance test’ -> click ‘ continue ->

click ‘OK’.

REGRESSION ANALYSIS

Regression analysis is one of the most popular statistical tools that’s used to investigate the relationship among numerical variable. To do regression analysis using SPSS, you can follow these steps :

  1. On the main menu, click ‘analyze’ -> ‘regression’ -> ‘linear’.
  2. Enter dependent variable into ‘dependent box’ and enter independent variable into ‘independent box’.
  3. click ‘statistics’ and choose the output that you need :

- ‘estimate’ : will give you estimation of regression coefficient.

- ‘confidence interval’ : will give you confidence interval of regression coefficient.

- ‘model fit’ : to see coefficient of determination.

- ‘descriptive’ : will give you mean, standard deviation and count of observation that you use.

- ‘durbin watson’ : will give you durbin watson statistic to examine autocorrelation in the model.

- ‘casewise diagnostics’ : to examine outlier in the model.

- click ‘continue’

- to check assumption of normality of residual, click ‘plots’, you can use standardized residual plot and then choose normality probability plot.

- click ‘continue’ -> ‘OK’.

DESCRIBING DATA

After you collect information about population by sampling and enter the data into worksheet of SPSS, you should to describe that data to summarize information that you’ve got.

There are several methods to describe data in statistics. Those methods are table of frequency, pie chart, bar chart, stacked bar, histogram, diagram of steam and leaf, and sequential graph. I’ll tell you when you should use each of those methods and tell you how to do that using SPSS.

1. Table of Frequency

It can be used to see frequency of variable that you use in data analysis, whether that variable is numerical or categorical variable. To do it using SPSS, you can follow these steps :

1. On the main menu click ‘analyze’ -> ‘descriptive statistics’ -> ‘frequencies’.

2. Enter variables that you want to know their frequency into ‘variables box’.

3. You will get table of frequency as output.

2. Pie Chart

While the variable that you have is categorical variable, it is a good choice to

describe data using pie chart. To do it using SPSS, you can follow these steps :

1. Click ‘graphs’ -> ‘pie’ -> ‘define’.

2. Enter variable into ‘define slices by’ -> click ‘OK.

If variable that you have is numerical variable, you cannot use pie chart to describe

that data.

3. Bar Chart

Bar chart is used to describe data when the variable is categorical variable. To do it

using SPSS, you can follow these steps :

1. Click ‘graphs’ -> ‘bar’ -> ‘define’.

2. Enter variable into ‘category axis’

3. In ‘bar represent’ you can choose ‘% of cases’ to compare each of categories of that variable -> click ‘OK.

4. Stacked Bar

Stacked bar can be used to examine relationship between two categorical variable.

To do it using SPSS, you can follow these steps :

2. Click ‘graphs’ -> ‘bar’ -> ‘stacked’ -> ‘define’.

3. Enter the first variable into ‘category axis’ and enter second variable into ‘define stacked by’.

4. In ‘bar represent’ you can choose ‘% of cases’ to compare each of categories of that variable -> click ‘OK.

Stacked bar can’t be used to examine significance of relationship between those

variables. If you want to know about significance of relationship between those

variables, you can use crosstab that will be discuss in the next topic.

5. Histogram

Histogram is used to describe numerical data. By using histogram, you can see

mean and standard deviation of data. In histogram, you can also investigate shape

of distribution of that data. To do it using SPSS, you can follow these steps :

1. Click ‘graphs’ -> ‘histogram’.

2. Enter variable into ‘variables’ -> click ‘OK’.

DATA ENTRY

The first step of data analysis is data entry. There are two ways to enter your data in SPSS :

1. Entering data in SPSS worksheet directly.

2. Enter your data with another software ( i.e : Microsoft excel ) and then copy to

SPSS worksheet.

If you choose the first way, you can follow these steps to enter your data :

1. click ‘file’ on the main menu - ‘new’ - ‘data’.

2. click ‘variable view’ on the bottom of worksheet. At the ‘name column’, give the name of variable that you use, at the next column you can define the type of your variable.

3. After define your variable, click ‘data view’ and then you can enter your data into worksheet.