Cross Tabulations

When you want to know how respondents answered on two or more questions at the same time, you will need to run a cross-tabulation. In order to do so, you must first determine which is your independent variable, and which your dependent variable, since the first is traditionally used as column headings and the latter are found in the row.

Independent variables explain or predict a response or an outcome, which is the dependent variable under study. As a basic rule, demographic information is usually considered independent, since characteristics such as gender, age, education etc. will normally determine the responses we make. If the variables being studied are not demographic, then the independent variable is determined by the study’s objectives. For instance, if the objective is to determine whether the level of satisfaction with the past holiday at a destination influences the likelihood of return, then level of satisfaction is our independent variable and the likelihood to return the dependent one.

This is the typical output of a simple cross-tabulation (of education levels and overall satisfaction with a holiday) as produced by SPSS, when we also ask that column percentage be calculated. Note that the title gives the two variables with the dependent one first separated by *. When producing this information in a table, we would reword it to read "Overall holiday satisfaction by highest level of education completed" ( see Table 1), removing all extraneous information and leaving it as a statement, not a question.

Overall satisfaction with your holiday * What is the highest level of education completed Crosstabulation


Obviously, you would not be able to use this table as is in a report. It requires ‘cleaning’. Your first consideration would be whether you want to keep all of the categories in your independent and dependent variable. This depends, of course, on what you are trying to illustrate and the responses in each cell. First of all, very few people have less than a high school degree, and we could therefore collapse the first two categories into ‘high school or less’. But that still leaves us with five categories or more detail than we would probably need. So we could collapse the categories ‘graduated from technical or vocational school’ and ‘some college/university’ into ‘some advanced education’ and the last two into ‘graduated from university or more’. Similarly, we notice that the level of satisfaction with the holiday is very high. Indeed, any rows with less than 5% of respondents in cells should be collapsed. At the very least we should only have one category ‘not at all or nor very satisfied’. This collapsing of categories is knows as recoding and is a way of changing existing variables or creating new variables based on existing data as explained by John Urbik, the Technical Marketing Specialist for SPSS.

The resultant cross-tabulation would look like this:

overall satisfaction with holiday * highest level of education Crosstabulation

      What is the highest level of education completed             Total
      Primary school (grade 1-7) Some high school Graduated from high school Graduated from a technical or vocational school Some college or university Graduated from university Graduated with an advanced degree  
Overall satisfaction with your holiday Not at all satisfied Count   2 1 1     1 5
    % within What is the highest level of education completed   0.60% 0.40% 0.50%     1.10% 0.40%
  Not very satisfied Count 2 6 8 8 2 1 2 29
    % within What is the highest level of education completed 6.70% 1.80% 3.00% 4.00% 1.40% 0.40% 2.20% 2.20%
  Somewhat satisfied Count 3 40 37 25 30 42 12 189
    % within What is the highest level of education completed 10.00% 11.80% 13.70% 12.60% 21.30% 15.80% 13.00% 14.20%
  Very satisfied Count 25 290 224 164 109 223 77 1112
    % within What is the highest level of education completed 83.30% 85.80% 83.00% 82.80% 77.30% 83.80% 83.70% 83.30%
Total   Count 30 338 270 198 141 266 92 1335
    % within What is the highest level of education completed 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%

We can now proceed to present this information in a more pleasing table format by giving it the appropriate table number and title, indicating the total number of respondents who answered this question, and cleaning the table, as follows:

Table 1: Overall holiday satisfaction by highest level of education

n=1243

Degree of Satisfaction Level of education    
High school or less Some advanced education Graduated university of more  
Not at all or not very

10

2.7%

18

3.8%

3

.7%

Somewhat

43

11.7%

62

13.2%

72

17.7%

Very

315

85.6%

388

82.9%

332

81.6%

Total

368

100%

468

100%

407

100%

Graphically, we would follow very similar rules: the graph is numbered (Figure 1) with the same title and the number of respondents indicated; the independent variable identifies the columns since we want to compare the satisfaction level of each of the three education categories. It is the column percentage that is used for comparison purposes. The type of graph below is called a clustered bar chart.

Bar Graph with the title "Overall holiday satisfaction by highest level of education"