no comments

Learning to use transformation commands in SPSS

For many users this will be, technically, the most difficult part of learning SPSS; not only because the commands are somewhat more complex, but with the exception of a few basic tasks, their use needs care planning , as well skills with troubleshooting problems (both technical and logical). As tranformation commands do not produce any output (they change the data matrix), you need to understand what is going on “behind the scene”. Newly created variables need to be documented (labels and variable properties) as well.

With the exception of simple tasks, like recoding a single variable or selecting observations for analysis, it is worthwile to learn to use the SPSS command language and use it instead of the equivalent menu commands. Frequently you will use a sequence of transformation commands and – using the menus- something goes wrong, you will have to repeat a possibly fastidious sequence of menu selections; if you use command language you simply correct the problem and rerun the command sequence. You can also easily keep the command sequence for future use saving the command sequence to a file.

    • Carefully plan what you wish to do before starting to write SPSS commands. Analyze the problem at hand at a purely logical level. Hide more
      Before writing down the SPSS commands, sketch out the logic of what you intend to do. An full example can be found here Inglehart Postmaterialism Scale, where you see how the logic of the scale is analyed and formalized and only then translated into SPSS commands (or if you prefer into commands of another statistical software).
    • Do not modify the original variable(s) but create new variables. In case of problem you still have the original variable, without the need to reopen the data set or the risk (when you save the current data set) of loosing the original variable. Show example
    • Always check the result when you create a new variable, produce a frequence table to check the result: Does it look plausible? Is it what you intended? Too many missing values?

Hide more

  • Does the table show what you expect. If your intention is to create a variable that has four distinct values, it should have four distinct values. If you added value labels, all values should have labels. Carefully examine the number of missing values: Assume that you are combining two variables that have 10 missing values each, if the variable you create has more than 20 missing values, something went wrong (check your commands and logic).
  • Even if you have a scale variable, with many values, the FREQUENCY is the only command that shows all different types of missing values (system missing, user missing)
  • Document the newly created variable: variable names, labels missing values… Hide more
    • Clearly mark the new variable as being a derived variable, created from some original variable. Adopt a personal style that reflects you working style and analysis needs. This could be systematically appending e.g. “_R” or “_T” to the variable name. A recoded version of variable Age could be Age_r and add e.g. “Four age groups” to the variable label.
    • Variable labels can be used to store information what the variable is good for etc.
    • For a more complex variable, e.g. a scale that has been built using a lengthy sequence of commands, it is worthwile to keep the command sequence in a file and document what you did e.g. using comments within the the SPSS command file.
    • Caveat: Be careful and take enough time, incorrectly labelled variables and values are worse than undocumented variables!
  • Do not save undocumented new variables. Remove variables that were meant for a specific one-time use. This avoids frustration, when you use your data file in the future and it is littered with variables you no longer understand, or even worse, you are no longer able to distinguish original variables (data that have been collected) from derived variables.

resources : unige.oh