# Variables and variable types

Introduction

Numerical variables

Categorical variables

Before we study about type of variables, let’s first understand what do we mean by variable?

Data can only be of two types –

- Constant – Constant data is the one which is not supposed to change over time, e.g., value of pi, value of Boltzmann constant etc.
- Variable – Variable data is the one which is supposed to change, e.g., age of an individual, monthly spent on groceries etc.

In the field of data analytics, variables are used to denote any column or attribute in data. These variables may or may not be measurable.

For example, your age is a quantifiable data but your temper is not.

Let’s understand the different variable types:

### Numerical Variables

Numerical variables are the ones which can be measured for example, your age, your salary, height, humidity level etc. Numerical variables are further subdivided into following categories-

##### Continuous Variables

Continuous variables are numerical variables which can take any value between a specified range. For example, age, height, salary etc.

“Any Value” mentioned above is dependent upon the precision of measurement of the scale. For example, a measuring tape can measure centimeter at the least.

Continuous variables are further sub-divided into following two categories:

**Interval**variables are a part of continuous variables which can be measured along a continuous scale. Important characteristic of interval variables is that the difference between equidistant observations is same. E.g., difference between 20C and 25C is same as the difference between 40C and 45C**Ratio**variables are similar to interval variables, but with an extra condition that Zero(0) of the scale indicates the absence of measured quantity. For example, temperature measured in Celsius is not a Ratio scale variable because 0C doesn’t mean that there is no temperature on the other hand temperature measured in Kelvin is a ratio variable because fundamentally 0K temperature means absence of any temperature. Other example of ratio variable can be the distance. The name “ratio” tells us that we can use the ratio of measurement i.e. distance of 20 meters is double the distance of 10 meters.

##### Discrete Variables

Discrete variables can take values from a set of allowed numerical levels. For example, number of students in a class, population of your city etc.

Discrete variables can not take values between two closest allowed levels. i.e., you can’t have 1.2 students in your class, you can’t have 2.4 girlfriends etc.

### Categorical Variables

Categorical variables are the ones which can’t be measured. For example, color of your hair, customer satisfaction with Walmart store etc.

Categorical variables are divided into following two categories:

##### Nominal Variables

Nominal variables have 2 or more categories but their is no intrinsic order among them. For example, color of hair can take values from Black, White, Brown, Gray etc. but we can’t put them in any order.

These kind of variables are known as Nominal variables. Other examples of Nominal variables might include- gender, names etc.

Few literature also consider dichotomous variables – Dichotomous variables are nothing but the nominal variables with only 2 categories. e.g., “Yes” or “No”, “Male” or “Female” etc.

##### Ordinal Variables

Ordinal variables are those categorical variables which have inherent order among them and you can order them accordingly. E.g., Net promoter score survey ratings i.e., Promoter, Passive, Detractor. Level of your happiness etc.

While dealing with Ordinal variables, keep in mind that though they can be ordered but they can not be assigned a value. E.g., let say we have three categories in our survey – “Not-satisfied”, “Neutral”, “Satisfied”

We can certainly say that – Satisfied is better than Neutral and Neutral is better than Not-satisfied, but we can’t say that Neutrals are half satisfied compared to satisfied customers.

Seminal 1946 paper by Stanley Stevens that defines the four types of data scales. Important for Data Science because it outlines not only the types but the permissible math/logic for each scale. This kind of knowledge is essential for prescribing algorithmic approaches (target variables) as well as how you deal with predictors/non-target variables in Data Prep.

Download the paper here:

#### analyticsfreak

#### Latest posts by analyticsfreak (see all)

- Few interesting questions related to correlation - July 22, 2016
- How to make a reproducible example to share? - July 21, 2016
- Few random questions on Random Forest - July 20, 2016