# Introduction to statistics

## What is statistics?

Statistics can be described as the science which studies the collection, organization, analysis and interpretation of data.

All aspects related to the treatment of data are included in Statistics; starting with data collection, continuing designing the survey and concluding with data analysis and its interpretations.

Since it is possible to find a huge amount of information in almost every area of knowledge, statistics can be used in most of them, like: medicine, sociology, marketing, finance…

In business, Statistics can be used to measure risk situations in order to make the proper decision.

## Stages of a statisticanalysis

Population

The population is the total number of item which are interesting for the research. The population size is usually called N, which can be very large or even infinite.

A census is a database which includes de interesting information for the population.

Sample

A sample is a subset of the population whose information is known for the researcher. The sample size is usually called n and the information for the sample is obtained by means of a survey.

The sample should represent the whole population properly. To avoid lack of representability in sample (bias), the best methodology to collect them must be found. All common methodologies to collect data include randomness.

The main objective is not obtain information about the sample but about the population

## Variables

In statistics a variable is a characteristic of the analyzed item. Considering that there are variables and attributes.

Variable: Its nature is quantitative, so it refers to measurable facts. There are two types:

• Continuous: Variable which can take an infinite number of different values.
• Discrete: Variable takes values from a finite or countable set.

Attribute: Its nature is qualitative, thus it cannot be measured. It is related to characteristics.

Datum might be expressed in different scales:

Nominal scale: Information can be classified in non-numerical categories, mutually exclusive, without any order relation among them. It is data corresponding to attributes.

Ordinal scaleInformation can be classified in non-numerical categories, with a specific order among them. It is data corresponding to attributes.

Interval scale: It is a quantitative scale. Observations are expressed in a specific unit of measure and it is possible to quantify the distance between two observations. Zero is an arbitrary value and the proportion between two data point has no sense. It is data corresponding to variables.

Ratio scale:  It is a quantitative scale. Observations are expressed in a specific unit of measure and it is possible to quantify the distance between two observations. Moreover, it makes sense to set an origin data point that marks the absolute zero. It is data corresponding to variables.

Depending on the scale, we choose the most suitable statistical method to apply. Usually, economic issues imply facts expressed in interval or ratio scales.