Variables and Attribution in statistics | Statical data | Editing data

VARIABLES AND ATTRIBUTES

1. Variable

A variable is a symbol, such as X,Y,Z which can assume any of a prescribed set of values. The weight of an individual, the number of children in a class, weekly wages, etc., are the examples of a variable. A variable sometimes known as a variate which is defined as a measurable quantity which vary from one value to another. A variable can either be discrete or continuous.

Continuous Variable

A variable which can theoretically assume any value between two given values is called a continuous variable. For example the weight of a boy, the monthly income etc.

(I) Discrete or Discontinuous Variable

A discrete variable is a variable which can not theoretically assume any value between two given values. The examples of discrete variable are. number of accidents in a month, number of children in 'a family, the number of students in a class.

(2) Attributes

Qualitative observations can not be measured: they can only be described. Non-measurable quantities are called attributes,

For example, beauty, honesty, the color of eyes. poor, etc. are attributes.

STATISTICAL DATA

The 'numbers recorded as a result of counting or measuring are called data. The recorded information in its original collected form is called raw data. Data which can be described by a discrete variable are called discrete data and those which can be described by continuous variable are called continuous data. The number of children in each of 25 families is an example of discrete data. while the heights of 100 college students is an example of continuous data.

COLLECTION OF STATISTICAL DATA

Without data there is no ground for statistics. Data are the core of the science of statistics. In order to apply statistical methods to any type of inquiry it is necessary that statistical data be collected, because no statistical analysis is possible in the absence of quantitative data. Therefore, it becomes essential at the first hand to collect facts and figures. Statistical data are of two types:

1) Primary data

2) Secondary data

Primary Data

Primary data refers to the data collected from primary sources i.e. the data collected by the investigator himself. Primary data is also called as original data since it is collected for

the first time by a person who is going to use them. Primary data are always collected by the investigator in the field and from the original sources. To quote SECRIST "data which are

gathered originally for a certain purpose are known as primary data"

2) Secondary Data

The facts and figures that have already been collected, are called secondary data. In other words. secondary data are those which have gone through the statistical treatment at least one. Thus secondary data are those which have been already compiled by government,

semi-government or non official bodies and are available in office records, publications.

reports. books. journals etc. To be precise, in relative terms, the difference between primary and secondary data is of one degree only. The data which is primary for one agency is secondary for the other s

hand,

EDITING OF DATA

After collecting the data, the next step is its editing. Editing means the examination or serotine of collected data to discover any error, omissions and mistakes. It has to be decided before hand what degree of accuracy is wanted and how far mere approximations will be enough and what extent of errors can be tolerated in the investigation.

CLASSIFICATION AND TABULATION

Collected data are usually available in a form which is not easy to comprehend. It is difficult to get a proper and clear cut impression of that data. In order to bring them into an

intelligible form, the data are to be condensed. This can be obtained by classification and tabulation.

Classification:

According to L. R. Connor "Classification is the process of arranging things (either actually or naturally) in groups or classes according to their resemblances and affinities. and gives expression to the unity of attributes that may subsist among a diversity of individuals". Thus by classification we arrange heterogeneous elements of the collected data into homogeneous sub groups. In classification the elements which possess the same characteristics are grouped in one class. When the whole data area divided into a number of classes it does not mean that they are now fit for the purpose of comparison or interpretation or one can derive the inference from them in a glance.

Types of Classification

The types of classification depend upon the characteristics of the statistical data.

There may be two characteristics of the statistical data, descriptive and numerical.

1. Descriptive Classification

The classification is said to be descriptive when it is according to attributes. It is of two kinds. The examples of descriptive characteristics are friendship, love, poverty, liberty

and sex.

2) Simple classification

3) Manifold classification

4) Numerical Classification

The type of classification is based on the data where quantitative measurement of the data is possible. The classification is based on the basis of class-interval. Characteristics like height, weight, income, number of leaves on a certain tree are called numerical as they are capable of quantitative measurement.

Thus according to the nature of characteristics possessed by the items of data, there are two main types of classification namely classification by attributes and classification by

magnitude. We shall consider classification by magnitude here.

(a) One- Way Classification

When classification is done considering only one variable, it is called one way classification.

(b) Two - Way Classification

When classification is done according to two variable, it is called two way

classification. Following terms are used in the classification of numerical data. Class Limit The class limit refers to the highest and Invest value that can be included in the class.

Highest limit is known as upper limit and the lowest limit is called as lowest class and is known as lower limit. For example if interval is :0-40 the lower limit in 30 and the upper

limit is 40.

5) Magnitude of Class- Intervals

The difference between the upper limit ad the lower limit of the class interval is known as the magnitude of the class interval, For example in the above example, 10 is the magnitude.

6) Mid Value

The average of the upper limit and lower limit is called as mid value. Mid-value is the value in the middle of the class limits. Mid value is calculated in the following manner.

Upper limit of class + Lower limit class

Mid Value =2

(iv)
Class Frequency

The number of items falling within a class interval is called as class -frequency

Tabulation

According to L. R. Connor "Tabulation involves the orderly and systematic

presentation of numerical data in a form designed to elucidate the problem under consideration". It is the next to classification in the process of statistical investigation the more precise 'tabulation is an orderly arrangement of data in columns and rows. Tabulation is the final stage in collection and compilation of data, and enable to analysis and interpretation of figures. The importance of proper tabulation of data is great

because if the tabulation of data is not satisfactory its analysis will not only be difficult but defective also. Tabulation of data is done in order to achieve simplicity and convenience in process, interpret and draw inferences from it.

Types of Tabulation

Statistical table is a systematic arrangement of numerical data presented in columns and rows for the purpose of comparison. It is quite obvious that the layout of tables is dependent on the information to be presented and the purpose for which they are prepared. l general terms tabulation can be distinguished as simple and complex. Simple tabulation gives information regarding one or more independent questions while complex tabulation gives information regarding two mutually dependent questions. Statistical tables are of following major types,

(i)

Simple and complex table

(ii)

One way tables or single tables

(iii)

Two way tables or double tables

(iv)

lighter order tables or manifold tables