Research Methods Resources

What are data?

Home

Research Methods Resources

Back to Handling data

What are data?

Look at the picture on the right. Do you think these are data? If you say yes, can you explain why? If you say no, can you explain why not?

Click on the picture above to enlarge it in a new window.

After thinking about the above mentioned questions, click here to see the same picture but slightly different. Are you still sure about the answer you gave?

 

Now, especially for social scientists, what about the picture of the young lady on the left? Do you think these are data? If you say yes, can you explain why? If you say no, can you explain why not?

After thinking again, click here to see the same picture but slightly different. Are you still sure about the answer you gave?

Finally, look at the MS Excel spreadsheet on the right. Are these data?

Click on the picture above to enlarge it in a new window.

Now click here to answer some questions about the data in the spreadsheet.

Data are...

...more than just numbers and text, you have to be able to put them in their correct context. And data management is more than just entering data in a spreadsheet.

You could look at the life cycle of a research project as a chain of data transformations.

 

Going through the cycle we can imagine formulating (fm) the research objectives to address particular problems. Once we have the research objectives we develop (dv) the protocol from which we design (ds) the observation units. We can then go into the field to collect (coll) the data. These are compiled (cm) into a well-structured dataset which we can query (qy) and select subsets of the data to analyse (as). The results of our analyses should be published (pb) leading to knowledge. 

This is basically the same structure as the flowchart of a research project or as the way the resources are organised on this website. We now look upon it from a slightly different angle.

The main scope of this section on handling data covers the processes from the collection of data and production of the data sheets, through to selection of data for analysis. We will however, consider the entire life cycle as the processes and transformations are linked in such a way that a weak point anywhere in the chain can cause the entire process to collapse. Each point in the chain needs sufficient, clear and unambiguous documentation.

The MS Excel spreadsheet shown above is an extreme example. Numbers and text are entered in a spreadsheet but nobody except the person who entered them knows what they are about. And even that person will have forgotten after a few months.

 

Data entry on a computer can be equal to total data loss

 

Which leads to the apparent contradiction that better technology can lead to bad data management. Before personal computers became a standard equipment of a researcher (the first personal computer only came on the market in 1981), research stations and especially agricultural research stations used to have strict rules on how to collect, record and file data: measurements were recorded in field notebooks, they were neatly copied on paper sheets, checked by supervisors, endorsed by the biometrics unit and filed according to programmes and research projects.

In many research stations nowadays, anybody enters data using his own system and preferences.

Within this section on handling data, we will cover how to enter data in a spreadsheet and how to check for errors, but the concept of data management will also be expanded to the organisational level: you cannot seriously call yourself a research organisation if you don't have a data management strategy and several policies and procedures to implement it.

But first we show some common data management problems when using MS Excel spreadsheets.

 

Home

Research Methods Resources

 

 

GenStat Discovery Edition