![]() |
Research Methods ResourcesCommon problems
|
||||
| Home | |||||
|
Common problems using MS Excel spreadsheets |
|||||
|
In the section What are data, we showed that entering data on a computer can be equal to a total data loss. Even if you add clear and unambiguous documentation to your data, there still exist a range of pitfalls you should avoid. Many researchers use MS Excel for data entry. We first give some example of data management problems we encountered before showing how to design an MS Excel spreadsheet for research data. |
|||||
|
In the example on the right, a researcher asked a secretary to enter the data of an experiment on progressive terracing (rows of two leguminous shrubs were planted on a slope to avoid soil erosion and fix nitrogen) in an Excel spreadsheet. He gave her a pile of field notes and left. The secretary had followed an MS Excel course in a technical college but didn't know very well what the experiment was about. She did her best and came up with this nice looking spreadsheet, something between a photocopy of the field notes and an aerial photo of the slopes where shrubs were planted. In terms of documentation, this isn't bad. You know the measurements units, location of the trees etc. (we just show a small fraction of the spreadsheet). The main problem however is that it will need extremely hard and cumbersome work to use this spreadsheet for further analysis. Think for instance how you would calculate the average height of Leucaena on all farms. Or how can you make a graph that shows the relationship between the diameter and height of Calliandra on each farm. The lesson learned here, apart from finding a better way to design the spreadsheet, is that people who enter the data should be aware of its background and further use. The researcher from the example should not blame the secretary. He should have been there to explain and tell exactly what is needed.
|
Click on the picture above to enlarge it in a new window. |
||||
|
The example on the left is a simple typing mistake. Probably still easy to spot on sight if your dataset is rather small. But imagine you're entering data from a survey among 3,500 households. There exist no person in the world who is able to enter big amounts of data without making mistakes. While there is not much difference between Chege and Chegee for a human, on a computer they are two completely different objects. So the mean and other calculations will be wrong. See further in the section on data validation.
|
|||||
|
In the example on the right, the total number of planted timber trees (#plantb/ha) plus the total number of fruit trees (#fr/ha) plus the natural regeneration (#natreg/ha) should equal all trees (#trees/ha). However, two cells of this small screenshot give a wrong result. The lesson learned is to let the computer work for you. Don't calculate sums by head or using a calculator. Use the correct MS Excel functions instead.
|
Click on the picture above to enlarge it in a new window. |
||||
|
Generally, MS Excel is too flexible. You can enter what you want and how you want it. The only solution is to use MS Excel in a very disciplined way. We show some resources in Data entry and validation.
|
|||||
| Home |
|
|
|||