MICROSOFT EXCEL 2000 AND 2003 FAULTS, PROBLEMS, WORKAROUNDS AND FIXES

DESCRIPTION

This is an extended analysis of Microsoft's Excel spreadsheet program with respect to statistical applications using statistical functions and routines. The earlier Excel versions are covered under Excel 2000, and the current version under Excel 2003.

New material has been added on the free Excel add-ins from the "Foxes" for optimization and non-linear equation fits. Also added is the extended precision computation package (xnumbers) for Excel. These extend the basic Excel computational capabilities.

This paper reviews some of the previous literature on errors and faults with Excel. The paper includes many more detailed tests made on Excel functions and routines by the author. The use of Excel in teaching introductory statistics is also reviewed and commented on.

There are 18 sections covering tests and evaluations of both the Excel 2000 and 2003 versions on graphics, univariate statistics, ANOVA, covariance, linear and non-linear regression, the statistical distributions, significance tests, random number generators and add-in packages. A review of the inherent problems with Excel's computing and display "machinery". The Excel statistical distributions are extensively analyzed. This paper also includes 26 notes covering more extended discussions on complaints and faults, suggested improvements in Excel, guides to the use of Excel and on other issues.

Also included is an Excel file (newchart.xls) that provides graphics sheets for histograms, dot plots, box&whisker, Pareto, pie, scatter, trend and regression normal residual charts.

The sections and notes are frequently changed, reflecting added tests, addition of new material, correction of errors, changing of text and modifying results.

A viewer may have problems in reading the files, since Internet Explorer (IE) may not recognize *.pdf files. Although pdf file readers are free, IE does not have a direct link to the Adobe PDF file reader in your computer. I have no good solution to this problem.

CONTENTS

I. INTRODUCTION

II. GENERAL PROBLEMS WITH EXCEL

III. EXCEL COMPUTATION AND DISPLAY ISSUES

IV. THE TESTING PROGRAM FOR ACCURACY

V. UNIVARIATE ANALYSIS

VI. ANALYSIS OF VARIANCE (ANOVA)

VII. COVARIANCE AND CORRELATION

VIII. LINEAR AND POLYNOMIAL REGRESSION

IX. OPTIMIZATION AND NON-LINEAR REGRESSION

X. STATISTICAL DISTRIBUTIONS AND RELATED FUNCTIONS

XI. TESTING FOR ACCURACY AND RELIABILITY OF STATISTICAL DISTRIBUTIONS

XII-1. RESULTS OF NEW TESTS ON STATISTICAL DISTRIBUTIONS, DISCRETES

XII-2. RESULTS OF NEW TESTS ON STATISTICAL DISTRIBUTIONS, CONTINUOUS FUNCTIONS

XII-3. RESULTS OF NEW TESTS ON STATISTICAL DISTRIBUTIONS, CONTINUOUS, CUMULATIVE

XII-4. RESULTS OF NEW TESTS ON STATISTICAL DISTRIBUTIONS, CONTINUOUS INVERSE

XIII. STATISTICAL TESTS, TESTS OF SIGNIFICANCE AND TESTS OF A HYPOTHESIS

XIV. RANDOM NUMBER GENERATION

XV. EXCEL ADD-IN PACKAGES

XVI. BIBLIOGRAPHY

XVII. CLOSING REMARKS

INDEX ON NOTES

EXCEL WORKSHEETS WITH DESCRIPTIVE CHARTS

ACKNOWLEDGEMENTS

I would like to thank Dr. B.D. McCullough, Department of Decisions Sciences
and Department of Economics, Drexel University, Philadelphia, PA. for
his help in making comments and corrections to the paper. I have tried
to incorporate all his comments and corrections, and the current version
is the result.

I would also like to thank Jerry W. Lewis, for his assistance in tests on
statistical distributions. He also steered me to Smith's set of accurate
probability distributions.

I thank the Journal "Computational Statistics and Data Analysis" (Elsevier)
for including a short note in an upcomming issue on this website.

Thanks to Leonardo Volpi for information on some problems in Excel
and on his XNUMBERS extended precision add-in and the optimization and non-linear equation add-ins.

COMMENTS AND REMARKS

This effort to test Excel started in 2001, based on McCullough's articles in "The American Statistician" in 1998.

The major problem has always been to find a way to publish the results. Most of the published Excel evaluations tended to be incomplete, primarily to get the publication down to a size that would be accepted for publication. Any complete evaluation of a statistical software package will be long, if it is comprehensive.

There is much to be said for BLOGS for getting out information.

CORRECTIONS AND CHANGES

If you have any comments, or have noted some errors or faults with Excel or my findings, please send it by email to daheiser@gvn.net.

1. Corrected section12 file name errors, noted by John Kim June 2, 2005.

2. Put in missing XLS file, June 2, 2005.

3. Corrected minor error in noteC and some major errors in section13, June 19, 2005.

4. First rewrite on section13. No errors, just expansions, July 13, 2005.

5. Changes to notec, notek, section2, section8 and a major rewrite of section13, July 13, 2005.

6. Additioanl tests on multicolinearity and on missing data, section7 and section8, July 23, 2005.

7. Expanded ordinal data, section3, added a test on CHITEST, corrected section13 table numbers and references, added to the bibleography, July 29, 2005.

8. Added discussion on RANDBETWEEN to section14, August 2, 2005.

9. Rewrote note h based on a discussion on EDSTAT on quartiles, August 30, 2005.

10. Put in new RAND period tests in section14. Added some comments on future statistical capabilities in section, November 10, 2005.

11. This page was changed, December 7, 2005.

12. Changes to Note A regarding EXCEL updates and fixes and changes to Section 8 in regard to Microsoft's reporting (Nov 7, 2005) of an accuracy problem with LINEST. December 27, 2005.

13. Changes to section 2 and to section 16 to reflect EUSPRIG 2005 issues regarding worksheet accuracy. Secember 29, 2005.

14. David Winsemius on 1/4/06 described an error in the HIST sheet in NEWCHART.XLS in regard to cell boundaries. This was fixed, and the change entered on 1/11/06. The original would not properly work for non-integer data.

15. Leonardo Volpi noted that the URL for a listing of add-ins in section 15 was wrong. The correced URL in section 15 was posted 2/10/06.

16. Robert de Levie raised some important questions on the results of tests on the ERF function in Section 12-3. A revision of this section was done and posted on 2/15/06.

17. The Introduction section was modified to include some new information. 2/27/06.

18. Revisions and additions to Section 1, Section 2, Section 3, Section 9, Section 16 and Note B. The revisions correspond to new information on Excel faults and fixes. Note B was changed to reflect the Feb 2006 current newslists. 3/2/06.

19. Note AA on the xnumbers add-in was added. 3/3/06.

20. Section 14 rewritten and the left-out routine "sampling" added. Rewriten material based on a recent "sci.stat.math" dialog. 3/19/06.

21. Section 3 added to, and section 9 expanded to include optimization and non-linear equation capabilities. Sections 1, 2, 3, 4, 5, 9, 11, 12-3, 14, 15 and 16 which had various changes prior to 4/11/2006 were converted to *.pdf and downloaded to this web site to restore currency. 4/11/06

22. Section 2 and the Bibleography were changed to include Jon Peltier's recommendations on getting contrasts between points, showing error bars and making box-and-whisker plots. 5/17/2006

Some major changes/additions are in the planning stage and have not yet been posted.