Sir Francis Galton's first Anthropometric Laboratory
Psychological Science: Mathematical Argument and the Quest for Scientific Respectability – Part 1 (Norman Costa)
By
Norman Costa Ph.D.
Mathematical Argument
We are reminded by Carl Sagan in his book, Cosmos, that the underpinning of modern science with mathematics goes back to Pythagoras. In the search for truths in nature, however, we no longer look for them in Pythagoras' mystical, even magical, power of numbers. Today, mathematics is indispensable for science as method, and science as content. We count, measure, perform basic operations (add, subtract, multiply, and divide,) compute values, solve equations, use visual display to communicate quantitative information, conduct statistical tests, and represent things and ideas with symbols and relationships.
The history of psychological science, even to the present day, has been a quest for scientific respectability. Few things have been as important to this quest as the development of mathematical argument for the science of psychology. Nothing has been more important, or as far reaching, for mathematical argument in psychology, than the development of the correlation coefficient. Because much of psychology (and the social sciences in general) has been the examination of individual differences, it was inevitable that tools be developed to express relationships and dependencies among different traits, capabilities, and just about anything that could be measured and recorded about people.
The rapid fire discoveries, in the 19th century, of fundamental laws of nature in physics, chemistry, and life sciences created an air of expectation, pride, and optimism. Some held the view that the final discovery of all laws of physical nature would be concluded in the early part of the new century. Psychology envisioned its own role in this great leap forward in knowledge and science. The development of mathematical argument was about to elevate psychology to a level that was on par with the more successful physical and life sciences – or so it was hoped.
It is difficult to appreciate, today, how exciting it was for scientific psychology in the late 19th and early 20th centuries. The development of the correlation coefficient became the Royal Road to scientific respectability, at least in the minds of the pioneers of psychological science. Statistical correlation formulas provided powerful tools that could be applied to a myriad of problems in the budding social and economic sciences. The correlation coefficient led to the development of other powerful tools like multiple correlation, canonical correlation, regression, and factor analysis. It gave impetus and support to the development of other tools for mathematical argument, particularly the concept of true score, and statistical tests.
The Correlation Coefficient, r
The idea of using mathematics to demonstrate a dependency or association between two factors began with the work of Auguste Bravais (23 August 1811, Annonay, Ardèche – 30 March 1863, Le Chesnay, France). One is more likely to read that Sir Francis Galton FRS (16 February 1822, Birmingham, England – 17 January 1911, Haslemere, Surrey, England) was the originator of mathematical correlation. This is not true, though Galton, considered the father of modern psychometrics, was a genius developer of the descriptive statistics that we use to this day: standard deviation, regression analysis, and the properties of the bivariate normal distribution. He saw a need to quantify the relationship between different variables in biometric studies, census and population data, psycho-physical data, and in hereditary and eugenic research. He wanted to express relation and degree of relation in his research.
Visual display of the concept of Regression to the Mean
Galton turned to the work of Auguste Bravais. Building upon Bravais' work he developed some approaches and early indexes of association. Galton did not invent the correlation coefficient, but he was the first person to apply correlation to data that he collected in the field. Famously, Galton was the first person to give the correlation coefficient a single symbol, r. Karl Pearson FRS (27 March 1857, Islington, London, England – 27 April 1936, Coldharbor, Surrey, England,) a student of Galton, wrote of the contribution of his mentor, and the origin of mathematical correlation in Philosophical Transactions of the Royal Society of London, 1897, Vol. 187.
“The fundamental theorems of correlation were for the first time and almost exhaustively discussed by [Auguste] Bravais ('Analyse mathématique sur les probabilités des erreurs de situation d'un point.' Mémoires par divers Savans, T. IX., Paris, 1846, pp. 255-332) nearly half a century ago. He deals completely with the correlation of two and three variables. Forty years later Mr. J. D. Hamilton Dickson ('Proc. Roy. Soc.1886,p. 63) dealt with special problem proposed to him by Mr. Galton, and reached on a somewhat narrow basis* some of Bravais' results for correlation of two variables. Mr. Galton at the same time introduced an improved notation which may be summed up in the 'Galton function' or 'coefficient of correlation.' This indeed appears in Bravais' work, but a single symbol is not used for it.”
There was great enthusiasm for measuring association, but, none of the early approaches and indexes was wholly satisfactory. Not until Galton took Karl Pearson under his wing as a protege did the mathematics and statistics of association become firmly developed and grounded.
Karl Pearson was a brilliant mathematician and mathematical statistician who contributed to the work of Galton and his students in developing statistical tools to measure association (what we know as relatedness or correlation.) The single most important statistic of Pearson, especially regarding the development of modern theory of psychological and educational testing, was Pearson's Product Moment Correlation Coefficient.
Formula for Pearson's Product Moment Correlation Coefficient, r
Visual display of a correlation between Body Surface Area and Weight
Karl Pearson's work overlapped with Galton's other students, notably Charles Spearman FRS (10 September 1863, London, England – 17 September 1945, London, England) in the United Kingdom. Spearman was to develop his own measure of relationship, which paralleled Pearson in concept, but used a different computational approach. We know it as Spearman's Rank Order Correlation Coefficient. In time, Karl Pearson and Charles Spearman would have differences on various aspects and uses of correlation formulas. Spearman was the most successful in finding applications for statistical computations of association. He was the most articulate and insightful statistician and psychologist when it came to applying correlation analysis to the premier research problem of the day – the study of human intelligence and its composition. He developed the techniques of factor analysis – derived and extended from correlation analysis – that would be directed toward answering questions about the elemental nature of human intelligence.
Edward L. Thorndike (August 31, 1874, Williamsburg, Massachusetts, U.S. – August 9, 1949, Montrose, New York, U.S.) was pivotal in introducing Spearman's concepts to American psychologists, because he, too, was looking for measures of association, and a way to measure test reliability. Spearman's two important publications in 1904 had to be published in America since the British Journal of Psychology had not yet been inaugurated. They were: ““General intelligence,” objectively determined and measured.” American Journal of Psychology. 15, 201-293; and “The proof and measurement of association between two things.” American Journal of Psychology. 15, 72-101. This was fortuitously helpful to Thorndike in the development of his own views on mental and social measurement. Working from Columbia University in New York City, he put Spearman's work into his very influential texts. An introduction to the theory of mental and social measurement, New York: Teachers College, Columbia University, Publishers. (1904, 1913, 1922.)
The period for development of concepts, and statistical formulations, for measures of association – what we call correlation – was an exciting time among early psychologists and psychometricians . Psychology, virtually alone, was in the forefront of applied statistical development when it came to measures of association. Not only would it do wonders for applied problems in psychology and education, they believed it would bolster the image and credentials for psychology as a science, and lead to the recognition of psychological science from their contemporaries in other successful sciences.
The True Score
Spearman was the first person to articulate the concepts of true score,, and error score (true score minus observed score,) and the idea of errors of measurement. The invention of the true score, a mathematical construct, is one of the most important events in the history and development of psychological and educational testing. It is literally true, that from this single invention was spawned a world-wide, multi-billion dollar industry, as ubiquitous and powerful, today, as anytime in its history.
Optical Mark Reader Answer Sheet
When it came to test theory, Spearman was in the forefront, as well. After developing his concept of true score, and applying it to the study of tests and testing, he came to propose correlation as a measure of the reliability of a test. This was another extremely exciting moment for psychology and education. It is impossible to overstate the professional pride and collective sense of achievement among psychologists in America and Europe (England in particular,) in what they perceived to be the elevation of psychology to the status of legitimate science.
The rationale for the use of the correlation coefficient to measure reliability proceeded in the following way. First, the principal researchers in the field had been querying their colleagues in the more successful sciences about the nature of measurement and scientific instrumentation. They saw direct parallels to measurement and tests in psychology. The scientific psychologists wanted to emulate the same processes, and develop analogous concepts that would be recognized and understood by their non-psychologist contemporaries.
Engineering Measurement Caliper
In the manner of anecdote, in varied notes and writings, they talked about the counsel received from their scientific colleagues in other disciplines. They learned, so they thought, what was regarded as an essential element in scientific observation and measurement. A measurement instrument was only as good as it was reliable. It had to produce the same measurement score or results every time it was used, all things being equal. If repeated measurements gave different scores, when the same results were expected, then the measurement instrument, or test, was unreliable and of no use to science. These informal comments from encounters with scientists in more successful fields, never mention any extended discussion on the definition of reliability, or examples of measurement. The only thing they remember and report is that reliability is consistency of observed scores. There is no indication that they interrogated their colleagues in other fields with any desire to understand the concept of reliability any further.
Volt-Watt Meter
Spearman, and all others in the early years of scientific psychology, interpreted this counsel from non-psychological scientists in a very literal manner. Reliability was consistency of measurements. This literal translation of the concept of reliability would be a huge mistake. I shall cover this in a later article.
Minnesota Multiphasic Personality Inventory (MMPI) Profiles
Second, Spearman believed that correlation analysis would measure consistency of scores. The technique of correlation was straightforward. Administer a test, and then re-administer the same test to the same sample of people. If the test was reliable, and would produce consistent scores upon re-administration, then he should observe the same relative standing of people on both test administrations. The people who tended to be higher than the others on the first administration, should tend to be higher than the others on the second administration. The same would be true for those who tended to score lower than others on the first administration. They would tend, also, to be lower on the second administration. Correlation coefficients, then as now, if they do anything, indicate whether relative standing on one test, is related to relative standing on another test.
The question never answered, because it was never asked, is why consistency of score should be inferred from rank order of scores. Nothing in the definition of reliability says anything about absolute values of individual scores, or mean performance of the samples. This is the origin of the inability to distinguish, unambiguously, between reliability and validity. I shall cover this in a later article.
Myers-Briggs Type Indicator used in premarital counseling - HIS profile
Myers-Briggs Type Indicator used in premarital counseling - HER profile
Finally, from Spearman's conceptualization of true score, he reasoned that he could parse the relative size of true variance from total observed variance, and use a correlation coefficient as a means to estimate the ratio of one to the other. The ratio would be a measure of consistency of scores, or reliability. The excitement over progress in the field of mental and social measurement was validated for those pioneering psychologists by Spearman's rationale for the use of the correlation coefficient for measuring reliability. After all, the correlation coefficient was one of the great inventions of the new sciences of statistics and psychology. I will discuss in a later article that reliability as consistency of measurement scores was an incomplete interpretation of what they heard from their non-psychologist colleagues. Also, the concept of true score was a hypothetical abstraction and an assumption at that, though it was treated as an axiom. We are going to find that this led to a fundamental mistake. The error was in thinking as if the abstract concept of true score was an actual reality.
Please com back for Part 2 of Psychological Science: Mathematical Argument and the Quest for Scientific Respectability.
I remember hearing something about this paper a while back, and I am too math-challenged to make an intelligent response to it now that you have begun to write it. There are mathier people in the AB Universe, thank heaven, and I hope they will comment. I'm very glad you are at work on it.
Posted by: Elatia Harris | January 03, 2012 at 12:56 AM
@ Elatia:
Thank you for taking notice. There is a plan afoot. In time, all will be revealed. Correction, in short time, all will be revealed.
Posted by: Norman Costa | January 03, 2012 at 01:16 AM
Norman: I know that it often feels like singing in the wilderness when one writes a substantive post and there are no comments. But that doesn't necessarily mean that no one took notice. I for example, read the article soon after you published it. But same as Elatia, I couldn't think of anything particularly interesting to say or ask - I took it as an informational essay. If this is part of a paper in the making, best of luck with it. And thanks for posting your thoughts on psychological measurements here.
Posted by: Ruchira | January 03, 2012 at 09:12 AM
@ Ruchira:
Thanks. I predict that more will become clear, and hesitancy to comment will be abated.
Posted by: Norman Costa | January 03, 2012 at 04:15 PM
I'm looking forward to part 2, Norm. The "true = observed + error" score formula does seem a little bit like cheating. Suppose I proclaim Barack Obama a turtle. By that, I mean he is the President of the United States, leaving some margin for error in my representation.
Posted by: Dean C. Rowan | January 04, 2012 at 11:34 AM
@ Dean:
You are more on target than you think. Onward to Part 2.
Posted by: Norman Costa | January 04, 2012 at 02:43 PM