24 April 2006

๐Ÿ–️Schuyler W Huck - Collected Quotes

"Distributional shape is an important attribute of data, regardless of whether scores are analyzed descriptively or inferentially. Because the degree of skewness can be summarized by means of a single number, and because computers have no difficulty providing such measures (or estimates) of skewness, those who prepare research reports should include a numerical index of skewness every time they provide measures of central tendency and variability." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"If a researcher checks the normality assumption by visually inspecting each sample’s data (for example, by looking at a frequency distribution or a histogram), that researcher might incorrectly think that the data are nonnormal because the distribution appears to be too tall and skinny or too flat and squatty. As a result of this misdiagnosis, the researcher might unnecessarily abandon his or her initial plan to use a parametric statistical test in favor of a different procedure, perhaps one that is thought to be distribution-free." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"If data are normally distributed, certain things are known about the group and individual scores in the group. For example, the three most frequently used measures of central tendency - the arithmetic mean, median, and mode - all have the same numerical value in a normal distribution. Moreover, if a distribution is normal, we can determine a person’s percentile if we know his or her z-score or T-score." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"It is best to think of the various kinds of central tendency indices as falling into three categories based on the computational procedures one uses to summarize the data. One category deals with means, with techniques put into this category if scores are added together and then divided by the number of scores that are summed. The second category involves different kinds of medians, with various techniques grouped here if the goal is to find some sort of midpoint. The third category contains different kinds of modes, with these techniques focused on the frequency with which scores appear in the data." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"It is dangerous to think that standard scores, such as z and T, form a normal distribution because (1) they don’t have to and (2) they often won’t. If you mistakenly presume that a set of standard scores are normally distributed (when they’re not), your conversion of z-scores (or T-scores) into percentiles can lead to great inaccuracies." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"It should be noted that any finite data set cannot “follow” the normal curve exactly. That’s because a normal curve’s two 'tails' extend out to positive and negative infinity. The curved line that forms a normal curve gets closer and closer to the baseline as the curved line moves further and further away from its middle section; however, the curved line never actually touches the abscissa." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"[…] kurtosis is influenced by the variability of the data. This fact leads to two surprising characteristics of kurtosis. First, not all rectangular distributions have the same amount of kurtosis. Second, certain distributions that are not rectangular are more platykurtic than are rectangular distributions!" (Schuyler W Huck, "Statistical Misconceptions", 2008)

"The shape of a normal curve is influenced by two things: (1) the distance between the baseline and the curve’s apex, and (2) the length, on the baseline, that’s set equal to one standard deviation. The arbitrary values chosen for these distances by the person drawing the normal curve determine the appearance of the resulting picture." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"The concept of kurtosis is often thought to deal with the 'peakedness' of a distribution. Compared to a normal distribution (which is said to have a moderate peak), distributions that have taller peaks are referred to as being leptokurtic, while those with smaller peaks are referred to as being platykurtic. Regarding the second of these terms, authors and instructors often suggest that the word flat (which rhymes with the first syllable of platykurtic) is a good mnemonic device for remembering that platykurtic distributions tend to be flatter than normal." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"[...] the term statistical misconception refers to any of several widely held but incorrect notions about statistical concepts, about procedures for analyzing data and about the meaning of results produced by such analyses. To illustrate, many people think that (1) normal curves are bell shaped, (2) a correlation coeffi cient should never be used to address questios of causality, and (3) the level of signifi cance dictates the probability of a Type I error. Some people, of course, have only one or two (rather than all three) of these misconceptions, and a few individuals realize that all three of those beliefs are false."(Schuyler W Huck, "Statistical Misconceptions", 2008)

"The second surprising feature of kurtosis is that rectangular distributions, which are flat, are not maximally platykurtic. Bimodal distributions can yield lower kurtosis values than rectangular distributions, even in those situations where the number of scores and score variability are held constant." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"There are degrees to which a distribution can deviate from normality in terms of peakedness. A platykurtic distribution, for instance, might be slightly less peaked than a normal distribution, moderately less peaked than normal, or totally lacking in any peak. One is tempted to think that any perfectly rectangular distribution, being ultraflat in its shape, would be maximally platykurtic. However, this is not the case." (Schuyler W Huck, "Statistical Misconceptions", 2008)

"Various measures of central tendency have been invented because the proper notion of the 'average' score can vary from study to study. Depending on the kind of data collected, the degree of skewness in the data, and the possible existence of outliers, it may be that the most appropriate measure of central tendency is found by doing something other than (1) dividing the sum of the scores by the number of scores (to get the mean), (2) calculating the midpoint in the distribution (to get the median), or (3) determining the most frequently observed score (to get the mode)." (Schuyler W Huck, "Statistical Misconceptions", 2008)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.