All the M&Ms You Want

Teachers of statistics know that a bag of M&M candies is often the best resource available for discussing practically any topic in the curriculum. In honor of “no sweets” week here at school, I created a virtual bag of M&Ms. You can set the size of each bag (in number of candies less than 100) and you’ll get an assortment of blue, orange, green, yellow, red, and brown candies that follows the officially stated distribution for milk chocolate M&Ms!

High Temperature

When the current temperature is greater than the predicted high temperature for the day, why doesn’t this report get updated? Is there a benefit in comparing the current temperature to the predicted high? Later in the day, wouldn’t it also be beneficial to know what the actual high for the day was?

Word Count

One of the neat WordPress plug-ins I use is Word Stats. Prior to this post, I had apparently written over 25,000 words for this blog. That’s half way to a NaNoWriMo novel! My favorite word seems to be “that” followed by “this”. The only “math” words to crack the top 20 are “number”, “numbers”, and “triangle”.

I wonder how the Word Stats plug-in handles the mathematical notation I often use. For example, is the phrase “a + b” considered to be three words?

Starting the New Year Running

As I have done the past two years, tonight at midnight I’ll be running in the Emerald Nuts Midnight Run in Central Park. The first time I participated in this 4-mile run, I finished in 30 minutes. Last year, I managed just under 40. By extrapolation, I must consign myself to finishing in 50 minutes this year.

Relatedly, I will be running the NYC Half Marathon on March 18, 2012. This will be my second half marathon!

Pattern Matching

Inspired by last night’s #mathchat, I started exploring patterns. Specifically, I tried to create some patterns that could be “continued” in more than one way. For example, the sequence:

3, 5, 7, …

can have 9 as the next term (if we are listing odd numbers) or 11 (if we are listing odd prime numbers). Similarly, the sequence:

1, 11, 121, 1331, 14641 …

can have 161051 as the next term (if we are listing powers of 11) or 15101051 (if we are listing smashed together terms that form lines of Pascal’s Triangle).

But this only got me thinking: given any finite sequence of numbers, no matter how long, there are an infinite number of choices for the next number in the sequence, and all of them could fit into a plausible pattern! For example, you might think it’s obvious what the next number here is:

2, 4, 6, 8, 10, …

But I could just as easily claim the next number is 983, because the actual sequence I had in mind was:

2, 4, 6, 8, 10, 983, 985, 987, 989, 1001, 1974, 1976, 1978, 1980, 2953, …

Here’s a much more compelling question though, and it has close ties to some of the theory used in statistics and hypothesis testing. Consider this sequence again, but think about plausible choices for the next two terms:

2, 4, 6, 8, 10, …, …

You might believe, prior to all of this discussion, that the most likely choices are 12 and 14. If I reveal that the term right after 10 is not 12, then that completely undermines your belief. But what if I told you that it is 12?

2, 4, 6, 8, 10, 12, …

How much stronger is your belief now that 14 is the next number? Or put it another way, how does this new information affect your previously held belief?

Fantasy Football Recap

My fantasy football season is over. I finished with an 11-3 record and first place out of ten teams after the regular season. Alas, I lost in the first round of the playoffs. The finals will be played this coming weekend without me.

As a look back on this season, here are the players that were selected in the first round and their distribution of fantasy points scored over the first fifteen weeks of the season. All told, there are seven running backs, two wide receivers and one quarterback.

And here are the ten players who currently have the highest average fantasy points per game. Among this group there are eight quarterbacks and two running backs.

In fact, only three of the top ten draft picks ended up in the top ten of average points scored per game!

Relative Salaries Again

When I showed different people the graphic I made comparing my salary with that of Albert Pujols over the next ten years, I noticed they tended to overestimate the amount of money I earned by a good margin. In fact, I earn less per year than the average male, 25-34 years old, with a bachelor’s degree or more.

I attribute the reason for this over-estimation to the fact that I used the areas of circles to represent the two salaries, and in my experience people (myself included) are just generally bad at discerning relative areas. While making the graphic, I had to triple check my arithmetic because the picture I saw didn’t match intuitively with the numbers I had in mind.

So here’s another take. The area of each figure still represents the salaries of myself and Albert Pujols, but now the figures are rectangles, and each rectangle has the same width. Therefore, to compare relative sizes, you only need to compare a single dimension (the height) rather than two.

Smooth Curves

I’m not usually a fan of line graphs. I think connecting consecutive points on a scatter plot makes data that’s not continuous seem like it is. However, while plotting some data in Google docs earlier, a particular line graph did raise an interesting question in my mind.

Here are twelve consecutive attempts to solve a Rubik’s Cube and the time it took me for each in seconds. I think I’m doing well, though it’ll be some time before I get my times down below 60 seconds.

Here’s the same data, but with the points connected:

What got me thinking was the next one. Google docs lets you connect the points using smooth curves.

It’s very appealing visually, though somewhat inaccurate. Here’s what I’m wondering: how does Google docs draw the smooth curves?

Though there’s only one way to connect two points with a straight line, there seems to be infinite (or at least more than one) ways to connect two points with a curve. So which one is most appropriate? Sounds like I might have a project ahead of me involving Bézier curves.

Equilateralness This Year

A couple months ago, good friend and colleague Mr. Honner asked a simple question motivated by the dates Oct 10, 2011 and Oct 11, 2011 (10-10-11 and 10-11-11). Which triangle is more equilateral?

In a follow up, he defined a new metric, “equilateralness”, which is given by the ratio of the triangle’s area to the area of an equilateral triangle with the same perimeter. As a triangle becomes more equilateral, this ratio approaches 1.

Today being another “close to equilateral date”, I decided to plot the equilateralness of the entire 2011 year. Obviously, we had one perfectly equilateral date (11-11-11), but did you know that today (12-12-11) is the second most equilateral at Eq ≈ 0.9953? And of dates that have proper triangle lengths January 11 and November 1 (1-11-11 and 11-1-11 respectively) share the lowest Eq this year?

You can explore the trends in the graph below (did you know you can publish charts from Google docs!?). If a date failed the triangle inequality (and there were 223 of them), I gave it an equilateralness of 0.