Feeds:
Posts
Comments

Archive for August, 2015

Buiding R 3.2.2 on Fedora 22

I started building R 3.2.2 on Fedora 22 today, and I got the dreaded

configure: WARNING: you cannot build info or HTML versions of the R manuals
configure: WARNING: you cannot build PDF versions of the R manuals
configure: WARNING: you cannot build info or HTML versions of vignettes and help pages

And google turned up about 99 solutions telling me to go read the manual.

But hey, I figure I’ll read the manual so that you don’t have to.  Here is the installation procedure I followed.  It probably won’t match the exact magical incantations you’d need on another flavor of Linux.  But at least you can use this as a guide to get some additional ideas on what packages you have to have, and then create your own spells.

Downloading the Source

The source can be downloaded as a .tar.gz file from www.r-project.org.

Preparing the System to Build

First, if you don’t already have it, you’ll need to install a compiler.  In fact, you’ll need three: a C compiler, C++ compiler, and a Fortran 90 compiler.

sudo yum install gcc
sudo yum install gcc-c++
sudo yum install gcc-gfortran

Many R packages also depend on Java, so you may want to download the latest JRE from www.java.com

To install kernel headers and development libraries, yum has a nice group install feature.  You’ll need to install the following groups:

sudo yum groupinstall "Development tools"
sudo yum groupinstall "Development Libraries"
sudo yum groupinstall "X Software Development"

Installing Stuff the R Needs

A close reading of the R build and installation manual above reveals that there are a number of packages required on Linux, and while it would have been nice if they had provided them in a list, here is a list of things I needed to install.  (Some of these may come up already installed on your system.)

sudo yum install zlib
sudo yum install lzma
sudo yum install curl
sudo yum install pcre
sudo yum install bzip2

And then there are the TeX libraries

sudo yum install texinfo
sudo yum install texinfo-tex
sudo yum install tex
sudo yum install texlive-scheme-basic    # A.K.A. LaTex, lol!
sudo yum install texlive-inconsolata        #  A font needed by R 3.2.2 manuals

Configuring and Building

This is usually a two step process.  There should be a script called configure in the R-3.2.2 directory.  Most people will want to build R with the shared libraries, especially if they want to use it with RStudio, so it should be run with that option. If the configure step fails, it usually does so with a message that indicates a missing package.  If there is something missing that is not covered above, you may have to google for the exact package name given the message.  Then you can make, and make pdf, and make install.

./configure --enable-R-shlib
make
make pdf
sudo make install

That’s it!

Now maybe you don’t need to make pdf, but when I did that and tried to make install, it complained that there was a missing “NEWS.pdf”.

Read Full Post »

Yamslam Odds

Yamslam is a dice game by Blue Orange Games that our family loves to play. The game is exciting, and we have a tradition that if one of us gets a Large Straight or a Yamslam (the highest roll), they slap the 50-point chip on their forehead, raise their arms in the air and shout “Yeaaaahhhh!”

In this blog post, I will calculate the odds of rolling various Yamslam rolls in one roll.

Possible Rolls

There are five dice and three chances to roll during each turn. The player chooses which dice to keep and which to re-roll to improve their lot.

The rolls in Yamslam roughly follow poker hands, listed here in order from highest to lowest: Yamslam (5 of a kind), Large Straight (5 in a row), Four of a Kind, Full House, Flush (all evens or all odds), Small Straight (4 in a row), Three of a Kind, and Two Pair. In addition to the scoring rolls, I will also calculate the odds of getting One Pair and our family’s unoffical “nothing” roll, Bupkiss.

Calculation

Since there are 5 dice and each die can take on one of 6 values, there are 6^5 = 7776 possible rolls. In order to come up with basic probabilities for the above Yamslam rolls, we will basically count the occurrences of each type of roll and divide by 7776.

For each roll, I will count both inclusively and exclusively. Inclusive means that the count will include higher scoring rolls that also match the given roll. For example, a Three of a Kind inclusive count would include Yamslam, Four of a Kind, and Full House as well as Three of a Kind because those rolls include a Three of a Kind.

Yamslam

The basic pattern of this roll is AAAAA. Of all 7776 possible rolls of 5 dice, exactly 6 of these have the Yamslam, or 5 of a kind, pattern. Since Yamslam is the highest possible roll, this count is both inclusive and exclusive.

Yamslam Inclusive Exclusive
Count 6 6
Probability 0.08% 0.08%

Large Straight

The pattern is ABCDE with successive numbers. For five dice, there are only two kinds of Large Straights possible: 1-2-3-4-5 and 2-3-4-5-6. But each die is distinct and can arranged in any position, so the total count is 2 * 5! = 2 * (5 * 4 * 3 * 2 * 1) = 240.

If you don’t see where the 5! comes from, consider that for any sequence of 5 distinct elements, there is a choice of 5 places where the first element can go, times 4 remaining choices of where the second element can go, times 3 remaining choices of where the third element can go, times 2 remaining choices of where the fourth element can go, times 1 remaining spot for the last element.

This count is also both inclusive and exclusive since there is no way a Yamslam can masquerade as a Large Straight.

Large Straight Inclusive Exclusive
Count 240 240
Probability 3.09% 3.09%

Four of a Kind

This pattern is AAAAB. To count up the possible Four of a Kind rolls, consider that there are 6 choices for A, and 5 remaining choices for die B. Also, the remaining die can be arranged in any one of 5 positions. So the exclusive count is 6 * 5 * 5 = 150. Now every Yamslam can also be considered a Four of a Kind, so the inclusive count should include the 6 possible Yamslams for a total of 156.

Four of a Kind Inclusive Exclusive
Count 156 150
Probability 2.01% 1.93%

Full House

The Full House pattern is AAABB where A and B are distinct numbers. This means that neither a Yamslam nor a Four of a Kind can ever be counted as a Full House, and certainly neither can a Large Straight. So the exclusive and inclusive counts will be the same. There are 6 choices for the A and 5 choices for the B for a total of 30 distinct AAABB number pairings, but how many arranegements are there? There are 5! ways to arrange 5 distinct dice, but the A’s can be rearranged amongst themselves in 3! ways and the two B’s can be rearranged in 2! ways, so the total count is 6 * 5 * 5! / (3! * 2!) = 300.

The quantity 5! / (2! * 3!) is also known as the combination of 5 objects taken 2 at a time and can be written as C(5,2). Another way to think about the foregoing calculation is to think of a bag containing the numbers 1 through 5 representing where the B’s would show up in a list of five slots, and you pull out 2 of those without replacement. How many different combinations do you come out with? 5! / (2! * 3!) = C(5,2).

Full House Inclusive Exclusive
Count 300 300
Probability 3.86% 3.86%

Flush
The Flush is where is starts to get interesting. In Yamslam, the flush is when all of the dice are odd or all are even. There’s no simple pattern for the Flush and it overlaps with many other rolls. But, for each set of evens or of odds there are 3 choices for each of 5 dice in the roll: 3^5 = 3*3*3*3*3 = 243, so the inclusive Flush count is 2 * 243 = 486.

To get the exclusive count of flushes we have to subtract off the cases where a Four of a Kind, a Full House, and a Yamslam are also flushes. (A Large Straight is never a flush.)

Obviously, all Yamslams are also flushes. How many Four of a Kind rolls are also flushes? There are 6 Four of a Kinds, and for each choice there are 2 choices to make a Flush, and 5 choices for where the last die goes: 6 * 2 * 5 = 60 Four of a Kinds that are also Flushes. Similarly, in a a Full House, for each of 6 choices for the triple, there are 2 choices for the pair to make a flush, and there are C(5,2) = 5! / (3! * 2!) arrangements just like before for a total of 6 * 2 * 5!/(3! * 2!) = 120 Full Houses that are also Flushes.

Therefore the exclusive number of Flushes is 486 – (6 + 60 + 120) = 300.

Flush Inclusive Exclusive
Count 486 300
Probability 6.25% 3.86%

Short Straight

There are three kinds of Short Straights in Yamslam: 1-2-3-4, 2-3-4-5, and 3-4-5-6. Counting the Short Straights is tricky because the combinatorics are different in the case where all of the dice are different versus the case where there is a Short Straight and a Pair. Consider the case where all of the dice are different: ABCDE. The distinct cases are 1-2-3-4-5, 1-2-3-4-6, 2-3-4-5-6, and 1-3-4-5-6. Since the die are all distinct, there are 5! combinations for a total of 4 * 5! = 480 combinations.

Now lets consider the case where there are pairs: AABCD. For each of the three Small Straights 1-2-3-4, 2-3-4-5, and 3-4-5-6, there are 4 choices for the pair. But how to arrange the results? There are C(5,2) ways that the pair can be distributed among the five available slots, but since the remaining 3 die are distinct, we multiply by 3 choices for the first remaining die, 2 choices for the next and 1 for the last. That gives us a total of 3 * 4 * 5!/(2! * 3!) * 3! = 720 additional overlap combinations, for a total of 480 + 720 = 1200 Short Straights, inclusive.

To get the exclusive figure, we need to count how many of these Short Straights include Large Straights. A careful observer will note that all 240 of the Large Straights were already included in the count. So then the exclusive figure for Short Straights is 1200 – 240 = 960.

Small Straight Inclusive Exclusive
Count 1200 960
Probability 15.43% 12.35%

Three of a Kind

To get the figure for Three of a Kind, consider the pattern AAABC exclusive of Yamslams, Four of a Kinds, and Full Houses. There are 6 choices for A with C(5,3) arrangements of the 3 A’s among 5 slots. Since the remaining dice are distinct, there are 5 choices for B and 4 choices for C for a total of 6 * 5 * 4 * 5! / (3! * 2!) = 1200.

But, this is a partially inclusive figure because many of these are also Flushes. The number of Three of a Kind rolls above that are also Flushes is 6 choices for the A with 2 choices for B (once A is fixed) and then C is determined. The arrangements are the same giving 6 * 2 * 5! / (3! * 2!) = 120 Three of a Kind rolls that are also Flushes. So the exclusive total count for Three of a Kind is 1200 – 180 = 1080.

To get the inclusive figure, we start with the 1200 above and simply add in the Yamslams, Four of a Kind, and Full Houses. So the inclusive figure is 1200 + 6 + 150 + 300 = 1656.

Three of a Kind Inclusive Exclusive
Count 1656 1080
Probability 21.30% 13.89%

Two Pair

The basic Two Pair pattern is AABBC, exclusive of Yamslam, Four of a Kind, Full House, and Three of a Kind. There are 6 choices for A, 5 remaining choices for B, and 4 remaining choices for C. But we’ve overcounted because there are two pairs: for example the case with A = 1 and B = 2 is double-counted by the case where A = 2 and B = 1. Since the pair AA and BB are indistinguishable as pairs, there are C(6,2) = 6! / (4! * 2!) = 6 * 5 / 2 = 15 distinct assignments to A and B. The first pair can be arranged in C(5,2) and the second pair in C(3,2) remaining slots. This gives a total of C(6,2) * C(5,2) * c(3,2) * 4 = 15 * 10 * 3 * 4 = 1800 cases.

But like the Three of a kind case, some of these overlap with Flushes. For each type of Flush (even or odd) in the AABBC pattern, there are C(3,2) distinct assignments for A and B, C is determined, and the number of arrangements are the same for a total of 2 * C(3,2) * C(3,2) * C(5,2) = 2 * 3 * 3 * 10 = 180 overlaps with Flushes. So the exclusive count for Two Pair is 1800 – 180 = 1620.

To get the inclusive count, start with the 1800 figure and add in the Full Houses to get 1800 + 300 = 2100.

Two Pair Inclusive Exclusive
Count 2100 1620
Probability 27.01% 20.83%

One Pair and Bupkiss

The Two Pair roll is the lowest scoring roll in Yamslam. One Pair and Bupkiss (aka “nothing”) are both scored as zero. However, it is convenient to count them separately.

The One Pair pattern is AABCD. There are 6 choices for A, and there are C(5,2) arrangements of the pair in five slots. The remaining dice are distinct, and so there are 5 remaining choices for the next die, times 4 remaining choices for the next die, times 3 for the last die: 6 * C(5,2) * 5 * 4 * 3 = 3600.

Does this include any Flushes? Actually, since the four A, B, C, and D are all distinct, and there are only 3 choices for these to be in a Flush, by the Pigeonhole Principle there are no Flushes with this pattern. But there are overlaps with Small Straights. We already counted these above, and there are 720 Small Straights that are also One Pairs. So the exclusive Single Pair count is 3600 – 720 = 2880.

To get the inclusive figure, start with the 3600 figure and add in everything else that overlaps with a pair: Two Pair, Three of a Kind, Four of a Kind, Full House and Yamslam: 3600 + 1620 + 1080 + 300 + 150 + 6 = 6756.

Bupkiss is surprisingly easy. To count pure Bupkisses, one only needs to realize that there are only two distinct pure Bupkiss patterns: 1-2-3-5-6 and 1-2-4-5-6. Each of these can be arranged in 5! ways for an inclusive and exclusive total 2 * 5! = 240.

Conclusion

In the above sections, we have counted all of the occurrences of different Yamslam rolls. These are collected in the table below. To check that all of the cases are accounted for, we sum up the exclusive counts and arrive at the expected 7776.

Inclusive Count Inclusive Probability Exclusive Count Exclusive Probability
Yamslam 6 0.08% 6 0.08%
Large Straight 240 3.09% 240 3.09%
Four of a Kind 156 2.01% 150 1.93%
Full House 300 3.86% 300 3.86%
Flush 486 6.25% 300 3.86%
Small Straight 1200 15.43% 960 12.35%
Three of a Kind 1656 21.30% 1080 13.89%
Two Pair 2100 27.01% 1620 20.83%
One Pair 6756 86.88% 2880 37.04%
Bupkiss 240 3.09% 240 3.09%
Total 7776 100.00%

That’s it!

Read Full Post »

Basic Probabilty Distributions in R

This has been done to death, but I wrote a brief introduction to Basic Probability Distributions in R on Rpubs.  One thing that that this introduction has going for it that I don’t see in many other places is that it brings together plots of each of the basic distribution functions in R along with some examples for how they are used.  The hope is that the reader will get both a feel for the shapes of the probability distributions and will gain an understanding of the three standard kinds of distribution functions offered by R: the probability density, the cumulative distribution, and the quantile function. The following probability distributions are covered.

  • Normal Distribution: dnorm, pnorm, and qnorm
  • Poisson Distribution: dpois, ppois, and qpois
  • Binomial Distribution: dbinom, pbinom, and qbinom
  • Exponential Distribution: dexp, pexp, and qexp
  • Chi Square Distribution: dchisq, pchisq, and qchisq

In general, R has excellent online documentation, but it can be a little dry.  It can be tough to remember the differences between p-this and d-that and q-who, and I find that it helps me to remember these if I visualize the functions and work a couple of examples.

Anyway, I hope someone finds this useful!

Read Full Post »