Saturday, April 16, 2011

Mac vs Windows and Safari vs IE - Are Firefox and Chrome more likely to run on Windows than Mac?

I don't have lots of web stats. But here is what I do have. The following is a contingency table from my over simplified visitors stats offered by blogger.com.


OS \ Browser | Firefox + Chrome | Safari or IE
-----------------------------------------------
Mac OS X     | 1789             | 922
-----------------------------------------------
Windows      | 688              | 591


The Odds Ratio for Windows users to use a non-IE browser over Mac OS X users to use a non-Safari browser is about 1.7  To be sure, let's calculate the 95% confidence interval. 


C.I. = 1.7 plus/minus 1.96 x sqrt( 1/1789 + 1/688 + 1/922 + 1/591)


The 95% CI for Odds Ratio is approximately between 1.2 and 2.1. 
Conclusion: Mac users are more likely to use Safari than Windows users to use IE.
But some of you have already known that.


For Stats readers, below is the LR Statistics of a saturated loglinear model. The high order
term has a small p-value



                 Analysis Of Maximum Likelihood
                      Parameter Estimates

                                    Wald
          Parameter           Chi-Square    Pr > ChiSq


          Intercept               980.95        <.0001
          Windows                   6.72        0.0095
          Thirdparty                0.78        0.3772
          Windows*Thirdparty        5.63        0.0177

Tuesday, April 12, 2011

Charlie's Blog: Did Zuckerberg send those email to Ceglia?

Charlie's Blog: Did Zuckerberg sent those email to Ceglia?: "Did Zuckerberg send those emails to Celgia? One way to find out is to study the writing styles. With a quick search on the inte..."

Did Facebook's Zuckerberg send those email to Ceglia?

Did Zuckerberg send those emails to Ceglia? 

One way to find out is to study the writing styles. 

With a quick search on the internet I found this email of Mark Zuckerberg's of which will be called email "Auth", standing for authentic.

I then saved the text of all emails alleged to be sent by Zuckerberg to Ceglia, of which are called "Acc", standing for "Accused". There are also emails by Ceglia himself to Zuckerberg. I compiled these emails together and called it "Ceglia".

A common method to study the style difference between say Shakespare's and Jack London's is to compare the frequencies of the "meaningless" words. By "meaningless", it means words such as "such", "as", "you", "any", "and", etc. Researchers believe that the frequencies of these "meaningless" words are characteristics of different authors.

Contrary to my eye-balling conclusion, the "Accused" emails are closer to "Authentic" than to "Ceglia" in writing style. How much closer? It is about 500 times closer to Zuckerberg's authentic email than to the Ceglia's email.


Let's take a closer look at the data, the differences are represented by numbers in the "Mean" column, the larger the number, the bigger the deviation between the
two writing styles. The top "acc_zuck" row has the smallest number. In other words, the difference of "Accused" and "Authentic" email are closest in writing style.


If you look at the "Mean" in the 2nd and 3rd row, the two values are very similar and much larger than row number one. In other words, "acc_ceglia", or the difference between "Accused" and "Ceglia" email, is as much as the difference between "Authentic" and "Ceglia".


All these numbers point to one direction, that the alleged email were from Zuckerburg, as far as the writing style is concerned.

How significant the results are? It depends on if the sample size (meaning the number of email) is considered large enough.  


                       Summary Statistics                      1
                            Results  12:40 Friday, April 8, 2011

                      The MEANS Procedure

  Variable               Mean         Std Dev         Minimum
  -----------------------------------------------------------
  acc_zuck        0.000066864     0.000139743    7.340421E-10
  acc_ceglia        0.0011288       0.0077861    4.709518E-10
  zuck_ceglia       0.0011347       0.0079904    2.0894707E-6
  -----------------------------------------------------------

          Variable            Maximum               N
          -------------------------------------------
          acc_zuck        0.000823812              57
          acc_ceglia        0.0588287              57
          zuck_ceglia       0.0603857              57
          -------------------------------------------