library(nlme)
library("dygraphs")
R is a language that has statistics and data built into its DNA, so to speak.
In this sense, R is nearly unique among programming languages. It is a language that has been built for statistics. It’s been designed for data.
This has advantages when you’re learning data science, because almost any statistical test or technique can be found somewhere within base R or one of its packages.
This is important. If you’re a beginner, and you’re just getting started in data science, you’ll have a lot to learn. To truly master data science, you’ll need to learn several sub-areas like probability, statistics, data visualization, data manipulation, and machine learning. All of these skill areas have theoretical foundations (which you’ll need to learn) but also practical techniques that you’ll need to execute by writing code.
R is in heavy use at several of the best companies who are hiring data scientists.
As Revolution Analytics recently noted, “R is also the tool of choice for data scientists at Microsoft, who apply machine learning to data from Bing, Azure, Office, and the Sales, Marketing and Finance departments.”
Beyond tech giants like Google, Facebook, and Microsoft, R is widely in use at a wide range of companies including Bank of America, Ford, TechCrunch, Uber, and Trulia.
“The term reproducible research refers to the idea that the ultimate product of academic research is the paper along with the full computational environment used to produce the results in the paper such as the code, data, etc. that can be used to reproduce the results and create new work based on the research”
Enough “hot air”
dygraph(Global.ts) %>% dyRangeSelector()
plot(Global.annual);grid()
Last35 <- window(Global.ts, start=c(1970, 1), end=c(2005, 12))
Last35Yrs <- time(Last35)
fitAD=lm(Last35 ~ Last35Yrs)
printShortsummary(summary(fitAD),TableOnly=TRUE)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -34.920409 1.164899 -29.98 <2e-16 ***
## Last35Yrs 0.017654 0.000586 30.13 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
par(mar=c(8,3,0,0));plot(Last35); abline(fitAD,col=2)
x.gls <- gls(Last35 ~ Last35Yrs, cor = corAR1(0.8))
confint(x.gls)
## 2.5 % 97.5 %
## (Intercept) -39.80571504 -28.49659109
## Last35Yrs 0.01442275 0.02011148
par(mar=c(7,3,1,1));
pacf(fitAD$residuals,lag.max = 10)
#demo(WorldBank);save(M,file="worldBank.rda")
#load("worldBank.rda")
#plot(M)
#print(M, file="figures/WorldBank.html")
This IEEE ranking system uses a set of 12 metrics, including things like Google search volume, Google trends, Twitter hits, Github repositories, Hacker News posts, and more
Keep in mind that the TIOBE index is structured to be “an indicator of the popularity of programming languages. The index is updated once a month. The ratings are based on the number of skilled engineers world-wide, courses and third party vendors. Popular search engines such as Google, Bing, Yahoo!, Wikipedia, Amazon, YouTube and Baidu are used to calculate the ratings.”
Another frequently sited language ranking system is the Redmonk Programming Language Rankings, which are derived from popularity on GitHub (lines of code) and popularity on Stack Overflow (number of tags).