Stats | Technology Tales

Some books and other forms of documentation on R

11th September 2021

The thrust of an exhortation from a computing handbook publisher comes to mind here: don’t just look things up on Google, read a book so you really understand what you are doing. Something like those words was used to sell an eBook on Github but the same sentiment applies to R or any other computing language. Using a search engine will get you going or add to existing knowledge but only a book or a training course will help to embed real competence.

In the case of R, there is a myriad of blogs out there that can be consulted as well as function and package documentation on RDocumentation or rrdr.io. For the former, R-bloggers or R Weekly can make good places to start while ones like Stats and R, Statistics Globe, STHDA, PSI’s VIS-SIG and anything from Posit (including their main blog as well as their AI one) can be worth consulting. Additionally, there is also RStudio Education and the NHS-R Community, which also have a Github repository together with a YouTube channel. Many packages have dedicated websites as well so there is no lack of documentation with all of these so here is a selection:

Tidyverse

forcats

tidyr

Distill for R Markdown

To come to the real subject of this post, R is unusual in that books that you can buy also have companions websites that contain the same content with the same structure. Whatever funds this approach (and some appear to be supported by RStudio itself by the looks of things), there certainly are a lot of books available freely online in HTML as you will see from the list below while a few do not have a print counterpart as far as I know:

Big Book of R

R Programming for Data Science

Hands-On Programming with R

Advanced R

Cookbook for R

R Graphics Cookbook

R Markdown: The Definitive Guide

R Markdown Cookbook

RMarkdown for Scientists

bookdown: Authoring Books and Technical Documents with R Markdown

blogdown: Creating Websites with R Markdown

pagedown: Create Paged HTML Documents for Printing from R Markdown

Dynamic Documents with R and knitr

Mastering Shiny

Engineering Production-Grade Shiny Apps

Outstanding User Interfaces with Shiny

R Packages

Mastering Spark with R

Happy Git and GitHub for the useR

JavaScript for R

HTTP Testing in R

Outstanding User Interfaces with Shiny

Engineering Production-Grade Shiny Apps

The Shiny AWS Book

Many of the above have counterparts published by O’Reilly or Chapman & Hall, to name the two publishers that I have found so far. Aside from sharing these with you, there is also the personal motivation of having the collection of links somewhere so I can close tabs in my Firefox session. There are other web articles open in other tabs that I need to retain and share but these will need to do for now and I hope that you find them as useful as I do.

Creating a Data Set Containing Confidence Intervals Using PROC UNIVARIATE

5th September 2010

While you could generate data sets containing means and confidence intervals using PROC SUMMARY or PROC MEANS, curiosity and the need to verify a program using a different technique were what drove me to consider using PROC UNIVARIATE for the task. For the record, the PROC SUMMARY code is below and the only difference between it and MEANS is that it doesn’t produce output by default, something that’s not needed in this case anyway. Quite why there are two SAS procedures doing exactly the same thing is beyond me though I do wonder if the NOPRINT options was a later addition than these two procedures. The LCLM and UCLM keywords are what triggers the calculation of confidence limits and the ALPHA option controls the confidence interval used; 0.05 specifies a 95% interval, 0.1 a 90% one and so on.

proc summary data=sashelp.class mean lclm uclm alpha=0.05; var age; output out=sasuser.lims mean=mean lclm=lclm uclm=uclm; run;

Given that I have had PROC UNIVARIATE producing statistics that MEANS/SUMMARY didn’t in previous versions of SAS (I believe that is was standard deviation that was absent from MEANS/SUMMARY), I might have expected the calculation and export of confidence limits to a data set to be straightforward. Sadly, it’s not a case of simply adding LCLM and UCLM keywords in the OUTPUT statement for the procedure and ODS OUTPUT is needed to create the data set instead. An ODS SELECT statement is needed to pick out the BasicIntervals output object (UNIVARIATE creates quite a few, it seems) that is created through specification of the CIBASIC and ALPHA (performs the same role as it does for PROC MEANS/SUMMARY) options on the PROC UNIVARIATE statement. The reason for the ODS LISTING and ODS RTF statements below is to stop output being sent to the output window in a standard SAS session. For some reason, it appears that you need the sending of output to one of the LISTING, HTML or RTF destinations or there will be no data in the data set; I met up with the same behaviour when using ODS PS, an ODS PRINTER destination. The data set will contain statistics for mean, standard deviation and variance so that’s why there is a WHERE clause on the ODS OUTPUT statement.

ods listing close; ods rtf body="c:\temp\uni_eg.doc"; ods select BasicIntervals; ods output BasicIntervals=sasuser.stats(where=(lowcase(parameter)="mean") );

proc univariate cibasic alpha=0.05 data=sashelp.class; var age; run;

ods output close; ods rtf close; ods listing;

Suffering from neglect?

6th March 2009

There have been several recorded instances of Google acquiring something and then not developing it to its full potential. FeedBurner is yet another acquisition where this sort of thing has been suspected. Changeovers by monolithic edict and lack of responsiveness from support fora are the sorts of things that breed resentment in some that share opinions on the web. Within the last month, I found that my FeedBurner feeds were not being updated as they should have been and it would not accept a new blog feed when I tried adding it. The result of both these was that I got to deactivating the FeedBurner FeedSmith plugin to take FeedBurner out of the way for my feed subscribers; those regulars on my hillwalking blog were greeted by a splurge of activity following something of a hiatus. There are alternatives such as RapidFeed and Pheedo but I will stay away from the likes of these for a little while and take advantage of the newly added FeedStats plugin to keep tabs on how many come to see the feeds. The downside to this is that IE6 users will see the pure XML rather than a version with a more friendly formatting.

An alternative use for Woopra

4th August 2008

Google Analytics is all very fine with its once a day reporting cycle but the availability of real time data dose have its advantages. WordPress.com’s Stats plugin goes some way to serving the need but Woopra trumps it in every way apart from a possible overkill in the amount of information that it makes available. The software may be in the beta phase and it does crash from time to time but its usefulness remains more than apparent.

One of its uses is seeing if there are people visiting your website at a time when you might be thinking of making a change like upgrading WordPress. Timing such activities to avoid a clash is a win-win situation: a better experience from your visitors and more reliable updates for you. After all, it’s very easy to make a poor impression and an unreliable site will do that faster than anything else so it’s paramount that your visitors do not get on the receiving end of updates, even if they are all for the better.

Technology Tales

Some books and other forms of documentation on R

Creating a Data Set Containing Confidence Intervals Using PROC UNIVARIATE

Suffering from neglect?

An alternative use for Woopra

Topics Discussed

Translate