Technology Tales

Adventures & experiences in contemporary technology

Some books and other forms of documentation on R

11th September 2021

The thrust of an exhortation from a computing handbook publisher comes to mind here: don’t just look things up on Google, read a book so you really understand what you are doing. Something like those words was used to sell an eBook on Github but the same sentiment applies to R or any other computing language. Using a search engine will get you going or add to existing knowledge but only a book or a training course will help to embed real competence.

In the case of R, there is a myriad of blogs out there that can be consulted as well as function and package documentation on RDocumentation or rrdr.io. For the former, R-bloggers or R Weekly can make good places to start while ones like Stats and R, Statistics Globe, STHDA, PSI’s VIS-SIG and anything from Posit (including their main blog as well as their AI one) can be worth consulting. Additionally, there is also RStudio Education and the NHS-R Community, which also have a Github repository together with a YouTube channel. Many packages have dedicated websites as well so there is no lack of documentation with all of these so here is a selection:

Tidyverse

forcats

tidyr

Distill for R Markdown

Databases using R

RMariaDB

R Markdown

xaringanExtra

Shiny

formattable

reactable

DT

rhandsontable

thematic

bslib

plumber

ggforce

officeverse

officer

pharmaRTF

COVID-19 Data Hub

To come to the real subject of this post, R is unusual in that books that you can buy also have companions websites that contain the same content with the same structure. Whatever funds this approach (and some appear to be supported by RStudio itself by the looks of things), there certainly are a lot of books available freely online in HTML as you will see from the list below while a few do not have a print counterpart as far as I know:

Big Book of R

R Programming for Data Science

Hands-On Programming with R

Advanced R

Cookbook for R

R Graphics Cookbook

R Markdown: The Definitive Guide

R Markdown Cookbook

RMarkdown for Scientists

bookdown: Authoring Books and Technical Documents with R Markdown

blogdown: Creating Websites with R Markdown

pagedown: Create Paged HTML Documents for Printing from R Markdown

Dynamic Documents with R and knitr

Mastering Shiny

Engineering Production-Grade Shiny Apps

Outstanding User Interfaces with Shiny

R Packages

Mastering Spark with R

Happy Git and GitHub for the useR

JavaScript for R

HTTP Testing in R

Outstanding User Interfaces with Shiny

Engineering Production-Grade Shiny Apps

The Shiny AWS Book

Many of the above have counterparts published by O’Reilly or Chapman & Hall, to name the two publishers that I have found so far. Aside from sharing these with you, there is also the personal motivation of having the collection of links somewhere so I can close tabs in my Firefox session. There are other web articles open in other tabs that I need to retain and share but these will need to do for now and I hope that you find them as useful as I do.

Online learning

18th April 2021

Recently, I shared my thoughts on learning new computing languages by oneself using books, online research and personal practice. As successful as that can be, there remains a place for getting some actual instruction as well. Maybe that is why so many turn to YouTube, where there is a multitude of video channels offering such possibilities without cost. What I have also discovered is that this is complemented by a host of other providers whose services attract a fee, and there will be a few of those mentioned later in this post. Paying for online courses does mean that you can get the benefit of curation and an added assurance of quality in what appears to be a growing market.

The variation in quality can dog the YouTube approach, and it also can be tricky to find something good, even if the platform does suggest new videos based on what you have been watching. Much of what is found there does take the form of webinars from the likes of the Why R? Foundation, Posit or the NHSR Community. These can be useful, and there are shorter videos from such providers as the Association of Computing Machinery or SAS Users. These do help more if you already have some knowledge about the topic area being discussed, so they may not make the best starting points for someone who is starting from scratch.

Of course, working your way through a good book will help, and it is something that I have been known to do, but supplementing this with one or more video courses really adds to the experience and I have done a few of these on LinkedIn. That part of the professional platform came from the acquisition of Lynda.com and the topic areas range from soft skills like time management through to computing skills courses with R, SAS and Python seeing coverage among the data science portfolio. Even O’Reilly has ventured into the area in an expansion from the book publishing activities for which so many of us know the organisation.

The available online instructor community does not stop at the above since there are others like Degreed, Baeldung, Udacity, Programiz, Udemy, Business Science and Datanovia. Some of these tend towards online education provision that feels more like an online university course and those are numerous as well as you will find through Data Science Central or KDNuggets. Both of these earn income from advertising to pay for featured blog posts and newsletters, while the former also organises regular webinars and was my first port of call when I became curious about the world of data science during the autumn of 2017.

My point of approach into the world of online training has been as a freelance information professional needing to keep up to date with a rapidly changing field. The mix of content that is both free of charge and that which attracts a fee is one that can work. Both kinds do complement each other while possessing their unique advantages and disadvantages. The need to continually expand skills and knowledge never goes away, so it is well worth spending some time working what you are after, since you need to be sure that any training always adds to your own knowledge and skill level.

Opening up Kindle for PC in a maximised window on Windows 10

18th March 2017

It has been a while since I scribbled anything on here but I now have a few things to relating, starting with this one. Amazon now promotes a different app for use when reading its eBooks on PC’s and, with a certain reluctance, I have taken to using this because its page synchronisation is not as good as it should be.

Another irritation is that it does not open in a maximised window and it scarcely remembers your size settings from session to session. Finding solutions to this sizing issue is no easy task so I happened on one of my own that I previously used with Windows (or File) Explorer folder shortcuts.

The first step is to find the actual location of the Start Menu shortcut. Trying C:\Users\[User Name]\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Amazon\Amazon Kindle should do that.

Next, right click on the Kindle icon and choose Properties from the context menu that appears. In the dialogue box that causes to appear on the screen, look for the “Run:” setting. By default, this appears as “Normal Window” but you can change this to “Maximised”, which is what I did before clicking on Apply before doing the same for the OK button to dismiss the dialogue box.

If you have pinned the shortcut to the taskbar or elsewhere, you may need to unpin it and pin it again to carry over the change. After that, I found that the Kindle app opened up in a maximised window as I wanted.

With that done, I could get along better with the app and it does put a search box in a more obvious place that it was in the old one. You also can set up Collections so your books are organised so there is something new for a user. Other than that, it largely works as before though you may to hit the F5 key every now and again to synchronise reading progress across multiple devices.

A collection of legal BitTorrent sites

19th October 2014

It was an article in a magazine that revealed these legal BitTorrent download sites to me, so I thought that I’d keep them on file for future reference while also sharing them with others who might need them. As far as I am aware, they are all legal in that no copyrighted material is on there. If that changes, I am happy to know and make amendments as needed.

My own interest in torrents arises from their being a convenient way to download installation disk images for Linux distributions, and at least one of the entries is devoted to just that. However, the distribution also lends itself to movies along with music and books, so that is reflected below too. With regard to downloading actual multimedia content, there is so much illegal downloading that a list like this is needed and has blackened the reputation of BitTorrent too because it only ever was conceived as a means for distributing large files in a peer-to-peer manner without the use of a single server. Of course, any use can be found for a technology, and it never has to be legal or morally acceptable either.

Archive.org

Bt.etree.org

FrostClick

Linux Tracker

Public Domain Torrents

Deauthorising Adobe Digital Editions software

12th March 2011

My being partial to the occasional eBook has meant my encountering Adobe’s Digital Editions. While I wonder why the functionality cannot be be included in the already quite bulky Adobe Reader, it does exist and some publishers used it to ensure that their books are not as easily pirated. In my case, it is a certain publisher of walking guidebooks that uses it and I must admit to being a sometime fan of their wares. At first, I was left wondering how they thought that Digital Editions was the delivery means that would ensure that they do not lose out from sharing of copies of eBooks but a recent episode has me seeing what they see.

One of the nice things that it allows is the sharing of eBooks between different computers using your Adobe account. Due to my own disorganisation, I admit to having more than one though I am not entirely sure why I ended up doing that. The result was that I ended entering the wrong credentials intro the Digital Editions instance on my Toshiba laptop and I needed to get rid of them in order to enter the correct ones. It is when you try doing things like this that you come to realise how basic and slimmed down this software is. After a Google search, I encountered the very keyboard shortcut about which even the help didn’t seem to want to tell me: Control+Shift+D. That did the required deauthorisation for me to be able to read eBooks bought and downloaded onto another computer. Maybe Digital Editions does its job to lessen the chances after all. Of course, I cannot see the system being perfect or unbreakable but a lot of our security is there to deter the opportunists rather than the more determined.

On Making PROC REPORT Work Harder

1st September 2010

In the early years of my SAS programming career, there seemed to be just the one procedure to use if you wanted to create a summary table. That was TABULATE and it was great for generating columns according to the value of a variable such as the treatment received by a subject in a clinical study. To a point, it could generate statistics for you too and I often used it to sum frequency and percentage variables. Since then, it seems to have been enhanced a little and it surprised me with the statistics it could produce when I had a recent play. Here’s the code:

proc tabulate data=sashelp.class;
class sex;
var age;
table age*(n median*f=8. mean*f=8.1 std*f=8.1 min*f=8. max*f=8. lclm*f=8.1 uclm*f=8.1),sex / misstext="0";
run;

When you compare that with the idea of creating one variable per column and then defining them in PROC REPORT as many do, it has to look more elegant and the results aren’t bad either though they can be tweaked further from the quick example that I generated. That last comment brings me to the point that PROC REPORT seems to have taken over from TABULATE wherever I care to look these days and I do ask myself if it is the right tool for that for which it is being used or if it is being used in the best way.

Using Data Step to create one variable per column in a PROC REPORT output doesn’t strike me as the best way to write reusable code but there are ways to make REPORT do more for you. For example, by defining GROUP, ACROSS and ANALYSIS columns in an output, you can persuade the procedure to do the summarising for you and there’s some example code below with the comma nesting height under sex in the resulting table. Sums are created by default if you do this and forgoing an analysis column definition means that you get a frequency table, not at all a useless thing in many cases.

proc report data=sashelp.class nowd missing;
columns age sex,height;
define age / group "Age";
define sex / across "Sex";
define height / analysis mean f=missing. "Mean Height";
run;

For those times when you need to create more heavily formatted statistics (summarising range as min-max rather showing min and max separately, for example), you might feel that the GROUP/ACROSS set-up’s non-display of character values puts a stop to using that approach. However, I found that making every value combination unique and attaching a cell ID helps to work around the problem. Then, you can create a format control data set from the data like in the code below and create a format from that which you can apply to the cell ID’s to display things as you need them. This method does make things more portable from situation to situation than adding or removing columns depending on the values of a classification variable.

proc sql noprint;
create table cntlin as
select distinct "fmtname" as fmtname, cellid as start, cellid as end, decode as label
from report;
quit;

proc format lib=work cntlin=cnlin;
run;

Exploring AJAX

7th June 2007

When I first started it, my online photo gallery started out simply as a set of interlinked HTML pages. Over time, I discovered frames (yes, them!) and started to make use of JavaScript to make the slideshows slicker. In those days, I was working off free webspace provided by my ISP and client-side scripting was the only tool that I had for enhancing functionality. Having tired of the vagaries of client-side scripting -- the browser wars were in full swing and incompatibilities reigned supreme, I went with paid hosting in order to get access to tools like Perl and PHP for server-side processing; their flexibility compared to JavaScript was a breath of fresh air to me and I am still a fan of the server-side approach.

The journey that I have just described is one that I now know was followed by a lot of website builders around the same time. Nevertheless, I have still held onto JavaScript for some things, particularly for updating the DOM as part of making the pages more responsive to user interaction. In the last few years, a hybrid approach has been gaining currency: AJAX. This offers the ability to modify parts of a page without needing to reload the whole thing and that has generated a considerable amount of interest among web application developers.

The world of AJAX is evidently a complex one though the underlying principle can be explained in simple terms. The essential idea is that you use JavaScript to call a server-side script, PHP is as good an example as any, that returns either text or XML that can be used to update part of a web page in situ without the need to reload it as per the traditional way of working. It has opened up so many possibilities from the interface design point of view that AJAX became a hot topic that still receives much attention today. One bugbear is efficiency because I have seem an AJAX application lock up a PC with a little help from IE6. There will always remain times where server-side processing is the best route and that needs to balanced against the client-side and vice versa.

Like its forbear DHTML, AJAX is really a development approach using a number of different technologies in combination. The DHTML elements such as (X)HTML, CSS, DOM and JavaScript are very much part of the AJAX world but server-side elements such as HTTP, PHP, MySQL and XML are also very much part of the fabric of the landscape. In fact, while AJAX can use plain text as the transfer format, XML is the one implied by the AJAX acronym and XSLT is used to transform XML in HTML. However, AJAX is not limited to the aforementioned technologies; for instance, I cannot see why Perl cannot play a role in place of PHP and ASP can be used for the same things.

Even in these standards-compliant days, browser support for AJAX remains diverse, to say the least, and it is akin to having MSIE in one corner and the rest in the other. Mind you, Microsoft did introduce the tools in the first place but they used ActiveX and Mozilla created a new object type rather than continue this method of operation. Given that ActiveX is a Windows-only technology, I can see why Mozilla did what they did and it is a sensible decision. In fact, IE7 appears to have picked up the Mozilla way of doing things.

Even with the apparent convergence, there will continue to be a need for the AJAX JavaScript libraries that are currently out there. Incidentally, Adobe has included one called Spry with Dreamweaver CS3. Nevertheless, I still like to find out how things work at the basic level and feel somewhat obstructed when I cannot do this. I remember perusing Wrox’s Professional AJAX and found the constant references to the associated function library rather grating; the writing style didn’t help either.

My taking a more granular approach has got me reading SAMS Teach Yourself AJAX in 10 Minutes as a means for getting my foot in the door. As with their Teach Yourself … in 24 Hours series, the title is a little misleading since there are 22 lessons of 10 minutes in duration (the 24 Hours moniker refers to there being 24 lessons, each of one hour in length). Anything composed of 10 minute lessons, even 22 of them, is never going to be comprehensive but, as a means for getting started, I have to say that the approach seems effective on the basis of this volume. It has certainly whet my appetite for giving AJAX a go and it’ll be interesting to see how things progress from here.

SAS books now on Safari

31st May 2007

Being a Safari subscriber, I found a pleasant surprise awaiting me in this month’s email newsletter: eBooks from SAS Books are now available on Safari. Having a quick look, I found a small but useful selection. Topics like the SQL procedure, the Macro language and Enterprise Guide caught my eye but there’s more than this on offer. It’ll be interesting to see where this leads…

Learning about Oracle

20th April 2007

My work in the last week has put me on something of a learning about Oracle. This is down my needing to add file metadata to database as part of an application that I am developing. The application is written in SAS but I am using SAS/Access for Oracle to update the database using SQL pass-through statements written in Oracle SQL. I am used to SAS SQL and there is commonality between it and Oracle’s implementation, which is a big help. Nevertheless, there of course are things specific to the Oracle world about which I have needed to learn. My experiences have introduced me to concepts like triggers, sequences, constraints, primary keys, foreign keys and the like. In addition, I have also seen the results of database normalisation at first hand.

Using Oracle’s SQL Developer has been a great help in my endeavours thanks to its online help and the way that you can view database objects in an easy to use manner. It also runs SQL scripts, giving you a feel for how Oracle works, and anyone can download it for free upon registration on the Oracle website. Also useful is the Express edition of the Oracle 10g database that I now have at home for personal learning purposes. That is another free download from Oracle’s website.

My Safari bookshelf has been another invaluable resource, providing access to O’ Reilly’s Oracle books. Of these, Mastering Oracle SQL has proved particularly useful and I made a journey to Manchester after work this evening (Waterstone’s on Deansgate is open until 21:00 on weekdays) to see if I could acquire a copy. That quest was to prove fruitless but I now have got the doorstop that is Oracle Database 10g: The Complete Reference from The Oracle Press, an imprint of Osborne and McGraw Hill. I needed a broader grounding in all things Oracle so this should help and it also covers SQL but the aforementioned O’ Reilly volume could return to the wish list if that provision is insufficient.

The joys of eBooks

3rd April 2007

One of the nice things about eBooks is the saving that you can make on buying one instead of the dead tree edition. And if you get one from Apress, it is the full article that you get and they keep it available so that you can download another version if you need it. You can also print the thing off if you want too but a laser printer producing double-sided prints is an asset if you don’t want your space invaded by a hoard of lever arch binders. Having a copious supply of inexpensive toner helps too as does cheap paper. Otherwise, you could spend your savings on printing the thing yourself.

The ever pervasive Safari does things a little differently from the likes of Apress. Mind you, the emphasis there is on the library aspect of the operation and not eBook selling. The result is that you can only ever download chapters, so no index or overall table of contents. You still can buy all of the chapters for a particular book, though some publishers don’t seem to allow this for some reason, but finding anything in there after you have had a read becomes an issue, especially when it’s the hard copy that you are using. Take yesterday, for instance, when trying to relocate the formatting parameters for the UNIX date function. I eventually found them in the chapters of UNIX in a Nutshell that I have downloaded and printed off but I spent rather longer looking in Learning the Korn Shell than I should have done. I know that you can search in the PDF’s themselves but that is more laborious when there is a number of files to search rather than just the one. I suppose that the likes of O’Reilly prefers you to buy paper copies of its books for more extensive use, and they have a point, but having the electronic version all in one file does make life so much easier.

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.