Other sides to R beyond data analysis: Excel automation, Docker environments, shell commands and `Rscript`

Published on 21^st March 2026

Estimated Reading Time: 9 minutes

The idea of using R purely as an interactive analysis tool understates what the language can do. Once a workflow is expressed in code, R can collect and transform data, write Excel workbooks, run inside a reproducible containerised environment, call shell scripts and execute unattended from the command line. Each of those capabilities is useful on its own, but together they describe something more ambitious: R as a general-purpose automation layer sitting at the centre of a reporting pipeline. This article draws on four sources to illustrate that picture, beginning with a Business Science tutorial on generating Excel workbooks with {openxlsx} and {tidyquant}, then expanding outwards to cover running RStudio inside Docker, calling shell commands from an interactive R session and executing R scripts from the command line with Rscript.

The Business Science Tutorial

A useful foundation for this workflow comes from a tutorial published on R-bloggers on the 6th of October 2020, contributed by Business Science as part of its R-Tips Weekly series. That tutorial demonstrates how to use the {openxlsx} and {tidyquant} packages together to automate the creation of an Excel workbook. The core idea is straightforward: gather financial data in R, transform it into a summary table, create a chart and then write those outputs into an Excel file programmatically.

The workbook described in the tutorial contains two main outputs. One is a pivot-style summary table showing stock returns broken down by year and symbol, and the other is a stock chart plotted over time. Rather than treating Excel as the place where all data manipulation begins and ends, the workflow shifts gathering and processing into R first, with Excel becoming the delivery format rather than the main engine of the analysis.

Collecting Data with {tidyquant}

The data collection stage in the tutorial uses {tidyquant}, a package designed for importing and working with financial data within the tidyverse. It wraps functionality from packages such as {quantmod}, {xts}, {zoo} and {TTR}, returning results as tidy tibbles that integrate cleanly with standard tidyverse tools. In the tutorial, financial market data are imported and then used to derive annual returns, which are reshaped into a pivot-table-like structure and plotted through time as a stock chart.

This keeps the entire preparation stage inside R before any results are written elsewhere. The tq_get() function serves as the main entry point for retrieving stock price data, accepting ticker symbols and returning results in a consistent tabular format. Keeping data collection in code rather than manual downloads also makes the workflow straightforward to update or extend.

Writing Workbooks with {openxlsx}

Once the data and visualisation are prepared, {openxlsx} provides the Excel automation layer. The tutorial describes a six-step process: initialise a workbook, create a worksheet, add the stock plot, add the pivot table, save the workbook and then open it programmatically. That sequence reflects a common pattern in reporting automation, where code assembles an output file from several components before making it available.

What makes {openxlsx} particularly convenient is that it works directly with Excel xlsx files, without requiring Excel itself to be open or installed during the creation process. In practical terms, this means an R script can generate a workbook as part of a larger task, whether run manually, scheduled on a machine or incorporated into a reporting pipeline. It is worth noting that {openxlsx} is no longer under active development; the package is maintained and CRAN warnings are fixed, but users starting new projects are encouraged to consider {openxlsx2} as a modern alternative.

The original tutorial also points readers towards a GitHub repository containing the full code and a YouTube walk-through showing the process step by step. Those references underline that the workflow is intended as a repeatable practical skill rather than a one-off demonstration. The tutorial forms part of a weekly series in which readers are invited to pull the latest code from the repository.

Running RStudio Inside Docker

Excel automation is one side of a wider theme in R workflows, namely integrating R with the surrounding operating environment. A good illustration of the broader approach is running RStudio inside Docker, which provides a reproducible computing environment that behaves consistently regardless of the host machine.

Docker needs to be installed first, after which the rocker/verse image can be used to launch a containerised RStudio session. As described in this Docker tutorial for R users, this image already has many useful R packages installed and allows RStudio Server to be accessed through a web browser. The launch command is as follows:

docker run --rm -p 8787:8787 -e PASSWORD=YOURNEWPASSWORD rocker/verse

The -p flag exposes port 8787 so that RStudio Server can be reached in a browser, while --rm ensures the container is deleted when it is shut down, preventing temporary containers from accumulating and consuming disc space. If Docker does not find the image locally, it will search Docker Hub and download it automatically.

Connecting to the running container depends on the operating system and Docker configuration. On Mac or Linux machines, pointing a browser to http://localhost:8787 should work. On Mac or Windows setups using Docker Quickstart Terminal, the IP address is shown at launch (for example http://192.168.99.100:8787). Should the error "Cannot connect to the Docker daemon" appear, running eval "$(docker-machine env default)" may resolve it. Once connected, log in using the username rstudio and the password set at launch.

Container File Systems and Volume Mounting

A key characteristic of Docker containers is that their file systems are temporary by default. Any files created inside a container launched with --rm will be lost when the container is shut down. The Docker tutorial illustrates this by having users create a script and a plot inside a running container, then restarting it with --rm to find them gone. This apparent limitation leads naturally into a more durable arrangement through volume mounting.

A local directory on the host machine can be linked to a directory inside the container using the -v flag, so that files written inside the container are stored on the host. Once that volume is linked, the user can open files from the mounted directory, set a working directory and load data stored on the host. The tutorial uses read.csv to load a CSV file, then loads {ggplot2}, creates a plot with qplot and saves the result with ggsave to the mounted directory. Files saved in this way persist after the container exits, separating the reproducible environment from persistent project data.

This is an important practical consideration for anyone building automated R workflows. It demonstrates that a containerised environment can provide isolation and consistency without sacrificing continuity of project files, making Docker a useful complement to R-based automation where consistent execution environments matter.

Calling Shell Commands from R

Another route to automation is the use of system commands from an interactive R session. As described in this post by Jay on his Notes blog, base R provides the system() function for this purpose, allowing an R session to call out to the operating system to list files, launch scripts or trigger shell-based tools. A simple example is system("ls"), which lists files in the current working directory from within an R session.

The post illustrates a more practical use case with a shell script called show_notes.sh. That script accepts a source file and a marker string, extracts all lines containing the marker and writes them to a new file. Running system("show_notes.sh explore.R NOTE") from within R would search through explore.R for lines labelled with NOTE and save them to explore.R.NOTE, assuming both files are in the same directory or their paths are provided in full.

If this becomes a regular part of a workflow, the shell script can be bundled inside an R package. The script is placed under an inst/sh subdirectory, and a wrapper function in the package's R directory calls it via system.file() to locate the installed script. The wrapper function, named show_notes() in the example, constructs a command string, appends the source file and marker arguments and runs the command with system(). Once the package is installed and loaded, calling show_notes("explore.R") performs the extraction without the user needing to remember the script location. The post also notes that system2() and the {fs} package are alternatives for similar tasks, though the author had not yet tried them at the time of writing.

Batch Execution with Rscript

For running R scripts outside an interactive session, Rscript provides a command-line front end. Its synopsis, as documented in the Rscript man page, is Rscript [options] [-e expr] file [args], allowing either an expression or a file of R code to be executed directly from a terminal. Additional arguments passed on the command line can then be accessed within the script using commandArgs().

Several flags influence how predictable a given run will be. The --vanilla flag suppresses saved workspaces, profile files and environment settings, helping ensure a script behaves consistently regardless of the local user environment. Other options include --verbose, --version, --help, --default-packages, --no-environ, --no-site-file and --no-init-file. Together, these controls make Rscript well suited to automated or scheduled execution where a clean, reproducible session is required.

When considered alongside the Excel example, Rscript illustrates how a reporting workflow might be run unattended. An R script that gathers financial data with {tidyquant}, creates a chart, writes a workbook with {openxlsx} and saves it to disc could be launched from the command line with Rscript, and that invocation could itself be triggered by either a scheduler or a shell process. This closes the loop between interactive development and fully automated deployment.

Putting the Pieces Together for Reproducible R Reporting

Across these examples, a coherent picture emerges of R as an orchestration layer rather than simply an interactive analysis tool. It can collect and transform data, produce visualisations, write Excel workbooks, run inside a containerised environment, call shell commands and execute as a standalone script. Each piece serves a different purpose, but together they show how a scripted workflow can cover the entire journey from data collection to delivered report.

For those primarily interested in Excel automation, the Business Science tutorial on R-bloggers remains the clearest practical entry point. It demonstrates that an Excel workbook can be generated from R by importing financial data with {tidyquant}, building a table of stock returns by year and symbol, creating a stock chart and inserting both into a workbook with {openxlsx}. The surrounding material on Docker, shell commands and Rscript adds useful depth, demonstrating that once a reporting task is expressed in code, there are several ways to run and maintain it, whether in a browser-based RStudio session inside a container, combined with shell tools or executed from the command line in a clean session.

Add a Comment Cancel Comment

Please be aware that comment moderation is enabled and may delay the appearance of your contribution.