TOPIC: FREE STATISTICAL SOFTWARE
Automating Positron and RStudio updates on Linux Mint 22
6th November 2025Elsewhere, I have written about avoiding manual updates with VSCode and VSCodium. Here, I come to IDE's produced by Posit, formerly RStudio, for data science and analytics uses. The first is a more recent innovation that works with both R and Python code natively, while the second has been around for much longer and focusses on native R code alone, though there are R packages allowing an interface of sorts with Python. Neither are released via a PPA, necessitating either manual downloading or the scripted approach taken here for a Linux system. Each software tool will be discussed in turn.
Positron
Now, we work through a script that automates the upgrade process for Positron. This starts with a shebang line calling the bash executable before moving to a line that adds safety to how the script works using a set statement. Here, the -e switch triggers exiting whenever there is an error, halting the script before it carries on to perform any undesirable actions. That is followed by the -u switch that causes errors when unset variables are called; normally these would be assigned a missing value, which is not desirable in all cases. Lastly, the -o pipefail switch causes a pipeline (cmd1 | cmd2 | cm3) to fail if any command in the pipeline produces an error, which can help debugging because the error is associated with the command that fails to complete.
#!/bin/bash
set -euo pipefail
The next step then is to determine the architecture of the system on which the script is running so that the correct download is selected.
ARCH=$(uname -m)
case "$ARCH" in
x86_64) POSIT_ARCH="x64" ;;
aarch64|arm64) POSIT_ARCH="arm64" ;;
*) echo "Unsupported arch: $ARCH"; exit 1 ;;
esac
Once that completes, we define the address of the web page to be interrogated and the path to the temporary file that is to be downloaded.
RELEASES_URL="https://github.com/posit-dev/positron/releases"
TMPFILE="/tmp/positron-latest.deb"
Now, we scrape the page to find the address of the latest DEB file that has been released.
echo "Finding latest Positron .deb for $POSIT_ARCH..."
DEB_URL=$(curl -fsSL "$RELEASES_URL" \
| grep -Eo "https://cdn\.posit\.co/[A-Za-z0-9/_\.-]+Positron-[0-9\.~-]+-${POSIT_ARCH}\.deb" \
| head -n 1)
If that were to fail, we get an error message produced before the script is aborted.
if [ -z "${DEB_URL:-}" ]; then
echo "Could not find a .deb link for ${POSIT_ARCH} on the releases page"
exit 1
fi
Should all go well thus far, we download the latest DEB file using curl.
echo "Downloading: $DEB_URL"
curl -fL "$DEB_URL" -o "$TMPFILE"
When the download completes, we try installing the package using apt, much like we do with a repo, apart from specifying an actual file path on our system.
echo "Installing Positron..."
sudo apt install -y "$TMPFILE"
Following that, we delete the installation file and issue a message informing the user of the task's successful completion.
echo "Cleaning up..."
rm -f "$TMPFILE"
echo "Done."
When I do this, I tend to find that the Python REPL console does not open straight away, causing me to shut down Positron and leaving things for a while before starting it again. There may be temporary files that need to be expunged and that needs its own time. Someone else might have a better explanation that I am happy to use if that makes more sense than what I am suggesting. Otherwise, all works well.
RStudio
A lot of the same processing happens during the script updating RStudio, so we will just cover the differences. The set -x statement ensures that every command is printed to the console for the debugging that was needed while this was being developed. Otherwise, much code, including architecture detection, is shared between the two apps.
#!/bin/bash
set -euo pipefail
set -x
# --- Detect architecture ---
ARCH=$(uname -m)
case "$ARCH" in
x86_64) RSTUDIO_ARCH="amd64" ;;
aarch64|arm64) RSTUDIO_ARCH="arm64" ;;
*) echo "Unsupported architecture: $ARCH"; exit 1 ;;
esac
Figuring out the distro version and the web page to scrape was where additional effort was needed, and that is reflected in some of the code that follows. Otherwise, many of the ideas applied with Positron also have a place here.
# --- Detect Ubuntu base ---
DISTRO=$(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || true)
[ -z "$DISTRO" ] && DISTRO="noble"
# --- Define paths ---
TMPFILE="/tmp/rstudio-latest.deb"
LOGFILE="/var/log/rstudio_update.log"
echo "Detected Ubuntu base: ${DISTRO}"
echo "Fetching latest version number from Posit..."
# --- Get version from Posit's official RStudio Desktop page ---
VERSION=$(curl -s https://posit.co/download/rstudio-desktop/ \
| grep -Eo 'rstudio-[0-9]+\.[0-9]+\.[0-9]+-[0-9]+' \
| head -n 1 \
| sed -E 's/rstudio-([0-9]+\.[0-9]+\.[0-9]+-[0-9]+)/\1/')
if [ -z "$VERSION" ]; then
echo "Error: Could not extract the latest RStudio version number from Posit's site."
exit 1
fi
echo "Latest RStudio version detected: ${VERSION}"
# --- Construct download URL (Jammy build for Noble until Noble builds exist) ---
BASE_DISTRO="jammy"
BASE_URL="https://download1.rstudio.org/electron/${BASE_DISTRO}/${RSTUDIO_ARCH}"
FULL_URL="${BASE_URL}/rstudio-${VERSION}-${RSTUDIO_ARCH}.deb"
echo "Downloading from:"
echo " ${FULL_URL}"
# --- Validate URL before downloading ---
if ! curl --head --silent --fail "$FULL_URL" >/dev/null; then
echo "Error: The expected RStudio package was not found at ${FULL_URL}"
exit 1
fi
# --- Download and install ---
curl -L "$FULL_URL" -o "$TMPFILE"
echo "Installing RStudio..."
sudo apt install -y "$TMPFILE" | tee -a "$LOGFILE"
# --- Clean up ---
rm -f "$TMPFILE"
echo "RStudio update to version ${VERSION} completed successfully." | tee -a "$LOGFILE"
When all ended, RStudio worked without a hitch, leaving me to move on to other things. The next time that I am prompted to upgrade the environment, this is the way I likely will go.
PandasGUI: A simple solution for Pandas DataFrame inspection from within VSCode
2nd September 2025One of the things that I miss about Spyder when running Python scripts is the ability to look at DataFrames easily. Recently, I was checking a VAT return only for tmux to truncate how much of the DataFrame I could see in output from the print function. While closing tmux might have been an idea, I sought the DataFrame windowing alternative. That led me to the pandasgui package, which did exactly what I needed, apart from pausing the script execution to show me the data. The installed was done using pip:
pip install pandasgui
Once that competed, I could use the following code construct to accomplish what I wanted:
import pandasgui
pandasgui.show(df)
In my case, there were several lines between the two lines above. Nevertheless, the first line made the pandasgui package available to the script, while the second one displayed the DataFrame in a GUI with scrollbars and cells, among other things. That was close enough to what I wanted to leave me able to complete the task that was needed of me.
Broadening data science horizons: Useful Python packages for working with data
14th October 2021My response to changes in the technology stack used in clinical research is to develop some familiarity with programming and scripting platforms that complement and compete with SAS, a system with which I have been programming since 2000. While one of these has been R, Python is another that has taken up my attention, and I now also have Julia in my sights as well. There may be others to assess in the fullness of time.
While I first started to explore the Data Science world in the autumn of 2017, it was in the autumn of 2019 that I began to complete LinkedIn training courses on the subject. Good though they were, I find that I need to actually use a tool to better understand it. At that time, I did get to hear about Python packages like Pandas, NumPy, SciPy, Scikit-learn, Matplotlib, Seaborn and Beautiful Soup though it took until of spring of this year for me to start gaining some hands-on experience with using any of these.
During the summer of 2020, I attended a BCS webinar on the CodeGrades initiative, a programming mentoring scheme inspired by the way classical musicianship is assessed. In fact, one of the main progenitors is a trained classical musician and teacher of classical music who turned to Python programming when starting a family to have a more stable income. The approach is that a student selects a project and works their way through it, with mentoring and periodic assessments carried out in a gentle and discursive manner. Of course, the project has to be engaging for the learning experience to stay the course, and that point came through in the webinar.
That is one lesson that resonates with me with subjects as diverse as web server performance and the ongoing pandemic supplying data, and there are other sources of public data to examine as well before looking through my own personal archive gathered over the decades. Though some subjects are uplifting while others are more foreboding, the key thing is that they sustain interest and offer opportunities for new learning. Without being able to dream up new things to try, my knowledge of R and Python would not be as extensive as it is, and I hope that it will help with learning Julia too.
In the main, my own learning has been a solo effort with consultation of documentation along with web searches that have brought me to the likes of Real Python, Stack Abuse, Data Viz with Python and R and others for longer tutorials as well as threads on Stack Overflow. Usually, the web searching begins when I need a steer on a particular or a way to resolve a particular error or warning message, but books are always worth reading even if that is the slower route. While those from the Dummies series or from O'Reilly have proved must useful so far, I do need to read them more completely than I already have; it is all too tempting to go with the try the "programming and search for solutions as you go" approach instead.
To get going, many choose the Anaconda distribution to get Jupyter notebook functionality, but I prefer a more traditional editor, so Spyder has been my tool of choice for Python programming and there are others like PyCharm as well. Because Spyder itself is written in Python, it can be installed using pip from PyPi like other Python packages. It has other dependencies like Pylint for code management activities, but these get installed behind the scenes.
The packages that I first met in 2019 may be the mainstays for doing data science, but I have discovered others since then. It also seems that there is porosity between the worlds of R and Python, so you get some Python packages aping R packages and R has the Reticulate package for executing Python code. There are Python counterparts to such Tidyverse stables as dplyr and ggplot2 in the form of Siuba and Plotnine, respectively. Though the syntax of these packages are not direct copies of what is executed in R, they are close enough for there to be enough familiarity for added user-friendliness compared to Pandas or Matplotlib. The interoperability does not stop there, for there is SQLAlchemy for connecting to MySQL and other databases (PyMySQL is needed as well) and there also is SASPy for interacting with SAS Viya.
While Python may not have the speed of Julia, there are plenty of packages for working with larger workloads. Of these, Dask, Modin and RAPIDS all have their uses for dealing with data volumes that make Pandas code crawl. As if to prove that there are plenty of libraries for various forms of data analytics, data science, artificial intelligence and machine learning, there also are the likes of Keras, TensorFlow and NetworkX. These are just a selection of what is available, and there is always the possibility of checking out others. It may be tempting to stick with the most popular packages all the time, especially when they do so much, but it never hurts to keep an open mind either.
Some books and other forms of documentation on R
11th September 2021The thrust of an exhortation from a computing handbook publisher comes to mind here: don't just look things up on Google, read a book so you really understand what you are doing. A form of words like that was used to sell an eBook on GitHub, but the same sentiment applies to R or any other computing language. While using a search engine will get you going or add to existing knowledge, only a book or a training course will help to embed real competence.
In the case of R, there is a myriad of blogs out there that can be consulted, as well as function and package documentation on RDocumentation or rrdr.io. For the former, R-bloggers or R Weekly can make good places to start, while ones like Stats and R, Statistics Globe, STHDA, PSI's VIS-SIG and anything from Posit (including their main blog as well as their AI one) can be worth consulting. Additionally, there is also RStudio Education and the NHS-R Community, which also have a GitHub repository together with a YouTube channel. Many packages have dedicated websites as well, so there is no lack of documentation with all of these, so here is a selection:
To come to the real subject of this post, R is unusual in that books that you can buy also have companion websites that contain the same content with the same structure. Whatever funds this approach (and some appear to be supported by RStudio itself by the looks of things), there certainly are many books available freely online in HTML as you will see from the list below, while a few do not have a print counterpart as far as I know:
R Programming for Data Science
R Markdown: The Definitive Guide
bookdown: Authoring Books and Technical Documents with R Markdown
blogdown: Creating Websites with R Markdown
pagedown: Create Paged HTML Documents for Printing from R Markdown
Dynamic Documents with R and knitr
Engineering Production-Grade Shiny Apps
Outstanding User Interfaces with Shiny
Happy Git and GitHub for the useR
Outstanding User Interfaces with Shiny
Engineering Production-Grade Shiny Apps
Many of the above have counterparts published by O'Reilly or Chapman & Hall, to name the two publishers that I have found so far. Aside from sharing these with you, there is also the personal motivation of having the collection of links somewhere so I can close tabs in my Firefox session. There are other web articles open in other tabs that I need to retain and share, but these will need to do for now, and I hope that you find them as useful as I do.