Technology Tales

Adventures & experiences in contemporary technology

Data Science Directory

While there is a LinkBlog on here, it has caught many different things, so I want to split off links to Data Science material and that is what you find here. At the beginning, it will be a case of moving things over from the other place, but things will appear anew on here too. Hopefully, this will stop things becoming lost in a bigger pile.

15:21 March 14, 2024

7 GPTs to Help Improve Your Data Science Workflow

15:20 March 14, 2024

Data Science and the Go Programming Language

20:01 January 22, 2024

Flight, DataFusion, Arrow, and Parquet: Using the FDAP Architecture to build InfluxDB 3.0

16:04 January 22, 2024

Tugan.ai

11:58 December 6, 2023

Using the RStudio Terminal in the RStudio IDE

16:47 November 3, 2023

A SAS RTF Parser Macro with the Aid of an R Package

17:13 June 14, 2023

ChatGPT brings AI into popular culture

17:02 June 14, 2023

My general advice on getting an analytics job

17:01 June 14, 2023

I’m an R user: Quarto or R Markdown?

09:10 April 26, 2023

Shiny User Adoption Fails: 9 Reasons Why Nobody Uses Your App

15:26 March 19, 2023

Welcome to NIHPO’s Synthetic Health Data Platform

01:30 February 25, 2023

Making Pretty PDFs with Quarto

10:13 January 12, 2023

Open Graph protocol

16:27 November 30, 2022

What is Chebychev’s Theorem and How Does it Apply to Data Science?

18:22 November 19, 2022

The Complete Free PyTorch Course for Deep Learning

14:50 November 19, 2022

7 Techniques to Handle Imbalanced Data

22:46 November 18, 2022

15 More Free Machine Learning and Deep Learning Books

22:34 November 18, 2022

7 Tips To Produce Readable Data Science Code

14:58 October 26, 2022

Extraction of chemical structures from literature and patent documents using open access chemistry toolkits: a case study with PFAS

14:57 October 26, 2022

Alliance for Data Science Professionals

09:12 October 26, 2022

SingleStoreDB

09:25 October 24, 2022

What is data extraction? And how to automate the process

15:51 October 21, 2022

GBS Analytics

20:41 October 14, 2022

Quarto

Pandoc

11:57 October 12, 2022

How to reveal new connections in a knowledge graph with link prediction

11:52 October 12, 2022

Fathom

11:43 October 12, 2022

Metabase

11:42 October 12, 2022

LimeSurvey

11:41 October 12, 2022

Supabase

09:39 October 1, 2022

Sample 68620: Create user home directories from the identities service in SAS® Viya® 2020.x using a script

15:09 September 28, 2022

BlueSky Statistics

17:02 September 27, 2022

R for Clinical Study Reports and Submission

09:16 September 15, 2022

How to Hide a Worksheet in Excel (that can not be unhidden easily)

13:49 September 8, 2022

Excel Formula Generator

16:54 August 17, 2022

Six tips for better spreadsheets

17:50 July 21, 2022

Apache Superset

15:54 July 8, 2022

FOSS For Spectroscopy

14:32 April 28, 2022

The Book of OHDSI

14:31 April 28, 2022

Observational Health Data Sciences and Informatics

14:56 April 27, 2022

Katja Glass Consulting Open Source Portal

14:54 April 27, 2022

CDISC Open Source Alliance

14:54 April 27, 2022

OpenClinica

15:30 February 21, 2022

Best Data Science Books For Beginners

21:14 February 10, 2022

R Graphical User Interface Comparison

22:02 January 29, 2022

Plumber

22:02 January 29, 2022

OpenCPU

08:54 January 24, 2022

The High Paying Side Hustles for Data Scientists

21:09 January 16, 2022

Numba

21:03 January 16, 2022

Freeing the data scientist mind from the curse of vectoRization

14:15 January 16, 2022

Julia Academy

14:12 January 16, 2022

Why You Should Invest in Julia Now, as a Data Scientist

14:07 January 16, 2022

The Future of Machine Learning and why it looks a lot like Julia

18:05 January 13, 2022

7 top predictive analytics use cases: Enterprise examples

17:53 January 13, 2022

6 challenges of building predictive analytics models

09:43 January 13, 2022

How to Disable Read-Only in Excel

13:58 January 8, 2022

Plots -- powerful convenience for visualization in Julia

14:00 December 23, 2021

SciML Scientific Machine Learning Software

10:31 December 23, 2021

Pumas AI

18:33 December 10, 2021

SAS Institute has shared a few COVID resources for data scientists and others so I have shared links to them here as well:

United States Covid 19 Vaccination Data

8 terms you need to understand when assessing COVID-19 data

Track the spread of the pandemic

Vaccine Efficacy, Clinical Trials, and SAS: Part 4 of Biostats in the Time of Coronavirus

What matters now when it comes to COVID-19

13:38 December 7, 2021

Light Table

13:37 December 7, 2021

CoCalc

16:46 December 2, 2021

DataKind UK

16:49 October 21, 2021

Open Neural Network Exchange

16:21 October 9, 2021

No line in plot chart despite + geom_line()

Date Formats in R

How to Automate Excel with R

5 Ways to Subset a Data Frame in R

RKWard

R-IDE

Eclipse StatET: Tooling for the R language

StatET 4.2 -- Downloads

R Commander

Summary Table with dplyr

gtsummary

My favourite R package for: summarising data

Formatted Summary Statistics and Data Summary Tables with qwraps2

How to Easily Create Descriptive Summary Statistics Tables in R Studio – By Group

Saving plots to a file with pdf(), jpeg() and png()

YaRrr! The Pirate’s Guide to R

Convert Data Frame Column to Vector in R (3 Examples)

ggplot does not work if it is inside a for loop although it works outside of it

paste {base}

Center the title in ggplot

ggplot2

strptime: Date-time Conversion Functions to and from Character

Date Values

How to sum a variable by group in R?

How to write your own ggplot2 functions in R

Learn R Programming

The R Graph Gallery

Plotly

Cumulative sum or count in R

Data Cornering

Sorting Data

Rename Data Frame Columns in R

R Package Documentation

R Documentation

Reordering Data Frame Columns in R

How to calculate a rolling average in R

Remove Element from List in R (7 Example Codes) | How to Delete a List Component

Reshaping Data

Data Wrangling with R

Cowplot

sjPlot

Journal of Statistical Software

ggplot2: Elegant Graphics for Data Analysis

R Consortium

R Validation Hub

Tutorialspoint

Run system commands or shell scripts from an interactive R session

CTAN

The write2 function

R Markdown: The Definitive Guide

R Markdown

Pandoc

TinyTeX

Advanced R

Haven

Tables in R (And How to Export Them to Word)

How to Create Customized Tables in Displayr Using R

gt

Create Awesome HTML Table with knitr::kable and kableExtra

How to Make Beautiful Tables in R

Introduction to tableone

High Performance Computing in R

Speed Up Your Code: Parallel Processing with multidplyr

Using the {plyr} (1.2) package parallel processing backend with Windows

R Programming for Data Science

Rscript

Remove grid and background from plot (ggplot2)

Download, Tidy and Visualize Covid-19 Related Data

Data Visualization: A practical introduction

The Mathematics and Statistics of Infectious Disease Outbreaks

Launching RStudio in Docker

Extending R Markdown

Applied R Code

R for the Test of Us

High-Performance and Parallel Computing with R

Huxtable

16:56 September 29, 2021

Compare the default definitions for sample quantiles in SAS, R, and Python

Quantile definitions in SAS

Sample quantiles: A comparison of 9 definitions

14:47 September 28, 2021

Data Sources in Power BI Desktop

10:34 September 28, 2021

Understanding the Parquet File Format

10:44 September 23, 2021

GxP Compliance in Pharma Made Easier: Good Documentation Practices with R Markdown and {officedown}

17:09 September 22, 2021

Azure Cognitive Services

20:32 September 13, 2021

Jedi SAS Tricks: Explicit SQL Pass-through in DS2

12:19 September 11, 2021

Parallel Processing in Python – A Practical Guide with Examples

Parallel Processing in Python

Multithreading in Python: Running Functions in Parallel

Python: run functions in parallel with a multiprocessing wrapper function

Python – Run same function in parallel with different parameters

Run Python Code In Parallel Using Multiprocessing

Asynchronous Parallel Programming in Python with Multiprocessing

Python Multithreading and Multiprocessing Tutorial

How to run multiple functions at the same time in Python

An Intro to Threading in Python

15:51 September 9, 2021

SAS FILENAME Statement: EMAIL (SMTP) Access Method

15:50 September 9, 2021

How to send email using SAS

13:55 September 2, 2021

A look at the DataOps engineer role and responsibilities

14:03 August 26, 2021

Text Mining Node in SAS Model Studio on SAS Viya

14:02 August 26, 2021

Natural Language Processing: An Introduction

11:13 August 25, 2021

JuliaHub

17:55 August 7, 2021

Top 5 Tips for RStudio Workbench and Desktop

14:09 August 4, 2021

SAS OnDemand for Academics

14:02 August 4, 2021

Data Science Experience | SAS

14:01 August 4, 2021

Authentication to SAS Viya: a couple of approaches

09:01 August 4, 2021

NumFOCUS: A Nonprofit Supporting Open Code for Better Science

09:03 July 5, 2021

10 Tips And Tricks For Data Scientists Vol.10

09:19 June 30, 2021

Common Format and MIME Type for Comma-Separated Values (CSV) Files

12:44 June 23, 2021

10 Tips And Tricks For Data Scientists Vol.9

10:11 June 23, 2021

Analytics Value Training -- SAS

16:21 June 21, 2021

Top 10 Data Science Projects for Beginners

16:20 June 21, 2021

5 Data Science Open-source Projects To Which You Should Consider Contributing

11:27 May 28, 2021

SANAITICS

Programiz

11:03 May 28, 2021

5 Tips for Writing Clean R Code – Leave Your Code Reviewer Commentless

11:02 May 28, 2021

Installing our R development environment on Ubuntu 20.04

10:02 May 28, 2021

SAS Analysis Explorers

09:58 May 28, 2021

SAS Curiosity

09:01 May 27, 2021

SingleStore

09:01 May 27, 2021

Teradata

08:59 May 27, 2021

Apache ORC

13:52 May 16, 2021

SAS Blogs

13:47 May 16, 2021

Machine Learning Operations

13:43 May 16, 2021

Python os.system() method

13:42 May 16, 2021

Pandas Split strings into two List/Columns using str.split()

13:41 May 16, 2021

SettingwithCopyWarning: How to Fix This Warning in Pandas

11:02 May 12, 2021

6 Life-Altering RStudio Keyboard Shortcuts

10:53 May 12, 2021

SAS Free Software Trials

10:52 May 12, 2021

Open Integration with SAS

10:50 May 12, 2021

SAS Data Science Resource Hub

10:49 May 12, 2021

SAS User Group UK & Ireland

10:47 May 12, 2021

Top YouTube Channels for Data Science

10:45 May 12, 2021

10 Tips And Tricks For Data Scientists Vol.6

17:12 May 10, 2021

Decisions in the Cloud from SAS

17:06 May 10, 2021

Ask the Expert -- Analytics Explorers from SAS

17:05 May 10, 2021

Pondering AI (transistor.fm)

17:02 May 10, 2021

Free SAS Training

17:02 May 10, 2021

SAS Viya

17:01 May 10, 2021

Microsoft Azure and SAS

12:48 April 15, 2021

Apache Arrow

10:49 April 9, 2021

Top 10 Python Libraries Data Scientists should know in 2021

18:42 March 2, 2021

Excel Tips

13:52 January 27, 2021

Combining Multiple Worksheets in Any Version of Excel

13:51 January 27, 2021

Combine Data From Multiple Worksheets into a Single Worksheet in Excel

15:21 December 10, 2020

LET() assigns names to calculations in Excel

The big difference between what Excel shows and what Excel knows

Excel is getting better Conditional Formatting with much needed improvements

Many Ordinal RANK() options in Excel with joint, equal rankings, words and more

15:18 December 10, 2020

Announcing LAMBDA: Turn Excel formulas into custom functions

LAMBDA function

20:47 December 6, 2020

Julia Computing

20:43 December 6, 2020

Top 5 IDE’s for Julia

15:12 November 18, 2020

Conda

09:20 November 9, 2020

Mockaroo

09:17 November 9, 2020

Code with Mu

09:15 November 9, 2020

Stencila

13:53 October 23, 2020

rOpenSci Packages: Development, Maintenance, and Peer Review

09:25 October 7, 2020

pxWorks

09:22 October 7, 2020

Learning Machines

09:21 October 7, 2020

rOpenSci

09:37 September 17, 2020

Big Data Ignite

09:15 September 17, 2020

Business Science University

14:10 July 1, 2020

23 sources of data bias for Machine Learning and Deep Learning

16:46 June 8, 2020

Summary of SAS Macro Quoting Functions and the Characters That They Mask

20:23 March 4, 2020

KDnuggets

20:22 March 4, 2020

Kaggle

20:22 March 4, 2020

R-bloggers

17:38 November 25, 2019

Elvis SAS Log Analyzer

14:54 November 9, 2019

The Definitive Guide to Conda Environments

16:54 October 31, 2019

Open Data Science Conference

09:19 October 31, 2019

Cloudera

09:18 October 31, 2019

JASP

Jamovi

Qlik

Tableau

Scala

MATLAB

22:02 October 23, 2019

What is eCOA and How Does it Improve Clinical Trial Data Quality?

16:23 October 18, 2019

Boemska Technology Solutions

15:08 October 1, 2019

What is a geometric mean?

15:56 December 5, 2018

A Second Look at the ODS Destination for PowerPoint

The Dynamic Duo: ODS Layout and the ODS Destination for PowerPoint

Square Peg, Square Hole—Getting Tables to Fit on Slides in the ODS Destination for PowerPoint

The ODS Destination for PowerPoint Tip Sheet

23:43 October 25, 2017

Revolutions

23:42 October 25, 2017

TIBCO

23:42 October 25, 2017

TIBCO Mashery

19:21 October 24, 2017

Nature Reviews Drug Discovery

23:26 October 22, 2017

Institute of Clinical Research

19:59 October 16, 2017

Moving and Accessing SAS® 9.4 Files

19:55 October 16, 2017

Here are a few SAS functions that are less well known to me:

CHOOSEC

CHOOSEN

IFC

IFN

10:21 October 14, 2017

The Data Incubator

22:42 October 12, 2017

RStudio

Jupyter

Impala

Amazon Redshift

Hadoop

13:51 October 4, 2017

Data Science 101

Data Science Central

Association for Computing Machinery

ImageNet

TensorFlow

WordNet

Albert Cairo

Visualizing and Understanding Convolutional Networks

Hidden Technical Debt in Machine Learning Systems

Apache Spark

Anaconda

10:27 August 30, 2017

Prevent a SAS Session from Connecting to a SQL Database Until Required

16:50 March 10, 2017

12:36 May 19, 2016

Create conditional formulas in Excel

12:14 May 19, 2016

Split a String in Excel

12:10 February 8, 2016

Add business days to date in Excel

10:10 February 5, 2016

Optimising the Use of Data Standards

18:16 January 13, 2016

Finding an Asterisk in Excel

10:11 November 7, 2015

Metadata-Driven SDTM Creation

02:40 March 5, 2015

List of ISO 639-1 codes

  • All the views that you find expressed on here in postings and articles are mine alone and not those of any organisation with which I have any association, through work or otherwise. As regards editorial policy, whatever appears here is entirely of my own choice and not that of any other person or organisation.

  • Please note that everything you find here is copyrighted material. The content may be available to read without charge and without advertising but it is not to be reproduced without attribution. As it happens, a number of the images are sourced from stock libraries like iStockPhoto so they certainly are not for abstraction.

  • With regards to any comments left on the site, I expect them to be civil in tone of voice and reserve the right to reject any that are either inappropriate or irrelevant. Comment review is subject to automated processing as well as manual inspection but whatever is said is the sole responsibility of the individual contributor.