SAS Institute | Technology Tales

Launching SAS Analytics Pro on Viya with automated Docker image clean-up

28^th November 2025

For my freelancing, I have a licensed version of SAS Analytics Pro running in a Docker container on my main Linux workstation. Every time there is a new release, a new Docker is made available, which means that a few of them could accumulate on your system. Aside from taking up disk space that could have other uses, it also makes it tricky to automate the startup of the associate Docker container. Avoiding this means pruning the Docker images available on the system, something that also needs automation.

To make things clearer, let me work through the launch script that I use; this is called by the script that checks for and then downloads any new image that is available, should that be needed. First up is the shebang, and this uses the -e switch to exit the script in the event of there being an error. That puts a stop to any potentially destructive outcomes from later commands being executed afterwards and without having the input that they need.

#!/bin/bash -e

Next comes the command to shut down the existing container. Should a new image get instated, this would lock up the old one, preventing its removal. Also, doing the rest of the steps with an already running container will result in errors anyway.

if docker container ls -a --format '{{.Names}}' | grep -q '^sas-analytics-pro$'; then
    docker container stop sas-analytics-pro
fi

After that, the step to find the latest image is performed. Once, I did this by looping through the ages by days, weeks and months, hardly an elegant or robust approach. What follows is something all the more effective.

# Find latest SAS Analytics Pro image
IMAGE=$(docker image ls --format '{{.Repository}}:{{.Tag}} {{.CreatedAt}}' \
    | grep 'sas-analytics-pro' \
    | sort -k2,3r \
    | head -n 1 \
    | awk '{print $1}')

echo "Chosen image: $IMAGE"

Since there is quite a lot happening above, let us unpack the actions. The first part lists all Docker images, formatting each line to show the image name (repository:tag) followed by its creation timestamp: docker image ls --format '{{.Repository}}:{{.Tag}} {{.CreatedAt}}'. The next piece picks out all the images that are for SAS Analytics Pro: grep 'sas-analytics-pro'. The crucial step, sort -k2,3r, comes next and this sorts the results by the second and third fields (the creation date and time) in reverse order, so the newest images appear first. With that done, it is time to pick out the most recent image using head -n 1. To pick out the image name, you need awk '{print $1}. This wrapped within IMAGE=$(...) to assign the result to a variable that is printed to the console using an echo statement.

With the image selected, you can then spin up the container once you specify the other parameters to use and allow some sleep time afterwards before proceeding to the clean-up steps:

run_args="
-e SASLOCKDOWN=0
--name=sas-analytics-pro
--rm
--detach
--hostname sas-analytics-pro
--env RUN_MODE=developer
--env SASLICENSEFILE=[Path to SAS licence file]
--publish 8080:80
--volume ${PWD}/sasinside:/sasinside
--volume ${PWD}/sasdemo:/data2
--volume [location of SAS files on the system]:/data
--cap-add AUDIT_WRITE
--cap-add SYS_ADMIN
--publish 8222:22
"

if ! docker run -u root ${run_args} "$IMAGE" "$@" > /dev/null 2>&1; then
    echo "Failed to run the image."
    exit 1
fi

sleep 5

With the new container in action, the subsequent step is to find the older images and delete those. Again, the docker image command is invoked, with its output fed to a selection command for SAS Analytics Pro images. Once the current image has been removed from the listing by the grep -v command, the list of images to be deleted is assigned to the IMAGES_TO_REMOVE variable.

IMAGES_TO_REMOVE=$(docker image ls --format '{{.Repository}}:{{.Tag}}' \
    | grep 'sas-analytics-pro' \
    | grep -v "^$IMAGE$")

echo "Will remove older images:"
echo "$IMAGES_TO_REMOVE"

After that has happened, iterating through the list of images using a for loop will remove them one at a time using the docker image rm command:

for OLD in $IMAGES_TO_REMOVE; do
    echo "Removing $OLD"
    docker image rm "$OLD" || echo "Could not remove $OLD"
done

All this concludes the operation of spinning up a new SAS Analytics Pro Docker container while also removing any superseded Docker images. One last step is to capture the password to use for logging into the SAS Studio interface that is available at localhost:8080 or whatever address and port is being used to serve the application:

docker logs sas-analytics-pro 2>&1 | grep "Password=" > pw.txt

Folding updating and housekeeping into the same activity as spinning up the Docker container means that I need not think of doing anything else. The time taken by the other activities repay the effort by always having the latest version running in a tidy environment. That just saves having to remember to do all of this, which is what is needed without automation.

Using the LIKE operator in PROC SQL WHERE clauses in SAS

26^th November 2025

Recently, I was working in SAS and decided to trying picking out datasets and variables from its dictionary tables, eventually picking out the maximum length of a variable type for assigning the length of a new variable. This could have been done using a long-established technique:

proc sql;
    select distinct memname into :dsns separated by '#'
        from dictionary.tables
            where lowcase(libname) = 'work'
            and index(lowcase(memname), "r_") = 1
            and index(lowcase(memname), "visit") = 0;
quit;

The result is that it creates a macro variable containing a delimited list of work datasets with names beginning with r_ and not containing the string visit. As well as using the index function to find the placing of one string within another, I have seen the count function used for similar purposes, albeit without the placement specificity. Since the =: operator which looks for a search string at the start of a larger is not something that works in SQL (data step is more than fine), you cannot do something like this:

proc print data = sashelp.vmember noobs;
    where lowcase(libname) = 'work'
    and lowcase(memname) =: "r_";
run;

While the contains operator works similarly to the count function when it comes to search text positioning, yet another option is the like operator, and that is shown in the example below:

proc sql;
    select distinct memname into :dsns separated by '#'
        from dictionary.tables
            where lowcase(libname) = 'work'
            and lowcase(memname) like 'r\_%' escape '\'
            and lowcase(memname) not like '%visit%';
quit;

Here, % and _ are placeholder characters, with the first matching zero or more characters and the second matching one character. Thus, the underscore in r_ needs escaping to look for that pattern (otherwise, it will look for the letter r at the start of a string and a single character after it) and a backslash character (\) will cover that duty. To ensure that it does what you want, adding escape '\' after the expression tells SAS what is happening.

Another thing to watch is that the percent (%) character needs a form escaping from the SAS Macro language processor, and placing the search term in single quotes attends to that. That means that %visit% does not cause any errors when you are looking for visit within a dataset name and using negation (the not operator) to exclude that possibility from the search results. However, using _%visit% might be a better pattern for finding visit at the end of a name, though.

Should you wish to play around with the above to see what happens for your own learning, try using something like this to give you a few test datasets:

data r_test r_visit visit;
    set sashelp.class;
run;

Otherwise, feel free to add your own test cases to cement the ideas even further. All too often, we look up something, deploy it and then forget about, especially when AI is involved. Nevertheless, the fastest way to write code can be to use what is embedded in your memory.

WARNING: Unsupported device 'SVG' for RTF destination. Using default device 'EMF'.

31^st October 2025

When running SAS programs to create some outputs, I met with the above message because I was sending output to an RTF file and these do not support SVG files. This must be a system default because there was nothing in the programs that should trigger the warning. Getting rid of the message uses one of two approaches:

goptions device=emf;

ods graphics / outputfmt=emf;

The first applies to legacy code generating graphics using SAS/GRAPH procedures, while the second is for the more modern ODS graphics procedures like SGPLOT. While the first one may work in all cases, that cannot be assumed without further testing.

There was another curiosity in this: the system setup should have taken care of things to stop the warning from appearing in the log. However, these were programs created from SAS logs to capture all code generated by any SAS macros that were called for regulatory submission purposes. Thus, they were self-contained and missed off the environment setup code, which is how I came to see what I did.

Platform-independent file deletion in SAS programming

11^th September 2025

There are times when you need to delete a file from within a SAS program that is not one of its inbuilt types, such as datasets or catalogues. Many choose to use X commands or the SYSTASK facility, even if these issue operating system-specific commands that obstruct portability. Also, it should be recalled that system administrators can make these unavailable to users as well. Thus, it is better to use methods that are provided within SAS itself.

The first step is to define a file reference pointing to the file that you need to delete:

filename myfile '<full path of file to be removed>';

With that accomplished, you can move on to deleting the file in a null data step, one where no output file is generated. The FDELETE function does that in the code below using the MYFILE file reference declared already. That step issues a return code used in the provision of feedback to the user, with 0 indicated success and non-zero values indicating failure.

data _null_;
  rc = fdelete('myfile');
  if rc = 0 then put 'File deleted successfully.';
  else put 'Error deleting file. RC=' rc;
run;

With the above out of the way, you then can clear the file reference to tidy up things afterwards:

filename myfile clear;

Some may consider that SAS programs are single use items that may not be moved from one system to another. However, I have been involved in several of these, particularly between Windows, UNIX and Linux, so I know the value of keeping things streamlined and some SAS programs are multi-use anyway. Anything that cuts down on workload when there is much else to do cannot be discounted.

SAS Packages: Revolutionising code sharing in the SAS ecosystem

26^th July 2025

In the world of statistical programming, SAS has long been the backbone of data analysis for countless organisations worldwide. Yet, for decades, one of the most significant challenges facing SAS practitioners has been the efficient sharing and reuse of code. Knowledge and expertise have often remained siloed within individual developers or teams, creating inefficiencies and missed opportunities for collaboration. Enter the SAS Packages Framework (SPF), a solution that changes how SAS professionals share, distribute and utilise code across their organisations and the broader community.

The Problem: Fragmented Knowledge and Complex Dependencies

Anyone who has worked extensively with SAS knows the frustration of trying to share complex macros or functions with colleagues. Traditional code sharing in SAS has been plagued by several issues:

Dependency nightmares: A single macro often relies on dozens of utility macros working behind the scenes, making it nearly impossible to share everything needed for the code to function properly
Version control chaos: Keeping track of which version of which macro works with which other components becomes an administrative burden
Platform compatibility issues: Code that works on Windows might fail on Linux systems and vice versa
Lack of documentation: Without proper documentation and help systems, even the most elegant code becomes unusable to others
Knowledge concentration: Valuable SAS expertise remains trapped within individuals rather than being shared with the broader community

These challenges have historically meant that SAS developers spend countless hours reinventing the wheel, recreating functionality that already exists elsewhere in their organisation or the wider SAS community.

The Solution: SAS Packages Framework

The SAS Packages Framework, developed by Bartosz Jabłoński, represents a paradigm shift in how SAS code is organised, shared and deployed. At its core, a SAS package is an automatically generated, single, standalone zip file containing organised and ordered code structures, extended with additional metadata and utility files. This solution addresses the fundamental challenges of SAS code sharing by providing:

Functionality over complexity: Instead of worrying about 73 utility macros working in the background, you simply share one file and tell your colleagues about the main functionality they need to use.
Complete self-containment: Everything needed for the code to function is bundled into one file, eliminating the "did I remember to include everything?" problem that has plagued SAS developers for years.
Automatic dependency management: The framework handles the loading order of code components and automatically updates system options like cmplib= and fmtsearch= for functions and formats.
Cross-platform compatibility: Packages work seamlessly across different operating systems, from Windows to Linux and UNIX environments.

Beyond Macros: A Spectrum of SAS Functionality

One of the most compelling aspects of the SAS Packages Framework is its versatility. While many code-sharing solutions focus solely on macros, SAS packages support a wide range of SAS functionality:

User-defined functions (both FCMP and CASL)
IML modules for matrix programming
PROC PROTO C routines for high-performance computing
Custom formats and informats
Libraries and datasets
PROC DS2 threads and packages
Data generation code
Additional content such as documentation PDF files

This comprehensive approach means that virtually any SAS functionality can be packaged and shared, making the framework suitable for everything from simple utility macros to complex analytical frameworks.

Real-World Applications: From Pharmaceutical Research to General Analytics

The adoption of SAS packages has been particularly notable in the pharmaceutical industry, where code quality, validation and sharing are critical concerns. The PharmaForest initiative, led by PHUSE Japan's Open-Source Technology Working Group, exemplifies how the framework is being used to revolutionise pharmaceutical SAS programming. PharmaForest offers a collaborative repository of SAS packages specifically designed for pharmaceutical applications, including:

OncoPlotter: A comprehensive package for creating figures commonly used in oncology studies
SAS FAKER: Tools for generating realistic test data while maintaining privacy
SASLogChecker: Automated log review and validation tools
rtfCreator: Streamlined RTF output generation

The initiative's philosophy captures perfectly the spirit of the SAS Packages Framework: "Through SAS packages, we want to actively encourage sharing of SAS know-how that has often stayed within individuals. By doing this, we aim to build up collective knowledge, boost productivity, ensure quality through standardisation and energise our community".

The SASPAC Archive: A Growing Ecosystem

The establishment of SASPAC (SAS Packages Archive) represents the maturation of the SAS packages ecosystem. This dedicated repository serves as the official home for SAS packages, with each package maintained as a separate repository complete with version history and documentation. Some notable packages available through SASPAC include:

BasePlus: Extends BASE SAS with functionality that many developers find themselves wishing was built into SAS itself. With 12 stars on GitHub, it's become one of the most popular packages in the archive.
MacroArray: Provides macro array functionality that simplifies complex macro programming tasks, addressing a long-standing gap in SAS's macro language capabilities.
SQLinDS: Enables SQL queries within data steps, bridging the gap between SAS's powerful data step processing and SQL's intuitive query syntax.
DFA (Dynamic Function Arrays): Offers advanced data structures that extend SAS's analytical capabilities.
GSM (Generate Secure Macros): Provides tools for protecting proprietary code while still enabling sharing and collaboration.

Getting Started: Surprisingly Simple

Despite the capabilities, getting started with SAS packages is fairly straightforward. The framework can be deployed in multiple ways, depending on your needs. For a quick test or one-time use, you can enable the framework directly from the web:

filename packages "%sysfunc(pathname(work))"; filename SPFinit url "https://raw.githubusercontent.com/yabwon/SAS_PACKAGES/main/SPF/SPFinit.sas"; %include SPFinit;

For permanent installation, you simply create a directory for your packages and install the framework:

filename packages "C:SAS_PACKAGES"; %installPackage(SPFinit)

Once installed, using packages becomes as simple as:

%installPackage(packageName) %helpPackage(packageName) %loadPackage(packageName)

Developer Benefits: Quality and Efficiency

For SAS developers, the framework offers numerous advantages that go beyond simple code sharing:

Enforced organisation: The package development process naturally encourages better code organisation and documentation practices.
Built-in testing: The framework includes testing capabilities that help ensure code quality and reliability.
Version management: Packages include metadata such as version numbers and generation timestamps, supporting modern DevOps practices.
Integrity verification: The framework provides tools to verify package authenticity and integrity, addressing security concerns in enterprise environments.
Cherry-picking: Users can load only specific components from a package, reducing memory usage and namespace pollution.

The Future of SAS Code Sharing

The growing adoption of SAS packages represents more than just a new tool, it signals a fundamental shift towards a more collaborative and efficient SAS ecosystem. The framework's MIT licensing and 100% open-source nature ensure that it remains accessible to all SAS users, from individual practitioners to large enterprise installations. This democratisation of advanced code-sharing capabilities levels the playing field and enables even small teams to benefit from enterprise-grade development practices.

As the ecosystem continues to grow, with contributions from pharmaceutical companies, academic institutions and individual developers worldwide, the SAS Packages Framework is proving that the future of SAS programming lies not in isolated development, but in collaborative, community-driven innovation.

For SAS practitioners looking to modernise their development practices, improve code quality and tap into the collective knowledge of the global SAS community, exploring SAS packages isn't just an option, it's becoming an essential step towards more efficient and effective statistical programming.

What SAS Innovate 2025 revealed about the future of enterprise analytics

21^st June 2025

SAS Innovate 2025 comprised a global event in Orlando (6^th-9^th May) followed by regional editions on tour. This document provides observations from both the global event and the London stop (3^rd-4^th June), covering technical content, platform developments and thematic emphasis across the two occasions. The global event featured extensive recorded content covering platform capabilities, migration approaches and practical applications, whilst the London event incorporated these themes with additional local perspectives and a particular focus on governance and life sciences applications.

Global Event

Platform Expansion and New Capabilities

The global SAS Innovate 2025 event included content on SAS Clinical Acceleration, positioned as a SAS Viya equivalent to SAS LSAF. Whilst much appeared familiar from the predecessor platform, performance improvements and additional capabilities represented meaningful enhancements.

Two presentations, likely restricted to in-person attendees based on their absence from certain schedules, covered AI-powered SAS code generation. Shionogi presented on using AI for clinical studies and real-world evidence generation, with the significant detail that the AI capability existed within SAS Viya rather than depending on external large language models. Another session addressed interrogating and generating study protocols using SAS Viya, including functionality intended to support study planning in ways that could improve success probability.

These sessions collectively indicated a directional shift. The scope extends beyond conventional expectations of "SAS in clinical" contexts, moving into upstream and adjacent activities, including protocol development and more integrated automation.

Architectural Approaches and Data Movement

A significant theme across multiple sessions addressed fundamental shifts in data architecture. The traditional approach of moving massive datasets from various sources into a single centralised analytics engine is being challenged by a new paradigm: bringing analytics to the data. The integration of SAS Viya with SingleStore exemplifies this approach, where analytics processing occurs directly within the source database rather than requiring data extraction and loading. This architectural change can reduce infrastructure requirements for specific workloads by as much as 50 per cent, whilst eliminating the complexity and cost associated with constant data movement and duplication.

Trustworthy AI and Organisational Reflection

Keynote presentations addressed the relationship between AI systems and organisational practices. SAS Vice President of Data Ethics Practice Reggie Townsend articulated a perspective that reframes common concerns about AI bias. When AI produces biased results, the issue is not primarily technical failure, but rather a reflection of biases already embedded within cultural and organisational practices. This view positions AI as a diagnostic tool that surfaces systemic issues requiring organisational attention rather than merely technical remediation.

The focus on trustworthy AI extended beyond bias to encompass governance frameworks, transparency requirements and the persistent challenge that poor data quality leads to ineffective AI regardless of model sophistication. These considerations hold particular significance in probabilistic AI contexts, especially where SAS aims to incorporate deterministic elements into aspects of its AI offering.

Natural Language Interfaces and Accessibility

Content addressing SAS Viya Copilot demonstrated the platform's natural language capabilities, enabling users to interact with analytics through conversational queries rather than requiring technical syntax. This approach aims to democratise data access by allowing users with limited technical knowledge to directly engage with complex datasets. The Copilot functionality, built on Microsoft Azure OpenAI Service, supports code generation, model development assistance and natural language explanations of analytical outputs.

Cloud Migration and Infrastructure Considerations

A presentation on transforming clinical programming using SAS Clinical Acceleration was scheduled but not accessible at the global event. The closing session featured the CIO of Parexel discussing their transition to SAS managed cloud services. Characterised as a modernisation initiative, reported outcomes included reduced outage frequency. This aligns with observations from other multi-tenant systems, where maintaining stability and availability represents a fundamental requirement that often proves more complex than external perspectives might suggest.

Content addressing cloud-native strategies emphasised a fundamental psychological shift in resource management. Rather than the traditional capital expenditure mindset where physical servers run continuously, cloud environments enable strategic use of the capability to create and destroy computing resources on demand. Approaches include spinning up analytics environments at the start of the working day and shutting them down at the end, with more sophisticated implementations that automatically save and shut down environments after periods of inactivity. This dynamic approach ensures organisations pay only for actively used resources.

Presentations on organisational change management accompanying technical migrations emphasised that successful technology projects require attention to human factors alongside technical implementation. Strategies discussed included formal launch events to mark transitions, structured support mechanisms such as office hours for technical questions and community-building activities designed to foster relationships and maintain engagement during periods of change.

Platform Integration and Practical Applications

Content on SAS Viya Workbench covered availability through Azure and AWS, Python integration, R compatibility and interfacing with SAS Enterprise Guide, with demonstrations of several features. As SAS expands support for open-source languages, the presentation illustrated how these capabilities can provide a unified platform for different technical communities.

A presentation on retrieval augmented generation with unstructured data (such as system manuals), combined with agentic AI for diagnosing manufacturing system problems, offered a concrete use case. Given the tendency for these subjects to become abstract, the connected example provided practical insight into how components can function together in operational settings.

Digital Twins and Immersive Simulation

A notable announcement at the global event involved the partnership between SAS and Epic Games to create enhanced digital twins using Unreal Engine. This collaboration applies the same photorealistic 3D rendering technology used in Fortnite to industrial applications. Georgia-Pacific piloted this technology at its Savannah River Mill, which manufactures napkins, paper towels and toilet tissue. The facility was captured using RealityScan, Epic's mobile application, to create photorealistic renderings imported into Unreal Engine.

The application focused on optimising automated guided vehicle deployment and routing strategies. Rather than testing scenarios in the physical environment with associated costs and safety risks, the digital twin enables simulation of complex factory floor operations including AGV navigation, proximity alerts, obstacles and rare adverse events. SAS CTO Bryan Harris emphasised that digital twins should not only function like the real world but also look like it, enabling more accessible decision-making for frontline workers, engineers and machine operators beyond traditional data scientist roles.

The collaboration extends beyond visual fidelity. SAS developed a plugin connecting Unreal Engine to SAS Viya, enabling real-time data from simulated environments to fuel AI models that analyse, optimise and test industrial operations. This approach allows organisations to explore "what-if" scenarios virtually before implementing changes in physical facilities, potentially delivering cost savings whilst improving safety and operational efficiency.

Marketing Intelligence and Customer Respect

Content on SAS Customer Intelligence 360 addressed the platform's marketing decisioning capabilities, including next-best-offer functionality and real-time personalisation across channels. A notable emphasis concerned contact policies and rules that enable marketers to limit communication frequency, reflecting a strategic choice to respect customer attention rather than maximise message volume. This approach recognises that in environments characterised by notification saturation, demonstrating restraint can build trust and ensure greater engagement when communications do occur.

Financial Crime and Integrated Analytics

Presentations on financial crime addressed the value of integrated platforms that connect traditionally siloed functions such as fraud detection, anti-money laundering and sanctions screening. Network analytics capabilities enable identification of patterns and relationships across these domains that might otherwise remain hidden. Examples illustrated how seemingly routine alerts, when analysed within a comprehensive view of connected data, can reveal connections to significant criminal networks, transforming tactical operational issues into sources of strategic intelligence.

Data Lineage and Transformation Planning

Content on data lineage reframed this capability from a purely technical concern to a strategic tool for transformation planning. For large-scale modernisation initiatives, comprehensive mapping of data flows, transformations and dependencies provides the foundation for accurate effort estimation, budgetary planning and risk assessment. This visibility enables organisations to proceed with complex changes whilst maintaining confidence that critical downstream processes will not be inadvertently affected.

Development Practices and Migration Approaches

Sessions included content on using Bitbucket with SAS Viya to support continuous integration and continuous deployment pipelines for SAS code. Git formed the foundation of the approach, with supporting tools such as JQ. Given the current state of manual validation processes, this content addressed a genuine need for more robust validation methods for SAS macros used across clinical portfolios, where these activities can require several weeks and efficiency improvements would represent substantial value.

Another session provided detailed coverage of migrating from SAS 9 to SAS Viya, focusing on assessment methods for determining what requires migration and techniques for locating existing assets. The content reflected the reality that the discovery phase often constitutes the primary work effort rather than a preliminary step.

A presentation on implementing SAS Viya on-premises under restrictive security requirements described a solution requiring sustained collaboration with SAS over multiple years to achieve necessary modifications. This illustrated how certain deployments are defined primarily by governance, controls and assurance requirements rather than by product features.

Technical Fundamentals and Persistent Challenges

A hands-on session on data-driven output programming with SAS macros provided practical content with life sciences examples. Control tables and CALL EXECUTE represented familiar approaches, whilst the data step RESOLVE function offered new functionality worth exploring, particularly given its capability to work with macro expressions rather than being limited to macro variables in the manner of SYMGET.

A recurring theme across multiple contexts emphasised that poor data quality leads to ineffective AI and consequently to flawed decision-making. The technological environment evolves, but fundamental challenges persist. This consideration holds particular significance in probabilistic AI contexts, especially where SAS aims to incorporate deterministic elements into aspects of its AI offering.

London Event

Overview and Core Themes

The London edition of SAS Innovate 2025 on Tour demonstrated the pervasive influence of AI across the programme. The event concluded with Michael Wooldridge from the University of Oxford providing an overview of different categories of AI, offering conceptual grounding for a day when terminology and ambition frequently extended beyond current practical adoption.

The opening session presented SAS' recent offerings, maintaining consistency with content from the global event in Orlando whilst incorporating local perspectives. Trustworthiness, responsibility and governance emerged as prominent themes, particularly relevant given the current industry emphasis on innovation. A panel discussion included a brief exchange regarding the term "digital workforce", reflecting an awareness of the human implications that can be absent from wider industry discussion.

Life Sciences Stream Content

The Life Sciences stream focused heavily on AI, with presentations from AWS, AstraZeneca and IQVIA addressing the subject, followed by a panel discussion continuing this direction. The scale of technological change represents a tangible shift affecting all parts of the ecosystem. A presentation from a healthcare professional provided context regarding the operational environment within which pharmaceutical companies function. SAS CTO Bryan Harris expressed appreciation for pharmaceutical research and development work, an acknowledgement that appeared both substantive and appropriate to the setting.

Observations from selected sessions of SAS Innovate 2024

21^st June 2024

SAS Innovate 2024 provided insight into evolving approaches to analytics modernisation, platform development and applied data science across multiple industries. This document captures observations from sessions addressing strategic platform migrations, unified analytics environments, enterprise integration patterns and practical applications in regulated sectors. The content reflects a discipline transitioning from experimental implementations to production-grade, business-critical infrastructure.

Strategic Platform Modernisation

A presentation from DNB Bank detailed the organisation's migration from SAS 9.4 to SAS Viya on Microsoft Azure. The strategic approach proved counter-intuitive: whilst SAS Viya supports both SAS and Python code seamlessly, DNB deliberately chose to rewrite their legacy SAS code library into Python. The rationale combined two business objectives. First, expanding the addressable talent market by tapping into the global Python developer pool. Second, creating a viable exit strategy from their primary analytics vendor, ensuring compliance with financial regulatory requirements to demonstrate realistic vendor transition options within 30 to 90 days.

This decision represents a fundamental shift in enterprise software value propositions. Competitive advantage no longer derives from creating vendor lock-in, but from providing powerful, stable and governed environments that fully embrace open-source tools. The winning strategy involves convincing customers to remain because the platform delivers undeniable value, not because departure presents insurmountable difficulty. This is something that signals the maturing of a market, where value flows through partnership rather than proprietary constraints.

Unified Analytics Environments

A healthcare analytics presentation addressed the persistent debate between low-code/no-code interfaces for business users and professional coding environments for data scientists. Two analysts tackled identical problems (predicting diabetes risk factors using a public CDC dataset) using different approaches within the same platform.

The low-code user employed SAS Viya's Model Studio, a visual interface. This analyst assessed the model for statistical bias against variables such as age and gender by selecting a configuration option, whereupon the platform automatically generated fairness statistics and visualisations.

The professional coder used SAS Viya Workbench, a code-first environment similar to Visual Studio Code. This analyst manually wrote code to perform identical bias assessments. However, direct code access enabled fine-tuning of variable interactions (such as age and cholesterol), ultimately producing a logistic regression model with marginally superior performance compared to the low-code approach.

The demonstration illustrated that the debate presents a false dichotomy. The actual value resides in unified platforms, enabling both personas to achieve exceptional productivity. Citizen data scientists can rapidly build and validate baseline models, whilst expert coders can refine those same models with advanced techniques and deploy them, all within a single ecosystem. This unified approach characterises disciplinary development, where focus shifts from tribal tool debates to collective problem-solving.

Analytics as Enterprise Infrastructure

Multiple architectural demonstrations illustrated analytics platforms evolving beyond sophisticated workbenches for specialists into the central nervous system of enterprise operations. Three distinct patterns emerged:

The AI Assistant Architecture: A demonstration featured a customer-facing AI assistant built with Azure OpenAI. When users interacted with the chatbot regarding credit risk, requests routed through Azure Logic App not to the large language model for decisions but to a SAS Intelligent Decisioning engine. The SAS engine functioned as the trusted decision core, executing business rules and models to generate real-time risk assessments, which returned to the chatbot for customer delivery. SAS provided not the interface but the automated decision engine.

The Digital Twin Pattern: A pharmaceutical use case described using historical data from penicillin manufacturing batches to train machine learning models. These models became digital twins of physical bioreactors. Rather than conducting costly and time-consuming physical experiments, researchers executed thousands of in silico simulated experiments, adjusting parameters in the model to discover the optimal recipe for maximising yield (the "Golden Batch").

The Microsoft 365 Automation Hub: A workflow demonstration showed SAS programmes functioning as critical nodes in Microsoft 365 ecosystems. The automated process involved SAS code accessing SharePoint folders, retrieving Excel files, executing analyses, generating new reports as Excel files and delivering those reports directly into Microsoft Teams channels for business users.

These patterns mark profound evolution. Analytics platforms are moving beyond sophisticated calculators for experts, becoming foundational infrastructure: the connective tissue enabling intelligent automation and integrating disparate systems such as cloud office suites, AI interfaces and industrial hardware into cohesive business processes. This evolution from specialised tool to core infrastructure clearly indicates analytics' growing maturity within enterprise contexts.

Applied Data Science in High-Stakes Environments

Whilst much data science narrative focuses on e-commerce recommendations or marketing optimisation, compelling applications tackle intensely human, high-stakes operational challenges. Heather Hallett, a former ICU nurse and healthcare industry consultant at SAS, presented on improving hospital efficiency.

She described the challenge of staffing intensive care units, where having appropriate nurse numbers with correct skills proves critical. Staffing decisions constitute "life and death decisions". Her team uses forecasting models (such as ARIMA) to predict patient demand and optimisation algorithms (including mixed-integer programming) to create optimal nurse schedules. The optimisation addresses more than headcount; it matches nurses' specific skills, such as certifications for complex assistive devices like intra-aortic balloon pumps, to forecasted needs of the sickest patients.

A second use case applied identical operational rigour to community care. Using the classic "travelling salesman solver" from optimisation theory, the team planned efficient daily routes for mobile care vans serving maximum numbers of patients in their homes, delivering essential services to those unable to reach hospitals easily.

These applications ground abstract concepts of forecasting and optimisation in deeply tangible human contexts. They demonstrate that beyond driving revenue or reducing costs, the ultimate purpose of data science and operational analytics can be directly improving and even saving human lives. This application of sophisticated mathematics to life preservation marks data science evolution from commercial tool to critical component of human-centred operations.

Transparency as Competitive Advantage

In highly regulated industries such as pharmaceuticals, generating trustworthy research proves paramount. A presentation from Japanese pharmaceutical company Shionogi detailed how they transform the transparency challenge in Real-World Evidence (RWE) into competitive advantage.

The core problem with RWE studies, which analyse data from sources such as electronic health records and insurance claims, involves their historical lack of standardisation and transparency compared to randomised clinical trials, leading regulators and peers to question validity. Shionogi's solution is an internal system called "AI SAS for RWE", addressing the challenge through two approaches:

Standardisation: The system transforms disparate Real-World Data from various vendors into a Shionogi-defined common data model based on OMOP principles, ensuring consistency where direct conversion of Japanese RWD proves challenging.

Semi-Automation: It semi-automates the entire analysis workflow, from defining research concepts to generating final tables, figures and reports.

The most innovative aspect involves its foundation in radical transparency. The system automatically records every research process step: from the initial concept suite where analysis is defined, through specification documents, final analysis programmes and resulting reports, directly into Git. This creates a complete, immutable and auditable history of exactly how evidence was generated.

This represents more than a clever technical solution; it constitutes profound strategic positioning. By building transparent, reproducible and efficient systems for generating RWE, Shionogi directly addresses core industry challenges. They work to increase research quality and trustworthiness, effectively transforming regulatory burden into competitive edge built on integrity. This move toward provable, auditable results hallmarks a discipline transitioning from experimental art to industrial-grade science.

User Experience as Productivity Multiplier

In complex data tool contexts, user experience (UX) has evolved beyond "nice-to-have" aesthetic features into a central product strategy pillar, directly tied to user productivity and talent acquisition. A detailed examination of the upcoming complete rewrite of SAS Studio illustrated this point.

The motivation for the massive undertaking proved straightforward: the old architecture was slow and becoming a drag on user productivity. The primary goal for the new version involved making a web-based application "feel like a desktop application" regarding speed and responsiveness. To achieve this, the team focused on improvements directly boosting productivity for coders and analysts:

A Modern Editor: Integrating the Monaco editor used in the widely popular Visual Studio Code, providing familiar and powerful coding experiences.

Smarter Assistance: Improving code completion and syntax help to reduce errors and time spent consulting documentation.

Better Navigation: Adding features such as code "mini-maps" enabling programmers to navigate thousands of lines of code instantly.

For modern technical software, UX has become a fundamental competitive differentiator. Faster, more intuitive and less frustrating tools do not merely improve existing user satisfaction; they enhance productivity. In competitive markets for top data science and engineering talent, providing a best-in-class user experience represents a key strategy for attracting and retaining exceptional people. The next leap in team productivity might derive not from new algorithms but from superior interfaces.

Conclusion

These observations from SAS Innovate 2024 illustrate a discipline maturing. Data science is moving beyond isolated experiments and "science projects", becoming pragmatic, integrated, transparent and deeply human business functionality. Focus shifts from algorithmic novelty to real-world application value (whether enabling better user experiences, building regulatory trust or making life-or-death decisions on ICU floors).

As analytics becomes more integrated and accessible, the challenge involves identifying where it might unexpectedly transform core processes within organisations, moving from specialist concern to foundational infrastructure enabling intelligent, automated and human-centred operations.

Expanding the coding toolkit: Adding R and Python in a changing landscape

10^th April 2021

Over the years, I have taught myself a number of computing languages, with some coming in useful for professional work while others came in handy for website development and maintenance. The collection has grown to include HTML, CSS, XML, Perl, PHP and UNIX Shell Scripting. The ongoing pandemic allowed to me add two more to the repertoire: R and Python.

My interest in these arose from my work as an information professional concerned with standardisation and automation of statistical results delivery. To date, the main focus has been on clinical study data, but ongoing changes in the life sciences sector could mean that I may need to look further afield, so having extra knowledge never hurts. Though I have been a SAS programmer for more than twenty years, its predominance in the clinical research field is not what it was, which causes me to rethink things.

As it happens, I would like to continue working with SAS since it does so much and thoughts of leaving it after me bring sadness. It also helps to know what the alternatives might be and to reject some management hopes about any newcomers, especially regarding the amount of code being produced and the quality of graphs being created. Use cases need to be assessed dispassionately, even when emotions loom behind the scenes.

Since both R and Python bring large scripting ecosystems with active communities, the attraction of their adoption makes a deal of sense. SAS is comparable in the scale of its own ecosystem, though there are considerable differences and the platform is catching up when it comes to Data Science. While the aforementioned open-source languages may have had a head start, it appears that others are not standing still either. It is a time to have wider awareness, and online conference attendance helps with that.

The breadth of what is available for any programming language more than stymies any attempt to create a truly all encompassing starting point, and I have abandoned thoughts of doing anything like that for R. Similarly, I will not even try such a thing for Python. Consequently, this means that my sharing of anything learned will be in the form of discrete postings from time to time, especially given ho easy it is to collect numerous website links for sharing.

The learning has been facilitated by ongoing pandemic restrictions, though things are opening up a little now. The pandemic also has given us public data that can be used for practice, since much can be gained from having one's own project instead of completing exercises from a book. Having an interesting data set with which to work is a must, and COVID-19 data contain a certain self-interest as well, while one remains mindful of the suffering and loss of life that has been happening since the pandemic first took hold.

Generating PNG files in SAS using ODS Graphics

21^st December 2019

Recently, I had someone ask me how to create PNG files in SAS using ODS Graphics, so I sought out the answer for them. Normally, the suggestion would have been to create RTF or PDF files instead, but there was a specific need that needed a different approach. Adding something like the following lines before an SGPLOT, SGPANEL or SGRENDER procedure should do the needful:

ods listing gpath='E:\'; ods graphics / imagename="test" imagefmt=png;

Here, the ODS LISTING statement declares the destination for the desired graphics file, while the ODS GRAPHICS statement defines the file name and type. In the above example, the file test.png would be created in the root of the E drive of a Windows machine. However, this also works with Linux or UNIX directory paths.

Using NOT IN operator type functionality in SAS Macro

9^th November 2018

For as long as I have been programming with SAS, there has been the ability to test if a variable does or does not have one value from a list of values in data step IF clauses or WHERE clauses in both data step and most if not all procedures. It was only within the last decade that its Macro language got similar functionality, with one caveat that I recently uncovered: you cannot have a NOT IN construct. To get that, you need to go about things differently.

In the example below, you see the NOT operator being placed before the IN operator component that is enclosed in parentheses. If this is not done, SAS produces the error messages that caused me to look at SAS Usage Note 31322. Once I followed that approach, I was able to do what I wanted without resorting to older, more long-winded coding practices.

options minoperator;

%macro inop(x);
    %if not (&x in (a b c)) %then %do;
        %put Value is not included;
    %end;
    %else %do;
        %put Value is included;
    %end;
%mend inop;

%inop(a);

While running the above code should produce a similar result to another featured on here in another post, the logic is reversed. There are times when such an approach is needed. One is where a few possibilities are to be excluded from a larger number of possibilities. Since programming often involves more inventive thinking, this may be one of those.

« Older Entries «