TOPIC: DATA WAREHOUSING
SAS Packages: Revolutionising code sharing in the SAS ecosystem
26th July 2025In the world of statistical programming, SAS has long been the backbone of data analysis for countless organisations worldwide. Yet, for decades, one of the most significant challenges facing SAS practitioners has been the efficient sharing and reuse of code. Knowledge and expertise have often remained siloed within individual developers or teams, creating inefficiencies and missed opportunities for collaboration. Enter the SAS Packages Framework (SPF), a solution that changes how SAS professionals share, distribute and utilise code across their organisations and the broader community.
The Problem: Fragmented Knowledge and Complex Dependencies
Anyone who has worked extensively with SAS knows the frustration of trying to share complex macros or functions with colleagues. Traditional code sharing in SAS has been plagued by several issues:
- Dependency nightmares: A single macro often relies on dozens of utility macros working behind the scenes, making it nearly impossible to share everything needed for the code to function properly
- Version control chaos: Keeping track of which version of which macro works with which other components becomes an administrative burden
- Platform compatibility issues: Code that works on Windows might fail on Linux systems and vice versa
- Lack of documentation: Without proper documentation and help systems, even the most elegant code becomes unusable to others
- Knowledge concentration: Valuable SAS expertise remains trapped within individuals rather than being shared with the broader community
These challenges have historically meant that SAS developers spend countless hours reinventing the wheel, recreating functionality that already exists elsewhere in their organisation or the wider SAS community.
The Solution: SAS Packages Framework
The SAS Packages Framework, developed by Bartosz Jabłoński, represents a paradigm shift in how SAS code is organised, shared and deployed. At its core, a SAS package is an automatically generated, single, standalone zip file containing organised and ordered code structures, extended with additional metadata and utility files. This solution addresses the fundamental challenges of SAS code sharing by providing:
- Functionality over complexity: Instead of worrying about 73 utility macros working in the background, you simply share one file and tell your colleagues about the main functionality they need to use.
- Complete self-containment: Everything needed for the code to function is bundled into one file, eliminating the "did I remember to include everything?" problem that has plagued SAS developers for years.
- Automatic dependency management: The framework handles the loading order of code components and automatically updates system options like
cmplib=
andfmtsearch=
for functions and formats. - Cross-platform compatibility: Packages work seamlessly across different operating systems, from Windows to Linux and UNIX environments.
Beyond Macros: A Spectrum of SAS Functionality
One of the most compelling aspects of the SAS Packages Framework is its versatility. While many code-sharing solutions focus solely on macros, SAS packages support a wide range of SAS functionality:
- User-defined functions (both FCMP and CASL)
- IML modules for matrix programming
- PROC PROTO C routines for high-performance computing
- Custom formats and informats
- Libraries and datasets
- PROC DS2 threads and packages
- Data generation code
- Additional content such as documentation PDF's
This comprehensive approach means that virtually any SAS functionality can be packaged and shared, making the framework suitable for everything from simple utility macros to complex analytical frameworks.
Real-World Applications: From Pharmaceutical Research to General Analytics
The adoption of SAS packages has been particularly notable in the pharmaceutical industry, where code quality, validation and sharing are critical concerns. The PharmaForest initiative, led by PHUSE Japan's Open-Source Technology Working Group, exemplifies how the framework is being used to revolutionise pharmaceutical SAS programming. PharmaForest offers a collaborative repository of SAS packages specifically designed for pharmaceutical applications, including:
- OncoPlotter: A comprehensive package for creating figures commonly used in oncology studies
- SAS FAKER: Tools for generating realistic test data while maintaining privacy
- SASLogChecker: Automated log review and validation tools
- rtfCreator: Streamlined RTF output generation
The initiative's philosophy perfectly captures the spirit of the SAS Packages Framework: "Through SAS packages, we want to actively encourage sharing of SAS know-how that has often stayed within individuals. By doing this, we aim to build up collective knowledge, boost productivity, ensure quality through standardisation and energise our community".
The SASPAC Archive: A Growing Ecosystem
The establishment of SASPAC (SAS Packages Archive) represents the maturation of the SAS packages ecosystem. This dedicated repository serves as the official home for SAS packages, with each package maintained as a separate repository complete with version history and documentation. Some notable packages available through SASPAC include:
- BasePlus: Extends BASE SAS with functionality that many developers find themselves wishing was built into SAS itself. With 12 stars on GitHub, it's become one of the most popular packages in the archive.
- MacroArray: Provides macro array functionality that simplifies complex macro programming tasks, addressing a long-standing gap in SAS's macro language capabilities.
- SQLinDS: Enables SQL queries within data steps, bridging the gap between SAS's powerful data step processing and SQL's intuitive query syntax.
- DFA (Dynamic Function Arrays): Offers advanced data structures that extend SAS's analytical capabilities.
- GSM (Generate Secure Macros): Provides tools for protecting proprietary code while still enabling sharing and collaboration.
Getting Started: Surprisingly Simple
Despite the capabilities, getting started with SAS packages is fairly straightforward. The framework can be deployed in multiple ways, depending on your needs. For a quick test or one-time use, you can enable the framework directly from the web:
filename packages "%sysfunc(pathname(work))";
filename SPFinit url "https://raw.githubusercontent.com/yabwon/SAS_PACKAGES/main/SPF/SPFinit.sas";
%include SPFinit;
For permanent installation, you simply create a directory for your packages and install the framework:
filename packages "C:SAS_PACKAGES";
%installPackage(SPFinit)
Once installed, using packages becomes as simple as:
%installPackage(packageName)
%helpPackage(packageName)
%loadPackage(packageName)
Developer Benefits: Quality and Efficiency
For SAS developers, the framework offers numerous advantages that go beyond simple code sharing:
- Enforced organisation: The package development process naturally encourages better code organisation and documentation practices.
- Built-in testing: The framework includes testing capabilities that help ensure code quality and reliability.
- Version management: Packages include metadata such as version numbers and generation timestamps, supporting modern DevOps practices.
- Integrity verification: The framework provides tools to verify package authenticity and integrity, addressing security concerns in enterprise environments.
- Cherry-picking: Users can load only specific components from a package, reducing memory usage and namespace pollution.
The Future of SAS Code Sharing
The growing adoption of SAS packages represents more than just a new tool, it signals a fundamental shift towards a more collaborative and efficient SAS ecosystem. The framework's MIT licensing and 100% open-source nature ensure that it remains accessible to all SAS users, from individual practitioners to large enterprise installations. This democratisation of advanced code-sharing capabilities levels the playing field and enables even small teams to benefit from enterprise-grade development practices.
As the ecosystem continues to grow, with contributions from pharmaceutical companies, academic institutions and individual developers worldwide, the SAS Packages Framework is proving that the future of SAS programming lies not in isolated development, but in collaborative, community-driven innovation.
For SAS practitioners looking to modernise their development practices, improve code quality and tap into the collective knowledge of the global SAS community, exploring SAS packages isn't just an option, it's becoming an essential step towards more efficient and effective statistical programming.
Expanding the coding toolkit: Adding R and Python in a changing landscape
10th April 2021Over the years, I have taught myself a number of computing languages, with some coming in useful for professional work while others came in handy for website development and maintenance. The collection has grown to include HTML, CSS, XML, Perl, PHP and UNIX Shell Scripting. The ongoing pandemic allowed to me add two more to the repertoire: R and Python.
My interest in these arose from my work as an information professional concerned with standardisation and automation of statistical results delivery. To date, the main focus has been on clinical study data, but ongoing changes in the life sciences sector could mean that I may need to look further afield, so having extra knowledge never hurts. Though I have been a SAS programmer for more than twenty years, its predominance in the clinical research field is not what it was, which causes me to rethink things.
As it happens, I would like to continue working with SAS since it does so much and thoughts of leaving it after me bring sadness. It also helps to know what the alternatives might be and to reject some management hopes about any newcomers, especially regarding the amount of code being produced and the quality of graphs being created. Use cases need to be assessed dispassionately, even when emotions loom behind the scenes.
Since both R and Python bring large scripting ecosystems with active communities, the attraction of their adoption makes a deal of sense. SAS is comparable in the scale of its own ecosystem, though there are considerable differences and the platform is catching up when it comes to Data Science. While the aforementioned open-source languages may have had a head start, it appears that others are not standing still either. It is a time to have wider awareness, and online conference attendance helps with that.
The breadth of what is available for any programming language more than stymies any attempt to create a truly all encompassing starting point, and I have abandoned thoughts of doing anything like that for R. Similarly, I will not even try such a thing for Python. Consequently, this means that my sharing of anything learned will be in the form of discrete postings from time to time, especially given ho easy it is to collect numerous website links for sharing.
The learning has been facilitated by ongoing pandemic restrictions, though things are opening up a little now. The pandemic also has given us public data that can be used for practice, since much can be gained from having one's own project instead of completing exercises from a book. Having an interesting data set with which to work is a must, and COVID-19 data contain a certain self-interest as well, while one remains mindful of the suffering and loss of life that has been happening since the pandemic first took hold.
Generating PNG files in SAS using ODS Graphics
21st December 2019Recently, I had someone ask me how to create PNG files in SAS using ODS Graphics, so I sought out the answer for them. Normally, the suggestion would have been to create RTF or PDF files instead, but there was a specific need that needed a different approach. Adding something like the following lines before an SGPLOT
, SGPANEL
or SGRENDER
procedure should do the needful:
ods listing gpath='E:\';
ods graphics / imagename="test" imagefmt=png;
Here, the ODS LISTING
statement declares the destination for the desired graphics file, while the ODS GRAPHICS
statement defines the file name and type. In the above example, the file test.png would be created in the root of the E drive of a Windows machine. However, this also works with Linux or UNIX directory paths.
Using NOT IN operator type functionality in SAS Macro
9th November 2018For as long as I have been programming with SAS, there has been the ability to test if a variable does or does not have one value from a list of values in data step IF clauses or WHERE clauses in both data step and most if not all procedures. It was only within the last decade that its Macro language got similar functionality, with one caveat that I recently uncovered: you cannot have a NOT IN construct. To get that, you need to go about things differently.
In the example below, you see the NOT operator being placed before the IN operator component that is enclosed in parentheses. If this is not done, SAS produces the error messages that caused me to look at SAS Usage Note 31322. Once I followed that approach, I was able to do what I wanted without resorting to older, more long-winded coding practices.
options minoperator;
%macro inop(x);
%if not (&x in (a b c)) %then %do;
%put Value is not included;
%end;
%else %do;
%put Value is included;
%end;
%mend inop;
%inop(a);
While running the above code should produce a similar result to another featured on here in another post, the logic is reversed. There are times when such an approach is needed. One is where a few possibilities are to be excluded from a larger number of possibilities. Since programming often involves more inventive thinking, this may be one of those.
Presenting more than one plot on a page using SAS ODS PDF
12th November 2012If you had asked me about getting two or more graphs on a page using SAS/GRAPH procedures, I might have suggested PROC GREPLAY
as the means to achieve it. However, I recently came across another way to do the same thing by using ODS. It helped that the graphs were being produced using the PDF destination because I doubt that what follows will work with the RTF one.
For this three plots on a page example, I first set the orientation to landscape so that the plots can be arranged side by side in a single row:
options orientation=landscape;
Next, the PDF destination was opened with page breaks turned off for the required output file using the STARTPAGE
option:
ods pdf file="c:\test.pdf" startpage=off;
The listing destination was turned off as well since it is not needed:
ods listing close;
With that complete, a page or region break gets inserted. While this could have been repeated before every procedure to get it popped into the next region on the page, that is the default behaviour for any extra procedural step, so it wasn't needed.
ods pdf startpage=now;
Then, the ODS LAYOUT feature is started so that the layout can be defined on the page:
ods layout start;
For the first plot and the one at the left of the triptych, a region was defined absolutely (grid layouts are available, but I didn't make use of them here) using ODS REGION. Since all plots were to be of the same size, the width was defined as being a third of the page and the bottom left-hand corner of the region defined to be the same as that of the plot area on the page. Titles and footnotes usefully lie outside this region in the way that SAS arranges things, so there is no further messing. With the region defined, it's a matter of running the required SAS/GRAPH procedure. In my case, this was GPLOT
, but I am certain that others would work as well. The height was defined as the full possible plot height. This could have a use if I wanted more than one row of graphs on a page, with a trellis plot being an example of that sort of arrangement.
ods region x=0pct y=0pct width=33pct height=100pct;
<< SAS/GRAPH Procedure >>
For the middle plot, the starting position is moved a third of the way along the page, while the section area has the same dimensions as before. Using percentages in these definitions does make their usage easier.
ods region x=33pct y=0pct width=33pct height=100pct;
<< SAS/GRAPH Procedure >>
Lastly, the right-hand plot has a starting position two-thirds of the width of the page and the other dimensions are as per the other panels:
ods region x=66pct y=0pct width=33pct height=100pct;
<< SAS/GRAPH Procedure >>
With the last graph created, it is time to close ODS LAYOUT and the PDF destination. Then, the listing destination is reopened again.
ods layout end;
ods pdf close;
ods listing;
Update 2012-12-08: Since writing the above, I have learned that ODS LAYOUT and ODS REGION have yet to become production features of SAS with 9.3 as the latest version. However, I have encountered an alternative that uses the STARTPAGE=NEVER
ODS PDF option to turn off page breaks and GOPTIONS
statements to control the regions occupied by plots. It's Sample 48569 on the SAS website. Having a production equivalent is better, since pre-production features are best avoided in production code. If I had realised the status, I would have used PROC GREPLAY
to achieve what I needed to do.
WARNING: Engine XPORT does not support SORTEDBY operations. SORTEDBY information cannot be copied.
24th July 2012When recently creating a transport file using PROC COPY
and the XPORT
library engine, I found the above message in the log. The code used was similar to the following:
libname tran xport "c:\temp\tran.xpt";
proc copy in=data out=tran;
run;
When I went seeking out the cause on the web, I discovered this SAS Note that dates from before the release of SAS 6.12, putting the issue at more than ten years old. My take on its continuing existence is that we still to use a transport file format that was introduced in SAS 5.x for the sake of interoperability, both between SAS versions and across alternatives to the platform.
The SORTEDBY
flag in a dataset header holds the keys used to sort the data, and it isn't being copied into the XPORT
transport files, hence the warning. To get rid of it, you need to remove the information manually in the data step using the SORTEDBY
option on the DATA
statement or using PROC DATASETS
, which avoids rewriting the entire data set.
First up is the data step option:
data test(sortedby=_null_);
set sashelp.class;
run;
Then, there's PROC DATASETS:
proc datasets;
modify test(sortedby=_null_);
run;
quit;
It might seem counterproductive to exclude the information, but it makes no sense to keep what's being lost anyway. So long as the actual sort order is unchanged, and I believe that the code that that below will not alter it, we can live with its documentation in a specification until transport files created using PROC CPORT
are as portable as those from PROC COPY
.
ERROR: Ambiguous reference, column xx is in more than one table.
5th May 2012Sometimes, SAS messages are not all that they seem, and a number of them are issued from PROC SQL when something goes awry with your code. In fact, I got a message like the above when ordering the results of the join using a variable that didn't exist in either of the datasets that were joined. This type of thing has been around for a while (I have been using SAS since version 6.11, and it was there then) and it amazes me that we haven't seen a better message in more recent versions of SAS; it was SAS 9.2 where I saw it most recently.
proc sql noprint;
select a.yy, a.yyy, b.zz
from a left join b
on a.yy=b.yy
order by xx;
quit;
Creating placeholder graphics in SAS using PROC GSLIDE for when no data are available
18th March 2012Recently, I found myself with a plot to produce, but there were no data to be presented, so a placeholder output was needed. For a listing or a table, this is a matter of detecting if there are observations to be listed or summarised and then issuing a placeholder listing using PROC REPORT if there are no data available. Using SAS/GRAPH, something similar can be achieved using one of its curiosities.
In the case of SAS/GRAPH, PROC GSLIDE
looks like the tool to user for the same purpose. The procedure does get covered as part of a SAS Institute SAS/GRAPH training course, but they tend to gloss over it. After all, there is little reason to go creating presentations in SAS when PowerPoint and its kind offer far more functionality. However, it would make an interesting tale to tell how GSLIDE
became part of SAS/GRAPH in the first place. Its existence makes me wonder if it pre-exists the main slideshow production tools that we use today.
The code that uses PROC GSLIDE
to create a placeholder graphic is as follows (detection of the number of observations in a SAS dataset is another entry on here):
proc gslide;
note height=10;
note j=center "No data are available";
run;
quit;
PROC GSLIDE
is one of those run group procedures in SAS so a QUIT statement is needed to close it. The NOTE statements specify the text to be added to the graphic. The first of these creates a blank line of the required height for placing the main text in the middle of the graphic. It is the second one that adds the centred text that tells users of the generated output what has happened.
Dealing with variable length warnings in SAS 9.2
11th January 2012A habit of mine is to put a LENGTH
or ATTRIB
statement between DATA and SET statements in a SAS data step to reset variable lengths. By default, it appears that this triggers truncation warnings in SAS 9.2 or SAS 9.3 when it didn't in previous versions. SAS 9.1.3, for instance, allowed you to have something like the following for shortening a variable length without issuing any messages at all:
data b;
length x $100;
set a;
run;
In this case, x could have a length of 200 previously and SAS 9.1.3 wouldn't have complained. Now, SAS 9.2 and 9.3 will issue a warning if the new length is less than the old length. This can be useful to know, but it can be changed using the VARLENCHK
system option. Though the default value is WARN, it can be set to ERROR if you really want to ensure that there is no chance of truncation. Then, you get error messages and the program fails where it normally would run with warnings. Setting the value of the option to NOWARN
restores the type of behaviour seen in SAS 9.1.3 and versions before that.
The SAS documentation says that the ability to change VARLENCHK
can be restricted by an administrator, so you might need to deal with this situation in a more locked down environment. Then, one option would be to do something like the following:
data b;
drop x;
rename _x=x;
set a;
length _x $100;
_x=strip(x);
run;
While It's a bit more laborious than setting the VARLENCHK
option to NOWARN
, the idea is that you create a new variable of the right length and replace the old one with it. That gets rid of warnings or errors in the log and resets the variable length as needed. Of course, you have to ensure that there is no value truncation with either remedy. If any is found, then the dataset specification probably needs updating to accommodate the length of the values in the data. After all, there is no substitute for getting to know your data and doing your own checking should you decide to take matters into your hands.
There is a use for the default behaviour, though. If you use a specification to specify a shell for a dataset, then you will be warned when the shell shortens variable lengths. That allows you to either adjust the dataset or your program. Also, it provides additional information when you get variable length mismatch warnings when concatenating or merging datasets. There was a time when SAS wasn't so communicative in these situations and some investigation was needed to establish which variable was affected. Now, that has changed without leaving the option to work differently if you so do desire. Sometimes, what can seem like an added restriction can have its uses.
Setting VIEWTABLE to show column names in SAS
15th September 2011The following is the default behaviour in the DMS: Base SAS opens datasets from its Explorer using VIEWTABLE
and with variable labels in the column headings and not variable names. Because I have been fortunate to use systems with SAS/FSP both installed and licensed, I have taken to using FSVIEW
for browsing SAS datasets as a workaround and, though the interface may look old to some, it proves to be a very flexible tool that still has a few things to teach newer ones. With SAS Enterprise Guide, the dataset viewing functionality is different to both VIEWTABLE
and FSVIEW
, but I have been to make it work for me. While the SAS EG dataset viewing tool may appear like the former of these, it has a few tricks to teach its forbear.
Now that I find myself working again with the traditional SAS DMS interface and without SAS/FSP, I decided to see if there was a way to get VIEWTABLE
to display variable names instead of variable labels by default. As it happened, the answer was found in an internet forum discussion. From the SAS command line, you can achieve the result by issuing a command like the following:
VT SASHELP.VCOLUMN COLHEADING=NAMES
Above VT
is the VIEWTABLE
shortcut, while it is the COLHEADING=NAMES
option on the line that gets variable names shown in column headings. Taking it further, you can set this as the default setting for datasets opened using a mouse from Explorer panes using the following procedure:
- Click in or on the Explorer pane to highlight the Explorer window.
- Select Tools > Options > Explorer in the menus.
- Select the Members tab.
- Double-click on the TABLE icon.
- Double-click on the
&Open
action. - Set the Action command to:
VIEWTABLE %8b.'%s'.DATA COLHEADING=NAMES
. - Click on the Set Default button.
- Save changes and close the Explorer Options window.
Because the DMS looks similar across versions 8.0 through to 9.2, the above instructions should be relevant to all of those. While I have yet to get the opportunity to use SAS 9.3, I would be surprised to find that the traditional SAS interface has changed there too, even though much else has changed about SAS. In fact, the latest version of SAS has brought quite a few interesting new features for programmers, so it appears that you can do more through a familiar interface, not entirely a bad thing. It looks as if this VIEWTABLE
tweak could be useful for a while yet.