Until the paperback versions are published on Lulu.com, the PDF ebooks downloaded from my blog site will be discounted, but prices will be adjusted as soon as the paperbacks have been published. This will probably be in early July 2024, after the physical printed proofs have been approved.
The pandemic has impacted us all in many ways, and my presenting has definitely been severely curtailed. However, even though the pandemic has not yet been defeated, I was determined to make my voice heard and my face seen, so I’ve accepted 2 presentation opportunities in the next 2 months:
01Apr2021: SUGUKI have asked me to present “Writing Reusable Macros: Managing SAS Data Sets” at a lunchtime webinar from 1215-1245 (British Summer Time = GMT+1). More details can be found at https://www.meetup.com/SUGUKI, and the Zoom call is limited to 100, so early registration is recommended.
20May2021: Virtual SAS Global Forum 2021 will be including a Premium Session video presentation of “How Many Shades of Guide: SAS Enterprise Guide to 8.3 and SAS Studio to 3.81 with SAS 9.4: Part 1 – SAS Enterprise Guide”. The paper includes the history of both EG and SAS Studio, but time limits necessitated the paper be split into 2 presentations, and this one will be Part 1 only. Look out for Peedy!
I’m not certain when the video presentation will be made public yet, but I’ll be having a live Q&A chat on 20May2021. Keep an eye on the SAS Global Forum web site for more details about when and how to join me.
I’m hoping to publish the Part 2 video later this year, probably on my blog site, which will look at the history of SAS Studio and a comparison with EG.
This is a project to read the daily Johns Hopkins COVID-19 data and visualise the national infection and fatality trends using Base SAS and SAS/STAT:
Download the GitHub Desktop software from https://desktop.github.com/ and install it on your computer where you will be running SAS Studio or SAS University Edition. For instructions on how to install SAS University Edition on your own computer please read my blog post “Are you learning about SAS?”.
Clone the Johns Hopkins COVID-19 data at https://github.com/CSSEGISandData/COVID-19, and then Pull the latest data, using the GitHub Desktop. This will reduce the time need to download all of the latest data each time you run the SAS Studio project, as a simple and quick Pull request in GitHub Desktop is all that is required each time.
Open the CPF project file in SAS Studio (requires Base SAS and SAS/STAT) or SAS University Edition (making certain you have created a Shared Folder(s) first that are pointing to where your GitHub files and CPF project file are stored).
Update the “run first” program to include your GitHub file folder in the &_dir macro variable assignment. The CSV files we will be using can be found in the /csse_covid_19_data/csse_covid_19_daily_reports folder.
Submit each program in order given below (or submit all of the programs in the project’s flow together):
(1) “run first” assigns the location of the data to the &_dir macro variable.
(2) “Read CSV files” creates the SAS data sets in WORK by reading all of the CSV files in the csse_covid_19_daily_reports folder. Summarise the records by Country_Region to remove finer detail in the csse_covid_19_daily_reports.
(3) “Calculate regression lines” generates the regression lines for confirmed cases between 100 and 10,000, and deaths between 10 and 1,000, to include on the graphs. The regression lines appear to be straight in the semi-log plots, but are actually exponential to match the initial growth of confirmed cases, so that “flattening” of the curves can be identified more easily.
(4) “Semi-log plots of confirmed vs deaths” generates the graphs for countries where COVID-19 has had more than 1,000 confirmed cases or more than 100 deaths.
Some questions for you to answer:
(a) Where could my “Read CSV files” program be improved?
(b) Why is the US graph split at around 20Mar2020? Is this a problem with the data or my program?
(c) Are all of cases being included?
This project is open to SAS programmers and to researchers. Follow the above instructions yourself, and then see if you can improve my SAS code by answering the questions.
Please send your saved SAS Studio flow containing your improved versions of the SAS programs to phil@hollandnumerics.org.uk. Anyone providing improvements that can be incorporated will be added to the credits for this project.
Kaggle are running a competition to develop a Python or R application to filter the vast collection of medical research papers that are being published every day.
The CORD-19 dataset represents the most extensive machine-readable coronavirus literature collection available for data mining to date. This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing COVID-19 response efforts worldwide. There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up.
Many of these questions are suitable for text mining, and they are encouraging researchers to develop text mining tools to provide insights on these questions.
This dataset was created by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine – National Institutes of Health, in coordination with The White House Office of Science and Technology Policy.
I am not a Python or R programmer, but a SAS programmer, so I decided to make use of the freely available dataset and try to develop a simple data mining application in SAS instead, which I would like to publish to the benefit of the fight against COVID-19. I have now created a basic framework, which I am opening up to the SAS-programming community to test, improve and enhance as a saved SAS Studio flow (*.cpf), which can be imported into a Single-User SAS Studio installation, or into a SAS University Edition installed on a PC (see my blog post “Are you learning about SAS?” for details about how to install this version of SAS), as these installations can directly access files on your own computer.
The SAS programs in the SAS Studio flow are as follows:
run first:
Assigns the location of the downloaded Kaggle dataset into &_dir.
This macro variable will need to be edited to match the location of the folder where you have downloaded and extracted the CORD-19 dataset. SAS University Edition users will also need to assign a shared folder pointing to this location.
Includes %create_json_extract_script used in the Read json xxx programs below to read the JSON files containing the selected research papers into SAS, and then print out the contents of the paper.
If more non-printable characters are present that are not catered for in this macro, then additional TRANWRD() statements will need to be added here.
If anyone can devise a more elegant solution than using multiple TRANWRD() statements to convert unicode (\u9999) strings to printable 8-bit ASCII values, then I will welcome tested suggestions.
Read metadata:
Reads the CSV file containing the metadata about the research papers, including the abstract, which can be searched, and the location(s) of the related paper(s). SAS data set created = work.metadata.
Filter metadata 31Dec19:
Filters the extracted metadata to only include papers published in or after December 2019. SAS data set created = work.Since_31Dec19.
Filter metadata xxx:
Filters the SAS data set (work.Since_31Dec19) created by Filter metadata 31Dec19 to select paper abstracts containing specific particular keywords. SAS data set created = work.Since_31Dec19_xxx:
xxx=infect: WHERE INDEX(lowcase(abstract), ‘infect‘) AND INDEX(lowcase(abstract), ‘rate’) AND INDEX(lowcase(abstract), ‘age’) AND (INDEX(lowcase(abstract), ‘hcov’) OR INDEX(lowcase(abstract), ‘-cov’) OR INDEX(lowcase(abstract), ‘covid’));
xxx=cured: WHERE INDEX(lowcase(abstract), ‘cured‘) AND INDEX(lowcase(abstract), ‘rate’) AND INDEX(lowcase(abstract), ‘age’) AND (INDEX(lowcase(abstract), ‘hcov’) OR INDEX(lowcase(abstract), ‘-cov’) OR INDEX(lowcase(abstract), ‘covid’));
xxx=fatal: WHERE INDEX(lowcase(abstract), ‘fatal‘) AND INDEX(lowcase(abstract), ‘rate’) AND INDEX(lowcase(abstract), ‘age’) AND (INDEX(lowcase(abstract), ‘hcov’) OR INDEX(lowcase(abstract), ‘-cov’) OR INDEX(lowcase(abstract), ‘covid’));
xxx=recover: WHERE INDEX(lowcase(abstract), ‘recover‘) AND INDEX(lowcase(abstract), ‘rate’) AND INDEX(lowcase(abstract), ‘age’) AND (INDEX(lowcase(abstract), ‘hcov’) OR INDEX(lowcase(abstract), ‘-cov’) OR INDEX(lowcase(abstract), ‘covid’));
Read json xxx:
Prints all of the papers in the filtered abstracts to HTML using the metadata in the SAS data set (work.Since_31Dec19_xxx) created by Filter metadata xxx.
This project is open both to SAS programmers and to researchers. Please download the CORD-19 dataset and my SAS Studio flow. Try it out yourself, and then see if you can improve the performance, usability, flexibility or maintenance of my SAS code.
Please send your saved SAS Studio flow containing your improved versions of the SAS programs to phil@hollandnumerics.org.uk. Anyone providing improvements that can be incorporated will be added to the credits for this project.
The SAS course and the SAS Programming Forum continue to grow, I have just added some new course sections and topics about Data Steps, Base SAS Procedures, PROC SQL, SAS Macros, SAS Enterprise Guide and SAS Studio, and there are now 54 topics in 7 different sections:
A. SAS components – 2 topics
B. Data Steps – 14 topics (1 new topic)
C. Base SAS Procedures – 6 topics (new section)
F. PROC SQL – 15 topics (1 new topic)
G. SAS Macros – 15 topics (11 new topics)
N. SAS Enterprise Guide – 1 topic (new section)
O. SAS Studio – 1 topic (new section)
More topics and sections are being developed, so register for free now to be kept up-to-date about all of the news, so you can take advantage of the Programmer level when it suits you best!
I’m presenting “How Many Shades of Guide: SAS Enterprise Guide to 8.1 and SAS Studio to 3.8 with SAS 9.4” at SAS Global Forum 2020 in Washington DC from 12:30pm to 1:30pm on Wednesday 1st April.
I’ve been using EG since 2001 when the version was 1.1.1, so I thought it would be a good idea to gather together all my EG and SAS Studio conference presentations with their assorted screenshots, and try to explain why each application works the way it does. In fact my EG installation has been updated since my paper was accepted, so I’ll actually be talking about EG 8.2 too!
I know some of you who are planning to attend SASGF 2020 may be thinking about starting their journeys home before my presentation starts. I would like to see everyone in the audience, but I do understand the pressures of transportation around DC, so, for you and anyone who can’t attend, I’ll be in a live-streaming session. Hopefully my session will also be recorded, so anyone who wants to learn more about EG and SAS Studio can watch me later.
For those of you who are attending SASGF please look out for me in the Quad, and elsewhere in the conference, and say Hi! I’ll try to post a link to my session here after the conference finishes (assuming it is recorded, of course!).
I was considering attending PharmaSUG China at the end of August 2019, but I’ve been told that a seminar held previously in China on SAS programming efficiency had a low attendance, as programmers there are relatively young, so they like to learn techniques on their own, or take classes on topics that they cannot learn from the internet. However, they prefer challenging topics, which are hard to learn on their own.
I have now decided to delay what could be my one and only visit to China until 2020, and use the extra preparation time to find out a little more about what SAS programmers in China would be most interested in.
Therefore, please could you help me by answering this quick poll about the 1/2 day training sessions I currently provide. The answers will guide me to the best package to offer to PharmaSUG China 2020. Links to most of the training courses can be found below the poll.
If you have not yet voted and can view the poll results, but the Vote button is grey, your IP address may already have been used to vote on this poll. This is in fact quite common when viewing blog posts from a company PC, so I would therefore recommend that you try voting using your phone or your home PC instead.
Thank you in advance………..Phil
I'm planning to go to PharmaSUG China in 2020. Which 1/2 day training courses would you be interested in attending there? (max 2)
Efficient SAS Programming (28%, 7 Votes)
Defensive SAS Programming (24%, 6 Votes)
Introduction to ODS Graph Templates (20%, 5 Votes)
I’m presenting on SAS Studio and ODS Graphics at the London SUGUKI Meetup on 10Jan2019.
Attend the meeting for a chance to win an ebook in the draw! Note that you meet register at http://www.meetup.com/SUGUKI to be allowed into the meetup, where you can find more details about this and other SUGUKI events.
The SAS course and the SAS Programming Forum continue to grow, I have just added 8 new course topics about PROC SQL, and there are now 33 topics in 4 different sections:
More topics and sections are being developed, so register for free now to be kept up-to-date about all of the news, so you can take advantage of the Programmer level when it suits you best!
There is now a new Training Course list for 2017Q4, which can be downloaded from here. The courses available in 2016 and early 2017 are still there, but had added a new course to the list:
½ day Defensive SAS Programming training
I’m also developing some new SAS-related courses, based on the SAS course, which you can accelerate to production status by requesting them:
½ day SAS Data Step training
½ day SAS PROC SQL training
½ day SAS macros training
Your interest in any of these courses will result in them being developed as priority tasks!
You must be logged in to post a comment.