When to Use or Not Use *ALLSEP
Deep Thought of the Month
“I have traveled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won’t last out the year.”
– The editor in charge of business books for Prentice Hall, 1957.
When to Use or Not Use *ALLSEP
In Spring 2023, IBM added the *ALLSEP keyword as an enhancement to the %SPLIT built-in function. I’ve had questions from developers about this a few times, and they generally boil down to two things: 1) What does *ALLSEP do? and 2) How do I know when I need to use it?
Today in Scott’s Corner, I’ll provide clear answers to these questions.
What Does *ALLSEP Do?
By default, the %SPLIT BIF ignores consecutive separators. *ALLSEP tells it to use all of those separators, rather than ignore the consecutive ones. To understand this better, let’s start by looking at a typical use case for %SPLIT.
For an example where *ALLSEP isn’t needed, consider the following data record:
Acme Industries|Harry Smith|123 Main Street
It consists of three fields. The company name, the contact name within the company, and the address. Each field is separated from the other using a pipe character, so the %SPLIT BIF makes it easy to read them, just split the data based on the pipe. Here’s a simple subprocedure that breaks it up and returns the separate fields in a data structure:
dcl-ds record_t qualified template;
Customer varchar(30);
Contact varchar(30);
Address varchar(30);
end-ds;
.
.
dcl-proc ProcessPipes;
dcl-pi *n;
data varchar(100) const;
rec likeds(record_t);
end-pi;
dcl-s fields varchar(100) dim(3);
fields = %split(data: ‘ | ‘ );
rec.Customer = fields(1);
rec.Contact = fields(2);
rec.Address = fields(3);
end-proc;
The result will be as follows:
rec.Customer = Acme Industries
rec.Contact = Harry Smith
rec.Address = 123 Main Street
This works because there aren’t any consecutive separators. In other words, there aren’t two pipes next to each other. The problem arises when you do end up with separators next to each other. The most common reason would be when a field is empty. For example, in the above example, what would happen if the contact name was unknown, and therefore left blank in the record?
Fudgy Factory||321 Sesame Street
ec.Customer = Fudgy Factory
rec.Contact = 321 Sesame Street
rec.Address =
Notice that the address is now in the rec.Contact field, that is because the consecutive pipe characters were ignored. To put it another way, multiple pipes such as || are treated the same as a single pipe |, so it thinks “321 Sesame Street” is the 2nd field rather than the 3rd.
This is the problem that *ALLSEP solves. You can add it to the split just by adding one parameter to the %SPLIT BIF in the ProcessPipes subprocedure. That line of code will now look like this:
fields = %split(data: ‘|’: *ALLSEP);
And the problem is solved:
ec.Customer = Fudgy Factory
rec.Contact =
rec.Address = 321 Sesame Street
When Shouldn't You Use It?
After reading what *ALLSEP does, above, you may be thinking you should always use *ALLSEP. That may be true for a CSV or Pipe separated file like the one above, but what about other scenarios?
For example, maybe you are splitting apart the names in an IFS path:
/home//sklement//ftpapi//src/rpg/FTPAPI_H.rpgleinc
In this case, the repeated // are probably mistakes, but… if you type a path like that into an IFS command (for example, WRKLNK) or a QShell or PASE command, the repeated slashes will be ignored (just as they are with %SPLIT). Therefore, this is a scenario where you wouldn’t use *ALLSEP.
Another scenario might be splitting apart the words in a sentence. Or a command-line. Or an address. Anywhere that a human may have accidentally used two separators.
The quick brown fox jumps over the lazy dog
In this case you’d use %SPLIT with a blank as the separator. But, it wouldn’t matter that there are two blanks between “jumps” and “over” – you’d want them treated as one.
Likewise, if someone is typing in a person’s name, they could type it as follows:
Klement,, Scott
The extra comma is a typo, but it doesn’t matter – your program knows there’s only two pieces of data here, the last and first name. If there’s multiple commas, you can easily treat them as one.
The moral of the story is to think about the data and understand what is or isn’t allowed according to your business rules. The nice thing about %SPLIT (vs. similar tools like strtok or strsep) is that you have the choice to include or exclude the *ALLSEP keyword depending on your circumstances.
Scott Klement
Midrange Dynamics
Development & Solutions Architect
Scott Klement is an IT professional with a passion for both programming and mentoring. He joined Midrange Dynamics at the beginning of October 2022. He formerly was the Director of Product Development and Support at Profound Logic and the IT Manager and Senior Programmer at
Klement’s Sausage Co., Inc. Scott also serves on the Board of Directors of COMMON, where he represents the Education, Innovation, and Certification teams. He is an IBM Champion for Power Systems.
Subscribe to our newsletter and join us next month to see what is happening in Scott’s Corner. Add a great dad joke to your arsenal and gain an even better IT insight from this recognized industry expert as he continues his quest to educate and support the IBM i community.