Problem:
How do I use the RM/COBOL LIKE condition?
Resolution:
The LIKE condition
Using the LIKE condition is a relatively effortless way to match the content of any display data item against a pattern that can express just about any possible desired value. For example, suppose your program needs to verify that the user has entered a well-formed email address in a form field. Let's say a well-formed email address satisfies the following rules:
1. Leading and trailing spaces are ignored.
2. Preceding a single required "at sign" character ("@"), there are one or more characters from the set of "word" characters, (digits, uppercase and lowercase letters), plus the period, hyphen, and underscore characters.
3. After the "at sign", there are one or more "word" characters, plus the hyphen and underscore characters, followed by a period.
4. Rule 3 can be repeated one or more times.
5. Finally, two to four uppercase or lowercase letters (the high level domain name) must be appended.
For example, well-formed email addresses are similar to:
fname_lname@company.com
fname.lname@mail.usps.gov
Malformed email addresses are similar to:
fname_lname@xx..com (two consecutive periods after @)
fname.lname (missing required @)
fname.lname@@x.net (multiple @)
To check the email address rules, a COBOL program might have data that looks like Example1 (see examples.txt attachment below).
The procedure in Example2 would set the value of the well-formed-address condition-name to reflect whether or not the value of the email-address data item is a valid email address.
Performing the same validation, but using the LIKE feature of RM/COBOL, the data area would look like Example3.
Notice that no procedure division code would be required at all! In this example, the well-formed-address condition-name would always reflect whether or not email-address was well-formed. Unlike the procedure test-email-address, the comparison does not modify the value of the email-address data item.
The magic is in the relational operator called "LIKE", available in RM/COBOL version 7.5 and later. The LIKE operator allows any display data item to be compared, not to just a single text value, but to a potentially infinite set of text values generated by a "pattern".
In this example, the pattern describes all possible text strings that are well-formed email addresses. By comparing the value of the email-address item to this pattern using both the LIKE operator and another extension to the VALUE clause for level-number 88-condition-name definitions in RM/COBOL, the COBOL program need only test the value of well-formed-address whenever necessary in the program.
The pattern used by the LIKE operator is what is known as a regular expression (sometimes known as "regexp"). You are probably already familiar with regular expressions if you are a UNIX shell programmer. The ones used by RM/COBOL are very similar. For the rest of you, here is a quick introduction to this powerful tool.
A regular expression is, in reality, a simple expression of a grammar that describes a diverse class of text strings. In fact, regular expressions can describe any set of text strings that are possible to recognize with a computer! You can see how powerful this could be. Where is the catch? There really isn't one, except that you have to come up with the correct regular expression that matches the strings you want to describe. For the kind of patterns you will probably want to use, this is not very difficult.
An RM/COBOL pattern (a kind of regular expression) is itself a text string. A pattern text string is made up of the alphanumeric characters and a handful of special characters with a particular meaning when used in a pattern. Suppose we want a pattern to recognize only the string "box". What would the pattern look like? The pattern would look like this:
box
The "box" pattern matches any string that begins with "b" followed by "o" followed by "x". That's pretty simple (and more easily done with the old equal relational operator). Let's suppose, however, we wanted to match both "box" and "Box". We could do this by using the pattern:
Box|box
The "|" (pipe) is a special character that means "or" in a pattern. We could also accomplish the same thing with the pattern:
[bB]ox
The "[bB]ox" pattern uses the special characters "[" and "]", (open and close square brackets), to say that the first character may be either "B" or "b". Suppose we wanted to match the word anywhere in a string. We could match such strings with the pattern:
.*[bB]ox.*
The above pattern will match any character, (the "." - period), repeated zero or more times (the "*" - star), followed by a "b" or a "B", followed by the characters "o" then "x", followed by any character repeated zero or more times. Suppose we only wanted to match the word when it followed the word "text" or "cat". That pattern could be written as:
.*\\s(([tT]ext)|([cC]at))\\s([bB]ox).*
The above pattern will match any character repeated zero or more times, followed by white space (the "\\s" - backslash s), followed by either "text" or "Text" or "cat" or "Cat", followed by white space, followed by "box" or "Box", followed by any character repeated any number of times.
Suppose you wanted to recognize valid U.S. postal ZIP codes. You could do so, with a pattern like this:
\\d{5}(\\s|(-\\d{4}))
The above pattern matches five decimal digits, (the "\\d{5}" - backslash d open brace 5 close brace), followed either by white space or by a "-", (dash), followed by four decimal digits.
You get the idea. A complete description of the syntax and semantics of the RM/COBOL LIKE patterns can be found in the RM/COBOL Language Reference Manual.
So how can you use this marvel? Any way your imagination leads you! You can use the LIKE operator in simple literal conditions of PERFORM, EVALUATE, IF, and SEARCH statements, such as in Example4.
In this case, the TRIMMED modifier tells the runtime to make the comparison without regard to leading or trailing spaces in the email-address item. The same effect could be achieved with the equivalent sentence shown in Example5.
Or, using the original example of the level-number 88 condition-name defined using a LIKE relational operator, (see Example6).
The original example depends on a feature of level-number 88 condition-names that allows a relational operator to be specified in the VALUE clause, in order to generate the list of values against which the conditional variable will be compared. Since the main purpose of this solution is to introduce the LIKE condition, the LIKE relational operator was used. However, this level-number 88 feature can also be used with the older standard relational operators, as in Example7.
Thus, you can see that the LIKE condition and extended level-number 88 condition-names just add to COBOL's status as the programmer productivity leader for real business logic.