MODULE 8: String handling
UNSTRING
- It is used to split a string into many sub-strings
- As like STRING, based on delimiter specified, the splitting occurs
- Basic syntax:-
UNSTRING identifier-1 [DELIMITED BY [ALL] {identifier-2/literal-1} [, OR [ALL] {identifier-3/literal-2}]…] INTO identifier-4 [, DELIMITER IN identifier-5] [, COUNT IN identifier-6] [,identifier-7 [, DELIMITER IN identifier-8] [, COUNT IN identifier-9]]… [TALLYING IN identifier-10] [; ON OVERFLOW <imperative-statement>]
- Data from identifier-1 splits and placed in multiple receiving fields namely identifier-4, identifier-7 etc.
- The sending data item i.e. identifier-1 must be alphanumeric
- The receiving field data item i.e. identifier-4, identifier-7 etc. must have DISPLAY usage. They can be alphabetic, alphanumeric or numeric
- All literals must be non-numeric. When literal is figurative-constant, it is treated as a single character
- identifier-6, identifier-9, identifier-10 must be elementary numeric integer data items
- If DELEMITED BY phrase is not coded, all data from sending field will be copied to receiving fields until either all data is transferred or receiving fields are full
- DELIMITED BY identifier-2/literal-1 etc. is used to specify character or sub-string in the sending string that will be used as a terminator for transfer of data to current receiving field and remaining data (excluding delimiter) will be transferred to another receiving field until again delimiter will encounter in sending string. This will continue till either all data is transferred from sending field or all the receiving fields are completely filled. You can specify one or more delimiters, in that case if any of the delimiter encountered in sending field, remaining data will be transferred to next receiving field
- identifier-5, identifier-8 are used to hold delimiter that caused termination of data transfer to respective receiving field
- identifier-6, identifier-9 are used to hold the count of number of characters transferred into respective receiving field
- identifier-10 is used to hold the count of receiving fields affected by UNSTRING verb
- When ALL phrase is used in DELIMITED BY, contiguous delimiters are treated as one delimiter
- When ALL phrase is not used in DELIMITED BY, contiguous delimiters will act as separate delimiters and will result in spaces being sent to some of the receiving field
- When DELIMITED BY phrase not coded, you cannot code DELIMITED IN or COUN IN.
- ON OVERFLOW phrase is used to execute imperative statement when there is overflow situation
- Example 1:-
In DATA DIVISON,01 WS-DD PIC X(02). 01 WS-MM PIC X(02). 01 WS-YYYY PIC X(04). 01 WS-DATE PIC X(10) VALUE ‘28/08/1993’.In PROCEDURE DIVISION,UNSTRING WS-DATE DELIMITED BY ‘/’ INTO WS-DD, WS-MM, WS-YYYYResult:- WS-DD will contain ’28’, WS-MM will contain ’08’, WS-YYYY will contain ’1993’
- Example 2:-
In DATA DIVISON,01 WS-A PIC X(5). 01 WS-B PIC X(5). 01 WS-C PIC X(5). 01 WS-D PIC X(16) VALUE ‘PRACTICEATTITUDE’.In PROCEDURE DIVISION,UNSTRING WS-D DELIMITED BY ‘T’ INTO WS-A, WS-B, WS-CResult:- WS-A will contain ‘PRACb ‘,WS-B will contain ‘ICEAb ‘ and WS-C will contain ‘bbbbb‘. Note:- b indicates space
Reason:- First termination occurs after string ‘PRAC; because of ‘T’ and thus it moved ‘PRAC’ to WS-A. Second ‘T’ encountered after ‘ICEA’, thus caused termination and moved ‘ICEA’ to WS-B. Third ‘T’ encountered immediately without any characters in between and thus there is nothing have been moved to WS-C. However, if we have coded ALL with DELIMITED BY phrase, like shown below:- DELIMITED BY ALL ‘T’ then contiguous delimiters are considered as one and thus WS-C would have ‘Ibbbb’ and other variables will have WS-A = ‘PRACb’, WS-B = ‘ICEAb’ (No change for WS-A, WS-B)
- Example 3:-
In DATA DIVISON,01 WS-A PIC X(04). 01 WS-A-COUNT PIC 9. 01 WS-A-DELIMITER PIC 9. 01 WS-B PIC X(02). 01 WS-B-COUNT PIC 9. 01 WS-B-DELIMITER PIC 9. 01 WS-TALLY PIC 9. 01 WS-C PIC X(11) VALUE ‘MAY,12 1993’.In PROCEDURE DIVISION,UNSTRING WS-C DELIMITED BY ‘,’ OR ‘ ‘ INTO WS-A DELIMITER IN WS-A-DELIMITER COUNT IN WS-A-COUNT WS-B DELIMITER IN WS-B-DELIMITER COUNT IN WS-B-COUNT TALLYING WS-TALLY ON OVERFLOW PERFORM ERROR-PARAResult:-
- WS-A = ‘MAY ‘ because ‘,’ after MAY is specified as delimiter in above statement
- WS-A-DELIMITER = ‘,’ because ‘,’(comma) caused termination
- WS-A-COUNT = ‘3’, because 3 characters are moved to WS-A
- WS-B = ‘12’ because blanks space after 12 causes termination
- WS-B-DELIMITER = ‘ ’ because ‘ ’(blank space) caused termination
- WS-B-COUNT = ‘2’, because 2 characters are moved to WS-B
- WS-TALLY = ‘2’ because there are 2 receiving fields affected
- At end, since 1993 can’t be moved anywhere, causes overflow and thus ERROR-PARA will be called.