UP PREVIOUS NEXT

## Examples.

As a trivial problem, consider copying an input file while adding 3 to every positive number divisible by 7. Here is a suitable Lex source program

```%%
int k;
[0-9]+	{
k = atoi(yytext);
if (k%7 == 0)
printf("%d", k+3);
else
printf("%d",k);
}
```
to do just that. The rule [0-9]+ recognizes strings of digits; atoi converts the digits to binary and stores the result in k. The operator % (remainder) is used to check whether k is divisible by 7; if it is, it is incremented by 3 as it is written out. It may be objected that this program will alter such input items as 49.63 or X7 . Furthermore, it increments the absolute value of all negative numbers divisible by 7. To avoid this, just add a few more rules after the active one, as here:
```%%
int k;
-?[0-9]+	{
k = atoi(yytext);
printf("%d", k%7 == 0 ? k+3 : k);
}
-?[0-9.]+	ECHO;
[A-Za-z][A-Za-z0-9]+	ECHO;
```
Numerical strings containing a ``.'' or preceded by a letter will be picked up by one of the last two rules, and not changed. The if-else has been replaced by a C conditional expression to save space; the form a?b:c means ``if a then b else c ''.

For an example of statistics gathering, here is a program which histograms the lengths of words, where a word is defined as a string of letters.

```	int lengs;
%%
[a-z]+	lengs[yyleng]++;
.	|
\n	;
%%
yywrap()
{
int i;
printf("Length  No. words\n");
for(i=0; i<100; i++)
if (lengs[i] > 0)
printf("%5d%10d\n",i,lengs[i]);
return(1);
}
```
This program accumulates the histogram, while producing no output. At the end of the input it prints the table. The final statement return(1); indicates that Lex is to perform wrapup. If yywrap returns zero (false) it implies that further input is available and the program is to continue reading and processing. To provide a yywrap that never returns true causes an infinite loop.

As a larger example, here are some parts of a program written by N. L. Schryer to convert double precision Fortran to single precision Fortran. Because Fortran does not distinguish upper and lower case letters, this routine begins by defining a set of classes including both cases of each letter:

```a	[aA]
b	[bB]
c	[cC]
...
z	[zZ]
```
An additional class recognizes white space:
```W	[ \t]*
```
The first rule changes ``double precision'' to ``real'', or ``DOUBLE PRECISION'' to ``REAL''.
```{d}{o}{u}{b}{l}{e}{W}{p}{r}{e}{c}{i}{s}{i}{o}{n} {
printf(yytext=='d'? "real" : "REAL");
}
```
Care is taken throughout this program to preserve the case (upper or lower) of the original program. The conditional operator is used to select the proper form of the keyword. The next rule copies continuation card indications to avoid confusing them with constants:
```^"     "[^ 0]	ECHO;
```
In the regular expression, the quotes surround the blanks. It is interpreted as ``beginning of line, then five blanks, then anything but blank or zero.'' Note the two different meanings of ^ . There follow some rules to change double precision constants to ordinary floating constants.
```[0-9]+{W}{d}{W}[+-]?{W}[0-9]+     |
[0-9]+{W}"."{W}{d}{W}[+-]?{W}[0-9]+     |
"."{W}[0-9]+{W}{d}{W}[+-]?{W}[0-9]+     {
/* convert constants */
for(p=yytext; *p != 0; p++)
{
if (*p == 'd' || *p == 'D')
*p=+ 'e'- 'd';
ECHO;
}
```
After the floating point constant is recognized, it is scanned by the for loop to find the letter d or D . The program than adds 'e'-'d' , which converts it to the next letter of the alphabet. The modified constant, now single-precision, is written out again. There follow a series of names which must be respelled to remove their initial d. By using the array yytext the same action suffices for all the names (only a sample of a rather long list is given here).
```{d}{s}{i}{n}	|
{d}{c}{o}{s}	|
{d}{s}{q}{r}{t}	|
{d}{a}{t}{a}{n}	|
...
{d}{f}{l}{o}{a}{t}	printf("%s",yytext+1);
```
Another list of names must have initial d changed to initial a:
```{d}{l}{o}{g}	|
{d}{l}{o}{g}10	|
{d}{m}{i}{n}1	|
{d}{m}{a}{x}1	{
yytext =+ 'a' - 'd';
ECHO;
}
```
And one routine must have initial d changed to initial r:
```{d}1{m}{a}{c}{h}	{yytext =+ 'r'  - 'd';
ECHO;
}
```
To avoid such names as dsinx being detected as instances of dsin, some final rules pick up longer words as identifiers and copy some surviving characters:
```[A-Za-z][A-Za-z0-9]*	|
[0-9]+	|
\n	|
.	ECHO;
```
Note that this program is not complete; it does not deal with the spacing problems in Fortran or with the use of keywords as identifiers.

UP PREVIOUS NEXT