help contents L4X Language Summary
An L4X script directs how L4X filters text into a copy of the worksheet containing the script. The L4X scripting language extends Lua 4.0 with reserved tables that define how text-strings may be selected, converted and conditioned before output to a formatted Excel worksheet. This document introduces the L4X language and also describes the following feature of Lua 4.0: tables and types, constants, operators, control structures and functions.


An L4X script specifies neither the name of the text-file nor the name of the workbook. These input and output files are opened automatically by the L4X filter when the script is run. The simplest possible L4X script directs the filter to read all (non-blank) lines from the input text-file and write them to successive cells of column A in the output worksheet:
-- An L4X script must begin with a comment (--) line
-- No other language statements are required!
Of course, L4X can perform more complex imports. The following script reads the first 10 text characters, from all lines of text in the input file, and writes them to successive cells of column A, beginning at row 1 of the output worksheet:
-- begin each script with a comment line
Column.A = {pos=1, len=10}
A detailed discussion of statements that import strings of text is provided in the next section of this document.
A script may specify that the text-strings are read only from selected lines within selected pages. The following script selects lines of text beginning at line 10 from every page of the input file, and writes text-strings read from the selected lines to successive cells of columns A and B, beginning at row 3 of the output worksheet:
-- another script
Column.Source = {fromline=10, toline=-1, frompage=1, topage=-1}
Column.A = {pos=1, len=10}
Column.B = {pos=21,len=20}
Sheet.Out.row = 3 -- the "first" row
The last statement in the script above assigns a value to field: row of the Sheet.Out table which specifies the first row of the worksheet where text read by Column.A and Column.B will be written.
The following script, provided for tradition's sake, writes "hello, world" to the output device:
-- all language manuals show how to do this!
print("hello, world")


To read one string of text and write it to one worksheet cell, L4X scripts must include a statement that specifies both the cell location in the worksheet where text is to be written and also the position and length of the text-string to be read. For instance:
Cell.A1 = {pos=1, len=15, line=1, page=1, fmt="number"}
This statement adds a new field named A1 (which specifies worksheet cell A1 as the write location) to the reserved table: Cell and assigns a new table (specifying the text-string to be read) to the new field: Cell.A1. Within the table constructor, the pos field specifies where a string starts within the line of text, the len field specifies how long it is, the line field specifies the line of text within the page and the page field specifies the page of text within the text-file. The fmt field specifies how the text-string is converted before it is written. A detailed analysis of a similar statement is provided in a following section of this document.
To read strings from successive lines of text and write them to successive cells of a worksheet column, a script must include statements like:
Column.B = {pos=20, len=30}
This statement adds a new field named: B (which specifies worksheet column B as the write location) to the reserved table: Column and assigns a new table (specifying the text-strings to be read) to the new field: Column.B. Within the table constructor, the pos field specifies where a string starts within the line of text and the len field specifies how long it is. The table constructor includes neither field: line nor field: page, because the range of text lines to be selected is specified by the Column.Source table. A detailed analysis of a similar statement is provided in a following section of this document.
When a script adds a new field to the Column table its name does not specify an output row within the worksheet column, instead the first row within the column is specified by field: row of the Sheet.Out table. The filter adds a new row to the output worksheet, beginning at this first row, for each line of text that is selected.


An L4X script can describe the import of one or more "delimited" or "comma-separated" values from within a single text-string. To do this, the table constructed and assigned to a new field of the Column table must assign a number to the value field. This number specifies the sequence-number of a single delimited value within a text-string, where 1 is the first delimited-value. When the value field is set but len is not, then the text-string containing the delimited values starts at the specified pos position and ends at the end of the text-line.
-- Select the first, second and third delimited values in a string
Column.C = {pos=1, value=1}     
Column.D = {pos=1, value=2}    
Column.E = {pos=1, value=3} 
The default delimiter , (comma) is specified in field: delimit of the Sheet.Pictures table. You can replace this default value, using a statement like:
-- Change the value delimiter to tab character
Sheet.Pictures.delimit = "*tab"     
Sheet.Pictures.quote   = "\""  -- " is the default quote character
Please note that, when changing the value of one field in the Sheet.Pictures table, a script need not construct and assign a new table. Once a table has been constructed and assigned a script can refer to one of its fields using the . index by name operator and assign a value directly to this field.


If a script writes text to to cells which are formatted as numbers or dates, these text-strings must be converted into numbers. For such cases the script should specify a value for the fmt field of Cell and Column tables, which describes how text is to be converted into a number.
Column.A = {pos=10, fmt="number"}   
When L4X converts text-strings to numbers or dates, it uses a corresponding "picture" to control the conversion. The reserved table Sheet.Pictures contains field names corresponding to each permitted value of the fmt field. Each of these "picture" fields contains a string that specifies the characters that stand for decimal-points, minus-signs, date and time separators and the ordering of date and time substrings. You may change the "picture" in order to localize the conversion, as in the following example:
-- uses picture-string in Sheet.Pictures.date 
Column.A = {pos=10, fmt="date"}   
Column.B = {pos=20, fmt="date"} 
Column.C = {pos=30, fmt="number"} 
-- changes the picture-string for all dates
Sheet.Pictures.date = "yy/mm/dd" 
Please note that, when changing the value of one field in the Sheet.Pictures table, a script need not construct and assign a new table. Once a table has been constructed and assigned script can refer to one of its fields using the . index by name operator and assign a value directly to this field.
If only one column needs to be converted using a particular "picture" then it may be assigned to the picture field of the table that specifies the string to be read:
-- overrides Sheet.Pictures.date for column A only
Column.A = {pos=10, fmt="date", picture="yy/mm/dd"} 


When fields A ... IV are added to the Column table by a script, this directs the L4X filter to write a worksheet cell for each (non-blank) line of text it reads. A script can select which lines of text are read by assigning values to fields of the Column.Source table. The following statement directs the filter to read only lines of text on or after line 10 in every page:
Column.Source = {fromline=10, toline=-1, frompage=1, topage=-1}
The negative value for field toline specifies a line number relative to the last line in the page, where -1 is the last line, -2 is the next to last and so on. The negative value for field topage specifies a page number relative to the last page in the file, where -1 is the last page, -2 is the next to last and so on.
Text-files sometimes "fold" a row of related text-strings into more than one line of text. In such cases, a script can specify a value for field linesperrow of the Column.Source table:
Column.Source.linesperrow = 4
Here, the number assigned to field linesperrow specifies that four lines of text will be read before a worksheet row is written. If a script requires the import of strings from the second or subsequent lines of text, it must assign an appropriate value to field lineofrow for each such import:
Column.A = {pos=1, len=10, lineofrow=2}
Column.B = {pos=1, len=10, lineofrow=3}
Column.C = {pos=1, len=10} -- default lineofrow value = 1 


The L4X filter always excludes blank lines of text. A script may also specify that, when a blank line is detected, all remaining lines in the current page are excluded or all remaining lines in the text-file are excluded.
 -- on blank line - skip to next line (Default).
Column.Source.onblankline = "ignore" 
-- on blank line - skip to next page.
Column.Source.onblankline = "endpage" 
-- on blank line - skip to end-of-file.
Column.Source.onblankline = "endtext" 
Please note that, when changing the value of one field in the Column.Source table, a script need not construct and assign a new table. Once a table has been constructed and assigned a script can refer to one of its fields using the . index by name operator and assign a value directly to this field.
To exclude a text line from within the range selected for reading, an L4X script should define the event function Event.OnOpenRow() and, within this function, assign a negative number to field: Column.Source.condition which excludes one or more lines of text:
function Event.OnOpenRow(Page, Source)
    -- exclude the current line
    if  (strsub(Page.text, 81, 85) == "-----") then
        Source.condition = -1
        end
    -- exclude the current line and next line
    if  (strsub(Page.text, 81, 85) == "Total") then
        Source.condition = -2
        end      
    -- exclude current and following lines on the current page
    if  (strsub(Page.text, 81, 85) == "=====") then
        Source.condition = -9999
        end
end 
The Lua strsub() function is described in the Lua Functions section of this document. Please see the following section for a detailed discussion of event functions.


When input text is not regularly formatted, a script may define functions for fields of the event table. An event function can inspect the current contents of the line and page of text which is being processed and control which columns import text. A script may define a function for field Event.OnOpenRow, which is called, automatically, by the L4X filter before each group of text-lines (selected for column processing) is processed. A script may also define a function for field Event.OnOpenPage, which is called, automatically, by the L4X filter, before each page of text is processed.
The first function parameter Page passed to event functions, is a reference to the Sheet.Page table. The second function parameter Source is a reference to the Column.Source table. These parameters are passed so that expressions in event functions can refer to fields of these tables by the short names Page and Source, aiding clarity and convenience. An event function definition in L4X looks like:
function Event.OnOpenRow(Page, Source)
-- this function does nothing !!
end
Because the examples of event functions in this section inspect various fields of the Sheet.Page table, a brief description of these fields is given below:
Page.text         -- text from line in process.
Page.texts[]      -- array of text lines in page.
If a script needs to exclude particular lines of text, depending on the contents of the text currently being processed, a script should define a function for field Event.OnOpenRow that performs the necessary tests. For example:
-- define your event function
function Event.OnOpenRow(Page, Source)
if  (strsub(Page.text, 81, 85) == "=====") then
    -- exclude this (and the following) line.
    Source.condition = -2 
    end
end
Here, the script tests a sub-string of the current line of text. If the current line of text contains the characters "======" at position 81 through 85, the script will set the value of field condition in table Column.Source to minus two. When the value of field condition is a negative number then that number of lines, starting at the current line, are excluded from column processing. The value of field Source.condition is always set to zero before event functions are called, so a function does not need to set the zero condition explicitly.
If a script needs to adjust the start line for column processing, a script should define a function for field Event.OnOpenPage. For instance, when a text-file has "header pages" with fewer detail lines than succeeding pages, a script can define a function like:
-- lines on the "header page".
Column.Source = {fromline=20, toline-1}
-- define a column
Column.A      = {pos=1, len=10}
-- define your event function
function Event.OnOpenPage(Page, Source)
if  (strsub(Page.texts[5], 1, 8) ~= "Library:") then
    -- This is not a "header" page.
    Source.fromline = 10 
    end
Here, the script tests a sub-string of the text in line 5 of the current page. If the sub-string "Library:" is not found, the range of lines selected for column processing is adjusted by assigning a different value to field Column.Source.fromline. Note the use of [ ], the indexing operator and ~=, the inequality operator. The fields Source.fromline and Source.toline are always reset to their initial values before event functions are called, so a function does not need to re-set them explicitly.


An L4X script can conditionally control the way in which text is imported. This control is made possible by the condition field of the tables: Column.Format, Column.Heading and Column.Total. These tables will only perform text import when the value of their specified condition field matches any one of the conditions set in the Event.OnOpenRow() function. If the event function executes the following statement:
Source.condition = 1
It sets the condition to 1. Please note that when a script assigns a non-zero number to this field, none of the locations specified by field names A ... IV of the Column table will be written to, nor will a new row be added to the worksheet.
If the event function executes the following statements:
Source.conditions[1] = 2
Source.conditions[9] = 3
The first of the statements above, assigns a number to array element 1 of field Source.conditions setting one of the conditions to 2. The second statement assigns a number to array element 9 of field Source.conditions setting another of the conditions to 3. Although this statements sets conditions, all of of the worksheet locations specified by field names A ... IV of the Column table will be written to, and a new row will be added to the worksheet.
If the text read for a worksheet column is already sorted, a script can set conditions when the value in a column changes, as follows:
Column.A = {pos=1, len=10, conditionset=3}
The statement above assigns a number to field Column.A.conditionset which specifies the conditions which are set when the value of the text-string selected for Column.A changes. Here, conditions 1, 2 and 3 are set when the value of the selected text-string is changed. Although this statements sets conditions, all of of the worksheet locations specified by field names A ... IV of the Column table will be written to, and a new row will be added to the worksheet.


The L4X filter can be directed to read a single text-string but write it to many worksheet cells. If a script requires L4X to read a text-string once per page but write it repeatedly to succeeding cells of a worksheet, a script may assign a new table to the appropriate field of the Column.Heading. table. A script could include a statement like:
Column.A         = {pos=1, len=30}
Column.Heading.B = {pos=1, len=30, line=5}
Here, the text-string written to column B is read from line 5 of every page, provided that line 5 is not within the lines specified by Column.Source.fromline and Column.Source.toline. This text-string is then written to successive cells of column B, whenever text is written to Column A and a new row is added to the worksheet.
If a script requires L4X to read a text-string from a line within the range of lines specified by the Column.Source table, but write it repeatedly to succeeding cells of a worksheet, a script may include a statement like:
Column.A         = {pos=1, len=30}
Column.Heading.B = {pos=10, len=30, fmt="text", condition=1}
Here, the text-string written to column B is read from position 10 of each line selected by the Column.Source table, provided that one of the conditions is set to 1. This text-string is repeatedly written to successive cells of column B, whenever text is written to Column A and a new row is added to the worksheet.
If a script requires L4X to read a text-string from a line within the range of lines specified by the Column.Source table, but write it repeatedly to the preceding cells of a worksheet, a script may assign a new table to the appropriate field of the Column.Total. table. A script could include a statement like the following:
Column.A       = {pos=1, len=30}
Column.Total.C = {pos=99, len=10, fmt="number", condition=2}
Here, the text-string written to column C is read from position 99 of each line selected by the Column.Source table, provided that one of the conditions is set to 2. This text-string is then written to any preceding cells of column C which have not previously been written.


An L4X script does not directly control the appearance and formatting of the output worksheet. However the formatting and formulae of the workbook containing the script are preserved in the copy of the "scripted" workbook output by the L4X filter. Any references in cell formulae, ranges and charts are adjusted to allow for the rows of text imported by the filter, as are the relative positions of notes, word-art and pictures. In particular the formats and formulae of all the cells in the first row of the worksheet are automatically copied and adjusted before each new row is imported.
Your script may also copy formats and formulae from one area of the worksheet to another by adding fields to the reserved table: Column.Format. This makes it possible to insert a row containing formulae that calculate (for instance) sub-totals. In the following example script:
-- define columns
Column.A = {pos=1,  len=10, conditionset=1}
Column.B = {pos=20, len=15, fmt="number"}
-- Insert formatting and formula
Column.Format.B = {condition=1, fromcell="Z1"}
When the condition is set to 1, L4X will write the contents of cell Z1 to column B of the worksheet. If cell Z1 contains a formula like =SUM(B1:B1) the formula will be suitably adjusted each time it is written. So, when the formula is written to row 8 it becomes: =SUM(B1:B8), when written to row 9 it becomes: =SUM(B1:B9), and so on.


Fields of the reserved table Sheet.Out allow a script to specify certain properties of the worksheet output by L4X. The L4X filter will, by default, protect the output worksheet. You can allow the output worksheet's contents to be edited, using statement:
Sheet.Out.protect = 0 -- allow editing
Please note that if the scripted worksheet is already protected, then the output worksheet cannot be un-protected.
L4X will also, by default, remove any VBA macros in the output workbook. VBA macros can be saved, by including the statement:
Sheet.Out.vbsave = 1 -- save VBA macros


The most common statements in an L4X script describe the text-string(s) to be read from an input text-file and the worksheet cell or cells where they are written. A typical statement looks like:
Column.B = {pos=1, len=15, fmt="number"}
All assignments to new fields of the reserved tables Cell and Column follow the same pattern. The field name added to the reserved table (on the left-hand side of the assignment operator = ) specifies the worksheet location where text is written, and the value assigned to this field (on the right-hand side of operator =) is a reference to a table which specifies which text-string(s) are read from the input file. Each element of this statement is described, in detail, below:
The "wizard buttons" in the L4X development environment can help you to build statements like the one described above.


The L4X filter automatically constructs reserved Lua tables that direct how text is imported. The names of these tables are chosen from the vocabulary of Excel so that L4X scripts can clearly express their purpose. The fields of reserved tables that select text-strings have names like: line, position and page for the same reason. The "wizard buttons" in the L4X development environment can help you to build statements that will assign new tables to the reserved tables described above.


Lua 4.0 is a dynamically typed language. This means that variables do not have fixed types, but the value assigned to a variable will have one of the Lua types: nil, number, string, function, userdata and table.
A value of type nil is returned by any undefined variable name, a false result from the relational and logical operations or may be returned by a call to a function as described in following sections of this document. You may test for a nil value as follows:
if  (not x) then
    -- x is nil
    end
A value of type table is a reference to a table object, constructed by the Lua operator { }. A Lua table can contain many fields, each field having a unique name and an associated value. Because a Lua table is an "associative array", the values contained in a table can be indexed by number or name. Lua tables are constructed and used as follows:
x = {}     -- use {} to construct a new empty table
x.pos = 1  -- adds field-name pos with value 1 to table x
x.len = 9  -- adds field-name len with value 9 to table x
x.tbl = {} -- adds field-name tbl referring to another empty table
For the sake of brevity, a table may be constructed and initialised in one expression, where each "field" name=value pair is delimited by a comma (,), as follows:
x = {pos=1, len=9} -- construct and initialise table
The value of an individual field of a table may also be referred to by its index, as follows:
x["pos"] = 1
x["len"] = 9
print(x.len) -- >> 9
An array is a table with numeric field names. The following example demonstrates assignment to elements of an array, using the index operation:
a = {}
a[1] = "element one"
a[2] = "element two"
j = 1
print(a[j]) -- >> "element one"


String constants are delimited by matching single or double quotes, and can contain the C-like escape sequences: A character in a string may also be specified by its numerical value, through the escape sequence `\ddd', where ddd is a sequence of up to three decimal digits. Strings in Lua may contain any 8-bit value, including embedded zeros, which can be specified as `\000'.
Literal strings can also be delimited by matching [[ ... ]]. Literals in this bracketed form may run for several lines, may contain nested [[ ... ]] pairs, and do not interpret escape sequences. This form is especially convenient for writing strings that contain program pieces or other quoted strings. As an example, in a system using ASCII, the following three literals are equivalent:
       1)   "alo\n123\""
       2)   '\97lo\10\04923"'
       3)   [[alo
            123"]]
Numerical constants may be written with an optional decimal part and an optional decimal exponent. Examples of valid numerical constants are
Comments start with -- (two hyphens) and continue until the end of the line. Also, if the first line of a file starts with a # (hash), then the first line is also a comment.


L4X scripts can use any of the operators defined in Lua 4.0.


L4X scripts can use any of the control structures defined in Lua 4.0. For convenience, if is defined below as:
if logical-expression then
   statement(s)
end

if logical-expression then
   statement(s)
else
   statement(s)
end
The evaluation of a "logical expression" may return any value. All non-nil values are considered true, only the nil value is considered false.

Lua also provides iterative control structures while and for. L4X scripts should not require iteration and iterative control structures should be used with extreme caution.


L4X scripts can use any of the functions defined in Lua 4.0 or in its standard libraries. The following functions from the Lua string manipulation library are documented here, for convenience, as they will often be used in event functions. The "wizard buttons" in the L4X development environment can help you create a valid code "snippets" using the functions strfind() or strsub() described below.
A script may, of course, define new functions:
function printhello() -- start function definition of "printhello"
    print("hello")
end                   -- end function definition
-- call function
printhello() -- >> "hello"


This document ©Jane Hearn 2008.
All Microsoft trademarks acknowledged.