diff options
Diffstat (limited to 'Tools/LuaMacro/readme.md')
-rw-r--r-- | Tools/LuaMacro/readme.md | 1010 |
1 files changed, 1010 insertions, 0 deletions
diff --git a/Tools/LuaMacro/readme.md b/Tools/LuaMacro/readme.md new file mode 100644 index 0000000..e86bbfb --- /dev/null +++ b/Tools/LuaMacro/readme.md @@ -0,0 +1,1010 @@ +## LuaMacro - a macro preprocessor for Lua + +This is a library and driver script for preprocessing and evaluating Lua code. +Lexical macros can be defined, which may be simple C-preprocessor style macros or +macros that change their expansion depending on the context. + +It is a new, rewritten version of the +[Luaforge](http://luaforge.net/projects/luamacro/) project of the same name, which +required the [token filter +patch](http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#tokenf) by Luiz Henrique de +Figueiredo. This patch allowed Lua scripts to filter the raw token stream before +the compiler stage. Within the limits imposed by the lexical filter approach this +worked pretty well. However, the token filter patch is unlikely to ever become +part of mainline Lua, either in its original or +[revised](http://lua-users.org/lists/lua-l/2010-02/msg00325.html) form. So the most +portable option becomes precompilation, but Lua bytecode is not designed to be +platform-independent and in any case changes faster than the surface syntax of the +language. So using LuaMacro with LuaJIT would have required re-applying the patch, +and would remain within the ghetto of specialized, experimental use. + +This implementation uses a [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg.html) +lexical analyser originally by [Peter +Odding](http://lua-users.org/wiki/LpegRecipes) to tokenize Lua source, and builds +up a preprocessed string explicitly, which then can be loaded in the usual way. +This is not as efficient as the original, but it can be used by anyone with a Lua +interpreter, whether it is Lua 5.1, 5.2 or LuaJIT 2. An advantage of fully building +the output is that it becomes much easier to debug macros when you can actually see +the generated results. (Another example of a LPeg-based Lua macro preprocessor is +[Luma](http://luaforge.net/projects/luma/)) + +It is not possible to discuss macros in Lua without mentioning Fabien Fleutot's +[Metalua](metalua.luaforge.net/) which is an alternative Lua compiler which +supports syntactical macros that can work on the AST (Abstract Syntax Tree) itself +of Lua. This is clearly a technically superior way to extend Lua syntax, but again +has the disadvantage of being a direct-to-bytecode compiler. (Perhaps it's also a +matter of taste, since I find it easier to think about extending Lua on the lexical +level.) + +My renewed interest in Lua lexical macros came from some discussions on the Lua +mailing list about numerically optimal Lua code using LuaJIT. We have been spoiled +by modern optimizing C/C++ compilers, where hand-optimization is often discouraged, +but LuaJIT is new and requires some assistance. For instance, unrolling short loops +can make a dramatic difference, but Lua does not provide the key concept of +constant value to assist the compiler. So a very straightforward use of a macro +preprocessor is to provide named constants in the old-fashioned C way. Very +efficient code can be generated by generalizing the idea of 'varargs' into a +statically-compiled 'tuple' type. + + tuple(3) A,B + +The assigment `A = B` is expanded as: + + A_1,A_2,A_3 = B_1,B_2,B_3 + +I will show how the expansion can be made context-sensitive, so that the +loop-unrolling macro `do_` changes this behaviour: + + do_(i,1,3, + A = 0.5*B + ) + +expands to: + + A_1 = 0.5*B_1 + A_2 = 0.5*B_2 + A_3 = 0.5*B_3 + +Another use is crafting DSLs, particularly for end-user scripting. For instance, +people may be more comfortable with `forall x in t do` rather than `for _,x in +ipairs(t) do`; there is less to explain in the first form and it translates +directly to the second form. Another example comes from this common pattern: + + some_action(function() + ... + end) + +Using the following macro: + + def_ block (function() _END_CLOSE_ + +we can write: + + some_action block + ... + end + +A criticism of traditional lexical macros is that they don't respect the scoping +rules of the language itself. Bad experiences with the C preprocessor lead many to +regard them as part of the prehistory of computing. The macros described here can +be lexically scoped, and can be as 'hygenic' as necessary, since their expansion +can be finely controlled with Lua itself. + +For me, a more serious charge against 'macro magic' is that it can lead to a +private dialect of the language (the original Bourne shell was written in C +'skinned' to look like Algol 68.) This often indicates a programmer uncomfortable +with a language, who wants it to look like something more familiar. Relying on a +preprocessor may mean that programmers need to immerse themselves more in the idioms of +the new language. + +That being said, macros can extend a language so that it can be more expressive for +a particular task, particularly if the users are not professional programmers. + +### Basic Macro Substitution + +To install LuaMacro, expand the archive and make a script or batch file that points +to `luam.lua`, for instance: + + lua /home/frodo/luamacro/luam.lua $* + +(Or '%*' if on Windows.) Then put this file on your executable path. + +Any Lua code loaded with `luam` goes through four distinct steps: + + * loading and defining macros + * preprocessing + * compilation + * execution + +The last two steps happen within Lua itself, but always occur, even though the Lua +compiler is fast enough that we mostly do not bother to save the generated bytecode. + +For example, consider this `hello.lua`: + + print(HELLO) + +and `hello-def.lua`: + + local macro = require 'macro' + macro.define 'HELLO "Hello, World!"' + +To run the program: + + $> luam -lhello-def hello.lua + Hello, World! + +So the module `hello-def.lua` is first loaded (compiled and executed, but not +preprocessed) and only then `hello.lua` can be preprocessed and then loaded. + +Naturaly, there are easier ways to use LuaMacro, but I want to emphasize the +sequence of macro loading, preprocessing and script loading. `luam` has a `-d` +flag, meaning 'dump', which is very useful when debugging the output of the +preprocessing step: + + $> luam -d -lhello-def hello.lua + print("Hello, World!") + +`hello2.lua` is a more sensible first program: + + require_ 'hello-def' + print(HELLO) + +You cannot use the Lua `require` function at this point, since `require` is only +executed when the program starts executing and we want the macro definitions to be +available during the current compilation. `require_` is the macro version, which +loads the file at compile-time. + +New with 2.5 is the default @ shortcut available when using `luam`, +so `require_` can be written `@require`. +(`@` is itself a macro, so you can redefine it if needed.) + +There is also `include_/@include`, which is analogous to `#include` in `cpp`. It takes a +file path in quotes, and directly inserts the contents of the file into the current +compilation. Although tempting to use, it will not work here because again the +macro definitions will not be available at compile-time. + +`hello3.lua` fits much more into the C preprocessor paradigm, which uses the `def_` +macro: + + @def HELLO "Hello, World!" + print(HELLO) + +(Like `cpp`, such macro definitions end with the line; however, there is no +equivalent of `\` to extend the definition over multiple lines.) + +With 2.1, an alternative syntax `def_ (name body)` is also available, which can be +embedded inside a macro expression: + + def_ OF_ def_ (of elseif _value ==) + +Or even extend over several lines: + + def_ (complain(msg,n) + for i = 1,n do + print msg + end + ) + +`def_` works pretty much like `#define`, for instance, `def_ SQR(x) ((x)*(x))`. A +number of C-style favourites can be defined, like `assert_` using `_STR_`, which is +a predefined macro that 'stringifies' its argument. + + def_ assert_(condn) assert(condn,_STR_(condn)) + +`def_` macros are _lexically scoped_: + + local X = 1 + if something then + def_ X 42 + assert(X == 42) + end + assert(X == 1) + +LuaMacro keeps track of Lua block structure - in particular it knows when a +particular lexical scope has just been closed. This is how the `_END_CLOSE_` +built-in macro works + + def_ block (function() _END_CLOSE_ + + my_fun block + do_something_later() + end + +When the current scope closes with `end`, LuaMacro appends the necessary ')' to +make this syntax valid. + +A common use of macros in both C and Lua is to inline optimized code for a case. +The Lua function `assert()` always evaluates its second argument, which is not +always optimal: + + def_ ASSERT(condn,expr) if condn then else error(expr) end + + ASSERT(2 == 1,"damn! ".. 2 .." is not equal to ".. 1) + +If the message expression is expensive to execute, then this can give better +performance at the price of some extra code. `ASSERT` is now a statement, not a +function, however. + +### Conditional Compilation + +For this to work consistently, you need to use the `@` shortcut: + + @include 'test.inc' + @def A 10 + ... + +This makes macro 'preprocessor' statements stand out more. Conditional compilation +works as you would expect from C: + + -- test-cond.lua + @if A + print 'A defined' + @else + print 'A not defined' + @end + @if os.getenv 'P' + print 'Env P is defined' + @end + +Now, what is `A`? It is a Lua expression which is evaluated at _preprocessor_ +time, and if it returns any value except `nil` or `false` it is true, using +the usual Lua rule. Assuming `A` is just a global variable, how can it be set? + + $ luam test-cond.lua + A not defined + $ luam -VA test-cond.lua + A defined + $ export P=1 + $ luam test-cond.lua + A not defined + Env P is defined + +Although this looks very much like the standard C preprocessor, the implementation +is rather different - `@if` is a special macro which evaluates its argument +(everything on the rest of the line) as a _Lua expression_ +and skips upto `@end` (or `@else` or `@elseif`) if that condition is false. + + +### Using macro.define + +`macro.define` is less convenient than `def_` but much more powerful. The extended +form allows the substitution to be a _function_ which is called in-place at compile +time. These definitions must be loaded before they can be used, +either with `-l` or with `@require`. + + macro.define('DATE',function() + return '"'..os.date('%c')..'"' + end) + +Any text which is returned will be tokenized and inserted into the output stream. +The explicit quoting here is needed to ensure that `DATE` will be replaced by the +string "04/30/11 09:57:53". ('%c' gives you the current locale's version of the +date; for a proper version of this macro, best to use `os.date` [with more explicit +formats](http://www.lua.org/pil/22.1.html) .) + +This function can also return nothing, which allows you to write macro code purely +for its _side-effects_. + +Non-operator characters like `@`,`$`, etc can be used as macros. For example, say +you like shell-like notation `$HOME` for expanding environment variables in your +scripts. + + macro.define '$(x) os.getenv(_STR_(x))' + +A script can now say `$(PATH)` and get the expected expansion, Make-style. But we +can do better and support `$PATH` directly: + + macro.define('$',function(get) + local var = get:iden() + return 'os.getenv("'..var..'")' + end) + +If a macro has no parameters, then the substitution function receives a 'getter' +object. This provides methods for extracting various token types from the input +stream. Here the `$` macro must be immediately followed by an identifier. + +We can do better, and define `$` so that something like `$(pwd)` has the same +meaning as the Unix shell: + + macro.define('$',function(get) + local t,v = get() + if t == 'iden' then + return 'os.getenv("'..v..'")' + elseif t == '(' then + local rest = get:upto ')' + return 'os.execute("'..tostring(rest)..'")' + end + end) + +(The getter `get` is callable, and returns the type and value of the next token.) + +It is probably a silly example, but it illustrates how a macro can be overloaded +based on its lexical context. Much of the expressive power of LuaMacro comes from +allowing macros to fetch their own parameters in this way. It allows us to define +new syntax and go beyond 'pseudo-functions', which is more important for a +conventional-syntax language like Lua, rather than Lisp where everything looks like +a function anyway. These kinds of macros are called 'reader' macros in the Lisp world, +since they temporarily take over reading code. + +It is entirely possible for macros to create macros; that is what `def_` does. +Consider how to add the concept of `const` declarations to Lua: + + const N,M = 10,20 + +Here is one solution: + + macro.define ('const',function(get) + get() -- skip the space + local vars = get:idens '=' + local values = get:list '\n' + for i,name in ipairs(vars) do + macro.assert(values[i],'each constant must be assigned!') + macro.define_scoped(name,tostring(values[i])) + end + end) + +The key to making these constants well-behaved is `define_scoped`, which installs a +block handler which resets the macro to its original value, which is usually `nil`. +This test script shows how the scoping works: + + require_ 'const' + do + const N,M = 10,20 + do + const N = 5 + assert(N == 5) + end + assert(N == 10 and M == 20) + end + assert(N == nil and M == nil) + + +If we were designing a DSL intended for non-technical users, then we cannot just +say to them 'learn the language properly - go read PiL!'. It would be easier to +explain: + + forall x in {10,20,30} do + +than the equivalent generic `for` loop. `forall` can be implemented fairly simply +as a macro: + + macro.define('forall',function(get) + local var = get:iden() + local t,v = get:next() -- will be 'in' + local rest = tostring(get:upto 'do') + return ('for _,%s in ipairs(%s) do'):format(var,rest) + end) + +That is, first get the loop variable, skip `in`, grab everything up to `do` and +output the corresponding `for` statement. + +Useful macros can often be built using these new forms. For instance, here is a +simple list comprehension macro: + + macro.define('L(expr,select) '.. + '(function() local res = {} '.. + ' forall select do res[#res+1] = expr end '.. + 'return res end)()' + ) + +For example, `L(x^2,x in t)` will make a list of the squares of all elements in `t`. + +Why don't we use a long string here? Because we don't wish to insert any extra line +feeds in the output.`macro.forall` defines more sophisticated `forall` statements +and list comprehension expressions, but the principle is the same - see 'tests/test-forall.lua' + +There is a second argument passed to the substitution function, which is a 'putter' +object - an object for building token lists. For example, a useful shortcut for +anonymous functions: + + M.define ('\\',function(get,put) + local args = get:idens('(') + local body = get:list() + return put:keyword 'function' '(' : idens(args) ')' : + keyword 'return' : list(body) : space() : keyword 'end' + end) + +The `put` object has methods for appending particular kinds of tokens, such as +keywords and strings, and is also callable for operator tokens. These always return +the object itself, so the output can be built up with chaining. + +Consider `\x,y(x+y)`: the `idens` getter grabs a comma-separated list of identifier +names upto the given token; the `list` getter grabs a general argument list. It +returns a list of token lists and by default stops at ')'. This 'lambda' notation +was suggested by Luiz Henrique de Figueiredo as something easily parsed by any +token-filtering approach - an alternative notation `|x,y| x+y` has been +[suggested](http://lua-users.org/lists/lua-l/2009-12/msg00071.html) but is +generally impossible to implement using a lexical scanner, since it would have to +parse the function body as an expression. The `\\` macro also has the advantage +that the operator precedence is explicit: in the case of `\\(42,'answer')` it is +immediately clear that this is a function of no arguments which returns two values. + +I would not necessarily suggest that lambdas are a good thing in +production code, but they _can_ be useful in iteractive exploration and within tests. + +Macros with explicit parameters can define a substitution function, but this +function receives the values themselves, not the getter and putter objects. These +values are _token lists_ and must be converted into the expected types using the +token list methods: + + macro.define('test_(var,start,finish)',function(var,start,finish) + var,start,finish = var:get_iden(),start:get_number(),finish:get_number() + print(var,start,finish) + end) + + +Since no `put` object is received, such macros need to construct their own: + + local put = M.Putter() + ... + return put + +(They can of course still just return the substitution as text.) + +### Dynamically controlling macro expansion + +Consider this loop-unrolling macro: + + do_(i,1,3, + y = y + i + ) + +which will expand as + + y = y + 1 + y = y + 2 + y = y + 3 + +For each iteration, it needs to define a local macro `i` which expands to 1,2 and 3. + + macro.define('do_(v,s,f,stat)',function(var,start,finish,statements) + local put = macro.Putter() + var,start,finish = var:get_iden(),start:get_number(),finish:get_number() + macro.push_token_stack('do_',var) + for i = start, finish do + -- output `set_ <var> <value> ` + put:iden 'set_':iden(var):number(i):space() + put:tokens(statements) + end + -- output `undef_ <var> <value>` + put:iden 'undef_':iden(var) + -- output `_POP_ 'do_'` + put:iden '_DROP_':string 'do_' + return put + end) + +Ignoring the macro stack manipulation for a moment, it works by inserting `set_` +macro assignments into the output. That is, the raw output looks like this: + + set_ i 1 + y = y + i + set_ i 2 + y = y + i + set_ i 2 + y = y + i + undef_ i + _DROP_ 'do_' + +It's important here to understand that LuaMacro does not do _recursive_ +substitution. Rather, the output of macros is pushed out to the stream which is +then further substituted, etc. So we do need these little helper macros to set the +loop variable at each point. + +Using the macro stack allows macros to be aware that they are expanding inside a +`do_` macro invocation. Consider `tuple`, which is another macro which creates +macros: + + tuple(3) A,B + A = B + +which would expand as + + local A_1,A_2,A_3,B_1,B_2,B_3 + A_1,A_2,A_3 = B_1,B_2,B_3 + +But we would like + + do_(i,1,3, + A = B/2 + ) + +to expand as + + A_1 = B_1/2 + A_2 = B_2/2 + A_2 = B_2/2 + +And here is the definition: + + macro.define('tuple',function(get) + get:expecting '(' + local N = get:number() + get:expecting ')' + get:expecting 'space' + local names = get:idens '\n' + for _,name in ipairs(names) do + macro.define(name,function(get,put) + local loop_var = macro.value_of_macro_stack 'do_' + if loop_var then + local loop_idx = tonumber(macro.get_macro_value(loop_var)) + return put:iden (name..'_'..loop_idx) + else + local out = {} + for i = 1,N do + out[i] = name..'_'..i + end + return put:idens(out) + end + end) + end + end) + +The first expansion case happens if we are not within a `do_` macro; a simple list +of names is outputted. Otherwise, we know what the loop variable is, and can +directly ask for its value. + +### Operator Macros + +You can of course define `@` to be a macro; a new feature allows you to add new +operator tokens: + + macro.define_tokens {'##','@-'} + +which can then be used with `macro.define`, but also now with `def_`. It's now +possible to define a list comprehension syntax that reads more naturally, e.g. +`{|x^2| i=1,10}` by making `{|` into a new token. + +Up to now, making a Lua operator token such as `.` into a macro was not so useful. +Such a macro may now return an extra value which indicates that the operator should +simply 'pass through' as is. Consider defining a `with` statement: + + with A do + .x = 1 + .y = 2 + end + +I've deliberately indicated the fields using a dot (a rare case of Visual Basic +syntax being superior to Delphi). So it is necessary to overload '.' and look at +the previous token: if it isn't a case like `name.` or `].` then we prepend the +table. Otherwise, the operator must simply _pass through_, to prevent an +uncontrolled recursion. + + M.define('with',function(get,put) + M.define_scoped('.',function() + local lt,lv = get:peek(-1,true) -- peek before the period... + if lt ~= 'iden' and lt ~= ']' then + return '_var.' + else + return nil,true -- pass through + end + end) + local expr = get:upto 'do' + return 'do local _var = '..tostring(expr)..'; ' + end) + +Again, scoping means that this behaviour is completely local to the with-block. + +A more elaborate experiment is `cskin.lua` in the tests directory. This translates +a curly-bracket form into standard Lua, and at its heart is defining '{' and '}' as +macros. You have to keep a brace stack, because these tokens still have their old +meaning and the table constructor in this example must still work, while the +trailing brace must be converted to `end`. + + if (a > b) { + t = {a,b} + } + +### Pass-Through Macros + +Normally a macro replaces the name (plus any arguments) with the substitution. It +is sometimes useful to pass the name through, but not to push the name into the +token stream - otherwise we will get an endless expansion. + + macro.define('fred',function() + print 'fred was found' + return nil, true + end) + +This has absolutely no effect on the preprocessed text ('fred' remains 'fred', but +has a side-effect. This happens if the substitution function returns a second +`true` value. You can look at the immediate lexical environment with `peek`: + + macro.define('fred',function(get) + local t,v = get:peek(1) + if t == 'string' then + local str = get:string() + return 'fred_'..str + end + return nil,true + end) + +Pass-through macros are useful when each macro corresponds to a Lua variable; they +allow such variables to have a dual role. + +An example would be Python-style lists. The [Penlight +List](http://stevedonovan.github.com/Penlight/api/modules/pl.List.html) class has +the same functionality as the built-in Python list, but does not have any +syntactical support: + + > List = require 'pl.List' + > ls = List{10,20,20} + > = ls:slice(1,2) + {10,20} + > ls:slice_assign(1,2,{10,11,20,21}) + > = ls + {10,11,20,21,30} + +It would be cool if we could add a little bit of custom syntax to make this more +natural. What we first need is a 'macro factory' which outputs the code to create +the lists, and also suitable macros with the same names. + + -- list <var-list> [ = <init-list> ] + M.define ('list',function(get) + get() -- skip space + -- 'list' acts as a 'type' followed by a variable list, which may be + -- followed by initial values + local values + local vars,endt = get:idens (function(t,v) + return t == '=' or (t == 'space' and v:find '\n') + end) + -- there is an initialization list + if endt[1] == '=' then + values,endt = get:list '\n' + else + values = {} + end + -- build up the initialization list + for i,name in ipairs(vars) do + M.define_scoped(name,list_check) + values[i] = 'List('..tostring(values[i] or '')..')' + end + local lcal = M._interactive and '' or 'local ' + return lcal..table.concat(vars,',')..' = '..table.concat(values,',')..tostring(endt) + end) + +Note that this is a fairly re-usable pattern; it requires the type constructor +(`List` in this case) and a type-specific macro function (`list_check`). The only +tricky bit is handling the two cases, so the `idens` method finds the end using a +function, not a simple token. `idens`, like `list`, returns the list and the token +that ended the list, so we can use `endt` to check. + + list a = {1,2,3} + list b + +becomes + + local a = List({1,2,3}) + local b = List() + +unless we are in interactive mode, where `local` is not appropriate! + +Each of these list macro/variables may be used in several ways: + + - directly `a` - no action! + - `a[i]` - plain table index + - `a[i:j]` - a list slice. Will be `a:slice(i,j)` normally, but must + be `a:slice_assign(i,j,RHS)` if on the right-hand side of an assignment. + +The substitution function checks these cases by appropriate look-ahead: + + function list_check (get,put) + local t,v = get:peek(1) + if t ~= '[' then return nil, true end -- pass-through; plain var reference + get:expecting '[' + local args = get:list(']',':') + -- it's just plain table access + if #args == 1 then return '['..tostring(args[1])..']',true end + + -- two items separated by a colon; use sensible defaults + M.assert(#args == 2, "slice has two arguments!") + local start,finish = tostring(args[1]),tostring(args[2]) + if start == '' then start = '1' end + if finish == '' then finish = '-1' end + + -- look ahead to see if we're on the left hand side of an assignment + if get:peek(1) == '=' then + get:next() -- skip '=' + local rest,eoln = get:upto '\n' + rest,eoln = tostring(rest),tostring(eoln) + return (':slice_assign(%s,%s,%s)%s'):format(start,finish,rest,eoln),true + else + return (':slice(%s,%s)'):format(start,finish),true + end + end + +This can be used interactively, like so (it requires the Penlight list library.) + + $> luam -llist -i + Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio + Lua Macro 2.3.0 Copyright (C) 2007-2011 Steve Donovan + > list a = {'one','two'} + > = a:map(\x(x:sub(1,1))) + {o,t} + > a:append 'three' + > a:append 'four' + > = a + {one,two,three,four} + > = a[2:3] + {two,three} + > = a[2:2] = {'zwei','twee'} + {one,zwei,twee,three,four} + > = a[1:2]..{'five'} + {one,zwei,five} + +### Preprocessing C + +With the 2.2 release, LuaMacro can preprocess C files, by the inclusion of a C LPeg +lexer based on work by Peter Odding. This may seem a semi-insane pursuit, given +that C already has a preprocessor, (which is widely considered a misfeature.) +However, the macros we are talking about are clever, they can maintain state, and +can be scoped lexically. + +One of the irritating things about C is the need to maintain separate include +files. It would be better if we could write a module like this: + + + // dll.c + #include "dll.h" + + export { + typedef struct { + int ival; + } MyStruct; + } + + export int one(MyStruct *ms) { + return ms->ival + 1 + } + + export int two(MyStruct *ms) { + return 2*ms->ival; + } + +and have the preprocessor generate an apppropriate header file: + + + #ifndef DLL_H + #define DLL_H + typedef struct { + int ival; + } MyStruct; + + int one(MyStruct *ms) ; + int two(MyStruct *ms) ; + #endif + +The macro `export` is straightforward: + + + M.define('export',function(get) + local t,v = get:next() + local decl,out + if v == '{' then + decl = tostring(get:upto '}') + decl = M.substitute_tostring(decl) + f:write(decl,'\n') + else + decl = v .. ' ' .. tostring(get:upto '{') + decl = M.substitute_tostring(decl) + f:write(decl,';\n') + out = decl .. '{' + end + return out + end) + +It looks ahead and if it finds a `{}` block it writes the block as text to a file +stream; otherwise writes out the function signature. `get:upto '}'` will do the +right thing here since it keeps track of brace level. To allow any other macro +expansions to take place, `substitute_tostring` is directly called. + +`tests/cexport.lua` shows how this idea can be extended, so that the generated +header is only updated when it changes. + +To preprocess C with `luam`, you need to specify the `-C` flag: + + luam -C -lcexport -o dll.c dll.lc + +Have a look at [lc](modules/macro.lc.html) which defines a simplified way to write +Lua bindings in C. Here is `tests/str.l.c`: + + // preprocess using luam -C -llc -o str.c str.l.c + #include <string.h> + + module "str" { + + def at (Str s, Int i = 0) { + lua_pushlstring(L,&s[i-1],1); + return 1; + } + + def upto (Str s, Str delim = " ") { + lua_pushinteger(L, strcspn(s,delim) + 1); + return 1; + } + + } + +The result looks like this: + + // preprocess using luam -C -llc -o str.c str.l.c + #line 2 "str.lc" + #include <string.h> + + #include <lua.h> + #include <lauxlib.h> + #include <lualib.h> + #ifdef WIN32 + #define EXPORT __declspec(dllexport) + #else + #define EXPORT + #endif + typedef const char *Str; + typedef const char *StrNil; + typedef int Int; + typedef double Number; + typedef int Boolean; + + + #line 6 "str.lc" + static int l_at(lua_State *L) { + const char *s = luaL_checklstring(L,1,NULL); + int i = luaL_optinteger(L,2,0); + + #line 7 "str.lc" + + lua_pushlstring(L,&s[i-1],1); + return 1; + } + + static int l_upto(lua_State *L) { + const char *s = luaL_checklstring(L,1,NULL); + const char *delim = luaL_optlstring(L,2," ",NULL); + + #line 12 "str.lc" + + lua_pushinteger(L, strcspn(s,delim) + 1); + return 1; + } + + static const luaL_reg str_funs[] = { + {"at",l_at}, + {"upto",l_upto}, + {NULL,NULL} + }; + + EXPORT int luaopen_str (lua_State *L) { + luaL_register (L,"str",str_funs); + + return 1; + } + +Note the line directives; this makes working with macro-ized C code much easier +when the inevitable compile and run-time errors occur. `lc` takes away some +of the more irritating bookkeeping needed in writing C extensions +(here I only have to mention function names once) + +`lc` was used for the [winapi](https://github.com/stevedonovan/winapi) project to +preprocess [this +file](https://github.com/stevedonovan/winapi/blob/master/winapi.l.c) +into [standard C](https://github.com/stevedonovan/winapi/blob/master/winapi.c). + +This used an extended version of `lc` which handled the largely superficial +differences between the Lua 5.1 and 5.2 API. + +(The curious thing is that `winapi` is my only project where I've leant on +LuaMacro, and it's all in C.) + +### A Simple Test Framework + +LuaMacro comes with yet another simple test framework - I apologize for this in +advance, because there are already quite enough. But consider it a demonstration +of how a little macro sugar can make tests more readable, even if you are +uncomfortable with them in production code (see `tests/test-test.lua`) + + require_ 'assert' + assert_ 1 == 1 + assert_ "hello" matches "^hell" + assert_ x.a throws 'attempt to index global' + +The last line is more interesting, since it's transparently wrapping +the offending expression in an anonymous function. The expanded output looks +like this: + + T_ = require 'macro.lib.test' + T_.assert_eq(1 ,1) + T_.assert_match("hello" ,"^hell") + T_.assert_match(T_.pcall_no(function() return x.a end),'attempt to index global') + +(This is a generally useful pattern - use macros to provide a thin layer of sugar +over the underlying library. The `macro.assert` module is only 75 lines long, with +comments - its job is to format code to make using the implementation easier.) + +Remember that the predefined meaning of @ is to convert `@name` into `name_`. So we +could just as easily say `@assert 1 == 1` and so forth. + +Lua functions often return multiple values or tables: + + two = \(40,2) + table2 = \({40,2}) + @assert two() == (40,2) + @assert table2() == {40,2} + +For a proper grown-up Lua testing framework +that uses LuaMacro, see [Specl](http://gvvaughan.github.io/specl). + + +### Implementation + +It is not usually necessary to understand the underlying representation of token +lists, but I present it here as a guide to understanding the code. + +#### Token Lists + +The token list representation of the expression `x+1` is: + + {{'iden','x'},{'+','+'},{'number','1'}} + +which is the form returned by the LPeg lexical analyser. Please note that there are +also 'space' and 'comment' tokens in the stream, which is a big difference from the +token-filter standard. + +The `TokenList` type defines `__tostring` and some helper methods for these lists. + +The following macro is an example of the lower-level coding needed without the +usual helpers: + + local macro = require 'macro' + macro.define('qw',function(get,put) + local append = table.insert + local t,v = get() + local res = {{'{','{'}} + t,v = get:next() + while t ~= ')' do + if t ~= ',' then + append(res,{'string','"'..v..'"'}) + append(res,{',',','}) + end + t,v = get:next() + end + append(res,{'}','}'}) + return res + end) + +We're using the getter `next` method to skip any whitespace, but building up the +substitution without a putter, just manipulating the raw token list. `qw` takes a +plain list of words, separated by spaces (and maybe commas) and makes it into a +list of strings. That is, + + qw(one two three) + +becomes + + {'one','two','three'} + +#### Program Structure + +The main loop of `macro.substitute` (towards end of `macro.lua`) summarizes the +operation of LuaMacro: + +There are two macro tables, `imacro` for classic name macros, and `smacro` for +operator style macros. They contain macro tables, which must have a `subst` field +containing the substitution and may have a `parms` field, which means that they +must be followed by their arguments in parentheses. + +A keywords table is chiefly used to track block scope, e.g. +`do`,`if`,`function`,etc means 'increase block level' and `end`,`until` means +'decrease block level'. At this point, any defined block handlers for this level +will be evaluated and removed. These may insert tokens into the stream, like +macros. This is how something like `_END_CLOSE_` is implemented: the `end` causes +the block level to decrease, which fires a block handler which passes `end` through +and inserts a closing `)`. + +Any keyword may also have an associated keyword handler, which works rather like a +macro substitution, except that the keyword itself is always passed through first. +(Allowing keywords as regular macros would generally be a bad idea because of the +recursive substitution problem.) + +The macro `subst` field may be a token list or a function. if it is a function then +that function is called, with the parameters as token lists if the macro defined +formal parameters, or with getter and setter objects if not. If the result is text +then it is parsed into a token list. |