1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
|
## LuaMacro - a macro preprocessor for Lua
This is a library and driver script for preprocessing and evaluating Lua code.
Lexical macros can be defined, which may be simple C-preprocessor style macros or
macros that change their expansion depending on the context.
It is a new, rewritten version of the
[Luaforge](http://luaforge.net/projects/luamacro/) project of the same name, which
required the [token filter
patch](http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#tokenf) by Luiz Henrique de
Figueiredo. This patch allowed Lua scripts to filter the raw token stream before
the compiler stage. Within the limits imposed by the lexical filter approach this
worked pretty well. However, the token filter patch is unlikely to ever become
part of mainline Lua, either in its original or
[revised](http://lua-users.org/lists/lua-l/2010-02/msg00325.html) form. So the most
portable option becomes precompilation, but Lua bytecode is not designed to be
platform-independent and in any case changes faster than the surface syntax of the
language. So using LuaMacro with LuaJIT would have required re-applying the patch,
and would remain within the ghetto of specialized, experimental use.
This implementation uses a [LPeg](http://www.inf.puc-rio.br/~roberto/lpeg.html)
lexical analyser originally by [Peter
Odding](http://lua-users.org/wiki/LpegRecipes) to tokenize Lua source, and builds
up a preprocessed string explicitly, which then can be loaded in the usual way.
This is not as efficient as the original, but it can be used by anyone with a Lua
interpreter, whether it is Lua 5.1, 5.2 or LuaJIT 2. An advantage of fully building
the output is that it becomes much easier to debug macros when you can actually see
the generated results. (Another example of a LPeg-based Lua macro preprocessor is
[Luma](http://luaforge.net/projects/luma/))
It is not possible to discuss macros in Lua without mentioning Fabien Fleutot's
[Metalua](metalua.luaforge.net/) which is an alternative Lua compiler which
supports syntactical macros that can work on the AST (Abstract Syntax Tree) itself
of Lua. This is clearly a technically superior way to extend Lua syntax, but again
has the disadvantage of being a direct-to-bytecode compiler. (Perhaps it's also a
matter of taste, since I find it easier to think about extending Lua on the lexical
level.)
My renewed interest in Lua lexical macros came from some discussions on the Lua
mailing list about numerically optimal Lua code using LuaJIT. We have been spoiled
by modern optimizing C/C++ compilers, where hand-optimization is often discouraged,
but LuaJIT is new and requires some assistance. For instance, unrolling short loops
can make a dramatic difference, but Lua does not provide the key concept of
constant value to assist the compiler. So a very straightforward use of a macro
preprocessor is to provide named constants in the old-fashioned C way. Very
efficient code can be generated by generalizing the idea of 'varargs' into a
statically-compiled 'tuple' type.
tuple(3) A,B
The assigment `A = B` is expanded as:
A_1,A_2,A_3 = B_1,B_2,B_3
I will show how the expansion can be made context-sensitive, so that the
loop-unrolling macro `do_` changes this behaviour:
do_(i,1,3,
A = 0.5*B
)
expands to:
A_1 = 0.5*B_1
A_2 = 0.5*B_2
A_3 = 0.5*B_3
Another use is crafting DSLs, particularly for end-user scripting. For instance,
people may be more comfortable with `forall x in t do` rather than `for _,x in
ipairs(t) do`; there is less to explain in the first form and it translates
directly to the second form. Another example comes from this common pattern:
some_action(function()
...
end)
Using the following macro:
def_ block (function() _END_CLOSE_
we can write:
some_action block
...
end
A criticism of traditional lexical macros is that they don't respect the scoping
rules of the language itself. Bad experiences with the C preprocessor lead many to
regard them as part of the prehistory of computing. The macros described here can
be lexically scoped, and can be as 'hygenic' as necessary, since their expansion
can be finely controlled with Lua itself.
For me, a more serious charge against 'macro magic' is that it can lead to a
private dialect of the language (the original Bourne shell was written in C
'skinned' to look like Algol 68.) This often indicates a programmer uncomfortable
with a language, who wants it to look like something more familiar. Relying on a
preprocessor may mean that programmers need to immerse themselves more in the idioms of
the new language.
That being said, macros can extend a language so that it can be more expressive for
a particular task, particularly if the users are not professional programmers.
### Basic Macro Substitution
To install LuaMacro, expand the archive and make a script or batch file that points
to `luam.lua`, for instance:
lua /home/frodo/luamacro/luam.lua $*
(Or '%*' if on Windows.) Then put this file on your executable path.
Any Lua code loaded with `luam` goes through four distinct steps:
* loading and defining macros
* preprocessing
* compilation
* execution
The last two steps happen within Lua itself, but always occur, even though the Lua
compiler is fast enough that we mostly do not bother to save the generated bytecode.
For example, consider this `hello.lua`:
print(HELLO)
and `hello-def.lua`:
local macro = require 'macro'
macro.define 'HELLO "Hello, World!"'
To run the program:
$> luam -lhello-def hello.lua
Hello, World!
So the module `hello-def.lua` is first loaded (compiled and executed, but not
preprocessed) and only then `hello.lua` can be preprocessed and then loaded.
Naturaly, there are easier ways to use LuaMacro, but I want to emphasize the
sequence of macro loading, preprocessing and script loading. `luam` has a `-d`
flag, meaning 'dump', which is very useful when debugging the output of the
preprocessing step:
$> luam -d -lhello-def hello.lua
print("Hello, World!")
`hello2.lua` is a more sensible first program:
require_ 'hello-def'
print(HELLO)
You cannot use the Lua `require` function at this point, since `require` is only
executed when the program starts executing and we want the macro definitions to be
available during the current compilation. `require_` is the macro version, which
loads the file at compile-time.
New with 2.5 is the default @ shortcut available when using `luam`,
so `require_` can be written `@require`.
(`@` is itself a macro, so you can redefine it if needed.)
There is also `include_/@include`, which is analogous to `#include` in `cpp`. It takes a
file path in quotes, and directly inserts the contents of the file into the current
compilation. Although tempting to use, it will not work here because again the
macro definitions will not be available at compile-time.
`hello3.lua` fits much more into the C preprocessor paradigm, which uses the `def_`
macro:
@def HELLO "Hello, World!"
print(HELLO)
(Like `cpp`, such macro definitions end with the line; however, there is no
equivalent of `\` to extend the definition over multiple lines.)
With 2.1, an alternative syntax `def_ (name body)` is also available, which can be
embedded inside a macro expression:
def_ OF_ def_ (of elseif _value ==)
Or even extend over several lines:
def_ (complain(msg,n)
for i = 1,n do
print msg
end
)
`def_` works pretty much like `#define`, for instance, `def_ SQR(x) ((x)*(x))`. A
number of C-style favourites can be defined, like `assert_` using `_STR_`, which is
a predefined macro that 'stringifies' its argument.
def_ assert_(condn) assert(condn,_STR_(condn))
`def_` macros are _lexically scoped_:
local X = 1
if something then
def_ X 42
assert(X == 42)
end
assert(X == 1)
LuaMacro keeps track of Lua block structure - in particular it knows when a
particular lexical scope has just been closed. This is how the `_END_CLOSE_`
built-in macro works
def_ block (function() _END_CLOSE_
my_fun block
do_something_later()
end
When the current scope closes with `end`, LuaMacro appends the necessary ')' to
make this syntax valid.
A common use of macros in both C and Lua is to inline optimized code for a case.
The Lua function `assert()` always evaluates its second argument, which is not
always optimal:
def_ ASSERT(condn,expr) if condn then else error(expr) end
ASSERT(2 == 1,"damn! ".. 2 .." is not equal to ".. 1)
If the message expression is expensive to execute, then this can give better
performance at the price of some extra code. `ASSERT` is now a statement, not a
function, however.
### Conditional Compilation
For this to work consistently, you need to use the `@` shortcut:
@include 'test.inc'
@def A 10
...
This makes macro 'preprocessor' statements stand out more. Conditional compilation
works as you would expect from C:
-- test-cond.lua
@if A
print 'A defined'
@else
print 'A not defined'
@end
@if os.getenv 'P'
print 'Env P is defined'
@end
Now, what is `A`? It is a Lua expression which is evaluated at _preprocessor_
time, and if it returns any value except `nil` or `false` it is true, using
the usual Lua rule. Assuming `A` is just a global variable, how can it be set?
$ luam test-cond.lua
A not defined
$ luam -VA test-cond.lua
A defined
$ export P=1
$ luam test-cond.lua
A not defined
Env P is defined
Although this looks very much like the standard C preprocessor, the implementation
is rather different - `@if` is a special macro which evaluates its argument
(everything on the rest of the line) as a _Lua expression_
and skips upto `@end` (or `@else` or `@elseif`) if that condition is false.
### Using macro.define
`macro.define` is less convenient than `def_` but much more powerful. The extended
form allows the substitution to be a _function_ which is called in-place at compile
time. These definitions must be loaded before they can be used,
either with `-l` or with `@require`.
macro.define('DATE',function()
return '"'..os.date('%c')..'"'
end)
Any text which is returned will be tokenized and inserted into the output stream.
The explicit quoting here is needed to ensure that `DATE` will be replaced by the
string "04/30/11 09:57:53". ('%c' gives you the current locale's version of the
date; for a proper version of this macro, best to use `os.date` [with more explicit
formats](http://www.lua.org/pil/22.1.html) .)
This function can also return nothing, which allows you to write macro code purely
for its _side-effects_.
Non-operator characters like `@`,`$`, etc can be used as macros. For example, say
you like shell-like notation `$HOME` for expanding environment variables in your
scripts.
macro.define '$(x) os.getenv(_STR_(x))'
A script can now say `$(PATH)` and get the expected expansion, Make-style. But we
can do better and support `$PATH` directly:
macro.define('$',function(get)
local var = get:iden()
return 'os.getenv("'..var..'")'
end)
If a macro has no parameters, then the substitution function receives a 'getter'
object. This provides methods for extracting various token types from the input
stream. Here the `$` macro must be immediately followed by an identifier.
We can do better, and define `$` so that something like `$(pwd)` has the same
meaning as the Unix shell:
macro.define('$',function(get)
local t,v = get()
if t == 'iden' then
return 'os.getenv("'..v..'")'
elseif t == '(' then
local rest = get:upto ')'
return 'os.execute("'..tostring(rest)..'")'
end
end)
(The getter `get` is callable, and returns the type and value of the next token.)
It is probably a silly example, but it illustrates how a macro can be overloaded
based on its lexical context. Much of the expressive power of LuaMacro comes from
allowing macros to fetch their own parameters in this way. It allows us to define
new syntax and go beyond 'pseudo-functions', which is more important for a
conventional-syntax language like Lua, rather than Lisp where everything looks like
a function anyway. These kinds of macros are called 'reader' macros in the Lisp world,
since they temporarily take over reading code.
It is entirely possible for macros to create macros; that is what `def_` does.
Consider how to add the concept of `const` declarations to Lua:
const N,M = 10,20
Here is one solution:
macro.define ('const',function(get)
get() -- skip the space
local vars = get:idens '='
local values = get:list '\n'
for i,name in ipairs(vars) do
macro.assert(values[i],'each constant must be assigned!')
macro.define_scoped(name,tostring(values[i]))
end
end)
The key to making these constants well-behaved is `define_scoped`, which installs a
block handler which resets the macro to its original value, which is usually `nil`.
This test script shows how the scoping works:
require_ 'const'
do
const N,M = 10,20
do
const N = 5
assert(N == 5)
end
assert(N == 10 and M == 20)
end
assert(N == nil and M == nil)
If we were designing a DSL intended for non-technical users, then we cannot just
say to them 'learn the language properly - go read PiL!'. It would be easier to
explain:
forall x in {10,20,30} do
than the equivalent generic `for` loop. `forall` can be implemented fairly simply
as a macro:
macro.define('forall',function(get)
local var = get:iden()
local t,v = get:next() -- will be 'in'
local rest = tostring(get:upto 'do')
return ('for _,%s in ipairs(%s) do'):format(var,rest)
end)
That is, first get the loop variable, skip `in`, grab everything up to `do` and
output the corresponding `for` statement.
Useful macros can often be built using these new forms. For instance, here is a
simple list comprehension macro:
macro.define('L(expr,select) '..
'(function() local res = {} '..
' forall select do res[#res+1] = expr end '..
'return res end)()'
)
For example, `L(x^2,x in t)` will make a list of the squares of all elements in `t`.
Why don't we use a long string here? Because we don't wish to insert any extra line
feeds in the output.`macro.forall` defines more sophisticated `forall` statements
and list comprehension expressions, but the principle is the same - see 'tests/test-forall.lua'
There is a second argument passed to the substitution function, which is a 'putter'
object - an object for building token lists. For example, a useful shortcut for
anonymous functions:
M.define ('\\',function(get,put)
local args = get:idens('(')
local body = get:list()
return put:keyword 'function' '(' : idens(args) ')' :
keyword 'return' : list(body) : space() : keyword 'end'
end)
The `put` object has methods for appending particular kinds of tokens, such as
keywords and strings, and is also callable for operator tokens. These always return
the object itself, so the output can be built up with chaining.
Consider `\x,y(x+y)`: the `idens` getter grabs a comma-separated list of identifier
names upto the given token; the `list` getter grabs a general argument list. It
returns a list of token lists and by default stops at ')'. This 'lambda' notation
was suggested by Luiz Henrique de Figueiredo as something easily parsed by any
token-filtering approach - an alternative notation `|x,y| x+y` has been
[suggested](http://lua-users.org/lists/lua-l/2009-12/msg00071.html) but is
generally impossible to implement using a lexical scanner, since it would have to
parse the function body as an expression. The `\\` macro also has the advantage
that the operator precedence is explicit: in the case of `\\(42,'answer')` it is
immediately clear that this is a function of no arguments which returns two values.
I would not necessarily suggest that lambdas are a good thing in
production code, but they _can_ be useful in iteractive exploration and within tests.
Macros with explicit parameters can define a substitution function, but this
function receives the values themselves, not the getter and putter objects. These
values are _token lists_ and must be converted into the expected types using the
token list methods:
macro.define('test_(var,start,finish)',function(var,start,finish)
var,start,finish = var:get_iden(),start:get_number(),finish:get_number()
print(var,start,finish)
end)
Since no `put` object is received, such macros need to construct their own:
local put = M.Putter()
...
return put
(They can of course still just return the substitution as text.)
### Dynamically controlling macro expansion
Consider this loop-unrolling macro:
do_(i,1,3,
y = y + i
)
which will expand as
y = y + 1
y = y + 2
y = y + 3
For each iteration, it needs to define a local macro `i` which expands to 1,2 and 3.
macro.define('do_(v,s,f,stat)',function(var,start,finish,statements)
local put = macro.Putter()
var,start,finish = var:get_iden(),start:get_number(),finish:get_number()
macro.push_token_stack('do_',var)
for i = start, finish do
-- output `set_ <var> <value> `
put:iden 'set_':iden(var):number(i):space()
put:tokens(statements)
end
-- output `undef_ <var> <value>`
put:iden 'undef_':iden(var)
-- output `_POP_ 'do_'`
put:iden '_DROP_':string 'do_'
return put
end)
Ignoring the macro stack manipulation for a moment, it works by inserting `set_`
macro assignments into the output. That is, the raw output looks like this:
set_ i 1
y = y + i
set_ i 2
y = y + i
set_ i 2
y = y + i
undef_ i
_DROP_ 'do_'
It's important here to understand that LuaMacro does not do _recursive_
substitution. Rather, the output of macros is pushed out to the stream which is
then further substituted, etc. So we do need these little helper macros to set the
loop variable at each point.
Using the macro stack allows macros to be aware that they are expanding inside a
`do_` macro invocation. Consider `tuple`, which is another macro which creates
macros:
tuple(3) A,B
A = B
which would expand as
local A_1,A_2,A_3,B_1,B_2,B_3
A_1,A_2,A_3 = B_1,B_2,B_3
But we would like
do_(i,1,3,
A = B/2
)
to expand as
A_1 = B_1/2
A_2 = B_2/2
A_2 = B_2/2
And here is the definition:
macro.define('tuple',function(get)
get:expecting '('
local N = get:number()
get:expecting ')'
get:expecting 'space'
local names = get:idens '\n'
for _,name in ipairs(names) do
macro.define(name,function(get,put)
local loop_var = macro.value_of_macro_stack 'do_'
if loop_var then
local loop_idx = tonumber(macro.get_macro_value(loop_var))
return put:iden (name..'_'..loop_idx)
else
local out = {}
for i = 1,N do
out[i] = name..'_'..i
end
return put:idens(out)
end
end)
end
end)
The first expansion case happens if we are not within a `do_` macro; a simple list
of names is outputted. Otherwise, we know what the loop variable is, and can
directly ask for its value.
### Operator Macros
You can of course define `@` to be a macro; a new feature allows you to add new
operator tokens:
macro.define_tokens {'##','@-'}
which can then be used with `macro.define`, but also now with `def_`. It's now
possible to define a list comprehension syntax that reads more naturally, e.g.
`{|x^2| i=1,10}` by making `{|` into a new token.
Up to now, making a Lua operator token such as `.` into a macro was not so useful.
Such a macro may now return an extra value which indicates that the operator should
simply 'pass through' as is. Consider defining a `with` statement:
with A do
.x = 1
.y = 2
end
I've deliberately indicated the fields using a dot (a rare case of Visual Basic
syntax being superior to Delphi). So it is necessary to overload '.' and look at
the previous token: if it isn't a case like `name.` or `].` then we prepend the
table. Otherwise, the operator must simply _pass through_, to prevent an
uncontrolled recursion.
M.define('with',function(get,put)
M.define_scoped('.',function()
local lt,lv = get:peek(-1,true) -- peek before the period...
if lt ~= 'iden' and lt ~= ']' then
return '_var.'
else
return nil,true -- pass through
end
end)
local expr = get:upto 'do'
return 'do local _var = '..tostring(expr)..'; '
end)
Again, scoping means that this behaviour is completely local to the with-block.
A more elaborate experiment is `cskin.lua` in the tests directory. This translates
a curly-bracket form into standard Lua, and at its heart is defining '{' and '}' as
macros. You have to keep a brace stack, because these tokens still have their old
meaning and the table constructor in this example must still work, while the
trailing brace must be converted to `end`.
if (a > b) {
t = {a,b}
}
### Pass-Through Macros
Normally a macro replaces the name (plus any arguments) with the substitution. It
is sometimes useful to pass the name through, but not to push the name into the
token stream - otherwise we will get an endless expansion.
macro.define('fred',function()
print 'fred was found'
return nil, true
end)
This has absolutely no effect on the preprocessed text ('fred' remains 'fred', but
has a side-effect. This happens if the substitution function returns a second
`true` value. You can look at the immediate lexical environment with `peek`:
macro.define('fred',function(get)
local t,v = get:peek(1)
if t == 'string' then
local str = get:string()
return 'fred_'..str
end
return nil,true
end)
Pass-through macros are useful when each macro corresponds to a Lua variable; they
allow such variables to have a dual role.
An example would be Python-style lists. The [Penlight
List](http://stevedonovan.github.com/Penlight/api/modules/pl.List.html) class has
the same functionality as the built-in Python list, but does not have any
syntactical support:
> List = require 'pl.List'
> ls = List{10,20,20}
> = ls:slice(1,2)
{10,20}
> ls:slice_assign(1,2,{10,11,20,21})
> = ls
{10,11,20,21,30}
It would be cool if we could add a little bit of custom syntax to make this more
natural. What we first need is a 'macro factory' which outputs the code to create
the lists, and also suitable macros with the same names.
-- list <var-list> [ = <init-list> ]
M.define ('list',function(get)
get() -- skip space
-- 'list' acts as a 'type' followed by a variable list, which may be
-- followed by initial values
local values
local vars,endt = get:idens (function(t,v)
return t == '=' or (t == 'space' and v:find '\n')
end)
-- there is an initialization list
if endt[1] == '=' then
values,endt = get:list '\n'
else
values = {}
end
-- build up the initialization list
for i,name in ipairs(vars) do
M.define_scoped(name,list_check)
values[i] = 'List('..tostring(values[i] or '')..')'
end
local lcal = M._interactive and '' or 'local '
return lcal..table.concat(vars,',')..' = '..table.concat(values,',')..tostring(endt)
end)
Note that this is a fairly re-usable pattern; it requires the type constructor
(`List` in this case) and a type-specific macro function (`list_check`). The only
tricky bit is handling the two cases, so the `idens` method finds the end using a
function, not a simple token. `idens`, like `list`, returns the list and the token
that ended the list, so we can use `endt` to check.
list a = {1,2,3}
list b
becomes
local a = List({1,2,3})
local b = List()
unless we are in interactive mode, where `local` is not appropriate!
Each of these list macro/variables may be used in several ways:
- directly `a` - no action!
- `a[i]` - plain table index
- `a[i:j]` - a list slice. Will be `a:slice(i,j)` normally, but must
be `a:slice_assign(i,j,RHS)` if on the right-hand side of an assignment.
The substitution function checks these cases by appropriate look-ahead:
function list_check (get,put)
local t,v = get:peek(1)
if t ~= '[' then return nil, true end -- pass-through; plain var reference
get:expecting '['
local args = get:list(']',':')
-- it's just plain table access
if #args == 1 then return '['..tostring(args[1])..']',true end
-- two items separated by a colon; use sensible defaults
M.assert(#args == 2, "slice has two arguments!")
local start,finish = tostring(args[1]),tostring(args[2])
if start == '' then start = '1' end
if finish == '' then finish = '-1' end
-- look ahead to see if we're on the left hand side of an assignment
if get:peek(1) == '=' then
get:next() -- skip '='
local rest,eoln = get:upto '\n'
rest,eoln = tostring(rest),tostring(eoln)
return (':slice_assign(%s,%s,%s)%s'):format(start,finish,rest,eoln),true
else
return (':slice(%s,%s)'):format(start,finish),true
end
end
This can be used interactively, like so (it requires the Penlight list library.)
$> luam -llist -i
Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
Lua Macro 2.3.0 Copyright (C) 2007-2011 Steve Donovan
> list a = {'one','two'}
> = a:map(\x(x:sub(1,1)))
{o,t}
> a:append 'three'
> a:append 'four'
> = a
{one,two,three,four}
> = a[2:3]
{two,three}
> = a[2:2] = {'zwei','twee'}
{one,zwei,twee,three,four}
> = a[1:2]..{'five'}
{one,zwei,five}
### Preprocessing C
With the 2.2 release, LuaMacro can preprocess C files, by the inclusion of a C LPeg
lexer based on work by Peter Odding. This may seem a semi-insane pursuit, given
that C already has a preprocessor, (which is widely considered a misfeature.)
However, the macros we are talking about are clever, they can maintain state, and
can be scoped lexically.
One of the irritating things about C is the need to maintain separate include
files. It would be better if we could write a module like this:
// dll.c
#include "dll.h"
export {
typedef struct {
int ival;
} MyStruct;
}
export int one(MyStruct *ms) {
return ms->ival + 1
}
export int two(MyStruct *ms) {
return 2*ms->ival;
}
and have the preprocessor generate an apppropriate header file:
#ifndef DLL_H
#define DLL_H
typedef struct {
int ival;
} MyStruct;
int one(MyStruct *ms) ;
int two(MyStruct *ms) ;
#endif
The macro `export` is straightforward:
M.define('export',function(get)
local t,v = get:next()
local decl,out
if v == '{' then
decl = tostring(get:upto '}')
decl = M.substitute_tostring(decl)
f:write(decl,'\n')
else
decl = v .. ' ' .. tostring(get:upto '{')
decl = M.substitute_tostring(decl)
f:write(decl,';\n')
out = decl .. '{'
end
return out
end)
It looks ahead and if it finds a `{}` block it writes the block as text to a file
stream; otherwise writes out the function signature. `get:upto '}'` will do the
right thing here since it keeps track of brace level. To allow any other macro
expansions to take place, `substitute_tostring` is directly called.
`tests/cexport.lua` shows how this idea can be extended, so that the generated
header is only updated when it changes.
To preprocess C with `luam`, you need to specify the `-C` flag:
luam -C -lcexport -o dll.c dll.lc
Have a look at [lc](modules/macro.lc.html) which defines a simplified way to write
Lua bindings in C. Here is `tests/str.l.c`:
// preprocess using luam -C -llc -o str.c str.l.c
#include <string.h>
module "str" {
def at (Str s, Int i = 0) {
lua_pushlstring(L,&s[i-1],1);
return 1;
}
def upto (Str s, Str delim = " ") {
lua_pushinteger(L, strcspn(s,delim) + 1);
return 1;
}
}
The result looks like this:
// preprocess using luam -C -llc -o str.c str.l.c
#line 2 "str.lc"
#include <string.h>
#include <lua.h>
#include <lauxlib.h>
#include <lualib.h>
#ifdef WIN32
#define EXPORT __declspec(dllexport)
#else
#define EXPORT
#endif
typedef const char *Str;
typedef const char *StrNil;
typedef int Int;
typedef double Number;
typedef int Boolean;
#line 6 "str.lc"
static int l_at(lua_State *L) {
const char *s = luaL_checklstring(L,1,NULL);
int i = luaL_optinteger(L,2,0);
#line 7 "str.lc"
lua_pushlstring(L,&s[i-1],1);
return 1;
}
static int l_upto(lua_State *L) {
const char *s = luaL_checklstring(L,1,NULL);
const char *delim = luaL_optlstring(L,2," ",NULL);
#line 12 "str.lc"
lua_pushinteger(L, strcspn(s,delim) + 1);
return 1;
}
static const luaL_reg str_funs[] = {
{"at",l_at},
{"upto",l_upto},
{NULL,NULL}
};
EXPORT int luaopen_str (lua_State *L) {
luaL_register (L,"str",str_funs);
return 1;
}
Note the line directives; this makes working with macro-ized C code much easier
when the inevitable compile and run-time errors occur. `lc` takes away some
of the more irritating bookkeeping needed in writing C extensions
(here I only have to mention function names once)
`lc` was used for the [winapi](https://github.com/stevedonovan/winapi) project to
preprocess [this
file](https://github.com/stevedonovan/winapi/blob/master/winapi.l.c)
into [standard C](https://github.com/stevedonovan/winapi/blob/master/winapi.c).
This used an extended version of `lc` which handled the largely superficial
differences between the Lua 5.1 and 5.2 API.
(The curious thing is that `winapi` is my only project where I've leant on
LuaMacro, and it's all in C.)
### A Simple Test Framework
LuaMacro comes with yet another simple test framework - I apologize for this in
advance, because there are already quite enough. But consider it a demonstration
of how a little macro sugar can make tests more readable, even if you are
uncomfortable with them in production code (see `tests/test-test.lua`)
require_ 'assert'
assert_ 1 == 1
assert_ "hello" matches "^hell"
assert_ x.a throws 'attempt to index global'
The last line is more interesting, since it's transparently wrapping
the offending expression in an anonymous function. The expanded output looks
like this:
T_ = require 'macro.lib.test'
T_.assert_eq(1 ,1)
T_.assert_match("hello" ,"^hell")
T_.assert_match(T_.pcall_no(function() return x.a end),'attempt to index global')
(This is a generally useful pattern - use macros to provide a thin layer of sugar
over the underlying library. The `macro.assert` module is only 75 lines long, with
comments - its job is to format code to make using the implementation easier.)
Remember that the predefined meaning of @ is to convert `@name` into `name_`. So we
could just as easily say `@assert 1 == 1` and so forth.
Lua functions often return multiple values or tables:
two = \(40,2)
table2 = \({40,2})
@assert two() == (40,2)
@assert table2() == {40,2}
For a proper grown-up Lua testing framework
that uses LuaMacro, see [Specl](http://gvvaughan.github.io/specl).
### Implementation
It is not usually necessary to understand the underlying representation of token
lists, but I present it here as a guide to understanding the code.
#### Token Lists
The token list representation of the expression `x+1` is:
{{'iden','x'},{'+','+'},{'number','1'}}
which is the form returned by the LPeg lexical analyser. Please note that there are
also 'space' and 'comment' tokens in the stream, which is a big difference from the
token-filter standard.
The `TokenList` type defines `__tostring` and some helper methods for these lists.
The following macro is an example of the lower-level coding needed without the
usual helpers:
local macro = require 'macro'
macro.define('qw',function(get,put)
local append = table.insert
local t,v = get()
local res = {{'{','{'}}
t,v = get:next()
while t ~= ')' do
if t ~= ',' then
append(res,{'string','"'..v..'"'})
append(res,{',',','})
end
t,v = get:next()
end
append(res,{'}','}'})
return res
end)
We're using the getter `next` method to skip any whitespace, but building up the
substitution without a putter, just manipulating the raw token list. `qw` takes a
plain list of words, separated by spaces (and maybe commas) and makes it into a
list of strings. That is,
qw(one two three)
becomes
{'one','two','three'}
#### Program Structure
The main loop of `macro.substitute` (towards end of `macro.lua`) summarizes the
operation of LuaMacro:
There are two macro tables, `imacro` for classic name macros, and `smacro` for
operator style macros. They contain macro tables, which must have a `subst` field
containing the substitution and may have a `parms` field, which means that they
must be followed by their arguments in parentheses.
A keywords table is chiefly used to track block scope, e.g.
`do`,`if`,`function`,etc means 'increase block level' and `end`,`until` means
'decrease block level'. At this point, any defined block handlers for this level
will be evaluated and removed. These may insert tokens into the stream, like
macros. This is how something like `_END_CLOSE_` is implemented: the `end` causes
the block level to decrease, which fires a block handler which passes `end` through
and inserts a closing `)`.
Any keyword may also have an associated keyword handler, which works rather like a
macro substitution, except that the keyword itself is always passed through first.
(Allowing keywords as regular macros would generally be a bad idea because of the
recursive substitution problem.)
The macro `subst` field may be a token list or a function. if it is a function then
that function is called, with the parameters as token lists if the macro defined
formal parameters, or with getter and setter objects if not. If the result is text
then it is parsed into a token list.
|