summaryrefslogtreecommitdiff
path: root/Data/Libraries/Penlight/docs_topics
diff options
context:
space:
mode:
Diffstat (limited to 'Data/Libraries/Penlight/docs_topics')
-rw-r--r--Data/Libraries/Penlight/docs_topics/01-introduction.md621
-rw-r--r--Data/Libraries/Penlight/docs_topics/02-arrays.md649
-rw-r--r--Data/Libraries/Penlight/docs_topics/03-strings.md228
-rw-r--r--Data/Libraries/Penlight/docs_topics/04-paths.md170
-rw-r--r--Data/Libraries/Penlight/docs_topics/05-dates.md111
-rw-r--r--Data/Libraries/Penlight/docs_topics/06-data.md1262
-rw-r--r--Data/Libraries/Penlight/docs_topics/07-functional.md547
-rw-r--r--Data/Libraries/Penlight/docs_topics/08-additional.md600
-rw-r--r--Data/Libraries/Penlight/docs_topics/09-discussion.md91
9 files changed, 4279 insertions, 0 deletions
diff --git a/Data/Libraries/Penlight/docs_topics/01-introduction.md b/Data/Libraries/Penlight/docs_topics/01-introduction.md
new file mode 100644
index 0000000..a8bf26a
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/01-introduction.md
@@ -0,0 +1,621 @@
+## Introduction
+
+### Purpose
+
+It is often said of Lua that it does not include batteries. That is because the
+goal of Lua is to produce a lean expressive language that will be used on all
+sorts of machines, (some of which don't even have hierarchical filesystems). The
+Lua language is the equivalent of an operating system kernel; the creators of Lua
+do not see it as their responsibility to create a full software ecosystem around
+the language. That is the role of the community.
+
+A principle of software design is to recognize common patterns and reuse them. If
+you find yourself writing things like `io.write(string.format('the answer is %d
+',42))` more than a number of times then it becomes useful just to define a
+function `printf`. This is good, not just because repeated code is harder to
+maintain, but because such code is easier to read, once people understand your
+libraries.
+
+Penlight captures many such code patterns, so that the intent of your code
+becomes clearer. For instance, a Lua idiom to copy a table is `{unpack(t)}`, but
+this will only work for 'small' tables (for a given value of 'small') so it is
+not very robust. Also, the intent is not clear. So `tablex.deepcopy` is provided,
+which will also copy nested tables and and associated metatables, so it can be
+used to clone complex objects.
+
+The default error handling policy follows that of the Lua standard libraries: if
+a argument is the wrong type, then an error will be thrown, but otherwise we
+return `nil,message` if there is a problem. There are some exceptions; functions
+like `input.fields` default to shutting down the program immediately with a
+useful message. This is more appropriate behaviour for a _script_ than providing
+a stack trace. (However, this default can be changed.) The lexer functions always
+throw errors, to simplify coding, and so should be wrapped in `pcall`.
+
+If you are used to Python conventions, please note that all indices consistently
+start at 1.
+
+The Lua function `table.foreach` has been deprecated in favour of the `for in`
+statement, but such an operation becomes particularly useful with the
+higher-order function support in Penlight. Note that `tablex.foreach` reverses
+the order, so that the function is passed the value and then the key. Although
+perverse, this matches the intended use better.
+
+The only important external dependence of Penlight is
+[LuaFileSystem](http://keplerproject.github.com/luafilesystem/manual.html)
+(`lfs`), and if you want `dir.copyfile` to work cleanly on Windows, you will need
+either [alien](http://alien.luaforge.net/) or be using
+[LuaJIT](http://luajit.org) as well. (The fallback is to call the equivalent
+shell commands.)
+
+### To Inject or not to Inject?
+
+It was realized a long time ago that large programs needed a way to keep names
+distinct by putting them into tables (Lua), namespaces (C++) or modules
+(Python). It is obviously impossible to run a company where everyone is called
+'Bruce', except in Monty Python skits. These 'namespace clashes' are more of a
+problem in a simple language like Lua than in C++, because C++ does more
+complicated lookup over 'injected namespaces'. However, in a small group of
+friends, 'Bruce' is usually unique, so in particular situations it's useful to
+drop the formality and not use last names. It depends entirely on what kind of
+program you are writing, whether it is a ten line script or a ten thousand line
+program.
+
+So the Penlight library provides the formal way and the informal way, without
+imposing any preference. You can do it formally like:
+
+ local utils = require 'pl.utils'
+ utils.printf("%s\n","hello, world!")
+
+or informally like:
+
+ require 'pl'
+ utils.printf("%s\n","That feels better")
+
+`require 'pl'` makes all the separate Penlight modules available, without needing
+to require them each individually.
+
+Generally, the formal way is better when writing modules, since then there are no
+global side-effects and the dependencies of your module are made explicit.
+
+Andrew Starks has contributed another way, which balances nicely between the
+formal need to keep the global table uncluttered and the informal need for
+convenience. `require'pl.import_into'` returns a function, which accepts a table
+for injecting Penlight into, or if no table is given, it passes back a new one.
+
+ local pl = require'pl.import_into'()
+
+The table `pl` is a 'lazy table' which loads modules as needed, so we can then
+use `pl.utils.printf` and so forth, without an explicit `require' or harming any
+globals.
+
+If you are using `_ENV` with Lua 5.2 to define modules, then here is a way to
+make Penlight available within a module:
+
+ local _ENV,M = require 'pl.import_into' ()
+
+ function answer ()
+ -- all the Penlight modules are available!
+ return pretty.write(utils.split '10 20 30', '')
+ end
+
+ return M
+
+The default is to put Penlight into `\_ENV`, which has the unintended effect of
+making it available from the module (much as `module(...,package.seeall)` does).
+To satisfy both convenience and safety, you may pass `true` to this function, and
+then the _module_ `M` is not the same as `\_ENV`, but only contains the exported
+functions.
+
+Otherwise, Penlight will _not_ bring in functions into the global table, or
+clobber standard tables like 'io'. require('pl') will bring tables like
+'utils','tablex',etc into the global table _if they are used_. This
+'load-on-demand' strategy ensures that the whole kitchen sink is not loaded up
+front, so this method is as efficient as explicitly loading required modules.
+
+You have an option to bring the `pl.stringx` methods into the standard string
+table. All strings have a metatable that allows for automatic lookup in `string`,
+so we can say `s:upper()`. Importing `stringx` allows for its functions to also
+be called as methods: `s:strip()`,etc:
+
+ require 'pl'
+ stringx.import()
+
+or, more explicitly:
+
+ require('pl.stringx').import()
+
+A more delicate operation is importing tables into the local environment. This is
+convenient when the context makes the meaning of a name very clear:
+
+ > require 'pl'
+ > utils.import(math)
+ > = sin(1.2)
+ 0.93203908596723
+
+`utils.import` can also be passed a module name as a string, which is first
+required and then imported. If used in a module, `import` will bring the symbols
+into the module context.
+
+Keeping the global scope simple is very necessary with dynamic languages. Using
+global variables in a big program is always asking for trouble, especially since
+you do not have the spell-checking provided by a compiler. The `pl.strict`
+module enforces a simple rule: globals must be 'declared'. This means that they
+must be assigned before use; assigning to `nil` is sufficient.
+
+ > require 'pl.strict'
+ > print(x)
+ stdin:1: variable 'x' is not declared
+ > x = nil
+ > print(x)
+ nil
+
+The `strict` module provided by Penlight is compatible with the 'load-on-demand'
+scheme used by `require 'pl`.
+
+`strict` also disallows assignment to global variables, except in the main
+program. Generally, modules have no business messing with global scope; if you
+must do it, then use a call to `rawset`. Similarly, if you have to check for the
+existence of a global, use `rawget`.
+
+If you wish to enforce strictness globally, then just add `require 'pl.strict'`
+at the end of `pl/init.lua`, otherwise call it from your main program.
+
+As from 1.1.0, this module provides a `strict.module` function which creates (or
+modifies) modules so that accessing an unknown function or field causes an error.
+
+For example,
+
+ -- mymod.lua
+ local strict = require 'pl.strict'
+ local M = strict.module (...)
+
+ function M.answer ()
+ return 42
+ end
+
+ return M
+
+If you were to accidently type `mymod.Answer()`, then you would get a runtime
+error: "variable 'Answer' is not declared in 'mymod'".
+
+This can be applied to existing modules. You may desire to have the same level
+of checking for the Lua standard libraries:
+
+ strict.make_all_strict(_G)
+
+Thereafter a typo such as `math.cosine` will give you an explicit error, rather
+than merely returning a `nil` that will cause problems later.
+
+### What are function arguments in Penlight?
+
+Many functions in Penlight themselves take function arguments, like `map` which
+applies a function to a list, element by element. You can use existing
+functions, like `math.max`, anonymous functions (like `function(x,y) return x > y
+end` ), or operations by name (e.g '*' or '..'). The module `pl.operator` exports
+all the standard Lua operations, like the Python module of the same name.
+Penlight allows these to be referred to by name, so `operator.gt` can be more
+concisely expressed as '>'.
+
+Note that the `map` functions pass any extra arguments to the function, so we can
+have `ls:filter('>',0)`, which is a shortcut for
+`ls:filter(function(x) return x > 0 end)`.
+
+Finally, `pl.func` supports _placeholder expressions_ in the Boost lambda style,
+so that an anonymous function to multiply the two arguments can be expressed as
+`\_1*\_2`.
+
+To use them directly, note that _all_ function arguments in Penlight go through
+`utils.function_arg`. `pl.func` registers itself with this function, so that you
+can directly use placeholder expressions with standard methods:
+
+ > _1 = func._1
+ > = List{10,20,30}:map(_1+1)
+ {11,21,31}
+
+Another option for short anonymous functions is provided by
+`utils.string_lambda`; this is invoked automatically:
+
+ > = List{10,20,30}:map '|x| x + 1'
+ {11,21,31}
+
+### Pros and Cons of Loopless Programming
+
+The standard loops-and-ifs 'imperative' style of programming is dominant, and
+often seems to be the 'natural' way of telling a machine what to do. It is in
+fact very much how the machine does things, but we need to take a step back and
+find ways of expressing solutions in a higher-level way. For instance, applying
+a function to all elements of a list is a common operation:
+
+ local res = {}
+ for i = 1,#ls do
+ res[i] = fun(ls[i])
+ end
+
+This can be efficiently and succintly expressed as `ls:map(fun)`. Not only is
+there less typing but the intention of the code is clearer. If readers of your
+code spend too much time trying to guess your intention by analyzing your loops,
+then you have failed to express yourself clearly. Similarly, `ls:filter('>',0)`
+will give you all the values in a list greater than zero. (Of course, if you
+don't feel like using `List`, or have non-list-like tables, then `pl.tablex`
+offers the same facilities. In fact, the `List` methods are implemented using
+`tablex` functions.)
+
+A common observation is that loopless programming is less efficient, particularly
+in the way it uses memory. `ls1:map2('*',ls2):reduce '+'` will give you the dot
+product of two lists, but an unnecessary temporary list is created. But
+efficiency is relative to the actual situation, it may turn out to be _fast
+enough_, or may not appear in any crucial inner loops, etc.
+
+Writing loops is 'error-prone and tedious', as Stroustrup says. But any
+half-decent editor can be taught to do much of that typing for you. The question
+should actually be: is it tedious to _read_ loops? As with natural language,
+programmers tend to read chunks at a time. A for-loop causes no surprise, and
+probably little brain activity. One argument for loopless programming is the
+loops that you _do_ write stand out more, and signal 'something different
+happening here'. It should not be an all-or-nothing thing, since most programs
+require a mixture of idioms that suit the problem. Some languages (like APL) do
+nearly everything with map and reduce operations on arrays, and so solutions can
+sometimes seem forced. Wisdom is knowing when a particular idiom makes a
+particular problem easy to _solve_ and the solution easy to _explain_ afterwards.
+
+### Generally useful functions.
+
+The function `printf` discussed earlier is included in `pl.utils` because it
+makes properly formatted output easier. (There is an equivalent `fprintf` which
+also takes a file object parameter, just like the C function.)
+
+Splitting a string using a delimiter is a fairly common operation, hence `split`.
+
+Utility functions like `is_type` help with identifying what
+kind of animal you are dealing with.
+The Lua `type` function handles the basic types, but can't distinguish between
+different kinds of objects, which are all tables. So `is_type` handles both
+cases, like `is_type(s,"string")` and `is_type(ls,List)`.
+
+A common pattern when working with Lua varargs is capturing all the arguments in
+a table:
+
+ function t(...)
+ local args = {...}
+ ...
+ end
+
+But this will bite you someday when `nil` is one of the arguments, since this
+will put a 'hole' in your table. In particular, `#ls` will only give you the size
+upto the `nil` value. Hence the need for `table.pack` - this is a new Lua 5.2
+function which Penlight defines also for Lua 5.1.
+
+ function t(...)
+ local args,n = table.pack(...)
+ for i = 1,n do
+ ...
+ end
+ end
+
+The 'memoize' pattern occurs when you have a function which is expensive to call,
+but will always return the same value subsequently. `utils.memoize` is given a
+function, and returns another function. This calls the function the first time,
+saves the value for that argument, and thereafter for that argument returns the
+saved value. This is a more flexible alternative to building a table of values
+upfront, since in general you won't know what values are needed.
+
+ sum = utils.memoize(function(n)
+ local sum = 0
+ for i = 1,n do sum = sum + i end
+ return sum
+ end)
+ ...
+ s = sum(1e8) --takes time!
+ ...
+ s = sum(1e8) --returned saved value!
+
+Penlight is fully compatible with Lua 5.1, 5.2 and LuaJIT 2. To ensure this,
+`utils` also defines the global Lua 5.2
+[load](http://www.lua.org/work/doc/manual.html#pdf-load) function as `utils.load`
+
+ * the input (either a string or a function)
+ * the source name used in debug information
+ * the mode is a string that can have either or both of 'b' or 't', depending on
+whether the source is a binary chunk or text code (default is 'bt')
+ * the environment for the compiled chunk
+
+Using `utils.load` should reduce the need to call the deprecated function `setfenv`,
+and make your Lua 5.1 code 5.2-friendly.
+
+The `utils` module exports `getfenv` and `setfenv` for
+Lua 5.2 as well, based on code by Sergey Rozhenko. Note that these functions can fail
+for functions which don't access any globals.
+
+### Application Support
+
+`app.parse_args` is a simple command-line argument parser. If called without any
+arguments, it tries to use the global `arg` array. It returns the _flags_
+(options begining with '-') as a table of name/value pairs, and the _arguments_
+as an array. It knows about long GNU-style flag names, e.g. `--value`, and
+groups of short flags are understood, so that `-ab` is short for `-a -b`. The
+flags result would then look like `{value=true,a=true,b=true}`.
+
+Flags may take values. The command-line `--value=open -n10` would result in
+`{value='open',n='10'}`; generally you can use '=' or ':' to separate the flag
+from its value, except in the special case where a short flag is followed by an
+integer. Or you may specify upfront that some flags have associated values, and
+then the values will follow the flag.
+
+ > require 'pl'
+ > flags,args = app.parse_args({'-o','fred','-n10','fred.txt'},{o=true})
+ > pretty.dump(flags)
+ {o='fred',n='10'}
+
+`parse_args` is not intelligent or psychic; it will not convert any flag values
+or arguments for you, or raise errors. For that, have a look at
+@{08-additional.md.Command_line_Programs_with_Lapp|Lapp}.
+
+An application which consists of several files usually cannot use `require` to
+load files in the same directory as the main script. `app.require_here()`
+ensures that the Lua module path is modified so that files found locally are
+found first. In the `examples` directory, `test-symbols.lua` uses this function
+to ensure that it can find `symbols.lua` even if it is not run from this directory.
+
+`app.appfile` will create a filename that your application can use to store its
+private data, based on the script name. For example, `app.appfile "test.txt"`
+from a script called `testapp.lua` produces the following file on my Windows
+machine:
+
+ @plain
+ C:\Documents and Settings\SJDonova\.testapp\test.txt
+
+and the equivalent on my Linux machine:
+
+ @plain
+ /home/sdonovan/.testapp/test.txt
+
+If `.testapp` does not exist, it will be created.
+
+Penlight makes it convenient to save application data in Lua format. You can use
+`pretty.dump(t,file)` to write a Lua table in a human-readable form to a file,
+and `pretty.read(file.read(file))` to generate the table again, using the
+`pretty` module.
+
+
+### Simplifying Object-Oriented Programming in Lua
+
+Lua is similar to JavaScript in that the concept of class is not directly
+supported by the language. In fact, Lua has a very general mechanism for
+extending the behaviour of tables which makes it straightforward to implement
+classes. A table's behaviour is controlled by its metatable. If that metatable
+has a `\_\_index` function or table, this will handle looking up anything which is
+not found in the original table. A class is just a table with an `__index` key
+pointing to itself. Creating an object involves making a table and setting its
+metatable to the class; then when handling `obj.fun`, Lua first looks up `fun` in
+the table `obj`, and if not found it looks it up in the class. `obj:fun(a)` is
+just short for `obj.fun(obj,a)`. So with the metatable mechanism and this bit of
+syntactic sugar, it is straightforward to implement classic object orientation.
+
+ -- animal.lua
+
+ class = require 'pl.class'
+
+ class.Animal()
+
+ function Animal:_init(name)
+ self.name = name
+ end
+
+ function Animal:__tostring()
+ return self.name..': '..self:speak()
+ end
+
+ class.Dog(Animal)
+
+ function Dog:speak()
+ return 'bark'
+ end
+
+ class.Cat(Animal)
+
+ function Cat:_init(name,breed)
+ self:super(name) -- must init base!
+ self.breed = breed
+ end
+
+ function Cat:speak()
+ return 'meow'
+ end
+
+ class.Lion(Cat)
+
+ function Lion:speak()
+ return 'roar'
+ end
+
+ fido = Dog('Fido')
+ felix = Cat('Felix','Tabby')
+ leo = Lion('Leo','African')
+
+ $ lua -i animal.lua
+ > = fido,felix,leo
+ Fido: bark Felix: meow Leo: roar
+ > = leo:is_a(Animal)
+ true
+ > = leo:is_a(Dog)
+ false
+ > = leo:is_a(Cat)
+ true
+
+All Animal does is define `\_\_tostring`, which Lua will use whenever a string
+representation is needed of the object. In turn, this relies on `speak`, which is
+not defined. So it's what C++ people would call an abstract base class; the
+specific derived classes like Dog define `speak`. Please note that _if_ derived
+classes have their own constructors, they must explicitly call the base
+constructor for their base class; this is conveniently available as the `super`
+method.
+
+Note that (as always) there are multiple ways to implement OOP in Lua; this method
+uses the classic 'a class is the __index of its objects' but does 'fat inheritance';
+methods from the base class are copied into the new class. The advantage of this is
+that you are not penalized for long inheritance chains, for the price of larger classes,
+but generally objects outnumber classes! (If not, something odd is going on with your design.)
+
+All such objects will have a `is_a` method, which looks up the inheritance chain
+to find a match. Another form is `class_of`, which can be safely called on all
+objects, so instead of `leo:is_a(Animal)` one can say `Animal:class_of(leo)`.
+
+There are two ways to define a class, either `class.Name()` or `Name = class()`;
+both work identically, except that the first form will always put the class in
+the current environment (whether global or module); the second form provides more
+flexibility about where to store the class. The first form does _name_ the class
+by setting the `_name` field, which can be useful in identifying the objects of
+this type later. This session illustrates the usefulness of having named classes,
+if no `__tostring` method is explicitly defined.
+
+ > class.Fred()
+ > a = Fred()
+ > = a
+ Fred: 00459330
+ > Alice = class()
+ > b = Alice()
+ > = b
+ table: 00459AE8
+ > Alice._name = 'Alice'
+ > = b
+ Alice: 00459AE8
+
+So `Alice = class(); Alice._name = 'Alice'` is exactly the same as `class.Alice()`.
+
+This useful notation is borrowed from Hugo Etchegoyen's
+[classlib](http://lua-users.org/wiki/MultipleInheritanceClasses) which further
+extends this concept to allow for multiple inheritance. Notice that the
+more convenient form puts the class name in the _current environment_! That is,
+you may use it safely within modules using the old-fashioned `module()`
+or the new `_ENV` mechanism.
+
+There is always more than one way of doing things in Lua; some may prefer this
+style for creating classes:
+
+ local class = require 'pl.class'
+
+ class.Named {
+ _init = function(self,name)
+ self.name = name
+ end;
+
+ __tostring = function(self)
+ return 'boo '..self.name
+ end;
+ }
+
+ b = Named 'dog'
+ print(b)
+ --> boo dog
+
+Note that you have to explicitly declare `self` and end each function definition
+with a semi-colon or comma, since this is a Lua table. To inherit from a base class,
+set the special field `_base` to the class in this table.
+
+Penlight provides a number of useful classes; there is `List`, which is a Lua
+clone of the standard Python list object, and `Set` which represents sets. There
+are three kinds of _map_ defined: `Map`, `MultiMap` (where a key may have
+multiple values) and `OrderedMap` (where the order of insertion is remembered.).
+There is nothing special about these classes and you may inherit from them.
+
+A powerful thing about dynamic languages is that you can redefine existing classes
+and functions, which is often called 'monkey patching' It's entertaining and convenient,
+but ultimately anti-social; you may modify `List` but then any other modules using
+this _shared_ resource can no longer be sure about its behaviour. (This is why you
+must say `stringx.import()` explicitly if you want the extended string methods - it
+would be a bad default.) Lua is particularly open to modification but the
+community is not as tolerant of monkey-patching as the Ruby community, say. You may
+wish to add some new methods to `List`? Cool, but that's what subclassing is for.
+
+ class.Strings(List)
+
+ function Strings:my_method()
+ ...
+ end
+
+It's definitely more useful to define exactly how your objects behave
+in _unknown_ conditions. All classes have a `catch` method you can use to set
+a handler for unknown lookups; the function you pass looks exactly like the
+`__index` metamethod.
+
+ Strings:catch(function(self,name)
+ return function() error("no such method "..name,2) end
+ end)
+
+In this case we're just customizing the error message, but
+creative things can be done. Consider this code from `test-vector.lua`:
+
+ Strings:catch(List.default_map_with(string))
+
+ ls = Strings{'one','two','three'}
+ asserteq(ls:upper(),{'ONE','TWO','THREE'})
+ asserteq(ls:sub(1,2),{'on','tw','th'})
+
+So we've converted a unknown method invocation into a map using the function of
+that name found in `string`. So for a `Vector` (which is a specialization of `List`
+for numbers) it makes sense to make `math` the default map so that `v:sin()` makes
+sense.
+
+Note that `map` operations return a object of the same type - this is often called
+_covariance_. So `ls:upper()` itself returns a `Strings` object.
+
+This is not _always_ what you want, but objects can always be cast to the desired type.
+(`cast` doesn't create a new object, but returns the object passed.)
+
+ local sizes = ls:map '#'
+ asserteq(sizes, {3,3,5})
+ asserteq(utils.type(sizes),'Strings')
+ asserteq(sizes:is_a(Strings),true)
+ sizes = Vector:cast(sizes)
+ asserteq(utils.type(sizes),'Vector')
+ asserteq(sizes+1,{4,4,6})
+
+About `utils.type`: it can only return a string for a class type if that class does
+in fact have a `_name` field.
+
+
+_Properties_ are a useful object-oriented pattern. We wish to control access to a
+field, but don't wish to force the user of the class to say `obj:get_field()`
+etc. This excerpt from `tests/test-class.lua` shows how it is done:
+
+
+ local MyProps = class(class.properties)
+ local setted_a, got_b
+
+ function MyProps:_init ()
+ self._a = 1
+ self._b = 2
+ end
+
+ function MyProps:set_a (v)
+ setted_a = true
+ self._a = v
+ end
+
+ function MyProps:get_b ()
+ got_b = true
+ return self._b
+ end
+
+ local mp = MyProps()
+
+ mp.a = 10
+
+ asserteq(mp.a,10)
+ asserteq(mp.b,2)
+ asserteq(setted_a and got_b, true)
+
+The convention is that the internal field name is prefixed with an underscore;
+when reading `mp.a`, first a check for an explicit _getter_ `get_a` and then only
+look for `_a`. Simularly, writing `mp.a` causes the _setter_ `set_a` to be used.
+
+This is cool behaviour, but like much Lua metaprogramming, it is not free. Method
+lookup on such objects goes through `\_\_index` as before, but now `\_\_index` is a
+function which has to explicitly look up methods in the class, before doing any
+property indexing, which is not going to be as fast as field lookup. If however,
+your accessors actually do non-trivial things, then the extra overhead could be
+worth it.
+
+This is not really intended for _access control_ because external code can write
+to `mp._a` directly. It is possible to have this kind of control in Lua, but it
+again comes with run-time costs.
diff --git a/Data/Libraries/Penlight/docs_topics/02-arrays.md b/Data/Libraries/Penlight/docs_topics/02-arrays.md
new file mode 100644
index 0000000..9ee292f
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/02-arrays.md
@@ -0,0 +1,649 @@
+## Tables and Arrays
+
+<a id="list"/>
+
+### Python-style Lists
+
+One of the elegant things about Lua is that tables do the job of both lists and
+dicts (as called in Python) or vectors and maps, (as called in C++), and they do
+it efficiently. However, if we are dealing with 'tables with numerical indices'
+we may as well call them lists and look for operations which particularly make
+sense for lists. The Penlight `List` class was originally written by Nick Trout
+for Lua 5.0, and translated to 5.1 and extended by myself. It seemed that
+borrowing from Python was a good idea, and this eventually grew into Penlight.
+
+Here is an example showing `List` in action; it redefines `__tostring`, so that
+it can print itself out more sensibly:
+
+ > List = require 'pl.List' --> automatic with require 'pl' <---
+ > l = List()
+ > l:append(10)
+ > l:append(20)
+ > = l
+ {10,20}
+ > l:extend {30,40}
+ {10,20,30,40}
+ > l:insert(1,5)
+ {5,10,20,30,40}
+ > = l:pop()
+ 40
+ > = l
+ {5,10,20,30}
+ > = l:index(30)
+ 4
+ > = l:contains(30)
+ true
+ > = l:reverse() ---> note: doesn't make a copy!
+ {30,20,10,5}
+
+Although methods like `sort` and `reverse` operate in-place and change the list,
+they do return the original list. This makes it possible to do _method chaining_,
+like `ls = ls:append(10):append(20):reverse():append(1)`. But (and this is an
+important but) no extra copy is made, so `ls` does not change identity. `List`
+objects (like tables) are _mutable_, unlike strings. If you want a copy of a
+list, then `List(ls)` will do the job, i.e. it acts like a copy constructor.
+However, if passed any other table, `List` will just set the metatable of the
+table and _not_ make a copy.
+
+A particular feature of Python lists is _slicing_. This is fully supported in
+this version of `List`, except we use 1-based indexing. So `List.slice` works
+rather like `string.sub`:
+
+ > l = List {10,20,30,40}
+ > = l:slice(1,1) ---> note: creates a new list!
+ {10}
+ > = l:slice(2,2)
+ {20}
+ > = l:slice(2,3)
+ {20,30}
+ > = l:slice(2,-2)
+ {20,30}
+ > = l:slice_assign(2,2,{21,22,23})
+ {10,21,22,23,30,40}
+ > = l:chop(1,1)
+ {21,22,23,30,40}
+
+Functions like `slice_assign` and `chop` modify the list; the first is equivalent
+to Python`l[i1:i2] = seq` and the second to `del l[i1:i2]`.
+
+List objects are ultimately just Lua 'list-like' tables, but they have extra
+operations defined on them, such as equality and concatention. For regular
+tables, equality is only true if the two tables are _identical objects_, whereas
+two lists are equal if they have the same contents, i.e. that `l1[i]==l2[i]` for
+all elements.
+
+ > l1 = List {1,2,3}
+ > l2 = List {1,2,3}
+ > = l1 == l2
+ true
+ > = l1..l2
+ {1,2,3,1,2,3}
+
+The `List` constructor can be passed a function. If so, it's assumed that this is
+an iterator function that can be repeatedly called to generate a sequence. One
+such function is `io.lines`; the following short, intense little script counts
+the number of lines in standard input:
+
+ -- linecount.lua
+ require 'pl'
+ ls = List(io.lines())
+ print(#ls)
+
+`List.iterate` captures what `List` considers a sequence. In particular, it can
+also iterate over all 'characters' in a string:
+
+ > for ch in List.iterate 'help' do io.write(ch,' ') end
+ h e l p >
+
+Since the function `iterate` is used internally by the `List` constructor,
+strings can be made into lists of character strings very easily.
+
+There are a number of operations that go beyond the standard Python methods. For
+instance, you can _partition_ a list into a table of sublists using a function.
+In the simplest form, you use a predicate (a function returning a boolean value)
+to partition the list into two lists, one of elements matching and another of
+elements not matching. But you can use any function; if we use `type` then the
+keys will be the standard Lua type names.
+
+ > ls = List{1,2,3,4}
+ > ops = require 'pl.operator'
+ > ls:partition(function(x) return x > 2 end)
+ {false={1,2},true={3,4}}
+ > ls = List{'one',math.sin,List{1},10,20,List{1,2}}
+ > ls:partition(type)
+ {function={function: 00369110},string={one},number={10,20},table={{1},{1,2}}}
+
+This is one `List` method which returns a table which is not a `List`. Bear in
+mind that you can always call a `List` method on a plain table argument, so
+`List.partition(t,type)` works as expected. But these functions will only operate
+on the array part of the table.
+
+The 'nominal' type of the returned table is `pl.Multimap`, which describes a mapping
+between keys and multiple values. This does not mean that `pl.Multimap` is automatically
+loaded whenever you use `partition` (or `List` for that matter); this is one of the
+standard metatables which are only filled out when the appropriate module is loaded.
+This allows tables to be tagged appropriately without causing excessive coupling.
+
+Stacks occur everywhere in computing. `List` supports stack-like operations;
+there is already `pop` (remove and return last value) and `append` acts like
+`push` (add a value to the end). `push` is provided as an alias for `append`, and
+the other stack operation (size) is simply the size operator `#`. Queues can
+also be implemented; you use `pop` to take values out of the queue, and `put` to
+insert a value at the begining.
+
+You may derive classes from `List`, and since the list-returning methods
+are covariant, the result of `slice` etc will return lists of the derived type,
+not `List`. For instance, consider the specialization of a `List` type that contains
+numbers in `tests/test-list.lua`:
+
+ n1 = NA{10,20,30}
+ n2 = NA{1,2,3}
+ ns = n1 + 2*n2
+ asserteq(ns,{12,24,36})
+ min,max = ns:slice(1,2):minmax()
+ asserteq(T(min,max),T(12,24))
+ asserteq(n1:normalize():sum(),1,1e-8)
+
+
+### Map and Set classes
+
+The `Map` class exposes what Python would call a 'dict' interface, and accesses
+the hash part of the table. The name 'Map' is used to emphasize the interface,
+not the implementation; it is an object which maps keys onto values; `m['alice']`
+or the equivalent `m.alice` is the access operation. This class also provides
+explicit `set` and `get` methods, which are trivial for regular maps but get
+interesting when `Map` is subclassed. The other operation is `update`, which
+extends a map by copying the keys and values from another table, perhaps
+overwriting existing keys:
+
+ > Map = require 'pl.Map'
+ > m = Map{one=1,two=2}
+ > m:update {three=3,four=4,two=20}
+ > = m == M{one=1,two=20,three=3,four=4}
+ true
+
+The method `values` returns a list of the values, and `keys` returns a list of
+the keys; there is no guarantee of order. `getvalues` is given a list of keys and
+returns a list of values associated with these keys:
+
+ > m = Map{one=1,two=2,three=3}
+ > = m:getvalues {'one','three'}
+ {1,3}
+ > = m:getvalues(m:keys()) == m:values()
+ true
+
+When querying the value of a `Map`, it is best to use the `get` method:
+
+ > print(m:get 'one', m:get 'two')
+ 1 2
+
+The reason is that `m[key]` can be ambiguous; due to the current implementation,
+`m["get"]` will always succeed, because if a value is not present in the map, it
+will be looked up in the `Map` metatable, which contains a method `get`. There is
+currently no simple solution to this annoying restriction.
+
+There are some useful classes which inherit from `Map`. An `OrderedMap` behaves
+like a `Map` but keeps its keys in order if you use its `set` method to add keys
+and values. Like all the 'container' classes in Penlight, it defines an `iter`
+method for iterating over its values; this will return the keys and values in the
+order of insertion; the `keys` and `values` methods likewise.
+
+A `MultiMap` allows multiple values to be associated with a given key. So `set`
+(as before) takes a key and a value, but calling it with the same key and a
+different value does not overwrite but adds a new value. `get` (or using `[]`)
+will return a list of values.
+
+A `Set` can be seen as a special kind of `Map`, where all the values are `true`,
+the keys are the values, and the order is not important. So in this case
+`Set.values` is defined to return a list of the keys. Sets can display
+themselves, and the basic operations like `union` (`+`) and `intersection` (`*`)
+are defined.
+
+ > Set = require 'pl.Set'
+ > = Set{'one','two'} == Set{'two','one'}
+ true
+ > fruit = Set{'apple','banana','orange'}
+ > = fruit['banana']
+ true
+ > = fruit['hazelnut']
+ nil
+ > = fruit:values()
+ {apple,orange,banana}
+ > colours = Set{'red','orange','green','blue'}
+ > = fruit,colours
+ [apple,orange,banana] [blue,green,orange,red]
+ > = fruit+colours
+ [blue,green,apple,red,orange,banana]
+ > = fruit*colours
+ [orange]
+
+There are also the functions `Set.difference` and `Set.symmetric_difference`. The
+first answers the question 'what fruits are not colours?' and the second 'what
+are fruits and colours but not both?'
+
+ > = fruit - colours
+ [apple,banana]
+ > = fruit ^ colours
+ [blue,green,apple,red,banana]
+
+Adding elements to a set is simply `fruit['peach'] = true` and removing is
+`fruit['apple'] = nil` . To make this simplicity work properly, the `Set` class has no
+methods - either you use the operator forms or explicitly use `Set.intersect`
+etc. In this way we avoid the ambiguity that plagues `Map`.
+
+
+(See `pl.Map` and `pl.Set`)
+
+### Useful Operations on Tables
+
+@lookup pl.tablex
+
+Some notes on terminology: Lua tables are usually _list-like_ (like an array) or
+_map-like_ (like an associative array or dict); they can of course have a
+list-like and a map-like part. Some of the table operations only make sense for
+list-like tables, and some only for map-like tables. (The usual Lua terminology
+is the array part and the hash part of the table, which reflects the actual
+implementation used; it is more accurate to say that a Lua table is an
+associative map which happens to be particularly efficient at acting like an
+array.)
+
+The functions provided in `table` provide all the basic manipulations on Lua
+tables, but as we saw with the `List` class, it is useful to build higher-level
+operations on top of those functions. For instance, to copy a table involves this
+kind of loop:
+
+ local res = {}
+ for k,v in pairs(T) do
+ res[k] = v
+ end
+ return res
+
+The `tablex` module provides this as `copy`, which does a _shallow_ copy of a
+table. There is also `deepcopy` which goes further than a simple loop in two
+ways; first, it also gives the copy the same metatable as the original (so it can
+copy objects like `List` above) and any nested tables will also be copied, to
+arbitrary depth. There is also `icopy` which operates on list-like tables, where
+you can set optionally set the start index of the source and destination as well.
+It ensures that any left-over elements will be deleted:
+
+ asserteq(icopy({1,2,3,4,5,6},{20,30}),{20,30}) -- start at 1
+ asserteq(icopy({1,2,3,4,5,6},{20,30},2),{1,20,30}) -- start at 2
+ asserteq(icopy({1,2,3,4,5,6},{20,30},2,2),{1,30}) -- start at 2, copy from 2
+
+(This code from the `tablex` test module shows the use of `pl.test.asserteq`)
+
+Whereas, `move` overwrites but does not delete the rest of the destination:
+
+ asserteq(move({1,2,3,4,5,6},{20,30}),{20,30,3,4,5,6})
+ asserteq(move({1,2,3,4,5,6},{20,30},2),{1,20,30,4,5,6})
+ asserteq(move({1,2,3,4,5,6},{20,30},2,2),{1,30,3,4,5,6})
+
+(The difference is somewhat like that between C's `strcpy` and `memmove`.)
+
+To summarize, use `copy` or `deepcopy` to make a copy of an arbitrary table. To
+copy into a map-like table, use `update`; to copy into a list-like table use
+`icopy`, and `move` if you are updating a range in the destination.
+
+To complete this set of operations, there is `insertvalues` which works like
+`table.insert` except that one provides a table of values to be inserted, and
+`removevalues` which removes a range of values.
+
+ asserteq(insertvalues({1,2,3,4},2,{20,30}),{1,20,30,2,3,4})
+ asserteq(insertvalues({1,2},{3,4}),{1,2,3,4})
+
+Another example:
+
+ > T = require 'pl.tablex'
+ > t = {10,20,30,40}
+ > = T.removevalues(t,2,3)
+ {10,40}
+ > = T.insertvalues(t,2,{20,30})
+ {10,20,30,40}
+
+
+In a similar spirit to `deepcopy`, `deepcompare` will take two tables and return
+true only if they have exactly the same values and structure.
+
+ > t1 = {1,{2,3},4}
+ > t2 = deepcopy(t1)
+ > = t1 == t2
+ false
+ > = deepcompare(t1,t2)
+ true
+
+`find` will return the index of a given value in a list-like table. Note that
+like `string.find` you can specify an index to start searching, so that all
+instances can be found. There is an optional fourth argument, which makes the
+search start at the end and go backwards, so we could define `rfind` like so:
+
+ function rfind(t,val,istart)
+ return tablex.find(t,val,istart,true)
+ end
+
+`find` does a linear search, so it can slow down code that depends on it. If
+efficiency is required for large tables, consider using an _index map_.
+`index_map` will return a table where the keys are the original values of the
+list, and the associated values are the indices. (It is almost exactly the
+representation needed for a _set_.)
+
+ > t = {'one','two','three'}
+ > = tablex.find(t,'two')
+ 2
+ > = tablex.find(t,'four')
+ nil
+ > il = tablex.index_map(t)
+ > = il['two']
+ 2
+ > = il.two
+ 2
+
+A version of `index_map` called `makeset` is also provided, where the values are
+just `true`. This is useful because two such sets can be compared for equality
+using `deepcompare`:
+
+ > = deepcompare(makeset {1,2,3},makeset {2,1,3})
+ true
+
+Consider the problem of determining the new employees that have joined in a
+period. Assume we have two files of employee names:
+
+ (last-month.txt)
+ smith,john
+ brady,maureen
+ mongale,thabo
+
+ (this-month.txt)
+ smith,john
+ smit,johan
+ brady,maureen
+ mogale,thabo
+ van der Merwe,Piet
+
+To find out differences, just make the employee lists into sets, like so:
+
+ require 'pl'
+
+ function read_employees(file)
+ local ls = List(io.lines(file)) -- a list of employees
+ return tablex.makeset(ls)
+ end
+
+ last = read_employees 'last-month.txt'
+ this = read_employees 'this-month.txt'
+
+ -- who is in this but not in last?
+ diff = tablex.difference(this,last)
+
+ -- in a set, the keys are the values...
+ for e in pairs(diff) do print(e) end
+
+ -- *output*
+ -- van der Merwe,Piet
+ -- smit,johan
+
+The `difference` operation is easy to write and read:
+
+ for e in pairs(this) do
+ if not last[e] then
+ print(e)
+ end
+ end
+
+Using `difference` here is not that it is a tricky thing to code, it is that you
+are stating your intentions clearly to other readers of your code. (And naturally
+to your future self, in six months time.)
+
+`find_if` will search a table using a function. The optional third argument is a
+value which will be passed as a second argument to the function. `pl.operator`
+provides the Lua operators conveniently wrapped as functions, so the basic
+comparison functions are available:
+
+ > ops = require 'pl.operator'
+ > = tablex.find_if({10,20,30,40},ops.gt,20)
+ 3 true
+
+Note that `find_if` will also return the _actual value_ returned by the function,
+which of course is usually just `true` for a boolean function, but any value
+which is not `nil` and not `false` can be usefully passed back.
+
+`deepcompare` does a thorough recursive comparison, but otherwise using the
+default equality operator. `compare` allows you to specify exactly what function
+to use when comparing two list-like tables, and `compare_no_order` is true if
+they contain exactly the same elements. Do note that the latter does not need an
+explicit comparison function - in this case the implementation is actually to
+compare the two sets, as above:
+
+ > = compare_no_order({1,2,3},{2,1,3})
+ true
+ > = compare_no_order({1,2,3},{2,1,3},'==')
+ true
+
+(Note the special string '==' above; instead of saying `ops.gt` or `ops.eq` we
+can use the strings '>' or '==' respectively.)
+
+`sort` and `sortv` return iterators that will iterate through the
+sorted elements of a table. `sort` iterates by sorted key order, and
+`sortv` iterates by sorted value order. For example, given a table
+with names and ages, it is trivial to iterate over the elements:
+
+ > t = {john=27,jane=31,mary=24}
+ > for name,age in tablex.sort(t) do print(name,age) end
+ jane 31
+ john 27
+ mary 24
+ > for name,age in tablex.sortv(t) do print(name,age) end
+ mary 24
+ john 27
+ jane 31
+
+There are several ways to merge tables in PL. If they are list-like, then see the
+operations defined by `pl.List`, like concatenation. If they are map-like, then
+`merge` provides two basic operations. If the third arg is false, then the result
+only contains the keys that are in common between the two tables, and if true,
+then the result contains all the keys of both tables. These are in fact
+generalized set union and intersection operations:
+
+ > S1 = {john=27,jane=31,mary=24}
+ > S2 = {jane=31,jones=50}
+ > = tablex.merge(S1, S2, false)
+ {jane=31}
+ > = tablex.merge(S1, S2, true)
+ {mary=24,jane=31,john=27,jones=50}
+
+When working with tables, you will often find yourself writing loops like in the
+first example. Loops are second nature to programmers, but they are often not the
+most elegant and self-describing way of expressing an operation. Consider the
+`map` function, which creates a new table by applying a function to each element
+of the original:
+
+ > = map(math.sin, {1,2,3,4})
+ { 0.84, 0.91, 0.14, -0.76}
+ > = map(function(x) return x*x end, {1,2,3,4})
+ {1,4,9,16}
+
+`map` saves you from writing a loop, and the resulting code is often clearer, as
+well as being shorter. This is not to say that 'loops are bad' (although you will
+hear that from some extremists), just that it's good to capture standard
+patterns. Then the loops you do write will stand out and acquire more significance.
+
+`pairmap` is interesting, because the function works with both the key and the
+value.
+
+ > t = {fred=10,bonzo=20,alice=4}
+ > = pairmap(function(k,v) return v end, t)
+ {4,10,20}
+ > = pairmap(function(k,v) return k end, t)
+ {'alice','fred','bonzo'}
+
+(These are common enough operations that the first is defined as `values` and the
+second as `keys`.) If the function returns two values, then the _second_ value is
+considered to be the new key:
+
+ > = pairmap(t,function(k,v) return v+10, k:upper() end)
+ {BONZO=30,FRED=20,ALICE=14}
+
+`map2` applies a function to two tables:
+
+ > map2(ops.add,{1,2},{10,20})
+ {11,22}
+ > map2('*',{1,2},{10,20})
+ {10,40}
+
+The various map operations generate tables; `reduce` applies a function of two
+arguments over a table and returns the result as a scalar:
+
+ > reduce ('+', {1,2,3})
+ 6
+ > reduce ('..', {'one','two','three'})
+ 'onetwothree'
+
+Finally, `zip` sews different tables together:
+
+ > = zip({1,2,3},{10,20,30})
+ {{1,10},{2,20},{3,30}}
+
+Browsing through the documentation, you will find that `tablex` and `List` share
+methods. For instance, `tablex.imap` and `List.map` are basically the same
+function; they both operate over the array-part of the table and generate another
+table. This can also be expressed as a _list comprehension_ `C 'f(x) for x' (t)`
+which makes the operation more explicit. So why are there different ways to do
+the same thing? The main reason is that not all tables are Lists: the expression
+`ls:map('#')` will return a _list_ of the lengths of any elements of `ls`. A list
+is a thin wrapper around a table, provided by the metatable `List`. Sometimes you
+may wish to work with ordinary Lua tables; the `List` interface is not a
+compulsory way to use Penlight table operations.
+
+### Operations on two-dimensional tables
+
+@lookup pl.array2d
+
+Two-dimensional tables are of course easy to represent in Lua, for instance
+`{{1,2},{3,4}}` where we store rows as subtables and index like so `A[col][row]`.
+This is the common representation used by matrix libraries like
+[LuaMatrix](http://lua-users.org/wiki/LuaMatrix). `pl.array2d` does not provide
+matrix operations, since that is the job for a specialized library, but rather
+provides generalizations of the higher-level operations provided by `pl.tablex`
+for one-dimensional arrays.
+
+`iter` is a useful generalization of `ipairs`. (The extra parameter determines
+whether you want the indices as well.)
+
+ > a = {{1,2},{3,4}}
+ > for i,j,v in array2d.iter(a,true) do print(i,j,v) end
+ 1 1 1
+ 1 2 2
+ 2 1 3
+ 2 2 4
+
+Note that you can always convert an arbitrary 2D array into a 'list of lists'
+with `List(tablex.map(List,a))`
+
+`map` will apply a function over all elements (notice that extra arguments can be
+provided, so this operation is in effect `function(x) return x-1 end`)
+
+ > array2d.map('-',a,1)
+ {{0,1},{2,3}}
+
+2D arrays are stored as an array of rows, but columns can be extracted:
+
+ > array2d.column(a,1)
+ {1,3}
+
+There are three equivalents to `tablex.reduce`. You can either reduce along the
+rows (which is the most efficient) or reduce along the columns. Either one will
+give you a 1D array. And `reduce2` will apply two operations: the first one
+reduces the rows, and the second reduces the result.
+
+ > array2d.reduce_rows('+',a)
+ {3,7}
+ > array2d.reduce_cols('+',a)
+ {4,6}
+ > -- same as tablex.reduce('*',array.reduce_rows('+',a))
+ > array2d.reduce2('*','+',a)
+ 21 `
+
+`tablex.map2` applies an operation to two tables, giving another table.
+`array2d.map2` does this for 2D arrays. Note that you have to provide the _rank_
+of the arrays involved, since it's hard to always correctly deduce this from the
+data:
+
+ > b = {{10,20},{30,40}}
+ > a = {{1,2},{3,4}}
+ > = array2d.map2('+',2,2,a,b) -- two 2D arrays
+ {{11,22},{33,44}}
+ > = array2d.map2('+',1,2,{10,100},a) -- 1D, 2D
+ {{11,102},{13,104}}
+ > = array2d.map2('*',2,1,a,{1,-1}) -- 2D, 1D
+ {{1,-2},{3,-4}}
+
+Of course, you are not limited to simple arithmetic. Say we have a 2D array of
+strings, and wish to print it out with proper right justification. The first step
+is to create all the string lengths by mapping `string.len` over the array, the
+second is to reduce this along the columns using `math.max` to get maximum column
+widths, and last, apply `stringx.rjust` with these widths.
+
+ maxlens = reduce_cols(math.max,map('#',lines))
+ lines = map2(stringx.rjust,2,1,lines,maxlens)
+
+There is `product` which returns the _Cartesian product_ of two 1D arrays. The
+result is a 2D array formed from applying the function to all possible pairs from
+the two arrays.
+
+ > array2d.product('{}',{1,2},{'a','b'})
+ {{{1,'b'},{2,'a'}},{{1,'a'},{2,'b'}}}
+
+There is a set of operations which work in-place on 2D arrays. You can
+`swap_rows` and `swap_cols`; the first really is a simple one-liner, but the idea
+here is to give the operation a name. `remove_row` and `remove_col` are
+generalizations of `table.remove`. Likewise, `extract_rows` and `extract_cols`
+are given arrays of indices and discard anything else. So, for instance,
+`extract_cols(A,{2,4})` will leave just columns 2 and 4 in the array.
+
+`List.slice` is often useful on 1D arrays; `slice` does the same thing, but is
+generally given a start (row,column) and a end (row,column).
+
+ > A = {{1,2,3},{4,5,6},{7,8,9}}
+ > B = slice(A,1,1,2,2)
+ > write(B)
+ 1 2
+ 4 5
+ > B = slice(A,2,2)
+ > write(B,nil,'%4.1f')
+ 5.0 6.0
+ 8.0 9.0
+
+Here `write` is used to print out an array nicely; the second parameter is `nil`,
+which is the default (stdout) but can be any file object and the third parameter
+is an optional format (as used in `string.format`).
+
+`parse_range` will take a spreadsheet range like 'A1:B2' or 'R1C1:R2C2' and
+return the range as four numbers, which can be passed to `slice`. The rule is
+that `slice` will return an array of the appropriate shape depending on the
+range; if a range represents a row or a column, the result is 1D, otherwise 2D.
+
+This applies to `iter` as well, which can also optionally be given a range:
+
+
+ > for i,j,v in iter(A,true,2,2) do print(i,j,v) end
+ 2 2 5
+ 2 3 6
+ 3 2 8
+ 3 3 9
+
+`new` will construct a new 2D array with the given dimensions. You provide an
+initial value for the elements, which is interpreted as a function if it's
+callable. With `L` being `utils.string_lambda` we then have the following way to
+make an _identity matrix_:
+
+ asserteq(
+ array.new(3,3,L'|i,j| i==j and 1 or 0'),
+ {{1,0,0},{0,1,0},{0,0,1}}
+ )
+
+Please note that most functions in `array2d` are _covariant_, that is, they
+return an array of the same type as they receive. In particular, any objects
+created with `data.new` or `matrix.new` will remain data or matrix objects when
+reshaped or sliced, etc. Data objects have the `array2d` functions available as
+methods.
+
+
diff --git a/Data/Libraries/Penlight/docs_topics/03-strings.md b/Data/Libraries/Penlight/docs_topics/03-strings.md
new file mode 100644
index 0000000..3808175
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/03-strings.md
@@ -0,0 +1,228 @@
+## Strings. Higher-level operations on strings.
+
+### Extra String Methods
+
+@lookup pl.stringx
+
+These are convenient borrowings from Python, as described in 3.6.1 of the Python
+reference, but note that indices in Lua always begin at one. `stringx` defines
+functions like `isalpha` and `isdigit`, which return `true` if s is only composed
+of letters or digits respectively. `startswith` and `endswith` are convenient
+ways to find substrings. (`endswith` works as in Python 2.5, so that `f:endswith
+{'.bat','.exe','.cmd'}` will be true for any filename which ends with these
+extensions.) There are justify methods and whitespace trimming functions like
+`strip`.
+
+ > stringx.import()
+ > ('bonzo.dog'):endswith {'.dog','.cat'}
+ true
+ > ('bonzo.txt'):endswith {'.dog','.cat'}
+ false
+ > ('bonzo.cat'):endswith {'.dog','.cat'}
+ true
+ > (' stuff'):ljust(20,'+')
+ '++++++++++++++ stuff'
+ > (' stuff '):lstrip()
+ 'stuff '
+ > (' stuff '):rstrip()
+ ' stuff'
+ > (' stuff '):strip()
+ 'stuff'
+ > for s in ('one\ntwo\nthree\n'):lines() do print(s) end
+ one
+ two
+ three
+
+Most of these can be fairly easily implemented using the Lua string library,
+which is more general and powerful. But they are convenient operations to have
+easily at hand. Note that can be injected into the `string` table if you use
+`stringx.import`, but a simple alias like `local stringx = require 'pl.stringx'`
+is preferrable. This is the recommended practice when writing modules for
+consumption by other people, since it is bad manners to change the global state
+of the rest of the system. Magic may be used for convenience, but there is always
+a price.
+
+
+### String Templates
+
+@lookup pl.text
+
+Another borrowing from Python, string templates allow you to substitute values
+looked up in a table:
+
+ local Template = require ('pl.text').Template
+ t = Template('${here} is the $answer')
+ print(t:substitute {here = 'Lua', answer = 'best'})
+ ==>
+ Lua is the best
+
+'$ variables' can optionally have curly braces; this form is useful if you are
+glueing text together to make variables, e.g `${prefix}_name_${postfix}`. The
+`substitute` method will throw an error if a $ variable is not found in the
+table, and the `safe_substitute` method will not.
+
+The Lua implementation has an extra method, `indent_substitute` which is very
+useful for inserting blocks of text, because it adjusts indentation. Consider
+this example:
+
+ -- testtemplate.lua
+ local Template = require ('pl.text').Template
+
+ t = Template [[
+ for i = 1,#$t do
+ $body
+ end
+ ]]
+
+ body = Template [[
+ local row = $t[i]
+ for j = 1,#row do
+ fun(row[j])
+ end
+ ]]
+
+ print(t:indent_substitute {body=body,t='tbl'})
+
+And the output is:
+
+ for i = 1,#tbl do
+ local row = tbl[i]
+ for j = 1,#row do
+ fun(row[j])
+ end
+ end
+
+`indent_substitute` can substitute templates, and in which case they themselves
+will be substituted using the given table. So in this case, `$t` was substituted
+twice.
+
+`pl.text` also has a number of useful functions like `dedent`, which strips all
+the initial indentation from a multiline string. As in Python, this is useful for
+preprocessing multiline strings if you like indenting them with your code. The
+function `wrap` is passed a long string (a _paragraph_) and returns a list of
+lines that fit into a desired line width. As an extension, there is also `indent`
+for indenting multiline strings.
+
+New in Penlight with the 0.9 series is `text.format_operator`. Calling this
+enables Python-style string formating using the modulo operator `%`:
+
+ > text.format_operator()
+ > = '%s[%d]' % {'dog',1}
+ dog[1]
+
+So in its simplest form it saves the typing involved with `string.format`; it
+will also expand `$` variables using named fields:
+
+ > = '$animal[$num]' % {animal='dog',num=1}
+ dog[1]
+
+As with `stringx.import` you have to do this explicitly, since all strings share the same
+metatable. But in your own scripts you can feel free to do this.
+
+### Another Style of Template
+
+A new module is `template`, which is a version of Rici Lake's [Lua
+Preprocessor](http://lua-users.org/wiki/SlightlyLessSimpleLuaPreprocessor). This
+allows you to mix Lua code with your templates in a straightforward way. There
+are only two rules:
+
+ - Lines begining with `#` are Lua
+ - Otherwise, anything inside `$()` is a Lua expression.
+
+So a template generating an HTML list would look like this:
+
+ <ul>
+ # for i,val in ipairs(T) do
+ <li>$(i) = $(val:upper())</li>
+ # end
+ </ul>
+
+Assume the text is inside `tmpl`, then the template can be expanded using:
+
+ local template = require 'pl.template'
+ local my_env = {
+ ipairs = ipairs,
+ T = {'one','two','three'}
+ }
+ res = template.substitute(tmpl, my_env)
+
+and we get
+
+ <ul>
+ <li>1 = ONE</li>
+ <li>2 = TWO</li>
+ <li>3 = THREE</li>
+ </ul>
+
+There is a single function, `template.substitute` which is passed a template
+string and an environment table. This table may contain some special fields,
+like `\_parent` which can be set to a table representing a 'fallback' environment
+in case a symbol was not found. `\_brackets` is usually '()' and `\_escape` is
+usually '#' but it's sometimes necessary to redefine these if the defaults
+interfere with the target language - for instance, `$(V)` has another meaning in
+Make, and `#` means a preprocessor line in C/C++.
+
+Finally, if something goes wrong, passing `_debug` will cause the intermediate
+Lua code to be dumped if there's a problem.
+
+Here is a C code generation example; something that could easily be extended to
+be a minimal Lua extension skeleton generator.
+
+ local subst = require 'pl.template'.substitute
+
+ local templ = [[
+ #include <lua.h>
+ #include <lauxlib.h>
+ #include <lualib.h>
+
+ > for _,f in ipairs(mod) do
+ static int l_$(f.name) (lua_State *L) {
+
+ }
+ > end
+
+ static const luaL_reg $(mod.name)[] = {
+ > for _,f in ipairs(mod) do
+ {"$(f.name)",l_$(f.name)},
+ > end
+ {NULL,NULL}
+ };
+
+ int luaopen_$(mod.name) {
+ luaL_register (L, "$(mod.name)", $(mod.name));
+ return 1;
+ }
+ ]]
+
+ print(subst(templ,{
+ _escape = '>',
+ ipairs = ipairs,
+ mod = {
+ name = 'baggins';
+ {name='frodo'},
+ {name='bilbo'}
+ }
+ }))
+
+
+### File-style I/O on Strings
+
+`pl.stringio` provides just three functions; `stringio.open` is passed a string,
+and returns a file-like object for reading. It supports a `read` method, which
+takes the same arguments as standard file objects:
+
+ > f = stringio.open 'first line\n10 20 30\n'
+ > = f:read()
+ first line
+ > = f:read('*n','*n','*n')
+ 10 20 30
+
+`lines` and `seek` are also supported.
+
+`stringio.lines` is a useful short-cut for iterating over all the lines in a
+string.
+
+`stringio.create` creates a writeable file-like object. You then use `write` to
+this stream, and finally extract the builded string using `value`. This 'string
+builder' pattern is useful for efficiently creating large strings.
+
diff --git a/Data/Libraries/Penlight/docs_topics/04-paths.md b/Data/Libraries/Penlight/docs_topics/04-paths.md
new file mode 100644
index 0000000..4367fe6
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/04-paths.md
@@ -0,0 +1,170 @@
+## Paths and Directories
+
+### Working with Paths
+
+Programs should not depend on quirks of your operating system. They will be
+harder to read, and need to be ported for other systems. The worst of course is
+hardcoding paths like 'c:\\' in programs, and wondering why Vista complains so
+much. But even something like `dir..'\\'..file` is a problem, since Unix can't
+understand backslashes in this way. `dir..'/'..file` is _usually_ portable, but
+it's best to put this all into a simple function, `path.join`. If you
+consistently use `path.join`, then it's much easier to write cross-platform code,
+since it handles the directory separator for you.
+
+`pl.path` provides the same functionality as Python's `os.path` module (11.1).
+
+ > p = 'c:\\bonzo\\DOG.txt'
+ > = path.normcase (p) ---> only makes sense on Windows
+ c:\bonzo\dog.txt
+ > = path.splitext (p)
+ c:\bonzo\DOG .txt
+ > = path.extension (p)
+ .txt
+ > = path.basename (p)
+ DOG.txt
+ > = path.exists(p)
+ false
+ > = path.join ('fred','alice.txt')
+ fred\alice.txt
+ > = path.exists 'pretty.lua'
+ true
+ > = path.getsize 'pretty.lua'
+ 2125
+ > = path.isfile 'pretty.lua'
+ true
+ > = path.isdir 'pretty.lua'
+ false
+
+
+It is very important for all programmers, not just on Unix, to only write to
+where they are allowed to write. `path.expanduser` will expand '~' (tilde) into
+the home directory. Depending on your OS, this will be a guaranteed place where
+you can create files:
+
+ > = path.expanduser '~/mydata.txt'
+ 'C:\Documents and Settings\SJDonova/mydata.txt'
+
+ > = path.expanduser '~/mydata.txt'
+ /home/sdonovan/mydata.txt
+
+Under Windows, `os.tmpname` returns a path which leads to your drive root full of
+temporary files. (And increasingly, you do not have access to this root folder.)
+This is corrected by `path.tmpname`, which uses the environment variable TMP:
+
+ > os.tmpname() -- not a good place to put temporary files!
+ '\s25g.'
+ > path.tmpname()
+ 'C:\DOCUME~1\SJDonova\LOCALS~1\Temp\s25g.1'
+
+
+A useful extra function is `pl.path.package_path`, which will tell you the path
+of a particular Lua module. So on my system, `package_path('pl.path')` returns
+'C:\Program Files\Lua\5.1\lualibs\pl\path.lua', and `package_path('ifs')` returns
+'C:\Program Files\Lua\5.1\clibs\lfs.dll'. It is implemented in terms of
+`package.searchpath`, which is a new function in Lua 5.2 which has been
+implemented for Lua 5.1 in Penlight.
+
+### File Operations
+
+`pl.file` is a new module that provides more sensible names for common file
+operations. For instance, `file.read` and `file.write` are aliases for
+`utils.readfile` and `utils.writefile`.
+
+Smaller files can be efficiently read and written in one operation. `file.read`
+is passed a filename and returns the contents as a string, if successful; if not,
+then it returns `nil` and the actual error message. There is an optional boolean
+parameter if you want the file to be read in binary mode (this makes no
+difference on Unix but remains important with Windows.)
+
+In previous versions of Penlight, `utils.readfile` would read standard input if
+the file was not specified, but this can lead to nasty bugs; use `io.read '*a'`
+to grab all of standard input.
+
+Similarly, `file.write` takes a filename and a string which will be written to
+that file.
+
+For example, this little script converts a file into upper case:
+
+ require 'pl'
+ assert(#arg == 2, 'supply two filenames')
+ text = assert(file.read(arg[1]))
+ assert(file.write(arg[2],text:upper()))
+
+Copying files is suprisingly tricky. `file.copy` and `file.move` attempt to use
+the best implementation possible. On Windows, they link to the API functions
+`CopyFile` and `MoveFile`, but only if the `alien` package is installed (this is
+true for Lua for Windows.) Otherwise, the system copy command is used. This can
+be ugly when writing Windows GUI applications, because of the dreaded flashing
+black-box problem with launching processes.
+
+### Directory Operations
+
+`pl.dir` provides some useful functions for working with directories. `fnmatch`
+will match a filename against a shell pattern, and `filter` will return any files
+in the supplied list which match the given pattern, which correspond to the
+functions in the Python `fnmatch` module. `getdirectories` will return all
+directories contained in a directory, and `getfiles` will return all files in a
+directory which match a shell pattern. These functions return the files as a
+table, unlike `lfs.dir` which returns an iterator.)
+
+`dir.makepath` can create a full path, creating subdirectories as necessary;
+`rmtree` is the Nuclear Option of file deleting functions, since it will
+recursively clear out and delete all directories found begining at a path (there
+is a similar function with this name in the Python `shutils` module.)
+
+ > = dir.makepath 't\\temp\\bonzo'
+ > = path.isdir 't\\temp\\bonzo'
+ true
+ > = dir.rmtree 't'
+
+`dir.rmtree` depends on `dir.walk`, which is a powerful tool for scanning a whole
+directory tree. Here is the implementation of `dir.rmtree`:
+
+ --- remove a whole directory tree.
+ -- @param path A directory path
+ function dir.rmtree(fullpath)
+ for root,dirs,files in dir.walk(fullpath) do
+ for i,f in ipairs(files) do
+ os.remove(path.join(root,f))
+ end
+ lfs.rmdir(root)
+ end
+ end
+
+
+`dir.clonetree` clones directory trees. The first argument is a path that must
+exist, and the second path is the path to be cloned. (Note that this path cannot
+be _inside_ the first path, since this leads to madness.) By default, it will
+then just recreate the directory structure. You can in addition provide a
+function, which will be applied for all files found.
+
+ -- make a copy of my libs folder
+ require 'pl'
+ p1 = [[d:\dev\lua\libs]]
+ p2 = [[D:\dev\lua\libs\..\tests]]
+ dir.clonetree(p1,p2,dir.copyfile)
+
+A more sophisticated version, which only copies files which have been modified:
+
+ -- p1 and p2 as before, or from arg[1] and arg[2]
+ dir.clonetree(p1,p2,function(f1,f2)
+ local res
+ local t1,t2 = path.getmtime(f1),path.getmtime(f2)
+ -- f2 might not exist, so be careful about t2
+ if not t2 or t1 > t2 then
+ res = dir.copyfile(f1,f2)
+ end
+ return res -- indicates successful operation
+ end)
+
+`dir.clonetree` uses `path.common_prefix`. With `p1` and `p2` defined above, the
+common path is 'd:\dev\lua'. So 'd:\dev\lua\libs\testfunc.lua' is copied to
+'d:\dev\lua\test\testfunc.lua', etc.
+
+If you need to find the common path of list of files, then `tablex.reduce` will
+do the job:
+
+ > p3 = [[d:\dev]]
+ > = tablex.reduce(path.common_prefix,{p1,p2,p3})
+ 'd:\dev'
+
diff --git a/Data/Libraries/Penlight/docs_topics/05-dates.md b/Data/Libraries/Penlight/docs_topics/05-dates.md
new file mode 100644
index 0000000..32c42f7
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/05-dates.md
@@ -0,0 +1,111 @@
+## Date and Time
+
+<a id="date"></a>
+
+NOTE: the Date module is deprecated
+
+### Creating and Displaying Dates
+
+The `Date` class provides a simplified way to work with [date and
+time](http://www.lua.org/pil/22.1.html) in Lua; it leans heavily on the functions
+`os.date` and `os.time`.
+
+A `Date` object can be constructed from a table, just like with `os.time`.
+Methods are provided to get and set the various parts of the date.
+
+ > d = Date {year = 2011, month = 3, day = 2 }
+ > = d
+ 2011-03-02 12:00:00
+ > = d:month(),d:year(),d:day()
+ 3 2011 2
+ > d:month(4)
+ > = d
+ 2011-04-02 12:00:00
+ > d:add {day=1}
+ > = d
+ 2011-04-03 12:00:00
+
+`add` takes a table containing one of the date table fields.
+
+ > = d:weekday_name()
+ Sun
+ > = d:last_day()
+ 2011-04-30 12:00:00
+ > = d:month_name(true)
+ April
+
+There is a default conversion to text for date objects, but `Date.Format` gives
+you full control of the format for both parsing and displaying dates:
+
+ > iso = Date.Format 'yyyy-mm-dd'
+ > d = iso:parse '2010-04-10'
+ > amer = Date.Format 'mm/dd/yyyy'
+ > = amer:tostring(d)
+ 04/10/2010
+
+With the 0.9.7 relase, the `Date` constructor has become more flexible. You may
+omit any of the 'year', 'month' or 'day' fields:
+
+ > = Date { year = 2008 }
+ 2008-01-01 12:00:00
+ > = Date { month = 3 }
+ 2011-03-01 12:00:00
+ > = Date { day = 20 }
+ 2011-10-20 12:00:00
+ > = Date { hour = 14, min = 30 }
+ 2011-10-13 14:30:00
+
+If 'year' is omitted, then the current year is assumed, and likewise for 'month'.
+
+To set the time on such a partial date, you can use the fact that the 'setter'
+methods return the date object and so you can 'chain' these methods.
+
+ > d = Date { day = 03 }
+ > = d:hour(18):min(30)
+ 2011-10-03 18:30:00
+
+Finally, `Date` also now accepts positional arguments:
+
+ > = Date(2011,10,3)
+ 2011-10-03 12:00:00
+ > = Date(2011,10,3,18,30,23)
+ 2011-10-03 18:30:23
+
+`Date.format` has been extended. If you construct an instance without a pattern,
+then it will try to match against a set of known formats. This is useful for
+human-input dates since keeping to a strict format is not one of the strong
+points of users. It assumes that there will be a date, and then a date.
+
+ > df = Date.Format()
+ > = df:parse '5.30pm'
+ 2011-10-13 17:30:00
+ > = df:parse '1730'
+ nil day out of range: 1730 is not between 1 and 31
+ > = df:parse '17.30'
+ 2011-10-13 17:30:00
+ > = df:parse 'mar'
+ 2011-03-01 12:00:00
+ > = df:parse '3 March'
+ 2011-03-03 12:00:00
+ > = df:parse '15 March'
+ 2011-03-15 12:00:00
+ > = df:parse '15 March 2008'
+ 2008-03-15 12:00:00
+ > = df:parse '15 March 2008 1.30pm'
+ 2008-03-15 13:30:00
+ > = df:parse '2008-10-03 15:30:23'
+ 2008-10-03 15:30:23
+
+ISO date format is of course a good idea if you need to deal with users from
+different countries. Here is the default behaviour for 'short' dates:
+
+ > = df:parse '24/02/12'
+ 2012-02-24 12:00:00
+
+That's not what Americans expect! It's tricky to work out in a cross-platform way
+exactly what the expected format is, so there is an explicit flag:
+
+ > df:US_order(true)
+ > = df:parse '9/11/01'
+ 2001-11-09 12:00:00
+
diff --git a/Data/Libraries/Penlight/docs_topics/06-data.md b/Data/Libraries/Penlight/docs_topics/06-data.md
new file mode 100644
index 0000000..e067b6b
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/06-data.md
@@ -0,0 +1,1262 @@
+## Data
+
+### Reading Data Files
+
+The first thing to consider is this: do you actually need to write a custom file
+reader? And if the answer is yes, the next question is: can you write the reader
+in as clear a way as possible? Correctness, Robustness, and Speed; pick the first
+two and the third can be sorted out later, _if necessary_.
+
+A common sort of data file is the configuration file format commonly used on Unix
+systems. This format is often called a _property_ file in the Java world.
+
+ # Read timeout in seconds
+ read.timeout=10
+
+ # Write timeout in seconds
+ write.timeout=10
+
+Here is a simple Lua implementation:
+
+ -- property file parsing with Lua string patterns
+ props = []
+ for line in io.lines() do
+ if line:find('#',1,true) ~= 1 and not line:find('^%s*$') then
+ local var,value = line:match('([^=]+)=(.*)')
+ props[var] = value
+ end
+ end
+
+Very compact, but it suffers from a similar disease in equivalent Perl programs;
+it uses odd string patterns which are 'lexically noisy'. Noisy code like this
+slows the casual reader down. (For an even more direct way of doing this, see the
+next section, 'Reading Configuration Files')
+
+Another implementation, using the Penlight libraries:
+
+ -- property file parsing with extended string functions
+ require 'pl'
+ stringx.import()
+ props = []
+ for line in io.lines() do
+ if not line:startswith('#') and not line:isspace() then
+ local var,value = line:splitv('=')
+ props[var] = value
+ end
+ end
+
+This is more self-documenting; it is generally better to make the code express
+the _intention_, rather than having to scatter comments everywhere - comments are
+necessary, of course, but mostly to give the higher view of your intention that
+cannot be expressed in code. It is slightly slower, true, but in practice the
+speed of this script is determined by I/O, so further optimization is unnecessary.
+
+### Reading Unstructured Text Data
+
+Text data is sometimes unstructured, for example a file containing words. The
+`pl.input` module has a number of functions which makes processing such files
+easier. For example, a script to count the number of words in standard input
+using `import.words`:
+
+ -- countwords.lua
+ require 'pl'
+ local k = 1
+ for w in input.words(io.stdin) do
+ k = k + 1
+ end
+ print('count',k)
+
+Or this script to calculate the average of a set of numbers using `input.numbers`:
+
+ -- average.lua
+ require 'pl'
+ local k = 1
+ local sum = 0
+ for n in input.numbers(io.stdin) do
+ sum = sum + n
+ k = k + 1
+ end
+ print('average',sum/k)
+
+These scripts can be improved further by _eliminating loops_ In the last case,
+there is a perfectly good function `seq.sum` which can already take a sequence of
+numbers and calculate these numbers for us:
+
+ -- average2.lua
+ require 'pl'
+ local total,n = seq.sum(input.numbers())
+ print('average',total/n)
+
+A further simplification here is that if `numbers` or `words` are not passed an
+argument, they will grab their input from standard input. The first script can
+be rewritten:
+
+ -- countwords2.lua
+ require 'pl'
+ print('count',seq.count(input.words()))
+
+A useful feature of a sequence generator like `numbers` is that it can read from
+a string source. Here is a script to calculate the sums of the numbers on each
+line in a file:
+
+ -- sums.lua
+ for line in io.lines() do
+ print(seq.sum(input.numbers(line))
+ end
+
+### Reading Columnar Data
+
+It is very common to find data in columnar form, either space or comma-separated,
+perhaps with an initial set of column headers. Here is a typical example:
+
+ EventID Magnitude LocationX LocationY LocationZ
+ 981124001 2.0 18988.4 10047.1 4149.7
+ 981125001 0.8 19104.0 9970.4 5088.7
+ 981127003 0.5 19012.5 9946.9 3831.2
+ ...
+
+`input.fields` is designed to extract several columns, given some delimiter
+(default to whitespace). Here is a script to calculate the average X location of
+all the events:
+
+ -- avg-x.lua
+ require 'pl'
+ io.read() -- skip the header line
+ local sum,count = seq.sum(input.fields {3})
+ print(sum/count)
+
+`input.fields` is passed either a field count, or a list of column indices,
+starting at one as usual. So in this case we're only interested in column 3. If
+you pass it a field count, then you get every field up to that count:
+
+ for id,mag,locX,locY,locZ in input.fields (5) do
+ ....
+ end
+
+`input.fields` by default tries to convert each field to a number. It will skip
+lines which clearly don't match the pattern, but will abort the script if there
+are any fields which cannot be converted to numbers.
+
+The second parameter is a delimiter, by default spaces. ' ' is understood to mean
+'any number of spaces', i.e. '%s+'. Any Lua string pattern can be used.
+
+The third parameter is a _data source_, by default standard input (defined by
+`input.create_getter`.) It assumes that the data source has a `read` method which
+brings in the next line, i.e. it is a 'file-like' object. As a special case, a
+string will be split into its lines:
+
+ > for x,y in input.fields(2,' ','10 20\n30 40\n') do print(x,y) end
+ 10 20
+ 30 40
+
+Note the default behaviour for bad fields, which is to show the offending line
+number:
+
+ > for x,y in input.fields(2,' ','10 20\n30 40x\n') do print(x,y) end
+ 10 20
+ line 2: cannot convert '40x' to number
+
+This behaviour of `input.fields` is appropriate for a script which you want to
+fail immediately with an appropriate _user_ error message if conversion fails.
+The fourth optional parameter is an options table: `{no_fail=true}` means that
+conversion is attempted but if it fails it just returns the string, rather as AWK
+would operate. You are then responsible for checking the type of the returned
+field. `{no_convert=true}` switches off conversion altogether and all fields are
+returned as strings.
+
+@lookup pl.data
+
+Sometimes it is useful to bring a whole dataset into memory, for operations such
+as extracting columns. Penlight provides a flexible reader specifically for
+reading this kind of data, using the `data` module. Given a file looking like this:
+
+ x,y
+ 10,20
+ 2,5
+ 40,50
+
+Then `data.read` will create a table like this, with each row represented by a
+sublist:
+
+ > t = data.read 'test.txt'
+ > pretty.dump(t)
+ {{10,20},{2,5},{40,50},fieldnames={'x','y'},delim=','}
+
+You can now analyze this returned table using the supplied methods. For instance,
+the method `column_by_name` returns a table of all the values of that column.
+
+ -- testdata.lua
+ require 'pl'
+ d = data.read('fev.txt')
+ for _,name in ipairs(d.fieldnames) do
+ local col = d:column_by_name(name)
+ if type(col[1]) == 'number' then
+ local total,n = seq.sum(col)
+ utils.printf("Average for %s is %f\n",name,total/n)
+ end
+ end
+
+`data.read` tries to be clever when given data; by default it expects a first
+line of column names, unless any of them are numbers. It tries to deduce the
+column delimiter by looking at the first line. Sometimes it guesses wrong; these
+things can be specified explicitly. The second optional parameter is an options
+table: can override `delim` (a string pattern), `fieldnames` (a list or
+comma-separated string), specify `no_convert` (default is to convert), numfields
+(indices of columns known to be numbers, as a list) and `thousands_dot` (when the
+thousands separator in Excel CSV is '.')
+
+A very powerful feature is a way to execute SQL-like queries on such data:
+
+ -- queries on tabular data
+ require 'pl'
+ local d = data.read('xyz.txt')
+ local q = d:select('x,y,z where x > 3 and z < 2 sort by y')
+ for x,y,z in q do
+ print(x,y,z)
+ end
+
+Please note that the format of queries is restricted to the following syntax:
+
+ FIELDLIST [ 'where' CONDITION ] [ 'sort by' FIELD [asc|desc]]
+
+Any valid Lua code can appear in `CONDITION`; remember it is _not_ SQL and you
+have to use `==` (this warning comes from experience.)
+
+For this to work, _field names must be Lua identifiers_. So `read` will massage
+fieldnames so that all non-alphanumeric chars are replaced with underscores.
+However, the `original_fieldnames` field always contains the original un-massaged
+fieldnames.
+
+`read` can handle standard CSV files fine, although doesn't try to be a
+full-blown CSV parser. With the `csv=true` option, it's possible to have
+double-quoted fields, which may contain commas; then trailing commas become
+significant as well.
+
+Spreadsheet programs are not always the best tool to
+process such data, strange as this might seem to some people. This is a toy CSV
+file; to appreciate the problem, imagine thousands of rows and dozens of columns
+like this:
+
+ Department Name,Employee ID,Project,Hours Booked
+ sales,1231,overhead,4
+ sales,1255,overhead,3
+ engineering,1501,development,5
+ engineering,1501,maintenance,3
+ engineering,1433,maintenance,10
+
+The task is to reduce the dataset to a relevant set of rows and columns, perhaps
+do some processing on row data, and write the result out to a new CSV file. The
+`write_row` method uses the delimiter to write the row to a file;
+`Data.select_row` is like `Data.select`, except it iterates over _rows_, not
+fields; this is necessary if we are dealing with a lot of columns!
+
+ names = {[1501]='don',[1433]='dilbert'}
+ keepcols = {'Employee_ID','Hours_Booked'}
+ t:write_row (outf,{'Employee','Hours_Booked'})
+ q = t:select_row {
+ fields=keepcols,
+ where=function(row) return row[1]=='engineering' end
+ }
+ for row in q do
+ row[1] = names[row[1]]
+ t:write_row(outf,row)
+ end
+
+`Data.select_row` and `Data.select` can be passed a table specifying the query; a
+list of field names, a function defining the condition and an optional parameter
+`sort_by`. It isn't really necessary here, but if we had a more complicated row
+condition (such as belonging to a specified set) then it is not generally
+possible to express such a condition as a query string, without resorting to
+hackery such as global variables.
+
+With 1.0.3, you can specify explicit conversion functions for selected columns.
+For instance, this is a log file with a Unix date stamp:
+
+ Time Message
+ 1266840760 +# EE7C0600006F0D00C00F06010302054000000308010A00002B00407B00
+ 1266840760 closure data 0.000000 1972 1972 0
+ 1266840760 ++ 1266840760 EE 1
+ 1266840760 +# EE7C0600006F0D00C00F06010302054000000408020A00002B00407B00
+ 1266840764 closure data 0.000000 1972 1972 0
+
+We would like the first column as an actual date object, so the `convert`
+field sets an explicit conversion for column 1. (Note that we have to explicitly
+convert the string to a number first.)
+
+ Date = require 'pl.Date'
+
+ function date_convert (ds)
+ return Date(tonumber(ds))
+ end
+
+ d = data.read(f,{convert={[1]=date_convert},last_field_collect=true})
+
+This gives us a two-column dataset, where the first column contains `Date` objects
+and the second column contains the rest of the line. Queries can then easily
+pick out events on a day of the week:
+
+ q = d:select "Time,Message where Time:weekday_name()=='Sun'"
+
+Data does not have to come from files, nor does it necessarily come from the lab
+or the accounts department. On Linux, `ps aux` gives you a full listing of all
+processes running on your machine. It is straightforward to feed the output of
+this command into `data.read` and perform useful queries on it. Notice that
+non-identifier characters like '%' get converted into underscores:
+
+ require 'pl'
+ f = io.popen 'ps aux'
+ s = data.read (f,{last_field_collect=true})
+ f:close()
+ print(s.fieldnames)
+ print(s:column_by_name 'USER')
+ qs = 'COMMAND,_MEM where _MEM > 5 and USER=="steve"'
+ for name,mem in s:select(qs) do
+ print(mem,name)
+ end
+
+I've always been an admirer of the AWK programming language; with `filter` you
+can get Lua programs which are just as compact:
+
+ -- printxy.lua
+ require 'pl'
+ data.filter 'x,y where x > 3'
+
+It is common enough to have data files without headers of field names.
+`data.read` makes a special exception for such files if all fields are numeric.
+Since there are no column names to use in query expressions, you can use AWK-like
+column indexes, e.g. '$1,$2 where $1 > 3'. I have a little executable script on
+my system called `lf` which looks like this:
+
+ #!/usr/bin/env lua
+ require 'pl.data'.filter(arg[1])
+
+And it can be used generally as a filter command to extract columns from data.
+(The column specifications may be expressions or even constants.)
+
+ $ lf '$1,$5/10' < test.dat
+
+(As with AWK, please note the single-quotes used in this command; this prevents
+the shell trying to expand the column indexes. If you are on Windows, then you
+must quote the expression in double-quotes so
+it is passed as one argument to your batch file.)
+
+As a tutorial resource, have a look at `test-data.lua` in the PL tests directory
+for other examples of use, plus comments.
+
+The data returned by `read` or constructed by `Data.copy_select` from a query is
+basically just an array of rows: `{{1,2},{3,4}}`. So you may use `read` to pull
+in any array-like dataset, and process with any function that expects such a
+implementation. In particular, the functions in `array2d` will work fine with
+this data. In fact, these functions are available as methods; e.g.
+`array2d.flatten` can be called directly like so to give us a one-dimensional list:
+
+ v = data.read('dat.txt'):flatten()
+
+The data is also in exactly the right shape to be treated as matrices by
+[LuaMatrix](http://lua-users.org/wiki/LuaMatrix):
+
+ > matrix = require 'matrix'
+ > m = matrix(data.read 'mat.txt')
+ > = m
+ 1 0.2 0.3
+ 0.2 1 0.1
+ 0.1 0.2 1
+ > = m^2 -- same as m*m
+ 1.07 0.46 0.62
+ 0.41 1.06 0.26
+ 0.24 0.42 1.05
+
+`write` will write matrices back to files for you.
+
+Finally, for the curious, the global variable `_DEBUG` can be used to print out
+the actual iterator function which a query generates and dynamically compiles. By
+using code generation, we can get pretty much optimal performance out of
+arbitrary queries.
+
+ > lua -lpl -e "_DEBUG=true" -e "data.filter 'x,y where x > 4 sort by x'" < test.txt
+ return function (t)
+ local i = 0
+ local v
+ local ls = {}
+ for i,v in ipairs(t) do
+ if v[1] > 4 then
+ ls[#ls+1] = v
+ end
+ end
+ table.sort(ls,function(v1,v2)
+ return v1[1] < v2[1]
+ end)
+ local n = #ls
+ return function()
+ i = i + 1
+ v = ls[i]
+ if i > n then return end
+ return v[1],v[2]
+ end
+ end
+
+ 10,20
+ 40,50
+
+### Reading Configuration Files
+
+The `config` module provides a simple way to convert several kinds of
+configuration files into a Lua table. Consider the simple example:
+
+ # test.config
+ # Read timeout in seconds
+ read.timeout=10
+
+ # Write timeout in seconds
+ write.timeout=5
+
+ #acceptable ports
+ ports = 1002,1003,1004
+
+This can be easily brought in using `config.read` and the result shown using
+`pretty.write`:
+
+ -- readconfig.lua
+ local config = require 'pl.config'
+ local pretty= require 'pl.pretty'
+
+ local t = config.read(arg[1])
+ print(pretty.write(t))
+
+and the output of `lua readconfig.lua test.config` is:
+
+ {
+ ports = {
+ 1002,
+ 1003,
+ 1004
+ },
+ write_timeout = 5,
+ read_timeout = 10
+ }
+
+That is, `config.read` will bring in all key/value pairs, ignore # comments, and
+ensure that the key names are proper Lua identifiers by replacing non-identifier
+characters with '_'. If the values are numbers, then they will be converted. (So
+the value of `t.write_timeout` is the number 5). In addition, any values which
+are separated by commas will be converted likewise into an array.
+
+Any line can be continued with a backslash. So this will all be considered one
+line:
+
+ names=one,two,three, \
+ four,five,six,seven, \
+ eight,nine,ten
+
+
+Windows-style INI files are also supported. The section structure of INI files
+translates naturally to nested tables in Lua:
+
+ ; test.ini
+ [timeouts]
+ read=10 ; Read timeout in seconds
+ write=5 ; Write timeout in seconds
+ [portinfo]
+ ports = 1002,1003,1004
+
+ The output is:
+
+ {
+ portinfo = {
+ ports = {
+ 1002,
+ 1003,
+ 1004
+ }
+ },
+ timeouts = {
+ write = 5,
+ read = 10
+ }
+ }
+
+You can now refer to the write timeout as `t.timeouts.write`.
+
+As a final example of the flexibility of `config.read`, if passed this simple
+comma-delimited file
+
+ one,two,three
+ 10,20,30
+ 40,50,60
+ 1,2,3
+
+it will produce the following table:
+
+ {
+ { "one", "two", "three" },
+ { 10, 20, 30 },
+ { 40, 50, 60 },
+ { 1, 2, 3 }
+ }
+
+`config.read` isn't designed to read all CSV files in general, but intended to
+support some Unix configuration files not structured as key-value pairs, such as
+'/etc/passwd'.
+
+This function is intended to be a Swiss Army Knife of configuration readers, but
+it does have to make assumptions, and you may not like them. So there is an
+optional extra parameter which allows some control, which is table that may have
+the following fields:
+
+ {
+ variablilize = true,
+ convert_numbers = tonumber,
+ trim_space = true,
+ list_delim = ',',
+ trim_quotes = true,
+ ignore_assign = false,
+ keysep = '=',
+ smart = false,
+ }
+
+`variablilize` is the option that converted `write.timeout` in the first example
+to the valid Lua identifier `write_timeout`. If `convert_numbers` is true, then
+an attempt is made to convert any string that starts like a number. You can
+specify your own function (say one that will convert a string like '5224 kb' into
+a number.)
+
+`trim_space` ensures that there is no starting or trailing whitespace with
+values, and `list_delim` is the character that will be used to decide whether to
+split a value up into a list (it may be a Lua string pattern such as '%s+'.)
+
+For instance, the password file in Unix is colon-delimited:
+
+ t = config.read('/etc/passwd',{list_delim=':'})
+
+This produces the following output on my system (only last two lines shown):
+
+ {
+ ...
+ {
+ "user",
+ "x",
+ "1000",
+ "1000",
+ "user,,,",
+ "/home/user",
+ "/bin/bash"
+ },
+ {
+ "sdonovan",
+ "x",
+ "1001",
+ "1001",
+ "steve donovan,28,,",
+ "/home/sdonovan",
+ "/bin/bash"
+ }
+ }
+
+You can get this into a more sensible format, where the usernames are the keys,
+with this (the `tablex.pairmap` function must return value, key!)
+
+ t = tablex.pairmap(function(k,v) return v,v[1] end,t)
+
+and you get:
+
+ { ...
+ sdonovan = {
+ "sdonovan",
+ "x",
+ "1001",
+ "1001",
+ "steve donovan,28,,",
+ "/home/sdonovan",
+ "/bin/bash"
+ }
+ ...
+ }
+
+Many common Unix configuration files can be read by tweaking these parameters.
+For `/etc/fstab`, the options `{list_delim='%s+',ignore_assign=true}` will
+correctly separate the columns. It's common to find 'KEY VALUE' assignments in
+files such as `/etc/ssh/ssh_config`; the options `{keysep=' '}` make
+`config.read` return a table where each KEY has a value VALUE.
+
+Files in the Linux `procfs` usually use ':` as the field delimiter:
+
+ > t = config.read('/proc/meminfo',{keysep=':'})
+ > = t.MemFree
+ 220140 kB
+
+That result is a string, since `tonumber` doesn't like it, but defining the
+`convert_numbers` option as `function(s) return tonumber((s:gsub(' kB$','')))
+end` will get the memory figures as actual numbers in the result. (The extra
+parentheses are necessary so that `tonumber` only gets the first result from
+`gsub`). From `tests/test-config.lua':
+
+ testconfig([[
+ MemTotal: 1024748 kB
+ MemFree: 220292 kB
+ ]],
+ { MemTotal = 1024748, MemFree = 220292 },
+ {
+ keysep = ':',
+ convert_numbers = function(s)
+ s = s:gsub(' kB$','')
+ return tonumber(s)
+ end
+ }
+ )
+
+
+The `smart` option lets `config.read` make a reasonable guess for you; there
+are examples in `tests/test-config.lua`, but basically these common file
+formats (and those following the same pattern) can be processed directly in
+smart mode: 'etc/fstab', '/proc/XXXX/status', 'ssh_config' and 'pdatedb.conf'.
+
+Please note that `config.read` can be passed a _file-like object_; if it's not a
+string and supports the `read` method, then that will be used. For instance, to
+read a configuration from a string, use `stringio.open`.
+
+
+<a id="lexer"/>
+
+### Lexical Scanning
+
+Although Lua's string pattern matching is very powerful, there are times when
+something more powerful is needed. `pl.lexer.scan` provides lexical scanners
+which _tokenize_ a string, classifying tokens into numbers, strings, etc.
+
+ > lua -lpl
+ Lua 5.1.4 Copyright (C) 1994-2008 Lua.org, PUC-Rio
+ > tok = lexer.scan 'alpha = sin(1.5)'
+ > = tok()
+ iden alpha
+ > = tok()
+ = =
+ > = tok()
+ iden sin
+ > = tok()
+ ( (
+ > = tok()
+ number 1.5
+ > = tok()
+ ) )
+ > = tok()
+ (nil)
+
+The scanner is a function, which is repeatedly called and returns the _type_ and
+_value_ of the token. Recognized basic types are 'iden','string','number', and
+'space'. and everything else is represented by itself. Note that by default the
+scanner will skip any 'space' tokens.
+
+'comment' and 'keyword' aren't applicable to the plain scanner, which is not
+language-specific, but a scanner which understands Lua is available. It
+recognizes the Lua keywords, and understands both short and long comments and
+strings.
+
+ > for t,v in lexer.lua 'for i=1,n do' do print(t,v) end
+ keyword for
+ iden i
+ = =
+ number 1
+ , ,
+ iden n
+ keyword do
+
+A lexical scanner is useful where you have highly-structured data which is not
+nicely delimited by newlines. For example, here is a snippet of a in-house file
+format which it was my task to maintain:
+
+ points
+ (818344.1,-20389.7,-0.1),(818337.9,-20389.3,-0.1),(818332.5,-20387.8,-0.1)
+ ,(818327.4,-20388,-0.1),(818322,-20387.7,-0.1),(818316.3,-20388.6,-0.1)
+ ,(818309.7,-20389.4,-0.1),(818303.5,-20390.6,-0.1),(818295.8,-20388.3,-0.1)
+ ,(818290.5,-20386.9,-0.1),(818285.2,-20386.1,-0.1),(818279.3,-20383.6,-0.1)
+ ,(818274,-20381.2,-0.1),(818274,-20380.7,-0.1);
+
+Here is code to extract the points using `pl.lexer`:
+
+ -- assume 's' contains the text above...
+ local lexer = require 'pl.lexer'
+ local expecting = lexer.expecting
+ local append = table.insert
+
+ local tok = lexer.scan(s)
+
+ local points = {}
+ local t,v = tok() -- should be 'iden','points'
+
+ while t ~= ';' do
+ c = {}
+ expecting(tok,'(')
+ c.x = expecting(tok,'number')
+ expecting(tok,',')
+ c.y = expecting(tok,'number')
+ expecting(tok,',')
+ c.z = expecting(tok,'number')
+ expecting(tok,')')
+ t,v = tok() -- either ',' or ';'
+ append(points,c)
+ end
+
+The `expecting` function grabs the next token and if the type doesn't match, it
+throws an error. (`pl.lexer`, unlike other PL libraries, raises errors if
+something goes wrong, so you should wrap your code in `pcall` to catch the error
+gracefully.)
+
+The scanners all have a second optional argument, which is a table which controls
+whether you want to exclude spaces and/or comments. The default for `lexer.lua`
+is `{space=true,comments=true}`. There is a third optional argument which
+determines how string and number tokens are to be processsed.
+
+The ultimate highly-structured data is of course, program source. Here is a
+snippet from 'text-lexer.lua':
+
+ require 'pl'
+
+ lines = [[
+ for k,v in pairs(t) do
+ if type(k) == 'number' then
+ print(v) -- array-like case
+ else
+ print(k,v)
+ end
+ end
+ ]]
+
+ ls = List()
+ for tp,val in lexer.lua(lines,{space=true,comments=true}) do
+ assert(tp ~= 'space' and tp ~= 'comment')
+ if tp == 'keyword' then ls:append(val) end
+ end
+ test.asserteq(ls,List{'for','in','do','if','then','else','end','end'})
+
+Here is a useful little utility that identifies all common global variables found
+in a lua module (ignoring those declared locally for the moment):
+
+ -- testglobal.lua
+ require 'pl'
+
+ local txt,err = utils.readfile(arg[1])
+ if not txt then return print(err) end
+
+ local globals = List()
+ for t,v in lexer.lua(txt) do
+ if t == 'iden' and _G[v] then
+ globals:append(v)
+ end
+ end
+ pretty.dump(seq.count_map(globals))
+
+Rather then dumping the whole list, with its duplicates, we pass it through
+`seq.count_map` which turns the list into a table where the keys are the values,
+and the associated values are the number of times those values occur in the
+sequence. Typical output looks like this:
+
+ {
+ type = 2,
+ pairs = 2,
+ table = 2,
+ print = 3,
+ tostring = 2,
+ require = 1,
+ ipairs = 4
+ }
+
+You could further pass this through `tablex.keys` to get a unique list of
+symbols. This can be useful when writing 'strict' Lua modules, where all global
+symbols must be defined as locals at the top of the file.
+
+For a more detailed use of `lexer.scan`, please look at `testxml.lua` in the
+examples directory.
+
+### XML
+
+New in the 0.9.7 release is some support for XML. This is a large topic, and
+Penlight does not provide a full XML stack, which is properly the task of a more
+specialized library.
+
+#### Parsing and Pretty-Printing
+
+The semi-standard XML parser in the Lua universe is [lua-expat](http://matthewwild.co.uk/projects/luaexpat/).
+In particular,
+it has a function called `lxp.lom.parse` which will parse XML into the Lua Object
+Model (LOM) format. However, it does not provide a way to convert this data back
+into XML text. `xml.parse` will use this function, _if_ `lua-expat` is
+available, and otherwise switches back to a pure Lua parser originally written by
+Roberto Ierusalimschy.
+
+The resulting document object knows how to render itself as a string, which is
+useful for debugging:
+
+ > d = xml.parse "<nodes><node id='1'>alice</node></nodes>"
+ > = d
+ <nodes><node id='1'>alice</node></nodes>
+ > pretty.dump (d)
+ {
+ {
+ "alice",
+ attr = {
+ "id",
+ id = "1"
+ },
+ tag = "node"
+ },
+ attr = {
+ },
+ tag = "nodes"
+ }
+
+Looking at the actual shape of the data reveals the structure of LOM:
+
+ * every element has a `tag` field with its name
+ * plus a `attr` field which is a table containing the attributes as fields, and
+also as an array. It is always present.
+ * the children of the element are the array part of the element, so `d[1]` is
+the first child of `d`, etc.
+
+It could be argued that having attributes also as the array part of `attr` is not
+essential (you cannot depend on attribute order in XML) but that's how
+it goes with this standard.
+
+`lua-expat` is another _soft dependency_ of Penlight; generally, the fallback
+parser is good enough for straightforward XML as is commonly found in
+configuration files, etc. `doc.basic_parse` is not intended to be a proper
+conforming parser (it's only sixty lines) but it handles simple kinds of
+documents that do not have comments or DTD directives. It is intelligent enough
+to ignore the `<?xml` directive and that is about it.
+
+You can get pretty-printing by explicitly calling `xml.tostring` and passing it
+the initial indent and the per-element indent:
+
+ > = xml.tostring(d,'',' ')
+
+ <nodes>
+ <node id='1'>alice</node>
+ </nodes>
+
+There is a fourth argument which is the _attribute indent_:
+
+ > a = xml.parse "<frodo name='baggins' age='50' type='hobbit'/>"
+ > = xml.tostring(a,'',' ',' ')
+
+ <frodo
+ type='hobbit'
+ name='baggins'
+ age='50'
+ />
+
+#### Parsing and Working with Configuration Files
+
+It's common to find configurations expressed with XML these days. It's
+straightforward to 'walk' the [LOM](http://matthewwild.co.uk/projects/luaexpat/lom.html)
+data and extract the data in the form you want:
+
+ require 'pl'
+
+ local config = [[
+ <config>
+ <alpha>1.3</alpha>
+ <beta>10</beta>
+ <name>bozo</name>
+ </config>
+ ]]
+ local d,err = xml.parse(config)
+
+ local t = {}
+ for item in d:childtags() do
+ t[item.tag] = item[1]
+ end
+
+ pretty.dump(t)
+ --->
+ {
+ beta = "10",
+ alpha = "1.3",
+ name = "bozo"
+ }
+
+The only gotcha is that here we must use the `Doc:childtags` method, which will
+skip over any text elements.
+
+A more involved example is this excerpt from `serviceproviders.xml`, which is
+usually found at `/usr/share/mobile-broadband-provider-info/serviceproviders.xml`
+on Debian/Ubuntu Linux systems.
+
+ d = xml.parse [[
+ <serviceproviders format="2.0">
+ ...
+ <country code="za">
+ <provider>
+ <name>Cell-c</name>
+ <gsm>
+ <network-id mcc="655" mnc="07"/>
+ <apn value="internet">
+ <username>Cellcis</username>
+ <dns>196.7.0.138</dns>
+ <dns>196.7.142.132</dns>
+ </apn>
+ </gsm>
+ </provider>
+ <provider>
+ <name>MTN</name>
+ <gsm>
+ <network-id mcc="655" mnc="10"/>
+ <apn value="internet">
+ <dns>196.11.240.241</dns>
+ <dns>209.212.97.1</dns>
+ </apn>
+ </gsm>
+ </provider>
+ <provider>
+ <name>Vodacom</name>
+ <gsm>
+ <network-id mcc="655" mnc="01"/>
+ <apn value="internet">
+ <dns>196.207.40.165</dns>
+ <dns>196.43.46.190</dns>
+ </apn>
+ <apn value="unrestricted">
+ <name>Unrestricted</name>
+ <dns>196.207.32.69</dns>
+ <dns>196.43.45.190</dns>
+ </apn>
+ </gsm>
+ </provider>
+ <provider>
+ <name>Virgin Mobile</name>
+ <gsm>
+ <apn value="vdata">
+ <dns>196.7.0.138</dns>
+ <dns>196.7.142.132</dns>
+ </apn>
+ </gsm>
+ </provider>
+ </country>
+ ....
+ </serviceproviders>
+ ]]
+
+Getting the names of the providers per-country is straightforward:
+
+ local t = {}
+ for country in d:childtags() do
+ local providers = {}
+ t[country.attr.code] = providers
+ for provider in country:childtags() do
+ table.insert(providers,provider:child_with_name('name'):get_text())
+ end
+ end
+
+ pretty.dump(t)
+ -->
+ {
+ za = {
+ "Cell-c",
+ "MTN",
+ "Vodacom",
+ "Virgin Mobile"
+ }
+ ....
+ }
+
+#### Generating XML with 'xmlification'
+
+This feature is inspired by the `htmlify` function used by
+[Orbit](http://keplerproject.github.com/orbit/) to simplify HTML generation,
+except that no function environment magic is used; the `tags` function returns a
+set of _constructors_ for elements of the given tag names.
+
+ > nodes, node = xml.tags 'nodes, node'
+ > = node 'alice'
+ <node>alice</node>
+ > = nodes { node {id='1','alice'}}
+ <nodes><node id='1'>alice</node></nodes>
+
+The flexibility of Lua tables is very useful here, since both the attributes and
+the children of an element can be encoded naturally. The argument to these tag
+constructors is either a single value (like a string) or a table where the
+attributes are the named keys and the children are the array values.
+
+#### Generating XML using Templates
+
+A template is a little XML document which contains dollar-variables. The `subst`
+method on a document is fed an array of tables containing values for these
+variables. Note how the parent tag name is specified:
+
+ > templ = xml.parse "<node id='$id'>$name</node>"
+ > = templ:subst {tag='nodes', {id=1,name='alice'},{id=2,name='john'}}
+ <nodes><node id='1'>alice</node><node id='2'>john</node></nodes>
+
+Substitution is very related to _filtering_ documents. One of the annoying things
+about XML is that it is a document markup language first, and a data language
+second. Standard parsers will assume you really care about all those extra
+text elements. Consider this fragment, which has been changed by a five-year old:
+
+ T = [[
+ <weather>
+ boops!
+ <current_conditions>
+ <condition data='$condition'/>
+ <temp_c data='$temp'/>
+ <bo>whoops!</bo>
+ </current_conditions>
+ </weather>
+ ]]
+
+Conformant parsers will give you text elements with the line feed after `<current_conditions>`
+although it makes handling the data more irritating.
+
+ local function parse (str)
+ return xml.parse(str,false,true)
+ end
+
+Second argument means 'string, not file' and third argument means use the built-in
+Lua parser (instead of LuaExpat if available) which _by default_ is not interested in
+keeping such strings.
+
+How to remove the string `boops!`? `clone` (also called `filter` when called as a
+method) copies a LOM document. It can be passed a filter function, which is applied
+to each string found. The powerful thing about this is that this function receives
+structural information - the parent node, and whether this was a tag name, a text
+element or a attribute name:
+
+ d = parse (T)
+ c = d:filter(function(s,kind,parent)
+ print(stringx.strip(s),kind,parent and parent.tag or '?')
+ if kind == '*TEXT' and #parent > 1 then return nil end
+ return s
+ end)
+ --->
+ weather *TAG ?
+ boops! *TEXT weather
+ current_conditions *TAG weather
+ condition *TAG current_conditions
+ $condition data condition
+ temp_c *TAG current_conditions
+ $temp data temp_c
+ bo *TAG current_conditions
+ whoops! *TEXT bo
+
+We can pull out 'boops' and not 'whoops' by discarding text elements which are not
+the single child of an element.
+
+
+
+#### Extracting Data using Templates
+
+Matching goes in the opposite direction. We have a document, and would like to
+extract values from it using a pattern.
+
+A common use of this is parsing the XML result of API queries. The
+[(undocumented and subsequently discontinued) Google Weather
+API](http://blog.programmableweb.com/2010/02/08/googles-secret-weather-api/) is a
+good example. Grabbing the result of
+`http://www.google.com/ig/api?weather=Johannesburg,ZA" we get something like
+this, after pretty-printing:
+
+ <xml_api_reply version='1'>
+ <weather module_id='0' tab_id='0' mobile_zipped='1' section='0' row='0'
+mobile_row='0'>
+ <forecast_information>
+ <city data='Johannesburg, Gauteng'/>
+ <postal_code data='Johannesburg,ZA'/>
+ <latitude_e6 data=''/>
+ <longitude_e6 data=''/>
+ <forecast_date data='2010-10-02'/>
+ <current_date_time data='2010-10-02 18:30:00 +0000'/>
+ <unit_system data='US'/>
+ </forecast_information>
+ <current_conditions>
+ <condition data='Clear'/>
+ <temp_f data='75'/>
+ <temp_c data='24'/>
+ <humidity data='Humidity: 19%'/>
+ <icon data='/ig/images/weather/sunny.gif'/>
+ <wind_condition data='Wind: NW at 7 mph'/>
+ </current_conditions>
+ <forecast_conditions>
+ <day_of_week data='Sat'/>
+ <low data='60'/>
+ <high data='89'/>
+ <icon data='/ig/images/weather/sunny.gif'/>
+ <condition data='Clear'/>
+ </forecast_conditions>
+ ....
+ </weather>
+ </xml_api_reply>
+
+Assume that the above XML has been read into `google`. The idea is to write a
+pattern looking like a template, and use it to extract some values of interest:
+
+ t = [[
+ <weather>
+ <current_conditions>
+ <condition data='$condition'/>
+ <temp_c data='$temp'/>
+ </current_conditions>
+ </weather>
+ ]]
+
+ local res, ret = google:match(t)
+ pretty.dump(res)
+
+And the output is:
+
+ {
+ condition = "Clear",
+ temp = "24"
+ }
+
+The `match` method can be passed a LOM document or some text, which will be
+parsed first.
+
+But what if we need to extract values from repeated elements? Match templates may
+contain 'array matches' which are enclosed in '{{..}}':
+
+ <weather>
+ {{<forecast_conditions>
+ <day_of_week data='$day'/>
+ <low data='$low'/>
+ <high data='$high'/>
+ <condition data='$condition'/>
+ </forecast_conditions>}}
+ </weather>
+
+And the match result is:
+
+ {
+ {
+ low = "60",
+ high = "89",
+ day = "Sat",
+ condition = "Clear",
+ },
+ {
+ low = "53",
+ high = "86",
+ day = "Sun",
+ condition = "Clear",
+ },
+ {
+ low = "57",
+ high = "87",
+ day = "Mon",
+ condition = "Clear",
+ },
+ {
+ low = "60",
+ high = "84",
+ day = "Tue",
+ condition = "Clear",
+ }
+ }
+
+With this array of tables, you can use `tablex` or `List`
+to reshape into the desired form, if you choose. Just as with reading a Unix password
+file with `config`, you can make the array into a map of days to conditions using:
+
+ `tablex.pairmap`('|k,v| v,v.day',conditions)
+
+(Here using the alternative string lambda option)
+
+However, xml matches can shape the structure of the output. By replacing the `day_of_week`
+line of the template with `<day_of_week data='$_'/>` we get the same effect; `$_` is
+a special symbol that means that this captured value (or simply _capture_) becomes the key.
+
+Note that `$NUMBER` means a numerical index, so
+that `$1` is the first element of the resulting array, and so forth. You can mix
+numbered and named captures, but it's strongly advised to make the numbered captures
+form a proper array sequence (everything from `1` to `n` inclusive). `$0` has a
+special meaning; if it is the only capture (`{[0]='foo'}`) then the table is
+collapsed into 'foo'.
+
+ <weather>
+ {{<forecast_conditions>
+ <day_of_week data='$_'/>
+ <low data='$1'/>
+ <high data='$2'/>
+ <condition data='$3'/>
+ </forecast_conditions>}}
+ </weather>
+
+Now the result is:
+
+ {
+ Tue = {
+ "60",
+ "84",
+ "Clear"
+ },
+ Sun = {
+ "53",
+ "86",
+ "Clear"
+ },
+ Sat = {
+ "60",
+ "89",
+ "Clear"
+ },
+ Mon = {
+ "57",
+ "87",
+ "Clear"
+ }
+ }
+
+Applying matches to this config file poses another problem, because the actual
+tags matched are themselves meaningful.
+
+ <config>
+ <alpha>1.3</alpha>
+ <beta>10</beta>
+ <name>bozo</name>
+ </config>
+
+So there are tag 'wildcards' which are element names ending with a hyphen.
+
+ <config>
+ {{<key->$value</key->}}
+ </config>
+
+You will then get `{{alpha='1.3'},...}`. The most convenient format would be
+returned by this (note that `_-` behaves just like `$_`):
+
+ <config>
+ {{<_->$0</_->}}
+ </config>
+
+which would return `{alpha='1.3',beta='10',name='bozo'}`.
+
+We could play this game endlessly, and encode ways of converting captures, but
+the scheme is complex enough, and it's easy to do the conversion later
+
+ local numbers = {alpha=true,beta=true}
+ for k,v in pairs(res) do
+ if numbers[v] then res[k] = tonumber(v) end
+ end
+
+
+#### HTML Parsing
+
+HTML is an unusually degenerate form of XML, and Dennis Schridde has contributed
+a feature which makes parsing it easier. For instance, from the tests:
+
+ doc = xml.parsehtml [[
+ <BODY>
+ Hello dolly<br>
+ HTML is <b>slack</b><br>
+ </BODY>
+ ]]
+
+ asserteq(xml.tostring(doc),[[
+ <body>
+ Hello dolly<br/>
+ HTML is <b>slack</b><br/></body>]])
+
+That is, all tags are converted to lowercase, and empty HTML elements like `br`
+are properly closed; attributes do not need to be quoted.
+
+Also, DOCTYPE directives and comments are skipped. For truly badly formed HTML,
+this is not the tool for you!
+
+
+
diff --git a/Data/Libraries/Penlight/docs_topics/07-functional.md b/Data/Libraries/Penlight/docs_topics/07-functional.md
new file mode 100644
index 0000000..5921a3d
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/07-functional.md
@@ -0,0 +1,547 @@
+## Functional Programming
+
+### Sequences
+
+@lookup pl.seq
+
+A Lua iterator (in its simplest form) is a function which can be repeatedly
+called to return a set of one or more values. The `for in` statement understands
+these iterators, and loops until the function returns `nil`. There are standard
+sequence adapters for tables in Lua (`ipairs` and `pairs`), and `io.lines`
+returns an iterator over all the lines in a file. In the Penlight libraries, such
+iterators are also called _sequences_. A sequence of single values (say from
+`io.lines`) is called _single-valued_, whereas the sequence defined by `pairs` is
+_double-valued_.
+
+`pl.seq` provides a number of useful iterators, and some functions which operate
+on sequences. At first sight this example looks like an attempt to write Python
+in Lua, (with the sequence being inclusive):
+
+ > for i in seq.range(1,4) do print(i) end
+ 1
+ 2
+ 3
+ 4
+
+But `range` is actually equivalent to Python's `xrange`, since it generates a
+sequence, not a list. To get a list, use `seq.copy(seq.range(1,10))`, which
+takes any single-value sequence and makes a table from the result. `seq.list` is
+like `ipairs` except that it does not give you the index, just the value.
+
+ > for x in seq.list {1,2,3} do print(x) end
+ 1
+ 2
+ 3
+
+`enum` takes a sequence and turns it into a double-valued sequence consisting of
+a sequence number and the value, so `enum(list(ls))` is actually equivalent to
+`ipairs`. A more interesting example prints out a file with line numbers:
+
+ for i,v in seq.enum(io.lines(fname)) do print(i..' '..v) end
+
+Sequences can be _combined_, either by 'zipping' them or by concatenating them.
+
+ > for x,y in seq.zip(l1,l2) do print(x,y) end
+ 10 1
+ 20 2
+ 30 3
+ > for x in seq.splice(l1,l2) do print(x) end
+ 10
+ 20
+ 30
+ 1
+ 2
+ 3
+
+`seq.printall` is useful for printing out single-valued sequences, and provides
+some finer control over formating, such as a delimiter, the number of fields per
+line, and a format string to use (@see string.format)
+
+ > seq.printall(seq.random(10))
+ 0.0012512588885159 0.56358531449324 0.19330423902097 ....
+ > seq.printall(seq.random(10), ',', 4, '%4.2f')
+ 0.17,0.86,0.71,0.51
+ 0.30,0.01,0.09,0.36
+ 0.15,0.17,
+
+`map` will apply a function to a sequence.
+
+ > seq.printall(seq.map(string.upper, {'one','two'}))
+ ONE TWO
+ > seq.printall(seq.map('+', {10,20,30}, 1))
+ 11 21 31
+
+`filter` will filter a sequence using a boolean function (often called a
+_predicate_). For instance, this code only prints lines in a file which are
+composed of digits:
+
+ for l in seq.filter(io.lines(file), stringx.isdigit) do print(l) end
+
+The following returns a table consisting of all the positive values in the
+original table (equivalent to `tablex.filter(ls, '>', 0)`)
+
+ ls = seq.copy(seq.filter(ls, '>', 0))
+
+We're already encounted `seq.sum` when discussing `input.numbers`. This can also
+be expressed with `seq.reduce`:
+
+ > seq.reduce(function(x,y) return x + y end, seq.list{1,2,3,4})
+ 10
+
+`seq.reduce` applies a binary function in a recursive fashion, so that:
+
+ reduce(op,{1,2,3}) => op(1,reduce(op,{2,3}) => op(1,op(2,3))
+
+it's now possible to easily generate other cumulative operations; the standard
+operations declared in `pl.operator` are useful here:
+
+ > ops = require 'pl.operator'
+ > -- can also say '*' instead of ops.mul
+ > = seq.reduce(ops.mul,input.numbers '1 2 3 4')
+ 24
+
+There are functions to extract statistics from a sequence of numbers:
+
+ > l1 = List {10,20,30}
+ > l2 = List {1,2,3}
+ > = seq.minmax(l1)
+ 10 30
+ > = seq.sum(l1)
+ 60 3
+
+It is common to get sequences where values are repeated, say the words in a file.
+`count_map` will take such a sequence and count the values, returning a table
+where the _keys_ are the unique values, and the value associated with each key is
+the number of times they occurred:
+
+ > t = seq.count_map {'one','fred','two','one','two','two'}
+ > = t
+ {one=2,fred=1,two=3}
+
+This will also work on numerical sequences, but you cannot expect the result to
+be a proper list, i.e. having no 'holes'. Instead, you always need to use `pairs`
+to iterate over the result - note that there is a hole at index 5:
+
+ > t = seq.count_map {1,2,4,2,2,3,4,2,6}
+ > for k,v in pairs(t) do print(k,v) end
+ 1 1
+ 2 4
+ 3 1
+ 4 2
+ 6 1
+
+`unique` uses `count_map` to return a list of the unique values, that is, just
+the keys of the resulting table.
+
+`last` turns a single-valued sequence into a double-valued sequence with the
+current value and the last value:
+
+ > for current,last in seq.last {10,20,30,40} do print (current,last) end
+ 20 10
+ 30 20
+ 40 30
+
+This makes it easy to do things like identify repeated lines in a file, or
+construct differences between values. `filter` can handle double-valued sequences
+as well, so one could filter such a sequence to only return cases where the
+current value is less than the last value by using `operator.lt` or just '<'.
+This code then copies the resulting code into a table.
+
+ > ls = {10,9,10,3}
+ > = seq.copy(seq.filter(seq.last(s),'<'))
+ {9,3}
+
+
+### Sequence Wrappers
+
+The functions in `pl.seq` cover the common patterns when dealing with sequences,
+but chaining these functions together can lead to ugly code. Consider the last
+example of the previous section; `seq` is repeated three times and the resulting
+expression has to be read right-to-left. The first issue can be helped by local
+aliases, so that the expression becomes `copy(filter(last(s),'<'))` but the
+second issue refers to the somewhat unnatural order of functional application.
+We tend to prefer reading operations from left to right, which is one reason why
+object-oriented notation has become popular. Sequence adapters allow this
+expression to be written like so:
+
+ seq(s):last():filter('<'):copy()
+
+With this notation, the operation becomes a chain of method calls running from
+left to right.
+
+'Sequence' is not a basic Lua type, they are generally functions or callable
+objects. The expression `seq(s)` wraps a sequence in a _sequence wrapper_, which
+is an object which understands all the functions in `pl.seq` as methods. This
+object then explicitly represents sequences.
+
+As a special case, the constructor (which is when you call the table `seq`) will
+make a wrapper for a plain list-like table. Here we apply the length operator to
+a sequence of strings, and print them out.
+
+ > seq{'one','tw','t'} :map '#' :printall()
+ 3 2 1
+
+As a convenience, there is a function `seq.lines` which behaves just like
+`io.lines` except it wraps the result as an explicit sequence type. This takes
+the first 10 lines from standard input, makes it uppercase, turns it into a
+sequence with a count and the value, glues these together with the concatenation
+operator, and finally prints out the sequence delimited by a newline.
+
+ seq.lines():take(10):upper():enum():map('..'):printall '\n'
+
+Note the method `upper`, which is not a `seq` function. if an unknown method is
+called, sequence wrappers apply that method to all the values in the sequence
+(this is implicit use of `mapmethod`)
+
+It is straightforward to create custom sequences that can be used in this way. On
+Unix, `/dev/random` gives you an _endless_ sequence of random bytes, so we use
+`take` to limit the sequence, and then `map` to scale the result into the desired
+range. The key step is to use `seq` to wrap the iterator function:
+
+ -- random.lua
+ local seq = require 'pl.seq'
+
+ function dev_random()
+ local f = io.open('/dev/random')
+ local byte = string.byte
+ return seq(function()
+ -- read two bytes into a string and convert into a 16-bit number
+ local s = f:read(2)
+ return byte(s,1) + 256*byte(s,2)
+ end)
+ end
+
+ -- print 10 random numbers from 0 to 1 !
+ dev_random():take(10):map('%',100):map('/',100):printall ','
+
+
+Another Linux one-liner depends on the `/proc` filesystem and makes a list of all
+the currently running processes:
+
+ pids = seq(lfs.dir '/proc'):filter(stringx.isdigit):map(tonumber):copy()
+
+This version of Penlight has an experimental feature which relies on the fact
+that _all_ Lua types can have metatables, including functions. This makes
+_implicit sequence wrapping_ possible:
+
+ > seq.import()
+ > seq.random(5):printall(',',5,'%4.1f')
+ 0.0, 0.1, 0.4, 0.1, 0.2
+
+This avoids the awkward `seq(seq.random(5))` construction. Or the iterator can
+come from somewhere else completely:
+
+ > ('one two three'):gfind('%a+'):printall(',')
+ one,two,three,
+
+After `seq.import`, it is no longer necessary to explicitly wrap sequence
+functions.
+
+But there is a price to pay for this convenience. _Every_ function is affected,
+so that any function can be used, appropriate or not:
+
+ > math.sin:printall()
+ ..seq.lua:287: bad argument #1 to '(for generator)' (number expected, got nil)
+ > a = tostring
+ > = a:find(' ')
+ function: 0042C920
+
+What function is returned? It's almost certain to be something that makes no
+sense in the current context. So implicit sequences may make certain kinds of
+programming mistakes harder to catch - they are best used for interactive
+exploration and small scripts.
+
+<a id="comprehensions"/>
+
+### List Comprehensions
+
+List comprehensions are a compact way to create tables by specifying their
+elements. In Python, you can say this:
+
+ ls = [x for x in range(5)] # == [0,1,2,3,4]
+
+In Lua, using `pl.comprehension`:
+
+ > C = require('pl.comprehension').new()
+ > = C ('x for x=1,10') ()
+ {1,2,3,4,5,6,7,8,9,10}
+
+`C` is a function which compiles a list comprehension _string_ into a _function_.
+In this case, the function has no arguments. The parentheses are redundant for a
+function taking a string argument, so this works as well:
+
+ > = C 'x^2 for x=1,4' ()
+ {1,4,9,16}
+ > = C '{x,x^2} for x=1,4' ()
+ {{1,1},{2,4},{3,9},{4,16}}
+
+Note that the expression can be _any_ function of the variable `x`!
+
+The basic syntax so far is `<expr> for <set>`, where `<set>` can be anything that
+the Lua `for` statement understands. `<set>` can also just be the variable, in
+which case the values will come from the _argument_ of the comprehension. Here
+I'm emphasizing that a comprehension is a function which can take a list argument:
+
+ > = C '2*x for x' {1,2,3}
+ {2,4,6}
+ > dbl = C '2*x for x'
+ > = dbl {10,20,30}
+ {20,40,60}
+
+Here is a somewhat more explicit way of saying the same thing; `_1` is a
+_placeholder_ refering to the _first_ argument passed to the comprehension.
+
+ > = C '2*x for _,x in pairs(_1)' {10,20,30}
+ {20,40,60}
+ > = C '_1(x) for x'(tostring,{1,2,3,4})
+ {'1','2','3','4'}
+
+This extended syntax is useful when you wish to collect the result of some
+iterator, such as `io.lines`. This comprehension creates a function which creates
+a table of all the lines in a file:
+
+ > f = io.open('array.lua')
+ > lines = C 'line for line in _1:lines()' (f)
+ > = #lines
+ 118
+
+There are a number of functions that may be applied to the result of a
+comprehension:
+
+ > = C 'min(x for x)' {1,44,0}
+ 0
+ > = C 'max(x for x)' {1,44,0}
+ 44
+ > = C 'sum(x for x)' {1,44,0}
+ 45
+
+(These are equivalent to a reduce operation on a list.)
+
+After the `for` part, there may be a condition, which filters the output. This
+comprehension collects the even numbers from a list:
+
+ > = C 'x for x if x % 2 == 0' {1,2,3,4,5}
+ {2,4}
+
+There may be a number of `for` parts:
+
+ > = C '{x,y} for x = 1,2 for y = 1,2' ()
+ {{1,1},{1,2},{2,1},{2,2}}
+ > = C '{x,y} for x for y' ({1,2},{10,20})
+ {{1,10},{1,20},{2,10},{2,20}}
+
+These comprehensions are useful when dealing with functions of more than one
+variable, and are not so easily achieved with the other Penlight functional forms.
+
+<a id="func"/>
+
+### Creating Functions from Functions
+
+@lookup pl.func
+
+Lua functions may be treated like any other value, although of course you cannot
+multiply or add them. One operation that makes sense is _function composition_,
+which chains function calls (so `(f * g)(x)` is `f(g(x))`.)
+
+ > func = require 'pl.func'
+ > printf = func.compose(io.write,string.format)
+ > printf("hello %s\n",'world')
+ hello world
+ true
+
+Many functions require you to pass a function as an argument, say to apply to all
+values of a sequence or as a callback. Often useful functions have the wrong
+number of arguments. So there is a need to construct a function of one argument
+from one of two arguments, _binding_ the extra argument to a given value.
+
+_partial application_ takes a function of n arguments and returns a function of n-1
+arguments where the first argument is bound to some value:
+
+ > p2 = func.bind1(print,'start>')
+ > p2('hello',2)
+ start> hello 2
+ > ops = require 'pl.operator'
+ > = tablex.filter({1,-2,10,-1,2},bind1(ops.gt,0))
+ {-2,-1}
+ > tablex.filter({1,-2,10,-1,2},bind1(ops.le,0))
+ {1,10,2}
+
+The last example unfortunately reads backwards, because `bind1` alway binds the
+first argument! Also unfortunately, in my youth I confused 'currying' with
+'partial application', so the old name for `bind1` is `curry` - this alias still exists.
+
+This is a specialized form of function argument binding. Here is another way
+to say the `print` example:
+
+ > p2 = func.bind(print,'start>',func._1,func._2)
+ > p2('hello',2)
+ start> hello 2
+
+where `_1` and `_2` are _placeholder variables_, corresponding to the first and
+second argument respectively.
+
+Having `func` all over the place is distracting, so it's useful to pull all of
+`pl.func` into the local context. Here is the filter example, this time the right
+way around:
+
+ > utils.import 'pl.func'
+ > tablex.filter({1,-2,10,-1,2},bind(ops.gt, _1, 0))
+ {1,10,2}
+
+`tablex.merge` does a general merge of two tables. This example shows the
+usefulness of binding the last argument of a function.
+
+ > S1 = {john=27, jane=31, mary=24}
+ > S2 = {jane=31, jones=50}
+ > intersection = bind(tablex.merge, _1, _2, false)
+ > union = bind(tablex.merge, _1, _2, true)
+ > = intersection(S1,S2)
+ {jane=31}
+ > = union(S1,S2)
+ {mary=24,jane=31,john=27,jones=50}
+
+When using `bind` with `print`, we got a function of precisely two arguments,
+whereas we really want our function to use varargs like `print`. This is the role
+of `_0`:
+
+ > _DEBUG = true
+ > p = bind(print,'start>', _0)
+ return function (fn,_v1)
+ return function(...) return fn(_v1,...) end
+ end
+
+ > p(1,2,3,4,5)
+ start> 1 2 3 4 5
+
+I've turned on the global `_DEBUG` flag, so that the function generated is
+printed out. It is actually a function which _generates_ the required function;
+the first call _binds the value_ of `_v1` to 'start>'.
+
+### Placeholder Expressions
+
+A common pattern in Penlight is a function which applies another function to all
+elements in a table or a sequence, such as `tablex.map` or `seq.filter`. Lua does
+anonymous functions well, although they can be a bit tedious to type:
+
+ > = tablex.map(function(x) return x*x end, {1,2,3,4})
+ {1,4,9,16}
+
+`pl.func` allows you to define _placeholder expressions_, which can cut down on
+the typing required, and also make your intent clearer. First, we bring contents
+of `pl.func` into our context, and then supply an expression using placeholder
+variables, such as `_1`,`_2`,etc. (C++ programmers will recognize this from the
+Boost libraries.)
+
+ > utils.import 'pl.func'
+ > = tablex.map(_1*_1, {1,2,3,4})
+ {1,4,9,16}
+
+Functions of up to 5 arguments can be generated.
+
+ > = tablex.map2(_1+_2,{1,2,3}, {10,20,30})
+ {11,22,33}
+
+These expressions can use arbitrary functions, altho they must first be
+registered with the functional library. `func.register` brings in a single
+function, and `func.import` brings in a whole table of functions, such as `math`.
+
+ > sin = register(math.sin)
+ > = tablex.map(sin(_1), {1,2,3,4})
+ {0.8414709848079,0.90929742682568,0.14112000805987,-0.75680249530793}
+ > import 'math'
+ > = tablex.map(cos(2*_1),{1,2,3,4})
+ {-0.41614683654714,-0.65364362086361,0.96017028665037,-0.14550003380861}
+
+A common operation is calling a method of a set of objects:
+
+ > = tablex.map(_1:sub(1,1), {'one','four','x'})
+ {'o','f','x'}
+
+There are some restrictions on what operators can be used in PEs. For instance,
+because the `__len` metamethod cannot be overriden by plain Lua tables, we need
+to define a special function to express `#_1':
+
+ > = tablex.map(Len(_1), {'one','four','x'})
+ {3,4,1}
+
+Likewise for comparison operators, which cannot be overloaded for _different_
+types, and thus also have to be expressed as a special function:
+
+ > = tablex.filter(Gt(_1,0), {1,-1,2,4,-3})
+ {1,2,4}
+
+It is useful to express the fact that a function returns multiple values. For
+instance, `tablex.pairmap` expects a function that will be called with the key
+and the value, and returns the new value and the key, in that order.
+
+ > = pairmap(Args(_2,_1:upper()),{fred=1,alice=2})
+ {ALICE=2,FRED=1}
+
+PEs cannot contain `nil` values, since PE function arguments are represented as
+an array. Instead, a special value called `Nil` is provided. So say
+`_1:f(Nil,1)` instead of `_1:f(nil,1)`.
+
+A placeholder expression cannot be automatically used as a Lua function. The
+technical reason is that the call operator must be overloaded to construct
+function calls like `_1(1)`. If you want to force a PE to return a function, use
+`func.I`.
+
+ > = tablex.map(_1(10),{I(2*_1),I(_1*_1),I(_1+2)})
+ {20,100,12}
+
+Here we make a table of functions taking a single argument, and then call them
+all with a value of 10.
+
+The essential idea with PEs is to 'quote' an expression so that it is not
+immediately evaluated, but instead turned into a function that can be applied
+later to some arguments. The basic mechanism is to wrap values and placeholders
+so that the usual Lua operators have the effect of building up an _expression
+tree_. (It turns out that you can do _symbolic algebra_ using PEs, see
+`symbols.lua` in the examples directory, and its test runner `testsym.lua`, which
+demonstrates symbolic differentiation.)
+
+The rule is that if any operator has a PE operand, the result will be quoted.
+Sometimes we need to quote things explicitly. For instance, say we want to pass a
+function to a filter that must return true if the element value is in a set.
+`set[_1]` is the obvious expression, but it does not give the desired result,
+since it evaluates directly, giving `nil`. Indexing works differently than a
+binary operation like addition (set+_1 _is_ properly quoted) so there is a need
+for an explicit quoting or wrapping operation. This is the job of the `_`
+function; the PE in this case should be `_(set)[_1]`. This works for functions
+as well, as a convenient alternative to registering functions: `_(math.sin)(_1)`.
+This is equivalent to using the `lines' method:
+
+ for line in I(_(f):read()) do print(line) end
+
+Now this will work for _any_ 'file-like' object which which has a `read` method
+returning the next line. If you had a LuaSocket client which was being 'pushed'
+by lines sent from a server, then `_(s):receive '*l'` would create an iterator
+for accepting input. These forms can be convenient for adapting your data flow so
+that it can be passed to the sequence functions in `pl.seq'.
+
+Placeholder expressions can be mixed with sequence wrapper expressions.
+`lexer.lua` will give us a double-valued sequence of tokens, where the first
+value is a type, and the second is a value. We filter out only the values where
+the type is 'iden', extract the actual value using `map`, get the unique values
+and finally copy to a list.
+
+ > str = 'for i=1,10 do for j = 1,10 do print(i,j) end end'
+ > = seq(lexer.lua(str)):filter('==','iden'):map(_2):unique():copy()
+ {i,print,j}
+
+This is a particularly intense line (and I don't always suggest making everything
+a one-liner!); the key is the behaviour of `map`, which will take both values of
+the sequence, so `_2` returns the value part. (Since `filter` here takes extra
+arguments, it only operates on the type values.)
+
+There are some performance considerations to using placeholder expressions.
+Instantiating a PE requires constructing and compiling a function, which is not
+such a fast operation. So to get best performance, factor out PEs from loops like
+this;
+
+ local fn = I(_1:f() + _2:g())
+ for i = 1,n do
+ res[i] = tablex.map2(fn,first[i],second[i])
+ end
+
+
diff --git a/Data/Libraries/Penlight/docs_topics/08-additional.md b/Data/Libraries/Penlight/docs_topics/08-additional.md
new file mode 100644
index 0000000..2c99497
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/08-additional.md
@@ -0,0 +1,600 @@
+## Additional Libraries
+
+Libraries in this section are no longer considered to be part of the Penlight
+core, but still provide specialized functionality when needed.
+
+<a id="sip"/>
+
+### Simple Input Patterns
+
+Lua string pattern matching is very powerful, and usually you will not need a
+traditional regular expression library. Even so, sometimes Lua code ends up
+looking like Perl, which happens because string patterns are not always the
+easiest things to read, especially for the casual reader. Here is a program
+which needs to understand three distinct date formats:
+
+ -- parsing dates using Lua string patterns
+ months={Jan=1,Feb=2,Mar=3,Apr=4,May=5,Jun=6,
+ Jul=7,Aug=8,Sep=9,Oct=10,Nov=11,Dec=12}
+
+ function check_and_process(d,m,y)
+ d = tonumber(d)
+ m = tonumber(m)
+ y = tonumber(y)
+ ....
+ end
+
+ for line in f:lines() do
+ -- ordinary (English) date format
+ local d,m,y = line:match('(%d+)/(%d+)/(%d+)')
+ if d then
+ check_and_process(d,m,y)
+ else -- ISO date??
+ y,m,d = line:match('(%d+)%-(%d+)%-(%d+)')
+ if y then
+ check_and_process(d,m,y)
+ else -- <day> <month-name> <year>?
+ d,mm,y = line:match('%(d+)%s+(%a+)%s+(%d+)')
+ m = months[mm]
+ check_and_process(d,m,y)
+ end
+ end
+ end
+
+These aren't particularly difficult patterns, but already typical issues are
+appearing, such as having to escape '-'. Also, `string.match` returns its
+captures, so that we're forced to use a slightly awkward nested if-statement.
+
+Verification issues will further cloud the picture, since regular expression
+people try to enforce constraints (like year cannot be more than four digits)
+using regular expressions, on the usual grounds that you shouldn't stop using a
+hammer when you are enjoying yourself.
+
+`pl.sip` provides a simple, intuitive way to detect patterns in strings and
+extract relevant parts.
+
+ > sip = require 'pl.sip'
+ > dump = require('pl.pretty').dump
+ > res = {}
+ > c = sip.compile 'ref=$S{file}:$d{line}'
+ > = c('ref=hello.c:10',res)
+ true
+ > dump(res)
+ {
+ line = 10,
+ file = "hello.c"
+ }
+ > = c('ref=long name, no line',res)
+ false
+
+`sip.compile` creates a pattern matcher function, which takes a string and a
+table as arguments. If the string matches the pattern, then `true` is returned
+and the table is populated according to the captures within the pattern.
+
+Here is another version of the date parser:
+
+ -- using SIP patterns
+ function check(t)
+ check_and_process(t.day,t.month,t.year)
+ end
+
+ shortdate = sip.compile('$d{day}/$d{month}/$d{year}')
+ longdate = sip.compile('$d{day} $v{mon} $d{year}')
+ isodate = sip.compile('$d{year}-$d{month}-$d{day}')
+
+ for line in f:lines() do
+ local res = {}
+ if shortdate(str,res) then
+ check(res)
+ elseif isodate(str,res) then
+ check(res)
+ elseif longdate(str,res) then
+ res.month = months[res.mon]
+ check(res)
+ end
+ end
+
+SIP captures start with '$', then a one-character type, and then an
+optional variable name in curly braces.
+
+ Type Meaning
+ v identifier
+ i possibly signed integer
+ f floating-point number
+ r rest of line
+ q quoted string (quoted using either ' or ")
+ p a path name
+ ( anything inside balanced parentheses
+ [ anything inside balanced brackets
+ { anything inside balanced curly brackets
+ < anything inside balanced angle brackets
+
+If a type is not one of the above, then it's assumed to be one of the standard
+Lua character classes, and will match one or more repetitions of that class.
+Any spaces you leave in your pattern will match any number of spaces, including
+zero, unless the spaces are between two identifier characters or patterns
+matching them; in that case, at least one space will be matched.
+
+SIP captures (like `$v{mon}`) do not have to be named. You can use just `$v`, but
+you have to be consistent; if a pattern contains unnamed captures, then all
+captures must be unnamed. In this case, the result table is a simple list of
+values.
+
+`sip.match` is a useful shortcut if you want to compile and match in one call,
+without saving the compiled pattern. It caches the result, so it is not much
+slower than explicitly using `sip.compile`.
+
+ > sip.match('($q{first},$q{second})','("john","smith")',res)
+ true
+ > res
+ {second='smith',first='john'}
+ > res = {}
+ > sip.match('($q,$q)','("jan","smit")',res) -- unnamed captures
+ true
+ > res
+ {'jan','smit'}
+ > sip.match('($q,$q)','("jan", "smit")',res)
+ false ---> oops! Can't handle extra space!
+ > sip.match('( $q , $q )','("jan", "smit")',res)
+ true
+
+As a general rule, allow for whitespace in your patterns.
+
+Finally, putting a '$' at the end of a pattern means 'capture the rest of the
+line, starting at the first non-space'. It is a shortcut for '$r{rest}',
+or just '$r' if no named captures are used.
+
+ > sip.match('( $q , $q ) $','("jan", "smit") and a string',res)
+ true
+ > res
+ {'jan','smit','and a string'}
+ > res = {}
+ > sip.match('( $q{first} , $q{last} ) $','("jan", "smit") and a string',res)
+ true
+ > res
+ {first='jan',rest='and a string',last='smit'}
+
+
+<a id="lapp"/>
+
+### Command-line Programs with Lapp
+
+`pl.lapp` is a small and focused Lua module which aims to make standard
+command-line parsing easier and intuitive. It implements the standard GNU style,
+i.e. short flags with one letter start with '-', and there may be an additional
+long flag which starts with '--'. Generally options which take an argument expect
+to find it as the next parameter (e.g. 'gcc test.c -o test') but single short
+options taking a value can dispense with the space (e.g. 'head -n4
+test.c' or `gcc -I/usr/include/lua/5.1 ...`)
+
+As far as possible, Lapp will convert parameters into their equivalent Lua types,
+i.e. convert numbers and convert filenames into file objects. If any conversion
+fails, or a required parameter is missing, an error will be issued and the usage
+text will be written out. So there are two necessary tasks, supplying the flag
+and option names and associating them with a type.
+
+For any non-trivial script, even for personal consumption, it's necessary to
+supply usage text. The novelty of Lapp is that it starts from that point and
+defines a loose format for usage strings which can specify the names and types of
+the parameters.
+
+An example will make this clearer:
+
+ -- scale.lua
+ lapp = require 'pl.lapp'
+ local args = lapp [[
+ Does some calculations
+ -o,--offset (default 0.0) Offset to add to scaled number
+ -s,--scale (number) Scaling factor
+ <number> (number) Number to be scaled
+ ]]
+
+ print(args.offset + args.scale * args.number)
+
+Here is a command-line session using this script:
+
+ $ lua scale.lua
+ scale.lua:missing required parameter: scale
+
+ Does some calculations
+ -o,--offset (default 0.0) Offset to add to scaled number
+ -s,--scale (number) Scaling factor
+ <number> (number ) Number to be scaled
+
+ $ lua scale.lua -s 2.2 10
+ 22
+
+ $ lua scale.lua -s 2.2 x10
+ scale.lua:unable to convert to number: x10
+
+ ....(usage as before)
+
+There are two kinds of lines in Lapp usage strings which are meaningful; option
+and parameter lines. An option line gives the short option, optionally followed
+by the corresponding long option. A type specifier in parentheses may follow.
+Similarly, a parameter line starts with '<NAME>', followed by a type
+specifier.
+
+Type specifiers usually start with a type name: one of 'boolean', 'string','number','file-in' or
+'file-out'. You may leave this out, but then _must_ say 'default' followed by a value.
+If a flag or parameter has a default, it is not _required_ and is set to the default. The actual
+type is deduced from this value (number, string, file or boolean) if not provided directly.
+'Deduce' is a fancy word for 'guess' and it can be wrong, e.g '(default 1)'
+will always be a number. You can say '(string default 1)' to override the guess.
+There are file values for the predefined console streams: stdin, stdout, stderr.
+
+The boolean type is the default for flags. Not providing the type specifier is equivalent to
+'(boolean default false)`. If the flag is meant to be 'turned off' then either the full
+'(boolean default true)` or the shortcut '(default true)' will work.
+
+An alternative to `default` is `optional`:
+
+ local lapp = require 'pl.lapp'
+ local args = lapp [[
+ --cmd (optional string) Command to run.
+ ]]
+
+ if args.cmd then
+ os.execute(args.cmd)
+ end
+
+Here we're implying that `cmd` need not be specified (just as with `default`) but if not
+present, then `args.cmd` is `nil`, which will always test false.
+
+The rest of the line is ignored and can be used for explanatory text.
+
+This script shows the relation between the specified parameter names and the
+fields in the output table.
+
+ -- simple.lua
+ local args = require ('pl.lapp') [[
+ Various flags and option types
+ -p A simple optional flag, defaults to false
+ -q,--quiet A simple flag with long name
+ -o (string) A required option with argument
+ -s (default 'save') Optional string with default 'save' (single quotes ignored)
+ -n (default 1) Optional numerical flag with default 1
+ -b (string default 1) Optional string flag with default '1' (type explicit)
+ <input> (default stdin) Optional input file parameter, reads from stdin
+ ]]
+
+ for k,v in pairs(args) do
+ print(k,v)
+ end
+
+I've just dumped out all values of the args table; note that args.quiet has
+become true, because it's specified; args.p defaults to false. If there is a long
+name for an option, that will be used in preference as a field name. A type or
+default specifier is not necessary for simple flags, since the default type is
+boolean.
+
+ $ simple -o test -q simple.lua
+ p false
+ input file (781C1BD8)
+ quiet true
+ o test
+ input_name simple.lua
+ D:\dev\lua\lapp>simple -o test simple.lua one two three
+ 1 one
+ 2 two
+ 3 three
+ p false
+ quiet false
+ input file (781C1BD8)
+ o test
+ input_name simple.lua
+
+The parameter input has been set to an open read-only file object - we know it
+must be a read-only file since that is the type of the default value. The field
+input_name is automatically generated, since it's often useful to have access to
+the original filename.
+
+Notice that any extra parameters supplied will be put in the result table with
+integer indices, i.e. args[i] where i goes from 1 to #args.
+
+Files don't really have to be closed explicitly for short scripts with a quick
+well-defined mission, since the result of garbage-collecting file objects is to
+close them.
+
+#### Enforcing a Range and Enumerations
+
+The type specifier can also be of the form '(' MIN '..' MAX ')' or a set of strings
+separated by '|'.
+
+ local lapp = require 'pl.lapp'
+ local args = lapp [[
+ Setting ranges
+ <x> (1..10) A number from 1 to 10
+ <y> (-5..1e6) Bigger range
+ <z> (slow|medium|fast)
+ ]]
+
+ print(args.x,args.y)
+
+Here the meaning of ranges is that the value is greater or equal to MIN and less or equal
+to MAX.
+An 'enum' is a _string_ that can only have values from a specified set.
+
+#### Custom Types
+
+There is no builti-in way to force a parameter to be a whole number, but
+you may define a custom type that does this:
+
+ lapp = require ('pl.lapp')
+
+ lapp.add_type('integer','number',
+ function(x)
+ lapp.assert(math.ceil(x) == x, 'not an integer!')
+ end
+ )
+
+ local args = lapp [[
+ <ival> (integer) Process PID
+ ]]
+
+ print(args.ival)
+
+`lapp.add_type` takes three parameters, a type name, a converter and a constraint
+function. The constraint function is expected to throw an assertion if some
+condition is not true; we use `lapp.assert` because it fails in the standard way
+for a command-line script. The converter argument can either be a type name known
+to Lapp, or a function which takes a string and generates a value.
+
+Here's a useful custom type that allows dates to be input as @{pl.Date} values:
+
+ local df = Date.Format()
+
+ lapp.add_type('date',
+ function(s)
+ local d,e = df:parse(s)
+ lapp.assert(d,e)
+ return d
+ end
+ )
+
+#### 'varargs' Parameter Arrays
+
+ lapp = require 'pl.lapp'
+ local args = lapp [[
+ Summing numbers
+ <numbers...> (number) A list of numbers to be summed
+ ]]
+
+ local sum = 0
+ for i,x in ipairs(args.numbers) do
+ sum = sum + x
+ end
+ print ('sum is '..sum)
+
+The parameter number has a trailing '...', which indicates that this parameter is
+a 'varargs' parameter. It must be the last parameter, and args.number will be an
+array.
+
+Consider this implementation of the head utility from Mac OS X:
+
+ -- implements a BSD-style head
+ -- (see http://www.manpagez.com/man/1/head/osx-10.3.php)
+
+ lapp = require ('pl.lapp')
+
+ local args = lapp [[
+ Print the first few lines of specified files
+ -n (default 10) Number of lines to print
+ <files...> (default stdin) Files to print
+ ]]
+
+ -- by default, lapp converts file arguments to an actual Lua file object.
+ -- But the actual filename is always available as <file>_name.
+ -- In this case, 'files' is a varargs array, so that 'files_name' is
+ -- also an array.
+ local nline = args.n
+ local nfile = #args.files
+ for i = 1,nfile do
+ local file = args.files[i]
+ if nfile > 1 then
+ print('==> '..args.files_name[i]..' <==')
+ end
+ local n = 0
+ for line in file:lines() do
+ print(line)
+ n = n + 1
+ if n == nline then break end
+ end
+ end
+
+Note how we have access to all the filenames, because the auto-generated field
+`files_name` is also an array!
+
+(This is probably not a very considerate script, since Lapp will open all the
+files provided, and only close them at the end of the script. See the `xhead.lua`
+example for another implementation.)
+
+Flags and options may also be declared as vararg arrays, and can occur anywhere.
+If there is both a short and long form, then the trailing "..." must happen after the long form,
+for example "-x,--network... (string)...",
+
+Bear in mind that short options can be combined (like 'tar -xzf'), so it's
+perfectly legal to have '-vvv'. But normally the value of args.v is just a simple
+`true` value.
+
+ local args = require ('pl.lapp') [[
+ -v... Verbosity level; can be -v, -vv or -vvv
+ ]]
+ vlevel = not args.v[1] and 0 or #args.v
+ print(vlevel)
+
+The vlevel assigment is a bit of Lua voodoo, so consider the cases:
+
+ * No -v flag, v is just { false }
+ * One -v flags, v is { true }
+ * Two -v flags, v is { true, true }
+ * Three -v flags, v is { true, true, true }
+
+#### Defining a Parameter Callback
+
+If a script implements `lapp.callback`, then Lapp will call it after each
+argument is parsed. The callback is passed the parameter name, the raw unparsed
+value, and the result table. It is called immediately after assignment of the
+value, so the corresponding field is available.
+
+ lapp = require ('pl.lapp')
+
+ function lapp.callback(parm,arg,args)
+ print('+',parm,arg)
+ end
+
+ local args = lapp [[
+ Testing parameter handling
+ -p Plain flag (defaults to false)
+ -q,--quiet Plain flag with GNU-style optional long name
+ -o (string) Required string option
+ -n (number) Required number option
+ -s (default 1.0) Option that takes a number, but will default
+ <start> (number) Required number argument
+ <input> (default stdin) A parameter which is an input file
+ <output> (default stdout) One that is an output file
+ ]]
+ print 'args'
+ for k,v in pairs(args) do
+ print(k,v)
+ end
+
+This produces the following output:
+
+ $ args -o name -n 2 10 args.lua
+ + o name
+ + n 2
+ + start 10
+ + input args.lua
+ args
+ p false
+ s 1
+ input_name args.lua
+ quiet false
+ output file (781C1B98)
+ start 10
+ input file (781C1BD8)
+ o name
+ n 2
+
+Callbacks are needed when you want to take action immediately on parsing an
+argument.
+
+#### Slack Mode
+
+If you'd like to use a multi-letter 'short' parameter you need to set
+the `lapp.slack` variable to `true`.
+
+In the following example we also see how default `false` and default `true` flags can be used
+and how to overwrite the default `-h` help flag (`--help` still works fine) - this applies
+to non-slack mode as well.
+
+ -- Parsing the command line ----------------------------------------------------
+ -- test.lua
+ local lapp = require 'pl.lapp'
+ local pretty = require 'pl.pretty'
+ lapp.slack = true
+ local args = lapp [[
+ Does some calculations
+ -v, --video (string) Specify input video
+ -w, --width (default 256) Width of the video
+ -h, --height (default 144) Height of the video
+ -t, --time (default 10) Seconds of video to process
+ -sk,--seek (default 0) Seek number of seconds
+ -f1,--flag1 A false flag
+ -f2,--flag2 A false flag
+ -f3,--flag3 (default true) A true flag
+ -f4,--flag4 (default true) A true flag
+ ]]
+
+ pretty.dump(args)
+
+And here we can see the output of `test.lua`:
+
+ $> lua test.lua -v abc --time 40 -h 20 -sk 15 --flag1 -f3
+ ---->
+ {
+ width = 256,
+ flag1 = true,
+ flag3 = false,
+ seek = 15,
+ flag2 = false,
+ video = abc,
+ time = 40,
+ height = 20,
+ flag4 = true
+ }
+
+### Simple Test Framework
+
+`pl.test` was originally developed for the sole purpose of testing Penlight itself,
+but you may find it useful for your own applications. ([There are many other options](http://lua-users.org/wiki/UnitTesting).)
+
+Most of the goodness is in `test.asserteq`. It uses `tablex.deepcompare` on its two arguments,
+and by default quits the test application with a non-zero exit code, and an informative
+message printed to stderr:
+
+ local test = require 'pl.test'
+
+ test.asserteq({10,20,30},{10,20,30.1})
+
+ --~ test-test.lua:3: assertion failed
+ --~ got: {
+ --~ [1] = 10,
+ --~ [2] = 20,
+ --~ [3] = 30
+ --~ }
+ --~ needed: {
+ --~ [1] = 10,
+ --~ [2] = 20,
+ --~ [3] = 30.1
+ --~ }
+ --~ these values were not equal
+
+This covers most cases but it's also useful to compare strings using `string.match`
+
+ -- must start with bonzo the dog
+ test.assertmatch ('bonzo the dog is here','^bonzo the dog')
+ -- must end with an integer
+ test.assertmatch ('hello 42','%d+$')
+
+Since Lua errors are usually strings, this matching strategy is used to test 'exceptions':
+
+ test.assertraise(function()
+ local t = nil
+ print(t.bonzo)
+ end,'nil value')
+
+(Some care is needed to match the essential part of the thrown error if you care
+for portability, since in Lua 5.2
+the exact error is "attempt to index local 't' (a nil value)" and in Lua 5.3 the error
+is "attempt to index a nil value (local 't')")
+
+There is an extra optional argument to these test functions, which is helpful when writing
+test helper functions. There you want to highlight the failed line, not the actual call
+to `asserteq` or `assertmatch` - line 33 here is the call to `is_iden`
+
+ function is_iden(str)
+ test.assertmatch(str,'^[%a_][%w_]*$',1)
+ end
+
+ is_iden 'alpha_dog'
+ is_iden '$dollars'
+
+ --~ test-test.lua:33: assertion failed
+ --~ got: "$dollars"
+ --~ needed: "^[%a_][%w_]*$"
+ --~ these strings did not match
+
+Useful Lua functions often return multiple values, and `test.tuple` is a convenient way to
+capture these values, whether they contain nils or not.
+
+ T = test.tuple
+
+ --- common error pattern
+ function failing()
+ return nil,'failed'
+ end
+
+ test.asserteq(T(failing()),T(nil,'failed'))
+
diff --git a/Data/Libraries/Penlight/docs_topics/09-discussion.md b/Data/Libraries/Penlight/docs_topics/09-discussion.md
new file mode 100644
index 0000000..7942d95
--- /dev/null
+++ b/Data/Libraries/Penlight/docs_topics/09-discussion.md
@@ -0,0 +1,91 @@
+## Technical Choices
+
+### Modularity and Granularity
+
+In an ideal world, a program should only load the libraries it needs. Penlight is
+intended to work in situations where an extra 100Kb of bytecode could be a
+problem. It is straightforward but tedious to load exactly what you need:
+
+ local data = require 'pl.data'
+ local List = require 'pl.List'
+ local array2d = require 'pl.array2d'
+ local seq = require 'pl.seq'
+ local utils = require 'pl.utils'
+
+This is the style that I follow in Penlight itself, so that modules don't mess
+with the global environment; also, `stringx.import()` is not used because it will
+update the global `string` table.
+
+But `require 'pl'` is more convenient in scripts; the question is how to ensure
+that one doesn't load the whole kitchen sink as the price of convenience. The
+strategy is to only load modules when they are referenced. In 'init.lua' (which
+is loaded by `require 'pl'`) a metatable is attached to the global table with an
+`__index` metamethod. Any unknown name is looked up in the list of modules, and
+if found, we require it and make that module globally available. So when
+`tablex.deepcompare` is encountered, looking up `tablex` causes 'pl.tablex' to be
+required. .
+
+Modifying the behaviour of the global table has consequences. For instance, there
+is the famous module `strict` which comes with Lua itself (perhaps the only
+standard Lua module written in Lua itself) which also does this modification so
+that global variiables must be defined before use. So the implementation in
+'init.lua' allows for a 'not found' hook, which 'pl.strict.lua' uses. Other
+libraries may install their own metatables for `_G`, but Penlight will now
+forward any unknown name to the `__index` defined by the original metatable.
+
+But the strategy is worth the effort: the old 'kitchen sink' 'init.lua' would
+pull in about 260K of bytecode, whereas now typical programs use about 100K less,
+and short scripts even better - for instance, if they were only needing
+functionality in `utils`.
+
+There are some functions which mark their output table with a special metatable,
+when it seems particularly appropriate. For instance, `tablex.makeset` creates a
+`Set`, and `seq.copy` creates a `List`. But this does not automatically result in
+the loading of `pl.Set` and `pl.List`; only if you try to access any of these
+methods. In 'utils.lua', there is an exported table called `stdmt`:
+
+ stdmt = { List = {}, Map = {}, Set = {}, MultiMap = {} }
+
+If you go through 'init.lua', then these plain little 'identity' tables get an
+`__index` metamethod which forces the loading of the full functionality. Here is
+the code from 'list.lua' which starts the ball rolling for lists:
+
+ List = utils.stdmt.List
+ List.__index = List
+ List._name = "List"
+ List._class = List
+
+The 'load-on-demand' strategy helps to modularize the library. Especially for
+more casual use, `require 'pl'` is a good compromise between convenience and
+modularity.
+
+In this current version, I have generally reduced the amount of trickery
+involved. Previously, `Map` was defined in `pl.class`; now it is sensibly defined
+in `pl.Map`; `pl.class` only contains the basic class mechanism (and returns that
+function.) For consistency, `List` is returned directly by `require 'pl.List'`
+(note the uppercase 'L'), Also, the amount of module dependencies in the
+non-core libraries like `pl.config` have been reduced.
+
+### Defining what is Callable
+
+'utils.lua' exports `function_arg` which is used extensively throughout Penlight.
+It defines what is meant by 'callable'. Obviously true functions are immediately
+passed back. But what about strings? The first option is that it represents an
+operator in 'operator.lua', so that '<' is just an alias for `operator.lt`.
+
+We then check whether there is a _function factory_ defined for the metatable of
+the value.
+
+(It is true that strings can be made callable, but in practice this turns out to
+be a cute but dubious idea, since _all_ strings share the same metatable. A
+common programming error is to pass the wrong kind of object to a function, and
+it's better to get a nice clean 'attempting to call a string' message rather than
+some obscure trace from the bowels of your library.)
+
+The other module that registers a function factory is `pl.func`. Placeholder
+expressions cannot be directly calleable, and so need to be instantiated and
+cached in as efficient way as possible.
+
+(An inconsistency is that `utils.is_callable` does not do this thorough check.)
+
+