diff options
author | chai <chaifix@163.com> | 2021-10-30 11:32:16 +0800 |
---|---|---|
committer | chai <chaifix@163.com> | 2021-10-30 11:32:16 +0800 |
commit | 42ec7286b2d36a9ba22925f816a17cb1cc2aa5ce (patch) | |
tree | 24bc7009457a8d7500f264e89946dc20d069294f /Data/Libraries/Penlight/docs/manual | |
parent | 164885fd98d48703bd771f802d79557b7db97431 (diff) |
+ Penlight
Diffstat (limited to 'Data/Libraries/Penlight/docs/manual')
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/01-introduction.md.html | 843 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/02-arrays.md.html | 914 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/03-strings.md.html | 397 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/04-paths.md.html | 329 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/05-dates.md.html | 269 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/06-data.md.html | 1633 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/07-functional.md.html | 834 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/08-additional.md.html | 815 | ||||
-rw-r--r-- | Data/Libraries/Penlight/docs/manual/09-discussion.md.html | 233 |
9 files changed, 6267 insertions, 0 deletions
diff --git a/Data/Libraries/Penlight/docs/manual/01-introduction.md.html b/Data/Libraries/Penlight/docs/manual/01-introduction.md.html new file mode 100644 index 0000000..fe42256 --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/01-introduction.md.html @@ -0,0 +1,843 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Purpose">Purpose </a></li> +<li><a href="#To_Inject_or_not_to_Inject_">To Inject or not to Inject? </a></li> +<li><a href="#What_are_function_arguments_in_Penlight_">What are function arguments in Penlight? </a></li> +<li><a href="#Pros_and_Cons_of_Loopless_Programming">Pros and Cons of Loopless Programming </a></li> +<li><a href="#Generally_useful_functions">Generally useful functions </a></li> +<li><a href="#Application_Support">Application Support </a></li> +<li><a href="#Simplifying_Object_Oriented_Programming_in_Lua">Simplifying Object-Oriented Programming in Lua </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><strong>Introduction</strong></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Introduction</h2> + +<p><a name="Purpose"></a></p> +<h3>Purpose</h3> + +<p>It is often said of Lua that it does not include batteries. That is because the +goal of Lua is to produce a lean expressive language that will be used on all +sorts of machines, (some of which don't even have hierarchical filesystems). The +Lua language is the equivalent of an operating system kernel; the creators of Lua +do not see it as their responsibility to create a full software ecosystem around +the language. That is the role of the community.</p> + +<p>A principle of software design is to recognize common patterns and reuse them. If +you find yourself writing things like `io.write(string.format('the answer is %d +',42))` more than a number of times then it becomes useful just to define a +function <code>printf</code>. This is good, not just because repeated code is harder to +maintain, but because such code is easier to read, once people understand your +libraries.</p> + +<p>Penlight captures many such code patterns, so that the intent of your code +becomes clearer. For instance, a Lua idiom to copy a table is <code>{unpack(t)}</code>, but +this will only work for 'small' tables (for a given value of 'small') so it is +not very robust. Also, the intent is not clear. So <a href="../libraries/pl.tablex.html#deepcopy">tablex.deepcopy</a> is provided, +which will also copy nested tables and and associated metatables, so it can be +used to clone complex objects.</p> + +<p>The default error handling policy follows that of the Lua standard libraries: if +a argument is the wrong type, then an error will be thrown, but otherwise we +return <code>nil,message</code> if there is a problem. There are some exceptions; functions +like <a href="../libraries/pl.input.html#fields">input.fields</a> default to shutting down the program immediately with a +useful message. This is more appropriate behaviour for a <em>script</em> than providing +a stack trace. (However, this default can be changed.) The lexer functions always +throw errors, to simplify coding, and so should be wrapped in <a href="https://www.lua.org/manual/5.1/manual.html#pdf-pcall">pcall</a>.</p> + +<p>If you are used to Python conventions, please note that all indices consistently +start at 1.</p> + +<p>The Lua function <a href="https://www.lua.org/manual/5.1/manual.html#pdf-table.foreach">table.foreach</a> has been deprecated in favour of the <code>for in</code> +statement, but such an operation becomes particularly useful with the +higher-order function support in Penlight. Note that <a href="../libraries/pl.tablex.html#foreach">tablex.foreach</a> reverses +the order, so that the function is passed the value and then the key. Although +perverse, this matches the intended use better.</p> + +<p>The only important external dependence of Penlight is +<a href="http://keplerproject.github.com/luafilesystem/manual.html">LuaFileSystem</a> +(<a href="http://stevedonovan.github.io/lua-stdlibs/modules/lfs.html">lfs</a>), and if you want <a href="../libraries/pl.dir.html#copyfile">dir.copyfile</a> to work cleanly on Windows, you will need +either <a href="http://alien.luaforge.net/">alien</a> or be using +<a href="http://luajit.org">LuaJIT</a> as well. (The fallback is to call the equivalent +shell commands.)</p> + +<p><a name="To_Inject_or_not_to_Inject_"></a></p> +<h3>To Inject or not to Inject?</h3> + +<p>It was realized a long time ago that large programs needed a way to keep names +distinct by putting them into tables (Lua), namespaces (C++) or modules +(Python). It is obviously impossible to run a company where everyone is called +'Bruce', except in Monty Python skits. These 'namespace clashes' are more of a +problem in a simple language like Lua than in C++, because C++ does more +complicated lookup over 'injected namespaces'. However, in a small group of +friends, 'Bruce' is usually unique, so in particular situations it's useful to +drop the formality and not use last names. It depends entirely on what kind of +program you are writing, whether it is a ten line script or a ten thousand line +program.</p> + +<p>So the Penlight library provides the formal way and the informal way, without +imposing any preference. You can do it formally like:</p> + + +<pre> +<span class="keyword">local</span> utils = <span class="global">require</span> <span class="string">'pl.utils'</span> +utils.printf(<span class="string">"%s\n"</span>,<span class="string">"hello, world!"</span>) +</pre> + +<p>or informally like:</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> +utils.printf(<span class="string">"%s\n"</span>,<span class="string">"That feels better"</span>) +</pre> + +<p><code>require 'pl'</code> makes all the separate Penlight modules available, without needing +to require them each individually.</p> + +<p>Generally, the formal way is better when writing modules, since then there are no +global side-effects and the dependencies of your module are made explicit.</p> + +<p>Andrew Starks has contributed another way, which balances nicely between the +formal need to keep the global table uncluttered and the informal need for +convenience. <code>require'pl.import_into'</code> returns a function, which accepts a table +for injecting Penlight into, or if no table is given, it passes back a new one.</p> + + +<pre> +<span class="keyword">local</span> pl = <span class="global">require</span><span class="string">'pl.import_into'</span>() +</pre> + +<p>The table <a href="../libraries/pl.html#">pl</a> is a 'lazy table' which loads modules as needed, so we can then +use <a href="../libraries/pl.utils.html#printf">pl.utils.printf</a> and so forth, without an explicit `require' or harming any +globals.</p> + +<p>If you are using <code>_ENV</code> with Lua 5.2 to define modules, then here is a way to +make Penlight available within a module:</p> + + +<pre> +<span class="keyword">local</span> _ENV,M = <span class="global">require</span> <span class="string">'pl.import_into'</span> () + +<span class="keyword">function</span> answer () + <span class="comment">-- all the Penlight modules are available! +</span> <span class="keyword">return</span> pretty.write(utils.split <span class="string">'10 20 30'</span>, <span class="string">''</span>) +<span class="keyword">end</span> + +<span class="keyword">return</span> M +</pre> + +<p>The default is to put Penlight into <code>\_ENV</code>, which has the unintended effect of +making it available from the module (much as <code>module(...,package.seeall)</code> does). +To satisfy both convenience and safety, you may pass <code>true</code> to this function, and +then the <em>module</em> <code>M</code> is not the same as <code>\_ENV</code>, but only contains the exported +functions.</p> + +<p>Otherwise, Penlight will <em>not</em> bring in functions into the global table, or +clobber standard tables like 'io'. require('pl') will bring tables like +'utils','tablex',etc into the global table <em>if they are used</em>. This +'load-on-demand' strategy ensures that the whole kitchen sink is not loaded up +front, so this method is as efficient as explicitly loading required modules.</p> + +<p>You have an option to bring the <a href="../libraries/pl.stringx.html#">pl.stringx</a> methods into the standard string +table. All strings have a metatable that allows for automatic lookup in <a href="https://www.lua.org/manual/5.1/manual.html#5.4">string</a>, +so we can say <code>s:upper()</code>. Importing <a href="../libraries/pl.stringx.html#">stringx</a> allows for its functions to also +be called as methods: <code>s:strip()</code>,etc:</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> +stringx.import() +</pre> + +<p>or, more explicitly:</p> + + +<pre> +<span class="global">require</span>(<span class="string">'pl.stringx'</span>).import() +</pre> + +<p>A more delicate operation is importing tables into the local environment. This is +convenient when the context makes the meaning of a name very clear:</p> + + +<pre> +> <span class="global">require</span> <span class="string">'pl'</span> +> utils.import(<span class="global">math</span>) +> = sin(<span class="number">1.2</span>) +<span class="number">0.93203908596723</span> +</pre> + +<p><a href="../libraries/pl.utils.html#import">utils.import</a> can also be passed a module name as a string, which is first +required and then imported. If used in a module, <code>import</code> will bring the symbols +into the module context.</p> + +<p>Keeping the global scope simple is very necessary with dynamic languages. Using +global variables in a big program is always asking for trouble, especially since +you do not have the spell-checking provided by a compiler. The <a href="../libraries/pl.strict.html#">pl.strict</a> +module enforces a simple rule: globals must be 'declared'. This means that they +must be assigned before use; assigning to <code>nil</code> is sufficient.</p> + + +<pre> +> <span class="global">require</span> <span class="string">'pl.strict'</span> +> <span class="global">print</span>(x) +stdin:<span class="number">1</span>: variable <span class="string">'x'</span> is <span class="keyword">not</span> declared +> x = <span class="keyword">nil</span> +> <span class="global">print</span>(x) +<span class="keyword">nil</span> +</pre> + +<p>The <a href="../libraries/pl.strict.html#">strict</a> module provided by Penlight is compatible with the 'load-on-demand' +scheme used by <code>require 'pl</code>.</p> + +<p><a href="../libraries/pl.strict.html#">strict</a> also disallows assignment to global variables, except in the main +program. Generally, modules have no business messing with global scope; if you +must do it, then use a call to <a href="https://www.lua.org/manual/5.1/manual.html#pdf-rawset">rawset</a>. Similarly, if you have to check for the +existence of a global, use <a href="https://www.lua.org/manual/5.1/manual.html#pdf-rawget">rawget</a>.</p> + +<p>If you wish to enforce strictness globally, then just add <code>require 'pl.strict'</code> +at the end of <code>pl/init.lua</code>, otherwise call it from your main program.</p> + +<p>As from 1.1.0, this module provides a <a href="../libraries/pl.strict.html#module">strict.module</a> function which creates (or +modifies) modules so that accessing an unknown function or field causes an error.</p> + +<p>For example,</p> + + +<pre> +<span class="comment">-- mymod.lua +</span><span class="keyword">local</span> strict = <span class="global">require</span> <span class="string">'pl.strict'</span> +<span class="keyword">local</span> M = strict.<span class="global">module</span> (...) + +<span class="keyword">function</span> M.answer () + <span class="keyword">return</span> <span class="number">42</span> +<span class="keyword">end</span> + +<span class="keyword">return</span> M +</pre> + +<p>If you were to accidently type <code>mymod.Answer()</code>, then you would get a runtime +error: "variable 'Answer' is not declared in 'mymod'".</p> + +<p>This can be applied to existing modules. You may desire to have the same level +of checking for the Lua standard libraries:</p> + + +<pre> +strict.make_all_strict(_G) +</pre> + +<p>Thereafter a typo such as <code>math.cosine</code> will give you an explicit error, rather +than merely returning a <code>nil</code> that will cause problems later.</p> + +<p><a name="What_are_function_arguments_in_Penlight_"></a></p> +<h3>What are function arguments in Penlight?</h3> + +<p>Many functions in Penlight themselves take function arguments, like <code>map</code> which +applies a function to a list, element by element. You can use existing +functions, like <a href="https://www.lua.org/manual/5.1/manual.html#pdf-math.max">math.max</a>, anonymous functions (like `function(x,y) return x > y +end<code> ), or operations by name (e.g '*' or '..'). The module </code>pl.operator` exports +all the standard Lua operations, like the Python module of the same name. +Penlight allows these to be referred to by name, so <a href="../libraries/pl.operator.html#gt">operator.gt</a> can be more +concisely expressed as '>'.</p> + +<p>Note that the <code>map</code> functions pass any extra arguments to the function, so we can +have <code>ls:filter('>',0)</code>, which is a shortcut for +<code>ls:filter(function(x) return x > 0 end)</code>.</p> + +<p>Finally, <a href="../libraries/pl.func.html#">pl.func</a> supports <em>placeholder expressions</em> in the Boost lambda style, +so that an anonymous function to multiply the two arguments can be expressed as +<code>\<em>1*\</em>2</code>.</p> + +<p>To use them directly, note that <em>all</em> function arguments in Penlight go through +<a href="../libraries/pl.utils.html#function_arg">utils.function_arg</a>. <a href="../libraries/pl.func.html#">pl.func</a> registers itself with this function, so that you +can directly use placeholder expressions with standard methods:</p> + + +<pre> +> _1 = func._1 +> = List{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>}:map(_1+<span class="number">1</span>) +{<span class="number">11</span>,<span class="number">21</span>,<span class="number">31</span>} +</pre> + +<p>Another option for short anonymous functions is provided by +<a href="../libraries/pl.utils.html#string_lambda">utils.string_lambda</a>; this is invoked automatically:</p> + + +<pre> +> = List{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>}:map <span class="string">'|x| x + 1'</span> +{<span class="number">11</span>,<span class="number">21</span>,<span class="number">31</span>} +</pre> + +<p><a name="Pros_and_Cons_of_Loopless_Programming"></a></p> +<h3>Pros and Cons of Loopless Programming</h3> + +<p>The standard loops-and-ifs 'imperative' style of programming is dominant, and +often seems to be the 'natural' way of telling a machine what to do. It is in +fact very much how the machine does things, but we need to take a step back and +find ways of expressing solutions in a higher-level way. For instance, applying +a function to all elements of a list is a common operation:</p> + + +<pre> +<span class="keyword">local</span> res = {} +<span class="keyword">for</span> i = <span class="number">1</span>,#ls <span class="keyword">do</span> + res[i] = fun(ls[i]) +<span class="keyword">end</span> +</pre> + +<p>This can be efficiently and succintly expressed as <code>ls:map(fun)</code>. Not only is +there less typing but the intention of the code is clearer. If readers of your +code spend too much time trying to guess your intention by analyzing your loops, +then you have failed to express yourself clearly. Similarly, <code>ls:filter('>',0)</code> +will give you all the values in a list greater than zero. (Of course, if you +don't feel like using <a href="../classes/pl.List.html#">List</a>, or have non-list-like tables, then <a href="../libraries/pl.tablex.html#">pl.tablex</a> +offers the same facilities. In fact, the <a href="../classes/pl.List.html#">List</a> methods are implemented using +<a href="../libraries/pl.tablex.html#">tablex</a> functions.)</p> + +<p>A common observation is that loopless programming is less efficient, particularly +in the way it uses memory. <code>ls1:map2('*',ls2):reduce '+'</code> will give you the dot +product of two lists, but an unnecessary temporary list is created. But +efficiency is relative to the actual situation, it may turn out to be <em>fast +enough</em>, or may not appear in any crucial inner loops, etc.</p> + +<p>Writing loops is 'error-prone and tedious', as Stroustrup says. But any +half-decent editor can be taught to do much of that typing for you. The question +should actually be: is it tedious to <em>read</em> loops? As with natural language, +programmers tend to read chunks at a time. A for-loop causes no surprise, and +probably little brain activity. One argument for loopless programming is the +loops that you <em>do</em> write stand out more, and signal 'something different +happening here'. It should not be an all-or-nothing thing, since most programs +require a mixture of idioms that suit the problem. Some languages (like APL) do +nearly everything with map and reduce operations on arrays, and so solutions can +sometimes seem forced. Wisdom is knowing when a particular idiom makes a +particular problem easy to <em>solve</em> and the solution easy to <em>explain</em> afterwards.</p> + +<p><a name="Generally_useful_functions_"></a></p> +<h3>Generally useful functions.</h3> + +<p>The function <code>printf</code> discussed earlier is included in <a href="../libraries/pl.utils.html#">pl.utils</a> because it +makes properly formatted output easier. (There is an equivalent <code>fprintf</code> which +also takes a file object parameter, just like the C function.)</p> + +<p>Splitting a string using a delimiter is a fairly common operation, hence <code>split</code>.</p> + +<p>Utility functions like <code>is_type</code> help with identifying what +kind of animal you are dealing with. +The Lua <a href="https://www.lua.org/manual/5.1/manual.html#pdf-type">type</a> function handles the basic types, but can't distinguish between +different kinds of objects, which are all tables. So <code>is_type</code> handles both +cases, like <code>is_type(s,"string")</code> and <code>is_type(ls,List)</code>.</p> + +<p>A common pattern when working with Lua varargs is capturing all the arguments in +a table:</p> + + +<pre> +<span class="keyword">function</span> t(...) + <span class="keyword">local</span> args = {...} + ... +<span class="keyword">end</span> +</pre> + +<p>But this will bite you someday when <code>nil</code> is one of the arguments, since this +will put a 'hole' in your table. In particular, <code>#ls</code> will only give you the size +upto the <code>nil</code> value. Hence the need for <a href="https://www.lua.org/manual/5.1/manual.html#pdf-table.pack">table.pack</a> - this is a new Lua 5.2 +function which Penlight defines also for Lua 5.1.</p> + + +<pre> +<span class="keyword">function</span> t(...) + <span class="keyword">local</span> args,n = <span class="global">table</span>.pack(...) + <span class="keyword">for</span> i = <span class="number">1</span>,n <span class="keyword">do</span> + ... + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>The 'memoize' pattern occurs when you have a function which is expensive to call, +but will always return the same value subsequently. <a href="../libraries/pl.utils.html#memoize">utils.memoize</a> is given a +function, and returns another function. This calls the function the first time, +saves the value for that argument, and thereafter for that argument returns the +saved value. This is a more flexible alternative to building a table of values +upfront, since in general you won't know what values are needed.</p> + + +<pre> +sum = utils.memoize(<span class="keyword">function</span>(n) + <span class="keyword">local</span> sum = <span class="number">0</span> + <span class="keyword">for</span> i = <span class="number">1</span>,n <span class="keyword">do</span> sum = sum + i <span class="keyword">end</span> + <span class="keyword">return</span> sum +<span class="keyword">end</span>) +... +s = sum(<span class="number">1e8</span>) <span class="comment">--takes time! +</span>... +s = sum(<span class="number">1e8</span>) <span class="comment">--returned saved value!</span> +</pre> + +<p>Penlight is fully compatible with Lua 5.1, 5.2 and LuaJIT 2. To ensure this, +<a href="../libraries/pl.utils.html#">utils</a> also defines the global Lua 5.2 +<a href="http://www.lua.org/work/doc/manual.html#pdf-load">load</a> function as <code>utils.load</code></p> + +<ul> + <li>the input (either a string or a function)</li> + <li>the source name used in debug information</li> + <li>the mode is a string that can have either or both of 'b' or 't', depending on + whether the source is a binary chunk or text code (default is 'bt')</li> + <li>the environment for the compiled chunk</li> +</ul> + +<p>Using <code>utils.load</code> should reduce the need to call the deprecated function <a href="https://www.lua.org/manual/5.1/manual.html#pdf-setfenv">setfenv</a>, +and make your Lua 5.1 code 5.2-friendly.</p> + +<p>The <a href="../libraries/pl.utils.html#">utils</a> module exports <a href="https://www.lua.org/manual/5.1/manual.html#pdf-getfenv">getfenv</a> and <a href="https://www.lua.org/manual/5.1/manual.html#pdf-setfenv">setfenv</a> for +Lua 5.2 as well, based on code by Sergey Rozhenko. Note that these functions can fail +for functions which don't access any globals.</p> + +<p><a name="Application_Support"></a></p> +<h3>Application Support</h3> + +<p><a href="../libraries/pl.app.html#parse_args">app.parse_args</a> is a simple command-line argument parser. If called without any +arguments, it tries to use the global <code>arg</code> array. It returns the <em>flags</em> +(options begining with '-') as a table of name/value pairs, and the <em>arguments</em> +as an array. It knows about long GNU-style flag names, e.g. <code>--value</code>, and +groups of short flags are understood, so that <code>-ab</code> is short for <code>-a -b</code>. The +flags result would then look like <code>{value=true,a=true,b=true}</code>.</p> + +<p>Flags may take values. The command-line <code>--value=open -n10</code> would result in +<code>{value='open',n='10'}</code>; generally you can use '=' or ':' to separate the flag +from its value, except in the special case where a short flag is followed by an +integer. Or you may specify upfront that some flags have associated values, and +then the values will follow the flag.</p> + + +<pre> +> <span class="global">require</span> <span class="string">'pl'</span> +> flags,args = app.parse_args({<span class="string">'-o'</span>,<span class="string">'fred'</span>,<span class="string">'-n10'</span>,<span class="string">'fred.txt'</span>},{o=<span class="keyword">true</span>}) +> pretty.dump(flags) +{o=<span class="string">'fred'</span>,n=<span class="string">'10'</span>} +</pre> + +<p><code>parse_args</code> is not intelligent or psychic; it will not convert any flag values +or arguments for you, or raise errors. For that, have a look at +<a href="../manual/08-additional.md.html#Command_line_Programs_with_Lapp">Lapp</a>.</p> + +<p>An application which consists of several files usually cannot use <a href="https://www.lua.org/manual/5.1/manual.html#pdf-require">require</a> to +load files in the same directory as the main script. <code>app.require_here()</code> +ensures that the Lua module path is modified so that files found locally are +found first. In the <code>examples</code> directory, <a href="../examples/test-symbols.lua.html#">test-symbols.lua</a> uses this function +to ensure that it can find <a href="../examples/symbols.lua.html#">symbols.lua</a> even if it is not run from this directory.</p> + +<p><a href="../libraries/pl.app.html#appfile">app.appfile</a> will create a filename that your application can use to store its +private data, based on the script name. For example, <code>app.appfile "test.txt"</code> +from a script called <code>testapp.lua</code> produces the following file on my Windows +machine:</p> + +<pre><code>C:\Documents and Settings\SJDonova\.testapp\test.txt +</code></pre> + + +<p>and the equivalent on my Linux machine:</p> + +<pre><code>/home/sdonovan/.testapp/test.txt +</code></pre> + + +<p>If <code>.testapp</code> does not exist, it will be created.</p> + +<p>Penlight makes it convenient to save application data in Lua format. You can use +<code>pretty.dump(t,file)</code> to write a Lua table in a human-readable form to a file, +and <code>pretty.read(file.read(file))</code> to generate the table again, using the +<a href="../libraries/pl.pretty.html#">pretty</a> module.</p> + + +<p><a name="Simplifying_Object_Oriented_Programming_in_Lua"></a></p> +<h3>Simplifying Object-Oriented Programming in Lua</h3> + +<p>Lua is similar to JavaScript in that the concept of class is not directly +supported by the language. In fact, Lua has a very general mechanism for +extending the behaviour of tables which makes it straightforward to implement +classes. A table's behaviour is controlled by its metatable. If that metatable +has a <code>\<em>\</em>index</code> function or table, this will handle looking up anything which is +not found in the original table. A class is just a table with an <code>__index</code> key +pointing to itself. Creating an object involves making a table and setting its +metatable to the class; then when handling <code>obj.fun</code>, Lua first looks up <code>fun</code> in +the table <code>obj</code>, and if not found it looks it up in the class. <code>obj:fun(a)</code> is +just short for <code>obj.fun(obj,a)</code>. So with the metatable mechanism and this bit of +syntactic sugar, it is straightforward to implement classic object orientation.</p> + + +<pre> +<span class="comment">-- animal.lua +</span> +class = <span class="global">require</span> <span class="string">'pl.class'</span> + +class.Animal() + +<span class="keyword">function</span> Animal:_init(name) + self.name = name +<span class="keyword">end</span> + +<span class="keyword">function</span> Animal:__tostring() + <span class="keyword">return</span> self.name..<span class="string">': '</span>..self:speak() +<span class="keyword">end</span> + +class.Dog(Animal) + +<span class="keyword">function</span> Dog:speak() + <span class="keyword">return</span> <span class="string">'bark'</span> +<span class="keyword">end</span> + +class.Cat(Animal) + +<span class="keyword">function</span> Cat:_init(name,breed) + self:super(name) <span class="comment">-- must init base! +</span> self.breed = breed +<span class="keyword">end</span> + +<span class="keyword">function</span> Cat:speak() + <span class="keyword">return</span> <span class="string">'meow'</span> +<span class="keyword">end</span> + +class.Lion(Cat) + +<span class="keyword">function</span> Lion:speak() + <span class="keyword">return</span> <span class="string">'roar'</span> +<span class="keyword">end</span> + +fido = Dog(<span class="string">'Fido'</span>) +felix = Cat(<span class="string">'Felix'</span>,<span class="string">'Tabby'</span>) +leo = Lion(<span class="string">'Leo'</span>,<span class="string">'African'</span>) + +$ lua -i animal.lua +> = fido,felix,leo +Fido: bark Felix: meow Leo: roar +> = leo:is_a(Animal) +<span class="keyword">true</span> +> = leo:is_a(Dog) +<span class="keyword">false</span> +> = leo:is_a(Cat) +<span class="keyword">true</span> +</pre> + +<p>All Animal does is define <code>\<em>\</em>tostring</code>, which Lua will use whenever a string +representation is needed of the object. In turn, this relies on <code>speak</code>, which is +not defined. So it's what C++ people would call an abstract base class; the +specific derived classes like Dog define <code>speak</code>. Please note that <em>if</em> derived +classes have their own constructors, they must explicitly call the base +constructor for their base class; this is conveniently available as the <code>super</code> +method.</p> + +<p>Note that (as always) there are multiple ways to implement OOP in Lua; this method +uses the classic 'a class is the __index of its objects' but does 'fat inheritance'; +methods from the base class are copied into the new class. The advantage of this is +that you are not penalized for long inheritance chains, for the price of larger classes, +but generally objects outnumber classes! (If not, something odd is going on with your design.)</p> + +<p>All such objects will have a <code>is_a</code> method, which looks up the inheritance chain +to find a match. Another form is <code>class_of</code>, which can be safely called on all +objects, so instead of <code>leo:is_a(Animal)</code> one can say <code>Animal:class_of(leo)</code>.</p> + +<p>There are two ways to define a class, either <code>class.Name()</code> or <code>Name = class()</code>; +both work identically, except that the first form will always put the class in +the current environment (whether global or module); the second form provides more +flexibility about where to store the class. The first form does <em>name</em> the class +by setting the <code>_name</code> field, which can be useful in identifying the objects of +this type later. This session illustrates the usefulness of having named classes, +if no <code>__tostring</code> method is explicitly defined.</p> + + +<pre> +> class.Fred() +> a = Fred() +> = a +Fred: <span class="number">00459330</span> +> Alice = class() +> b = Alice() +> = b +<span class="global">table</span>: <span class="number">00459</span>AE8 +> Alice._name = <span class="string">'Alice'</span> +> = b +Alice: <span class="number">00459</span>AE8 +</pre> + +<p>So <code>Alice = class(); Alice._name = 'Alice'</code> is exactly the same as <code>class.Alice()</code>.</p> + +<p>This useful notation is borrowed from Hugo Etchegoyen's +<a href="http://lua-users.org/wiki/MultipleInheritanceClasses">classlib</a> which further +extends this concept to allow for multiple inheritance. Notice that the +more convenient form puts the class name in the <em>current environment</em>! That is, +you may use it safely within modules using the old-fashioned <code>module()</code> +or the new <code>_ENV</code> mechanism.</p> + +<p>There is always more than one way of doing things in Lua; some may prefer this +style for creating classes:</p> + + +<pre> +<span class="keyword">local</span> class = <span class="global">require</span> <span class="string">'pl.class'</span> + +class.Named { + _init = <span class="keyword">function</span>(self,name) + self.name = name + <span class="keyword">end</span>; + + __tostring = <span class="keyword">function</span>(self) + <span class="keyword">return</span> <span class="string">'boo '</span>..self.name + <span class="keyword">end</span>; +} + +b = Named <span class="string">'dog'</span> +<span class="global">print</span>(b) +<span class="comment">--> boo dog</span> +</pre> + +<p>Note that you have to explicitly declare <code>self</code> and end each function definition +with a semi-colon or comma, since this is a Lua table. To inherit from a base class, +set the special field <code>_base</code> to the class in this table.</p> + +<p>Penlight provides a number of useful classes; there is <a href="../classes/pl.List.html#">List</a>, which is a Lua +clone of the standard Python list object, and <a href="../classes/pl.Set.html#">Set</a> which represents sets. There +are three kinds of <em>map</em> defined: <a href="../classes/pl.Map.html#">Map</a>, <a href="../classes/pl.MultiMap.html#">MultiMap</a> (where a key may have +multiple values) and <a href="../classes/pl.OrderedMap.html#">OrderedMap</a> (where the order of insertion is remembered.). +There is nothing special about these classes and you may inherit from them.</p> + +<p>A powerful thing about dynamic languages is that you can redefine existing classes +and functions, which is often called 'monkey patching' It's entertaining and convenient, +but ultimately anti-social; you may modify <a href="../classes/pl.List.html#">List</a> but then any other modules using +this <em>shared</em> resource can no longer be sure about its behaviour. (This is why you +must say <code>stringx.import()</code> explicitly if you want the extended string methods - it +would be a bad default.) Lua is particularly open to modification but the +community is not as tolerant of monkey-patching as the Ruby community, say. You may +wish to add some new methods to <a href="../classes/pl.List.html#">List</a>? Cool, but that's what subclassing is for.</p> + + +<pre> +class.Strings(List) + +<span class="keyword">function</span> Strings:my_method() +... +<span class="keyword">end</span> +</pre> + +<p>It's definitely more useful to define exactly how your objects behave +in <em>unknown</em> conditions. All classes have a <code>catch</code> method you can use to set +a handler for unknown lookups; the function you pass looks exactly like the +<code>__index</code> metamethod.</p> + + +<pre> +Strings:catch(<span class="keyword">function</span>(self,name) + <span class="keyword">return</span> <span class="keyword">function</span>() <span class="global">error</span>(<span class="string">"no such method "</span>..name,<span class="number">2</span>) <span class="keyword">end</span> +<span class="keyword">end</span>) +</pre> + +<p>In this case we're just customizing the error message, but +creative things can be done. Consider this code from <code>test-vector.lua</code>:</p> + + +<pre> +Strings:catch(List.default_map_with(<span class="global">string</span>)) + +ls = Strings{<span class="string">'one'</span>,<span class="string">'two'</span>,<span class="string">'three'</span>} +asserteq(ls:upper(),{<span class="string">'ONE'</span>,<span class="string">'TWO'</span>,<span class="string">'THREE'</span>}) +asserteq(ls:sub(<span class="number">1</span>,<span class="number">2</span>),{<span class="string">'on'</span>,<span class="string">'tw'</span>,<span class="string">'th'</span>}) +</pre> + +<p>So we've converted a unknown method invocation into a map using the function of +that name found in <a href="https://www.lua.org/manual/5.1/manual.html#5.4">string</a>. So for a <code>Vector</code> (which is a specialization of <a href="../classes/pl.List.html#">List</a> +for numbers) it makes sense to make <a href="https://www.lua.org/manual/5.1/manual.html#5.6">math</a> the default map so that <code>v:sin()</code> makes +sense.</p> + +<p>Note that <code>map</code> operations return a object of the same type - this is often called +<em>covariance</em>. So <code>ls:upper()</code> itself returns a <code>Strings</code> object.</p> + +<p>This is not <em>always</em> what you want, but objects can always be cast to the desired type. +(<code>cast</code> doesn't create a new object, but returns the object passed.)</p> + + +<pre> +<span class="keyword">local</span> sizes = ls:map <span class="string">'#'</span> +asserteq(sizes, {<span class="number">3</span>,<span class="number">3</span>,<span class="number">5</span>}) +asserteq(utils.<span class="global">type</span>(sizes),<span class="string">'Strings'</span>) +asserteq(sizes:is_a(Strings),<span class="keyword">true</span>) +sizes = Vector:cast(sizes) +asserteq(utils.<span class="global">type</span>(sizes),<span class="string">'Vector'</span>) +asserteq(sizes+<span class="number">1</span>,{<span class="number">4</span>,<span class="number">4</span>,<span class="number">6</span>}) +</pre> + +<p>About <code>utils.type</code>: it can only return a string for a class type if that class does +in fact have a <code>_name</code> field.</p> + + +<p><em>Properties</em> are a useful object-oriented pattern. We wish to control access to a +field, but don't wish to force the user of the class to say <code>obj:get_field()</code> +etc. This excerpt from <code>tests/test-class.lua</code> shows how it is done:</p> + + + +<pre> +<span class="keyword">local</span> MyProps = class(class.properties) +<span class="keyword">local</span> setted_a, got_b + +<span class="keyword">function</span> MyProps:_init () + self._a = <span class="number">1</span> + self._b = <span class="number">2</span> +<span class="keyword">end</span> + +<span class="keyword">function</span> MyProps:set_a (v) + setted_a = <span class="keyword">true</span> + self._a = v +<span class="keyword">end</span> + +<span class="keyword">function</span> MyProps:get_b () + got_b = <span class="keyword">true</span> + <span class="keyword">return</span> self._b +<span class="keyword">end</span> + +<span class="keyword">local</span> mp = MyProps() + +mp.a = <span class="number">10</span> + +asserteq(mp.a,<span class="number">10</span>) +asserteq(mp.b,<span class="number">2</span>) +asserteq(setted_a <span class="keyword">and</span> got_b, <span class="keyword">true</span>) +</pre> + +<p>The convention is that the internal field name is prefixed with an underscore; +when reading <code>mp.a</code>, first a check for an explicit <em>getter</em> <code>get_a</code> and then only +look for <code>_a</code>. Simularly, writing <code>mp.a</code> causes the <em>setter</em> <code>set_a</code> to be used.</p> + +<p>This is cool behaviour, but like much Lua metaprogramming, it is not free. Method +lookup on such objects goes through <code>\<em>\</em>index</code> as before, but now <code>\<em>\</em>index</code> is a +function which has to explicitly look up methods in the class, before doing any +property indexing, which is not going to be as fast as field lookup. If however, +your accessors actually do non-trivial things, then the extra overhead could be +worth it.</p> + +<p>This is not really intended for <em>access control</em> because external code can write +to <code>mp._a</code> directly. It is possible to have this kind of control in Lua, but it +again comes with run-time costs.</p> + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/02-arrays.md.html b/Data/Libraries/Penlight/docs/manual/02-arrays.md.html new file mode 100644 index 0000000..28dc6a2 --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/02-arrays.md.html @@ -0,0 +1,914 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Python_style_Lists">Python-style Lists </a></li> +<li><a href="#Map_and_Set_classes">Map and Set classes </a></li> +<li><a href="#Useful_Operations_on_Tables">Useful Operations on Tables </a></li> +<li><a href="#Operations_on_two_dimensional_tables">Operations on two-dimensional tables </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><strong>Tables and Arrays</strong></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Tables and Arrays</h2> + +<p><a id="list"/></p> + +<p><a name="Python_style_Lists"></a></p> +<h3>Python-style Lists</h3> + +<p>One of the elegant things about Lua is that tables do the job of both lists and +dicts (as called in Python) or vectors and maps, (as called in C++), and they do +it efficiently. However, if we are dealing with 'tables with numerical indices' +we may as well call them lists and look for operations which particularly make +sense for lists. The Penlight <a href="../classes/pl.List.html#">List</a> class was originally written by Nick Trout +for Lua 5.0, and translated to 5.1 and extended by myself. It seemed that +borrowing from Python was a good idea, and this eventually grew into Penlight.</p> + +<p>Here is an example showing <a href="../classes/pl.List.html#">List</a> in action; it redefines <code>__tostring</code>, so that +it can print itself out more sensibly:</p> + + +<pre> +> List = <span class="global">require</span> <span class="string">'pl.List'</span> <span class="comment">--> automatic with require 'pl' <--- +</span>> l = List() +> l:append(<span class="number">10</span>) +> l:append(<span class="number">20</span>) +> = l +{<span class="number">10</span>,<span class="number">20</span>} +> l:extend {<span class="number">30</span>,<span class="number">40</span>} +{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>} +> l:insert(<span class="number">1</span>,<span class="number">5</span>) +{<span class="number">5</span>,<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>} +> = l:pop() +<span class="number">40</span> +> = l +{<span class="number">5</span>,<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>} +> = l:index(<span class="number">30</span>) +<span class="number">4</span> +> = l:contains(<span class="number">30</span>) +<span class="keyword">true</span> +> = l:reverse() <span class="comment">---> note: doesn't make a copy! +</span>{<span class="number">30</span>,<span class="number">20</span>,<span class="number">10</span>,<span class="number">5</span>} +</pre> + +<p>Although methods like <code>sort</code> and <code>reverse</code> operate in-place and change the list, +they do return the original list. This makes it possible to do <em>method chaining</em>, +like <code>ls = ls:append(10):append(20):reverse():append(1)</code>. But (and this is an +important but) no extra copy is made, so <code>ls</code> does not change identity. <a href="../classes/pl.List.html#">List</a> +objects (like tables) are <em>mutable</em>, unlike strings. If you want a copy of a +list, then <code>List(ls)</code> will do the job, i.e. it acts like a copy constructor. +However, if passed any other table, <a href="../classes/pl.List.html#">List</a> will just set the metatable of the +table and <em>not</em> make a copy.</p> + +<p>A particular feature of Python lists is <em>slicing</em>. This is fully supported in +this version of <a href="../classes/pl.List.html#">List</a>, except we use 1-based indexing. So <a href="../classes/pl.List.html#List:slice">List.slice</a> works +rather like <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.sub">string.sub</a>:</p> + + +<pre> +> l = List {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>} +> = l:slice(<span class="number">1</span>,<span class="number">1</span>) <span class="comment">---> note: creates a new list! +</span>{<span class="number">10</span>} +> = l:slice(<span class="number">2</span>,<span class="number">2</span>) +{<span class="number">20</span>} +> = l:slice(<span class="number">2</span>,<span class="number">3</span>) +{<span class="number">20</span>,<span class="number">30</span>} +> = l:slice(<span class="number">2</span>,-<span class="number">2</span>) +{<span class="number">20</span>,<span class="number">30</span>} +> = l:slice_assign(<span class="number">2</span>,<span class="number">2</span>,{<span class="number">21</span>,<span class="number">22</span>,<span class="number">23</span>}) +{<span class="number">10</span>,<span class="number">21</span>,<span class="number">22</span>,<span class="number">23</span>,<span class="number">30</span>,<span class="number">40</span>} +> = l:chop(<span class="number">1</span>,<span class="number">1</span>) +{<span class="number">21</span>,<span class="number">22</span>,<span class="number">23</span>,<span class="number">30</span>,<span class="number">40</span>} +</pre> + +<p>Functions like <code>slice_assign</code> and <code>chop</code> modify the list; the first is equivalent +to Python<code>l[i1:i2] = seq</code> and the second to <code>del l[i1:i2]</code>.</p> + +<p>List objects are ultimately just Lua 'list-like' tables, but they have extra +operations defined on them, such as equality and concatention. For regular +tables, equality is only true if the two tables are <em>identical objects</em>, whereas +two lists are equal if they have the same contents, i.e. that <code>l1[i]==l2[i]</code> for +all elements.</p> + + +<pre> +> l1 = List {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} +> l2 = List {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} +> = l1 == l2 +<span class="keyword">true</span> +> = l1..l2 +{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} +</pre> + +<p>The <a href="../classes/pl.List.html#">List</a> constructor can be passed a function. If so, it's assumed that this is +an iterator function that can be repeatedly called to generate a sequence. One +such function is <a href="https://www.lua.org/manual/5.1/manual.html#pdf-io.lines">io.lines</a>; the following short, intense little script counts +the number of lines in standard input:</p> + + +<pre> +<span class="comment">-- linecount.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +ls = List(<span class="global">io</span>.lines()) +<span class="global">print</span>(#ls) +</pre> + +<p><a href="../classes/pl.List.html#List.iterate">List.iterate</a> captures what <a href="../classes/pl.List.html#">List</a> considers a sequence. In particular, it can +also iterate over all 'characters' in a string:</p> + + +<pre> +> <span class="keyword">for</span> ch <span class="keyword">in</span> List.iterate <span class="string">'help'</span> <span class="keyword">do</span> <span class="global">io</span>.write(ch,<span class="string">' '</span>) <span class="keyword">end</span> +h e l p > +</pre> + +<p>Since the function <code>iterate</code> is used internally by the <a href="../classes/pl.List.html#">List</a> constructor, +strings can be made into lists of character strings very easily.</p> + +<p>There are a number of operations that go beyond the standard Python methods. For +instance, you can <em>partition</em> a list into a table of sublists using a function. +In the simplest form, you use a predicate (a function returning a boolean value) +to partition the list into two lists, one of elements matching and another of +elements not matching. But you can use any function; if we use <a href="https://www.lua.org/manual/5.1/manual.html#pdf-type">type</a> then the +keys will be the standard Lua type names.</p> + + +<pre> +> ls = List{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>} +> ops = <span class="global">require</span> <span class="string">'pl.operator'</span> +> ls:partition(<span class="keyword">function</span>(x) <span class="keyword">return</span> x > <span class="number">2</span> <span class="keyword">end</span>) +{<span class="keyword">false</span>={<span class="number">1</span>,<span class="number">2</span>},<span class="keyword">true</span>={<span class="number">3</span>,<span class="number">4</span>}} +> ls = List{<span class="string">'one'</span>,<span class="global">math</span>.sin,List{<span class="number">1</span>},<span class="number">10</span>,<span class="number">20</span>,List{<span class="number">1</span>,<span class="number">2</span>}} +> ls:partition(<span class="global">type</span>) +{<span class="keyword">function</span>={<span class="keyword">function</span>: <span class="number">00369110</span>},<span class="global">string</span>={one},number={<span class="number">10</span>,<span class="number">20</span>},<span class="global">table</span>={{<span class="number">1</span>},{<span class="number">1</span>,<span class="number">2</span>}}} +</pre> + +<p>This is one <a href="../classes/pl.List.html#">List</a> method which returns a table which is not a <a href="../classes/pl.List.html#">List</a>. Bear in +mind that you can always call a <a href="../classes/pl.List.html#">List</a> method on a plain table argument, so +<code>List.partition(t,type)</code> works as expected. But these functions will only operate +on the array part of the table.</p> + +<p>The 'nominal' type of the returned table is <code>pl.Multimap</code>, which describes a mapping +between keys and multiple values. This does not mean that <code>pl.Multimap</code> is automatically +loaded whenever you use <code>partition</code> (or <a href="../classes/pl.List.html#">List</a> for that matter); this is one of the +standard metatables which are only filled out when the appropriate module is loaded. +This allows tables to be tagged appropriately without causing excessive coupling.</p> + +<p>Stacks occur everywhere in computing. <a href="../classes/pl.List.html#">List</a> supports stack-like operations; +there is already <code>pop</code> (remove and return last value) and <code>append</code> acts like +<code>push</code> (add a value to the end). <code>push</code> is provided as an alias for <code>append</code>, and +the other stack operation (size) is simply the size operator <code>#</code>. Queues can +also be implemented; you use <code>pop</code> to take values out of the queue, and <code>put</code> to +insert a value at the begining.</p> + +<p>You may derive classes from <a href="../classes/pl.List.html#">List</a>, and since the list-returning methods +are covariant, the result of <code>slice</code> etc will return lists of the derived type, +not <a href="../classes/pl.List.html#">List</a>. For instance, consider the specialization of a <a href="../classes/pl.List.html#">List</a> type that contains +numbers in <code>tests/test-list.lua</code>:</p> + + +<pre> +n1 = NA{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>} +n2 = NA{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} +ns = n1 + <span class="number">2</span>*n2 +asserteq(ns,{<span class="number">12</span>,<span class="number">24</span>,<span class="number">36</span>}) +min,max = ns:slice(<span class="number">1</span>,<span class="number">2</span>):minmax() +asserteq(T(min,max),T(<span class="number">12</span>,<span class="number">24</span>)) +asserteq(n1:normalize():sum(),<span class="number">1</span>,<span class="number">1e-8</span>) +</pre> + +<p><a name="Map_and_Set_classes"></a></p> +<h3>Map and Set classes</h3> + +<p>The <a href="../classes/pl.Map.html#">Map</a> class exposes what Python would call a 'dict' interface, and accesses +the hash part of the table. The name 'Map' is used to emphasize the interface, +not the implementation; it is an object which maps keys onto values; <code>m['alice']</code> +or the equivalent <code>m.alice</code> is the access operation. This class also provides +explicit <code>set</code> and <code>get</code> methods, which are trivial for regular maps but get +interesting when <a href="../classes/pl.Map.html#">Map</a> is subclassed. The other operation is <code>update</code>, which +extends a map by copying the keys and values from another table, perhaps +overwriting existing keys:</p> + + +<pre> +> Map = <span class="global">require</span> <span class="string">'pl.Map'</span> +> m = Map{one=<span class="number">1</span>,two=<span class="number">2</span>} +> m:update {three=<span class="number">3</span>,four=<span class="number">4</span>,two=<span class="number">20</span>} +> = m == M{one=<span class="number">1</span>,two=<span class="number">20</span>,three=<span class="number">3</span>,four=<span class="number">4</span>} +<span class="keyword">true</span> +</pre> + +<p>The method <code>values</code> returns a list of the values, and <code>keys</code> returns a list of +the keys; there is no guarantee of order. <code>getvalues</code> is given a list of keys and +returns a list of values associated with these keys:</p> + + +<pre> +> m = Map{one=<span class="number">1</span>,two=<span class="number">2</span>,three=<span class="number">3</span>} +> = m:getvalues {<span class="string">'one'</span>,<span class="string">'three'</span>} +{<span class="number">1</span>,<span class="number">3</span>} +> = m:getvalues(m:keys()) == m:values() +<span class="keyword">true</span> +</pre> + +<p>When querying the value of a <a href="../classes/pl.Map.html#">Map</a>, it is best to use the <code>get</code> method:</p> + + +<pre> +> <span class="global">print</span>(m:get <span class="string">'one'</span>, m:get <span class="string">'two'</span>) +<span class="number">1</span> <span class="number">2</span> +</pre> + +<p>The reason is that <code>m[key]</code> can be ambiguous; due to the current implementation, +<code>m["get"]</code> will always succeed, because if a value is not present in the map, it +will be looked up in the <a href="../classes/pl.Map.html#">Map</a> metatable, which contains a method <code>get</code>. There is +currently no simple solution to this annoying restriction.</p> + +<p>There are some useful classes which inherit from <a href="../classes/pl.Map.html#">Map</a>. An <a href="../classes/pl.OrderedMap.html#">OrderedMap</a> behaves +like a <a href="../classes/pl.Map.html#">Map</a> but keeps its keys in order if you use its <code>set</code> method to add keys +and values. Like all the 'container' classes in Penlight, it defines an <code>iter</code> +method for iterating over its values; this will return the keys and values in the +order of insertion; the <code>keys</code> and <code>values</code> methods likewise.</p> + +<p>A <a href="../classes/pl.MultiMap.html#">MultiMap</a> allows multiple values to be associated with a given key. So <code>set</code> +(as before) takes a key and a value, but calling it with the same key and a +different value does not overwrite but adds a new value. <code>get</code> (or using <code>[]</code>) +will return a list of values.</p> + +<p>A <a href="../classes/pl.Set.html#">Set</a> can be seen as a special kind of <a href="../classes/pl.Map.html#">Map</a>, where all the values are <code>true</code>, +the keys are the values, and the order is not important. So in this case +<a href="../classes/pl.Set.html#pl.Set:values">Set.values</a> is defined to return a list of the keys. Sets can display +themselves, and the basic operations like <code>union</code> (<code>+</code>) and <code>intersection</code> (<code>*</code>) +are defined.</p> + + +<pre> +> Set = <span class="global">require</span> <span class="string">'pl.Set'</span> +> = Set{<span class="string">'one'</span>,<span class="string">'two'</span>} == Set{<span class="string">'two'</span>,<span class="string">'one'</span>} +<span class="keyword">true</span> +> fruit = Set{<span class="string">'apple'</span>,<span class="string">'banana'</span>,<span class="string">'orange'</span>} +> = fruit[<span class="string">'banana'</span>] +<span class="keyword">true</span> +> = fruit[<span class="string">'hazelnut'</span>] +<span class="keyword">nil</span> +> = fruit:values() +{apple,orange,banana} +> colours = Set{<span class="string">'red'</span>,<span class="string">'orange'</span>,<span class="string">'green'</span>,<span class="string">'blue'</span>} +> = fruit,colours +[apple,orange,banana] [blue,green,orange,red] +> = fruit+colours +[blue,green,apple,red,orange,banana] +> = fruit*colours +[orange] +</pre> + +<p>There are also the functions <a href="../classes/pl.Set.html#pl.Set:difference">Set.difference</a> and <code>Set.symmetric_difference</code>. The +first answers the question 'what fruits are not colours?' and the second 'what +are fruits and colours but not both?'</p> + + +<pre> +> = fruit - colours +[apple,banana] +> = fruit ^ colours +[blue,green,apple,red,banana] +</pre> + +<p>Adding elements to a set is simply <code>fruit['peach'] = true</code> and removing is +<code>fruit['apple'] = nil</code> . To make this simplicity work properly, the <a href="../classes/pl.Set.html#">Set</a> class has no +methods - either you use the operator forms or explicitly use <code>Set.intersect</code> +etc. In this way we avoid the ambiguity that plagues <a href="../classes/pl.Map.html#">Map</a>.</p> + + +<p>(See <a href="../classes/pl.Map.html#">pl.Map</a> and <a href="../classes/pl.Set.html#">pl.Set</a>)</p> + +<p><a name="Useful_Operations_on_Tables"></a></p> +<h3>Useful Operations on Tables</h3> + + +<p>Some notes on terminology: Lua tables are usually <em>list-like</em> (like an array) or +<em>map-like</em> (like an associative array or dict); they can of course have a +list-like and a map-like part. Some of the table operations only make sense for +list-like tables, and some only for map-like tables. (The usual Lua terminology +is the array part and the hash part of the table, which reflects the actual +implementation used; it is more accurate to say that a Lua table is an +associative map which happens to be particularly efficient at acting like an +array.)</p> + +<p>The functions provided in <a href="https://www.lua.org/manual/5.1/manual.html#5.5">table</a> provide all the basic manipulations on Lua +tables, but as we saw with the <a href="../classes/pl.List.html#">List</a> class, it is useful to build higher-level +operations on top of those functions. For instance, to copy a table involves this +kind of loop:</p> + + +<pre> +<span class="keyword">local</span> res = {} +<span class="keyword">for</span> k,v <span class="keyword">in</span> <span class="global">pairs</span>(T) <span class="keyword">do</span> + res[k] = v +<span class="keyword">end</span> +<span class="keyword">return</span> res +</pre> + +<p>The <a href="../libraries/pl.tablex.html#">tablex</a> module provides this as <a href="../libraries/pl.tablex.html#copy">copy</a>, which does a <em>shallow</em> copy of a +table. There is also <a href="../libraries/pl.tablex.html#deepcopy">deepcopy</a> which goes further than a simple loop in two +ways; first, it also gives the copy the same metatable as the original (so it can +copy objects like <a href="../classes/pl.List.html#">List</a> above) and any nested tables will also be copied, to +arbitrary depth. There is also <a href="../libraries/pl.tablex.html#icopy">icopy</a> which operates on list-like tables, where +you can set optionally set the start index of the source and destination as well. +It ensures that any left-over elements will be deleted:</p> + + +<pre> +asserteq(icopy({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">20</span>,<span class="number">30</span>}),{<span class="number">20</span>,<span class="number">30</span>}) <span class="comment">-- start at 1 +</span>asserteq(icopy({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">20</span>,<span class="number">30</span>},<span class="number">2</span>),{<span class="number">1</span>,<span class="number">20</span>,<span class="number">30</span>}) <span class="comment">-- start at 2 +</span>asserteq(icopy({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">20</span>,<span class="number">30</span>},<span class="number">2</span>,<span class="number">2</span>),{<span class="number">1</span>,<span class="number">30</span>}) <span class="comment">-- start at 2, copy from 2</span> +</pre> + +<p>(This code from the <a href="../libraries/pl.tablex.html#">tablex</a> test module shows the use of <a href="../libraries/pl.test.html#asserteq">pl.test.asserteq</a>)</p> + +<p>Whereas, <a href="../libraries/pl.tablex.html#move">move</a> overwrites but does not delete the rest of the destination:</p> + + +<pre> +asserteq(move({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">20</span>,<span class="number">30</span>}),{<span class="number">20</span>,<span class="number">30</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>}) +asserteq(move({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">20</span>,<span class="number">30</span>},<span class="number">2</span>),{<span class="number">1</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>}) +asserteq(move({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">20</span>,<span class="number">30</span>},<span class="number">2</span>,<span class="number">2</span>),{<span class="number">1</span>,<span class="number">30</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>}) +</pre> + +<p>(The difference is somewhat like that between C's <code>strcpy</code> and <code>memmove</code>.)</p> + +<p>To summarize, use <a href="../libraries/pl.tablex.html#copy">copy</a> or <a href="../libraries/pl.tablex.html#deepcopy">deepcopy</a> to make a copy of an arbitrary table. To +copy into a map-like table, use <a href="../libraries/pl.tablex.html#update">update</a>; to copy into a list-like table use +<a href="../libraries/pl.tablex.html#icopy">icopy</a>, and <a href="../libraries/pl.tablex.html#move">move</a> if you are updating a range in the destination.</p> + +<p>To complete this set of operations, there is <a href="../libraries/pl.tablex.html#insertvalues">insertvalues</a> which works like +<a href="https://www.lua.org/manual/5.1/manual.html#pdf-table.insert">table.insert</a> except that one provides a table of values to be inserted, and +<a href="../libraries/pl.tablex.html#removevalues">removevalues</a> which removes a range of values.</p> + + +<pre> +asserteq(insertvalues({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>},<span class="number">2</span>,{<span class="number">20</span>,<span class="number">30</span>}),{<span class="number">1</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +asserteq(insertvalues({<span class="number">1</span>,<span class="number">2</span>},{<span class="number">3</span>,<span class="number">4</span>}),{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +</pre> + +<p>Another example:</p> + + +<pre> +> T = <span class="global">require</span> <span class="string">'pl.tablex'</span> +> t = {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>} +> = T.removevalues(t,<span class="number">2</span>,<span class="number">3</span>) +{<span class="number">10</span>,<span class="number">40</span>} +> = T.insertvalues(t,<span class="number">2</span>,{<span class="number">20</span>,<span class="number">30</span>}) +{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>} +</pre> + +<p>In a similar spirit to <a href="../libraries/pl.tablex.html#deepcopy">deepcopy</a>, <a href="../libraries/pl.tablex.html#deepcompare">deepcompare</a> will take two tables and return +true only if they have exactly the same values and structure.</p> + + +<pre> +> t1 = {<span class="number">1</span>,{<span class="number">2</span>,<span class="number">3</span>},<span class="number">4</span>} +> t2 = deepcopy(t1) +> = t1 == t2 +<span class="keyword">false</span> +> = deepcompare(t1,t2) +<span class="keyword">true</span> +</pre> + +<p><a href="../libraries/pl.tablex.html#find">find</a> will return the index of a given value in a list-like table. Note that +like <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.find">string.find</a> you can specify an index to start searching, so that all +instances can be found. There is an optional fourth argument, which makes the +search start at the end and go backwards, so we could define <a href="../libraries/pl.tablex.html#rfind">rfind</a> like so:</p> + + +<pre> +<span class="keyword">function</span> rfind(t,val,istart) + <span class="keyword">return</span> tablex.find(t,val,istart,<span class="keyword">true</span>) +<span class="keyword">end</span> +</pre> + +<p><a href="../libraries/pl.tablex.html#find">find</a> does a linear search, so it can slow down code that depends on it. If +efficiency is required for large tables, consider using an <em>index map</em>. +<a href="../libraries/pl.tablex.html#index_map">index_map</a> will return a table where the keys are the original values of the +list, and the associated values are the indices. (It is almost exactly the +representation needed for a <em>set</em>.)</p> + + +<pre> +> t = {<span class="string">'one'</span>,<span class="string">'two'</span>,<span class="string">'three'</span>} +> = tablex.find(t,<span class="string">'two'</span>) +<span class="number">2</span> +> = tablex.find(t,<span class="string">'four'</span>) +<span class="keyword">nil</span> +> il = tablex.index_map(t) +> = il[<span class="string">'two'</span>] +<span class="number">2</span> +> = il.two +<span class="number">2</span> +</pre> + +<p>A version of <a href="../libraries/pl.tablex.html#index_map">index_map</a> called <a href="../libraries/pl.tablex.html#makeset">makeset</a> is also provided, where the values are +just <code>true</code>. This is useful because two such sets can be compared for equality +using <a href="../libraries/pl.tablex.html#deepcompare">deepcompare</a>:</p> + + +<pre> +> = deepcompare(makeset {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>},makeset {<span class="number">2</span>,<span class="number">1</span>,<span class="number">3</span>}) +<span class="keyword">true</span> +</pre> + +<p>Consider the problem of determining the new employees that have joined in a +period. Assume we have two files of employee names:</p> + + +<pre> +(last-month.txt) +smith,john +brady,maureen +mongale,thabo + +(this-month.txt) +smith,john +smit,johan +brady,maureen +mogale,thabo +van der Merwe,Piet +</pre> + +<p>To find out differences, just make the employee lists into sets, like so:</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> + +<span class="keyword">function</span> read_employees(file) + <span class="keyword">local</span> ls = List(<span class="global">io</span>.lines(file)) <span class="comment">-- a list of employees +</span> <span class="keyword">return</span> tablex.makeset(ls) +<span class="keyword">end</span> + +last = read_employees <span class="string">'last-month.txt'</span> +this = read_employees <span class="string">'this-month.txt'</span> + +<span class="comment">-- who is in this but not in last? +</span>diff = tablex.difference(this,last) + +<span class="comment">-- in a set, the keys are the values... +</span><span class="keyword">for</span> e <span class="keyword">in</span> <span class="global">pairs</span>(diff) <span class="keyword">do</span> <span class="global">print</span>(e) <span class="keyword">end</span> + +<span class="comment">-- *output* +</span><span class="comment">-- van der Merwe,Piet +</span><span class="comment">-- smit,johan</span> +</pre> + +<p>The <a href="../libraries/pl.tablex.html#difference">difference</a> operation is easy to write and read:</p> + + +<pre> +<span class="keyword">for</span> e <span class="keyword">in</span> <span class="global">pairs</span>(this) <span class="keyword">do</span> + <span class="keyword">if</span> <span class="keyword">not</span> last[e] <span class="keyword">then</span> + <span class="global">print</span>(e) + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>Using <a href="../libraries/pl.tablex.html#difference">difference</a> here is not that it is a tricky thing to code, it is that you +are stating your intentions clearly to other readers of your code. (And naturally +to your future self, in six months time.)</p> + +<p><a href="../libraries/pl.tablex.html#find_if">find_if</a> will search a table using a function. The optional third argument is a +value which will be passed as a second argument to the function. <a href="../libraries/pl.operator.html#">pl.operator</a> +provides the Lua operators conveniently wrapped as functions, so the basic +comparison functions are available:</p> + + +<pre> +> ops = <span class="global">require</span> <span class="string">'pl.operator'</span> +> = tablex.find_if({<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>},ops.gt,<span class="number">20</span>) +<span class="number">3</span> <span class="keyword">true</span> +</pre> + +<p>Note that <a href="../libraries/pl.tablex.html#find_if">find_if</a> will also return the <em>actual value</em> returned by the function, +which of course is usually just <code>true</code> for a boolean function, but any value +which is not <code>nil</code> and not <code>false</code> can be usefully passed back.</p> + +<p><a href="../libraries/pl.tablex.html#deepcompare">deepcompare</a> does a thorough recursive comparison, but otherwise using the +default equality operator. <a href="../libraries/pl.tablex.html#compare">compare</a> allows you to specify exactly what function +to use when comparing two list-like tables, and <a href="../libraries/pl.tablex.html#compare_no_order">compare_no_order</a> is true if +they contain exactly the same elements. Do note that the latter does not need an +explicit comparison function - in this case the implementation is actually to +compare the two sets, as above:</p> + + +<pre> +> = compare_no_order({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>},{<span class="number">2</span>,<span class="number">1</span>,<span class="number">3</span>}) +<span class="keyword">true</span> +> = compare_no_order({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>},{<span class="number">2</span>,<span class="number">1</span>,<span class="number">3</span>},<span class="string">'=='</span>) +<span class="keyword">true</span> +</pre> + +<p>(Note the special string '==' above; instead of saying <code>ops.gt</code> or <code>ops.eq</code> we +can use the strings '>' or '==' respectively.)</p> + +<p><a href="../libraries/pl.tablex.html#sort">sort</a> and <a href="../libraries/pl.tablex.html#sortv">sortv</a> return iterators that will iterate through the +sorted elements of a table. <a href="../libraries/pl.tablex.html#sort">sort</a> iterates by sorted key order, and +<a href="../libraries/pl.tablex.html#sortv">sortv</a> iterates by sorted value order. For example, given a table +with names and ages, it is trivial to iterate over the elements:</p> + + +<pre> +> t = {john=<span class="number">27</span>,jane=<span class="number">31</span>,mary=<span class="number">24</span>} +> <span class="keyword">for</span> name,age <span class="keyword">in</span> tablex.sort(t) <span class="keyword">do</span> <span class="global">print</span>(name,age) <span class="keyword">end</span> +jane <span class="number">31</span> +john <span class="number">27</span> +mary <span class="number">24</span> +> <span class="keyword">for</span> name,age <span class="keyword">in</span> tablex.sortv(t) <span class="keyword">do</span> <span class="global">print</span>(name,age) <span class="keyword">end</span> +mary <span class="number">24</span> +john <span class="number">27</span> +jane <span class="number">31</span> +</pre> + +<p>There are several ways to merge tables in PL. If they are list-like, then see the +operations defined by <a href="../classes/pl.List.html#">pl.List</a>, like concatenation. If they are map-like, then +<a href="../libraries/pl.tablex.html#merge">merge</a> provides two basic operations. If the third arg is false, then the result +only contains the keys that are in common between the two tables, and if true, +then the result contains all the keys of both tables. These are in fact +generalized set union and intersection operations:</p> + + +<pre> +> S1 = {john=<span class="number">27</span>,jane=<span class="number">31</span>,mary=<span class="number">24</span>} +> S2 = {jane=<span class="number">31</span>,jones=<span class="number">50</span>} +> = tablex.merge(S1, S2, <span class="keyword">false</span>) +{jane=<span class="number">31</span>} +> = tablex.merge(S1, S2, <span class="keyword">true</span>) +{mary=<span class="number">24</span>,jane=<span class="number">31</span>,john=<span class="number">27</span>,jones=<span class="number">50</span>} +</pre> + +<p>When working with tables, you will often find yourself writing loops like in the +first example. Loops are second nature to programmers, but they are often not the +most elegant and self-describing way of expressing an operation. Consider the +<a href="../libraries/pl.tablex.html#map">map</a> function, which creates a new table by applying a function to each element +of the original:</p> + + +<pre> +> = map(<span class="global">math</span>.sin, {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{ <span class="number">0.84</span>, <span class="number">0.91</span>, <span class="number">0.14</span>, -<span class="number">0.76</span>} +> = map(<span class="keyword">function</span>(x) <span class="keyword">return</span> x*x <span class="keyword">end</span>, {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{<span class="number">1</span>,<span class="number">4</span>,<span class="number">9</span>,<span class="number">16</span>} +</pre> + +<p><a href="../libraries/pl.tablex.html#map">map</a> saves you from writing a loop, and the resulting code is often clearer, as +well as being shorter. This is not to say that 'loops are bad' (although you will +hear that from some extremists), just that it's good to capture standard +patterns. Then the loops you do write will stand out and acquire more significance.</p> + +<p><a href="../libraries/pl.tablex.html#pairmap">pairmap</a> is interesting, because the function works with both the key and the +value.</p> + + +<pre> +> t = {fred=<span class="number">10</span>,bonzo=<span class="number">20</span>,alice=<span class="number">4</span>} +> = pairmap(<span class="keyword">function</span>(k,v) <span class="keyword">return</span> v <span class="keyword">end</span>, t) +{<span class="number">4</span>,<span class="number">10</span>,<span class="number">20</span>} +> = pairmap(<span class="keyword">function</span>(k,v) <span class="keyword">return</span> k <span class="keyword">end</span>, t) +{<span class="string">'alice'</span>,<span class="string">'fred'</span>,<span class="string">'bonzo'</span>} +</pre> + +<p>(These are common enough operations that the first is defined as <a href="../libraries/pl.tablex.html#values">values</a> and the +second as <a href="../libraries/pl.tablex.html#keys">keys</a>.) If the function returns two values, then the <em>second</em> value is +considered to be the new key:</p> + + +<pre> +> = pairmap(t,<span class="keyword">function</span>(k,v) <span class="keyword">return</span> v+<span class="number">10</span>, k:upper() <span class="keyword">end</span>) +{BONZO=<span class="number">30</span>,FRED=<span class="number">20</span>,ALICE=<span class="number">14</span>} +</pre> + +<p><a href="../libraries/pl.tablex.html#map2">map2</a> applies a function to two tables:</p> + + +<pre> +> map2(ops.add,{<span class="number">1</span>,<span class="number">2</span>},{<span class="number">10</span>,<span class="number">20</span>}) +{<span class="number">11</span>,<span class="number">22</span>} +> map2(<span class="string">'*'</span>,{<span class="number">1</span>,<span class="number">2</span>},{<span class="number">10</span>,<span class="number">20</span>}) +{<span class="number">10</span>,<span class="number">40</span>} +</pre> + +<p>The various map operations generate tables; <a href="../libraries/pl.tablex.html#reduce">reduce</a> applies a function of two +arguments over a table and returns the result as a scalar:</p> + + +<pre> +> reduce (<span class="string">'+'</span>, {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>}) +<span class="number">6</span> +> reduce (<span class="string">'..'</span>, {<span class="string">'one'</span>,<span class="string">'two'</span>,<span class="string">'three'</span>}) +<span class="string">'onetwothree'</span> +</pre> + +<p>Finally, <a href="../libraries/pl.tablex.html#zip">zip</a> sews different tables together:</p> + + +<pre> +> = zip({<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>},{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>}) +{{<span class="number">1</span>,<span class="number">10</span>},{<span class="number">2</span>,<span class="number">20</span>},{<span class="number">3</span>,<span class="number">30</span>}} +</pre> + +<p>Browsing through the documentation, you will find that <a href="../libraries/pl.tablex.html#">tablex</a> and <a href="../classes/pl.List.html#">List</a> share +methods. For instance, <a href="../libraries/pl.tablex.html#imap">tablex.imap</a> and <a href="../classes/pl.List.html#List:map">List.map</a> are basically the same +function; they both operate over the array-part of the table and generate another +table. This can also be expressed as a <em>list comprehension</em> <code>C 'f(x) for x' (t)</code> +which makes the operation more explicit. So why are there different ways to do +the same thing? The main reason is that not all tables are Lists: the expression +<code>ls:map('#')</code> will return a <em>list</em> of the lengths of any elements of <code>ls</code>. A list +is a thin wrapper around a table, provided by the metatable <a href="../classes/pl.List.html#">List</a>. Sometimes you +may wish to work with ordinary Lua tables; the <a href="../classes/pl.List.html#">List</a> interface is not a +compulsory way to use Penlight table operations.</p> + +<p><a name="Operations_on_two_dimensional_tables"></a></p> +<h3>Operations on two-dimensional tables</h3> + + +<p>Two-dimensional tables are of course easy to represent in Lua, for instance +<code>{{1,2},{3,4}}</code> where we store rows as subtables and index like so <code>A[col][row]</code>. +This is the common representation used by matrix libraries like +<a href="http://lua-users.org/wiki/LuaMatrix">LuaMatrix</a>. <a href="../libraries/pl.array2d.html#">pl.array2d</a> does not provide +matrix operations, since that is the job for a specialized library, but rather +provides generalizations of the higher-level operations provided by <a href="../libraries/pl.tablex.html#">pl.tablex</a> +for one-dimensional arrays.</p> + +<p><a href="../libraries/pl.array2d.html#iter">iter</a> is a useful generalization of <a href="https://www.lua.org/manual/5.1/manual.html#pdf-ipairs">ipairs</a>. (The extra parameter determines +whether you want the indices as well.)</p> + + +<pre> +> a = {{<span class="number">1</span>,<span class="number">2</span>},{<span class="number">3</span>,<span class="number">4</span>}} +> <span class="keyword">for</span> i,j,v <span class="keyword">in</span> array2d.iter(a,<span class="keyword">true</span>) <span class="keyword">do</span> <span class="global">print</span>(i,j,v) <span class="keyword">end</span> +<span class="number">1</span> <span class="number">1</span> <span class="number">1</span> +<span class="number">1</span> <span class="number">2</span> <span class="number">2</span> +<span class="number">2</span> <span class="number">1</span> <span class="number">3</span> +<span class="number">2</span> <span class="number">2</span> <span class="number">4</span> +</pre> + +<p>Note that you can always convert an arbitrary 2D array into a 'list of lists' +with <code>List(tablex.map(List,a))</code></p> + +<p><a href="../libraries/pl.array2d.html#map">map</a> will apply a function over all elements (notice that extra arguments can be +provided, so this operation is in effect <code>function(x) return x-1 end</code>)</p> + + +<pre> +> array2d.map(<span class="string">'-'</span>,a,<span class="number">1</span>) +{{<span class="number">0</span>,<span class="number">1</span>},{<span class="number">2</span>,<span class="number">3</span>}} +</pre> + +<p>2D arrays are stored as an array of rows, but columns can be extracted:</p> + + +<pre> +> array2d.column(a,<span class="number">1</span>) +{<span class="number">1</span>,<span class="number">3</span>} +</pre> + +<p>There are three equivalents to <a href="../libraries/pl.tablex.html#reduce">tablex.reduce</a>. You can either reduce along the +rows (which is the most efficient) or reduce along the columns. Either one will +give you a 1D array. And <a href="../libraries/pl.array2d.html#reduce2">reduce2</a> will apply two operations: the first one +reduces the rows, and the second reduces the result.</p> + + +<pre> +> array2d.reduce_rows(<span class="string">'+'</span>,a) +{<span class="number">3</span>,<span class="number">7</span>} +> array2d.reduce_cols(<span class="string">'+'</span>,a) +{<span class="number">4</span>,<span class="number">6</span>} +> <span class="comment">-- same as tablex.reduce('*',array.reduce_rows('+',a)) +</span>> array2d.reduce2(<span class="string">'*'</span>,<span class="string">'+'</span>,a) +<span class="number">21</span> ` +</pre> + +<p><a href="../libraries/pl.tablex.html#map2">tablex.map2</a> applies an operation to two tables, giving another table. +<a href="../libraries/pl.array2d.html#map2">array2d.map2</a> does this for 2D arrays. Note that you have to provide the <em>rank</em> +of the arrays involved, since it's hard to always correctly deduce this from the +data:</p> + + +<pre> +> b = {{<span class="number">10</span>,<span class="number">20</span>},{<span class="number">30</span>,<span class="number">40</span>}} +> a = {{<span class="number">1</span>,<span class="number">2</span>},{<span class="number">3</span>,<span class="number">4</span>}} +> = array2d.map2(<span class="string">'+'</span>,<span class="number">2</span>,<span class="number">2</span>,a,b) <span class="comment">-- two 2D arrays +</span>{{<span class="number">11</span>,<span class="number">22</span>},{<span class="number">33</span>,<span class="number">44</span>}} +> = array2d.map2(<span class="string">'+'</span>,<span class="number">1</span>,<span class="number">2</span>,{<span class="number">10</span>,<span class="number">100</span>},a) <span class="comment">-- 1D, 2D +</span>{{<span class="number">11</span>,<span class="number">102</span>},{<span class="number">13</span>,<span class="number">104</span>}} +> = array2d.map2(<span class="string">'*'</span>,<span class="number">2</span>,<span class="number">1</span>,a,{<span class="number">1</span>,-<span class="number">1</span>}) <span class="comment">-- 2D, 1D +</span>{{<span class="number">1</span>,-<span class="number">2</span>},{<span class="number">3</span>,-<span class="number">4</span>}} +</pre> + +<p>Of course, you are not limited to simple arithmetic. Say we have a 2D array of +strings, and wish to print it out with proper right justification. The first step +is to create all the string lengths by mapping <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.len">string.len</a> over the array, the +second is to reduce this along the columns using <a href="https://www.lua.org/manual/5.1/manual.html#pdf-math.max">math.max</a> to get maximum column +widths, and last, apply <a href="../libraries/pl.stringx.html#rjust">stringx.rjust</a> with these widths.</p> + + +<pre> +maxlens = reduce_cols(<span class="global">math</span>.max,map(<span class="string">'#'</span>,lines)) +lines = map2(stringx.rjust,<span class="number">2</span>,<span class="number">1</span>,lines,maxlens) +</pre> + +<p>There is <a href="../libraries/pl.array2d.html#product">product</a> which returns the <em>Cartesian product</em> of two 1D arrays. The +result is a 2D array formed from applying the function to all possible pairs from +the two arrays.</p> + + +<pre> +> array2d.product(<span class="string">'{}'</span>,{<span class="number">1</span>,<span class="number">2</span>},{<span class="string">'a'</span>,<span class="string">'b'</span>}) +{{{<span class="number">1</span>,<span class="string">'b'</span>},{<span class="number">2</span>,<span class="string">'a'</span>}},{{<span class="number">1</span>,<span class="string">'a'</span>},{<span class="number">2</span>,<span class="string">'b'</span>}}} +</pre> + +<p>There is a set of operations which work in-place on 2D arrays. You can +<a href="../libraries/pl.array2d.html#swap_rows">swap_rows</a> and <a href="../libraries/pl.array2d.html#swap_cols">swap_cols</a>; the first really is a simple one-liner, but the idea +here is to give the operation a name. <a href="../libraries/pl.array2d.html#remove_row">remove_row</a> and <a href="../libraries/pl.array2d.html#remove_col">remove_col</a> are +generalizations of <a href="https://www.lua.org/manual/5.1/manual.html#pdf-table.remove">table.remove</a>. Likewise, <a href="../libraries/pl.array2d.html#extract_rows">extract_rows</a> and <a href="../libraries/pl.array2d.html#extract_cols">extract_cols</a> +are given arrays of indices and discard anything else. So, for instance, +<code>extract_cols(A,{2,4})</code> will leave just columns 2 and 4 in the array.</p> + +<p><a href="../classes/pl.List.html#List:slice">List.slice</a> is often useful on 1D arrays; <a href="../libraries/pl.array2d.html#slice">slice</a> does the same thing, but is +generally given a start (row,column) and a end (row,column).</p> + + +<pre> +> A = {{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>},{<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>},{<span class="number">7</span>,<span class="number">8</span>,<span class="number">9</span>}} +> B = slice(A,<span class="number">1</span>,<span class="number">1</span>,<span class="number">2</span>,<span class="number">2</span>) +> write(B) + <span class="number">1</span> <span class="number">2</span> + <span class="number">4</span> <span class="number">5</span> +> B = slice(A,<span class="number">2</span>,<span class="number">2</span>) +> write(B,<span class="keyword">nil</span>,<span class="string">'%4.1f'</span>) + <span class="number">5.0</span> <span class="number">6.0</span> + <span class="number">8.0</span> <span class="number">9.0</span> +</pre> + +<p>Here <a href="../libraries/pl.array2d.html#write">write</a> is used to print out an array nicely; the second parameter is <code>nil</code>, +which is the default (stdout) but can be any file object and the third parameter +is an optional format (as used in <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.format">string.format</a>).</p> + +<p><a href="../libraries/pl.array2d.html#parse_range">parse_range</a> will take a spreadsheet range like 'A1:B2' or 'R1C1:R2C2' and +return the range as four numbers, which can be passed to <a href="../libraries/pl.array2d.html#slice">slice</a>. The rule is +that <a href="../libraries/pl.array2d.html#slice">slice</a> will return an array of the appropriate shape depending on the +range; if a range represents a row or a column, the result is 1D, otherwise 2D.</p> + +<p>This applies to <a href="../libraries/pl.array2d.html#iter">iter</a> as well, which can also optionally be given a range:</p> + + + +<pre> +> <span class="keyword">for</span> i,j,v <span class="keyword">in</span> iter(A,<span class="keyword">true</span>,<span class="number">2</span>,<span class="number">2</span>) <span class="keyword">do</span> <span class="global">print</span>(i,j,v) <span class="keyword">end</span> +<span class="number">2</span> <span class="number">2</span> <span class="number">5</span> +<span class="number">2</span> <span class="number">3</span> <span class="number">6</span> +<span class="number">3</span> <span class="number">2</span> <span class="number">8</span> +<span class="number">3</span> <span class="number">3</span> <span class="number">9</span> +</pre> + +<p><a href="../libraries/pl.array2d.html#new">new</a> will construct a new 2D array with the given dimensions. You provide an +initial value for the elements, which is interpreted as a function if it's +callable. With <code>L</code> being <a href="../libraries/pl.utils.html#string_lambda">utils.string_lambda</a> we then have the following way to +make an <em>identity matrix</em>:</p> + + +<pre> +asserteq( + array.new(<span class="number">3</span>,<span class="number">3</span>,L<span class="string">'|i,j| i==j and 1 or 0'</span>), + {{<span class="number">1</span>,<span class="number">0</span>,<span class="number">0</span>},{<span class="number">0</span>,<span class="number">1</span>,<span class="number">0</span>},{<span class="number">0</span>,<span class="number">0</span>,<span class="number">1</span>}} +) +</pre> + +<p>Please note that most functions in <a href="../libraries/pl.array2d.html#">array2d</a> are <em>covariant</em>, that is, they +return an array of the same type as they receive. In particular, any objects +created with <a href="../libraries/pl.data.html#new">data.new</a> or <code>matrix.new</code> will remain data or matrix objects when +reshaped or sliced, etc. Data objects have the <a href="../libraries/pl.array2d.html#">array2d</a> functions available as +methods.</p> + + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/03-strings.md.html b/Data/Libraries/Penlight/docs/manual/03-strings.md.html new file mode 100644 index 0000000..a629192 --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/03-strings.md.html @@ -0,0 +1,397 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Extra_String_Methods">Extra String Methods </a></li> +<li><a href="#String_Templates">String Templates </a></li> +<li><a href="#Another_Style_of_Template">Another Style of Template </a></li> +<li><a href="#File_style_I_O_on_Strings">File-style I/O on Strings </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><strong>Strings. Higher-level operations on strings.</strong></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Strings. Higher-level operations on strings.</h2> + +<p><a name="Extra_String_Methods"></a></p> +<h3>Extra String Methods</h3> + + +<p>These are convenient borrowings from Python, as described in 3.6.1 of the Python +reference, but note that indices in Lua always begin at one. <a href="../libraries/pl.stringx.html#">stringx</a> defines +functions like <a href="../libraries/pl.stringx.html#isalpha">isalpha</a> and <a href="../libraries/pl.stringx.html#isdigit">isdigit</a>, which return <code>true</code> if s is only composed +of letters or digits respectively. <a href="../libraries/pl.stringx.html#startswith">startswith</a> and <a href="../libraries/pl.stringx.html#endswith">endswith</a> are convenient +ways to find substrings. (<a href="../libraries/pl.stringx.html#endswith">endswith</a> works as in Python 2.5, so that `f:endswith +{'.bat','.exe','.cmd'}` will be true for any filename which ends with these +extensions.) There are justify methods and whitespace trimming functions like +<a href="../libraries/pl.stringx.html#strip">strip</a>.</p> + + +<pre> +> stringx.import() +> (<span class="string">'bonzo.dog'</span>):endswith {<span class="string">'.dog'</span>,<span class="string">'.cat'</span>} +<span class="keyword">true</span> +> (<span class="string">'bonzo.txt'</span>):endswith {<span class="string">'.dog'</span>,<span class="string">'.cat'</span>} +<span class="keyword">false</span> +> (<span class="string">'bonzo.cat'</span>):endswith {<span class="string">'.dog'</span>,<span class="string">'.cat'</span>} +<span class="keyword">true</span> +> (<span class="string">' stuff'</span>):ljust(<span class="number">20</span>,<span class="string">'+'</span>) +<span class="string">'++++++++++++++ stuff'</span> +> (<span class="string">' stuff '</span>):lstrip() +<span class="string">'stuff '</span> +> (<span class="string">' stuff '</span>):rstrip() +<span class="string">' stuff'</span> +> (<span class="string">' stuff '</span>):strip() +<span class="string">'stuff'</span> +> <span class="keyword">for</span> s <span class="keyword">in</span> (<span class="string">'one\ntwo\nthree\n'</span>):lines() <span class="keyword">do</span> <span class="global">print</span>(s) <span class="keyword">end</span> +one +two +three +</pre> + +<p>Most of these can be fairly easily implemented using the Lua string library, +which is more general and powerful. But they are convenient operations to have +easily at hand. Note that can be injected into the <a href="https://www.lua.org/manual/5.1/manual.html#5.4">string</a> table if you use +<code>stringx.import</code>, but a simple alias like <code>local stringx = require 'pl.stringx'</code> +is preferrable. This is the recommended practice when writing modules for +consumption by other people, since it is bad manners to change the global state +of the rest of the system. Magic may be used for convenience, but there is always +a price.</p> + + +<p><a name="String_Templates"></a></p> +<h3>String Templates</h3> + + +<p>Another borrowing from Python, string templates allow you to substitute values +looked up in a table:</p> + + +<pre> +<span class="keyword">local</span> Template = <span class="global">require</span> (<span class="string">'pl.text'</span>).Template +t = Template(<span class="string">'${here} is the $answer'</span>) +<span class="global">print</span>(t:substitute {here = <span class="string">'Lua'</span>, answer = <span class="string">'best'</span>}) +==> +Lua is the best +</pre> + +<p>'$ variables' can optionally have curly braces; this form is useful if you are +glueing text together to make variables, e.g <code>${prefix}_name_${postfix}</code>. The +<a href="../libraries/pl.text.html#Template:substitute">substitute</a> method will throw an error if a $ variable is not found in the +table, and the <a href="../libraries/pl.text.html#Template:safe_substitute">safe_substitute</a> method will not.</p> + +<p>The Lua implementation has an extra method, <a href="../libraries/pl.text.html#Template:indent_substitute">indent_substitute</a> which is very +useful for inserting blocks of text, because it adjusts indentation. Consider +this example:</p> + + +<pre> +<span class="comment">-- testtemplate.lua +</span><span class="keyword">local</span> Template = <span class="global">require</span> (<span class="string">'pl.text'</span>).Template + +t = Template <span class="string">[[ + for i = 1,#$t do + $body + end +]]</span> + +body = Template <span class="string">[[ +local row = $t[i] +for j = 1,#row do + fun(row[j]) +end +]]</span> + +<span class="global">print</span>(t:indent_substitute {body=body,t=<span class="string">'tbl'</span>}) +</pre> + +<p>And the output is:</p> + + +<pre> +<span class="keyword">for</span> i = <span class="number">1</span>,#tbl <span class="keyword">do</span> + <span class="keyword">local</span> row = tbl[i] + <span class="keyword">for</span> j = <span class="number">1</span>,#row <span class="keyword">do</span> + fun(row[j]) + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p><a href="../libraries/pl.text.html#Template:indent_substitute">indent_substitute</a> can substitute templates, and in which case they themselves +will be substituted using the given table. So in this case, <code>$t</code> was substituted +twice.</p> + +<p><a href="../libraries/pl.text.html#">pl.text</a> also has a number of useful functions like <a href="../libraries/pl.text.html#dedent">dedent</a>, which strips all +the initial indentation from a multiline string. As in Python, this is useful for +preprocessing multiline strings if you like indenting them with your code. The +function <a href="../libraries/pl.text.html#wrap">wrap</a> is passed a long string (a <em>paragraph</em>) and returns a list of +lines that fit into a desired line width. As an extension, there is also <a href="../libraries/pl.text.html#indent">indent</a> +for indenting multiline strings.</p> + +<p>New in Penlight with the 0.9 series is <code>text.format_operator</code>. Calling this +enables Python-style string formating using the modulo operator <code>%</code>:</p> + + +<pre> +> text.format_operator() +> = <span class="string">'%s[%d]'</span> % {<span class="string">'dog'</span>,<span class="number">1</span>} +dog[<span class="number">1</span>] +</pre> + +<p>So in its simplest form it saves the typing involved with <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.format">string.format</a>; it +will also expand <code>$</code> variables using named fields:</p> + + +<pre> +> = <span class="string">'$animal[$num]'</span> % {animal=<span class="string">'dog'</span>,num=<span class="number">1</span>} +dog[<span class="number">1</span>] +</pre> + +<p>As with <code>stringx.import</code> you have to do this explicitly, since all strings share the same +metatable. But in your own scripts you can feel free to do this.</p> + +<p><a name="Another_Style_of_Template"></a></p> +<h3>Another Style of Template</h3> + +<p>A new module is <a href="../libraries/pl.template.html#">template</a>, which is a version of Rici Lake's <a href="http://lua-users.org/wiki/SlightlyLessSimpleLuaPreprocessor">Lua +Preprocessor</a>. This +allows you to mix Lua code with your templates in a straightforward way. There +are only two rules:</p> + +<ul> + <li>Lines begining with <code>#</code> are Lua</li> + <li>Otherwise, anything inside <code>$()</code> is a Lua expression.</li> +</ul> + +<p>So a template generating an HTML list would look like this:</p> + + +<pre> +<ul> +# <span class="keyword">for</span> i,val <span class="keyword">in</span> <span class="global">ipairs</span>(T) <span class="keyword">do</span> +<li>$(i) = $(val:upper())</li> +# <span class="keyword">end</span> +</ul> +</pre> + +<p>Assume the text is inside <code>tmpl</code>, then the template can be expanded using:</p> + + +<pre> +<span class="keyword">local</span> template = <span class="global">require</span> <span class="string">'pl.template'</span> +<span class="keyword">local</span> my_env = { + <span class="global">ipairs</span> = <span class="global">ipairs</span>, + T = {<span class="string">'one'</span>,<span class="string">'two'</span>,<span class="string">'three'</span>} +} +res = template.substitute(tmpl, my_env) +</pre> + +<p>and we get</p> + + +<pre> +<ul> +<li><span class="number">1</span> = ONE</li> +<li><span class="number">2</span> = TWO</li> +<li><span class="number">3</span> = THREE</li> +</ul> +</pre> + +<p>There is a single function, <a href="../libraries/pl.template.html#substitute">template.substitute</a> which is passed a template +string and an environment table. This table may contain some special fields, +like <code>\_parent</code> which can be set to a table representing a 'fallback' environment +in case a symbol was not found. <code>\_brackets</code> is usually '()' and <code>\_escape</code> is +usually '#' but it's sometimes necessary to redefine these if the defaults +interfere with the target language - for instance, <code>$(V)</code> has another meaning in +Make, and <code>#</code> means a preprocessor line in C/C++.</p> + +<p>Finally, if something goes wrong, passing <code>_debug</code> will cause the intermediate +Lua code to be dumped if there's a problem.</p> + +<p>Here is a C code generation example; something that could easily be extended to +be a minimal Lua extension skeleton generator.</p> + + +<pre> +<span class="keyword">local</span> subst = <span class="global">require</span> <span class="string">'pl.template'</span>.substitute + +<span class="keyword">local</span> templ = <span class="string">[[ +#include <lua.h> +#include <lauxlib.h> +#include <lualib.h> + +> for _,f in ipairs(mod) do +static int l_$(f.name) (lua_State *L) { + +} +> end + +static const luaL_reg $(mod.name)[] = { +> for _,f in ipairs(mod) do + {"$(f.name)",l_$(f.name)}, +> end + {NULL,NULL} +}; + +int luaopen_$(mod.name) { + luaL_register (L, "$(mod.name)", $(mod.name)); + return 1; +} +]]</span> + +<span class="global">print</span>(subst(templ,{ + _escape = <span class="string">'>'</span>, + <span class="global">ipairs</span> = <span class="global">ipairs</span>, + mod = { + name = <span class="string">'baggins'</span>; + {name=<span class="string">'frodo'</span>}, + {name=<span class="string">'bilbo'</span>} + } +})) +</pre> + +<p><a name="File_style_I_O_on_Strings"></a></p> +<h3>File-style I/O on Strings</h3> + +<p><a href="../libraries/pl.stringio.html#">pl.stringio</a> provides just three functions; <a href="../libraries/pl.stringio.html#open">stringio.open</a> is passed a string, +and returns a file-like object for reading. It supports a <code>read</code> method, which +takes the same arguments as standard file objects:</p> + + +<pre> +> f = stringio.open <span class="string">'first line\n10 20 30\n'</span> +> = f:read() +first line +> = f:read(<span class="string">'*n'</span>,<span class="string">'*n'</span>,<span class="string">'*n'</span>) +<span class="number">10</span> <span class="number">20</span> <span class="number">30</span> +</pre> + +<p><code>lines</code> and <code>seek</code> are also supported.</p> + +<p><code>stringio.lines</code> is a useful short-cut for iterating over all the lines in a +string.</p> + +<p><a href="../libraries/pl.stringio.html#create">stringio.create</a> creates a writeable file-like object. You then use <code>write</code> to +this stream, and finally extract the builded string using <code>value</code>. This 'string +builder' pattern is useful for efficiently creating large strings.</p> + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/04-paths.md.html b/Data/Libraries/Penlight/docs/manual/04-paths.md.html new file mode 100644 index 0000000..070a3ea --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/04-paths.md.html @@ -0,0 +1,329 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Working_with_Paths">Working with Paths </a></li> +<li><a href="#File_Operations">File Operations </a></li> +<li><a href="#Directory_Operations">Directory Operations </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><strong>Paths and Directories</strong></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Paths and Directories</h2> + +<p><a name="Working_with_Paths"></a></p> +<h3>Working with Paths</h3> + +<p>Programs should not depend on quirks of your operating system. They will be +harder to read, and need to be ported for other systems. The worst of course is +hardcoding paths like 'c:\' in programs, and wondering why Vista complains so +much. But even something like <code>dir..'\'..file</code> is a problem, since Unix can't +understand backslashes in this way. <code>dir..'/'..file</code> is <em>usually</em> portable, but +it's best to put this all into a simple function, <a href="../libraries/pl.path.html#join">path.join</a>. If you +consistently use <a href="../libraries/pl.path.html#join">path.join</a>, then it's much easier to write cross-platform code, +since it handles the directory separator for you.</p> + +<p><a href="../libraries/pl.path.html#">pl.path</a> provides the same functionality as Python's <code>os.path</code> module (11.1).</p> + + +<pre> +> p = <span class="string">'c:\\bonzo\\DOG.txt'</span> +> = path.normcase (p) <span class="comment">---> only makes sense on Windows +</span>c:\bonzo\dog.txt +> = path.splitext (p) +c:\bonzo\DOG .txt +> = path.extension (p) +.txt +> = path.basename (p) +DOG.txt +> = path.exists(p) +<span class="keyword">false</span> +> = path.join (<span class="string">'fred'</span>,<span class="string">'alice.txt'</span>) +fred\alice.txt +> = path.exists <span class="string">'pretty.lua'</span> +<span class="keyword">true</span> +> = path.getsize <span class="string">'pretty.lua'</span> +<span class="number">2125</span> +> = path.isfile <span class="string">'pretty.lua'</span> +<span class="keyword">true</span> +> = path.isdir <span class="string">'pretty.lua'</span> +<span class="keyword">false</span> +</pre> + +<p>It is very important for all programmers, not just on Unix, to only write to +where they are allowed to write. <a href="../libraries/pl.path.html#expanduser">path.expanduser</a> will expand '~' (tilde) into +the home directory. Depending on your OS, this will be a guaranteed place where +you can create files:</p> + + +<pre> +> = path.expanduser <span class="string">'~/mydata.txt'</span> +<span class="string">'C:\Documents and Settings\SJDonova/mydata.txt'</span> + +> = path.expanduser <span class="string">'~/mydata.txt'</span> +/home/sdonovan/mydata.txt +</pre> + +<p>Under Windows, <a href="https://www.lua.org/manual/5.1/manual.html#pdf-os.tmpname">os.tmpname</a> returns a path which leads to your drive root full of +temporary files. (And increasingly, you do not have access to this root folder.) +This is corrected by <a href="../libraries/pl.path.html#tmpname">path.tmpname</a>, which uses the environment variable TMP:</p> + + +<pre> +> <span class="global">os</span>.tmpname() <span class="comment">-- not a good place to put temporary files! +</span><span class="string">'\s25g.'</span> +> path.tmpname() +<span class="string">'C:\DOCUME~1\SJDonova\LOCALS~1\Temp\s25g.1'</span> +</pre> + +<p>A useful extra function is <a href="../libraries/pl.path.html#package_path">pl.path.package_path</a>, which will tell you the path +of a particular Lua module. So on my system, <code>package_path('pl.path')</code> returns +'C:\Program Files\Lua\5.1\lualibs\pl\path.lua', and <code>package_path('ifs')</code> returns +'C:\Program Files\Lua\5.1\clibs\lfs.dll'. It is implemented in terms of +<a href="https://www.lua.org/manual/5.1/manual.html#pdf-package.searchpath">package.searchpath</a>, which is a new function in Lua 5.2 which has been +implemented for Lua 5.1 in Penlight.</p> + +<p><a name="File_Operations"></a></p> +<h3>File Operations</h3> + +<p><a href="../libraries/pl.file.html#">pl.file</a> is a new module that provides more sensible names for common file +operations. For instance, <a href="https://www.lua.org/manual/5.1/manual.html#pdf-file:read">file.read</a> and <a href="https://www.lua.org/manual/5.1/manual.html#pdf-file:write">file.write</a> are aliases for +<a href="../libraries/pl.utils.html#readfile">utils.readfile</a> and <a href="../libraries/pl.utils.html#writefile">utils.writefile</a>.</p> + +<p>Smaller files can be efficiently read and written in one operation. <a href="https://www.lua.org/manual/5.1/manual.html#pdf-file:read">file.read</a> +is passed a filename and returns the contents as a string, if successful; if not, +then it returns <code>nil</code> and the actual error message. There is an optional boolean +parameter if you want the file to be read in binary mode (this makes no +difference on Unix but remains important with Windows.)</p> + +<p>In previous versions of Penlight, <a href="../libraries/pl.utils.html#readfile">utils.readfile</a> would read standard input if +the file was not specified, but this can lead to nasty bugs; use <code>io.read '*a'</code> +to grab all of standard input.</p> + +<p>Similarly, <a href="https://www.lua.org/manual/5.1/manual.html#pdf-file:write">file.write</a> takes a filename and a string which will be written to +that file.</p> + +<p>For example, this little script converts a file into upper case:</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> +<span class="global">assert</span>(#arg == <span class="number">2</span>, <span class="string">'supply two filenames'</span>) +text = <span class="global">assert</span>(file.read(arg[<span class="number">1</span>])) +<span class="global">assert</span>(file.write(arg[<span class="number">2</span>],text:upper())) +</pre> + +<p>Copying files is suprisingly tricky. <a href="../libraries/pl.file.html#copy">file.copy</a> and <a href="../libraries/pl.file.html#move">file.move</a> attempt to use +the best implementation possible. On Windows, they link to the API functions +<code>CopyFile</code> and <code>MoveFile</code>, but only if the <code>alien</code> package is installed (this is +true for Lua for Windows.) Otherwise, the system copy command is used. This can +be ugly when writing Windows GUI applications, because of the dreaded flashing +black-box problem with launching processes.</p> + +<p><a name="Directory_Operations"></a></p> +<h3>Directory Operations</h3> + +<p><a href="../libraries/pl.dir.html#">pl.dir</a> provides some useful functions for working with directories. <code>fnmatch</code> +will match a filename against a shell pattern, and <code>filter</code> will return any files +in the supplied list which match the given pattern, which correspond to the +functions in the Python <code>fnmatch</code> module. <code>getdirectories</code> will return all +directories contained in a directory, and <code>getfiles</code> will return all files in a +directory which match a shell pattern. These functions return the files as a +table, unlike <a href="http://stevedonovan.github.io/lua-stdlibs/modules/lfs.html#dir">lfs.dir</a> which returns an iterator.)</p> + +<p><a href="../libraries/pl.dir.html#makepath">dir.makepath</a> can create a full path, creating subdirectories as necessary; +<code>rmtree</code> is the Nuclear Option of file deleting functions, since it will +recursively clear out and delete all directories found begining at a path (there +is a similar function with this name in the Python <code>shutils</code> module.)</p> + + +<pre> +> = dir.makepath <span class="string">'t\\temp\\bonzo'</span> +> = path.isdir <span class="string">'t\\temp\\bonzo'</span> +<span class="keyword">true</span> +> = dir.rmtree <span class="string">'t'</span> +</pre> + +<p><a href="../libraries/pl.dir.html#rmtree">dir.rmtree</a> depends on <a href="../libraries/pl.dir.html#walk">dir.walk</a>, which is a powerful tool for scanning a whole +directory tree. Here is the implementation of <a href="../libraries/pl.dir.html#rmtree">dir.rmtree</a>:</p> + + +<pre> +<span class="comment">--- remove a whole directory tree. +</span><span class="comment">-- @param path A directory path +</span><span class="keyword">function</span> dir.rmtree(fullpath) + <span class="keyword">for</span> root,dirs,files <span class="keyword">in</span> dir.walk(fullpath) <span class="keyword">do</span> + <span class="keyword">for</span> i,f <span class="keyword">in</span> <span class="global">ipairs</span>(files) <span class="keyword">do</span> + <span class="global">os</span>.remove(path.join(root,f)) + <span class="keyword">end</span> + lfs.rmdir(root) + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p><a href="../libraries/pl.dir.html#clonetree">dir.clonetree</a> clones directory trees. The first argument is a path that must +exist, and the second path is the path to be cloned. (Note that this path cannot +be <em>inside</em> the first path, since this leads to madness.) By default, it will +then just recreate the directory structure. You can in addition provide a +function, which will be applied for all files found.</p> + + +<pre> +<span class="comment">-- make a copy of my libs folder +</span><span class="global">require</span> <span class="string">'pl'</span> +p1 = <span class="string">[[d:\dev\lua\libs]]</span> +p2 = <span class="string">[[D:\dev\lua\libs\..\tests]]</span> +dir.clonetree(p1,p2,dir.copyfile) +</pre> + +<p>A more sophisticated version, which only copies files which have been modified:</p> + + +<pre> +<span class="comment">-- p1 and p2 as before, or from arg[1] and arg[2] +</span>dir.clonetree(p1,p2,<span class="keyword">function</span>(f1,f2) + <span class="keyword">local</span> res + <span class="keyword">local</span> t1,t2 = path.getmtime(f1),path.getmtime(f2) + <span class="comment">-- f2 might not exist, so be careful about t2 +</span> <span class="keyword">if</span> <span class="keyword">not</span> t2 <span class="keyword">or</span> t1 > t2 <span class="keyword">then</span> + res = dir.copyfile(f1,f2) + <span class="keyword">end</span> + <span class="keyword">return</span> res <span class="comment">-- indicates successful operation +</span><span class="keyword">end</span>) +</pre> + +<p><a href="../libraries/pl.dir.html#clonetree">dir.clonetree</a> uses <a href="../libraries/pl.path.html#common_prefix">path.common_prefix</a>. With <code>p1</code> and <code>p2</code> defined above, the +common path is 'd:\dev\lua'. So 'd:\dev\lua\libs\testfunc.lua' is copied to +'d:\dev\lua\test\testfunc.lua', etc.</p> + +<p>If you need to find the common path of list of files, then <a href="../libraries/pl.tablex.html#reduce">tablex.reduce</a> will +do the job:</p> + + +<pre> +> p3 = <span class="string">[[d:\dev]]</span> +> = tablex.reduce(path.common_prefix,{p1,p2,p3}) +<span class="string">'d:\dev'</span> +</pre> + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/05-dates.md.html b/Data/Libraries/Penlight/docs/manual/05-dates.md.html new file mode 100644 index 0000000..c04b036 --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/05-dates.md.html @@ -0,0 +1,269 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Creating_and_Displaying_Dates">Creating and Displaying Dates </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><strong>Date and Time</strong></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Date and Time</h2> + +<p><a id="date"></a></p> + +<p>NOTE: the Date module is deprecated</p> + +<p><a name="Creating_and_Displaying_Dates"></a></p> +<h3>Creating and Displaying Dates</h3> + +<p>The <a href="../classes/pl.Date.html#">Date</a> class provides a simplified way to work with <a href="http://www.lua.org/pil/22.1.html">date and +time</a> in Lua; it leans heavily on the functions +<a href="https://www.lua.org/manual/5.1/manual.html#pdf-os.date">os.date</a> and <a href="https://www.lua.org/manual/5.1/manual.html#pdf-os.time">os.time</a>.</p> + +<p>A <a href="../classes/pl.Date.html#">Date</a> object can be constructed from a table, just like with <a href="https://www.lua.org/manual/5.1/manual.html#pdf-os.time">os.time</a>. +Methods are provided to get and set the various parts of the date.</p> + + +<pre> +> d = Date {year = <span class="number">2011</span>, month = <span class="number">3</span>, day = <span class="number">2</span> } +> = d +<span class="number">2011</span>-<span class="number">03</span>-<span class="number">02</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = d:month(),d:year(),d:day() +<span class="number">3</span> <span class="number">2011</span> <span class="number">2</span> +> d:month(<span class="number">4</span>) +> = d +<span class="number">2011</span>-<span class="number">04</span>-<span class="number">02</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> d:add {day=<span class="number">1</span>} +> = d +<span class="number">2011</span>-<span class="number">04</span>-<span class="number">03</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +</pre> + +<p><code>add</code> takes a table containing one of the date table fields.</p> + + +<pre> +> = d:weekday_name() +Sun +> = d:last_day() +<span class="number">2011</span>-<span class="number">04</span>-<span class="number">30</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = d:month_name(<span class="keyword">true</span>) +April +</pre> + +<p>There is a default conversion to text for date objects, but <a href="../classes/pl.Date.html#Date.Format">Date.Format</a> gives +you full control of the format for both parsing and displaying dates:</p> + + +<pre> +> iso = Date.Format <span class="string">'yyyy-mm-dd'</span> +> d = iso:parse <span class="string">'2010-04-10'</span> +> amer = Date.Format <span class="string">'mm/dd/yyyy'</span> +> = amer:<span class="global">tostring</span>(d) +<span class="number">04</span>/<span class="number">10</span>/<span class="number">2010</span> +</pre> + +<p>With the 0.9.7 relase, the <a href="../classes/pl.Date.html#">Date</a> constructor has become more flexible. You may +omit any of the 'year', 'month' or 'day' fields:</p> + + +<pre> +> = Date { year = <span class="number">2008</span> } +<span class="number">2008</span>-<span class="number">01</span>-<span class="number">01</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = Date { month = <span class="number">3</span> } +<span class="number">2011</span>-<span class="number">03</span>-<span class="number">01</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = Date { day = <span class="number">20</span> } +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">20</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = Date { hour = <span class="number">14</span>, min = <span class="number">30</span> } +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">13</span> <span class="number">14</span>:<span class="number">30</span>:<span class="number">00</span> +</pre> + +<p>If 'year' is omitted, then the current year is assumed, and likewise for 'month'.</p> + +<p>To set the time on such a partial date, you can use the fact that the 'setter' +methods return the date object and so you can 'chain' these methods.</p> + + +<pre> +> d = Date { day = <span class="number">03</span> } +> = d:hour(<span class="number">18</span>):min(<span class="number">30</span>) +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">03</span> <span class="number">18</span>:<span class="number">30</span>:<span class="number">00</span> +</pre> + +<p>Finally, <a href="../classes/pl.Date.html#">Date</a> also now accepts positional arguments:</p> + + +<pre> +> = Date(<span class="number">2011</span>,<span class="number">10</span>,<span class="number">3</span>) +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">03</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = Date(<span class="number">2011</span>,<span class="number">10</span>,<span class="number">3</span>,<span class="number">18</span>,<span class="number">30</span>,<span class="number">23</span>) +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">03</span> <span class="number">18</span>:<span class="number">30</span>:<span class="number">23</span> +</pre> + +<p><code>Date.format</code> has been extended. If you construct an instance without a pattern, +then it will try to match against a set of known formats. This is useful for +human-input dates since keeping to a strict format is not one of the strong +points of users. It assumes that there will be a date, and then a date.</p> + + +<pre> +> df = Date.Format() +> = df:parse <span class="string">'5.30pm'</span> +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">13</span> <span class="number">17</span>:<span class="number">30</span>:<span class="number">00</span> +> = df:parse <span class="string">'1730'</span> +<span class="keyword">nil</span> day out of range: <span class="number">1730</span> is <span class="keyword">not</span> between <span class="number">1</span> <span class="keyword">and</span> <span class="number">31</span> +> = df:parse <span class="string">'17.30'</span> +<span class="number">2011</span>-<span class="number">10</span>-<span class="number">13</span> <span class="number">17</span>:<span class="number">30</span>:<span class="number">00</span> +> = df:parse <span class="string">'mar'</span> +<span class="number">2011</span>-<span class="number">03</span>-<span class="number">01</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = df:parse <span class="string">'3 March'</span> +<span class="number">2011</span>-<span class="number">03</span>-<span class="number">03</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = df:parse <span class="string">'15 March'</span> +<span class="number">2011</span>-<span class="number">03</span>-<span class="number">15</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = df:parse <span class="string">'15 March 2008'</span> +<span class="number">2008</span>-<span class="number">03</span>-<span class="number">15</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +> = df:parse <span class="string">'15 March 2008 1.30pm'</span> +<span class="number">2008</span>-<span class="number">03</span>-<span class="number">15</span> <span class="number">13</span>:<span class="number">30</span>:<span class="number">00</span> +> = df:parse <span class="string">'2008-10-03 15:30:23'</span> +<span class="number">2008</span>-<span class="number">10</span>-<span class="number">03</span> <span class="number">15</span>:<span class="number">30</span>:<span class="number">23</span> +</pre> + +<p>ISO date format is of course a good idea if you need to deal with users from +different countries. Here is the default behaviour for 'short' dates:</p> + + +<pre> +> = df:parse <span class="string">'24/02/12'</span> +<span class="number">2012</span>-<span class="number">02</span>-<span class="number">24</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +</pre> + +<p>That's not what Americans expect! It's tricky to work out in a cross-platform way +exactly what the expected format is, so there is an explicit flag:</p> + + +<pre> +> df:US_order(<span class="keyword">true</span>) +> = df:parse <span class="string">'9/11/01'</span> +<span class="number">2001</span>-<span class="number">11</span>-<span class="number">09</span> <span class="number">12</span>:<span class="number">00</span>:<span class="number">00</span> +</pre> + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/06-data.md.html b/Data/Libraries/Penlight/docs/manual/06-data.md.html new file mode 100644 index 0000000..585e23e --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/06-data.md.html @@ -0,0 +1,1633 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Reading_Data_Files">Reading Data Files </a></li> +<li><a href="#Reading_Unstructured_Text_Data">Reading Unstructured Text Data </a></li> +<li><a href="#Reading_Columnar_Data">Reading Columnar Data </a></li> +<li><a href="#Reading_Configuration_Files">Reading Configuration Files </a></li> +<li><a href="#Lexical_Scanning">Lexical Scanning </a></li> +<li><a href="#XML">XML </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><strong>Data</strong></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Data</h2> + +<p><a name="Reading_Data_Files"></a></p> +<h3>Reading Data Files</h3> + +<p>The first thing to consider is this: do you actually need to write a custom file +reader? And if the answer is yes, the next question is: can you write the reader +in as clear a way as possible? Correctness, Robustness, and Speed; pick the first +two and the third can be sorted out later, <em>if necessary</em>.</p> + +<p>A common sort of data file is the configuration file format commonly used on Unix +systems. This format is often called a <em>property</em> file in the Java world.</p> + + +<pre> +# Read timeout <span class="keyword">in</span> seconds +read.timeout=<span class="number">10</span> + +# Write timeout <span class="keyword">in</span> seconds +write.timeout=<span class="number">10</span> +</pre> + +<p>Here is a simple Lua implementation:</p> + + +<pre> +<span class="comment">-- property file parsing with Lua string patterns +</span>props = [] +<span class="keyword">for</span> line <span class="keyword">in</span> <span class="global">io</span>.lines() <span class="keyword">do</span> + <span class="keyword">if</span> line:find(<span class="string">'#'</span>,<span class="number">1</span>,<span class="keyword">true</span>) ~= <span class="number">1</span> <span class="keyword">and</span> <span class="keyword">not</span> line:find(<span class="string">'^%s*$'</span>) <span class="keyword">then</span> + <span class="keyword">local</span> var,value = line:match(<span class="string">'([^=]+)=(.*)'</span>) + props[var] = value + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>Very compact, but it suffers from a similar disease in equivalent Perl programs; +it uses odd string patterns which are 'lexically noisy'. Noisy code like this +slows the casual reader down. (For an even more direct way of doing this, see the +next section, 'Reading Configuration Files')</p> + +<p>Another implementation, using the Penlight libraries:</p> + + +<pre> +<span class="comment">-- property file parsing with extended string functions +</span><span class="global">require</span> <span class="string">'pl'</span> +stringx.import() +props = [] +<span class="keyword">for</span> line <span class="keyword">in</span> <span class="global">io</span>.lines() <span class="keyword">do</span> + <span class="keyword">if</span> <span class="keyword">not</span> line:startswith(<span class="string">'#'</span>) <span class="keyword">and</span> <span class="keyword">not</span> line:isspace() <span class="keyword">then</span> + <span class="keyword">local</span> var,value = line:splitv(<span class="string">'='</span>) + props[var] = value + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>This is more self-documenting; it is generally better to make the code express +the <em>intention</em>, rather than having to scatter comments everywhere - comments are +necessary, of course, but mostly to give the higher view of your intention that +cannot be expressed in code. It is slightly slower, true, but in practice the +speed of this script is determined by I/O, so further optimization is unnecessary.</p> + +<p><a name="Reading_Unstructured_Text_Data"></a></p> +<h3>Reading Unstructured Text Data</h3> + +<p>Text data is sometimes unstructured, for example a file containing words. The +<a href="../libraries/pl.input.html#">pl.input</a> module has a number of functions which makes processing such files +easier. For example, a script to count the number of words in standard input +using <code>import.words</code>:</p> + + +<pre> +<span class="comment">-- countwords.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +<span class="keyword">local</span> k = <span class="number">1</span> +<span class="keyword">for</span> w <span class="keyword">in</span> input.words(<span class="global">io</span>.stdin) <span class="keyword">do</span> + k = k + <span class="number">1</span> +<span class="keyword">end</span> +<span class="global">print</span>(<span class="string">'count'</span>,k) +</pre> + +<p>Or this script to calculate the average of a set of numbers using <a href="../libraries/pl.input.html#numbers">input.numbers</a>:</p> + + +<pre> +<span class="comment">-- average.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +<span class="keyword">local</span> k = <span class="number">1</span> +<span class="keyword">local</span> sum = <span class="number">0</span> +<span class="keyword">for</span> n <span class="keyword">in</span> input.numbers(<span class="global">io</span>.stdin) <span class="keyword">do</span> + sum = sum + n + k = k + <span class="number">1</span> +<span class="keyword">end</span> +<span class="global">print</span>(<span class="string">'average'</span>,sum/k) +</pre> + +<p>These scripts can be improved further by <em>eliminating loops</em> In the last case, +there is a perfectly good function <a href="../libraries/pl.seq.html#sum">seq.sum</a> which can already take a sequence of +numbers and calculate these numbers for us:</p> + + +<pre> +<span class="comment">-- average2.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +<span class="keyword">local</span> total,n = seq.sum(input.numbers()) +<span class="global">print</span>(<span class="string">'average'</span>,total/n) +</pre> + +<p>A further simplification here is that if <code>numbers</code> or <code>words</code> are not passed an +argument, they will grab their input from standard input. The first script can +be rewritten:</p> + + +<pre> +<span class="comment">-- countwords2.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +<span class="global">print</span>(<span class="string">'count'</span>,seq.count(input.words())) +</pre> + +<p>A useful feature of a sequence generator like <code>numbers</code> is that it can read from +a string source. Here is a script to calculate the sums of the numbers on each +line in a file:</p> + + +<pre> +<span class="comment">-- sums.lua +</span><span class="keyword">for</span> line <span class="keyword">in</span> <span class="global">io</span>.lines() <span class="keyword">do</span> + <span class="global">print</span>(seq.sum(input.numbers(line)) +<span class="keyword">end</span> +</pre> + +<p><a name="Reading_Columnar_Data"></a></p> +<h3>Reading Columnar Data</h3> + +<p>It is very common to find data in columnar form, either space or comma-separated, +perhaps with an initial set of column headers. Here is a typical example:</p> + + +<pre> +EventID Magnitude LocationX LocationY LocationZ +<span class="number">981124001</span> <span class="number">2.0</span> <span class="number">18988.4</span> <span class="number">10047.1</span> <span class="number">4149.7</span> +<span class="number">981125001</span> <span class="number">0.8</span> <span class="number">19104.0</span> <span class="number">9970.4</span> <span class="number">5088.7</span> +<span class="number">981127003</span> <span class="number">0.5</span> <span class="number">19012.5</span> <span class="number">9946.9</span> <span class="number">3831.2</span> +... +</pre> + +<p><a href="../libraries/pl.input.html#fields">input.fields</a> is designed to extract several columns, given some delimiter +(default to whitespace). Here is a script to calculate the average X location of +all the events:</p> + + +<pre> +<span class="comment">-- avg-x.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +<span class="global">io</span>.read() <span class="comment">-- skip the header line +</span><span class="keyword">local</span> sum,count = seq.sum(input.fields {<span class="number">3</span>}) +<span class="global">print</span>(sum/count) +</pre> + +<p><a href="../libraries/pl.input.html#fields">input.fields</a> is passed either a field count, or a list of column indices, +starting at one as usual. So in this case we're only interested in column 3. If +you pass it a field count, then you get every field up to that count:</p> + + +<pre> +<span class="keyword">for</span> id,mag,locX,locY,locZ <span class="keyword">in</span> input.fields (<span class="number">5</span>) <span class="keyword">do</span> +.... +<span class="keyword">end</span> +</pre> + +<p><a href="../libraries/pl.input.html#fields">input.fields</a> by default tries to convert each field to a number. It will skip +lines which clearly don't match the pattern, but will abort the script if there +are any fields which cannot be converted to numbers.</p> + +<p>The second parameter is a delimiter, by default spaces. ' ' is understood to mean +'any number of spaces', i.e. '%s+'. Any Lua string pattern can be used.</p> + +<p>The third parameter is a <em>data source</em>, by default standard input (defined by +<a href="../libraries/pl.input.html#create_getter">input.create_getter</a>.) It assumes that the data source has a <code>read</code> method which +brings in the next line, i.e. it is a 'file-like' object. As a special case, a +string will be split into its lines:</p> + + +<pre> +> <span class="keyword">for</span> x,y <span class="keyword">in</span> input.fields(<span class="number">2</span>,<span class="string">' '</span>,<span class="string">'10 20\n30 40\n'</span>) <span class="keyword">do</span> <span class="global">print</span>(x,y) <span class="keyword">end</span> +<span class="number">10</span> <span class="number">20</span> +<span class="number">30</span> <span class="number">40</span> +</pre> + +<p>Note the default behaviour for bad fields, which is to show the offending line +number:</p> + + +<pre> +> <span class="keyword">for</span> x,y <span class="keyword">in</span> input.fields(<span class="number">2</span>,<span class="string">' '</span>,<span class="string">'10 20\n30 40x\n'</span>) <span class="keyword">do</span> <span class="global">print</span>(x,y) <span class="keyword">end</span> +<span class="number">10</span> <span class="number">20</span> +line <span class="number">2</span>: cannot convert <span class="string">'40x'</span> to number +</pre> + +<p>This behaviour of <a href="../libraries/pl.input.html#fields">input.fields</a> is appropriate for a script which you want to +fail immediately with an appropriate <em>user</em> error message if conversion fails. +The fourth optional parameter is an options table: <code>{no_fail=true}</code> means that +conversion is attempted but if it fails it just returns the string, rather as AWK +would operate. You are then responsible for checking the type of the returned +field. <code>{no_convert=true}</code> switches off conversion altogether and all fields are +returned as strings.</p> + + +<p>Sometimes it is useful to bring a whole dataset into memory, for operations such +as extracting columns. Penlight provides a flexible reader specifically for +reading this kind of data, using the <a href="../libraries/pl.data.html#">data</a> module. Given a file looking like this:</p> + + +<pre> +x,y +<span class="number">10</span>,<span class="number">20</span> +<span class="number">2</span>,<span class="number">5</span> +<span class="number">40</span>,<span class="number">50</span> +</pre> + +<p>Then <a href="../libraries/pl.data.html#read">data.read</a> will create a table like this, with each row represented by a +sublist:</p> + + +<pre> +> t = data.read <span class="string">'test.txt'</span> +> pretty.dump(t) +{{<span class="number">10</span>,<span class="number">20</span>},{<span class="number">2</span>,<span class="number">5</span>},{<span class="number">40</span>,<span class="number">50</span>},fieldnames={<span class="string">'x'</span>,<span class="string">'y'</span>},delim=<span class="string">','</span>} +</pre> + +<p>You can now analyze this returned table using the supplied methods. For instance, +the method <a href="../libraries/pl.data.html#Data.column_by_name">column_by_name</a> returns a table of all the values of that column.</p> + + +<pre> +<span class="comment">-- testdata.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +d = data.read(<span class="string">'fev.txt'</span>) +<span class="keyword">for</span> _,name <span class="keyword">in</span> <span class="global">ipairs</span>(d.fieldnames) <span class="keyword">do</span> + <span class="keyword">local</span> col = d:column_by_name(name) + <span class="keyword">if</span> <span class="global">type</span>(col[<span class="number">1</span>]) == <span class="string">'number'</span> <span class="keyword">then</span> + <span class="keyword">local</span> total,n = seq.sum(col) + utils.printf(<span class="string">"Average for %s is %f\n"</span>,name,total/n) + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p><a href="../libraries/pl.data.html#read">data.read</a> tries to be clever when given data; by default it expects a first +line of column names, unless any of them are numbers. It tries to deduce the +column delimiter by looking at the first line. Sometimes it guesses wrong; these +things can be specified explicitly. The second optional parameter is an options +table: can override <code>delim</code> (a string pattern), <code>fieldnames</code> (a list or +comma-separated string), specify <code>no_convert</code> (default is to convert), numfields +(indices of columns known to be numbers, as a list) and <code>thousands_dot</code> (when the +thousands separator in Excel CSV is '.')</p> + +<p>A very powerful feature is a way to execute SQL-like queries on such data:</p> + + +<pre> +<span class="comment">-- queries on tabular data +</span><span class="global">require</span> <span class="string">'pl'</span> +<span class="keyword">local</span> d = data.read(<span class="string">'xyz.txt'</span>) +<span class="keyword">local</span> q = d:<span class="global">select</span>(<span class="string">'x,y,z where x > 3 and z < 2 sort by y'</span>) +<span class="keyword">for</span> x,y,z <span class="keyword">in</span> q <span class="keyword">do</span> + <span class="global">print</span>(x,y,z) +<span class="keyword">end</span> +</pre> + +<p>Please note that the format of queries is restricted to the following syntax:</p> + + +<pre> +FIELDLIST [ <span class="string">'where'</span> CONDITION ] [ <span class="string">'sort by'</span> FIELD [asc|desc]] +</pre> + +<p>Any valid Lua code can appear in <code>CONDITION</code>; remember it is <em>not</em> SQL and you +have to use <code>==</code> (this warning comes from experience.)</p> + +<p>For this to work, <em>field names must be Lua identifiers</em>. So <a href="../libraries/pl.data.html#read">read</a> will massage +fieldnames so that all non-alphanumeric chars are replaced with underscores. +However, the <code>original_fieldnames</code> field always contains the original un-massaged +fieldnames.</p> + +<p><a href="../libraries/pl.data.html#read">read</a> can handle standard CSV files fine, although doesn't try to be a +full-blown CSV parser. With the <code>csv=true</code> option, it's possible to have +double-quoted fields, which may contain commas; then trailing commas become +significant as well.</p> + +<p>Spreadsheet programs are not always the best tool to +process such data, strange as this might seem to some people. This is a toy CSV +file; to appreciate the problem, imagine thousands of rows and dozens of columns +like this:</p> + + +<pre> +Department Name,Employee ID,Project,Hours Booked +sales,<span class="number">1231</span>,overhead,<span class="number">4</span> +sales,<span class="number">1255</span>,overhead,<span class="number">3</span> +engineering,<span class="number">1501</span>,development,<span class="number">5</span> +engineering,<span class="number">1501</span>,maintenance,<span class="number">3</span> +engineering,<span class="number">1433</span>,maintenance,<span class="number">10</span> +</pre> + +<p>The task is to reduce the dataset to a relevant set of rows and columns, perhaps +do some processing on row data, and write the result out to a new CSV file. The +<a href="../libraries/pl.data.html#Data.write_row">write_row</a> method uses the delimiter to write the row to a file; +<code>Data.select_row</code> is like <code>Data.select</code>, except it iterates over <em>rows</em>, not +fields; this is necessary if we are dealing with a lot of columns!</p> + + +<pre> +names = {[<span class="number">1501</span>]=<span class="string">'don'</span>,[<span class="number">1433</span>]=<span class="string">'dilbert'</span>} +keepcols = {<span class="string">'Employee_ID'</span>,<span class="string">'Hours_Booked'</span>} +t:write_row (outf,{<span class="string">'Employee'</span>,<span class="string">'Hours_Booked'</span>}) +q = t:select_row { + fields=keepcols, + where=<span class="keyword">function</span>(row) <span class="keyword">return</span> row[<span class="number">1</span>]==<span class="string">'engineering'</span> <span class="keyword">end</span> +} +<span class="keyword">for</span> row <span class="keyword">in</span> q <span class="keyword">do</span> + row[<span class="number">1</span>] = names[row[<span class="number">1</span>]] + t:write_row(outf,row) +<span class="keyword">end</span> +</pre> + +<p><code>Data.select_row</code> and <code>Data.select</code> can be passed a table specifying the query; a +list of field names, a function defining the condition and an optional parameter +<code>sort_by</code>. It isn't really necessary here, but if we had a more complicated row +condition (such as belonging to a specified set) then it is not generally +possible to express such a condition as a query string, without resorting to +hackery such as global variables.</p> + +<p>With 1.0.3, you can specify explicit conversion functions for selected columns. +For instance, this is a log file with a Unix date stamp:</p> + + +<pre> +Time Message +<span class="number">1266840760</span> +# EE7C0600006F0D00C00F06010302054000000308010A00002B00407B00 +<span class="number">1266840760</span> closure data <span class="number">0.000000</span> <span class="number">1972</span> <span class="number">1972</span> <span class="number">0</span> +<span class="number">1266840760</span> ++ <span class="number">1266840760</span> EE <span class="number">1</span> +<span class="number">1266840760</span> +# EE7C0600006F0D00C00F06010302054000000408020A00002B00407B00 +<span class="number">1266840764</span> closure data <span class="number">0.000000</span> <span class="number">1972</span> <span class="number">1972</span> <span class="number">0</span> +</pre> + +<p>We would like the first column as an actual date object, so the <code>convert</code> +field sets an explicit conversion for column 1. (Note that we have to explicitly +convert the string to a number first.)</p> + + +<pre> +Date = <span class="global">require</span> <span class="string">'pl.Date'</span> + +<span class="keyword">function</span> date_convert (ds) + <span class="keyword">return</span> Date(<span class="global">tonumber</span>(ds)) +<span class="keyword">end</span> + +d = data.read(f,{convert={[<span class="number">1</span>]=date_convert},last_field_collect=<span class="keyword">true</span>}) +</pre> + +<p>This gives us a two-column dataset, where the first column contains <a href="../classes/pl.Date.html#">Date</a> objects +and the second column contains the rest of the line. Queries can then easily +pick out events on a day of the week:</p> + + +<pre> +q = d:<span class="global">select</span> <span class="string">"Time,Message where Time:weekday_name()=='Sun'"</span> +</pre> + +<p>Data does not have to come from files, nor does it necessarily come from the lab +or the accounts department. On Linux, <code>ps aux</code> gives you a full listing of all +processes running on your machine. It is straightforward to feed the output of +this command into <a href="../libraries/pl.data.html#read">data.read</a> and perform useful queries on it. Notice that +non-identifier characters like '%' get converted into underscores:</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> +f = <span class="global">io</span>.popen <span class="string">'ps aux'</span> +s = data.read (f,{last_field_collect=<span class="keyword">true</span>}) +f:close() +<span class="global">print</span>(s.fieldnames) +<span class="global">print</span>(s:column_by_name <span class="string">'USER'</span>) +qs = <span class="string">'COMMAND,_MEM where _MEM > 5 and USER=="steve"'</span> +<span class="keyword">for</span> name,mem <span class="keyword">in</span> s:<span class="global">select</span>(qs) <span class="keyword">do</span> + <span class="global">print</span>(mem,name) +<span class="keyword">end</span> +</pre> + +<p>I've always been an admirer of the AWK programming language; with <a href="../libraries/pl.data.html#filter">filter</a> you +can get Lua programs which are just as compact:</p> + + +<pre> +<span class="comment">-- printxy.lua +</span><span class="global">require</span> <span class="string">'pl'</span> +data.filter <span class="string">'x,y where x > 3'</span> +</pre> + +<p>It is common enough to have data files without headers of field names. +<a href="../libraries/pl.data.html#read">data.read</a> makes a special exception for such files if all fields are numeric. +Since there are no column names to use in query expressions, you can use AWK-like +column indexes, e.g. '$1,$2 where $1 > 3'. I have a little executable script on +my system called <code>lf</code> which looks like this:</p> + + +<pre> +#!/usr/bin/env lua +<span class="global">require</span> <span class="string">'pl.data'</span>.filter(arg[<span class="number">1</span>]) +</pre> + +<p>And it can be used generally as a filter command to extract columns from data. +(The column specifications may be expressions or even constants.)</p> + + +<pre> +$ lf <span class="string">'$1,$5/10'</span> < test.dat +</pre> + +<p>(As with AWK, please note the single-quotes used in this command; this prevents +the shell trying to expand the column indexes. If you are on Windows, then you +must quote the expression in double-quotes so +it is passed as one argument to your batch file.)</p> + +<p>As a tutorial resource, have a look at <a href="../examples/test-data.lua.html#">test-data.lua</a> in the PL tests directory +for other examples of use, plus comments.</p> + +<p>The data returned by <a href="../libraries/pl.data.html#read">read</a> or constructed by <code>Data.copy_select</code> from a query is +basically just an array of rows: <code>{{1,2},{3,4}}</code>. So you may use <a href="../libraries/pl.data.html#read">read</a> to pull +in any array-like dataset, and process with any function that expects such a +implementation. In particular, the functions in <a href="../libraries/pl.array2d.html#">array2d</a> will work fine with +this data. In fact, these functions are available as methods; e.g. +<a href="../libraries/pl.array2d.html#flatten">array2d.flatten</a> can be called directly like so to give us a one-dimensional list:</p> + + +<pre> +v = data.read(<span class="string">'dat.txt'</span>):flatten() +</pre> + +<p>The data is also in exactly the right shape to be treated as matrices by +<a href="http://lua-users.org/wiki/LuaMatrix">LuaMatrix</a>:</p> + + +<pre> +> matrix = <span class="global">require</span> <span class="string">'matrix'</span> +> m = matrix(data.read <span class="string">'mat.txt'</span>) +> = m +<span class="number">1</span> <span class="number">0.2</span> <span class="number">0.3</span> +<span class="number">0.2</span> <span class="number">1</span> <span class="number">0.1</span> +<span class="number">0.1</span> <span class="number">0.2</span> <span class="number">1</span> +> = m^<span class="number">2</span> <span class="comment">-- same as m*m +</span><span class="number">1.07</span> <span class="number">0.46</span> <span class="number">0.62</span> +<span class="number">0.41</span> <span class="number">1.06</span> <span class="number">0.26</span> +<span class="number">0.24</span> <span class="number">0.42</span> <span class="number">1.05</span> +</pre> + +<p><a href="../libraries/pl.data.html#write">write</a> will write matrices back to files for you.</p> + +<p>Finally, for the curious, the global variable <code>_DEBUG</code> can be used to print out +the actual iterator function which a query generates and dynamically compiles. By +using code generation, we can get pretty much optimal performance out of +arbitrary queries.</p> + + +<pre> +> lua -lpl -e <span class="string">"_DEBUG=true"</span> -e <span class="string">"data.filter 'x,y where x > 4 sort by x'"</span> < test.txt +<span class="keyword">return</span> <span class="keyword">function</span> (t) + <span class="keyword">local</span> i = <span class="number">0</span> + <span class="keyword">local</span> v + <span class="keyword">local</span> ls = {} + <span class="keyword">for</span> i,v <span class="keyword">in</span> <span class="global">ipairs</span>(t) <span class="keyword">do</span> + <span class="keyword">if</span> v[<span class="number">1</span>] > <span class="number">4</span> <span class="keyword">then</span> + ls[#ls+<span class="number">1</span>] = v + <span class="keyword">end</span> + <span class="keyword">end</span> + <span class="global">table</span>.sort(ls,<span class="keyword">function</span>(v1,v2) + <span class="keyword">return</span> v1[<span class="number">1</span>] < v2[<span class="number">1</span>] + <span class="keyword">end</span>) + <span class="keyword">local</span> n = #ls + <span class="keyword">return</span> <span class="keyword">function</span>() + i = i + <span class="number">1</span> + v = ls[i] + <span class="keyword">if</span> i > n <span class="keyword">then</span> <span class="keyword">return</span> <span class="keyword">end</span> + <span class="keyword">return</span> v[<span class="number">1</span>],v[<span class="number">2</span>] + <span class="keyword">end</span> +<span class="keyword">end</span> + +<span class="number">10</span>,<span class="number">20</span> +<span class="number">40</span>,<span class="number">50</span> +</pre> + +<p><a name="Reading_Configuration_Files"></a></p> +<h3>Reading Configuration Files</h3> + +<p>The <a href="../libraries/pl.config.html#">config</a> module provides a simple way to convert several kinds of +configuration files into a Lua table. Consider the simple example:</p> + + +<pre> +# test.config +# Read timeout <span class="keyword">in</span> seconds +read.timeout=<span class="number">10</span> + +# Write timeout <span class="keyword">in</span> seconds +write.timeout=<span class="number">5</span> + +#acceptable ports +ports = <span class="number">1002</span>,<span class="number">1003</span>,<span class="number">1004</span> +</pre> + +<p>This can be easily brought in using <a href="../libraries/pl.config.html#read">config.read</a> and the result shown using +<a href="../libraries/pl.pretty.html#write">pretty.write</a>:</p> + + +<pre> +<span class="comment">-- readconfig.lua +</span><span class="keyword">local</span> config = <span class="global">require</span> <span class="string">'pl.config'</span> +<span class="keyword">local</span> pretty= <span class="global">require</span> <span class="string">'pl.pretty'</span> + +<span class="keyword">local</span> t = config.read(arg[<span class="number">1</span>]) +<span class="global">print</span>(pretty.write(t)) +</pre> + +<p>and the output of <code>lua readconfig.lua test.config</code> is:</p> + + +<pre> +{ + ports = { + <span class="number">1002</span>, + <span class="number">1003</span>, + <span class="number">1004</span> + }, + write_timeout = <span class="number">5</span>, + read_timeout = <span class="number">10</span> +} +</pre> + +<p>That is, <a href="../libraries/pl.config.html#read">config.read</a> will bring in all key/value pairs, ignore # comments, and +ensure that the key names are proper Lua identifiers by replacing non-identifier +characters with '_'. If the values are numbers, then they will be converted. (So +the value of <code>t.write_timeout</code> is the number 5). In addition, any values which +are separated by commas will be converted likewise into an array.</p> + +<p>Any line can be continued with a backslash. So this will all be considered one +line:</p> + + +<pre> +names=one,two,three, \ +four,five,six,seven, \ +eight,nine,ten +</pre> + +<p>Windows-style INI files are also supported. The section structure of INI files +translates naturally to nested tables in Lua:</p> + + +<pre> +; test.ini +[timeouts] +read=<span class="number">10</span> ; Read timeout <span class="keyword">in</span> seconds +write=<span class="number">5</span> ; Write timeout <span class="keyword">in</span> seconds +[portinfo] +ports = <span class="number">1002</span>,<span class="number">1003</span>,<span class="number">1004</span> +</pre> + +<p> The output is:</p> + + +<pre> +{ + portinfo = { + ports = { + <span class="number">1002</span>, + <span class="number">1003</span>, + <span class="number">1004</span> + } + }, + timeouts = { + write = <span class="number">5</span>, + read = <span class="number">10</span> + } +} +</pre> + +<p>You can now refer to the write timeout as <code>t.timeouts.write</code>.</p> + +<p>As a final example of the flexibility of <a href="../libraries/pl.config.html#read">config.read</a>, if passed this simple +comma-delimited file</p> + + +<pre> +one,two,three +<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span> +<span class="number">40</span>,<span class="number">50</span>,<span class="number">60</span> +<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span> +</pre> + +<p>it will produce the following table:</p> + + +<pre> +{ + { <span class="string">"one"</span>, <span class="string">"two"</span>, <span class="string">"three"</span> }, + { <span class="number">10</span>, <span class="number">20</span>, <span class="number">30</span> }, + { <span class="number">40</span>, <span class="number">50</span>, <span class="number">60</span> }, + { <span class="number">1</span>, <span class="number">2</span>, <span class="number">3</span> } +} +</pre> + +<p><a href="../libraries/pl.config.html#read">config.read</a> isn't designed to read all CSV files in general, but intended to +support some Unix configuration files not structured as key-value pairs, such as +'/etc/passwd'.</p> + +<p>This function is intended to be a Swiss Army Knife of configuration readers, but +it does have to make assumptions, and you may not like them. So there is an +optional extra parameter which allows some control, which is table that may have +the following fields:</p> + + +<pre> +{ + variablilize = <span class="keyword">true</span>, + convert_numbers = <span class="global">tonumber</span>, + trim_space = <span class="keyword">true</span>, + list_delim = <span class="string">','</span>, + trim_quotes = <span class="keyword">true</span>, + ignore_assign = <span class="keyword">false</span>, + keysep = <span class="string">'='</span>, + smart = <span class="keyword">false</span>, +} +</pre> + +<p><code>variablilize</code> is the option that converted <code>write.timeout</code> in the first example +to the valid Lua identifier <code>write_timeout</code>. If <code>convert_numbers</code> is true, then +an attempt is made to convert any string that starts like a number. You can +specify your own function (say one that will convert a string like '5224 kb' into +a number.)</p> + +<p><code>trim_space</code> ensures that there is no starting or trailing whitespace with +values, and <code>list_delim</code> is the character that will be used to decide whether to +split a value up into a list (it may be a Lua string pattern such as '%s+'.)</p> + +<p>For instance, the password file in Unix is colon-delimited:</p> + + +<pre> +t = config.read(<span class="string">'/etc/passwd'</span>,{list_delim=<span class="string">':'</span>}) +</pre> + +<p>This produces the following output on my system (only last two lines shown):</p> + + +<pre> +{ + ... + { + <span class="string">"user"</span>, + <span class="string">"x"</span>, + <span class="string">"1000"</span>, + <span class="string">"1000"</span>, + <span class="string">"user,,,"</span>, + <span class="string">"/home/user"</span>, + <span class="string">"/bin/bash"</span> + }, + { + <span class="string">"sdonovan"</span>, + <span class="string">"x"</span>, + <span class="string">"1001"</span>, + <span class="string">"1001"</span>, + <span class="string">"steve donovan,28,,"</span>, + <span class="string">"/home/sdonovan"</span>, + <span class="string">"/bin/bash"</span> + } +} +</pre> + +<p>You can get this into a more sensible format, where the usernames are the keys, +with this (the <a href="../libraries/pl.tablex.html#pairmap">tablex.pairmap</a> function must return value, key!)</p> + + +<pre> +t = tablex.pairmap(<span class="keyword">function</span>(k,v) <span class="keyword">return</span> v,v[<span class="number">1</span>] <span class="keyword">end</span>,t) +</pre> + +<p>and you get:</p> + + +<pre> +{ ... + sdonovan = { + <span class="string">"sdonovan"</span>, + <span class="string">"x"</span>, + <span class="string">"1001"</span>, + <span class="string">"1001"</span>, + <span class="string">"steve donovan,28,,"</span>, + <span class="string">"/home/sdonovan"</span>, + <span class="string">"/bin/bash"</span> + } +... +} +</pre> + +<p>Many common Unix configuration files can be read by tweaking these parameters. +For <code>/etc/fstab</code>, the options <code>{list_delim='%s+',ignore_assign=true}</code> will +correctly separate the columns. It's common to find 'KEY VALUE' assignments in +files such as <code>/etc/ssh/ssh_config</code>; the options <code>{keysep=' '}</code> make +<a href="../libraries/pl.config.html#read">config.read</a> return a table where each KEY has a value VALUE.</p> + +<p>Files in the Linux <code>procfs</code> usually use ':` as the field delimiter:</p> + + +<pre> +> t = config.read(<span class="string">'/proc/meminfo'</span>,{keysep=<span class="string">':'</span>}) +> = t.MemFree +<span class="number">220140</span> kB +</pre> + +<p>That result is a string, since <a href="https://www.lua.org/manual/5.1/manual.html#pdf-tonumber">tonumber</a> doesn't like it, but defining the +<code>convert_numbers</code> option as `function(s) return tonumber((s:gsub(' kB$',''))) +end` will get the memory figures as actual numbers in the result. (The extra +parentheses are necessary so that <a href="https://www.lua.org/manual/5.1/manual.html#pdf-tonumber">tonumber</a> only gets the first result from +<code>gsub</code>). From `tests/test-config.lua':</p> + + +<pre> +testconfig(<span class="string">[[ +MemTotal: 1024748 kB +MemFree: 220292 kB +]]</span>, +{ MemTotal = <span class="number">1024748</span>, MemFree = <span class="number">220292</span> }, +{ + keysep = <span class="string">':'</span>, + convert_numbers = <span class="keyword">function</span>(s) + s = s:gsub(<span class="string">' kB$'</span>,<span class="string">''</span>) + <span class="keyword">return</span> <span class="global">tonumber</span>(s) + <span class="keyword">end</span> + } +) +</pre> + +<p>The <code>smart</code> option lets <a href="../libraries/pl.config.html#read">config.read</a> make a reasonable guess for you; there +are examples in <code>tests/test-config.lua</code>, but basically these common file +formats (and those following the same pattern) can be processed directly in +smart mode: 'etc/fstab', '/proc/XXXX/status', 'ssh_config' and 'pdatedb.conf'.</p> + +<p>Please note that <a href="../libraries/pl.config.html#read">config.read</a> can be passed a <em>file-like object</em>; if it's not a +string and supports the <a href="../libraries/pl.data.html#read">read</a> method, then that will be used. For instance, to +read a configuration from a string, use <a href="../libraries/pl.stringio.html#open">stringio.open</a>.</p> + + +<p><a id="lexer"/></p> + +<p><a name="Lexical_Scanning"></a></p> +<h3>Lexical Scanning</h3> + +<p>Although Lua's string pattern matching is very powerful, there are times when +something more powerful is needed. <a href="../libraries/pl.lexer.html#scan">pl.lexer.scan</a> provides lexical scanners +which <em>tokenize</em> a string, classifying tokens into numbers, strings, etc.</p> + + +<pre> +> lua -lpl +Lua <span class="number">5.1</span>.<span class="number">4</span> Copyright (C) <span class="number">1994</span>-<span class="number">2008</span> Lua.org, PUC-Rio +> tok = lexer.scan <span class="string">'alpha = sin(1.5)'</span> +> = tok() +iden alpha +> = tok() += = +> = tok() +iden sin +> = tok() +( ( +> = tok() +number <span class="number">1.5</span> +> = tok() +) ) +> = tok() +(<span class="keyword">nil</span>) +</pre> + +<p>The scanner is a function, which is repeatedly called and returns the <em>type</em> and +<em>value</em> of the token. Recognized basic types are 'iden','string','number', and +'space'. and everything else is represented by itself. Note that by default the +scanner will skip any 'space' tokens.</p> + +<p>'comment' and 'keyword' aren't applicable to the plain scanner, which is not +language-specific, but a scanner which understands Lua is available. It +recognizes the Lua keywords, and understands both short and long comments and +strings.</p> + + +<pre> +> <span class="keyword">for</span> t,v <span class="keyword">in</span> lexer.lua <span class="string">'for i=1,n do'</span> <span class="keyword">do</span> <span class="global">print</span>(t,v) <span class="keyword">end</span> +keyword <span class="keyword">for</span> +iden i += = +number <span class="number">1</span> +, , +iden n +keyword <span class="keyword">do</span> +</pre> + +<p>A lexical scanner is useful where you have highly-structured data which is not +nicely delimited by newlines. For example, here is a snippet of a in-house file +format which it was my task to maintain:</p> + + +<pre> +points + (<span class="number">818344.1</span>,-<span class="number">20389.7</span>,-<span class="number">0.1</span>),(<span class="number">818337.9</span>,-<span class="number">20389.3</span>,-<span class="number">0.1</span>),(<span class="number">818332.5</span>,-<span class="number">20387.8</span>,-<span class="number">0.1</span>) + ,(<span class="number">818327.4</span>,-<span class="number">20388</span>,-<span class="number">0.1</span>),(<span class="number">818322</span>,-<span class="number">20387.7</span>,-<span class="number">0.1</span>),(<span class="number">818316.3</span>,-<span class="number">20388.6</span>,-<span class="number">0.1</span>) + ,(<span class="number">818309.7</span>,-<span class="number">20389.4</span>,-<span class="number">0.1</span>),(<span class="number">818303.5</span>,-<span class="number">20390.6</span>,-<span class="number">0.1</span>),(<span class="number">818295.8</span>,-<span class="number">20388.3</span>,-<span class="number">0.1</span>) + ,(<span class="number">818290.5</span>,-<span class="number">20386.9</span>,-<span class="number">0.1</span>),(<span class="number">818285.2</span>,-<span class="number">20386.1</span>,-<span class="number">0.1</span>),(<span class="number">818279.3</span>,-<span class="number">20383.6</span>,-<span class="number">0.1</span>) + ,(<span class="number">818274</span>,-<span class="number">20381.2</span>,-<span class="number">0.1</span>),(<span class="number">818274</span>,-<span class="number">20380.7</span>,-<span class="number">0.1</span>); +</pre> + +<p>Here is code to extract the points using <a href="../libraries/pl.lexer.html#">pl.lexer</a>:</p> + + +<pre> +<span class="comment">-- assume 's' contains the text above... +</span><span class="keyword">local</span> lexer = <span class="global">require</span> <span class="string">'pl.lexer'</span> +<span class="keyword">local</span> expecting = lexer.expecting +<span class="keyword">local</span> append = <span class="global">table</span>.insert + +<span class="keyword">local</span> tok = lexer.scan(s) + +<span class="keyword">local</span> points = {} +<span class="keyword">local</span> t,v = tok() <span class="comment">-- should be 'iden','points' +</span> +<span class="keyword">while</span> t ~= <span class="string">';'</span> <span class="keyword">do</span> + c = {} + expecting(tok,<span class="string">'('</span>) + c.x = expecting(tok,<span class="string">'number'</span>) + expecting(tok,<span class="string">','</span>) + c.y = expecting(tok,<span class="string">'number'</span>) + expecting(tok,<span class="string">','</span>) + c.z = expecting(tok,<span class="string">'number'</span>) + expecting(tok,<span class="string">')'</span>) + t,v = tok() <span class="comment">-- either ',' or ';' +</span> append(points,c) +<span class="keyword">end</span> +</pre> + +<p>The <code>expecting</code> function grabs the next token and if the type doesn't match, it +throws an error. (<a href="../libraries/pl.lexer.html#">pl.lexer</a>, unlike other PL libraries, raises errors if +something goes wrong, so you should wrap your code in <a href="https://www.lua.org/manual/5.1/manual.html#pdf-pcall">pcall</a> to catch the error +gracefully.)</p> + +<p>The scanners all have a second optional argument, which is a table which controls +whether you want to exclude spaces and/or comments. The default for <a href="../libraries/pl.lexer.html#lua">lexer.lua</a> +is <code>{space=true,comments=true}</code>. There is a third optional argument which +determines how string and number tokens are to be processsed.</p> + +<p>The ultimate highly-structured data is of course, program source. Here is a +snippet from 'text-lexer.lua':</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> + +lines = <span class="string">[[ +for k,v in pairs(t) do + if type(k) == 'number' then + print(v) -- array-like case + else + print(k,v) + end +end +]]</span> + +ls = List() +<span class="keyword">for</span> tp,val <span class="keyword">in</span> lexer.lua(lines,{space=<span class="keyword">true</span>,comments=<span class="keyword">true</span>}) <span class="keyword">do</span> + <span class="global">assert</span>(tp ~= <span class="string">'space'</span> <span class="keyword">and</span> tp ~= <span class="string">'comment'</span>) + <span class="keyword">if</span> tp == <span class="string">'keyword'</span> <span class="keyword">then</span> ls:append(val) <span class="keyword">end</span> +<span class="keyword">end</span> +test.asserteq(ls,List{<span class="string">'for'</span>,<span class="string">'in'</span>,<span class="string">'do'</span>,<span class="string">'if'</span>,<span class="string">'then'</span>,<span class="string">'else'</span>,<span class="string">'end'</span>,<span class="string">'end'</span>}) +</pre> + +<p>Here is a useful little utility that identifies all common global variables found +in a lua module (ignoring those declared locally for the moment):</p> + + +<pre> +<span class="comment">-- testglobal.lua +</span><span class="global">require</span> <span class="string">'pl'</span> + +<span class="keyword">local</span> txt,err = utils.readfile(arg[<span class="number">1</span>]) +<span class="keyword">if</span> <span class="keyword">not</span> txt <span class="keyword">then</span> <span class="keyword">return</span> <span class="global">print</span>(err) <span class="keyword">end</span> + +<span class="keyword">local</span> globals = List() +<span class="keyword">for</span> t,v <span class="keyword">in</span> lexer.lua(txt) <span class="keyword">do</span> + <span class="keyword">if</span> t == <span class="string">'iden'</span> <span class="keyword">and</span> _G[v] <span class="keyword">then</span> + globals:append(v) + <span class="keyword">end</span> +<span class="keyword">end</span> +pretty.dump(seq.count_map(globals)) +</pre> + +<p>Rather then dumping the whole list, with its duplicates, we pass it through +<a href="../libraries/pl.seq.html#count_map">seq.count_map</a> which turns the list into a table where the keys are the values, +and the associated values are the number of times those values occur in the +sequence. Typical output looks like this:</p> + + +<pre> +{ + <span class="global">type</span> = <span class="number">2</span>, + <span class="global">pairs</span> = <span class="number">2</span>, + <span class="global">table</span> = <span class="number">2</span>, + <span class="global">print</span> = <span class="number">3</span>, + <span class="global">tostring</span> = <span class="number">2</span>, + <span class="global">require</span> = <span class="number">1</span>, + <span class="global">ipairs</span> = <span class="number">4</span> +} +</pre> + +<p>You could further pass this through <a href="../libraries/pl.tablex.html#keys">tablex.keys</a> to get a unique list of +symbols. This can be useful when writing 'strict' Lua modules, where all global +symbols must be defined as locals at the top of the file.</p> + +<p>For a more detailed use of <a href="../libraries/pl.lexer.html#scan">lexer.scan</a>, please look at <a href="../examples/testxml.lua.html#">testxml.lua</a> in the +examples directory.</p> + +<p><a name="XML"></a></p> +<h3>XML</h3> + +<p>New in the 0.9.7 release is some support for XML. This is a large topic, and +Penlight does not provide a full XML stack, which is properly the task of a more +specialized library.</p> + +<h4>Parsing and Pretty-Printing</h4> + +<p>The semi-standard XML parser in the Lua universe is <a href="http://matthewwild.co.uk/projects/luaexpat/">lua-expat</a>. +In particular, +it has a function called <code>lxp.lom.parse</code> which will parse XML into the Lua Object +Model (LOM) format. However, it does not provide a way to convert this data back +into XML text. <a href="../libraries/pl.xml.html#parse">xml.parse</a> will use this function, <em>if</em> <code>lua-expat</code> is +available, and otherwise switches back to a pure Lua parser originally written by +Roberto Ierusalimschy.</p> + +<p>The resulting document object knows how to render itself as a string, which is +useful for debugging:</p> + + +<pre> +> d = xml.parse <span class="string">"<nodes><node id='1'>alice</node></nodes>"</span> +> = d +<nodes><node id=<span class="string">'1'</span>>alice</node></nodes> +> pretty.dump (d) +{ + { + <span class="string">"alice"</span>, + attr = { + <span class="string">"id"</span>, + id = <span class="string">"1"</span> + }, + tag = <span class="string">"node"</span> + }, + attr = { + }, + tag = <span class="string">"nodes"</span> +} +</pre> + +<p>Looking at the actual shape of the data reveals the structure of LOM:</p> + +<ul> + <li>every element has a <code>tag</code> field with its name</li> + <li>plus a <code>attr</code> field which is a table containing the attributes as fields, and + also as an array. It is always present.</li> + <li>the children of the element are the array part of the element, so <code>d[1]</code> is + the first child of <code>d</code>, etc.</li> +</ul> + +<p>It could be argued that having attributes also as the array part of <code>attr</code> is not +essential (you cannot depend on attribute order in XML) but that's how +it goes with this standard.</p> + +<p><code>lua-expat</code> is another <em>soft dependency</em> of Penlight; generally, the fallback +parser is good enough for straightforward XML as is commonly found in +configuration files, etc. <code>doc.basic_parse</code> is not intended to be a proper +conforming parser (it's only sixty lines) but it handles simple kinds of +documents that do not have comments or DTD directives. It is intelligent enough +to ignore the <code><?xml</code> directive and that is about it.</p> + +<p>You can get pretty-printing by explicitly calling <a href="../libraries/pl.xml.html#tostring">xml.tostring</a> and passing it +the initial indent and the per-element indent:</p> + + +<pre> +> = xml.<span class="global">tostring</span>(d,<span class="string">''</span>,<span class="string">' '</span>) + +<nodes> + <node id=<span class="string">'1'</span>>alice</node> +</nodes> +</pre> + +<p>There is a fourth argument which is the <em>attribute indent</em>:</p> + + +<pre> +> a = xml.parse <span class="string">"<frodo name='baggins' age='50' type='hobbit'/>"</span> +> = xml.<span class="global">tostring</span>(a,<span class="string">''</span>,<span class="string">' '</span>,<span class="string">' '</span>) + +<frodo + <span class="global">type</span>=<span class="string">'hobbit'</span> + name=<span class="string">'baggins'</span> + age=<span class="string">'50'</span> +/> +</pre> + +<h4>Parsing and Working with Configuration Files</h4> + +<p>It's common to find configurations expressed with XML these days. It's +straightforward to 'walk' the <a href="http://matthewwild.co.uk/projects/luaexpat/lom.html">LOM</a> +data and extract the data in the form you want:</p> + + +<pre> +<span class="global">require</span> <span class="string">'pl'</span> + +<span class="keyword">local</span> config = <span class="string">[[ +<config> + <alpha>1.3</alpha> + <beta>10</beta> + <name>bozo</name> +</config> +]]</span> +<span class="keyword">local</span> d,err = xml.parse(config) + +<span class="keyword">local</span> t = {} +<span class="keyword">for</span> item <span class="keyword">in</span> d:childtags() <span class="keyword">do</span> + t[item.tag] = item[<span class="number">1</span>] +<span class="keyword">end</span> + +pretty.dump(t) +<span class="comment">---> +</span>{ + beta = <span class="string">"10"</span>, + alpha = <span class="string">"1.3"</span>, + name = <span class="string">"bozo"</span> +} +</pre> + +<p>The only gotcha is that here we must use the <code>Doc:childtags</code> method, which will +skip over any text elements.</p> + +<p>A more involved example is this excerpt from <code>serviceproviders.xml</code>, which is +usually found at <code>/usr/share/mobile-broadband-provider-info/serviceproviders.xml</code> +on Debian/Ubuntu Linux systems.</p> + + +<pre> +d = xml.parse <span class="string">[[ +<serviceproviders format="2.0"> +... +<country code="za"> + <provider> + <name>Cell-c</name> + <gsm> + <network-id mcc="655" mnc="07"/> + <apn value="internet"> + <username>Cellcis</username> + <dns>196.7.0.138</dns> + <dns>196.7.142.132</dns> + </apn> + </gsm> + </provider> + <provider> + <name>MTN</name> + <gsm> + <network-id mcc="655" mnc="10"/> + <apn value="internet"> + <dns>196.11.240.241</dns> + <dns>209.212.97.1</dns> + </apn> + </gsm> + </provider> + <provider> + <name>Vodacom</name> + <gsm> + <network-id mcc="655" mnc="01"/> + <apn value="internet"> + <dns>196.207.40.165</dns> + <dns>196.43.46.190</dns> + </apn> + <apn value="unrestricted"> + <name>Unrestricted</name> + <dns>196.207.32.69</dns> + <dns>196.43.45.190</dns> + </apn> + </gsm> + </provider> + <provider> + <name>Virgin Mobile</name> + <gsm> + <apn value="vdata"> + <dns>196.7.0.138</dns> + <dns>196.7.142.132</dns> + </apn> + </gsm> + </provider> +</country> +.... +</serviceproviders> +]]</span> +</pre> + +<p>Getting the names of the providers per-country is straightforward:</p> + + +<pre> +<span class="keyword">local</span> t = {} +<span class="keyword">for</span> country <span class="keyword">in</span> d:childtags() <span class="keyword">do</span> + <span class="keyword">local</span> providers = {} + t[country.attr.code] = providers + <span class="keyword">for</span> provider <span class="keyword">in</span> country:childtags() <span class="keyword">do</span> + <span class="global">table</span>.insert(providers,provider:child_with_name(<span class="string">'name'</span>):get_text()) + <span class="keyword">end</span> +<span class="keyword">end</span> + +pretty.dump(t) +<span class="comment">--> +</span>{ + za = { + <span class="string">"Cell-c"</span>, + <span class="string">"MTN"</span>, + <span class="string">"Vodacom"</span>, + <span class="string">"Virgin Mobile"</span> + } + .... +} +</pre> + +<h4>Generating XML with 'xmlification'</h4> + +<p>This feature is inspired by the <code>htmlify</code> function used by +<a href="http://keplerproject.github.com/orbit/">Orbit</a> to simplify HTML generation, +except that no function environment magic is used; the <code>tags</code> function returns a +set of <em>constructors</em> for elements of the given tag names.</p> + + +<pre> +> nodes, node = xml.tags <span class="string">'nodes, node'</span> +> = node <span class="string">'alice'</span> +<node>alice</node> +> = nodes { node {id=<span class="string">'1'</span>,<span class="string">'alice'</span>}} +<nodes><node id=<span class="string">'1'</span>>alice</node></nodes> +</pre> + +<p>The flexibility of Lua tables is very useful here, since both the attributes and +the children of an element can be encoded naturally. The argument to these tag +constructors is either a single value (like a string) or a table where the +attributes are the named keys and the children are the array values.</p> + +<h4>Generating XML using Templates</h4> + +<p>A template is a little XML document which contains dollar-variables. The <code>subst</code> +method on a document is fed an array of tables containing values for these +variables. Note how the parent tag name is specified:</p> + + +<pre> +> templ = xml.parse <span class="string">"<node id='$id'>$name</node>"</span> +> = templ:subst {tag=<span class="string">'nodes'</span>, {id=<span class="number">1</span>,name=<span class="string">'alice'</span>},{id=<span class="number">2</span>,name=<span class="string">'john'</span>}} +<nodes><node id=<span class="string">'1'</span>>alice</node><node id=<span class="string">'2'</span>>john</node></nodes> +</pre> + +<p>Substitution is very related to <em>filtering</em> documents. One of the annoying things +about XML is that it is a document markup language first, and a data language +second. Standard parsers will assume you really care about all those extra +text elements. Consider this fragment, which has been changed by a five-year old:</p> + + +<pre> +T = <span class="string">[[ + <weather> + boops! + <current_conditions> + <condition data='$condition'/> + <temp_c data='$temp'/> + <bo>whoops!</bo> + </current_conditions> + </weather> +]]</span> +</pre> + +<p>Conformant parsers will give you text elements with the line feed after <code><current_conditions></code> +although it makes handling the data more irritating.</p> + + +<pre> +<span class="keyword">local</span> <span class="keyword">function</span> parse (str) + <span class="keyword">return</span> xml.parse(str,<span class="keyword">false</span>,<span class="keyword">true</span>) +<span class="keyword">end</span> +</pre> + +<p>Second argument means 'string, not file' and third argument means use the built-in +Lua parser (instead of LuaExpat if available) which <em>by default</em> is not interested in +keeping such strings.</p> + +<p>How to remove the string <code>boops!</code>? <code>clone</code> (also called <a href="../libraries/pl.data.html#filter">filter</a> when called as a +method) copies a LOM document. It can be passed a filter function, which is applied +to each string found. The powerful thing about this is that this function receives +structural information - the parent node, and whether this was a tag name, a text +element or a attribute name:</p> + + +<pre> +d = parse (T) +c = d:filter(<span class="keyword">function</span>(s,kind,parent) + <span class="global">print</span>(stringx.strip(s),kind,parent <span class="keyword">and</span> parent.tag <span class="keyword">or</span> <span class="string">'?'</span>) + <span class="keyword">if</span> kind == <span class="string">'*TEXT'</span> <span class="keyword">and</span> #parent > <span class="number">1</span> <span class="keyword">then</span> <span class="keyword">return</span> <span class="keyword">nil</span> <span class="keyword">end</span> + <span class="keyword">return</span> s +<span class="keyword">end</span>) +<span class="comment">---> +</span>weather *TAG ? +boops! *TEXT weather +current_conditions *TAG weather +condition *TAG current_conditions +$condition data condition +temp_c *TAG current_conditions +$temp data temp_c +bo *TAG current_conditions +whoops! *TEXT bo +</pre> + +<p>We can pull out 'boops' and not 'whoops' by discarding text elements which are not +the single child of an element.</p> + + + +<h4>Extracting Data using Templates</h4> + +<p>Matching goes in the opposite direction. We have a document, and would like to +extract values from it using a pattern.</p> + +<p>A common use of this is parsing the XML result of API queries. The +<a href="http://blog.programmableweb.com/2010/02/08/googles-secret-weather-api/">(undocumented and subsequently discontinued) Google Weather +API</a> is a +good example. Grabbing the result of +`http://www.google.com/ig/api?weather=Johannesburg,ZA" we get something like +this, after pretty-printing:</p> + + +<pre> +<xml_api_reply version=<span class="string">'1'</span>> + <weather module_id=<span class="string">'0'</span> tab_id=<span class="string">'0'</span> mobile_zipped=<span class="string">'1'</span> section=<span class="string">'0'</span> row=<span class="string">'0'</span> +</pre> + +<p>mobile_row='0'></p> + +<pre> +<forecast_information> + <city data=<span class="string">'Johannesburg, Gauteng'</span>/> + <postal_code data=<span class="string">'Johannesburg,ZA'</span>/> + <latitude_e6 data=<span class="string">''</span>/> + <longitude_e6 data=<span class="string">''</span>/> + <forecast_date data=<span class="string">'2010-10-02'</span>/> + <current_date_time data=<span class="string">'2010-10-02 18:30:00 +0000'</span>/> + <unit_system data=<span class="string">'US'</span>/> +</forecast_information> +<current_conditions> + <condition data=<span class="string">'Clear'</span>/> + <temp_f data=<span class="string">'75'</span>/> + <temp_c data=<span class="string">'24'</span>/> + <humidity data=<span class="string">'Humidity: 19%'</span>/> + <icon data=<span class="string">'/ig/images/weather/sunny.gif'</span>/> + <wind_condition data=<span class="string">'Wind: NW at 7 mph'</span>/> +</current_conditions> +<forecast_conditions> + <day_of_week data=<span class="string">'Sat'</span>/> + <low data=<span class="string">'60'</span>/> + <high data=<span class="string">'89'</span>/> + <icon data=<span class="string">'/ig/images/weather/sunny.gif'</span>/> + <condition data=<span class="string">'Clear'</span>/> +</forecast_conditions> +.... +/weather> +l_api_reply> +</pre> + +<p>Assume that the above XML has been read into <code>google</code>. The idea is to write a +pattern looking like a template, and use it to extract some values of interest:</p> + + +<pre> +t = <span class="string">[[ + <weather> + <current_conditions> + <condition data='$condition'/> + <temp_c data='$temp'/> + </current_conditions> + </weather> +]]</span> + +<span class="keyword">local</span> res, ret = google:match(t) +pretty.dump(res) +</pre> + +<p>And the output is:</p> + + +<pre> +{ + condition = <span class="string">"Clear"</span>, + temp = <span class="string">"24"</span> +} +</pre> + +<p>The <code>match</code> method can be passed a LOM document or some text, which will be +parsed first.</p> + +<p>But what if we need to extract values from repeated elements? Match templates may +contain 'array matches' which are enclosed in '{{..}}':</p> + + +<pre> +<weather> + {{<forecast_conditions> + <day_of_week data=<span class="string">'$day'</span>/> + <low data=<span class="string">'$low'</span>/> + <high data=<span class="string">'$high'</span>/> + <condition data=<span class="string">'$condition'</span>/> + </forecast_conditions>}} +</weather> +</pre> + +<p>And the match result is:</p> + + +<pre> +{ + { + low = <span class="string">"60"</span>, + high = <span class="string">"89"</span>, + day = <span class="string">"Sat"</span>, + condition = <span class="string">"Clear"</span>, + }, + { + low = <span class="string">"53"</span>, + high = <span class="string">"86"</span>, + day = <span class="string">"Sun"</span>, + condition = <span class="string">"Clear"</span>, + }, + { + low = <span class="string">"57"</span>, + high = <span class="string">"87"</span>, + day = <span class="string">"Mon"</span>, + condition = <span class="string">"Clear"</span>, + }, + { + low = <span class="string">"60"</span>, + high = <span class="string">"84"</span>, + day = <span class="string">"Tue"</span>, + condition = <span class="string">"Clear"</span>, + } +} +</pre> + +<p>With this array of tables, you can use <a href="../libraries/pl.tablex.html#">tablex</a> or <a href="../classes/pl.List.html#">List</a> +to reshape into the desired form, if you choose. Just as with reading a Unix password +file with <a href="../libraries/pl.config.html#">config</a>, you can make the array into a map of days to conditions using:</p> + + +<pre> +<span class="backtick"><a href="../libraries/pl.tablex.html#pairmap">tablex.pairmap</a></span>(<span class="string">'|k,v| v,v.day'</span>,conditions) +</pre> + +<p>(Here using the alternative string lambda option)</p> + +<p>However, xml matches can shape the structure of the output. By replacing the <code>day_of_week</code> +line of the template with <code><day_of_week data='$_'/></code> we get the same effect; <code>$_</code> is +a special symbol that means that this captured value (or simply <em>capture</em>) becomes the key.</p> + +<p>Note that <code>$NUMBER</code> means a numerical index, so +that <code>$1</code> is the first element of the resulting array, and so forth. You can mix +numbered and named captures, but it's strongly advised to make the numbered captures +form a proper array sequence (everything from <code>1</code> to <code>n</code> inclusive). <code>$0</code> has a +special meaning; if it is the only capture (<code>{[0]='foo'}</code>) then the table is +collapsed into 'foo'.</p> + + +<pre> +<weather> + {{<forecast_conditions> + <day_of_week data=<span class="string">'$_'</span>/> + <low data=<span class="string">'$1'</span>/> + <high data=<span class="string">'$2'</span>/> + <condition data=<span class="string">'$3'</span>/> + </forecast_conditions>}} +</weather> +</pre> + +<p>Now the result is:</p> + + +<pre> +{ + Tue = { + <span class="string">"60"</span>, + <span class="string">"84"</span>, + <span class="string">"Clear"</span> + }, + Sun = { + <span class="string">"53"</span>, + <span class="string">"86"</span>, + <span class="string">"Clear"</span> + }, + Sat = { + <span class="string">"60"</span>, + <span class="string">"89"</span>, + <span class="string">"Clear"</span> + }, + Mon = { + <span class="string">"57"</span>, + <span class="string">"87"</span>, + <span class="string">"Clear"</span> + } +} +</pre> + +<p>Applying matches to this config file poses another problem, because the actual +tags matched are themselves meaningful.</p> + + +<pre> +<config> + <alpha><span class="number">1.3</span></alpha> + <beta><span class="number">10</span></beta> + <name>bozo</name> +</config> +</pre> + +<p>So there are tag 'wildcards' which are element names ending with a hyphen.</p> + + +<pre> +<config> + {{<key->$value</key->}} +</config> +</pre> + +<p>You will then get <code>{{alpha='1.3'},...}</code>. The most convenient format would be +returned by this (note that <code>_-</code> behaves just like <code>$_</code>):</p> + + +<pre> +<config> + {{<_->$<span class="number">0</span></_->}} +</config> +</pre> + +<p>which would return <code>{alpha='1.3',beta='10',name='bozo'}</code>.</p> + +<p>We could play this game endlessly, and encode ways of converting captures, but +the scheme is complex enough, and it's easy to do the conversion later</p> + + +<pre> +<span class="keyword">local</span> numbers = {alpha=<span class="keyword">true</span>,beta=<span class="keyword">true</span>} +<span class="keyword">for</span> k,v <span class="keyword">in</span> <span class="global">pairs</span>(res) <span class="keyword">do</span> + <span class="keyword">if</span> numbers[v] <span class="keyword">then</span> res[k] = <span class="global">tonumber</span>(v) <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<h4>HTML Parsing</h4> + +<p>HTML is an unusually degenerate form of XML, and Dennis Schridde has contributed +a feature which makes parsing it easier. For instance, from the tests:</p> + + +<pre> +doc = xml.parsehtml <span class="string">[[ +<BODY> +Hello dolly<br> +HTML is <b>slack</b><br> +</BODY> +]]</span> + +asserteq(xml.<span class="global">tostring</span>(doc),<span class="string">[[ +<body> +Hello dolly<br/> +HTML is <b>slack</b><br/></body>]]</span>) +</pre> + +<p>That is, all tags are converted to lowercase, and empty HTML elements like <code>br</code> +are properly closed; attributes do not need to be quoted.</p> + +<p>Also, DOCTYPE directives and comments are skipped. For truly badly formed HTML, +this is not the tool for you!</p> + + + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/07-functional.md.html b/Data/Libraries/Penlight/docs/manual/07-functional.md.html new file mode 100644 index 0000000..d4ca655 --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/07-functional.md.html @@ -0,0 +1,834 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Sequences">Sequences </a></li> +<li><a href="#Sequence_Wrappers">Sequence Wrappers </a></li> +<li><a href="#List_Comprehensions">List Comprehensions </a></li> +<li><a href="#Creating_Functions_from_Functions">Creating Functions from Functions </a></li> +<li><a href="#Placeholder_Expressions">Placeholder Expressions </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><strong>Functional Programming</strong></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Functional Programming</h2> + +<p><a name="Sequences"></a></p> +<h3>Sequences</h3> + + +<p>A Lua iterator (in its simplest form) is a function which can be repeatedly +called to return a set of one or more values. The <code>for in</code> statement understands +these iterators, and loops until the function returns <code>nil</code>. There are standard +sequence adapters for tables in Lua (<a href="https://www.lua.org/manual/5.1/manual.html#pdf-ipairs">ipairs</a> and <a href="https://www.lua.org/manual/5.1/manual.html#pdf-pairs">pairs</a>), and <a href="https://www.lua.org/manual/5.1/manual.html#pdf-io.lines">io.lines</a> +returns an iterator over all the lines in a file. In the Penlight libraries, such +iterators are also called <em>sequences</em>. A sequence of single values (say from +<a href="https://www.lua.org/manual/5.1/manual.html#pdf-io.lines">io.lines</a>) is called <em>single-valued</em>, whereas the sequence defined by <a href="https://www.lua.org/manual/5.1/manual.html#pdf-pairs">pairs</a> is +<em>double-valued</em>.</p> + +<p><a href="../libraries/pl.seq.html#">pl.seq</a> provides a number of useful iterators, and some functions which operate +on sequences. At first sight this example looks like an attempt to write Python +in Lua, (with the sequence being inclusive):</p> + + +<pre> +> <span class="keyword">for</span> i <span class="keyword">in</span> seq.range(<span class="number">1</span>,<span class="number">4</span>) <span class="keyword">do</span> <span class="global">print</span>(i) <span class="keyword">end</span> +<span class="number">1</span> +<span class="number">2</span> +<span class="number">3</span> +<span class="number">4</span> +</pre> + +<p>But <a href="../libraries/pl.seq.html#range">range</a> is actually equivalent to Python's <code>xrange</code>, since it generates a +sequence, not a list. To get a list, use <code>seq.copy(seq.range(1,10))</code>, which +takes any single-value sequence and makes a table from the result. <a href="../libraries/pl.seq.html#list">seq.list</a> is +like <a href="https://www.lua.org/manual/5.1/manual.html#pdf-ipairs">ipairs</a> except that it does not give you the index, just the value.</p> + + +<pre> +> <span class="keyword">for</span> x <span class="keyword">in</span> seq.list {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} <span class="keyword">do</span> <span class="global">print</span>(x) <span class="keyword">end</span> +<span class="number">1</span> +<span class="number">2</span> +<span class="number">3</span> +</pre> + +<p><a href="../libraries/pl.seq.html#enum">enum</a> takes a sequence and turns it into a double-valued sequence consisting of +a sequence number and the value, so <code>enum(list(ls))</code> is actually equivalent to +<a href="https://www.lua.org/manual/5.1/manual.html#pdf-ipairs">ipairs</a>. A more interesting example prints out a file with line numbers:</p> + + +<pre> +<span class="keyword">for</span> i,v <span class="keyword">in</span> seq.enum(<span class="global">io</span>.lines(fname)) <span class="keyword">do</span> <span class="global">print</span>(i..<span class="string">' '</span>..v) <span class="keyword">end</span> +</pre> + +<p>Sequences can be <em>combined</em>, either by 'zipping' them or by concatenating them.</p> + + +<pre> +> <span class="keyword">for</span> x,y <span class="keyword">in</span> seq.zip(l1,l2) <span class="keyword">do</span> <span class="global">print</span>(x,y) <span class="keyword">end</span> +<span class="number">10</span> <span class="number">1</span> +<span class="number">20</span> <span class="number">2</span> +<span class="number">30</span> <span class="number">3</span> +> <span class="keyword">for</span> x <span class="keyword">in</span> seq.splice(l1,l2) <span class="keyword">do</span> <span class="global">print</span>(x) <span class="keyword">end</span> +<span class="number">10</span> +<span class="number">20</span> +<span class="number">30</span> +<span class="number">1</span> +<span class="number">2</span> +<span class="number">3</span> +</pre> + +<p><a href="../libraries/pl.seq.html#printall">seq.printall</a> is useful for printing out single-valued sequences, and provides +some finer control over formating, such as a delimiter, the number of fields per +line, and a format string to use (@see string.format)</p> + + +<pre> +> seq.printall(seq.random(<span class="number">10</span>)) +<span class="number">0.0012512588885159</span> <span class="number">0.56358531449324</span> <span class="number">0.19330423902097</span> .... +> seq.printall(seq.random(<span class="number">10</span>), <span class="string">','</span>, <span class="number">4</span>, <span class="string">'%4.2f'</span>) +<span class="number">0.17</span>,<span class="number">0.86</span>,<span class="number">0.71</span>,<span class="number">0.51</span> +<span class="number">0.30</span>,<span class="number">0.01</span>,<span class="number">0.09</span>,<span class="number">0.36</span> +<span class="number">0.15</span>,<span class="number">0.17</span>, +</pre> + +<p><a href="../libraries/pl.seq.html#map">map</a> will apply a function to a sequence.</p> + + +<pre> +> seq.printall(seq.map(<span class="global">string</span>.upper, {<span class="string">'one'</span>,<span class="string">'two'</span>})) +ONE TWO +> seq.printall(seq.map(<span class="string">'+'</span>, {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>}, <span class="number">1</span>)) +<span class="number">11</span> <span class="number">21</span> <span class="number">31</span> +</pre> + +<p><a href="../libraries/pl.seq.html#filter">filter</a> will filter a sequence using a boolean function (often called a +<em>predicate</em>). For instance, this code only prints lines in a file which are +composed of digits:</p> + + +<pre> +<span class="keyword">for</span> l <span class="keyword">in</span> seq.filter(<span class="global">io</span>.lines(file), stringx.isdigit) <span class="keyword">do</span> <span class="global">print</span>(l) <span class="keyword">end</span> +</pre> + +<p>The following returns a table consisting of all the positive values in the +original table (equivalent to <code>tablex.filter(ls, '>', 0)</code>)</p> + + +<pre> +ls = seq.copy(seq.filter(ls, <span class="string">'>'</span>, <span class="number">0</span>)) +</pre> + +<p>We're already encounted <a href="../libraries/pl.seq.html#sum">seq.sum</a> when discussing <a href="../libraries/pl.input.html#numbers">input.numbers</a>. This can also +be expressed with <a href="../libraries/pl.seq.html#reduce">seq.reduce</a>:</p> + + +<pre> +> seq.reduce(<span class="keyword">function</span>(x,y) <span class="keyword">return</span> x + y <span class="keyword">end</span>, seq.list{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +<span class="number">10</span> +</pre> + +<p><a href="../libraries/pl.seq.html#reduce">seq.reduce</a> applies a binary function in a recursive fashion, so that:</p> + + +<pre> +reduce(op,{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>}) => op(<span class="number">1</span>,reduce(op,{<span class="number">2</span>,<span class="number">3</span>}) => op(<span class="number">1</span>,op(<span class="number">2</span>,<span class="number">3</span>)) +</pre> + +<p>it's now possible to easily generate other cumulative operations; the standard +operations declared in <a href="../libraries/pl.operator.html#">pl.operator</a> are useful here:</p> + + +<pre> +> ops = <span class="global">require</span> <span class="string">'pl.operator'</span> +> <span class="comment">-- can also say '*' instead of ops.mul +</span>> = seq.reduce(ops.mul,input.numbers <span class="string">'1 2 3 4'</span>) +<span class="number">24</span> +</pre> + +<p>There are functions to extract statistics from a sequence of numbers:</p> + + +<pre> +> l1 = List {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>} +> l2 = List {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} +> = seq.minmax(l1) +<span class="number">10</span> <span class="number">30</span> +> = seq.sum(l1) +<span class="number">60</span> <span class="number">3</span> +</pre> + +<p>It is common to get sequences where values are repeated, say the words in a file. +<a href="../libraries/pl.seq.html#count_map">count_map</a> will take such a sequence and count the values, returning a table +where the <em>keys</em> are the unique values, and the value associated with each key is +the number of times they occurred:</p> + + +<pre> +> t = seq.count_map {<span class="string">'one'</span>,<span class="string">'fred'</span>,<span class="string">'two'</span>,<span class="string">'one'</span>,<span class="string">'two'</span>,<span class="string">'two'</span>} +> = t +{one=<span class="number">2</span>,fred=<span class="number">1</span>,two=<span class="number">3</span>} +</pre> + +<p>This will also work on numerical sequences, but you cannot expect the result to +be a proper list, i.e. having no 'holes'. Instead, you always need to use <a href="https://www.lua.org/manual/5.1/manual.html#pdf-pairs">pairs</a> +to iterate over the result - note that there is a hole at index 5:</p> + + +<pre> +> t = seq.count_map {<span class="number">1</span>,<span class="number">2</span>,<span class="number">4</span>,<span class="number">2</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">2</span>,<span class="number">6</span>} +> <span class="keyword">for</span> k,v <span class="keyword">in</span> <span class="global">pairs</span>(t) <span class="keyword">do</span> <span class="global">print</span>(k,v) <span class="keyword">end</span> +<span class="number">1</span> <span class="number">1</span> +<span class="number">2</span> <span class="number">4</span> +<span class="number">3</span> <span class="number">1</span> +<span class="number">4</span> <span class="number">2</span> +<span class="number">6</span> <span class="number">1</span> +</pre> + +<p><code>unique</code> uses <a href="../libraries/pl.seq.html#count_map">count_map</a> to return a list of the unique values, that is, just +the keys of the resulting table.</p> + +<p><a href="../libraries/pl.seq.html#last">last</a> turns a single-valued sequence into a double-valued sequence with the +current value and the last value:</p> + + +<pre> +> <span class="keyword">for</span> current,last <span class="keyword">in</span> seq.last {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>,<span class="number">40</span>} <span class="keyword">do</span> <span class="global">print</span> (current,last) <span class="keyword">end</span> +<span class="number">20</span> <span class="number">10</span> +<span class="number">30</span> <span class="number">20</span> +<span class="number">40</span> <span class="number">30</span> +</pre> + +<p>This makes it easy to do things like identify repeated lines in a file, or +construct differences between values. <a href="../libraries/pl.seq.html#filter">filter</a> can handle double-valued sequences +as well, so one could filter such a sequence to only return cases where the +current value is less than the last value by using <a href="../libraries/pl.operator.html#lt">operator.lt</a> or just '<'. +This code then copies the resulting code into a table.</p> + + +<pre> +> ls = {<span class="number">10</span>,<span class="number">9</span>,<span class="number">10</span>,<span class="number">3</span>} +> = seq.copy(seq.filter(seq.last(s),<span class="string">'<'</span>)) +{<span class="number">9</span>,<span class="number">3</span>} +</pre> + +<p><a name="Sequence_Wrappers"></a></p> +<h3>Sequence Wrappers</h3> + +<p>The functions in <a href="../libraries/pl.seq.html#">pl.seq</a> cover the common patterns when dealing with sequences, +but chaining these functions together can lead to ugly code. Consider the last +example of the previous section; <a href="../libraries/pl.seq.html#">seq</a> is repeated three times and the resulting +expression has to be read right-to-left. The first issue can be helped by local +aliases, so that the expression becomes <code>copy(filter(last(s),'<'))</code> but the +second issue refers to the somewhat unnatural order of functional application. +We tend to prefer reading operations from left to right, which is one reason why +object-oriented notation has become popular. Sequence adapters allow this +expression to be written like so:</p> + + +<pre> +seq(s):last():filter(<span class="string">'<'</span>):copy() +</pre> + +<p>With this notation, the operation becomes a chain of method calls running from +left to right.</p> + +<p>'Sequence' is not a basic Lua type, they are generally functions or callable +objects. The expression <code>seq(s)</code> wraps a sequence in a <em>sequence wrapper</em>, which +is an object which understands all the functions in <a href="../libraries/pl.seq.html#">pl.seq</a> as methods. This +object then explicitly represents sequences.</p> + +<p>As a special case, the constructor (which is when you call the table <a href="../libraries/pl.seq.html#">seq</a>) will +make a wrapper for a plain list-like table. Here we apply the length operator to +a sequence of strings, and print them out.</p> + + +<pre> +> seq{<span class="string">'one'</span>,<span class="string">'tw'</span>,<span class="string">'t'</span>} :map <span class="string">'#'</span> :printall() +<span class="number">3</span> <span class="number">2</span> <span class="number">1</span> +</pre> + +<p>As a convenience, there is a function <a href="../libraries/pl.seq.html#lines">seq.lines</a> which behaves just like +<a href="https://www.lua.org/manual/5.1/manual.html#pdf-io.lines">io.lines</a> except it wraps the result as an explicit sequence type. This takes +the first 10 lines from standard input, makes it uppercase, turns it into a +sequence with a count and the value, glues these together with the concatenation +operator, and finally prints out the sequence delimited by a newline.</p> + + +<pre> +seq.lines():take(<span class="number">10</span>):upper():enum():map(<span class="string">'..'</span>):printall <span class="string">'\n'</span> +</pre> + +<p>Note the method <code>upper</code>, which is not a <a href="../libraries/pl.seq.html#">seq</a> function. if an unknown method is +called, sequence wrappers apply that method to all the values in the sequence +(this is implicit use of <a href="../libraries/pl.seq.html#mapmethod">mapmethod</a>)</p> + +<p>It is straightforward to create custom sequences that can be used in this way. On +Unix, <code>/dev/random</code> gives you an <em>endless</em> sequence of random bytes, so we use +<a href="../libraries/pl.seq.html#take">take</a> to limit the sequence, and then <a href="../libraries/pl.seq.html#map">map</a> to scale the result into the desired +range. The key step is to use <a href="../libraries/pl.seq.html#">seq</a> to wrap the iterator function:</p> + + +<pre> +<span class="comment">-- random.lua +</span><span class="keyword">local</span> seq = <span class="global">require</span> <span class="string">'pl.seq'</span> + +<span class="keyword">function</span> dev_random() + <span class="keyword">local</span> f = <span class="global">io</span>.open(<span class="string">'/dev/random'</span>) + <span class="keyword">local</span> byte = <span class="global">string</span>.byte + <span class="keyword">return</span> seq(<span class="keyword">function</span>() + <span class="comment">-- read two bytes into a string and convert into a 16-bit number +</span> <span class="keyword">local</span> s = f:read(<span class="number">2</span>) + <span class="keyword">return</span> byte(s,<span class="number">1</span>) + <span class="number">256</span>*byte(s,<span class="number">2</span>) + <span class="keyword">end</span>) +<span class="keyword">end</span> + +<span class="comment">-- print 10 random numbers from 0 to 1 ! +</span>dev_random():take(<span class="number">10</span>):map(<span class="string">'%'</span>,<span class="number">100</span>):map(<span class="string">'/'</span>,<span class="number">100</span>):printall <span class="string">','</span> +</pre> + +<p>Another Linux one-liner depends on the <code>/proc</code> filesystem and makes a list of all +the currently running processes:</p> + + +<pre> +pids = seq(lfs.dir <span class="string">'/proc'</span>):filter(stringx.isdigit):map(<span class="global">tonumber</span>):copy() +</pre> + +<p>This version of Penlight has an experimental feature which relies on the fact +that <em>all</em> Lua types can have metatables, including functions. This makes +<em>implicit sequence wrapping</em> possible:</p> + + +<pre> +> seq.import() +> seq.random(<span class="number">5</span>):printall(<span class="string">','</span>,<span class="number">5</span>,<span class="string">'%4.1f'</span>) + <span class="number">0.0</span>, <span class="number">0.1</span>, <span class="number">0.4</span>, <span class="number">0.1</span>, <span class="number">0.2</span> +</pre> + +<p>This avoids the awkward <code>seq(seq.random(5))</code> construction. Or the iterator can +come from somewhere else completely:</p> + + +<pre> +> (<span class="string">'one two three'</span>):gfind(<span class="string">'%a+'</span>):printall(<span class="string">','</span>) +one,two,three, +</pre> + +<p>After <code>seq.import</code>, it is no longer necessary to explicitly wrap sequence +functions.</p> + +<p>But there is a price to pay for this convenience. <em>Every</em> function is affected, +so that any function can be used, appropriate or not:</p> + + +<pre> +> <span class="global">math</span>.sin:printall() +..seq.lua:<span class="number">287</span>: bad argument #<span class="number">1</span> to <span class="string">'(for generator)'</span> (number expected, got <span class="keyword">nil</span>) +> a = <span class="global">tostring</span> +> = a:find(<span class="string">' '</span>) +<span class="keyword">function</span>: <span class="number">0042</span>C920 +</pre> + +<p>What function is returned? It's almost certain to be something that makes no +sense in the current context. So implicit sequences may make certain kinds of +programming mistakes harder to catch - they are best used for interactive +exploration and small scripts.</p> + +<p><a id="comprehensions"/></p> + +<p><a name="List_Comprehensions"></a></p> +<h3>List Comprehensions</h3> + +<p>List comprehensions are a compact way to create tables by specifying their +elements. In Python, you can say this:</p> + + +<pre> +ls = [x <span class="keyword">for</span> x <span class="keyword">in</span> range(<span class="number">5</span>)] # == [<span class="number">0</span>,<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>] +</pre> + +<p>In Lua, using <a href="../libraries/pl.comprehension.html#">pl.comprehension</a>:</p> + + +<pre> +> C = <span class="global">require</span>(<span class="string">'pl.comprehension'</span>).new() +> = C (<span class="string">'x for x=1,10'</span>) () +{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>,<span class="number">6</span>,<span class="number">7</span>,<span class="number">8</span>,<span class="number">9</span>,<span class="number">10</span>} +</pre> + +<p><code>C</code> is a function which compiles a list comprehension <em>string</em> into a <em>function</em>. +In this case, the function has no arguments. The parentheses are redundant for a +function taking a string argument, so this works as well:</p> + + +<pre> +> = C <span class="string">'x^2 for x=1,4'</span> () +{<span class="number">1</span>,<span class="number">4</span>,<span class="number">9</span>,<span class="number">16</span>} +> = C <span class="string">'{x,x^2} for x=1,4'</span> () +{{<span class="number">1</span>,<span class="number">1</span>},{<span class="number">2</span>,<span class="number">4</span>},{<span class="number">3</span>,<span class="number">9</span>},{<span class="number">4</span>,<span class="number">16</span>}} +</pre> + +<p>Note that the expression can be <em>any</em> function of the variable <code>x</code>!</p> + +<p>The basic syntax so far is <code><expr> for <set></code>, where <code><set></code> can be anything that +the Lua <code>for</code> statement understands. <code><set></code> can also just be the variable, in +which case the values will come from the <em>argument</em> of the comprehension. Here +I'm emphasizing that a comprehension is a function which can take a list argument:</p> + + +<pre> +> = C <span class="string">'2*x for x'</span> {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>} +{<span class="number">2</span>,<span class="number">4</span>,<span class="number">6</span>} +> dbl = C <span class="string">'2*x for x'</span> +> = dbl {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>} +{<span class="number">20</span>,<span class="number">40</span>,<span class="number">60</span>} +</pre> + +<p>Here is a somewhat more explicit way of saying the same thing; <code>_1</code> is a +<em>placeholder</em> refering to the <em>first</em> argument passed to the comprehension.</p> + + +<pre> +> = C <span class="string">'2*x for _,x in pairs(_1)'</span> {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>} +{<span class="number">20</span>,<span class="number">40</span>,<span class="number">60</span>} +> = C <span class="string">'_1(x) for x'</span>(<span class="global">tostring</span>,{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{<span class="string">'1'</span>,<span class="string">'2'</span>,<span class="string">'3'</span>,<span class="string">'4'</span>} +</pre> + +<p>This extended syntax is useful when you wish to collect the result of some +iterator, such as <a href="https://www.lua.org/manual/5.1/manual.html#pdf-io.lines">io.lines</a>. This comprehension creates a function which creates +a table of all the lines in a file:</p> + + +<pre> +> f = <span class="global">io</span>.open(<span class="string">'array.lua'</span>) +> lines = C <span class="string">'line for line in _1:lines()'</span> (f) +> = #lines +<span class="number">118</span> +</pre> + +<p>There are a number of functions that may be applied to the result of a +comprehension:</p> + + +<pre> +> = C <span class="string">'min(x for x)'</span> {<span class="number">1</span>,<span class="number">44</span>,<span class="number">0</span>} +<span class="number">0</span> +> = C <span class="string">'max(x for x)'</span> {<span class="number">1</span>,<span class="number">44</span>,<span class="number">0</span>} +<span class="number">44</span> +> = C <span class="string">'sum(x for x)'</span> {<span class="number">1</span>,<span class="number">44</span>,<span class="number">0</span>} +<span class="number">45</span> +</pre> + +<p>(These are equivalent to a reduce operation on a list.)</p> + +<p>After the <code>for</code> part, there may be a condition, which filters the output. This +comprehension collects the even numbers from a list:</p> + + +<pre> +> = C <span class="string">'x for x if x % 2 == 0'</span> {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>} +{<span class="number">2</span>,<span class="number">4</span>} +</pre> + +<p>There may be a number of <code>for</code> parts:</p> + + +<pre> +> = C <span class="string">'{x,y} for x = 1,2 for y = 1,2'</span> () +{{<span class="number">1</span>,<span class="number">1</span>},{<span class="number">1</span>,<span class="number">2</span>},{<span class="number">2</span>,<span class="number">1</span>},{<span class="number">2</span>,<span class="number">2</span>}} +> = C <span class="string">'{x,y} for x for y'</span> ({<span class="number">1</span>,<span class="number">2</span>},{<span class="number">10</span>,<span class="number">20</span>}) +{{<span class="number">1</span>,<span class="number">10</span>},{<span class="number">1</span>,<span class="number">20</span>},{<span class="number">2</span>,<span class="number">10</span>},{<span class="number">2</span>,<span class="number">20</span>}} +</pre> + +<p>These comprehensions are useful when dealing with functions of more than one +variable, and are not so easily achieved with the other Penlight functional forms.</p> + +<p><a id="func"/></p> + +<p><a name="Creating_Functions_from_Functions"></a></p> +<h3>Creating Functions from Functions</h3> + + +<p>Lua functions may be treated like any other value, although of course you cannot +multiply or add them. One operation that makes sense is <em>function composition</em>, +which chains function calls (so <code>(f * g)(x)</code> is <code>f(g(x))</code>.)</p> + + +<pre> +> func = <span class="global">require</span> <span class="string">'pl.func'</span> +> printf = func.compose(<span class="global">io</span>.write,<span class="global">string</span>.format) +> printf(<span class="string">"hello %s\n"</span>,<span class="string">'world'</span>) +hello world +<span class="keyword">true</span> +</pre> + +<p>Many functions require you to pass a function as an argument, say to apply to all +values of a sequence or as a callback. Often useful functions have the wrong +number of arguments. So there is a need to construct a function of one argument +from one of two arguments, <em>binding</em> the extra argument to a given value.</p> + +<p><em>partial application</em> takes a function of n arguments and returns a function of n-1 +arguments where the first argument is bound to some value:</p> + + +<pre> +> p2 = func.bind1(<span class="global">print</span>,<span class="string">'start>'</span>) +> p2(<span class="string">'hello'</span>,<span class="number">2</span>) +start> hello <span class="number">2</span> +> ops = <span class="global">require</span> <span class="string">'pl.operator'</span> +> = tablex.filter({<span class="number">1</span>,-<span class="number">2</span>,<span class="number">10</span>,-<span class="number">1</span>,<span class="number">2</span>},bind1(ops.gt,<span class="number">0</span>)) +{-<span class="number">2</span>,-<span class="number">1</span>} +> tablex.filter({<span class="number">1</span>,-<span class="number">2</span>,<span class="number">10</span>,-<span class="number">1</span>,<span class="number">2</span>},bind1(ops.le,<span class="number">0</span>)) +{<span class="number">1</span>,<span class="number">10</span>,<span class="number">2</span>} +</pre> + +<p>The last example unfortunately reads backwards, because <a href="../libraries/pl.func.html#bind1">bind1</a> alway binds the +first argument! Also unfortunately, in my youth I confused 'currying' with +'partial application', so the old name for <a href="../libraries/pl.func.html#bind1">bind1</a> is <code>curry</code> - this alias still exists.</p> + +<p>This is a specialized form of function argument binding. Here is another way +to say the <a href="https://www.lua.org/manual/5.1/manual.html#pdf-print">print</a> example:</p> + + +<pre> +> p2 = func.bind(<span class="global">print</span>,<span class="string">'start>'</span>,func._1,func._2) +> p2(<span class="string">'hello'</span>,<span class="number">2</span>) +start> hello <span class="number">2</span> +</pre> + +<p>where <code>_1</code> and <code>_2</code> are <em>placeholder variables</em>, corresponding to the first and +second argument respectively.</p> + +<p>Having <a href="../libraries/pl.func.html#">func</a> all over the place is distracting, so it's useful to pull all of +<a href="../libraries/pl.func.html#">pl.func</a> into the local context. Here is the filter example, this time the right +way around:</p> + + +<pre> +> utils.import <span class="string">'pl.func'</span> +> tablex.filter({<span class="number">1</span>,-<span class="number">2</span>,<span class="number">10</span>,-<span class="number">1</span>,<span class="number">2</span>},bind(ops.gt, _1, <span class="number">0</span>)) +{<span class="number">1</span>,<span class="number">10</span>,<span class="number">2</span>} +</pre> + +<p><a href="../libraries/pl.tablex.html#merge">tablex.merge</a> does a general merge of two tables. This example shows the +usefulness of binding the last argument of a function.</p> + + +<pre> +> S1 = {john=<span class="number">27</span>, jane=<span class="number">31</span>, mary=<span class="number">24</span>} +> S2 = {jane=<span class="number">31</span>, jones=<span class="number">50</span>} +> intersection = bind(tablex.merge, _1, _2, <span class="keyword">false</span>) +> union = bind(tablex.merge, _1, _2, <span class="keyword">true</span>) +> = intersection(S1,S2) +{jane=<span class="number">31</span>} +> = union(S1,S2) +{mary=<span class="number">24</span>,jane=<span class="number">31</span>,john=<span class="number">27</span>,jones=<span class="number">50</span>} +</pre> + +<p>When using <a href="../libraries/pl.func.html#bind">bind</a> with <a href="https://www.lua.org/manual/5.1/manual.html#pdf-print">print</a>, we got a function of precisely two arguments, +whereas we really want our function to use varargs like <a href="https://www.lua.org/manual/5.1/manual.html#pdf-print">print</a>. This is the role +of <code>_0</code>:</p> + + +<pre> +> _DEBUG = <span class="keyword">true</span> +> p = bind(<span class="global">print</span>,<span class="string">'start>'</span>, _0) +<span class="keyword">return</span> <span class="keyword">function</span> (fn,_v1) + <span class="keyword">return</span> <span class="keyword">function</span>(...) <span class="keyword">return</span> fn(_v1,...) <span class="keyword">end</span> +<span class="keyword">end</span> + +> p(<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>,<span class="number">5</span>) +start> <span class="number">1</span> <span class="number">2</span> <span class="number">3</span> <span class="number">4</span> <span class="number">5</span> +</pre> + +<p>I've turned on the global <code>_DEBUG</code> flag, so that the function generated is +printed out. It is actually a function which <em>generates</em> the required function; +the first call <em>binds the value</em> of <code>_v1</code> to 'start>'.</p> + +<p><a name="Placeholder_Expressions"></a></p> +<h3>Placeholder Expressions</h3> + +<p>A common pattern in Penlight is a function which applies another function to all +elements in a table or a sequence, such as <a href="../libraries/pl.tablex.html#map">tablex.map</a> or <a href="../libraries/pl.seq.html#filter">seq.filter</a>. Lua does +anonymous functions well, although they can be a bit tedious to type:</p> + + +<pre> +> = tablex.map(<span class="keyword">function</span>(x) <span class="keyword">return</span> x*x <span class="keyword">end</span>, {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{<span class="number">1</span>,<span class="number">4</span>,<span class="number">9</span>,<span class="number">16</span>} +</pre> + +<p><a href="../libraries/pl.func.html#">pl.func</a> allows you to define <em>placeholder expressions</em>, which can cut down on +the typing required, and also make your intent clearer. First, we bring contents +of <a href="../libraries/pl.func.html#">pl.func</a> into our context, and then supply an expression using placeholder +variables, such as <code>_1</code>,<code>_2</code>,etc. (C++ programmers will recognize this from the +Boost libraries.)</p> + + +<pre> +> utils.import <span class="string">'pl.func'</span> +> = tablex.map(_1*_1, {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{<span class="number">1</span>,<span class="number">4</span>,<span class="number">9</span>,<span class="number">16</span>} +</pre> + +<p>Functions of up to 5 arguments can be generated.</p> + + +<pre> +> = tablex.map2(_1+_2,{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>}, {<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>}) +{<span class="number">11</span>,<span class="number">22</span>,<span class="number">33</span>} +</pre> + +<p>These expressions can use arbitrary functions, altho they must first be +registered with the functional library. <a href="../libraries/pl.func.html#register">func.register</a> brings in a single +function, and <a href="../libraries/pl.func.html#import">func.import</a> brings in a whole table of functions, such as <a href="https://www.lua.org/manual/5.1/manual.html#5.6">math</a>.</p> + + +<pre> +> sin = register(<span class="global">math</span>.sin) +> = tablex.map(sin(_1), {<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{<span class="number">0.8414709848079</span>,<span class="number">0.90929742682568</span>,<span class="number">0.14112000805987</span>,-<span class="number">0.75680249530793</span>} +> import <span class="string">'math'</span> +> = tablex.map(cos(<span class="number">2</span>*_1),{<span class="number">1</span>,<span class="number">2</span>,<span class="number">3</span>,<span class="number">4</span>}) +{-<span class="number">0.41614683654714</span>,-<span class="number">0.65364362086361</span>,<span class="number">0.96017028665037</span>,-<span class="number">0.14550003380861</span>} +</pre> + +<p>A common operation is calling a method of a set of objects:</p> + + +<pre> +> = tablex.map(_1:sub(<span class="number">1</span>,<span class="number">1</span>), {<span class="string">'one'</span>,<span class="string">'four'</span>,<span class="string">'x'</span>}) +{<span class="string">'o'</span>,<span class="string">'f'</span>,<span class="string">'x'</span>} +</pre> + +<p>There are some restrictions on what operators can be used in PEs. For instance, +because the <code>__len</code> metamethod cannot be overriden by plain Lua tables, we need +to define a special function to express `#_1':</p> + + +<pre> +> = tablex.map(Len(_1), {<span class="string">'one'</span>,<span class="string">'four'</span>,<span class="string">'x'</span>}) +{<span class="number">3</span>,<span class="number">4</span>,<span class="number">1</span>} +</pre> + +<p>Likewise for comparison operators, which cannot be overloaded for <em>different</em> +types, and thus also have to be expressed as a special function:</p> + + +<pre> +> = tablex.filter(Gt(_1,<span class="number">0</span>), {<span class="number">1</span>,-<span class="number">1</span>,<span class="number">2</span>,<span class="number">4</span>,-<span class="number">3</span>}) +{<span class="number">1</span>,<span class="number">2</span>,<span class="number">4</span>} +</pre> + +<p>It is useful to express the fact that a function returns multiple values. For +instance, <a href="../libraries/pl.tablex.html#pairmap">tablex.pairmap</a> expects a function that will be called with the key +and the value, and returns the new value and the key, in that order.</p> + + +<pre> +> = pairmap(Args(_2,_1:upper()),{fred=<span class="number">1</span>,alice=<span class="number">2</span>}) +{ALICE=<span class="number">2</span>,FRED=<span class="number">1</span>} +</pre> + +<p>PEs cannot contain <code>nil</code> values, since PE function arguments are represented as +an array. Instead, a special value called <code>Nil</code> is provided. So say +<code>_1:f(Nil,1)</code> instead of <code>_1:f(nil,1)</code>.</p> + +<p>A placeholder expression cannot be automatically used as a Lua function. The +technical reason is that the call operator must be overloaded to construct +function calls like <code>_1(1)</code>. If you want to force a PE to return a function, use +<a href="../libraries/pl.func.html#I">func.I</a>.</p> + + +<pre> +> = tablex.map(_1(<span class="number">10</span>),{I(<span class="number">2</span>*_1),I(_1*_1),I(_1+<span class="number">2</span>)}) +{<span class="number">20</span>,<span class="number">100</span>,<span class="number">12</span>} +</pre> + +<p>Here we make a table of functions taking a single argument, and then call them +all with a value of 10.</p> + +<p>The essential idea with PEs is to 'quote' an expression so that it is not +immediately evaluated, but instead turned into a function that can be applied +later to some arguments. The basic mechanism is to wrap values and placeholders +so that the usual Lua operators have the effect of building up an <em>expression +tree</em>. (It turns out that you can do <em>symbolic algebra</em> using PEs, see +<a href="../examples/symbols.lua.html#">symbols.lua</a> in the examples directory, and its test runner <code>testsym.lua</code>, which +demonstrates symbolic differentiation.)</p> + +<p>The rule is that if any operator has a PE operand, the result will be quoted. +Sometimes we need to quote things explicitly. For instance, say we want to pass a +function to a filter that must return true if the element value is in a set. +<code>set[_1]</code> is the obvious expression, but it does not give the desired result, +since it evaluates directly, giving <code>nil</code>. Indexing works differently than a +binary operation like addition (set+_1 <em>is</em> properly quoted) so there is a need +for an explicit quoting or wrapping operation. This is the job of the <code>_</code> +function; the PE in this case should be <code>_(set)[_1]</code>. This works for functions +as well, as a convenient alternative to registering functions: <code>_(math.sin)(_1)</code>. +This is equivalent to using the `lines' method:</p> + + +<pre> +<span class="keyword">for</span> line <span class="keyword">in</span> I(_(f):read()) <span class="keyword">do</span> <span class="global">print</span>(line) <span class="keyword">end</span> +</pre> + +<p>Now this will work for <em>any</em> 'file-like' object which which has a <code>read</code> method +returning the next line. If you had a LuaSocket client which was being 'pushed' +by lines sent from a server, then <code>_(s):receive '*l'</code> would create an iterator +for accepting input. These forms can be convenient for adapting your data flow so +that it can be passed to the sequence functions in `pl.seq'.</p> + +<p>Placeholder expressions can be mixed with sequence wrapper expressions. +<a href="../libraries/pl.lexer.html#lua">lexer.lua</a> will give us a double-valued sequence of tokens, where the first +value is a type, and the second is a value. We filter out only the values where +the type is 'iden', extract the actual value using <code>map</code>, get the unique values +and finally copy to a list.</p> + + +<pre> +> str = <span class="string">'for i=1,10 do for j = 1,10 do print(i,j) end end'</span> +> = seq(lexer.lua(str)):filter(<span class="string">'=='</span>,<span class="string">'iden'</span>):map(_2):unique():copy() +{i,<span class="global">print</span>,j} +</pre> + +<p>This is a particularly intense line (and I don't always suggest making everything +a one-liner!); the key is the behaviour of <code>map</code>, which will take both values of +the sequence, so <code>_2</code> returns the value part. (Since <code>filter</code> here takes extra +arguments, it only operates on the type values.)</p> + +<p>There are some performance considerations to using placeholder expressions. +Instantiating a PE requires constructing and compiling a function, which is not +such a fast operation. So to get best performance, factor out PEs from loops like +this;</p> + + +<pre> +<span class="keyword">local</span> fn = I(_1:f() + _2:g()) +<span class="keyword">for</span> i = <span class="number">1</span>,n <span class="keyword">do</span> + res[i] = tablex.map2(fn,first[i],second[i]) +<span class="keyword">end</span> +</pre> + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/08-additional.md.html b/Data/Libraries/Penlight/docs/manual/08-additional.md.html new file mode 100644 index 0000000..d13ac6e --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/08-additional.md.html @@ -0,0 +1,815 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Simple_Input_Patterns">Simple Input Patterns </a></li> +<li><a href="#Command_line_Programs_with_Lapp">Command-line Programs with Lapp </a></li> +<li><a href="#Simple_Test_Framework">Simple Test Framework </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><strong>Additional Libraries</strong></li> + <li><a href="../manual/09-discussion.md.html">Technical Choices</a></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Additional Libraries</h2> + +<p>Libraries in this section are no longer considered to be part of the Penlight +core, but still provide specialized functionality when needed.</p> + +<p><a id="sip"/></p> + +<p><a name="Simple_Input_Patterns"></a></p> +<h3>Simple Input Patterns</h3> + +<p>Lua string pattern matching is very powerful, and usually you will not need a +traditional regular expression library. Even so, sometimes Lua code ends up +looking like Perl, which happens because string patterns are not always the +easiest things to read, especially for the casual reader. Here is a program +which needs to understand three distinct date formats:</p> + + +<pre> +<span class="comment">-- parsing dates using Lua string patterns +</span>months={Jan=<span class="number">1</span>,Feb=<span class="number">2</span>,Mar=<span class="number">3</span>,Apr=<span class="number">4</span>,May=<span class="number">5</span>,Jun=<span class="number">6</span>, +Jul=<span class="number">7</span>,Aug=<span class="number">8</span>,Sep=<span class="number">9</span>,Oct=<span class="number">10</span>,Nov=<span class="number">11</span>,Dec=<span class="number">12</span>} + +<span class="keyword">function</span> check_and_process(d,m,y) + d = <span class="global">tonumber</span>(d) + m = <span class="global">tonumber</span>(m) + y = <span class="global">tonumber</span>(y) + .... +<span class="keyword">end</span> + +<span class="keyword">for</span> line <span class="keyword">in</span> f:lines() <span class="keyword">do</span> + <span class="comment">-- ordinary (English) date format +</span> <span class="keyword">local</span> d,m,y = line:match(<span class="string">'(%d+)/(%d+)/(%d+)'</span>) + <span class="keyword">if</span> d <span class="keyword">then</span> + check_and_process(d,m,y) + <span class="keyword">else</span> <span class="comment">-- ISO date?? +</span> y,m,d = line:match(<span class="string">'(%d+)%-(%d+)%-(%d+)'</span>) + <span class="keyword">if</span> y <span class="keyword">then</span> + check_and_process(d,m,y) + <span class="keyword">else</span> <span class="comment">-- <day> <month-name> <year>? +</span> d,mm,y = line:match(<span class="string">'%(d+)%s+(%a+)%s+(%d+)'</span>) + m = months[mm] + check_and_process(d,m,y) + <span class="keyword">end</span> + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>These aren't particularly difficult patterns, but already typical issues are +appearing, such as having to escape '-'. Also, <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.match">string.match</a> returns its +captures, so that we're forced to use a slightly awkward nested if-statement.</p> + +<p>Verification issues will further cloud the picture, since regular expression +people try to enforce constraints (like year cannot be more than four digits) +using regular expressions, on the usual grounds that you shouldn't stop using a +hammer when you are enjoying yourself.</p> + +<p><a href="../libraries/pl.sip.html#">pl.sip</a> provides a simple, intuitive way to detect patterns in strings and +extract relevant parts.</p> + + +<pre> +> sip = <span class="global">require</span> <span class="string">'pl.sip'</span> +> dump = <span class="global">require</span>(<span class="string">'pl.pretty'</span>).dump +> res = {} +> c = sip.compile <span class="string">'ref=$S{file}:$d{line}'</span> +> = c(<span class="string">'ref=hello.c:10'</span>,res) +<span class="keyword">true</span> +> dump(res) +{ + line = <span class="number">10</span>, + file = <span class="string">"hello.c"</span> +} +> = c(<span class="string">'ref=long name, no line'</span>,res) +<span class="keyword">false</span> +</pre> + +<p><a href="../libraries/pl.sip.html#compile">sip.compile</a> creates a pattern matcher function, which takes a string and a +table as arguments. If the string matches the pattern, then <code>true</code> is returned +and the table is populated according to the captures within the pattern.</p> + +<p>Here is another version of the date parser:</p> + + +<pre> +<span class="comment">-- using SIP patterns +</span><span class="keyword">function</span> check(t) + check_and_process(t.day,t.month,t.year) +<span class="keyword">end</span> + +shortdate = sip.compile(<span class="string">'$d{day}/$d{month}/$d{year}'</span>) +longdate = sip.compile(<span class="string">'$d{day} $v{mon} $d{year}'</span>) +isodate = sip.compile(<span class="string">'$d{year}-$d{month}-$d{day}'</span>) + +<span class="keyword">for</span> line <span class="keyword">in</span> f:lines() <span class="keyword">do</span> + <span class="keyword">local</span> res = {} + <span class="keyword">if</span> shortdate(str,res) <span class="keyword">then</span> + check(res) + <span class="keyword">elseif</span> isodate(str,res) <span class="keyword">then</span> + check(res) + <span class="keyword">elseif</span> longdate(str,res) <span class="keyword">then</span> + res.month = months[res.mon] + check(res) + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>SIP captures start with '$', then a one-character type, and then an +optional variable name in curly braces.</p> + + +<pre> +Type Meaning +v identifier +i possibly signed integer +f floating-point number +r rest of line +q quoted <span class="global">string</span> (quoted using either ' <span class="keyword">or</span> ") +p a path name +( anything inside balanced parentheses +[ anything inside balanced brackets +{ anything inside balanced curly brackets +< anything inside balanced angle brackets +</pre> + +<p>If a type is not one of the above, then it's assumed to be one of the standard +Lua character classes, and will match one or more repetitions of that class. +Any spaces you leave in your pattern will match any number of spaces, including +zero, unless the spaces are between two identifier characters or patterns +matching them; in that case, at least one space will be matched.</p> + +<p>SIP captures (like <code>$v{mon}</code>) do not have to be named. You can use just <code>$v</code>, but +you have to be consistent; if a pattern contains unnamed captures, then all +captures must be unnamed. In this case, the result table is a simple list of +values.</p> + +<p><a href="../libraries/pl.sip.html#match">sip.match</a> is a useful shortcut if you want to compile and match in one call, +without saving the compiled pattern. It caches the result, so it is not much +slower than explicitly using <a href="../libraries/pl.sip.html#compile">sip.compile</a>.</p> + + +<pre> +> sip.match(<span class="string">'($q{first},$q{second})'</span>,<span class="string">'("john","smith")'</span>,res) +<span class="keyword">true</span> +> res +{second=<span class="string">'smith'</span>,first=<span class="string">'john'</span>} +> res = {} +> sip.match(<span class="string">'($q,$q)'</span>,<span class="string">'("jan","smit")'</span>,res) <span class="comment">-- unnamed captures +</span><span class="keyword">true</span> +> res +{<span class="string">'jan'</span>,<span class="string">'smit'</span>} +> sip.match(<span class="string">'($q,$q)'</span>,<span class="string">'("jan", "smit")'</span>,res) +<span class="keyword">false</span> <span class="comment">---> oops! Can't handle extra space! +</span>> sip.match(<span class="string">'( $q , $q )'</span>,<span class="string">'("jan", "smit")'</span>,res) +<span class="keyword">true</span> +</pre> + +<p>As a general rule, allow for whitespace in your patterns.</p> + +<p>Finally, putting a '$' at the end of a pattern means 'capture the rest of the +line, starting at the first non-space'. It is a shortcut for '$r{rest}', +or just '$r' if no named captures are used.</p> + + +<pre> +> sip.match(<span class="string">'( $q , $q ) $'</span>,<span class="string">'("jan", "smit") and a string'</span>,res) +<span class="keyword">true</span> +> res +{<span class="string">'jan'</span>,<span class="string">'smit'</span>,<span class="string">'and a string'</span>} +> res = {} +> sip.match(<span class="string">'( $q{first} , $q{last} ) $'</span>,<span class="string">'("jan", "smit") and a string'</span>,res) +<span class="keyword">true</span> +> res +{first=<span class="string">'jan'</span>,rest=<span class="string">'and a string'</span>,last=<span class="string">'smit'</span>} +</pre> + +<p><a id="lapp"/></p> + +<p><a name="Command_line_Programs_with_Lapp"></a></p> +<h3>Command-line Programs with Lapp</h3> + +<p><a href="../libraries/pl.lapp.html#">pl.lapp</a> is a small and focused Lua module which aims to make standard +command-line parsing easier and intuitive. It implements the standard GNU style, +i.e. short flags with one letter start with '-', and there may be an additional +long flag which starts with '--'. Generally options which take an argument expect +to find it as the next parameter (e.g. 'gcc test.c -o test') but single short +options taking a value can dispense with the space (e.g. 'head -n4 +test.c' or <code>gcc -I/usr/include/lua/5.1 ...</code>)</p> + +<p>As far as possible, Lapp will convert parameters into their equivalent Lua types, +i.e. convert numbers and convert filenames into file objects. If any conversion +fails, or a required parameter is missing, an error will be issued and the usage +text will be written out. So there are two necessary tasks, supplying the flag +and option names and associating them with a type.</p> + +<p>For any non-trivial script, even for personal consumption, it's necessary to +supply usage text. The novelty of Lapp is that it starts from that point and +defines a loose format for usage strings which can specify the names and types of +the parameters.</p> + +<p>An example will make this clearer:</p> + + +<pre> +<span class="comment">-- scale.lua +</span> lapp = <span class="global">require</span> <span class="string">'pl.lapp'</span> + <span class="keyword">local</span> args = lapp <span class="string">[[ + Does some calculations + -o,--offset (default 0.0) Offset to add to scaled number + -s,--scale (number) Scaling factor + <number> (number) Number to be scaled + ]]</span> + + <span class="global">print</span>(args.offset + args.scale * args.number) +</pre> + +<p>Here is a command-line session using this script:</p> + + +<pre> +$ lua scale.lua +scale.lua:missing required parameter: scale + +Does some calculations + -o,<span class="comment">--offset (default 0.0) Offset to add to scaled number +</span> -s,<span class="comment">--scale (number) Scaling factor +</span> <number> (number ) Number to be scaled + +$ lua scale.lua -s <span class="number">2.2</span> <span class="number">10</span> +<span class="number">22</span> + +$ lua scale.lua -s <span class="number">2.2</span> x10 +scale.lua:unable to convert to number: x10 + +....(usage as before) +</pre> + +<p>There are two kinds of lines in Lapp usage strings which are meaningful; option +and parameter lines. An option line gives the short option, optionally followed +by the corresponding long option. A type specifier in parentheses may follow. +Similarly, a parameter line starts with '<NAME>', followed by a type +specifier.</p> + +<p>Type specifiers usually start with a type name: one of 'boolean', 'string','number','file-in' or +'file-out'. You may leave this out, but then <em>must</em> say 'default' followed by a value. +If a flag or parameter has a default, it is not <em>required</em> and is set to the default. The actual +type is deduced from this value (number, string, file or boolean) if not provided directly. +'Deduce' is a fancy word for 'guess' and it can be wrong, e.g '(default 1)' +will always be a number. You can say '(string default 1)' to override the guess. +There are file values for the predefined console streams: stdin, stdout, stderr.</p> + +<p>The boolean type is the default for flags. Not providing the type specifier is equivalent to +'(boolean default false)`. If the flag is meant to be 'turned off' then either the full +'(boolean default true)` or the shortcut '(default true)' will work.</p> + +<p>An alternative to <code>default</code> is <code>optional</code>:</p> + + +<pre> +<span class="keyword">local</span> lapp = <span class="global">require</span> <span class="string">'pl.lapp'</span> +<span class="keyword">local</span> args = lapp <span class="string">[[ + --cmd (optional string) Command to run. +]]</span> + +<span class="keyword">if</span> args.cmd <span class="keyword">then</span> + <span class="global">os</span>.execute(args.cmd) +<span class="keyword">end</span> +</pre> + +<p>Here we're implying that <code>cmd</code> need not be specified (just as with <code>default</code>) but if not +present, then <code>args.cmd</code> is <code>nil</code>, which will always test false.</p> + +<p>The rest of the line is ignored and can be used for explanatory text.</p> + +<p>This script shows the relation between the specified parameter names and the +fields in the output table.</p> + + +<pre> +<span class="comment">-- simple.lua +</span><span class="keyword">local</span> args = <span class="global">require</span> (<span class="string">'pl.lapp'</span>) <span class="string">[[ +Various flags and option types + -p A simple optional flag, defaults to false + -q,--quiet A simple flag with long name + -o (string) A required option with argument + -s (default 'save') Optional string with default 'save' (single quotes ignored) + -n (default 1) Optional numerical flag with default 1 + -b (string default 1) Optional string flag with default '1' (type explicit) + <input> (default stdin) Optional input file parameter, reads from stdin +]]</span> + +<span class="keyword">for</span> k,v <span class="keyword">in</span> <span class="global">pairs</span>(args) <span class="keyword">do</span> + <span class="global">print</span>(k,v) +<span class="keyword">end</span> +</pre> + +<p>I've just dumped out all values of the args table; note that args.quiet has +become true, because it's specified; args.p defaults to false. If there is a long +name for an option, that will be used in preference as a field name. A type or +default specifier is not necessary for simple flags, since the default type is +boolean.</p> + + +<pre> +$ simple -o test -q simple.lua +p <span class="keyword">false</span> +input file (<span class="number">781</span>C1BD8) +quiet <span class="keyword">true</span> +o test +input_name simple.lua +D:\dev\lua\lapp>simple -o test simple.lua one two three +<span class="number">1</span> one +<span class="number">2</span> two +<span class="number">3</span> three +p <span class="keyword">false</span> +quiet <span class="keyword">false</span> +input file (<span class="number">781</span>C1BD8) +o test +input_name simple.lua +</pre> + +<p>The parameter input has been set to an open read-only file object - we know it +must be a read-only file since that is the type of the default value. The field +input_name is automatically generated, since it's often useful to have access to +the original filename.</p> + +<p>Notice that any extra parameters supplied will be put in the result table with +integer indices, i.e. args[i] where i goes from 1 to #args.</p> + +<p>Files don't really have to be closed explicitly for short scripts with a quick +well-defined mission, since the result of garbage-collecting file objects is to +close them.</p> + +<h4>Enforcing a Range and Enumerations</h4> + +<p>The type specifier can also be of the form '(' MIN '..' MAX ')' or a set of strings +separated by '|'.</p> + + +<pre> +<span class="keyword">local</span> lapp = <span class="global">require</span> <span class="string">'pl.lapp'</span> +<span class="keyword">local</span> args = lapp <span class="string">[[ + Setting ranges + <x> (1..10) A number from 1 to 10 + <y> (-5..1e6) Bigger range + <z> (slow|medium|fast) +]]</span> + +<span class="global">print</span>(args.x,args.y) +</pre> + +<p>Here the meaning of ranges is that the value is greater or equal to MIN and less or equal +to MAX. +An 'enum' is a <em>string</em> that can only have values from a specified set.</p> + +<h4>Custom Types</h4> + +<p>There is no builti-in way to force a parameter to be a whole number, but +you may define a custom type that does this:</p> + + +<pre> +lapp = <span class="global">require</span> (<span class="string">'pl.lapp'</span>) + +lapp.add_type(<span class="string">'integer'</span>,<span class="string">'number'</span>, + <span class="keyword">function</span>(x) + lapp.<span class="global">assert</span>(<span class="global">math</span>.ceil(x) == x, <span class="string">'not an integer!'</span>) + <span class="keyword">end</span> +) + +<span class="keyword">local</span> args = lapp <span class="string">[[ + <ival> (integer) Process PID +]]</span> + +<span class="global">print</span>(args.ival) +</pre> + +<p><a href="../libraries/pl.lapp.html#add_type">lapp.add_type</a> takes three parameters, a type name, a converter and a constraint +function. The constraint function is expected to throw an assertion if some +condition is not true; we use <a href="../libraries/pl.lapp.html#assert">lapp.assert</a> because it fails in the standard way +for a command-line script. The converter argument can either be a type name known +to Lapp, or a function which takes a string and generates a value.</p> + +<p>Here's a useful custom type that allows dates to be input as <a href="../classes/pl.Date.html#">pl.Date</a> values:</p> + + +<pre> +<span class="keyword">local</span> df = Date.Format() + +lapp.add_type(<span class="string">'date'</span>, + <span class="keyword">function</span>(s) + <span class="keyword">local</span> d,e = df:parse(s) + lapp.<span class="global">assert</span>(d,e) + <span class="keyword">return</span> d + <span class="keyword">end</span> +) +</pre> + +<h4>'varargs' Parameter Arrays</h4> + + +<pre> +lapp = <span class="global">require</span> <span class="string">'pl.lapp'</span> +<span class="keyword">local</span> args = lapp <span class="string">[[ +Summing numbers + <numbers...> (number) A list of numbers to be summed +]]</span> + +<span class="keyword">local</span> sum = <span class="number">0</span> +<span class="keyword">for</span> i,x <span class="keyword">in</span> <span class="global">ipairs</span>(args.numbers) <span class="keyword">do</span> + sum = sum + x +<span class="keyword">end</span> +<span class="global">print</span> (<span class="string">'sum is '</span>..sum) +</pre> + +<p>The parameter number has a trailing '...', which indicates that this parameter is +a 'varargs' parameter. It must be the last parameter, and args.number will be an +array.</p> + +<p>Consider this implementation of the head utility from Mac OS X:</p> + + +<pre> +<span class="comment">-- implements a BSD-style head +</span><span class="comment">-- (see http://www.manpagez.com/man/1/head/osx-10.3.php) +</span> +lapp = <span class="global">require</span> (<span class="string">'pl.lapp'</span>) + +<span class="keyword">local</span> args = lapp <span class="string">[[ +Print the first few lines of specified files + -n (default 10) Number of lines to print + <files...> (default stdin) Files to print +]]</span> + +<span class="comment">-- by default, lapp converts file arguments to an actual Lua file object. +</span><span class="comment">-- But the actual filename is always available as <file>_name. +</span><span class="comment">-- In this case, 'files' is a varargs array, so that 'files_name' is +</span><span class="comment">-- also an array. +</span><span class="keyword">local</span> nline = args.n +<span class="keyword">local</span> nfile = #args.files +<span class="keyword">for</span> i = <span class="number">1</span>,nfile <span class="keyword">do</span> + <span class="keyword">local</span> file = args.files[i] + <span class="keyword">if</span> nfile > <span class="number">1</span> <span class="keyword">then</span> + <span class="global">print</span>(<span class="string">'==> '</span>..args.files_name[i]..<span class="string">' <=='</span>) + <span class="keyword">end</span> + <span class="keyword">local</span> n = <span class="number">0</span> + <span class="keyword">for</span> line <span class="keyword">in</span> file:lines() <span class="keyword">do</span> + <span class="global">print</span>(line) + n = n + <span class="number">1</span> + <span class="keyword">if</span> n == nline <span class="keyword">then</span> <span class="keyword">break</span> <span class="keyword">end</span> + <span class="keyword">end</span> +<span class="keyword">end</span> +</pre> + +<p>Note how we have access to all the filenames, because the auto-generated field +<code>files_name</code> is also an array!</p> + +<p>(This is probably not a very considerate script, since Lapp will open all the +files provided, and only close them at the end of the script. See the <code>xhead.lua</code> +example for another implementation.)</p> + +<p>Flags and options may also be declared as vararg arrays, and can occur anywhere. +If there is both a short and long form, then the trailing "..." must happen after the long form, +for example "-x,--network... (string)...",</p> + +<p>Bear in mind that short options can be combined (like 'tar -xzf'), so it's +perfectly legal to have '-vvv'. But normally the value of args.v is just a simple +<code>true</code> value.</p> + + +<pre> +<span class="keyword">local</span> args = <span class="global">require</span> (<span class="string">'pl.lapp'</span>) <span class="string">[[ + -v... Verbosity level; can be -v, -vv or -vvv +]]</span> +vlevel = <span class="keyword">not</span> args.v[<span class="number">1</span>] <span class="keyword">and</span> <span class="number">0</span> <span class="keyword">or</span> #args.v +<span class="global">print</span>(vlevel) +</pre> + +<p>The vlevel assigment is a bit of Lua voodoo, so consider the cases:</p> + + +<pre> +* No -v flag, v is just { <span class="keyword">false</span> } +* One -v flags, v is { <span class="keyword">true</span> } +* Two -v flags, v is { <span class="keyword">true</span>, <span class="keyword">true</span> } +* Three -v flags, v is { <span class="keyword">true</span>, <span class="keyword">true</span>, <span class="keyword">true</span> } +</pre> + +<h4>Defining a Parameter Callback</h4> + +<p>If a script implements <code>lapp.callback</code>, then Lapp will call it after each +argument is parsed. The callback is passed the parameter name, the raw unparsed +value, and the result table. It is called immediately after assignment of the +value, so the corresponding field is available.</p> + + +<pre> +lapp = <span class="global">require</span> (<span class="string">'pl.lapp'</span>) + +<span class="keyword">function</span> lapp.callback(parm,arg,args) + <span class="global">print</span>(<span class="string">'+'</span>,parm,arg) +<span class="keyword">end</span> + +<span class="keyword">local</span> args = lapp <span class="string">[[ +Testing parameter handling + -p Plain flag (defaults to false) + -q,--quiet Plain flag with GNU-style optional long name + -o (string) Required string option + -n (number) Required number option + -s (default 1.0) Option that takes a number, but will default + <start> (number) Required number argument + <input> (default stdin) A parameter which is an input file + <output> (default stdout) One that is an output file +]]</span> +<span class="global">print</span> <span class="string">'args'</span> +<span class="keyword">for</span> k,v <span class="keyword">in</span> <span class="global">pairs</span>(args) <span class="keyword">do</span> + <span class="global">print</span>(k,v) +<span class="keyword">end</span> +</pre> + +<p>This produces the following output:</p> + + +<pre> +$ args -o name -n <span class="number">2</span> <span class="number">10</span> args.lua ++ o name ++ n <span class="number">2</span> ++ start <span class="number">10</span> ++ input args.lua +args +p <span class="keyword">false</span> +s <span class="number">1</span> +input_name args.lua +quiet <span class="keyword">false</span> +output file (<span class="number">781</span>C1B98) +start <span class="number">10</span> +input file (<span class="number">781</span>C1BD8) +o name +n <span class="number">2</span> +</pre> + +<p>Callbacks are needed when you want to take action immediately on parsing an +argument.</p> + +<h4>Slack Mode</h4> + +<p>If you'd like to use a multi-letter 'short' parameter you need to set +the <code>lapp.slack</code> variable to <code>true</code>.</p> + +<p>In the following example we also see how default <code>false</code> and default <code>true</code> flags can be used +and how to overwrite the default <code>-h</code> help flag (<code>--help</code> still works fine) - this applies +to non-slack mode as well.</p> + + +<pre> +<span class="comment">-- Parsing the command line ---------------------------------------------------- +</span><span class="comment">-- test.lua +</span><span class="keyword">local</span> lapp = <span class="global">require</span> <span class="string">'pl.lapp'</span> +<span class="keyword">local</span> pretty = <span class="global">require</span> <span class="string">'pl.pretty'</span> +lapp.slack = <span class="keyword">true</span> +<span class="keyword">local</span> args = lapp <span class="string">[[ +Does some calculations + -v, --video (string) Specify input video + -w, --width (default 256) Width of the video + -h, --height (default 144) Height of the video + -t, --time (default 10) Seconds of video to process + -sk,--seek (default 0) Seek number of seconds + -f1,--flag1 A false flag + -f2,--flag2 A false flag + -f3,--flag3 (default true) A true flag + -f4,--flag4 (default true) A true flag +]]</span> + +pretty.dump(args) +</pre> + +<p>And here we can see the output of <code>test.lua</code>:</p> + + +<pre> +$> lua test.lua -v abc <span class="comment">--time 40 -h 20 -sk 15 --flag1 -f3 +</span><span class="comment">----> +</span>{ + width = <span class="number">256</span>, + flag1 = <span class="keyword">true</span>, + flag3 = <span class="keyword">false</span>, + seek = <span class="number">15</span>, + flag2 = <span class="keyword">false</span>, + video = abc, + time = <span class="number">40</span>, + height = <span class="number">20</span>, + flag4 = <span class="keyword">true</span> +} +</pre> + +<p><a name="Simple_Test_Framework"></a></p> +<h3>Simple Test Framework</h3> + +<p><a href="../libraries/pl.test.html#">pl.test</a> was originally developed for the sole purpose of testing Penlight itself, +but you may find it useful for your own applications. (<a href="http://lua-users.org/wiki/UnitTesting">There are many other options</a>.)</p> + +<p>Most of the goodness is in <a href="../libraries/pl.test.html#asserteq">test.asserteq</a>. It uses <a href="../libraries/pl.tablex.html#deepcompare">tablex.deepcompare</a> on its two arguments, +and by default quits the test application with a non-zero exit code, and an informative +message printed to stderr:</p> + + +<pre> +<span class="keyword">local</span> test = <span class="global">require</span> <span class="string">'pl.test'</span> + +test.asserteq({<span class="number">10</span>,<span class="number">20</span>,<span class="number">30</span>},{<span class="number">10</span>,<span class="number">20</span>,<span class="number">30.1</span>}) + +<span class="comment">--~ test-test.lua:3: assertion failed +</span><span class="comment">--~ got: { +</span><span class="comment">--~ [1] = 10, +</span><span class="comment">--~ [2] = 20, +</span><span class="comment">--~ [3] = 30 +</span><span class="comment">--~ } +</span><span class="comment">--~ needed: { +</span><span class="comment">--~ [1] = 10, +</span><span class="comment">--~ [2] = 20, +</span><span class="comment">--~ [3] = 30.1 +</span><span class="comment">--~ } +</span><span class="comment">--~ these values were not equal</span> +</pre> + +<p>This covers most cases but it's also useful to compare strings using <a href="https://www.lua.org/manual/5.1/manual.html#pdf-string.match">string.match</a></p> + + +<pre> +<span class="comment">-- must start with bonzo the dog +</span>test.assertmatch (<span class="string">'bonzo the dog is here'</span>,<span class="string">'^bonzo the dog'</span>) +<span class="comment">-- must end with an integer +</span>test.assertmatch (<span class="string">'hello 42'</span>,<span class="string">'%d+$'</span>) +</pre> + +<p>Since Lua errors are usually strings, this matching strategy is used to test 'exceptions':</p> + + +<pre> +test.assertraise(<span class="keyword">function</span>() + <span class="keyword">local</span> t = <span class="keyword">nil</span> + <span class="global">print</span>(t.bonzo) +<span class="keyword">end</span>,<span class="string">'nil value'</span>) +</pre> + +<p>(Some care is needed to match the essential part of the thrown error if you care +for portability, since in Lua 5.2 +the exact error is "attempt to index local 't' (a nil value)" and in Lua 5.3 the error +is "attempt to index a nil value (local 't')")</p> + +<p>There is an extra optional argument to these test functions, which is helpful when writing +test helper functions. There you want to highlight the failed line, not the actual call +to <code>asserteq</code> or <code>assertmatch</code> - line 33 here is the call to <code>is_iden</code></p> + + +<pre> +<span class="keyword">function</span> is_iden(str) + test.assertmatch(str,<span class="string">'^[%a_][%w_]*$'</span>,<span class="number">1</span>) +<span class="keyword">end</span> + +is_iden <span class="string">'alpha_dog'</span> +is_iden <span class="string">'$dollars'</span> + +<span class="comment">--~ test-test.lua:33: assertion failed +</span><span class="comment">--~ got: "$dollars" +</span><span class="comment">--~ needed: "^[%a_][%w_]*$" +</span><span class="comment">--~ these strings did not match</span> +</pre> + +<p>Useful Lua functions often return multiple values, and <a href="../libraries/pl.test.html#tuple">test.tuple</a> is a convenient way to +capture these values, whether they contain nils or not.</p> + + +<pre> +T = test.tuple + +<span class="comment">--- common error pattern +</span><span class="keyword">function</span> failing() + <span class="keyword">return</span> <span class="keyword">nil</span>,<span class="string">'failed'</span> +<span class="keyword">end</span> + +test.asserteq(T(failing()),T(<span class="keyword">nil</span>,<span class="string">'failed'</span>)) +</pre> + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> diff --git a/Data/Libraries/Penlight/docs/manual/09-discussion.md.html b/Data/Libraries/Penlight/docs/manual/09-discussion.md.html new file mode 100644 index 0000000..4e7dd69 --- /dev/null +++ b/Data/Libraries/Penlight/docs/manual/09-discussion.md.html @@ -0,0 +1,233 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" + "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> +<html> +<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> +<head> + <title>Penlight Documentation</title> + <link rel="stylesheet" href="../ldoc_fixed.css" type="text/css" /> +</head> +<body> + +<div id="container"> + +<div id="product"> + <div id="product_logo"></div> + <div id="product_name"><big><b></b></big></div> + <div id="product_description"></div> +</div> <!-- id="product" --> + + +<div id="main"> + + +<!-- Menu --> + +<div id="navigation"> +<br/> +<h1>Penlight</h1> + +<ul> + <li><a href="https://github.com/lunarmodules/Penlight">GitHub Project</a></li> + <li><a href="../index.html">Documentation</a></li> +</ul> + +<h2>Contents</h2> +<ul> +<li><a href="#Modularity_and_Granularity">Modularity and Granularity </a></li> +<li><a href="#Defining_what_is_Callable">Defining what is Callable </a></li> +</ul> + + +<h2>Manual</h2> +<ul class="nowrap"> + <li><a href="../manual/01-introduction.md.html">Introduction</a></li> + <li><a href="../manual/02-arrays.md.html">Tables and Arrays</a></li> + <li><a href="../manual/03-strings.md.html">Strings. Higher-level operations on strings.</a></li> + <li><a href="../manual/04-paths.md.html">Paths and Directories</a></li> + <li><a href="../manual/05-dates.md.html">Date and Time</a></li> + <li><a href="../manual/06-data.md.html">Data</a></li> + <li><a href="../manual/07-functional.md.html">Functional Programming</a></li> + <li><a href="../manual/08-additional.md.html">Additional Libraries</a></li> + <li><strong>Technical Choices</strong></li> +</ul> +<h2>Libraries</h2> +<ul class="nowrap"> + <li><a href="../libraries/pl.html">pl</a></li> + <li><a href="../libraries/pl.app.html">pl.app</a></li> + <li><a href="../libraries/pl.array2d.html">pl.array2d</a></li> + <li><a href="../libraries/pl.class.html">pl.class</a></li> + <li><a href="../libraries/pl.compat.html">pl.compat</a></li> + <li><a href="../libraries/pl.comprehension.html">pl.comprehension</a></li> + <li><a href="../libraries/pl.config.html">pl.config</a></li> + <li><a href="../libraries/pl.data.html">pl.data</a></li> + <li><a href="../libraries/pl.dir.html">pl.dir</a></li> + <li><a href="../libraries/pl.file.html">pl.file</a></li> + <li><a href="../libraries/pl.func.html">pl.func</a></li> + <li><a href="../libraries/pl.import_into.html">pl.import_into</a></li> + <li><a href="../libraries/pl.input.html">pl.input</a></li> + <li><a href="../libraries/pl.lapp.html">pl.lapp</a></li> + <li><a href="../libraries/pl.lexer.html">pl.lexer</a></li> + <li><a href="../libraries/pl.luabalanced.html">pl.luabalanced</a></li> + <li><a href="../libraries/pl.operator.html">pl.operator</a></li> + <li><a href="../libraries/pl.path.html">pl.path</a></li> + <li><a href="../libraries/pl.permute.html">pl.permute</a></li> + <li><a href="../libraries/pl.pretty.html">pl.pretty</a></li> + <li><a href="../libraries/pl.seq.html">pl.seq</a></li> + <li><a href="../libraries/pl.sip.html">pl.sip</a></li> + <li><a href="../libraries/pl.strict.html">pl.strict</a></li> + <li><a href="../libraries/pl.stringio.html">pl.stringio</a></li> + <li><a href="../libraries/pl.stringx.html">pl.stringx</a></li> + <li><a href="../libraries/pl.tablex.html">pl.tablex</a></li> + <li><a href="../libraries/pl.template.html">pl.template</a></li> + <li><a href="../libraries/pl.test.html">pl.test</a></li> + <li><a href="../libraries/pl.text.html">pl.text</a></li> + <li><a href="../libraries/pl.types.html">pl.types</a></li> + <li><a href="../libraries/pl.url.html">pl.url</a></li> + <li><a href="../libraries/pl.utils.html">pl.utils</a></li> + <li><a href="../libraries/pl.xml.html">pl.xml</a></li> +</ul> +<h2>Classes</h2> +<ul class="nowrap"> + <li><a href="../classes/pl.Date.html">pl.Date</a></li> + <li><a href="../classes/pl.List.html">pl.List</a></li> + <li><a href="../classes/pl.Map.html">pl.Map</a></li> + <li><a href="../classes/pl.MultiMap.html">pl.MultiMap</a></li> + <li><a href="../classes/pl.OrderedMap.html">pl.OrderedMap</a></li> + <li><a href="../classes/pl.Set.html">pl.Set</a></li> +</ul> +<h2>Examples</h2> +<ul class="nowrap"> + <li><a href="../examples/seesubst.lua.html">seesubst.lua</a></li> + <li><a href="../examples/sipscan.lua.html">sipscan.lua</a></li> + <li><a href="../examples/symbols.lua.html">symbols.lua</a></li> + <li><a href="../examples/test-cmp.lua.html">test-cmp.lua</a></li> + <li><a href="../examples/test-data.lua.html">test-data.lua</a></li> + <li><a href="../examples/test-listcallbacks.lua.html">test-listcallbacks.lua</a></li> + <li><a href="../examples/test-pretty.lua.html">test-pretty.lua</a></li> + <li><a href="../examples/test-symbols.lua.html">test-symbols.lua</a></li> + <li><a href="../examples/testclone.lua.html">testclone.lua</a></li> + <li><a href="../examples/testconfig.lua.html">testconfig.lua</a></li> + <li><a href="../examples/testglobal.lua.html">testglobal.lua</a></li> + <li><a href="../examples/testinputfields.lua.html">testinputfields.lua</a></li> + <li><a href="../examples/testinputfields2.lua.html">testinputfields2.lua</a></li> + <li><a href="../examples/testxml.lua.html">testxml.lua</a></li> + <li><a href="../examples/which.lua.html">which.lua</a></li> +</ul> + +</div> + +<div id="content"> + + +<h2>Technical Choices</h2> + +<p><a name="Modularity_and_Granularity"></a></p> +<h3>Modularity and Granularity</h3> + +<p>In an ideal world, a program should only load the libraries it needs. Penlight is +intended to work in situations where an extra 100Kb of bytecode could be a +problem. It is straightforward but tedious to load exactly what you need:</p> + + +<pre> +<span class="keyword">local</span> data = <span class="global">require</span> <span class="string">'pl.data'</span> +<span class="keyword">local</span> List = <span class="global">require</span> <span class="string">'pl.List'</span> +<span class="keyword">local</span> array2d = <span class="global">require</span> <span class="string">'pl.array2d'</span> +<span class="keyword">local</span> seq = <span class="global">require</span> <span class="string">'pl.seq'</span> +<span class="keyword">local</span> utils = <span class="global">require</span> <span class="string">'pl.utils'</span> +</pre> + +<p>This is the style that I follow in Penlight itself, so that modules don't mess +with the global environment; also, <code>stringx.import()</code> is not used because it will +update the global <a href="https://www.lua.org/manual/5.1/manual.html#5.4">string</a> table.</p> + +<p>But <code>require 'pl'</code> is more convenient in scripts; the question is how to ensure +that one doesn't load the whole kitchen sink as the price of convenience. The +strategy is to only load modules when they are referenced. In 'init.lua' (which +is loaded by <code>require 'pl'</code>) a metatable is attached to the global table with an +<code>__index</code> metamethod. Any unknown name is looked up in the list of modules, and +if found, we require it and make that module globally available. So when +<a href="../libraries/pl.tablex.html#deepcompare">tablex.deepcompare</a> is encountered, looking up <a href="../libraries/pl.tablex.html#">tablex</a> causes 'pl.tablex' to be +required. .</p> + +<p>Modifying the behaviour of the global table has consequences. For instance, there +is the famous module <a href="../libraries/pl.strict.html#">strict</a> which comes with Lua itself (perhaps the only +standard Lua module written in Lua itself) which also does this modification so +that global variiables must be defined before use. So the implementation in +'init.lua' allows for a 'not found' hook, which 'pl.strict.lua' uses. Other +libraries may install their own metatables for <code>_G</code>, but Penlight will now +forward any unknown name to the <code>__index</code> defined by the original metatable.</p> + +<p>But the strategy is worth the effort: the old 'kitchen sink' 'init.lua' would +pull in about 260K of bytecode, whereas now typical programs use about 100K less, +and short scripts even better - for instance, if they were only needing +functionality in <a href="../libraries/pl.utils.html#">utils</a>.</p> + +<p>There are some functions which mark their output table with a special metatable, +when it seems particularly appropriate. For instance, <a href="../libraries/pl.tablex.html#makeset">tablex.makeset</a> creates a +<a href="../classes/pl.Set.html#">Set</a>, and <a href="../libraries/pl.seq.html#copy">seq.copy</a> creates a <a href="../classes/pl.List.html#">List</a>. But this does not automatically result in +the loading of <a href="../classes/pl.Set.html#">pl.Set</a> and <a href="../classes/pl.List.html#">pl.List</a>; only if you try to access any of these +methods. In 'utils.lua', there is an exported table called <code>stdmt</code>:</p> + + +<pre> +stdmt = { List = {}, Map = {}, Set = {}, MultiMap = {} } +</pre> + +<p>If you go through 'init.lua', then these plain little 'identity' tables get an +<code>__index</code> metamethod which forces the loading of the full functionality. Here is +the code from 'list.lua' which starts the ball rolling for lists:</p> + + +<pre> +List = utils.stdmt.List +List.__index = List +List._name = <span class="string">"List"</span> +List._class = List +</pre> + +<p>The 'load-on-demand' strategy helps to modularize the library. Especially for +more casual use, <code>require 'pl'</code> is a good compromise between convenience and +modularity.</p> + +<p>In this current version, I have generally reduced the amount of trickery +involved. Previously, <a href="../classes/pl.Map.html#">Map</a> was defined in <a href="../libraries/pl.class.html#">pl.class</a>; now it is sensibly defined +in <a href="../classes/pl.Map.html#">pl.Map</a>; <a href="../libraries/pl.class.html#">pl.class</a> only contains the basic class mechanism (and returns that +function.) For consistency, <a href="../classes/pl.List.html#">List</a> is returned directly by <code>require 'pl.List'</code> +(note the uppercase 'L'), Also, the amount of module dependencies in the +non-core libraries like <a href="../libraries/pl.config.html#">pl.config</a> have been reduced.</p> + +<p><a name="Defining_what_is_Callable"></a></p> +<h3>Defining what is Callable</h3> + +<p>'utils.lua' exports <code>function_arg</code> which is used extensively throughout Penlight. +It defines what is meant by 'callable'. Obviously true functions are immediately +passed back. But what about strings? The first option is that it represents an +operator in 'operator.lua', so that '<' is just an alias for <a href="../libraries/pl.operator.html#lt">operator.lt</a>.</p> + +<p>We then check whether there is a <em>function factory</em> defined for the metatable of +the value.</p> + +<p>(It is true that strings can be made callable, but in practice this turns out to +be a cute but dubious idea, since <em>all</em> strings share the same metatable. A +common programming error is to pass the wrong kind of object to a function, and +it's better to get a nice clean 'attempting to call a string' message rather than +some obscure trace from the bowels of your library.)</p> + +<p>The other module that registers a function factory is <a href="../libraries/pl.func.html#">pl.func</a>. Placeholder +expressions cannot be directly calleable, and so need to be instantiated and +cached in as efficient way as possible.</p> + +<p>(An inconsistency is that <code>utils.is_callable</code> does not do this thorough check.)</p> + + + + +</div> <!-- id="content" --> +</div> <!-- id="main" --> +<div id="about"> +<i>generated by <a href="http://github.com/stevedonovan/LDoc">LDoc 1.4.6</a></i> +</div> <!-- id="about" --> +</div> <!-- id="container" --> +</body> +</html> |