Sano Collections Framework
Paul Chiusano paul.chiusano@gmail.com Last updated: February 28, 2006
Contents:
1. About the framework
1.1 Getting Started
1.2 Basics
1.3 More Examples
2. API Documentation
3. Download
4. Repository access
5. Getting Help
6. Roadmap
1. About the framework
The Sano framework contains pureLua implementations of several common (and a few notsocommon) data structures and related algorithms. It runs under both Lua 5.0 and 5.1, and is MIT licensed.
Among the highlights are: Several sequence types, including
Tuple (a lightweight, hashable table wrapper),
Vector (slightly more
industrial strength table wrapper, including negative indexing),
QueueVector
(random access + efficient inserts at head OR tail),
a basic LinkedList,
and SkipVector (a skiplist based random access
sequence with efficient inserts or removals at ANY position).
The library also includes several map implementations: among them the
basic HashMap (a tablebased map with plenty of convenience methods
and support for userdefined hashing functions), an ordered,
mergeable, spliceable SkipMap, and a LinkedMap (iteration order same as
insertion order). For each of these map types, Sano also has
the parallel set type  HashSet, SkipSet, LinkedSet.
In addition, the library contains a PairingHeap (constanttime
mergeable heap), Multimap (implements one to many mappings), Multiset
(ordered or unordered), and numerous algorithms and utility functions
for working with all the data structures in the library.
All the data structures in the library are at the very least lightly unittested; some I've used extensively for my own projects and are probably pretty stable. Bug reports are appreciated, as are suggestions and comments.
1.1 Getting Started
Sano is packaged as a single 200KB Lua file with all modules and documentation embedded. To get started with Sano, download the sano.lua file linked to on the Sano Luaforge page and place it somewhere Lua's 'require' mechanism knows to look. With that done, you can access the Sano library in the obvious way:
local sano = require 'sano'
...
local v = sano.Vector:new()
One minor caveat is that the root module returned via the call require 'sano' doesn't add anything to the global namespace; you'll need to store the result in a local (or global!) variable.
1.2 Basics
To start off, we'll look at two data structures which are not much more than lightweight wrappers around vanilla Lua tables: HashMap and Vector .
A Vector is not much more than a Lua table with a few handy metamethods. Vectors support negative indexing and can handle nil elements (Vector explicitly tracks the size of the table).
> Vector = sano.Vector
> iter = sano.iter
> vec = Vector:new()
> vec:addAll(iter.count(5))  see below
> print(vec)
[1, 2, 3, 4, 5]
All Sano data structures have two construction methods: new and make . The make method is a value constructor; the new method always creates an empty object, and for some classes, new accepts parameters which affect the behavior of the class (for instance, PairingHeap accepts an ordering function which determines which objects will appear at the top of the heap). We could have created the above Vector a little more succinctly using the make method:
> vec = Vector:make(iter.count(5))
> print(vec)
[1, 2, 3, 4, 5]
The make method, and most all Sano functions which handle plaural objects (Lua tables, iterators, strings, and other Sano collections), will work without trouble for any type of plaural object. In the above example, we constructed a Vector by passing in iter.count(5) , which just returns an iterator that counts off the integers 1 through 5:
> for int in iter.count(5) do print(int) end
1
2
3
4
5
We could just as easily have supplied a Lua table or some Sano collection (including, but not limited to, Vector) as the argument to make. For example:
> v1 = Vector:make(iter.count(5))
> v2 = Vector:make{1,2,3,4,5}
> v3 = Vector:make(v1)
> print(v1==v2, v2==v3)
true true
As the last line shows, the __eq metamethod for Vector performs a pairwise comparison of elements to determine equality. All the Sano data structures do something sensible for their equality comparison, and where it makes sense, most data structures have metamethods for other operators (for instance, set union can be expressed via + , and so on).
Here are some more examples of Vector:
> v = Vector:make(1,2,3,4,5)
> print(v)
[1, 2, 3, 4, 5]
> v:shuffle(); print(v)
[1, 5, 3, 2, 4]
> v:sort(); print(v)
[1, 2, 3, 4, 5]
> print(v:get(1), v:get(1), v(1))  v:get(x)==v(x)
1 5 5
> for e in v:iter() do print(e) end
1
2
3
4
5
> vSub = Vector:make(v:iter(2,1)); print(vSub)
[2, 3, 4, 5]
> vSub:add(1,0); vSub:add(6); vSub:add(2.2,2)
> print(vSub)
[1, 2, 2.2, 3, 4, 5, 6]
> print(vSub:remove(3), vSub)
2.2 [1, 2, 3, 4, 5, 6]
HashMap is the basic Sano map/table implementation. The table below shows the mapping between Lua table syntax and Sano HashMap syntax:
table

HashMap

t = {}

t = HashMap:new()

t[k]

t:get(k), t(k)

t[k] = v

t:add(k,v)

t[k] = nil

t:remove(k)

HashMap supports userdefined hash functions. Thus, it's possible to use objects other than numbers and strings as keys in a HashMap without resorting to memoization or converting the keys to strings. For instance,
the Tuple class overrides its __hash and __eq metamethods, so HashMap can
be used as a multikey map by making the keys tuples:
> h = HashMap:new()
> tuple = sano.makers.tuple  see explanation below
> h:add(tuple(1,1,2), tuple(3,4))
...
> print( h:get(tuple(1,1,2)) )
(3, 4)
If h were a regular Lua table, the two instances of tuple(1,1,2) would not be recognized as being equivalent:
> h = {}
> h[tuple(1,1,2)] = tuple(3,4)
...
> print( h[tuple(1,1,2)] )
nil
As it can get a little verbose typing ClassName:make , especially for commonly used data structures, Sano has functions in the sano.makers table for constructing Sano data structures with less typing, for example:
> vector = sano.makers.vector  Vector maker
> list = sano.makers.list  SkipVector maker
> map = sano.makers.map  HashMap maker
> omap = sano.makers.omap  SkipMap maker
> set = sano.makers.set  HashSet maker
> oset = sano.makers.oset  SkipSet maker
> heap = sano.makers.heap  PairingHeap maker
> tuple = sano.makers.tuple  Tuple maker
...
> v = vector(5,4,3,2,1); print(v)
[5, 4, 3, 2, 1]
> l = list(5,4,3,2,1); print(l)
[5, 4, 3, 2, 1]
> m = map("abcde",{1,2,3,4,5}); print(m)
{a=1, c=3, b=2, e=5, d=4}
> om = omap("abcde",{1,2,3,4,5}); print(om)
{a=1, b=2, c=3, d=4, e=5}
> s = set(1,1,2,3,4); print(s)
{1, 2, 3, 4}
> s = set("abcdee"); print(s)
{"a", "c", "b", "e", "d"}
> so = oset("abcdee"); print(so)
{"a", "b", "c", "d", "e"}
> h = heap(l:shuffle()); print(h)
/1, 2, 3, 5, 4\
1.3 More Examples
> h = map{a=1,b=2,c=3,d=4}  h = HashMap:make{...}
> print(h)
{a=1, d=4, c=3, b=2}
> for k,v in h:iter() do print(k,v) end
a 1
d 4
c 3
b 2
>  iterate over the entries in sorted order
> for k,v in h:orderedIter() do print(k,v) end
a 1
b 2
c 3
d 4
SkipMap is an ordered map implementation:
> SkipMap = sano.SkipMap
> s = SkipMap:make(h); print(s)
{a=1, b=2, c=3, d=4}
>  iterate over a sorted range of entries
> for k,v in s:iter('b','c') do print(k,v) end
b 2
c 3
>  remove a sorted range of entries
> cut = s:remove('b','c'); print(cut)
{b=2, c=3}
> print(s)
{a=1, d=4}
>  merge the two maps back together
> s:merge(cut); print(s)
{a=1, b=2, c=3, d=4}
>  create another SkipMap with the reverse ordering
> srev = SkipMap:new(sano.ordering.DECREASING)
> print(srev:addMappings(s))
{d=4, c=3, b=2, a=1}
Here are some example uses of different set implementations:
> HashSet, SkipSet = sano.HashSet, sano.SkipSet
> s1 = HashSet:make(iter.count(10))
> s2 = set(iter.count(10))
> print(s1 == s2)
true
> print(set(s1))
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
> print(s1  set(iter.count(5,10)))
{1, 2, 3, 4}
> so1 = SkipSet:make("ace"); so2 = oset("ace")
> print(so1 == so2)
true
> so3 = oset("bd"); print(so1, so3)
{"a", "c", "e"} {"b", "d"}
>  efficiently merge two ordered sets:
> so1:merge(so3); print(so1)
{"a", "b", "c", "d", "e"}
> print(vector(so1:iter('b','d')))
["b", "c", "d"]
Multiset is a collection in which duplicate elements are not explicitly stored:
> m = Multiset:make(iter.repeatCall(100, math.random, 1, 5)); print(m)
{1^27, 2^14, 3^22, 4^17, 5^20}
> m2 = mset(m); print(m == m2)
true
> m3 = mset(iter.repeatCall(100, math.random, 1, 5)); print(m3)
{1^18, 2^28, 3^24, 4^14, 5^16}
>  computing set operations is efficient
> m:addAll(m3); print(m)
{1^45, 2^42, 3^46, 4^31, 5^36}
Multimap is handy for working with onetomany mappings. When Multimap:add(key,val) is called for some key which already exists in the map, val is added to a collection stored against key .
> Multimap = sano.Multimap
> m = Multimap:new()
> m:add('a',1); m:add('a',2); m:add('b',3); print(m)
{a=[1, 2], b=[3]}
> m:addValues('c',{1,2,3,4,5}); print(m)
{a=[1, 2], c=[1, 2, 3, 4, 5], b=[3]}
When a onetomany mapping is built up incrementally, Multimap is often more convenient than inserting checks such as:
local vals = mmap:get(k)
if vals then
vals:add(val)
else
vals = Vector:new()
vals:add(val)
end
mmap:add(k, vals)
PairingHeap is a basic multipass pairingheap implementation:
> h = PairingHeap:make(iter.repeatCall(10, math.random, 1, 5))
> print(h)
/1, 1, 4, 5, 4, 1, 2, 2, 2, 3\
> v = vector(h:extractFirstK(6))  partial sort
> print(v)
[1, 1, 1, 2, 2, 2]
> print(h)
/3, 4, 4, 5\
> h:merge(heap(iter.count(10))); print(h)
/1, 3, 10, 9, 8, 7, 6, 5, 4, 3, 2, 4, 4, 5\
> print(h:get())
1
> print(h:remove(), h)
1 /2, 4, 3, 3, 8, 10, 4, 4, 5, 9, 6, 5, 7\
> print(vector(h:extractFirstK()))  full heapsort
[2, 3, 3, 4, 4, 4, 5, 5, 6, 7, 8, 9, 10]
The sano.ordering module contains utility functions for working with ordered data. Here are just a few example uses:
> v = list(iter.count(10)):shuffle()
> print(v)
[5, 4, 10, 3, 7, 8, 6, 2, 9, 1]
> print(ordering.first(v))
1
> print(ordering.last(v))
10
> print(ordering.first(v, ordering.DECREASING))
10
> kth = ordering.kth(v, 6); print(kth)  runs in linear time
6
> vectors = vector(vector(1,4,2), vector(1,1), vector{1}, vector(1,2,3,4))
> print(vectors)
[[1, 4, 2], [1, 1], [1], [1, 2, 3, 4]]
>  order these vectors by size
> vectors:sort(ordering.byTransform(Vector.size))
> print(vectors)
[[1], [1, 1], [1, 4, 2], [1, 2, 3, 4]]
2. API Documentation
Online API documentation is available here. The same documentation can also be accessed from Lua: to get help on any object, function, or module,
sano.summary(obj) prints a brief one sentance
description of obj, and, if obj is an object or module, a brief one
sentance description of each of the methods of the object or functions
of the module. sano.doc(obj) prints full
documentation for obj.
For example, sano.summary(sano.PairingHeap) prints:
PairingHeap  Heap implementation with constanttime merge and getfirst
operations.
PairingHeap:new(orderFn)  Constructs a new PairingHeap using the
supplied ordering function (default is natural increasing order).
collections.make(self, ...)  Creates and returns a new instance of self
containing the elements passed in as vararg params.
PairingHeap:__eq(otherheap)  Returns true if both heaps have the same
size and both self:iter() and otherheap:iter() return the same
sequence of values.
PairingHeap:add(val)  Adds val to this PairingHeap.
PairingHeap:addNode(node)  Adds this node to the heap and returns it.
PairingHeap:get()  Returns the first/topmost element in the heap.
PairingHeap:get()  Returns the first/topmost element in the heap.
PairingHeap:iter(node)  Returns an iterator over the values in the
heap, starting from node, which defaults to the root.
PairingHeap:merge(pheap)  Merges another PairingHeap, pheap, inplace
into this heap.
PairingHeap:nodeIter(node)  Returns an iterator over the NODES in the
heap, starting from node (default is root).
PairingHeap:remove(node)  Removes the node from this PairingHeap and
returns the value that was stored at that node.
PairingHeap:set(node, newVal)  Adjusts the value stored at node to be
newVal and reestablishes the heap property.
PairingHeap:size()  Returns the number of elements in the PairingHeap.
PairingHeap:test()  Unit test.
PairingHeap:update(node)  Removes and reinserts node.
heaps.toString(heap)  Returns a string representation of heap.
collections.addAll(collection, iterable)  Calls collection:add(e) for
each e in iter(iterable).
heaps.extractFirstK(heap [, k])  Returns an iterator which
destructively removes and returns the first k elements from this
heap.
Likewise, calling sano.doc(PairingHeap.merge) prints:
PairingHeap:merge(pheap)  Merges another PairingHeap, pheap, inplace
into this heap.
This heap is modified and pheap is invalidated as a result of this call.
returns: this PairingHeap (to allow chaining)
time complexity: O(1)
Note that a lot of modules "mixin" methods from other modules so it's not uncommon to see, for example, collections.addAll in the summary for PairingHeap or SkipSet .
3. Download
The latest release can be downloaded from the Luaforge project page.
4. Repository access
The development version can be accessed via the Darcs repository:
darcs get http://www.cs.brandeis.edu/~pchiusano/repos/sano
5. Getting Help
I'll be monitoring the Luaforge forum (all messages get routed to my inbox), so if you have questions, comments, suggestions, or want help, you can post there and I'll respond as soon as I can. Of course, you can also email me directly.
6. Roadmap
Where is this all going? Well, I'm not sure. It really depends on what other people are interested in seeing. If you like the framework and have some idea for where you think it should go, or some addition you'd like to see, or if you'd like to contribute some new data structure or algorithm, post to the forum or email me.
