Experiments in implementing data structures in the dependently typed programming language Idris.
Notably missing:
- Instances of
Eq,Ord,Functor,Applicative,Monad,Traversable,Show,DecEq - Efficient dictionaries (these are a generalization of sets)
- Benchmarks
For now these are left as a proverbial 'exercise for the reader'.
The code in this repository requires Idris 1.3.x (last tested on Idris 1.3.1). Given the experimental nature of this project, an Idris 2 port is not planned for.
OrderedVect is a linked list type that is always sorted. The element count and order criteria are embedded in the type. There's also a merge operation that will merge two OrderedVects.
Available operations: merge, head, tail, orderedVectToList
In Idris, Elem 2 [1, 2, 3, 4] is satisfied by a proof that 2 is an element of [1, 2, 3, 4]. There is only one such proof:
Idris> the (Elem 2 [1,2,3,4]) (There Here)
There Here : Elem 2 [1, 2, 3, 4]
The module VectRankedElem contains a type RankedElem that is similar but also encodes the index where the element is found:
Idris> the (RankedElem 2 [1,2,3,4] 1) (There Here)
There Here : RankedElem 2 [1, 2, 3, 4] 1
The module also contains a few functions that use RankedElem to prove interesting properties:
cons_ x xsreturns(x::xs)as well as a proof that for everyRankedElem el xs ithere is aRankedElem el (x::xs) (S i)and a proof ofRankedElem x (x::xs) Zconcat_ xs ysreturnsxs ++ ysas well as a proof that for everyRankedElem el xs ithere is aRankedElem el (xs ++ ys) iand a proof that for everyRankedElem el ys ithere is aRankedElem el (length xs + i) irev_ xsreturnsreverse xsas well as a proof that for everyRankedElem el xs ithere is aRankedElem el (reverse xs) (length xs - i)
The module also contains functions cons, concat, rev that call their _ counterpart and return only data. These functions behave entirely like ::, ++, reverse.
The proofs returned by cons_, concat_, rev_ can be composed to prove properties on more complicated data structures.
The data structures in this section are inspired by Chris Okasaki's book Purely Functional Data Structures. This book is based on Okasaki's PhD Thesis, which is available as a free PDF.
Using dependent types I was able to prove the following:
LeftistHeapis always sortedLeftistHeaprespects the 'leftist property'- The result of merging a
LeftistHeapof lengthmand a heap of lengthnis a heap of lengthm + n insertanddeleteMinyield aLeftistHeapthat has one element more or less (respectively) than the input heap
Available operations: findMin, merge, insert, deleteMin.
A supplementary CountedLeftistHeap data structure is also available.
MergeList is a dependently typed 'bottom up merge sort' data structure. A merge list is a linked list of OrderedVects, strictly ordered by size, and where each OrderedVect has a size expressible as 2ⁿ. For example, [[10], [3, 6, 9, 12]] and [[10], [5, 8], [3, 6, 9, 12]] are valid MergeLists.
Available operations: insert, mergeListToOrderedVect
By using dependent types we get the following guarantees:
- The
OrderedVects have sizes that are expresible as 2ⁿ and are strictly ordered by size - A
MergeListof a given size can be flattened into anOrderedVectof the same size - When inserting
nelements, the resulting merge list hasnelements more than the input merge list
LazyPairingHeap is another ordered list data structure. We have the following guarantees:
- Elements are ordered
- Inserting and removing elements changes the size of the heap as expected
Available operations: findMin, merge, insert, deleteMin
The type Queue is a functional queue data struture. It consist of a 'front list' f and a 'reverse list' r that together represent the logical list f ++ reverse r. Moreover, for reasons of efficiency, it is an invariant that r is never larger than f.
Available operations: head, snoc_, snoc, tail_, tail
Similar to what was done in VectRankedElem (see above), snoc_ and tail_ return functions that map proofs about the original queue onto proofs about the new queue.
We have proofs for the following properties:
- The reverse list is never larger than the front list
snoc_preserves the existence and the index of all elements and it adds the new element to the end of the queuetail_drops the first element and shifts the indices of the remaining elements by 1
snoc and tail are simple wrappers around snoc_ and tail_ that return the new queue, but not the proofs.
BinarySearchTree is an implementation of functional binary trees.
Available operations: decElem, elem, insert
The above operations are based on a higher order function named find. find takes a needle and a function for arguments. find efficiently recurses over the tree trying to find the needle. At every step it returns the result of calling f with (a) contextual information about the search process and (b) the result of the previous step. The search context includes a verified upper and lower bound for the current node, by which find can prove to f that it's traversing the tree correctly.
RandomAccessList is a list structure that provides efficient prepend and access to element at arbitrary indices. Informally you can think of it as a persistent array.
The namespace Data.RandomAccessList contains RandomAccessList and CountedRandomAccessList types. They allow the following operations:
consfor prepending an elementtailfor removing the first elementindexfor accessing an element at a specific indexupdatefor updating an element at a specific index
These operations take O(logn) time. See Okasaki's book for techniques to reduce the time complexity to constant time.
By construction all accesses and updates to a RandomAccessList are proved to be safe.
Usually arrays allow efficient append and prepend is expensive. In RandomAccessList, prepend is cheap and append is expensive. The performance of these operations can be swapped by translating all indices. That is, cons can be interpreted as an append and access to indices can be translated as follows: realIndex(virtualIndex) = size - 1 - virtualIndex.