As a programmer, my job is to write the function to transform inputs to outputs.
+Usually the inputs and outputs are complex data that represent something in the real world. +Abstractly though, I tend to think about the inputs and outputs as data shapes. +Squares go in, circles come out. +Defining functions is my fundamental form of expression.
+
+As a mathematician, equations are my fundamental form of expression. +Equations define valid rules for transforming one expression into another. +My job is to solve problems using symbolic manipulation.
+Functions and equations are both fine ways to specify transformations. +So why only use functions for programming? +What does it look like to program data transformation like equations instead of functions? +Today we will look at some examples of an equation base approach to specifying data transformations. +Before diving into the examples, let’s first establish some definitions
+Definitions
+| +Term rewriting + | +
+ Replacing terms with other terms + |
+
| +Term + | +
+ A variable or function application: \(a\), \((a + b)\), \((a + b)^2\) + |
+
| +Variable + | +
+ \(a,b\) can be assigned a value + |
+
| +Function + | +
+
|
+
| +Rule + | +
+ A pair of matching terms and substitution terms: \((a + b)^2 => a^2 + 2ab + b^2\) + |
+
| +Matching + | +
+ \((1 + 3)^2\) can be made equal to \((a + b)^2\) when \(a=1,b=3\) + |
+
| +Substitution + | +
+ Creation of new terms from old ones \((1 + 3)^2 => 1^2 + 2\cdot1\cdot3 + 2^2\) + |
+
| + + | +
+
+
+Term rewriting allows any transformation. +Math equations are rules that strictly preserve equivalence. + |
+
Introducing Meander
+Meander is a term rewriting system packaged as a Clojure library for data transformation.
+[meander/epsilon "0.0.512"]
+(ns meander.examples
+ (:require [meander.epsilon :as m]))
+I prefer to call it a structural pattern syntax for data transformation.
+Rewrite
+The m/rewrite macro is the swiss army knife of Meander.
Let’s do our first transformation from a vector to a map:
+; Meander!
+(m/rewrite [:a 1 :c 2] ; input
+ [?a ?b ?c ?d] ; a match pattern
+ {?a ?d}) ; a substitution pattern
+
+;=> {:a 2 :c 1} ; output
+-
+
-
+
The match pattern is a data literal with the shape of the input
+
+ -
+
The substitution pattern is a data literals with the shape of the output
+
+ -
+
+?ais called a logic variable
+ -
+
You can also use
+_for unused terms
+
| + + | +
+
+
+
|
+
Logic
+(m/rewrite [1 2] ; input
+ [?a ?a] ; a match pattern
+ {?a ?a}) ; a substitute pattern
+
+;=> nil
+No match.
+?a cannot match both 1 and 2.
(m/rewrite [1 1] ; input
+ [?a ?a] ; a match
+ {?a ?a}) ; a substitution pattern
+
+;=> {1 1}
+Matched ?a with 1
| + + | +
+
+
+Unification means resolving logical matches. + |
+
Repeating terms
+(m/rewrite [:a 1 :c 2 :d 3]
+ [!a ...]
+ [!a ...])
+
+;=> [:a 1 :c 2 :d 3]
+!a is a memory variable: an array that can collect many values
+… indicates 0 or more repeated occurrences
| + + | +
+
+
+The match pattern is symmetrical with the substitution pattern. +We got out exactly what we put in. + |
+
Repeat scope
+(m/rewrite [1 :a 2 :b 3 :c]
+ [!x !y ...] ; match pairs
+ [!y !x ...]) ; substitute flipped pairs
+
+;=> [:a 1 :b 2 :c 3]
+All terms preceeding … repeat.
Separator
+Limits the scope of repeats.
+(m/rewrite [:hello 1 2 3]
+ [:hello . !v ...]
+ [:world . !v ...])
+
+;=> [:world 1 2 3]
+Only the terms between . and … repeat.
Structurally explicit
+Most Clojure functions will coerce their inputs to seqs. +Meander does not do this.
+
+[!x …]
+ |
+
+ will only match vectors + |
+
+(!x …)
+ |
+
+ will only match lists and seqs + |
+
+{:k ?v}
+ |
+
+ will only match maps + |
+
+#{1}
+ |
+
+ will only match sets + |
+
+(m/seqable !x …)
+ |
+
+ will match maps, vectors, strings, etc… + |
+
I like that it will only match the same types unless you choose to be more permissive with m/seqable.
Nesting
+As you might expect, match and substitute patterns may be nested.
+(m/rewrite {:xs [1 2 3 4 5]
+ :ys [6 7 8 9 10]}
+ {:xs [!xs ...]
+ :ys [!ys ...]}
+ [!xs ... . !ys ...])
+
+;=> [1 2 3 4 5 6 7 8 9 10]
+And Meander does the rearranging for you.
+Pop Quiz
+(m/rewrite [:hello 1 2 3]
+ [?k . !v ...]
+ [?k !v ...])
+
+;=> [:hello 1 :hello 2 :hello 3]
+Why?
+| + + | +
+
+
+When matching, we expect a single value, then repeated values. +The substituting pattern has no separator, so we are substituting pairs. + |
+
Congratulations! +Enjoy these macarons as your reward.
+
+Meander has a rich feature set. +I have only shown you the core syntax. +We are now going to shift gears and look at some examples in the wild. +I’ll introduce a few bits of unexplained but fairly obvious syntax as we go.
+| + + | +
+
+
+For a comprehensive treatment of Meander features, see Meander on GitHub which links to documentation, a cookbook, blog posts and a Strange Loop talk. + |
+
Examples
+Rearranging logic expressions
+ +(def rearrange-logic
+ (bottom-up
+ ;; nested commutative logic is raised
+ ((m/pred #{'and 'or} ?op) . !before ... (?op . !clauses ...) . !after ...)
+ (?op . !before ... !clauses ... !after ...)
+
+ ;; moves `or` to the outside,
+ ;; and `and` to the inside to match Datalog rule convention
+ (and . !before ... (or . !clauses ...) . !after ...)
+ (or . (and . ~@!before . !clauses . ~@!after) ...)
+
+ ;; identity logic expressions are flattened
+ ((m/pred #{'and 'or} ?op) ?body)
+ ?body
+
+ ;; double negatives are removed
+ (not (not ?body))
+ ?body
+
+ ;; moves `not` inside to match Datalog rule convention
+ (not (or . !clauses ...))
+ (and . (not !clauses) ...)
+
+ (not (and . !clauses ...))
+ (or . (not !clauses) ...)))
+Web scraping
+This example comes from the Chartit (an ETL/analytics utility for tracking work)
+(def extract-employees
+ (s/search
+ (m/$
+ [:div {:class "directory-tables"} &
+ (m/scan
+ ;; heading/table pairs
+ [:h3 {} ?department & _]
+ _
+ [:table &
+ (m/scan
+ [:tbody &
+ (m/scan
+ [:tr &
+ (m/separated
+ [:td & (m/scan [:a {} ?name & _])]
+ [:td {} ?title & _]
+ [:td & (m/scan [:a {:href ?mailto} & _])])])])])])
+ ;;=>
+ {:department (str/trim ?department)
+ :name ?name
+ :title ?title
+ :email (subs ?mailto 7)}))
+Parsing defn like forms
+Here is another example from Justice.
+where it was desirable to define a macro that behaves like defn.
+This example can be directly compared with the Clojure.spec approach.
(defn wrap-defn
+ "Returns a function that will parse a form according to `defn` semantics.
+ Takes a function which will convert fn-spec forms."
+ [rewrite-fn-spec]
+ (m/rewrite
+ (m/and ((pred simple-symbol? ?name) .
+ (pred string? !?docstring) ...
+ (pred map? !?attr-map) ...
+ !tail ...)
+ (m/guard (<= (count !?docstring) 1))
+ (m/guard (<= (count !?attr-map) 1))
+ (let
+ (or (([(m/pred simple-symbol? !params) ... :as !param-list] . !forms ... :as !fn-specs) ..1)
+ ([(m/pred simple-symbol? !params) ... :as !param-list] . !forms ... :as !fn-specs))
+ (list* !tail))
+ (m/guard (apply distinct? (map count !param-list))))
+ ;;>
+ (defn ?name . !?docstring ... !?attr-map ...
+ ~@(map rewrite-fn-spec !fn-specs))))
+Schema to code
+This example shows the core functionality of HappyGAPI (a library that exposes Google APIs by generating code from schemas).
+(defn summarize-schema [schema request depth]
+ "Given a json-schema of type definitions,
+ and a request that is a $ref to one of those types,
+ resolves $ref(s) to a depth of 3,
+ discards the distracting information,
+ and returns a pattern for constructing the required input."
+ (m/rewrite request
+ {:type "object"
+ :id ?id
+ :additionalProperties ?ap
+ :properties (m/seqable [!property !item] ...)}
+ ;;>
+ {& ([!property (m/app #(summarize-schema schema % depth) !item)] ...)}
+
+ {:type "array"
+ :items ?item}
+ ;;>
+ [~(summarize-schema schema ?item depth)]
+
+ {:type (m/pred string? ?type)}
+ ;;>
+ (m/app symbol ?type)
+
+ {:$ref (m/pred string? ?ref)}
+ ;;>
+ ~(if (> depth 2)
+ (symbol ?ref)
+ (summarize-schema schema (get schema (keyword ?ref)) (inc depth)))))
+AST manipulation
+This example comes from Kalai, a transpiler from Clojure to Java and Rust.
+(def propagate-types-from-bindings-to-locals
+ "We propagate type information which is stored in metadata
+ from the the place where they are declared on a symbol
+ to all future usages of that symbol in scope."
+ (s/rewrite
+ {:op :local
+ :form ?symbol
+ :env {:locals {?symbol {:form ?symbol-with-meta
+ :init ?init}}
+ :as ?env}
+ & ?more
+ :as ?ast}
+ ;;>
+ {:op :local
+ :form ~(propagate-ast-type ?init ?symbol-with-meta ?ast)
+ :env ?env
+ & ?more}
+
+ ;; otherwise leave the ast as is
+ ?else
+ ?else))
+Reading patterns vs functions
+Those were all pretty complicated transformations.
+Could you follow them?
+I bet it was easier than most code.
+Is that normal?
+I claim that term rewriting expressions are much easier to read than functional transformations.
Next let’s compare what is involved in writing functions vs patterns.
+Writing functions vs patterns
+Functions are like recipes
+Functions are all about what to do with inputs.
+Clearly the big difference between functions and rewrite rules is what to do with the inputs. +To write a function is to specify in verbiage the tasks in order to convert the ingredients into the desired meal. +This is a verbal description of how to transform inputs to outputs in written form.
+
+Recipe for a function
+1. Prepare meat and potatoes |
+(example inputs) |
+
2. Mix ingredients in a pot |
+(function definition) |
+
3. Boil |
+(iterate, make it work) |
+
4. Garnish with documentation |
+(docstring or types) |
+
5. Simmer |
+(tests) |
+
| + + | +
+
+
+Easy to forget why it tastes good |
+
Patterns are more like pictures
+Rewrite rules are the inputs and outputs; the what to do part is completely missing.
+
+Picturing a pattern
+1. Prepare example inputs |
+
|
+
2. Parameterize |
+
|
+
3. Recombine as outputs |
+
|
+
4. Create a test |
+
|
+
5. Verify the output |
+
|
+
| + + | +
+
+
+Examples are the best kind of documentation. +Writing Meander feels like writing examples. +It gives me confidence I solved the right problem, and it’s easy to see what it does coming back to it later. + |
+
Data transformation
+When should we use term rewriting instead of functions?
+Term rewriting is suitable for largish data reshaping where you would otherwise do destructuring, updates, restructuring, and pipelines.
+Why isn’t this the default way of writing code everywhere?
+-
+
-
+
Term rewriting systems are hard to implement, and rarely integrated with programming languages.
+
+ -
+
Functions are far more familiar in the programming community.
+
+ -
+
Limitations
+
+
What’s the catch?
+Limitations
+Extension
+The syntax available in data literals is limited by the host language.
+Meander syntax is partially extensible through defsyntax to define custom operators.
+Execution extension is available by function application (see m/app).
Failure to match is opaque
+Similar to regular expressions, it can be difficult to figure out why a match isn’t made for a given input.
+Simplifying and decomposing can help.
+Function application can be used (m/app #(doto % prn) ?x) to spy on terms.
Reduction
+Meander has recursion, so you can reduce. +But recursion has no syntactic advantage. +The next branch of Meander (zeta) contains a neater way to express reduction.
+Memory variable correspondence
+Nested memory variables in a single expression can get confusing. +There is potential for a syntactic simplification issue#129. +The next branch of Meander (zeta) has flexibility in how variables behave, which may solve this.
+Performance
+On par with hand rolled functions. +Meander is a high quality library: fast and reliable.
+Decomposition
+The current options for creating sub-patterns are m/with and m/app.
+This is equivalent to functional decomposition.
+I wish for a more syntactically elegant approach.
Comparing Meander
+How does Meander compare to other data transformation approaches in Clojure?
+| +Clojure.core + | +
+ Clojure is concise and powerful.
+Many functions defined on few primary data structures.
+We operate on data but do not specify what the data is.
+So it is common to be looking at a function that operates on |
+
| +Destructuring + | +
+ Does one job well. +No logic expressions or substitution. + |
+
| +Core.match, Core.logic + | +
+ Independently work great. There is no convenient syntax for substitution. + |
+
| +Clojure.spec + | +
+ Defines the shape of |
+
| +Specter + | +
+ Takes a navigator/action approach. +Improves data manipulation. +Does not address shape. + |
+
Conclusion
+The mathematical style term rewriting approach to data transformation has several merits:
+-
+
-
+
When reading expressions, inputs and outputs are instantly recognizable
+
+ -
+
When writing expressions, examples can be converted to patterns
+
+ -
+
The declarative style is symmetrically pleasing
+
+ -
+
Visual expression is more effective than verbal expression
+
+
Meander provides term rewriting as a convenient library for Clojure. +I hope you will give it a try and find it as useful as I have.
+Thank you for reading.
+