Getting Started
Bloblang is the most advanced mapping language that you’ll learn from this walkthrough (probably). It is designed for readability, the power to shape even the most outrageous input documents, and to easily make erratic schemas bend to your will. Bloblang is the native mapping language of Wombat, but it has been designed as a general purpose technology ready to be adopted by other tools.
In this walkthrough you’ll learn how to make new friends by mapping their documents, and lose old friends as they grow
jealous and bitter of your mapping abilities. There are a few ways to execute Bloblang but the way we’ll do it in this
guide is to pull a Wombat docker image and run the command wombat blobl server
, which opens up an interactive Bloblang
editor:
Next, open your browser at http://localhost:4195
and you should see an app with three panels, the top-left is where
you paste an input document, the bottom is your Bloblang mapping and on the top-right is the output.
Your first assignment
The primary goal of a Bloblang mapping is to construct a brand new document by using an input document as a reference, which we achieve through a series of assignments. Bloblang is traditionally used to map JSON documents and that’s mostly what we’ll be doing in this walkthrough. The first mapping you’ll see when you open the editor is a single assignment:
On the left-hand side of the assignment is our assignment target, where root
is a keyword referring to the root of the
new document being constructed. On the right-hand side is a query which determines the value to be assigned, wherethis
is a keyword that refers to the context of the mapping which begins as the root of the input document.
As you can see the input document in the editor begins as a JSON object {"message":"hello world"}
, and the output
panel should show the result as:
Which is a (neatly formatted) replica of the input document. This is the result of our mapping because we assigned the entire input document to the root of our new thing. However, you won’t get far in life by trapping yourself in the past, let’s create a brand new document by assigning a fresh object to the root:
Bloblang supports a bunch of literal types, and the first line of this mapping assigns an empty object
literal to the root. The second line then creates a new field foo
on that object by assigning it the value ofmessage
from the input document. You should see that our output has changed to:
In Bloblang, when the path that we assign to contains fields that are themselves unset then they are created as empty
objects. This rule also applies to root
itself, which means the mapping:
Will automatically create the objects required to produce the output document:
Also note that we can use quotes in order to express path segments that contain symbols or whitespace. Great, let’s move on quick before our self-satisfaction gets in the way of progress.
Basic Methods and Functions
Nothing is ever good enough for you, why should the input document be any different? Usually in our mappings it’s
necessary to mutate values whilst we map them over, this is almost always done with methods, of
which there are many. To demonstrate we’re going to change our mapping
to uppercase the field message
from our input document:
As you can see the syntax for a method is similar to many languages, simply add a dot on the target value followed by the method name and arguments within brackets. With this method added our output document should look like this:
Since the result of any Bloblang query is a value you can use methods on anything, including other methods. For example,
we could expand our mapping of message
to also replace WORLD
with EARTH
using the
replace_all
method:
As you can see this method required some arguments. Methods support both nameless (like above) and named arguments, which are often literal values but can also be queries themselves. For example try out the following mapping using both named style and a dynamic argument:
Woah, I think that’s the plot to Inception, let’s move onto functions. Functions are just boring methods that don’t have a target, and there are plenty of them as well. Functions are often used to extract information unrelated to the input document, such as environment variables, or to generate data such as timestamps or UUIDs.
Since we’re completionists let’s add one to our mapping:
Now I can’t tell you what the output looks like since it will be different each time it’s mapped, how fun!
Deletions
Everything in Bloblang is an expression to be assigned, including deletions, which is a function
deleted()
. To illustrate let’s create a field we want to delete by changing our input to the
following:
If we wanted a full copy of this document without the field name
then we can assign deleted()
to it:
And it won’t be included in the output:
An alternative way to delete fields is the method without
, our above example could be
rewritten as a single assignment root = this.without("name")
. However, deleted()
is generally more powerful and will
come into play more later on.
Variables
Sometimes it’s necessary to capture a value for later, but we might not want it to be added to the resulting document.
In Bloblang we can achieve this with variables which are created using the let
keyword, and can be referenced within
subsequent queries with a dollar sign prefix:
Variables can be assigned any value type, including objects and arrays.
Unstructured and Binary Data
So far in all of our examples both the input document and our newly mapped document are structured, but this does not
need to be so. Try assigning some literal value types directly to the root
, such as a string root = "hello world"
,
or a number root = 5
.
You should notice that when a value type is assigned to the root the output is the raw value, and therefore strings are not quoted. This is what makes it possible to output data of any format, including encrypted, encoded or otherwise binary data.
Unstructured mapping is not limited to the output. Rather than referencing the input document with this
, where it must
be structured, it is possible to reference it as a binary string with the function content
,
try changing your mapping to:
And then put any old gibberish in the input panel, the output panel should be the same gibberish but all uppercase.
Conditionals
In order to play around with conditionals let’s set our input to something structured:
In Bloblang all conditionals are expressions, this is a core principal of Bloblang and will be important later on when we’re mapping deeply nested structures.
If Expression
The simplest conditional is the if
expression, where the boolean condition does not need to be in parentheses. Let’s
create a map that modifies the number of treats our pet receives based on a field:
Try that mapping out and you should see the number of treats in the output increased to 15. Now try changing the input
field pet.is_cute
to false
and the output treats count should go back to the original 5.
When a conditional expression doesn’t have a branch to execute then the assignment is skipped entirely, which means when
the pet is not cute the value of pet.treats
is unchanged (and remains the value set in the root = this
assignment).
We can add an else
block to our if
expression to remove treats entirely when the pet is not cute:
This is possible because field deletions are expressed as assigned values created with the deleted()
function. This is
cool but also in poor taste, treats should be allocated based on need, not cuteness!
If Statement
The if
keyword can also be used as a statement in order to conditionally apply a series of mapping assignments, the
previous example can be rewritten as:
Converting this mapping to use a statement has resulted in a more verbose mapping as we had to specify root.pet.treats
multiple times as an assignment target. However, using if
as a statement can be beneficial when multiple assignments
rely on the same logic:
More treats and more toys! Lucky Spot!
Match Expression
Another conditional expression is match
which allows you to list many branches consisting of a condition and a query
to execute separated with =>
, where the first condition to pass is the one that is executed:
Try executing that mapping with different values for pet.type
and pet.treats
. Match expressions can also specify a
new context for the keyword this
which can help reduce some of the boilerplate in your boolean conditions. The
following mapping is equivalent to the previous:
Your boolean conditions can also be expressed as value types, in which case the context being matched will be compared to the value:
Error Handling
Are you feeling relaxed? Well don’t, because in the world of mapping anything can happen, at ANY TIME, and there are plenty of ways that a mapping can fail due to variations in the input data. Are you feeling stressed? Well don’t, because Bloblang makes handling errors easy.
First, let’s take a look at what happens when errors aren’t handled, change your input to the following:
And change your mapping to something simple like a number comparison:
Uh oh! It looks like our canvasser was too lazy and our angry_peasants
count was incorrectly set for this document.
You should see an error in the output window that mentions something like
cannot compare types string (from field this.angry_peasants) and number (from field this.palace_guards)
, which means
the mapping was abandoned.
So what if we want to try and map something, but don’t care if it fails? In this case if we are unable to compare our angry peasants with palace guards then I would still consider us in trouble just to be safe.
For that we have a special [method catch
][blobl.methods.catch], which if we add to any query allows us to specify an
argument to be returned when an error occurs. Since methods can be added to any query we can surround our arithmetic
with brackets and catch the whole thing:
Now instead of an error we should see an output with in_trouble
set to true
. Try changing to value of
angry_peasants
to a few different values, including some numbers.
One of the powerful features of catch
is that when it is added at the end of a series of expressions and methods it
will capture errors at any part of the series, allowing you to capture errors at any granularity. For example, the
mapping:
Will catch errors caused by:
this.mission.type
not being a stringthis.user.motives
not being an arraythis.mission.difficulty
not being a number
But will always return false
if any of those errors occur. Try it out with this input and play around by breaking some
of the fields:
Now try out this mapping:
This version is more granular and will capture each of the errors individually, with each error given a unique true
or
false
fallback.
Validation
I’m worried that I’ve turned you into some sort of error hating thug, hell-bent on eliminating all errors from existence. However, sometimes errors are what we want. Failing a mapping with an error allows us to handle the bad document in other ways, such as routing it to a dead-letter queue or filtering it entirely.
You can read about common Wombat error handling patterns for bad data in the error handling guide, but the first step is to create the error. Luckily, Bloblang has a range of ways of creating errors under certain circumstances, which can be used in order to validate the data being mapped.
There are a few helper methods that make validating and coercing fields nice and easy, try this mapping out:
With some of these sample inputs:
However, these methods don’t cover all use cases. The general purpose error throwing technique is the
throw
function, which takes an argument string that describes the error. When it’s called it
will throw a mapping error that abandons the mapping (unless it’s caught, psych!)
For example, we can check the type of a field with the method type
, and then throw an error if
it’s not the type we expected:
Try this mapping out with a few sample inputs:
Context
In Bloblang, when we refer to the context we’re talking about the value returned with the keyword this
. At the
beginning of a mapping the context starts off as a reference to the root of a structured input document, which is why
the mapping root = this
will result in the same document coming out as you put in.
However, in Bloblang there are mechanisms whereby the context might change, we’ve already seen how this can happen
within a match
expression. Another useful way to change the context is by adding a bracketed query expression as a
method to a query, which looks like this:
Within the bracketed query expression the context becomes the result of the query that it’s a method of, so within the
brackets in the above mapping the value of this
points to the result of this.foo.bar
, and the mapping is therefore
equivalent to:
With this handy trick the throw
mapping from the validation section above could be rewritten as:
Naming the Context
Shadowing the keyword this
with new contexts can look confusing in your mappings, and it also limits you to only being
able to reference one context at any given time. As an alternative, Bloblang supports context capture expressions that
look similar to lambda functions from other languages, where you can name the new context with the syntax
<context name> -> <query>
, which looks like this:
Within the brackets we now have a new field thing
, which returns the context that would have otherwise been captured
as this
. This also means the value returned from this
hasn’t changed and will continue to return the root of the
input document.
Coalescing
Being able to open up bracketed query expressions on fields leads us onto another cool trick in Bloblang referred to as coalescing. It’s very common in the world of document mapping that due to structural deviations a value that we wish to obtain could come from one of multiple possible paths.
To illustrate this problem change the input document to the following:
Let’s say we wish to flatten this structure with the following mapping:
But articles are only one of many document types we expect to receive, where the field contents
remains the same but
the field article
could instead be comment
or share
. In this case we could expand our map of contents
to use a
match
expression where we check for the existence of article
, comment
, etc in the input document.
However, a much cleaner way of approaching this is with the pipe operator (|
), which in Bloblang can be used to join
multiple queries, where the first to yield a non-null result is selected. Change your mapping to the following:
And now try changing the field article
in your input document to comment
. You should see that the value ofcontents
remains as Some people did some stuff
in the output document.
Now, rather than write out the full path prefix this.thing
each time we can use a bracketed query expression to change
the context, giving us more space for adding other fields:
And by the way, the keyword this
within queries can be omitted and made implicit, which allows us to reduce this even
further:
Finally, we can also add a pipe operator at the end to fallback to a literal value when none of our candidates exists:
Neat.
Advanced Methods
Congratulations for making it this far, but if you take your current level of knowledge to a map-off you’ll be laughed off the stage. What happens when you need to map all of the elements of an array? Or filter the keys of an object by their values? What if the fellowship just used the eagles to fly to mount doom?
Bloblang offers a bunch of advanced methods for manipulating structured data types, let’s take a quick tour of some of the cooler ones. Set your input document to this list of things:
Let’s say we wanted to reduce the things
in our input document to only those that are cool and where we have enough of
them to share with our friends. We can do this with a filter
method:
Try running that mapping and you’ll see that the output is reduced. What is happening here is that the filter
method
takes an argument that is a query, and that query will be mapped for each individual element of the array (where the
context is changed to the element itself). We have captured the context into a field thing
which allows us to continue
referencing the root of the input with this
.
The filter
method requires the query parameter to resolve to a boolean true
or false
, and if it resolves to true
the element will be present in the resulting array, otherwise it is removed.
Being able to express a query argument to be applied to a range in this way is one of the more powerful features of Bloblang, and when mapping complex structured data these advanced methods will likely be a common tool that you’ll reach for.
Another such method is map_each
, which allows you to mutate each element of an array, or
each value of an object. Change your input document to the following:
Here we have an array of talking heads, where each element is a string containing an identifer, a colon, and a comma separated list of their opinions. We wish to map each string into a structured object, which we can do with the following mapping:
The argument to map_each
is a query where the context is the element, which we capture into the field raw
. The
result of the query argument will become the value of the element in the resulting array, and in this case we return an
object literal.
In order to separate the identifier from opinions we perform a split
by colon on the raw string element and get the
first substring with the index
method. We then do the split again and extract the remainder, and split that by comma
in order to extract all of the opinions to an array field.
However, one problem with this mapping is that the split by colon is written out twice and executed twice. A more efficient way of performing the same thing is with the bracketed query expressions we’ve played with before:
:::note Challenge! Try updating that map so that only opinions that mention Pokemon are kept :::
Cool. To find more methods for manipulating structured data types check out the methods page.
Reusable Mappings
Bloblang has cool methods, sure, but there’s nothing cooler than methods you’ve made yourself. When the going gets tough
in the mapping world the best solution is often to create a named mapping, which you can do with the keyword map
:
The body of a named map, encapsulated with squiggly brackets, is a totally isolated mapping where root
now refers to a
new value being created for each invocation of the map, and this
refers to the root of the context provided to the
map.
Named maps are executed with the method apply
, which has a string parameter identifying the map
to execute, this means it’s possible to dynamically select the target map.
As you can see in the above example we were able to use a custom map in order to create our talking head objects without the object literal. Within a named map we can also create variables that exist only within the scope of the map.
A cool feature of named mappings is that they can invoke themselves recursively, allowing you to define mappings that walk deeply nested structures. The following mapping will scrub all values from a document that contain the word ” Voldemort” (case insensitive):
Try running that mapping with the following input document:
Charlie will be upset but at least we’ll be safe.
Unit Testing
You are truly a champion of mappings, and you’re probably feeling pretty confident right now. Maybe you even have a mapping that you’re particularly proud of. Well, I’m sorry to inform you that your mapping is DOOMED, as a mapping without unit tests is like a Twitter session, with the progression of time it will inevitably descend into madness.
However, if you act now there is still time to spare your mapping from this fate, as Wombat has it’s
own unit testing capabilities that you can also use for your mappings. To start with save
a mapping into a file called something like naughty_man.blobl
, we can use the example above from the reusable mappings
section:
Next, we can define our unit tests in an accompanying YAML file in the same directory, let’s call this
naughty_man_test.yaml
:
As you can see we’ve defined a single test, where we point to our mapping file which will be executed in our test. We then specify an input message which is a reduced version of the document we tried out before, and finally we specify output predicates, which is a JSON comparison against the output document.
We can execute these tests with wombat test ./naughty_man_test.yaml
, Wombat will also automatically find our tests if
you simply run wombat test ./...
. You should see an output something like:
Because in actual fact our expected output is wrong, I’ll leave it to you to spot the error. Once the test is fixed you should see:
And now our mapping, should we need to expand it in the future, is better protected against regressions. You can read more about the Wombat unit test specification, including alternative output predicates, in this document.
Final Words
That’s it for this walkthrough, if you’re hungry for more then I suggest you re-evaluate your priorities in life. If you have feedback then please get in touch, despite being terrible people the Wombat community are very welcoming.