Lua is a programmable scripting language that Omniata uses for data processing. While relatively simple to code, Lua offers a large amount of flexibility for data analysis and can help create insights that would be very difficult or impossible with a solely SQL-based solution.
Lua is somewhat similar to Python. Detailed information about Lua can be found at http://www.lua.org. One of the more advanced features of Omniata’s platform is the ability to use Lua to create custom data transformations from the raw events sent to Omniata, the data in a user’s user state, or other Custom User Field values.
There are three types of Omniata-specific Lua programming conventions that are important to know when writing custom Lua. All relate to the source of the data that is to be transformed.
To use the values from the raw data sent to Omniata, you need to enclose the specific name of the key in the key value pair within single quotes (not apostrophes, this will cause an error), then within square brackets, and prepend “event_kvp” to the entire string. To reference the event type KVP within an event, you would do so like this:
To reference the values in a user’s user state, the format is the same, except “user_state” will be prepended. If the goal was to get a user’s gender information, it would look like this:
Finally, to reference Custom User Fields, “user_vars” will be prepended, and the CUF’s machine name within the brackets and quotes, as shown:
Please note that Omniata’s Lua conventions must be followed exactly, or the field will not function.
When using Lua to create custom data transformations, there are several things to consider regarding the actual Lua code. Some are relevant to any application of Lua, and some are idiosyncratic to Omniata’s platform. One of the most important considerations when writing custom Lua code is how the expression handles null values and data types.
Null values are an especially important consideration when writing Lua code. While there may be narrow use cases in which a null value is desired, in nearly all cases null values will lead to data processing errors. The reason for this is that a null value does not have a value nor does it have a data type. Any type of operation, whether logical, mathematical, or involving strings, such as concatenation, will not function reliably if a null value is involved. These are some rough guidelines to remember:
1 + Null = Null
Any mathematical operation that contains a null value will evaluate to null, causing an error
IF(Null = Null, 1, 0) returns 0
Null logical operations will not work reliably, and will likely cause an error
CONCAT(Null, “Test”) = Null
Null values cannot be used as text, it will produce a null value and cause an error
In all of these cases, the result will be that the data cannot be processed. To avoid errors, there are two methodologies that can be used. First, in the case that a number may not have a numeric data type, you can use an expression like this:
Wrapping the Data Field reference within tonumber() ensures that the value returned will be numeric. However, this will not prevent errors should the value be missing. Imagine the case that your application prompts users to send invitations to their friends, and upon sending the invitation, sends an event to Omniata signifying this. This data is stored in a Custom User Field referenced via user_vars['invites_sent']. It would be beneficial to wrap any occurrence of this within tonumber(), however this will not solve the issue of missing values, since tonumber(null) will generate an error. To avoid this problem, use the following syntax:
tonumber(user_vars['invites_sent'] or 0)
By adding the “or” operator, in the case that the CUF value is null, it will default to zero. This ensures that the Lua code will never generate errors. In the case that the CUF is a string, such as a user’s first name, a similar method will ensure that null values do not cause errors:
user_vars['first_name'] or ''
In this case, any null values will be replaced by an empty string, which will work with Omniata’s data processing. By utilizing these two techniques, most data type and null-based errors will be avoided.
One final thing to remember about using Lua is that whenever possible, Lua-based Data Fields should be written as expressions, not as functions. Consider the following example of Lua code:
(function() if user_vars['invites_sent'] then return user_vars['invites_sent'] else return 0 end end)()
This function works exactly the same as the expression below, but will take far more time to process:
tonumber(user_vars['invites_sent'] or 0)
Functions will always take significantly longer to process than expressions, so it is advised to always use expressions. There are advanced cases in which an expression is not suited for the data transformation, but these are very rare.
A simple use case for Lua would be to count the number of times a user shares content on Facebook. In this case, we’ll be assuming that users send an event structured like this:
Omniata natively supports the creation of fields that count the number of times this event is sent in aggregate. However, if you wanted to see whether or not recent changes have led to more sharing, you would want to see if individual users were sending more share events. Looking at the number in aggregate could be impacted by user acquisition and other factors.
The first step is to create a Custom User Field with a descriptive title, such as “User Facebook Shares”. You can create this field by clicking on “Data Model” drop down in a Custom Metrics Data Application, and then clicking “Fields”:
Once you’re on the “All Fields” page, click the blue “New…” button in the top right corner and select “Custom User Field”:
In the “New Custom User Field” page, name the new field, select the appropriate data type, and select “Event Formula” as the Data Source:
Selecting “Event Formula” will open up a text box in which you can write your Lua code. To count the number of times this event occurs, you’ll use the following code:
(user_vars['cuf_user_facebook_shares'] or 0) + 1
In this code, user_vars['cuf_fb_share_count'] refers to a Custom User Field that has not yet been created. This CUF, which will be created shortly, will store the generated value for each individual user.
The parenthetical expression (user_vars['cuf_fb_share_count'] or 0) is necessary here and in most cases as a form of error handling. If a user has never sent a “Shared on Facebook” event, then their value will not be zero, it will be null. This is an extremely important distinction for any Lua code, as null values do not behave like zeroes, and will almost always cause errors. Encasing the reference to the CUF in a parenthesis and adding or 0 ensures that should the value be null, the calculation will substitute a zero. This methodology should be utilized in all Lua-based fields.
Finally, the entire expression evaluates as “increment the CUF by one should the user have a existent value for the CUF; should the user not have an existent value, increment zero by one”
In addition, you'll want to either select "fb_share" in the "For the following events" section, or add the code below to accomplish the same thing:
event_kvp['om_event_type'] == 'fb_share'
This ensures that the metric will only increment for the desired event, "fb_share". Omitting this step would mean any event could increment the user’s count, which is not the goal here. You can then scroll all the way to the end and click the blue “Save” button, and the Custom User Field will be created:
This CUF can now be used in the creation of Segments in the Engager, as well as in a Reporting Table for use in analyzing users.
The most useful aspect of using custom Lua code is for custom processing that is unique to your specific application. For instance, imagine a website in which users can create wishlists, and events are sent that indicate when the user adds or removes an item. Now imagine you would like to do an analysis on the number of items your users keep in their wishlists.
There are a few things to consider when creating a field to track this. First, the Lua will need to recognize that receiving one event should increment the value, while a different event should decrement it. Second, it will need to know that a wishlist cannot have a negative value, since it wouldn't make sense for a user to have -2 items in their wishlist.
This is the code that would accomplish this:
math.max( tonumber( (event_kvp['om_event_type'] == 'addItem' and ((user_vars['cuf_user_wishlist_count'] or 0) + 1) or (event_kvp['om_event_type'] == 'removeItem' and ((user_vars['cuf_user_wishlist_count'] or 0) - 1))) or 0), 0)
Going from the inside out, we start with this:
(event_kvp['om_event_type'] == 'addItem' and ((user_vars['cuf_user_wishlist_count'] or 0) + 1)
(event_kvp['om_event_type'] == 'removeItem' and ((user_vars['cuf_user_wishlist_count'] or 0) - 1))
This is the logical core of the expression, which boils down to this logic:
If the event type is “addItem”, add one to the “cuf_user_wishlist_count” CUF
If the event type is “removeItem”, remove one from the “cuf_user_wishlist_count” CUF
It’s also important to note the continued use of or 0 in this code. This is necessary here for the same reason, that in the case that the value for cuf_user_wishlist_count is null, it won’t break the processing.
Moving one level up, there are two wrappers for the entire logical statement:
math.max(tonumber(<logical expression> or 0),0)
The inner wrapper, tonumber(<logical expression> or 0), is again used to force the result to be numeric, and to be zero if the logical expression does not evaluate to a number. This is necessary to avoid incompatibility with the next layer up, math.max(<inner expression>,0), which chooses the larger number, either zero or the results of the logical expression. The logical expression could technically send events to make the value equal to zero, but this forces the results to never be negative.
As with the previous example, the machine name for the CUF needs to be consistent between the CUF and the Lua code. In addition, it is best practices to limit the events that this will be applied to by navigating to the “For the following events” section of the Event Builder and adding the addItem and removeItem events.
There are some cases in which a metric needs to be evaluated in real time, rather than at the time that Omniata's nightly processing starts. A good example of this would be comparing the number of days since an event has occurred.
First, a CUF with that stores the last timestamp when a specific event was sent needs to be created. In this case, we'll assume the machine name of the CUF is "cuf_last_revenue_timestamp", and it's a CUF with the following Lua code:
tonumber(event_meta['timestamp']) or 0
This CUF will also need to be scoped to the particular event, in this case om_revenue. One this is created, you'll need to create a System User Field to perform the comparison. A SUF needs to be used instead of a CUF because CUFs evaluate data at the time of processing, whereas SUFs evaluate data in real time. To create a SUF, click on the "New..." button as before, but select "System User Field". Under "Data Source" you'll select "Formula", and then use the following code:
(tonumber(user_vars['cuf_last_revenue_timestamp']) > 0) and (math.floor(os.time() - (tonumber(user_vars['cuf_last_revenue_timestamp']) or 0) / 86400)) or -1
Working from the inside out, this code gets the value of the CUF and converts it to a number, or defaults to zero if it is unavailable. Then, the function "os.time()" is used, which returns the current Unix timestamp. The time of the event (or zero) is then subtracted from the current time, and this value is then divided by 86,400, which is the number of seconds in a day, to convert the value in seconds to days. This expression is then wrapped in math.floor(), which will round the value down to the nearest whole number. So for instance, if the expression evaluates to 3.5, the math.floor() function will convert it to three. Finally, the first part of the expression checks to confirm that the timestamp is greater than zero, and skips the timestamp comparison if not. It will then return negative one as the value. This is done so that if a user has not sent the event, their CUF value will be negative one, and when a Segment is built off this, they won't be able to receive the Content. If the value for "cuf_last_revenue_timestamp" is greater than zero, then the comparison will take place, and the number of days since the user sent the event will be stored.
There are cases where writing a Lua transformation is not enough, and something more flexible is needed. In those cases, you can write functions. The biggest downside to functions is that they place a higher load on the system when processing data, so whenever possible, expressions should be used. For very complex transformations though, a function may be necessary.
As an example, consider an application that is cross-platform, with users able to play on Facebook, Android, Amazon, and iOS. A basic question may be how many platforms do users play on, but a more advanced question would be what order do people play on? This would answer questions such as what platform do users play on after installing on iOS, how many different platforms users play on, and what platforms are best for enticing a user to connect Facebook.
This requires the creation of a Data Field that will tell you not only the platforms that a user has used, but also the order that the user has adopted the different platforms. The Data Field also needs to be smart enough to not add duplicate information, and to stop after a user has used all the platforms.
The code below generates a text field which contains the order in which an individual user uses the four various platforms:
(function() if tonumber(string.len(user_vars['cuf_platform_order'] or 0)) == 11 then return user_vars['cuf_platform_order'] or '' elseif (tonumber(string.len(user_vars['cuf_platform_order'] or 0)) < 11) and ((string.find(user_vars['cuf_platform_order'] or '',"Fb") or 0) < 1) and (event_kvp['played_fb'] == 'true') then user_vars['cuf_platform_order'] = (user_vars['cuf_platform_order'] or '') .. 'Fb' return user_vars['cuf_platform_order'] or '' elseif (tonumber(string.len(user_vars['cuf_platform_order'] or 0)) < 11) and ((string.find(user_vars['cuf_platform_order'] or '',"Amz") or 0) < 1) and (event_kvp['played_amz'] == 'true') then user_vars['cuf_platform_order'] = (user_vars['cuf_platform_order'] or '') .. 'Amz' return user_vars['cuf_platform_order'] or '' elseif (tonumber(string.len(user_vars['cuf_platform_order'] or 0)) < 11) and ((string.find(user_vars['cuf_platform_order'] or '',"And") or 0) < 1) and (event_kvp['played_and'] == 'true') then user_vars['cuf_platform_order'] = (user_vars['cuf_platform_order'] or '') .. 'And' return user_vars['cuf_platform_order'] or '' elseif (tonumber(string.len(user_vars['cuf_platform_order'] or 0)) < 11) and ((string.find(user_vars['cuf_platform_order'] or '',"iOS") or 0) < 1) and (event_kvp['played_ios'] == 'true') then user_vars['cuf_platform_order'] = (user_vars['cuf_platform_order'] or '') .. 'iOS' return user_vars['cuf_platform_order'] or '' else user_vars['cuf_platform_order'] = '' return user_vars['cuf_platform_order'] end end)()
Going through the code, the logic is as follows:
If the length of the field cuf_platform_order is 11, then return the value of cuf_platform_order
The maximum length of the cuf_platform_order is 11, because three of the platforms have a three letter designation, and one has a two letter designation, so any user that plays on all four platforms will have a cuf_platform_order value that is 11 characters long, such as ”‘FbAmzAndiOS”
If the length of the field cuf_platform_order is less than 11, and the event KVP “played_fb” is equal to “true”, then append “Fb” to the value for cuf_platform_order
If the length of the field cuf_platform_order is less than 11, and the event KVP “played_amz” is equal to “true”, then append “Amz” to the value for cuf_platform_order
If the length of the field cuf_platform_order is less than 11, and the event KVP “played_and” is equal to “true”, then append “And” to the value for cuf_platform_order
If the length of the field cuf_platform_order is less than 11, and the event KVP “played_ios” is equal to “true”, then append “iOS” to the value for cuf_platform_order
The code also exhibits many examples of or 0 and or '', which are used to deal with null numeric and null string values, respectively. The end result is a field that will produce values such as “FbiOS” or “AmzAndFb”, depending on how many platforms each individual user uses and in what order they use them. As before, the CUF machine name needs to match the name used in the field here, cuf_platform_order.
When creating more advanced Lua code for Data Fields, it is sometimes desirable to be able to test the code quicker than running an on-demand job for a Reporting Table that contains the Data Field(s). In that case, Omniata recommends using the Lua.org website’s testing page, located here:
When testing, there are a few things to keep in mind. First, you’ll need to replace all Omniata-specific references to fields with plain text strings, which will function as variables. Second, you’ll need to set the values for the variables. If we consider the code in the screenshot below, it will generate an error in the Lua console:
The error is caused by the user_vars['invites_sent'] text. To successfully test it, only a few simple changes need to be done:
A new line was added before the io.write() statement, which declared the variable invites_sent and set the value to one. Within the io.write(), the syntax was changed to match the variable that was declared, and the output now shows the correct value for the expression.