Calculated Fields

From Explore Analytics: The Wiki
Jump to navigation Jump to search

Introduction

A calculated field is calculated from other fields in the list. The calculated field itself does not exist in the data source. Its value for each row is calculated from the values of the other fields.

The calculated value does not have to be numeric. In fact you can calculate a date, a time duration, or perform some text manipulation to create the value.

Adding a Calculated Field

To see how to add a calculated field, let's see an example. We'll start with this view showing the 10 hottest days in NYC. The temperatures are shown in Celsius.

10 hottest days.png

To also show the temperatures in Fahrenheit, we'll add a calculated field by clicking the calculator button (see arrow in the above picture).

Using the Calculated Field dialog, we can build a calculation that can include arithmetic as well as a selection of functions. For this example, we only need arithmetic.

Calculated field.png

The following picture shows the list view with the calculated field:

10 hottest days fahrenheit.png

Syntax for fields, literal numbers, and literal text

  • fields - are entered in double quotes, for example, "Temperature in C"
  • literal numbers - are entered without quote, for example, -3.5
  • literal text - is entered in single quotes, for example, '2010-01-01'

Note that literal function arguments such as format, date/time, and constant values that are not a field or a literal number, are entered as literal text in single quotes

Performing Arithmetic Calculation

Use the multiplication, subtraction, division, and addition operators as well as parenthesis to perform numeric calculation. In a numeric calculation, the operands and the results are numeric. To perform more complex calculation, use numeric functions such as ABS, ROUND, etc.

Logical Calculations - Comparisons

Use the equals, not-equals, greater than, less than, greater or equals, or less or equal operators to compare values. In these comparisons, the operands may be numbers, dates, or text. The result is always 'true' or 'false'. For example, when comparing two dates, the comparison will be a date/time comparison and the result would be true or false.

You can use such logical calculations in functions such as IF, NOT, OR, AND to perform more complex conditional logic in calculations. See examples below in the description of those functions.

Adding a Function to the Calculation

To demonstrate the use of functions in a calculation, let's continue with our example, and calculate the number of years ago when we saw that temperature.

Years ago.png

The calculation extract the year part of the date and subtract it from the year part of today's date. The TODAY() function returns today's date. The YEAR() function returns the year part of the date as a number.

Next is the list view with the "Years Ago" field.

10 hottest days years ago.png

Functions

The table below shows all the functions that are available for field calculations.

  • ABS(value) - Absolute value. The value must be numeric and the returned value is based on the type of the value.

ABS("Amount")

  • AND(logical expression,logical expression,...) - evaluated to 'true' if all the arguments evaluate to 'true'.

IF(AND("Status" = 'Resolved', "Closed" <= NOW()), 'Closed', 'Pending')

  • CATEGORIZE(text, item1, ..., category) - if text matches any of the items, it's replaced by category, otherwise text is returned unchanged.

CATEGORIZE("State",'canceled','aborted','timeout', 'error', 'Unsuccessful')

  • CEILING(value) - the smallest integer greater than or equal to the value
  • CONCAT(text,text,...) - concatenates two or more text values returning the concatenated text

CONCAT("State", ' ', "Zip")

  • CONVERT_TIME_ZONE - Convert the date and time value from the "from" timezone to the "to" timezone. To specify a timezone, enclose it in single quotes. Timezone are taken from this list.

CONVERT_TIME_ZONE("Opened", 'America/New_York', 'Asia/Manila')

  • CSV_SEARCH(csv,item1,item2,item3) - Search a comma separated text for an item matching item1. If found, return item 2, otherwise return item3
  • CSV_SELECT(csv,n) - Select the nth item in a comma-separated text. Items are numbered from 1 (first). To get items from the end, use negative numbers where -1 is first from the end (the last), -2 is the second from the end (the second to last), etc.
  • DATE_ADD(date,duration) - adds the duration (number of seconds) to the date and returns the new date
  • DATE_ADD(date,interval,count) - adds count intervals to the date. Count is the number of intervals and the interval can be YEAR, QUARTER, MONTH, WEEK, DAY, HOUR, MINUTE, or SECOND
  • DATE_SUB(date,duration) - subtract the duration (number of seconds) from the date and returns the new date
  • DATE_SUB(date,interval,count) - subtracts count intervals from the date. Count is the number of intervals and the interval can be YEAR, QUARTER, MONTH, WEEK, DAY, HOUR, MINUTE, or SECOND
  • DAY(date) - the day of the month - a number between 1 and 31
  • DECODE(text,item1,result1,...,default) - Decodes text. If it matches an item, returns the corresponding result, if no item is matches, the default is returned. If a default is not provided, then null is returned.

DECODE("Risk", '1', 'High', '2', 'Medium', '3', 'Low', 'Unknown')

  • DELIMITER_SEARCH(delimiter,text,item1,item2,item3) - Search a delimited text for an item matching item1. If found, return item 2, otherwise return item3
  • DELIMITER_SELECT(delimiter,text,n) - Select the nth item in a delimited text. Items are numbered from 1 (first). To get items from the end, use negative numbers where -1 is first from the end (the last), -2 is the second from the end (the second to last), etc.

DELIMITER_SELECT('/', "Path", 1)

  • DISPLAY(field) - Certain fields such as fields of type choice have an interval value (e.g. '1') and a display value (e.g., 'Pending'). Normally, the calculation operates on the value, not the display value, but if you'd like to use the display value, then use this function.

IF(DISPLAY("State") = 'New', 'New', 'Existing')

  • DURATION(begin,end) - Difference in seconds between the begin and end. The returned value is of type duration.
  • FLOOR(value) - the largest integer less than or equal to the value
  • FORMAT(value,format) - if value is date/time, the format will be a date/time format. The format is specified according to the Java Date format specifications. For example, 'MM/dd/yyyy HH:mm' would render: 12/06/2014 17:46. If value is numeric, the format will be a number format. When entering a number format, you can enter formats based on the Java Decimal Format specifications.

FORMAT("ORDER_DATE",'ddMMyyyy')

The following will display the name of the day of the week (e.g., Wednesday)

FORMAT("ORDER_DATE",'EEEE')

The following will format a dollar amount:

FORMAT("Amount", '$#,##0.00')

  • FORMAT_DURATION(value) - given a numeric value (number of seconds), format it as a duration. The result is text. Please note that if you have a numeric duration value, you can cause it to be formatted as duration simply by setting the Calculation Data Type to "duration".
  • HOUR(date) - the hour of the day - a number between 0 and 23
  • IF(logical expression,true value,false value) - If the logical expression evaluate to 'true', the result is the true value, otherwise it's the false value. The logical expression can use a comparison operator, but could also perform any calculation that results in 'true' or 'false'.

IF("Opened Date" = "Resolved Date", 'first call resolution','normal')

  • IFNULL(value,replacement) - If the input is null or empty, replace it with the replacement value. The type of the replacement must agree with the type of the text
  • INSTR(text,item,start) - the position of item in text. If start is specified, the search start at the specified start position. The position of the first character is 1. If not found, the position is 0
  • IS_NUMBER(text) - checks if text is a valid number. Returns "true" if it is or "false" if it isn't. If text is empty, the result false. The true/false result can then be used in a logical expression, for example in an IF function.
  • LEFT(text,count) - extracts the first count (leftmost) characters of the text. If count is negative, the result is empty.
  • LEN(text) - returns the number of characters in text. If text is null, returns null. Not recommended for getting the length of a numeric or date/time field. In that case, it returns the length of the unformatted value.
  • LOOKUP(value,data source,table,key field,value field) - lookup value in a lookup table. The table is specified by data source and table. The lookup value is matched against the key field. If there's a match, it returns the value of the value field of the lookup table. Lookup is case insensitive. If there's no match, the returned value is empty (null). When specifying the data source, table, and fields, you can use their name or their label.

Please note that when entering the data source, table, key field, and value field, you must enclose those names in single quotes and not double quotes. For example, to lookup a sales_target table in the explore data source, the LOOKUP function might look like this:

LOOKUP("Department", 'explore', 'sales_target', 'department_id', 'target')

  • LOOKUP_AGGREGATE(value,data source,table,key field,value field,aggregate) - lookup an aggregate value in a lookup table. The table is specified by data source and table. The lookup value is matched against the key field. If there's a match, it returns the aggregate value (e.g., sum) of the value field of the lookup table. Lookup is case insensitive. If there's no match, the returned value is empty (null). When specifying the data source, table, and fields, you can use their name or their label.

Please note that when entering the data source, table, key field, and value field, you must enclose those names in single quotes and not double quotes.

The aggregate must be one of the following: 'sum', 'average', 'min', 'max', 'count rows', 'count non-empty values', 'count distinct values'. Unless the aggregate is a count, the value field must be numeric or duration.

For example, to lookup the total quota in a sales_quota table in the explore data source, the LOOKUP_AGGREGATE function might look like this:

LOOKUP_AGGREGATE("Department", 'explore', 'sales_quota', 'department_id', 'quota', 'sum')

  • MID(text,start,count) - extracts count characters from the text starting at the start position. The position of the first character is 1. If count is not specified, MID extracts all the characters until the end of the text
  • MINUTE(date) - the minute of the hour - a number between 0 and 59
  • MONTH(date) - the month of the year - a number between 1 and 12
  • NOT(logical expression) - if the logical expression evaluated to 'true', then NOT returns 'false', otherwise it returns 'true'.

IF(NOT("Has Error"),'Passed','Failed')

  • NOW() - The current date and time (at the data source of this list view).
  • OR(logical expression,logical expression,...) - evaluates to 'true' of at least one of its arguments is 'true'

IF(OR("Status" = 'FCR', "Opened Date" = "Resolved Date"), 'first call resolution', 'normal')

  • QUARTER(date) - the quarter of the year - a number between 1 and 4
  • RANK(category,...) - Return the number of times that the list of categories have been seen. Returns an integer starting with 1

For example, the following will number the rows within each region. Given that the list is sorted, this can allow you to then select the top items for each region

RANK("Region")

To simply number the rows in your list:

RANK()

Note: RANK works based on the sorting of the rows that happens before fields are calculated. If the list is later sorted based on a calculated field, or if the list is sorted in the browser, the RANK order will still reflect the original order.

  • REPLACE(text,item,replacement) - Replaces all occurrences of text item with replacement. The result is text
  • RIGHT(text,count) - extracts the last count (rightmost) characters of the text
  • ROUND(value) - rounds a numeric value to an integer
  • SEARCH(text,item1,item2,item3) - Search the text for item1. If found, return item 2, otherwise return item3
  • SECOND(date) - the second - a number between 0 and 59
  • TO_DATE(text, format) - converts a formatted date to a standard Explore Analytics date (and time). If format is not specified, the text can be in the standard Explore Analytics format (e.g. '2015-10-31 14:01:03') or in mm/dd/yyyy format (e.g., '10/31/2015 14:01:03'). The date may omit the time part. If format is specified, it follows the Java simple date format specifications. Format can also be 'epoch' and it interprets the text to be a number of seconds since 01/01/1970 (known as Unix timestamp) and will convert it to a date in UTC.
  • TO_NUMBER(text) - validate that text is a number and return it as a numeric value that can be used in arithmetic. If text is not a valid number, an error message will be issued and the view will not be produced
  • TODAY() - The current date. Same as NOW, but with the time part truncated.
  • TRUNC(value) - if value is numeric, the fraction part is removed. If value is datetime, the time-of-day part is removed
  • TRUNC(date,interval) - truncate the date down to the specified interval. For example, if interval is YEAR, the result is the top of the year (January 1 at 00:00:00). Similarly, QUARTER is top of the quarter, MONTH is top of month, WEEK_MONDAY is the start time of the week (Monday), WEEK_SUNDAY is the beginning of the week (Sunday), DAY is the date without the time part, HOUR is the date and time truncated to the beginning of the hour

TRUNC("ORDER_DATE", 'week_sunday')

  • WEEK(date,first_day,min_days) - the week number (week of the year) where week starts on first_day (Sunday or Monday) and week 1 is the first week of the year that has at least min_days (1, 4, or 7) in the year.

WEEK("ORDER_DATE", 'Sunday', 4)

The example above, calculated the week of the year (a number from 1 to 52 or 53), where week 1 starts on Sunday and it's the first week of the year to have at least 4 days in the year (in other words, contains January 4)

  • YEAR(date) - the year part of a datetime value returned as a number (e.g. 2015)

Deprecated Functions

the COMPARE and TEXT_COMPARE are there for compatibility. For new calculations, we recommend that you'll use the IF function and logical calculations. In this newer method, we support the operators like greater-than and there's no longer a need to put such operators in quotes.

  • COMPARE(item1, operator, item2, trueValue, falseValue) - compares numeric item1 to item2. The operator is one of '=' '!=' '>' '>=' '<' '<=' (enter the operator with the single quotes). If the comparison is true, the result is the value 'true', or if a trueValue is specified, the result is trueValue. If the comparison fails, the result is the value 'false', or if falseValue is specified the result is falseValue

COMPARE(TO_NUMBER("Result"), '>=', 3, 'Pass', 'Fail')

The same can now be written as:

IF(TO_NUMBER("Result") >= 3, 'Pass', 'Fail')

  • TEXT_COMPARE(item1, operator, item2, trueValue, falseValue) - compares text item1 to item2. The operator is one of = != > >= < <= . The comparison is alphanumeric. If the comparison is true, the result is the value 'true', or if a trueValue is specified, the result is trueValue. If the comparison fails, the result is the value 'false', or if falseValue is specified the result is falseValue