Operators

select

This is functional version of selection operator. Selection is a row filtering operation. It takes predicate for filtering criteria and returns a function. The returned function is called with the DataModel instance on which the action needs to be performed.

SelectionPredicate is a function which returns a boolean value. For selection opearation the selection function is called for each row of DataModel instance with the current row passed as argument.

After executing SelectionPredicate the rows are labeled as either an entry of selection set or an entry of rejection set.

FilteringMode operates on the selection and rejection set to determine which one would reflect in the resulatant datamodel.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.
 const select = DataModel.Operators.select;
 usaCarsFn = select(fields => fields.Origin.value === 'USA');
 outputDM = usaCarsFn(dm);

Parameters:

NameTypeDescription
selectFn

SelectionPredicate

Predicate function which is called for each row with the current row

function (row, i) { ... }

config

Object

The configuration object to control the inclusion exclusion of a row in resultant DataModel instance

NameTypeDescription
mode

FilteringMode

The mode of the selection

logo

Note

[Warn] Selection and rejection set is only a logical idea for concept explanation purpose.

Returns:

PreparatorFunctionFunction which expects an instance of DataModel on which the operator needs to be applied.

project

This is functional version of projection operator. Projection is a column (field) filtering operation. It expects list of fields name and either include those or exclude those based on FilteringMode on the resultant dataModel. It returns a function which is called with the DataModel instance on which the action needs to be performed.

Projection expects array of fields name based on which it creates the selection and rejection set. All the field whose name is present in array goes in selection set and rest of the fields goes in rejection set.

FilteringMode operates on the selection and rejection set to determine which one would reflect in the resulatant datamodel.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.
 const project = DataModel.Operators.project;
 usaCarsFn = project(['Name'], { mode: DataModel.FilteringMode.INVERSE });
 outputDM = usaCarsFn(dm);

Parameters:

NameTypeDescription
projField

Array<(string|Regexp)>

An array of column names in string or regular expression.

config

Object

An optional config to control the creation of new DataModel

NameTypeDescription
mode

FilteringMode

Mode of the projection

logo

Note

Selection and rejection set is only a logical idea for concept explanation purpose.

Returns:

PreparatorFunctionFunction which expects an instance of DataModel on which the operator needs to be applied.

groupBy

This is functional version of groupBy operator. This operator groups the data using particular dimensions and by reducing measures. It expects a list of dimensions using which it projects the datamodel and perform aggregations to reduce the duplicate tuples. Refer this document to know the intuition behind groupBy.

DataModel by default provides definition of few Reducers. User defined reducers can also be registered.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.

 const groupBy = DataModel.Operators.groupBy;
 const groupedFn = groupBy(['Year'], { horsepower: 'max' } );
 const outputDM = groupedFn(dm);

Parameters:

NameTypeDescription
fieldsArr

Array of string

Array containing the name of dimensions

reducers

Object

A map whose key is the variable name and value is the name of the reducer. If its not passed, or any variable is ommitted from the object, default aggregation function is used from the schema of the variable.

Returns:

PreparatorFunctionFunction which expects an instance of DataModel on which the operator needs to be applied.

compose

It enables you to create new operator by composing existing operators. The newly created operator is used like any other operator. The operations provided will be executed in a serial manner ie. result of one operation will be the input for the next operations (like pipe operator in unix).

Compose has added benefits which chaining does not provide. Like, if there are group of operators are involved to transform data, chaining would create so intermediate DataModel instances. If compose is used no intermediate DataModels are created.

Suported operators in compose are

  • select
  • project
  • groupBy
  • bin
  • Any operator created using compose compose
main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.
 const compose = DataModel.Operators.compose;
 const select = DataModel.Operators.select;
 const project = DataModel.Operators.project;

 let lowCylCarsFromUSADM= compose(
     select(fields => fields.Origin.value === 'USA' && fields.Cylinders.value === '4' ),
     project(['Origin', 'Cylinders'], { mode: DataModel.FilteringMode.INVERSE })
 );

 const outputDM = lowCylCarsFromUSADM(dm);

Parameters:

NameTypeDescription
operators

Array of Operators

An array of operation that will be applied on the datatable.

Returns:

PreparatorFunctionFunction which expects an instance of DataModel on which the operator needs to be applied.

difference

Difference operator is written as (A - B) where A and B are instances of DataModel. The result of difference is an instance of DataModel which includes tuples which are present in A and not in B. For difference to work schema of both DataModel has to be same.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates a DataModel instance only including USA. Using chained version for conciseness.
 const usaMakerDM = dm.select(fields => fields.Origin.value === 'USA');

 const difference = DataModel.Operators.difference;
 outputDM = difference(dm, usaMakerDM);

Parameters:

NameTypeDescription
leftDM

DataModel

Instance of DataModel from which the difference will be calculated. For the notation (A - B), A is the leftDM

rightDM

DataModel

Instance of DataModel which will be used to calculate the difference from the leftDM. For the notation (A - B), B is the rightDM.

Returns:

DataModelNew DataModel instance with the result of the operation.

join

Performs crossproduct between two DataModel instances with an optional predicate which determines which tuples should be included and returns a new DataModel instance containing the results. This operation is also called theta join.

Cross product takes two set and create one set where each value of one set is paired with each value of another set.

This method takes an optional predicate which filters the generated result rows. The predicate is called for every tuple. If the predicate returns true the combined row is included in the resulatant table.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates two small DataModel instance from the original DataModel instance, which will be joined.
 let makerDM = dm.groupBy(['Origin', 'Maker']).project(['Origin', 'Maker']);
 let nameDM = dm.project(['Name','Miles_per_Gallon']);

 const join = DataModel.Operators.join;
 let outputDM = join(makerDM, nameDM,
     (makerDM, nameDM) => makerDM.Maker.value === nameDM.Name.value.split(/\s/)[0]);

Parameters:

NameTypeDescription
leftDM

DataModel

Instance of DataModel

rightDM

DataModel

Instance of DataModel

filterFn

SelectionPredicate

The predicate function that will filter the result of the crossProduct.

Returns:

DataModelNew DataModel instance created after joining.

DataModel

Natural join is a special kind of joining where filtering of rows are performed internally by resolving common fields are from both table and the rows with common value are included.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates two small DataModel instance from the original DataModel instance, which will be joined. Used chained
 // operator for conciseness.
 const makerDM = dm.groupBy(['Origin', 'Maker']).project(['Origin', 'Maker']);
 const nameDM = dm.project(['Name','Miles_per_Gallon'])

 const naturalJoin = DataModel.Operators.naturalJoin;
 const outputDM = naturalJoin(makerDM, nameDM);

Parameters:

NameTypeDescription
leftDM

DataModel

Instance of DataModel

rightDM

DataModel

Instance of DataModel

Returns:

DataModelNew DataModel instance with joined data

leftOuterJoin

Left outer join between two DataModel instances is a kind of join that ensures that all the tuples from the left DataModel are present in the resulatant DataModel. This operator takes a predicate which gets called for every combination of tuples (created by cartesian product). Based on the value of predicate the equality is established between two DataModel.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates two small DataModel instance from the original DataModel instance, which will be joined using left outer
 // join.
 let makerDM = dm.groupBy(['Origin', 'Maker']).project(['Origin', 'Maker']);
 let nameDM = dm.project(['Name','Miles_per_Gallon']);

 const leftOuterJoin = DataModel.Operators.leftOuterJoin;
 let outputDM = leftOuterJoin(makerDM, nameDM,
     (makerDM, nameDM) => makerDM.Maker.value === nameDM.Name.value.split(/\s/)[0]);

Parameters:

NameTypeDescription
leftDm

DataModel

Instance of DataModel

rightDm

DataModel

Instance of DataModel

filterFn

SelectionPredicate

The predicate function that will filter the result of the crossProduct.

Returns:

DataModelNew DataModel instance created after the left outer join operation.

rightOuterJoin

Right outer join between two DataModel instances is a kind of join that ensures that all the tuples from the right DataModel are present in the resulatant DataModel. This operator takes a predicate which gets called for every combination of tuples (created by cartesian product). Based on the value of predicate the equality is established between two DataModel.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates two small DataModel instance from the original DataModel instance, which will be joined using left outer
 // join.
 let makerDM = dm.groupBy(['Origin', 'Maker']).project(['Origin', 'Maker']);
 let nameDM = dm.project(['Name','Miles_per_Gallon']);

 const rightOuterJoin = DataModel.Operators.rightOuterJoin;
 let outputDM = rightOuterJoin(makerDM, nameDM,
     (makerDM, nameDM) => makerDM.Maker.value === nameDM.Name.value.split(/\s/)[0]);

Parameters:

NameTypeDescription
leftDm

DataModel

Instance of DataModel

rightDm

DataModel

Instance of DataModel

filterFn

SelectionPredicate

The predicate function that will filter the result of the crossProduct.

Returns:

DataModelNew DataModel instance created after the left outer join operation.

fullOuterJoin

Full outer join between two DataModel instances is a kind of join that ensures that all the tuples from the left DataModel and right DataModel are present in the resulatant DataModel. This operator takes a predicate which gets called for every combination of tuples (created by cartesian product). Based on the value of predicate the equality is established between two DataModel.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates two small DataModel instance from the original DataModel instance, which will be joined using left outer
 // join.
 let makerDM = dm.groupBy(['Origin', 'Maker']).project(['Origin', 'Maker']);
 let nameDM = dm.project(['Name','Miles_per_Gallon']);

 const fullOuterJoin = DataModel.Operators.fullOuterJoin;
 let outputDM = fullOuterJoin(makerDM, nameDM,
     (makerDM, nameDM) => makerDM.Maker.value === nameDM.Name.value.split(/\s/)[0]);

Parameters:

NameTypeDescription
leftDm

DataModel

Instance of DataModel

rightDm

DataModel

Instance of DataModel

filterFn

SelectionPredicate

The predicate function that will filter the result of the crossProduct.

Returns:

DataModelNew DataModel instance created after the left outer join operation.

calculateVariable

Creates a new variable calculated from existing variable. This method expects definition of the newly created variable and a function which resolves value of the new variable from existing variables.

This operator is not compose supported.

Creates a new measure based on existing variables

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.

 const calculateVariable = DataModel.Operators.calculateVariable
 const creatorFn = calculateVariable({
     name: 'powerToWeight',
     type: 'measure' // Schema of variable
 }, ['Horsepower', 'Weight_in_lbs', (hp, weight) => hp / weight ]);
 const outputDM = creatorFn(dm);
main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.

 const calculateVariable = DataModel.Operators.calculateVariable;
 const creatorFn = calculateVariable(
    {
      name: 'Efficiency',
      type: 'dimension'
    }, ['Horsepower', (hp) => {
     if (hp < 80) { return 'low'; }
     else if (hp < 120) { return 'moderate'; }
     else { return 'high' }
 }]);
  const outputDM = creatorFn(dm);

Parameters:

NameTypeDescription
schema

Schema

Schema of newly defined variable

resolver

VariableResolver

Resolver format to resolve the current variable

Returns:

PreparatorFunctionFunction which expects an instance of DataModel on which the operator needs to be applied.

sort

Performs sorting according to the specified sorting details. Like every other operator it doesn't mutate the current DataModel instance on which it was called, instead returns a new DataModel instance containing the sorted data.

This operator support multi level sorting by listing the variables using which sorting needs to be performed and the type of sorting ASC or DESC.

In the following example, data is sorted by Origin field in DESC order in first level followed by another level of sorting by Acceleration in ASC order.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the DataModel variable.

 const sort = DataModel.Operators.sort;
 const preparatorFn = sort([
     ['Origin', 'DESC'],
     ['Acceleration'] // Default value is ASC
 ]);
 const outputDM = preparatorFn(dm);
main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted
 // from muze namespace and assigned to DataModel variable.
 const avg = DataModel.Stats.avg;
 const sort = DataModel.Operators.sort;
 const preparatorFn = sort([
     ['Origin', ['Acceleration', (a, b) => avg(a.Acceleration) - avg(b.Acceleration)]]
 ]);
 const outputDM = preparatorFn(dm);

Parameters:

NameTypeDescription
sortingDetails

Array of Array

Sorting details based on which the sorting will be performed.

Returns:

PreparatorFunctionFunction which expects an instance of DataModel on which the operator needs to be applied.

union

Union operation can be termed as vertical stacking of all rows from both the DataModel instances, provided that both of the DataModel instances should have same column names.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted from
 // muze namespace and assigned to the variable DataModel.

 // Creates two small DataModel instance from the original DataModel instance, one only for european cars,
 // another for cars from USA. Used the chain operation here for conciseness.
 const usaMakerDM = dm.select(fields => fields.Origin.value === 'USA');
 const euroMakerDM = dm.select(fields => fields.Origin.value === 'Europe');

 const union = DataModel.Operators.union;
 const outputDM = union(usaMakerDM, euroMakerDM);

Parameters:

NameTypeDescription
topDM

DataModel

One of the two operands of union. Instance of DataModel.

bottomDM

DataModel

Another operands of union. Instance of DataModel.

Returns:

DataModelNew DataModel instance with the result of the operation.