Datamodel

DataModel is an in-browser representation of tabular data. It supports relational algebra operators as well as generic data processing opearators. DataModel extends Relation class which defines all the relational algebra opreators. DataModel gives definition of generic data processing operators which are not relational algebra complient but needed for ease of use.

constructor

Creates a new DataModel instance by providing data and schema. Data could be in the form of

  • Flat JSON
  • DSV String
  • 2D Array

By default DataModel finds suitable adapter to serialize the data. DataModel also expects a schema for identifying the variables present in data.

main
run-button
run-button
reset-button
const DataModel = muze.DataModel; // Retrieves reference to DataModel from muze namespace
 const data = [
     { Name:'chevrolet chevelle malibu', Miles_per_Gallon:18, Cylinders:8, Horsepower:130, Year:'1970' },
     { Name:'ford fiesta', Miles_per_Gallon:36.1, Cylinders:4, Horsepower:66, Year:'1978' },
     { Name:'bmw 320i', Miles_per_Gallon:21.5, Cylinders:4, Horsepower:110, Year:'1977' },
     { Name:'chevrolet chevelle malibu', Miles_per_Gallon:18, Cylinders:8, Horsepower:130, Year:'1970' },
     { Name:'ford fiesta', Miles_per_Gallon:36.1, Cylinders:4, Horsepower:66, Year:'1978' },
     { Name:'bmw 320i', Miles_per_Gallon:21.5, Cylinders:4, Horsepower:110, Year:'1977' }
 ];
 const schema = [
     { name: 'Name', type: 'dimension' },
     { name: 'Miles_per_Gallon', type: 'measure', unit : 'gallon', numberformat: val => `${val}G`},
     { name: 'Cylinders', type: 'dimension' },
     { name: 'Horsepower', type: 'measure' },
     { name: 'Year', type: 'dimension', subtype: 'datetime', format: '%Y' }
];
const dm = new DataModel(data, schema, { name: 'Cars' });
printDM(dm); // internal function to print datamodel, available only in this interface

Parameters:

NameTypeDescription
data

Array of Object

string

Array of Array

Input data in any of the mentioned formats. Checkout this example for practical example on how feed different data format.

schema

Array of Schema

Defination of the variables. Order of the variables in data and order of the variables in schema has to be same.

options

object

Optional arguments to specify more settings regarding the creation part

NameTypeDescription
name

string

Name of the datamodel instance. If no name is given an auto generated name is assigned to the instance.

fieldSeparator

string

specify field separator type if the data is of type dsv string.

static Reducers

Reducers are simple functions which reduces an array of numbers to a representative number of the set. Like an array of numbers [10, 20, 5, 15] can be reduced to 12.5 if average / mean reducer function is applied. All the measure fields in datamodel (variables in data) needs a reducer to handle aggregation.

Returns:

ReducerStoreSingleton instance of ReducerStore.

getData

Retrieve the data attached to an instance in JSON format.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm.
 const serializedData = dm.getData({
     order: 'column',
     formatter: {
         origin: (val) => val === 'European Union' ? 'EU' : val
     }
 console.log(serializedData);
 });

Parameters:

NameTypeDescription
options

Object

Options to control how the raw data is to be returned.

NameTypeDescription
order

string

Defines if data is retieved in row order or column order. Possible values are 'rows' and 'columns'

formatter

function

Formats the output data. This expects an object, where the keys are the name of the variable needs to be formatted. The formatter function is called for each row passing the value of the cell for a particular row as arguments. The formatter is a function in the form of

function (value, rowId, schema) => { ... }

Know more about Fomatter.

Returns:

Array: Returns a multidimensional array of the data with schema. The return format looks like

     {
data,
schema
}

groupBy

Groups the data using particular dimensions by reducing measures. It expects a list of dimensions using which it projects the datamodel and perform aggregations to reduce the duplicate tuples. Refer this document to know the intuition behind groupBy.

DataModel by default provides definition of few Reducers for reducing a measure when aggregation is required for groupBy. User defined reducers can also be registered.

This is the chained implementation of groupBy. groupBy also supports composability.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm.
 const outputDM = dm.groupBy(['Year'], { horsepower: 'max' } );

Parameters:

NameTypeDescription
fieldsArr

Array of string

Array containing the name of dimensions using which groupBy should happen.

reducers

Object

A simple key value pair whose key is the variable name and value is the name of the reducer. If its not passed, or any variable is ommitted from the object, default aggregation function is used from the schema of the variable.

Returns:

DataModelReturns a new DataModel instance after performing the groupby.

sort

Performs sorting according to the specified sorting details.Like every other operator it doesn't mutate the current DataModel instance on which it was called, instead returns a new DataModel instance containing the sorted data.

DataModel support multi level sorting by listing the variables using which sorting needs to be performed and the type of sorting ASC or DESC.

In the following example, data is sorted by Origin field in DESC order in first level followed by another level of sorting by Acceleration in ASC order.

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm.
 let outputDM = dm.sort([
     ['Origin', 'DESC'],
     ['Acceleration'] // Default value is ASC
 ]);
main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm. DataModel is extracted
 // from muze namespace and assigned to DataModel variable.
 const avg = DataModel.Stats.avg;
 const outputDM = dm.sort([
     ['Origin', ['Acceleration', (a, b) => avg(a.Acceleration) - avg(b.Acceleration)]]
 ]);

Parameters:

NameTypeDescription
sortingDetails

Array of Array

Sorting details based on which the sorting will be performed.

Returns:

DataModelReturns a new instance of DataModel with sorted data.

calculateVariable

Creates a new variable calculated from existing variable. This method expects definition of the newly created variable and a function which resolves value of the new variable from existing variables.

Creates a new measure based on existing variables

main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm.
 const outputDM = dm.calculateVariable({
     name: 'powerToWeight',
     type: 'measure' // Schema of variable
 }, ['Horsepower', 'Weight_in_lbs', (hp, weight) => hp / weight ]);
main
run-button
run-button
reset-button
 // DataModel instance is created from https://www.charts.com/static/cars.json data,
 // https://www.charts.com/static/cars-schema.json schema and assigned to variable dm.
 const outputDM= dm.calculateVariable(
    {
      name: 'Efficiency',
      type: 'dimension'
    }, ['Horsepower', (hp) => {
     if (hp < 80) { return 'low'; }
     else if (hp < 120) { return 'moderate'; }
     else { return 'high' }
 }]);

Parameters:

NameTypeDescription
schema

Schema

Schema of newly defined variable

resolver

VariableResolver

Resolver format to resolve the current variable

Returns:

DataModelInstance of DataModel with the new field