Machine Learning models are configured in accordance to the task they are to perform.
Configuring Machine Learning Models
Configuring a model to make it ready for training involves specifying the data dimension fields, a label field if necessary, any required transformers and a learner. You can get to the model configuration page by finding the model in the Machine Learning List Models page, click on the model's actions button and select Edit.
Selecting Dimension and Label Fields
Dimension fields are chosen from a list of fields gotten from the data source tracker. These are shown in a multiselect list interface. Select a field by clicking on it. Select multiple fields by holding down the Ctrl
keyboard key and clicking on the fields.
Chosen dimension fields are the data attributes that model will be trained on. Tiki will leave out all unselected fields.
The label field is the data attribute that contains the target to be predicted. A label field is required if the chosen learner is a classifier.
Some regression-based learners like Gradient Boost will also require a label field specified. In such a case, the data attribute chosen as the label field is usually expected to be of numeric type.
Handling Empty Data Values
Before a sample is used for training, Tiki by default will replace empty numeric fields with 0. Empty categorical fields will remain as empty strings. If you do not want this behaviour, you can make Tiki to simply ignore samples with empty fields by checking the Ignore items with empty values option.
With this option checked, Tiki will skip any item that contain empty fields during model training, and it will not be used to train the model.
Adding Transformers and Learners
You use transformers to preprocess data before model training. A learner is a machine learning algorithm on which the machine learning model will be based. The type of transformers and learner you choose will depend on the structure and format of the training data and the type of target that you want to predict.
Add a transformer or a learner by simply selecting it from the dropdown list and clicking on Enter Arguments button.
A popup will be displayed for you to enter the argument values to be used internally to control the transformer or learning algorithm. Tiki will autofill any left out parameters with default values.
You add transformers in the order in which you want the data processed and you can add as many transformers as you deem fit. As a convention, the learner should be added last and only one learner is required. Adding multiple learners might result in unexpected behaviour.
Tiki internally uses Rubix ML for its Machine Learning functionality, so only transformers and learners available in Rubix ML are supported by Tiki.
Due to Tiki Tracker's robust nature, some data transformations might not be necessary. For example, Numeric String Converter works by converting all numeric values that have been given as categorical values to their equivalent integer and floating point types. Tiki will handle this automatically if the given values belong to a numeric field type in the source tracker. Applying the least possible number of transformers will help reduce model latency.