History: Configuring Machine Learning Models
Source of version: 6 (current)
Copy to clipboard
^This page ((needs review))^ Machine Learning models are configured in accordance to the task they are to perform. ! Configuring Machine Learning Models Configuring a model to make it ready for training involves specifying the data dimension fields, a label field if necessary, any required transformers and a learner. You can get to the model configuration page by finding the model in the Machine Learning __List Models__ page, click on the model's actions button and select __Edit__. {img src="display1902" link="display1902" width="400" rel="box[g]" imalign="center" desc="Find edit option in model's action menu" align="center" styleimage="border"} {img src="display1903" link="display1903" width="400" rel="box[g]" imalign="center" desc="Model configuration page" align="center" styleimage="border"} !! Selecting Dimension and Label Fields Dimension fields are chosen from a list of fields gotten from the data source tracker. These are shown in a multiselect list interface. Select a field by clicking on it. Select multiple fields by holding down the -+Ctrl+- keyboard key and clicking on the fields. {img src="display1904" link="display1904" width="400" rel="box[g]" imalign="center" desc="Select dimension fields from multi-select list" align="center" styleimage="border"} Chosen dimension fields are the data attributes that model will be trained on. Tiki will leave out all unselected fields. The label field is the data attribute that contains the target to be predicted. A label field is required if the chosen learner is a classifier. {img src="display1905" link="display1905" width="400" rel="box[g]" imalign="center" desc="Set label field if required by learner" align="center" styleimage="border"} Some regression-based learners like [https://docs.rubixml.com/2.0/regressors/gradient-boost.html|Gradient Boost] will also require a label field specified. In such a case, the data attribute chosen as the label field is usually expected to be of numeric type. !! Handling Empty Data Values Before a sample is used for training, Tiki by default will replace empty numeric fields with 0. Empty categorical fields will remain as empty strings. If you do not want this behaviour, you can make Tiki to simply ignore samples with empty fields by checking the __Ignore items with empty values__ option. {img src="display1906" link="display1906" width="400" rel="box[g]" imalign="center" desc="Check the box to ignore empty data values" align="center" styleimage="border"} With this option checked, Tiki will skip any item that contain empty fields during model training, and it will not be used to train the model. !! Adding Transformers and Learners You use transformers to preprocess data before model training. A learner is a machine learning algorithm on which the machine learning model will be based. The type of transformers and learner you choose will depend on the structure and format of the training data and the type of target that you want to predict. {img src="display1907" link="display1907" width="400" rel="box[g]" imalign="center" desc="Choose a transformer or learner" align="center" styleimage="border"} {img src="display1908" link="display1908" width="400" rel="box[g]" imalign="center" desc="Pick a transformer or learner from the list" align="center" styleimage="border"} {img src="display1909" link="display1909" width="400" rel="box[g]" imalign="center" desc="Click Enter Arguments to show popup" align="center" styleimage="border"} Add a transformer or a learner by simply selecting it from the dropdown list and clicking on __Enter Arguments__ button. {img src="display1910" link="display1910" width="400" rel="box[g]" imalign="center" desc="Enter arguments" align="center" styleimage="border"} {img src="display1911" link="display1911" width="400" rel="box[g]" imalign="center" desc="Learner added" align="center" styleimage="border"} A popup will be displayed for you to enter the argument values to be used internally to control the transformer or learning algorithm. Tiki will autofill any left out parameters with default values. {img src="display1912" link="display1912" width="400" rel="box[g]" imalign="center" desc="Fully configured" align="center" styleimage="border"} {img src="display1913" link="display1913" width="400" rel="box[g]" imalign="center" desc="Success message after configuration" align="center" styleimage="border"} You add transformers in the order in which you want the data processed and you can add as many transformers as you deem fit. As a convention, the learner should be added last and only one learner is required. Adding multiple learners might result in unexpected behaviour. Tiki internally uses [https://docs.rubixml.com/|Rubix ML] for its Machine Learning functionality, so only transformers and learners available in Rubix ML are supported by Tiki. Due to Tiki Tracker's robust nature, some data transformations might not be necessary. For example, [https://docs.rubixml.com/2.0/transformers/numeric-string-converter.html|Numeric String Converter] works by converting all numeric values that have been given as categorical values to their equivalent integer and floating point types. Tiki will handle this automatically if the given values belong to a numeric field type in the source tracker. Applying the least possible number of transformers will help reduce model latency. !! Related links * ((Machine Learning)) * ((Preparing Machine Learning Dataset)) * ((Creating Machine Learning Models)) * ((Training Machine Learning Models)) * ((Using Machine Learning Models)) * [https://docs.rubixml.com/|Rubix ML]