Get started using Aito Command Line Interface
  • 22 Oct 2020
  • 10 Minutes To Read
  • Print
  • Share
  • Dark
    Light

Get started using Aito Command Line Interface

  • Print
  • Share
  • Dark
    Light

Before you begin


Getting an Aito instance

If you want to follow along with this get started guide, you'll have to go and get your own Aito instance from the Aito Console.

  1. Sign in or create an account, if you haven't got one already.
  2. In the Aito Console go to the instances page and click the "Create an instance" button.
  3. Select the instance type you want to create and fill in needed fields, Sandbox is the free instance for testing and small projects. Visit our pricing page, to learn more about the Aito instance types.
  4. Click "Create instance" and wait for a moment while your instance is created, you will receive an email once your instance is ready.

Getting the Aito Command Line Interface tool

The Aito Command Line Interface (CLI) tool works using Python, so you will have to have Python3.6 or higher installed.

Install Aito CLI

To install the Aito CLI you can use pip:

pip install aitoai==0.4.0

Running the help command

You can use aito --help to get detailed information on the aito Command Line tool.

aito --help

To get more information on a specific operation, you can include the operation name. For example, if you want to know what database does you can use the following call.

aito quick-add-table --help

Configuring the CLI

To access your Aito instance, you will need to get the Aito instance API URL and read/write API key from the Aito console. To find the information follow these steps:

  1. Log in to the Aito console.
  2. Go to the Instances page and click on the instance you want to use.
  3. Select the overview tab and there you should be able to access the instance API URL and read/write API key. The API key is shown by pressing the eye icon.

api_info

Then you can use the information to define the configuration for the CLI using the following command.

aito configure

The CLI will ask you to for the instance URL and API key. Be sure to use the read/write API key in order to go through all of the steps in this guide. After you have given the URL and API ke, the CLI creates a credentials file to $HOME/.config/aito/credentials (%UserProfile% in Windows) which it then uses when accessing the Aito instance.

Store the URL and API key as environment variables as well, in order to easily copy paste the curl examples in this guide. Use the following environment variables.

Environment variable Value
AITO_INSTANCE_URL your-aito-instance-url
AITO_API_KEY your-api-key

In Unix based systems you can define the environment variables in the command line as follows (for one session use).

export AITO_INSTANCE_URL=your-aito-instance-url
export AITO_API_KEY=your-api-key

TL;DR


If you're in a hurry all of the mentioned steps can be done with just one command. The downside is that you will have no control over the schema created. If you don't mind this and want to skip the data handling steps (2-5) to get straight into predicting, you can use the following command.

aito quick-add-table --file-format csv --table-name Titanic train.csv

All of the commands described in this guide are as follows.

  1. Download CSV (train.csv):
    https://www.kaggle.com/c/titanic/data
  2. Infer schema:
aito infer-table-schema csv train.csv > titanic_schema.json
  1. Create table:
aito create-table Titanic titanic_schema.json
  1. Convert data:
aito convert csv -s titanic_schema.json --json train.csv > titanic_data.json
  1. Upload data:
aito upload-entries Titanic < titanic_data.json
  1. Make query :
aito predict '
  {
    "from": "Titanic",
    "where": {
      "Pclass": 1,
      "Sex": "female"
    },
    "predict": "Survived"
  }'
  1. Evaluate the results:
aito evaluate --use-job '
{
  "test": {
  "$index": {
  "$mod": [4, 0]
    }
  },
  "evaluate":  {
    "from": "Titanic",
    "where": {
      "Pclass": {"$get": "Pclass"},
      "Sex": {"$get": "Sex"}
    },
    "predict": "Survived"
  },
    "select": ["trainSamples", "testSamples", "baseAccuracy", "accuracyGain", "accuracy", "error", "baseError"]
}'
  1. Clean up:
aito delete-database

Intro


This get started guide uses the famous Titanic dataset as an example of how to make predictions using Aito. Titanic was a passenger liner (the biggest of the time) that collided with an iceberg on her maiden voyage on April 15, 1912 which lead to the ship sinking into the abyss.

The dataset and problem framing are quite simple but they demonstrate the steps of how to work with Aito, so you can go ahead and start making predictions with your own data and answer the questions you are curious about.

The problem


When starting to use Aito, you will want to have the problem you're solving framed as a question to help with creating the queries. In this guide, we want to answer the question of "What kind of people were more likely to survive the accident?" by using the Titanic passenger data (i.e. name, age, gender, socio-economic class, etc.)

Data


You can download the dataset from here. The name of the data file is train.csv.

Aito needs data from the past in order to make predictions for the future. The Titanic dataset includes passenger details such as the class of the passenger, sex, age and so on. The details can be used to define the person we want to predict survival for. The value we want to predict also has to be encoded in the data as a column (or a feature in data science terms), in this case, it is survived.

Snapshot of the data

PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
3 1 3 Heikkinen, Miss. Laina female 26 0 0 STON/O2. 3101282 7.925 S
4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35 1 0 113803 53.1 C123 S
5 0 3 Allen, Mr. William Henry male 35 0 0 373450 8.05 S

Table schema definition


Data lives in Aito as tables. The train.csv of the Titanic dataset will be put into Aito as one table which will be called Titanic. It is possible to use linked tables in Aito but in this example having just one table is enough.

In order to get data uploaded into Aito you will have to define the data schema for the Titanic table. By defining the schema you will tell Aito how to handle different columns in the data, for example, whether a column's values should be handled as integer or boolean values. Aito accepts numeric (integers, decimals), boolean, string and text data types. The variable nullable defines whether the column includes empty values, nullable: truemeans the column can have empty values. Analyzers are used for text columns so columns whose values have longer sentences.

For the Titanic dataset, the table schema can for example be defined as follows.

{
    "columns": {
        "Age": {
            "nullable": true,
            "type": "Decimal"
        },
        "Cabin": {
            "analyzer": "en",
            "nullable": true,
            "type": "Text"
        },
        "Embarked": {
            "nullable": true,
            "type": "String"
        },
        "Fare": {
            "nullable": false,
            "type": "Decimal"
        },
        "Name": {
            "analyzer": "en",
            "nullable": false,
            "type": "Text"
        },
        "Parch": {
            "nullable": false,
            "type": "Int"
        },
        "PassengerId": {
            "nullable": false,
            "type": "Int"
        },
        "Pclass": {
            "nullable": false,
            "type": "Int"
        },
        "Sex": {
            "nullable": false,
            "type": "String"
        },
        "SibSp": {
            "nullable": false,
            "type": "Int"
        },
        "Survived": {
            "nullable": false,
            "type": "Int"
        },
        "Ticket": {
            "analyzer": "pt-br",
            "nullable": false,
            "type": "Text"
        }
    },
    "type": "table"
}

You can either copy-paste the above JSON into a file named titanic_schema.json or run the aito infer-table-schema command to create the table schema file you can then use to create the Titanic table into Aito.

Aito CLI Infer schema

To quickly infer the table schema from the data use the following Aito CLI command.

aito infer-table-schema csv train.csv > titanic_schema.json

Always check the created file that the types look correct. After uploading data into Aito the schema will be immutable and you can only change it by deleting all the data in Aito and reuploading the schema.

Aito CLI Create table

To create the Titanic table into Aito using the CLI you can use the following command.

aito create-table Titanic titanic_schema.json

Upload data


The data has to be in the JSON format in order to be uploaded into Aito. The data can be uploaded by reading the file by entries or by uploading the whole file. In this guide, we use the upload entries functionality.

Convert into JSON

You can run the following command to format the CSV into the JSON format.

aito convert csv -s titanic_schema.json --json train.csv > titanic_data.json

Upload entries

To upload the data to the created Titanic table you can use the following command.

aito upload-entries Titanic < titanic_data.json

Run a query


Aito query's generic syntax

The aito query follows a syntax that is based on this rule:

From a given context (a specific table and what is known from that table), use an operation to find the known or the unknown.

{
  "from"            : define the initial context (table name),
  "where"           : more details of the context,
  "operation_name"  : operation to be perform,
  "orderBy"         : sort the result by some metric,
  "select"          : select specific attributes or parts of the result,
  "offset"          : define the number of rows in the result to be skipped,
  "limit"           : limit the number of rows to be shown in the result
}

Making the query

When the data is in Aito, you can start making queries to the data. For example, if you want to answer the question "How likely would a first-class woman survive the Titanic accident?" you can create the following query to the _predict endpoint.

aito predict '
  {
    "from": "Titanic",
    "where": {
      "Pclass": 1,
      "Sex": "female"
    },
    "predict": "Survived"
  }'

Aito's query language resembles SQL in that it has the from and where clauses. In the query, we state that we want to use the data in the Titanic table, from: "Titanic", and the attributes of the passengers, we want to get survival for, are defined by where. With the predict clause we want to state which attribute we are predicting.

Copy the curl command to the command and press enter. Aito will then immediately return the results.

Results


For the query, Aito will return the following result.

{
  "offset" : 0,
  "total" : 2,
  "hits" : [ {
    "$p" : 0.8853714713085219,
    "field" : "Survived",
    "feature" : 1
  }, {
    "$p" : 0.11462852869147827,
    "field" : "Survived",
    "feature" : 0
  } ]

From the result $p is the probability of the fieldhaving a feature . So for example with 89% probability, a first-class female passenger survived the Titanic accident. Aito also returns the probabilities of the field having other features. In the case of survival the result is binary so the passenger either survived (=1) or didn't (=0) so Aito returns results for two features.

You can try out how different kinds of people survived the accident by changing the attributes defined in the query's where clause.

Evaluation of results


Result evaluation is an important step when calculating probabilities. It gives you the information on how accurate your predictions are and gives you a metric which to use for improving the prediction. Evaluation is an in-built functionality of Aito.

Evaluation can be run for the previous query as follows.

aito evaluate --use-job '
{
  "test": {
  "$index": {
  "$mod": [4, 0]
    }
  },
  "evaluate":  {
    "from": "Titanic",
    "where": {
      "Pclass": {"$get": "Pclass"},
      "Sex": {"$get": "Sex"}
    },
    "predict": "Survived"
  },
    "select": ["trainSamples", "testSamples", "baseAccuracy", "accuracyGain", "accuracy", "error", "baseError"]
}'

The test variable defines what data is used as the test data. Aito runs the evaluation by splitting the data in the database into a test and training dataset. The test dataset is the unknown data to Aito for which we will make predictions using Aito and the training dataset is the set for which we already know the correct values. The test and training set prediction values are then compared to get the accuracy of the prediction query. In this example, the test set is defined to be every fourth row of the data in the DB starting from index 0, as defined by $mod. In the evaluate variable we define the query we want to evaluate which is the same as was used in the query step. The $get operator takes the values per row for the column. With the select operator you can restrict the values which are returned by the evaluate endpoint.

The request starts an evaluation job as it can take some time while Aito calculates the accuracy using multiple datapoints, so it runs possibly hundreds or thousands of predictions depending on the size of your dataset.

Evaluation result

{
  "trainSamples": 668.0,
  "testSamples": 223,
  "baseAccuracy": 0.6367713004484304,
  "accuracyGain": 0.15246636771300448,
  "accuracy": 0.7892376681614349,
  "error": 0.21076233183856508,
  "baseError": 0.36322869955156956
}

Variable accuracy shows the accuracy of Aito for the given query. The baseAccuracyis the accuracy that would be achieved just by using a Naive Bayesian algorithm for the prediction. For more about the response values, check our API documentation.

Deleting the data

If you then want to start your project on a clean slate, you can delete the schema and all data from the Aito instance with the following command.

aito delete-database
Was This Article Helpful?