Bringing the end-user into the AI picture! - AI Time Journal

There is a ton of information these days on every single phase involved in building AI algorithms, and this is great!

This covers loading/preparing data, feature engineering, training, testing, hyper-parameterization, validation, explainability, MLOps, and deployment.

Overlooking End-Users in AI Applications

At the same time, I am puzzled to see how little is mentioned about the “end-user”: the end-user being a business person with no AI background interacting with the software.

Even if AI has led to many “automated” AI applications (for instance, autonomous vehicles, trading bots, etc), most companies need end-users to “collaborate”/interact with an AI engine.

Vincent Gosselin

Let’s take two examples:

QSR Store Sales Forecast
A two-month Cash Flow Prediction for a large Franchised brand.

In Example 1, a McDonald’s store manager connects to the new forecasting software. Through an ergonomic GUI, she/he can generate next week’s sales forecast (created by the AI engine). Then, she/he just discovered 5 minutes ago that a competitor across the road is running a new promotion today. She/He may then opt to lower the generated forecast by 10% during peak hours. Here, the end-user needs to modify the output of the forecast engine.

In Example 2, the company treasurer wants to run the Cash Flow Prediction for the next two months. However, he wants to play with different inflation values and evaluate the impact on the forecast. Here, the end-user wants to control an input parameter (the inflation rate) to the AI Engine.

There are countless other examples where end-users need to modify an AI engine’s input or output. This is an integral part of the Decision Process.

Taipy’s Capabilities to enhance end-user interaction with AI

To address these situations, we defined (as part of the Taipy open source team) the concept of “scenario” and “data nodes”. A scenario is nothing more than the execution of your algorithm (pipeline) given a set of input information (input data nodes).

We have also implemented three essential capabilities:

1. Data Nodes

Ability to model pipelines as a sequence of Python tasks as well as Data Nodes (anything that can be an input or an output of a Python task). A data node can connect to any data format (SQL, NoSQL, CSV, JSON, etc) or a parameter (a Python object, i.e., A date entered by the end-user through the graphical interface).

2. Scenarios

Ability to record each pipeline execution (inside a registry). We call such execution a ‘scenario’.

3. Scenario comparison

Ability to retrieve past/registered scenarios, compare them, track them, etc.

We decided to provide two options for defining your pipeline in Taipy: Programmatically or using a Visual Code Graph Editor.

Let’s take an example

1. Create a pipeline

Let’s take a straightforward pipeline case with:

– A single task: “predict”, calling the inference of an AI engine

– 2 input Data Nodes: ‘historical_temperature” and “date_to_forecast”.

A single task pipeline with 2 data nodes

To create this pipeline, with Taipy, we have two options:

Option 1: Programmatical Configuration

We can dive into Python code. This script creates a scenario_cfg object:

Option 2: Graphical Editor Configuration

Or, we can use Taipy Studio, the Pipeline/DAG Graphical Editor that enhances pipelines creation. (VS Code extension)

The scenario_cfg object is then created by loading the previous diagram and saved as a TOML file.

2. Execute different scenarios

Scenarios are just instances of the previous pipeline configuration.
Here:
1. We create a scenario (an instance of the pipeline configuration above)
2. We initialize its input data nodes
3. We execute it (tp.submit())

Note that behind the screen, the execution of a given scenario is registered, i.e., an automatic storage of information related to each data node used at the time of execution.

Benefits

This relatively “simple” scenario management process defined in this article allows for:

1. A rich set of user functionalities such as:

Easy Retrieval of all scenarios over a given period and their associated input/output data nodes allows easy data lineage.
Comparing two or more scenarios based on some KPIs: the value of a given data node.
Tracking over time a given KPI
Re-executing a past scenario with new values (can change the value of a given data node)

2. Full pipeline Versioning: Essential for quality Project management

Overall pipeline versioning is badly needed when new data nodes/sources are introduced or a new version of a given Python code (avoiding incompatibilities with previously run scenarios).

3. Narrowing the gap between Data Scientists/Developers & End-users

By providing access to the entire repository of end-user scenarios, data scientists and Python devs can better understand how end-users use the software.

And to go further

To help this process, we found it helpful to provide specific graphical objects to explore past scenarios visually, display their input and output data nodes, modify them, re-execute scenarios, etc.

For this purpose, we extended Taipy’s graphical library to provide a new set of graphical components for Scenario visualization.

Here’s an example of such a scenario ‘navigator’.

Conclusion

This is our interpretation of scenario management. We hope such an article will trigger more interest and discussion on this crucial topic and lead to better AI software and, ultimately, better decisions.

If you want to read more articles like this, check out this one.

Bringing the end-user into the AI picture! – AI Time Journal

Overlooking End-Users in AI Applications