in

Predict automobile fleet failure likelihood utilizing Amazon SageMaker Jumpstart


Predictive upkeep is vital in automotive industries as a result of it might probably keep away from out-of-the-blue mechanical failures and reactive upkeep actions that disrupt operations. By predicting automobile failures and scheduling upkeep and repairs, you’ll scale back downtime, enhance security, and enhance productiveness ranges.

What if we might apply deep studying methods to widespread areas that drive automobile failures, unplanned downtime, and restore prices?

On this submit, we present you easy methods to practice and deploy a mannequin to foretell automobile fleet failure likelihood utilizing Amazon SageMaker JumpStart. SageMaker Jumpstart is the machine studying (ML) hub of Amazon SageMaker, offering pre-trained, publicly accessible fashions for a variety of downside sorts that will help you get began with ML. The answer outlined within the submit is on the market on GitHub.

SageMaker JumpStart resolution templates

SageMaker JumpStart gives one-click, end-to-end options for a lot of widespread ML use instances. Discover the next use instances for extra info on accessible resolution templates:

The SageMaker JumpStart resolution templates cowl quite a lot of use instances, below every of which a number of completely different resolution templates are supplied (the answer on this submit, Predictive Maintenance for Vehicle Fleets, is within the Options part). Select the answer template that most closely fits your use case from the SageMaker JumpStart touchdown web page. For extra info on particular options below every use case and easy methods to launch a SageMaker JumpStart resolution, see Solution Templates.

Answer overview

The AWS predictive upkeep resolution for automotive fleets applies deep studying methods to widespread areas that drive automobile failures, unplanned downtime, and restore prices. It serves as an preliminary constructing block so that you can get to a proof of idea in a brief time period. This resolution accommodates information preparation and visualization performance inside SageMaker and lets you practice and optimize the hyperparameters of deep studying fashions on your dataset. You should utilize your personal information or strive the answer with an artificial dataset as a part of this resolution. This model processes automobile sensor information over time. A subsequent model will course of upkeep file information.

The next diagram demonstrates how you should utilize this resolution with SageMaker parts. As a part of the answer, the next companies are used:

  • Amazon S3 – We use Amazon Simple Storage Service (Amazon S3) to retailer datasets
  • SageMaker pocket book – We use a pocket book to preprocess and visualize the information, and to coach the deep studying mannequin
  • SageMaker endpoint – We use the endpoint to deploy the skilled mannequin

Solution overview

The workflow consists of the next steps:

  1. An extract of historic information is created from the Fleet Administration System containing automobile information and sensor logs.
  2. After the ML mannequin is skilled, the SageMaker mannequin artifact is deployed.
  3. The related automobile sends sensor logs to AWS IoT Core (alternatively, by way of an HTTP interface).
  4. Sensor logs are endured by way of Amazon Kinesis Data Firehose.
  5. Sensor logs are despatched to AWS Lambda for querying in opposition to the mannequin to make predictions.
  6. Lambda sends sensor logs to Sagemaker mannequin inference for predictions.
  7. Predictions are endured in Amazon Aurora.
  8. Mixture outcomes are displayed on an Amazon QuickSight dashboard.
  9. Actual-time notifications on the anticipated likelihood of failure are despatched to Amazon Simple Notification Service (Amazon SNS).
  10. Amazon SNS sends notifications again to the related automobile.

The answer consists of six notebooks:

  • 0_demo.ipynb – A fast preview of our resolution
  • 1_introduction.ipynb – Introduction and resolution overview
  • 2_data_preparation.ipynb – Put together a pattern dataset
  • 3_data_visualization.ipynb – Visualize our pattern dataset
  • 4_model_training.ipynb – Prepare a mannequin on our pattern dataset to detect failures
  • 5_results_analysis.ipynb – Analyze the outcomes from the mannequin we skilled

Conditions

Amazon SageMaker Studio is the built-in growth surroundings (IDE) inside SageMaker that gives us with all of the ML options that we want in a single pane of glass. Earlier than we will run SageMaker JumpStart, we have to arrange SageMaker Studio. You may skip this step if you have already got your personal model of SageMaker Studio operating.

The very first thing we have to do earlier than we will use any AWS companies is to verify now we have signed up for and created an AWS account. Then we create an administrative person and a gaggle. For directions on each steps, consult with Set Up Amazon SageMaker Prerequisites.

The subsequent step is to create a SageMaker area. A website units up all of the storage and lets you add customers to entry SageMaker. For extra info, consult with Onboard to Amazon SageMaker Domain. This demo is created within the AWS Area us-east-1.

Lastly, you launch SageMaker Studio. For this submit, we suggest launching a person profile app. For directions, consult with Launch Amazon SageMaker Studio.

To run this SageMaker JumpStart resolution and have the infrastructure deployed to your AWS account, you should create an energetic SageMaker Studio occasion (see Onboard to Amazon SageMaker Studio). When your occasion is prepared, use the directions in SageMaker JumpStart to launch the answer. The answer artifacts are included on this GitHub repository for reference.

Launch the SageMaker Jumpstart resolution

To get began with the answer, full the next steps:

  1. On the SageMaker Studio console, select JumpStart.
    choose jumpstart
  2. On the Options tab, select Predictive Upkeep for Car Fleets.
    choose predictive maintenance
  3. Select Launch.
    launch solution
    It takes a couple of minutes to deploy the answer.
  4. After the answer is deployed, select Open Pocket book.
    open notebook

In case you’re prompted to pick a kernel, select PyTorch 1.8 Python 3.6 for all notebooks on this resolution.

Answer preview

We first work on the 0_demo.ipynb pocket book. On this pocket book, you may get a fast preview of what the result will appear like once you full the total pocket book for this resolution.

Select Run and Run All Cells to run all cells in SageMaker Studio (or Cell and Run All in a SageMaker pocket book occasion). You may run all of the cells in every pocket book one after the opposite. Guarantee all of the cells end processing earlier than shifting to the subsequent pocket book.

run all cells

This resolution depends on a config file to run the provisioned AWS assets. We generate the file as follows:

import boto3
import os
import json

shopper = boto3.shopper('servicecatalog')
cwd = os.getcwd().break up('/')
i= cwd.index('S3Downloads')
pp_name = cwd[i + 1]
pp = shopper.describe_provisioned_product(Title=pp_name)
record_id = pp['ProvisionedProductDetail']['LastSuccessfulProvisioningRecordId']
file = shopper.describe_record(Id=record_id)

keys = [ x['OutputKey'] for x in file['RecordOutputs'] if 'OutputKey' and 'OutputValue' in x]
values = [ x['OutputValue'] for x in file['RecordOutputs'] if 'OutputKey' and 'OutputValue' in x]
stack_output = dict(zip(keys, values))

with open(f'/root/S3Downloads/{pp_name}/stack_outputs.json', 'w') as f:
json.dump(stack_output, f)

We’ve got some pattern time sequence enter information consisting of a automobile’s battery voltage and battery present over time. Subsequent, we load and visualize the pattern information. As proven within the following screenshots, the voltage and present values are on the Y axis and the readings (19 readings recorded) are on the X axis.

volt

current

volt and current

We’ve got beforehand skilled a mannequin on this voltage and present information that predicts the likelihood of auto failure and have deployed the mannequin as an endpoint in SageMaker. We’ll name this endpoint with some pattern information to find out the likelihood of failure within the subsequent time interval.

Given the pattern enter information, the anticipated likelihood of failure is 45.73%.

To maneuver to the subsequent stage, select Click on right here to proceed.

next stage

Introduction and resolution overview

The 1_introduction.ipynb pocket book gives an outline of the answer and levels, and a glance into the configuration file that has content material definition, information sampling interval, practice and check pattern rely, parameters, location, and column names for generated content material.

After you evaluation this pocket book, you may transfer to the subsequent stage.

Put together a pattern dataset

We put together a pattern dataset within the 2_data_preparation.ipynb pocket book.

We first generate the configuration file for this resolution:

import boto3
import os
import json

shopper = boto3.shopper('servicecatalog')
cwd = os.getcwd().break up('/')
i= cwd.index('S3Downloads')
pp_name = cwd[i + 1]
pp = shopper.describe_provisioned_product(Title=pp_name)
record_id = pp['ProvisionedProductDetail']['LastSuccessfulProvisioningRecordId']
file = shopper.describe_record(Id=record_id)

keys = [ x['OutputKey'] for x in file['RecordOutputs'] if 'OutputKey' and 'OutputValue' in x]
values = [ x['OutputValue'] for x in file['RecordOutputs'] if 'OutputKey' and 'OutputValue' in x]
stack_output = dict(zip(keys, values))

with open(f'/root/S3Downloads/{pp_name}/stack_outputs.json', 'w') as f:
json.dump(stack_output, f)
import os

from supply.config import Config
from supply.preprocessing import pivot_data, sample_dataset
from supply.dataset import DatasetGenerator
config = Config(filename="config/config.yaml", fetch_sensor_headers=False)
config

The config properties are as follows:

fleet_info_fn=information/example_fleet_info.csv
fleet_sensor_logs_fn=information/example_fleet_sensor_logs.csv
vehicle_id_column=vehicle_id
timestamp_column=timestamp
target_column=goal
period_ms=30000
dataset_size=25000
window_length=20
chunksize=10000
processing_chunksize=2500
fleet_dataset_fn=information/processed/fleet_dataset.csv
train_dataset_fn=information/processed/train_dataset.csv
test_dataset_fn=information/processed/test_dataset.csv
period_column=period_ms

You may outline your personal dataset or use our scripts to generate a pattern dataset:

if should_generate_data:
    fleet_statistics_fn = "information/technology/fleet_statistics.csv"
    generator = DatasetGenerator(fleet_statistics_fn=fleet_statistics_fn,
                                 fleet_info_fn=config.fleet_info_fn, 
                                 fleet_sensor_logs_fn=config.fleet_sensor_logs_fn, 
                                 period_ms=config.period_ms, 
                                 )
    generator.generate_dataset()

assert os.path.exists(config.fleet_info_fn), "Please copy your information to {}".format(config.fleet_info_fn)
assert os.path.exists(config.fleet_sensor_logs_fn), "Please copy your information to {}".format(config.fleet_sensor_logs_fn)

You may merge the sensor information and fleet automobile information collectively:

pivot_data(config)
sample_dataset(config)

We will now transfer to information visualization.

Visualize our pattern dataset

We visualize our pattern dataset in 3_data_vizualization.ipynb. This resolution depends on a config file to run the provisioned AWS assets. Let’s generate the file just like the earlier pocket book.

The next screenshot exhibits our dataset.

dataset

Subsequent, let’s construct the dataset:

train_ds = PMDataset_torch(
    config.train_dataset_fn,
    sensor_headers=config.sensor_headers,
    target_column=config.target_column,
    standardize=True)

properties = train_ds.vehicle_properties_headers.copy()
properties.take away('vehicle_id')
properties.take away('timestamp')
properties.take away('period_ms')

Now that the dataset is prepared, let’s visualize the information statistics. The next screenshot exhibits the information distribution based mostly on automobile make, engine sort, automobile class, and mannequin.

visualize

Evaluating the log information, let’s have a look at an instance of the imply voltage throughout completely different years for Make E and C (random).

The imply of voltage and present is on the Y axis and the variety of readings is on the X axis.

  • Doable values for log_target: [‘make’, ‘model’, ‘year’, ‘vehicle_class’, ‘engine_type’]
    • Randomly assigned worth for log_target: make
  • Doable values for log_target_value1: [‘Make A’, ‘Make B’, ‘Make E’, ‘Make C’, ‘Make D’]
    • Randomly assigned worth for log_target_value1: Make B
  • Doable values for log_target_value2: [‘Make A’, ‘Make B’, ‘Make E’, ‘Make C’, ‘Make D’]
    • Randomly assigned worth for log_target_value2: Make D

Primarily based on the above, we assume log_target: make, log_target_value1: Make B and log_target_value2: Make D

make b and d

The next graphs break down the imply of the log information.

engine g h e

The next graphs visualize an instance of various sensor log values in opposition to voltage and present.

volt current 2

Prepare a mannequin on our pattern dataset to detect failures

Within the 4_model_training.ipynb pocket book, we practice a mannequin on our pattern dataset to detect failures.

Let’s generate the configuration file just like the earlier pocket book, after which proceed with coaching configuration:

sage_session = sagemaker.session.Session()
s3_bucket = sagemaker_configs["S3Bucket"]  
s3_output_path="s3://{}/".format(s3_bucket)
print("S3 bucket path: {}".format(s3_output_path))

# run in local_mode on this machine, or as a SageMaker TrainingJob
local_mode = False

if local_mode:
    instance_type="native"
else:
    instance_type = sagemaker_configs["SageMakerTrainingInstanceType"]
    
function = sagemaker.get_execution_role()
print("Utilizing IAM function arn: {}".format(function))
# solely run from SageMaker pocket book occasion
if local_mode:
    !/bin/bash ./setup.sh
cpu_or_gpu = 'gpu' if instance_type.startswith('ml.p') else 'cpu'


We will now outline the information and provoke hyperparameter optimization:

%%time

estimator = PyTorch(entry_point="practice.py",
                    source_dir="supply",                    
                    function=function,
                    dependencies=["source/dl_utils"],
                    instance_type=instance_type,
                    instance_count=1,
                    output_path=s3_output_path,
                    framework_version="1.5.0",
                    py_version='py3',
                    base_job_name=job_name_prefix,
                    metric_definitions=metric_definitions,
                    hyperparameters= {
                        'epoch': 100,  # tune it in response to your want
                        'target_column': config.target_column,
                        'sensor_headers': json.dumps(config.sensor_headers),
                        'train_input_filename': os.path.basename(config.train_dataset_fn),
                        'test_input_filename': os.path.basename(config.test_dataset_fn),
                        }
                     )

if local_mode:
    estimator.match({'practice': training_data, 'check': testing_data})
%%time

tuner = HyperparameterTuner(estimator,
                            objective_metric_name="test_auc",
                            objective_type="Maximize",
                            hyperparameter_ranges=hyperparameter_ranges,
                            metric_definitions=metric_definitions,
                            max_jobs=max_jobs,
                            max_parallel_jobs=max_parallel_jobs,
                            base_tuning_job_name=job_name_prefix)
tuner.match({'practice': training_data, 'check': testing_data})

Analyze the outcomes from the mannequin we skilled

Within the 5_results_analysis.ipynb pocket book, we get information from our hyperparameter tuning job, visualize metrics of all the roles to establish the very best job, and construct an endpoint for the very best coaching job.

Let’s generate the configuration file just like the earlier pocket book and visualize the metrics of all the roles. The next plot visualizes check accuracy vs. epoch.

test accuracy

The next screenshot exhibits the hyperparameter tuning jobs we ran.

hyperparameter tuning jobs

Now you can visualize information from the very best coaching job (out of the 4 coaching jobs) based mostly on the check accuracy (pink).

As we will see within the following screenshots, the check loss declines and AUC and accuracy improve with epochs.

auc and accuracy

auc and accuracy 2

Primarily based on the visualizations, we will now construct an endpoint for the very best coaching job:

%%time

function = sagemaker.get_execution_role()

mannequin = PyTorchModel(model_data=model_artifact,
                     function=function,
                     entry_point="inference.py",
                     source_dir="supply/dl_utils",
                     framework_version='1.5.0',
                     py_version = 'py3',
                     identify=sagemaker_configs["SageMakerModelName"],
                     code_location="s3://{}/endpoint".format(s3_bucket)
                    )

endpoint_instance_type = sagemaker_configs["SageMakerInferenceInstanceType"]

predictor = mannequin.deploy(initial_instance_count=1, instance_type=endpoint_instance_type, endpoint_name=sagemaker_configs["SageMakerEndpointName"])

def custom_np_serializer(information):
    return json.dumps(information.tolist())
    
def custom_np_deserializer(np_bytes, content_type="software/x-npy"):
    out = np.array(json.hundreds(np_bytes.learn()))
    return out

predictor.serializer = custom_np_serializer
predictor.deserializer = custom_np_deserializer

After we construct the endpoint, we will check the predictor by passing it pattern sensor logs:

import botocore

config = botocore.config.Config(read_timeout=200)
runtime = boto3.shopper('runtime.sagemaker', config=config)

information = np.ones(form=(1, 20, 2)).tolist()
payload = json.dumps(information)

response = runtime.invoke_endpoint(EndpointName=sagemaker_configs["SageMakerEndpointName"],
ContentType="software/json",
Physique=payload)
out = json.hundreds(response['Body'].learn().decode())[0]

print("Given the pattern enter information, the anticipated likelihood of failure is {:0.2f}%".format(100*(1.0-out[0])))

Given the pattern enter information, the anticipated likelihood of failure is 34.60%.

Clear up

While you’ve completed with this resolution, just be sure you delete all undesirable AWS assets. On the Predictive Upkeep for Car Fleets web page, below Delete resolution, select Delete all assets to delete all of the assets related to the answer.

clean up

You have to manually delete any further assets that you will have created on this pocket book. Some examples embody the additional S3 buckets (to the answer’s default bucket) and the additional SageMaker endpoints (utilizing a customized identify).

Customise the answer

Our resolution is straightforward to customise. To switch the enter information visualizations, consult with sagemaker/3_data_visualization.ipynb. To customise the machine studying, consult with sagemaker/source/train.py and sagemaker/source/dl_utils/network.py. To customise the dataset processing, consult with sagemaker/1_introduction.ipynb on easy methods to outline the config file.

Moreover, you may change the configuration within the config file. The default configuration is as follows:

fleet_info_fn=information/example_fleet_info.csv
fleet_sensor_logs_fn=information/example_fleet_sensor_logs.csv
vehicle_id_column=vehicle_id
timestamp_column=timestamp
target_column=goal
period_ms=30000
dataset_size=10000
window_length=20
chunksize=10000
processing_chunksize=1000
fleet_dataset_fn=information/processed/fleet_dataset.csv
train_dataset_fn=information/processed/train_dataset.csv
test_dataset_fn=information/processed/test_dataset.csv
period_column=period_ms

The config file has the next parameters:

  • fleet_info_fn, fleet_sensor_logs_fn, fleet_dataset_fn, train_dataset_fn, and test_dataset_fn outline the situation of dataset recordsdata
  • vehicle_id_column, timestamp_column, target_column, and period_column outline the headers for columns
  • dataset_size, chunksize, processing_chunksize, period_ms, and window_length outline the properties of the dataset

Conclusion

On this submit, we confirmed you easy methods to practice and deploy a mannequin to foretell automobile fleet failure likelihood utilizing SageMaker JumpStart. The answer relies on ML and deep studying fashions and permits all kinds of enter information together with any time-varying sensor information. As a result of each automobile has completely different telemetry on it, you may fine-tune the supplied mannequin to the frequency and sort of knowledge that you’ve got.

To be taught extra about what you are able to do with SageMaker JumpStart, consult with the next:

Sources


Concerning the Authors

Rajakumar Sampathkumar is a Principal Technical Account Supervisor at AWS, offering clients steerage on business-technology alignment and supporting the reinvention of their cloud operation fashions and processes. He’s captivated with cloud and machine studying. Raj can also be a machine studying specialist and works with AWS clients to design, deploy, and handle their AWS workloads and architectures.


Spotlight textual content because it’s being spoken utilizing Amazon Polly

Fourier Remodel for Time Sequence: Quick Convolution Defined with numpy | by Yoann Mocquin | Jul, 2023