This feature analyzes violations that have been classified as Fix and Suppress in the Actions field of the Prioritization tab (see Assigning Actions to Violations) and builds a predictive model based on patterns it detects. After the model has been built, DTP will predict which violations in the build should be assigned the Fix action. A balanced set of at least 20 violations must be classified as Fix and Suppress to build the predictive model. The model gradually improves as you continue to review violations and assign them actions.

In this section:

Using the Machine Learning Wizard

The wizard guides you through the following process:

Classifying violations: Classifying violations refers to setting a value for the Action field in the Prioritization tab (see Assigning Actions to Violations). The machine learning functionality requires at least 20 violations to be classified as Fix or Suppress to build a predictive model.
Training the model: DTP will analyze classified violations and build a predictive model based on several data points. This may take several minutes if a large number of violations (for example, 1000+) have been classified. The resulting model is assigned a health score. If the model has an unacceptable health score, the wizard will prompt you to review and classify additional violations.
Predicting actions: When the model has an acceptable health score (between Good and Excellent), you can apply the model to the unclassified violations in the filter. This significantly reduces the time required to review violations and enables the team to quickly begin working on remediating critical violations.

Classifying Violations

Click the machine learning icon and choose Predict Violations to Fix to launch the wizard.
If this is your first time launching the wizard or if the existing model is not sufficient to predict actions, you will be prompted to train the model. Click Next.
You will need to classify at least 20 violations to build a predictive model. Choose Classify violations and click Next.
DTP will analyze the existing builds in the filter and prompt you to build the model based on violations that have already been classified or to classify violations manually.
Choose one of the following options and click Confirm:
- Enable Classify violations based on history and choose either All, 6 months, or 12 months from the Build history menu. DTP will search for violations that have already been fixed or suppressed and set the Action field to "Fix" or "Suppress" accordingly. See Expected Classification Outcomes for additional information.
- Enable Classify violations manually and DTP will present 20 violations from the current build for you to review and classify. See Assigning Actions to Violations for instructions on how to manually classify violations.
When an adequate number of violations have been classified, you will be prompted to return to the main page of the wizard so that you can train the model (see Training the Model).

Expected Classification Outcomes

If Classify violations based on history is enabled, DTP will classify violations in the database that have been fixed or suppressed, resulting in one of the following outcomes:

If an adequate number of violations have been classified, you will be prompted to return to the main page of the wizard so that you can train the model (see Training the Model).
If no violations from your history have been fixed or suppressed, then DTP will not be able to automatically assign actions. You will be prompted to return to the main page of the wizard and restart the classification process. Choose Classify violations manually when prompted. DTP will present 20 violations from the current build for you to review and classify. See Assigning Actions to Violations for instructions on how to manually classify violations.
DTP requires a balanced ratio of Fix and Suppress violations. If too many violations have been identified and classified as either Fix or Suppress, you will be prompted to manually classify additional violations until a more balanced ratio is achieved. See Assigning Actions to Violations for instructions on how to manually classify violations.

Training the Model

If the wizard is not already open, click the machine learning icon and choose Predict Violations to Fix to launch it.
Choose Train model and click Next.
If you have already classified at least 20 violations, you will be prompted to execute training based on those classifications (if you have not already classified at least 20 violations, see Classifying Violations). Enable Execute training and click Next.
Choose a training mode when prompted. See About Training Modes for additional information about which option to choose.
Click Confirm when prompted to begin training the prediction model on how to classify violations. This may take several minutes if a large number of violations (for example, 1000+) have been classified.
When the model is trained, you will be prompted to predict actions (see Predicting Actions).

A healthy prediction model is essential for DTP to make reliable predictions. If the classification process results in a Poor or Moderate model, then you should continue classifying violations until the model health improves. See About Prediction Health for additional information. If you used the Fast or Normal mode to train the model and are unsatisfied with the quality of the model, retrain the model and choose either the Deep or Deepest training mode option.

About Training Modes

DTP uses the following algorithms to train the model:

k-NN
Naive Bayes
Adaptive Boost
Random Forest

The Fast and Deep training mode options use the k-NN and Naive Bayes algorithms, which have shown to be the fastest algorithms for processing static analysis data. But because only two algorithms run, DTP will also have fewer options when determining the best training model.

The Normal and Deepest training modes use all algorithms, which will take longer than the other options to process the data. Using all algorithms, however, provides DTP with the most information for determining the best training model.

The Fast and Normal training mode options limit the number of violations used to train the model to 1000. This enables DTP to train faster, especially if you have a large set of classified violations. Reducing the data sample size, however, may affect the overall quality of the model.

The Deep and Deepest training use all classified violations available to train the model. As a result, these options have the potential to produce the best possible model. Allow several more minutes for the training process to complete, however, if your project contains a very large set of classified violations.

Predicting Actions

If the wizard is not already open, click the machine learning icon and choose Predict Violations to Fix to launch it.
Choose Predict actions and click Next.
- If Predict actions is not available, you will need to classify more violations. See Classifying Violations.
- If you would like to increase the predictor's health score, you will need to train the model more. See Training the Model.
Click Confirm when prompted. DTP will use the machine learning model to make predictions about how to classify static analysis violations for the current build in the filter. The Predicted Action field in the Prioritization tab will be updated. See Predicted Actions for additional information. A prediction results summary screen will also appear.
Review the actions predicted for the violations. You can either assign the predictions to the violations (see Assigning Actions to Violations) or ignore the predictions. See Predicted Actions for details about predicted actions.

Predicted Actions

After you have classified violations and trained and applied the predictive model, the Predicted Action field appears in the Prioritization tab. If DTP predicts that the violation should be fixed, a value of "Fix" will appear in the field, as well as a value indicating how confident DTP is that the predicted action is correct based on the model.

If the Fix action has not been predicted for the violation, the Predicted Actions field shows Not available.

You can look at the Predicted Action column in the search results table so that you can sort violations according to DTPs recommendations.

You can drag the column to the position in the table that best suits your needs. Refer to the Navigating Explorer Views chapter for additional information about customizing the Violations Explorer view.

About Prediction Health

The health of the predictions DTP makes depends on the number of violations you process and the quality of the data you provide. DTP requires at least 20 violations to be assigned an action, but processing more violations will improve prediction health. If you have not provided enough data, DTP will not be able to make a prediction. The model informing DTP predictions also depends on the accuracy and consistency of the data you provide. DTP uses advanced algorithms to analyze the data associated with each violation, but it can only make quality predictions based on quality inputs. Assigning too many of one type of action also affects the prediction results. You should assign an equal number Fix and Suppress actions to facilitate prediction health.

Providing Machine Learning Feedback to Parasoft

You can help Parasoft improve the system by using the API to get Machine Learning data and sending the data to your Parasoft representative. Sensitive information, such as username and project name, is obfuscated in the response. Save the output from the following API endpoints are available to retrieve the Machine Learning data:

/classificationResultSets

This endpoint returns static analysis violations used to train the model and prediction results.

Authentication

Pass your username and password as when sending the request. See Example.

URL

http:<HOST>:<PORT>/grs/api/v1.7/ml/staticAnalysis/classificationResultsSets

Method

GET

Parameters

The following table describes the parameters available for this endpoint.

Parameter Description Type Required

sortOrder

Specifies the order in which the data is returned by ID. Data is assigned IDs in sequential order, so newer data has a larger ID number. You can specify the following values:

asc (default)

desc

string

optional

limit Specifies the maximum number of result sets to return. The default is no limit. integer optional

offset

Specifies the number of result sets to skip in the response. Default is 0.

If the database has three result sets and the offset=1 parameter is specified, the two most recent results will be returned. If the sortOrder=desc is also applied, then the first two result sets will be returned.

integer

optional

Example

curl -X GET -u username:password "http://dtp.mycompany.com:8443/grs/api/v1.7/ml/staticAnalysis/classificationResultSets?sortOrder=desc&limit=10" > myResultSets.json

/violationActionHistory

This endpoint returns actions assigned to the violations used to train the model.

URL

http:<HOST>:<PORT>/grs/api/v1.7/ml/staticAnalysis/violationActionHistory

Method

GET

Parameters

None.

Example

curl -X GET -u username:password "http://dtp.mycompany.com:8443/grs/api/v1.7/ml/staticAnalysis/violationActionHistory" > myViolationActionHistory.json

Page tree

Using the Machine Learning Feature - Predicting Violations to Fix

Using the Machine Learning Wizard

Classifying Violations

Expected Classification Outcomes

Training the Model

About Training Modes

Predicting Actions

Predicted Actions

About Prediction Health

Providing Machine Learning Feedback to Parasoft

/classificationResultSets

Authentication

URL

Method

Parameters

Example

/violationActionHistory

URL

Method

Parameters

Example