View Source

In this section:

Why Recommend Violations to Fix?

One of the first decisions that a developer has to make when assessing a violation is whether to fix the violation or suppress it. Although most violations detected may be true positives, there also may be instances in which suppressing a violation is the sensible thing to do, such as when the violation is a false positive or the team has decided to not fix certain kinds of violations.

DTP provides a recommendation whether to fix the violation based on historical data of whether similar violations have been fixed or suppressed. This recommendation is given as a fix percentage: a high percentage (for example, 80% or more) indicates that similar violations have been fixed in the past, while a low percentage (for example, less than 20%) means that similar violations have been suppressed in the past.

Developers can use the DTP recommendation as an assistant in assessing and triaging violations.

Getting Recommendations

In order to get recommendations, a model must first be trained. Training requires violations that have been classified as Fix or Suppress in the Actions field of the Prioritization tab (see Assigning Actions to Violations).

If you have been fixing violations for some time, you can classify the violations based on history. This is the recommended approach. If you are a new user and have not fixed or suppressed any violations, you can classify violations manually. Both approaches are supported by the Machine Learning Wizard.

A balanced set of at least 20 violations must be classified as Fix and Suppress to train the model.

It is recommended to retrain the model from time to time as you continue to fix and suppress more violations.

Using the Machine Learning Wizard to Train the Model

The wizard guides you through the following process:

Classifying violations: Classifying violations refers to setting a value for the Action field in the Prioritization tab (see Assigning Actions to Violations). The machine learning functionality requires at least 20 violations to be classified as Fix or Suppress to build a recommendation model.
Training the model: DTP will analyze classified violations and build a recommendation model based on several data points. This may take several minutes if a large number of violations (for example, 1000+) have been classified. The resulting model is assigned a health score. If the model has an unacceptable health score, the wizard will prompt you to review and classify additional violations.

Classifying Violations

Click the machine learning icon and choose Recommend Violations to Fix to launch the wizard.
If this is your first time launching the wizard or if the existing model is not sufficient to recommend actions, you will be prompted to train the model. Click Next.
You will need to classify at least 20 violations to build a recommendation model. Choose Classify violations and click Next.
DTP will analyze the existing builds in the filter and prompt you to build the model based on violations that have already been classified or to classify violations manually.
Choose one of the following options and click Confirm:
- Enable Classify violations based on history and choose either All, 6 months, or 12 months from the Build history menu. DTP will search for violations that have already been fixed or suppressed and set the Action field to "Fix" or "Suppress" accordingly. See Expected Classification Outcomes for additional information.
- Enable Classify violations manually and DTP will present 20 violations from the current build for you to review and classify. See Assigning Actions to Violations for instructions on how to manually classify violations.
When an adequate number of violations have been classified, you will be prompted to return to the main page of the wizard so that you can train the model (see Training the Model).

Expected Classification Outcomes

If Classify violations based on history is enabled, DTP will classify violations in the database that have been fixed or suppressed, resulting in one of the following outcomes:

If an adequate number of violations have been classified, you will be prompted to return to the main page of the wizard so that you can train the model (see Training the Model).
If no violations from your history have been fixed or suppressed, then DTP will not be able to automatically assign actions. You will be prompted to return to the main page of the wizard and restart the classification process. Choose Classify violations manually when prompted. DTP will present 20 violations from the current build for you to review and classify. See Assigning Actions to Violations for instructions on how to manually classify violations.
DTP requires a balanced ratio of Fix and Suppress violations. If too many violations have been identified and classified as either Fix or Suppress, you will be prompted to manually classify additional violations until a more balanced ratio is achieved. See Assigning Actions to Violations for instructions on how to manually classify violations.

Training the Model

If the wizard is not already open, click the machine learning icon and choose Recommend Violations to Fix to launch it.
Choose Train model and click Next.
If you have already classified at least 20 violations, you will be prompted to execute training based on those classifications (if you have not already classified at least 20 violations, see Classifying Violations). Enable Execute training and click Next.
Choose a training mode when prompted. See About Training Modes for additional information about which option to choose.
Click Confirm when prompted to begin training the recommendation model on how to classify violations. This may take several minutes if a large number of violations (for example, 1000+) have been classified.
When the model is trained, you will be prompted to recommend actions (see Recommending Violations to Fix).

A healthy recommendation model is essential for DTP to make reliable recommendations. If the classification process results in a Poor or Moderate model, then you should continue classifying violations until the model health improves. See About Model Health for additional information. If you used the Fast or Normal mode to train the model and are unsatisfied with the quality of the model, retrain the model and choose either the Deep or Deepest training mode option.

About Training Modes

DTP uses the following algorithms to train the model:

k-NN
Naive Bayes
Adaptive Boost
Random Forest

The Fast and Deep training mode options use the k-NN and Naive Bayes algorithms, which have shown to be the fastest algorithms for processing static analysis data. But because only two algorithms run, DTP will also have fewer options when determining the best training model.

The Normal and Deepest training modes use all algorithms, which will take longer than the other options to process the data. Using all algorithms, however, provides DTP with the most information for determining the best training model.

The Fast and Normal training mode options limit the number of violations used to train the model to 1000. This enables DTP to train faster, especially if you have a large set of classified violations. Reducing the data sample size, however, may affect the overall quality of the model.

The Deep and Deepest training use all classified violations available to train the model. As a result, these options have the potential to produce the best possible model. Allow several more minutes for the training process to complete, however, if your project contains a very large set of classified violations.

Recommending Violations to Fix

You can get recommendations on violations to fix from two places:

From the Prioritization Tab
Using the Machine Learning Wizard

From the Prioritization Tab

You can also get DTP to recommend violations to fix by clicking Get Recommendations on the Prioritization tab.

Parasoft DTP 2024.2 > Recommending Violations to Fix > violations-explorer_prioritization-tab1.png

If you have trained the model, the recommendation will be added to the Recommendations section. If one or more prerequisite conditions have not been met, there will be an info icon to the left of the Recommendations label that you can hover over for details.

Using the Machine Learning Wizard

If the wizard is not already open, click the machine learning icon and choose Recommend Violations to Fix to launch it.
Enable Recommend violations to fix and click Next.
- If Recommend violations to fix is not available, you will need to classify more violations. See Classifying Violations.
- If you would like to increase the recommendation health score, you will need to train the model more. See Training the Model.
Click Confirm when prompted. DTP will use the machine learning model to make recommendations about how to classify static analysis violations for the current build in the filter. The Recommended Action field in the Prioritization tab will be updated. See Viewing Recommendations for more information about viewing recommendations. A recommendations results summary screen will also appear.
Review the actions recommended for the violations. You can either assign the recommendations to the violations (see Assigning Actions to Violations) or ignore the recommendations. See Viewing Recommendations for more information about viewing recommendations.

Viewing Recommendations

You can view the violations to fix recommendations in two places: in the search results table and on the Prioritization tab.

In the search results table, look in the Recommended Action column. You can sort violations according to DTP's recommendations.

In the Prioritization tab, look in the Recommendations section.

Parasoft DTP 2024.2 > Recommending Violations to Fix > prioritization-tab_highlight-fix-recommendation1.png

About Model Health

The health of the model that DTP uses depends on the number of violations you process and the quality of the data you provide. DTP requires at least 20 violations to be classified, but classifying more violations will improve the model health. If you have not provided enough data, DTP will not be able to use the model to make a recommendation.

The model informing DTP recommendations also depends on the accuracy and consistency of the data you provide. DTP uses advanced algorithms to analyze the data associated with each violation, but it can only make quality recommendations based on quality inputs. If there is an imbalance in the number of fixed violations versus the number of suppressed violations (or imbalance in the number of violations with action set to Fix versus the number of violations with action set to Suppress), DTP may indicate that it cannot properly train a model. In this case, you should assign an equal number Fix and Suppress actions to facilitate model health.

Providing Machine Learning Feedback to Parasoft

You can help Parasoft improve the system by using the API to get Machine Learning data and sending the data to your Parasoft representative. Sensitive information, such as username and project name, is obfuscated in the response. Save the output from the following API endpoints are available to retrieve the Machine Learning data:

/classificationResultSets

This endpoint returns static analysis violations used to train the model and recommendation results.

Authentication

Pass your username and password as when sending the request. See Example.

URL

http:<HOST>:<PORT>/grs/api/v1.7/ml/staticAnalysis/classificationResultsSets

Method

GET

Parameters

The following table describes the parameters available for this endpoint.

Parameter Description Type Required

sortOrder

Specifies the order in which the data is returned by ID. Data is assigned IDs in sequential order, so newer data has a larger ID number. You can specify the following values:

asc (default)

desc

string

optional

limit Specifies the maximum number of result sets to return. The default is no limit. integer optional

offset

Specifies the number of result sets to skip in the response. Default is 0.

If the database has three result sets and the offset=1 parameter is specified, the two most recent results will be returned. If the sortOrder=desc is also applied, then the first two result sets will be returned.

integer

optional

Example

curl -X GET -u username:password "http://dtp.mycompany.com:8443/grs/api/v1.7/ml/staticAnalysis/classificationResultSets?sortOrder=desc&limit=10" > myResultSets.json

/violationActionHistory

This endpoint returns actions assigned to the violations used to train the model.

URL

http:<HOST>:<PORT>/grs/api/v1.7/ml/staticAnalysis/violationActionHistory

Method

GET

Parameters

None.

Example

curl -X GET -u username:password "http://dtp.mycompany.com:8443/grs/api/v1.7/ml/staticAnalysis/violationActionHistory" > myViolationActionHistory.json