# Pharma Engineering

For Engineer By Engineer

• ## Monday, 19 April 2021

ii Everyone ......!! Took a long break to focus on my job and studies ..!!

Hope everyone is safe from Covid - 19, if not wish you a quick recovery.

Today i would like to give a short demo on using Minitab for identifying the Process Capability and performance, which will be using in identifying whether process is in control and able to deliver the current / future business requirements.

In this post, i might be using some terms which might be new to some readers, but you are familiar with Lean & Six sigma topics, it will be easy for you.

Anyhow, i'll deliver those common terms prior post to make understanding for others better.

What is VOC & VOP ?

VOC refers to Voice Of Customer & VOP refers to Voice Of Process.

What is CTQ ?

CTQ refers to Critical To Quality.

What does Cp, Cpk represents ?

In this post context, Cp doesn't mean Specific heat. Cp, Cpk are Process Capability indices.

Cp - Potential Process Capability;
Cpk - Actual Process Capability.

What does Pp, Ppk represents ?

Pp, Ppk are Process performance indices.

What is the difference between Cp / Cpk and Pp / Ppk ?

I think many of you will be much confused with this, in simple the Cp/Cpk indicates the short term capability and Pp/Ppk indicates the long term capability of a process.

What are the formulae for calculating Cp, Cpk, Pp, Ppk ?

Cp, Pp have a similar formula but reminding you its not same,
Cp = USL - LSL / (6 x 𝝈w); Pp = USL - LSL / ((6 x 𝝈o);

USL - Upper Specification Limit; LSL - Lower Specification Limit;
𝝈w - Standard Deviation (within); 𝝈o - Standard Deviation (overall)

Cpk = min (Cpu, Cpl); it means it indicates minimum value of Cpu, Cpl is considered as Cpk;
Ppk = min (Ppu, Ppl); it means it indicates minimum value of Ppu, Ppl is considered as Ppk;

Cpu = USL - Xb / (3 x 𝝈w); Cpl = X - LSL / (3 x 𝝈w)
Ppu = USL - Xb / (3 x 𝝈o); Ppl = X - LSL / (3 x 𝝈o)

Xb - is the average of the data (simply mean)
𝝈w - shall be calculated by converting the data into sub - groups,
𝝈o - shall be calculated for the total population, i.e., for the total data.

What are UCL & LCL ?

UCL represents the Upper Control Limit & LCL represents the Lower Control Limit.

What are the types of variations ?

Variations are of two types, i.e., common cause variation and special cause variation.

How to identify Common cause and special cause variations ?

Special cause variations can be easily identified as the results due to these variations will be outliers. Common cause variations can be result of variation in operating parameters which are within specified ranges.

How the UCL & LCL values derived ?

UCL can be calculated as Mean of the given data + 3 times the standard deviation
LCL can be calculated as Mean of the given data - 3 times the standard deviation

How to select the capability analysis for different data types ?

Usually data shall be classified into two types, i.e., discrete and continuous.
If you can confirm that the data is continuous then we need to check the normality, if the data is normal then Capability Analysis (Normal) shall be selected,
If the data distribution is non - normal, then we have to proceed with Weibull / Exponential analysis.

or else, in Minitab we have an option to transform non - normal data into normal using Johnson transformation / Box - Cox transformation.

If data is discrete, then again their can be further classification like binary data or ordinal data,
If its a binary data then proceed with Binomial, if its ordinal then proceed with Poisson analysis.

What does Z represents ?

Z represents Sigma level of process. As Sigma level increases the variability / defects will reduce per unit no. of batches / parts.

How to calculate Z ?

Sigma level shall be calculated (USL - Xb)/s (or) (Xb - LSL)/s.

Now lets perform the capability analysis for the available data.

CAPABILITY ANALYSIS DEMO

To demonstrate the process of capability analysis, let me assume some data:
[The data considered here is pure assumption, not taken from any where]

I've assumed data for ~50 batch for which the yield range would be 500+20 Kg , below is the data:

Step - 1 : Copy the data into Minitab worksheet and plot a control chart (I - MR chart).
[Approach for plotting I-MR chart: Stat --> Control charts --> Variable charts for individuals --> IMR..]
the plot for the above data would look like below:

below the control chart, there will be a text showing where the test is failing:

For Individuals chart:
TEST 1. One point more than 3.00 standard deviations from center line.
Test Failed at points:  12, 37.

For Moving range chart:
TEST 1. One point more than 3.00 standard deviations from center line.
Test Failed at points:  12, 13, 38.

These points are called as outliers, which are not lying inside the control limits i.e., UCL & LCL.

Inference from above control chart: The outliers indicate that the process is not in control, hence the batches which produced the outliers shall be investigated for variation in yields and shall be addressed by implementing the appropriate Corrective action and Preventive actions. Post implementation of the CAPA, the outliers shall be eliminated from data set and the control charts shall be re-plotted.

Re-plotted Control chart:

For Moving range chart:
TEST 1. One point more than 3.00 standard deviations from center line.
Test Failed at points:  2, 3, 38.

Inference from the above control chart: The outliers in the moving range chart indicates that the range of the consecutive points is lying outside the control limits, which might be attributed by some special cause variation. Again the variation need to be addressed through investigation and implementing the CAPA.

by eliminating the outliers, control chart shall be re-plotted:

For Individuals chart:
TEST 1. One point more than 3.00 standard deviations from center line.
Test Failed at points:  27

For Moving range chart:
TEST 1. One point more than 3.00 standard deviations from center line.
Test Failed at points:  2

Inference from the above control chart: The outliers in the moving range chart indicates that the range of the consecutive points is lying outside the control limits, which might be attributed by some special cause variation. Again the variation need to be addressed through investigation and implementing the CAPA.

by eliminating the outliers, control chart shall be re-plotted:

For Moving range chart:
TEST 1. One point more than 3.00 standard deviations from center line.
Test Failed at points:  16

Inference from the above control chart: The outliers in the moving range chart indicates that the range of the consecutive points is lying outside the control limits, which might be attributed by some special cause variation. Again the variation need to be addressed through investigation and implementing the CAPA.

by eliminating the outliers, control chart shall be re-plotted:

Inference from the above control chart: The process is free from outliers and we can conclude the process is in control now.

Now, we can proceed to step - 2.

Step - 2: Check the normality of data:
[Approach for checking the data normality: Stat --> Basic Statistics --> Normality test]
it would appear like below:

Insert Yield (kg) in Variable, click Ok [Ensure test for normality is selected as Anderson - Darling]

Inference: From the above normality test, the obtained p - value is found to be 0.422, which is greater than 0.05, hence the data can be considered as normal.

Story behind the greater than or less than of p - values:
The approach depends on hypothesis testing, there shall be two claims, one is null hypothesis and the other is alternate hypothesis.
By default, the null hypothesis would state that data is normal and hence the opposite claim to null is alternate which is data doesn't follow normal distribution..

Ho: Data is normal,
Ha: Data is non-normal.

We have to compare the p - value with level of significance (𝞪), by default the level of significance considered by minitab would be 0.05.

Hence, in this case p - value is 0.422 which is greater than 0.05 (𝞪), hence the data can be said as normally distributed.

[If p - value is less than (𝞪), then the conclusion would be data is not normal and the approach for estimating the process capability would be different, that i'll explain at the end]

Now, lets goto step - 3.

Step - 3: Performing Capability analysis:
[Approach for performing capability analysis would be Stat --> Quality tools --> Capability analysis --> Normal]

Inference: From the above Capability report, we can conclude that the process is capable of meeting the specification limit, but the process is skewed towards left due to non-centering of mean. The Cpk less than 1.00 indicates the short term capability is bad. As the short term itself is bad, there is no need to check the overall capability i.e., Ppk.

Hence the mean shall be moved towards the median i.e., ~504, the specification limit would be 504+20 Kg. Now lets check the process capability with revised specification limits:

Inference: From the above capability chart, we can conclude that the process capability is improved when compared with the previous one (i.e.,0.95 to 1.22), which is due to shift of mean towards the process median. Based on the overall process performance indices (i.e., Ppk), it can concluded that the process is marginal and there is a scope for further improvement.

Below is the classification for the Ppk values:

If Ppk < 1.00, then the performance is bad,
If Ppk is between 1.00 and 1.33, performance is typical,
If Ppk is between 1.33 and 1.67, performance is average,
if Ppk is between 1.67 and 2.00, the process is good,
If Ppk is above 2.00, the performance is world class.

Now, lets go back and say that our data is not normally distributed, then we have to transform the data using Johnson transform / Box - Cox transformation.

[Approach: Stat --> quality tools --> Capability analysis -->Normal (Transform: Johnson transform)].

Hope, you have understood, how to perform Capability analysis for the data.

So, most of the readers would have satisfied by here, and some of you must have a doubt like 'we have investigated and eliminated all the outliers in the process by addressing through appropriate CAPA, but still the process performance is only marginal and what can we do to make it good / world class ?'.

The solution for this query is we can make the process excellent by identifying some other patterns and some more investigations are required to make the process more robust, the tests are:
Note: While identifying the outliers we have used only 1st test, but to make the process more robust, the first 4 tests to be performed and any patterns are highlighted, there can be some special cause variations for that, which need to be identified during investigation and shall be addressed through appropriate CAPA. If there are no patterns observed than it can be concluded that the variation in the yields is highly attributed by common cause variations, which need to be identified and addressed.

Is the post helpful to you ?
Yes absolutely, well connected
No, not at all
time waste and needs improvement
Created with Survey Maker Hi! I am Ajay Kumar Kalva, Currently serving as the CEO of this site, a tech geek by passion, and a chemical process engineer by profession, i'm interested in writing articles regarding technology, hacking and pharma technology. Hi! I am Ajay Kumar Kalva, owner of this site, a tech geek by passion, and a chemical process engineer by profession, i'm interested in writing articles regarding technology, hacking and pharma technology.