A detailed description of the normal distribution of random errors for the processing of instrumentation data
Distribution curve
y: probability density, which indicates the probability that the measured value appears here. The larger y, the more likely it is to appear. x: measured value.
μ population mean: The average of the infinite data, corresponding to the abscissa value of the highest point of the curve, representing an infinite number of dataset trends. It is true if there is no systematic error.
σ overall standard deviation: the distance from the population average to one of the two turning points of the curve, which characterizes the degree of data dispersion. σ is small, the data is concentrated, the curve is high and thin, σ is large, the data is scattered, and the curve is relatively short and fat.
X-σ: random error. If x-σ is the abscissa, the highest point of the curve corresponds to the abscissa of 0.
For a curve, μ and σ are the two parameters of this curve, so N (μ, σ) is used to represent this curve. This curve can be represented by a functional formula.
2. Probability density function
3. Random error regularity
(1) The probability of occurrence of small errors is greater than that of large errors, and the probability of occurrence of particularly large errors is extremely small.
(2) The probability of occurrence of positive and negative errors is equal.
4. Standard normal distribution:
The abscissa is represented by u and its definition is:
That is, the random error is expressed in units of σ.
The function expression is:
Therefore, the shape of the curve is independent of the σ size, and the different curves are combined into one.
Recorded as N(0,1)
Interval probability of random error
1. Definition
The probability that a random error occurs in a certain interval is expressed by the area contained under a certain normal distribution curve.
The area contained in a complete normal distribution curve, representing the sum of the probabilities of all measurements, ie 100%, equals 1. Expressed as:
Generally, the area included in the different value curve is calculated in units, and a probability integral table is prepared for direct inspection.
2. Calculation formula
Probability = area =
Statistical processing of limited data
The law of random error distribution provides a theoretical basis for data processing, but it is for infinite multiple measurements. In practice, we only do a limited number of measurements and treat it as part of a random extraction from an infinite population, called a sample. The number contained in the sample is called the sample size and is represented by n.
Trends in data → Representation of trends in data sets
Arithmetic mean
The average value of the measured data is n times.
It is the best estimate of the overall average. For a finite number of measurements, the measured values ​​are always concentrated towards the arithmetic mean, ie the values ​​appear around the arithmetic mean; for an infinite number of measurements, ie n → ∞, →μ.
2. Median M
The data is arranged in order of magnitude, and the data in the middle is called the median M.
When n is an odd number, the center is the one; when n is an even number, the average of the two data in the middle is.
Trend of data → Representation of data dispersion
1. Range R (or full range): refers to the difference between the largest (Xmax) and the smallest (Xmin) of a set of parallel measurements.
R = Xmax - Xmin
2. Average deviation: The average of the absolute values ​​of the deviations of the measured values ​​from the mean.
Absolute deviation di = Xi - (i =1,2,...,n )
Average deviation
Relative average deviation
3. Standard deviation S: calculation method
Standard deviation S =
The relative standard deviation, also called the coefficient of variation, is expressed in CV and is generally calculated as a percentage.
Relative standard deviation RSD = ×100 %
Degree of freedom f:f = n-1
Confidence interval for the mean → Definition
Confidence
Confidence indicates the degree to which a judgment is made. Representation symbol: P.
Sometimes we say "I have a good grasp of 80% of this matter" for something. The "eighty-one grasp" here is the degree of confidence, which actually refers to the probability of an event appearing.
Commonly used confidence: P = 0.90, P = 0.95; or P = 90%, P = 95%.
2. Confidence interval
According to the t-distribution calculation, the range containing the true value at a certain confidence level centered on the individual measurement values ​​is called the confidence interval of the individual measurement values.
1. Definition of t
, and contrast.
2. t distribution curve
(1) t distribution curve: The ordinate of the t distribution curve is the probability density, and the abscissa is t. At this time, the random error is not distributed according to the normal state, but is distributed according to t.
(2) Relationship with normal distribution: The t distribution curve varies with the degree of freedom f. When n→∞, the t distribution curve is a normal distribution.
t distribution curve
[t distribution value table]
It can be seen from the table that when f→∞, S→σ, t is u.
In fact, when f = 20, t and u are very close.
3. Confidence interval for the mean:
(1) Representation method:
(2) Meaning: Under a certain degree of confidence, centered on the average, including the confidence interval of the population mean.
(3) Calculation method:
1 Find the measured value of S, n.
2 Find the value of t from the t-distribution table according to the required confidence and f-value.
3 Substitute the formula calculation.
Significance test → Comparison of average and standard value
There are two commonly used methods: t test and F test.
Two situations are often encountered in the analysis: the sample determination mean and the sample standard values ​​are inconsistent; the average values ​​of the two sets of measurements are inconsistent. A comparison of the mean and the standard value and the comparison of the two groups of average values ​​is required separately.
Comparison method
The measurement is performed several times with a standard sample, and then a t test is used to check whether there is a difference between the average value of the measurement result and the standard value of the standard sample.
2. Calculation method
1 Find t.
t =
2 Check the t value in the t distribution table based on the confidence (usually taking 95% confidence) and the degree of freedom f.
3 Compare t and t. If t > t, the average value of the measurement appears outside the 95% probability interval centered on the true value. The average value is significantly different from the true value. We believe that systematic errors exist.
t =
Example: The determination of CaO content in a laboratory sample has the following results: CaO content = 30.51%, S = 0.05, n = 6, the standard value of CaO content in the standard is 30.43%, is there any systematic error in this operation? (Confidence is 95%)
Solution: t = = 3.92
Look up the table: 95% confidence, f = 5, t = 2.57. The comparison shows that t > t .
Description: There is a systematic error in this operation.
Significance test → Comparison of the two groups
There are two commonly used methods: t test and F test.
Two situations are often encountered in the analysis: the sample determination mean and the sample standard values ​​are inconsistent; the average values ​​of the two sets of measurements are inconsistent. A comparison of the mean and the standard value and the comparison of the two groups of average values ​​is required separately.
Comparison method
The measurement was carried out by two methods, and the results were respectively, S, n; , S, n . Then, after calculating by F test and t test, respectively, whether there is significant difference between the two groups of data.
2. Calculation method
(1) Comparison of precision - F test method:
1 find F calculation: F = >1
2 From the F table according to the degree of freedom of the two methods, check the corresponding F value for comparison.
[Table 2-2 95% confidence level (a=0.05) unilateral test F value (partial)]
3 If F > F, the difference between S and S is not significant, and then there is no significant difference between the mean values ​​by t test. If F > F, the difference between S and S is significant.
(2) Comparison of average values:
1 find t : t =
If there is no significant difference between S and S, take S as S.
2 Check the t value table, the degree of freedom f = n + n - 2.
3 If t > t , there is a significant difference between the two groups.
Example: Na CO samples were measured by two methods as follows:
Method 1: =42.34, S = 0.10, n = 5.
Method 2: =42.44, S = 0.12, n = 4.
Compare the two results for significant differences.
Outliers
1. Definition
In a set of parallel measurement data, sometimes individual values ​​are far from the other values, which is called outliers.
Judging whether a measured value is an outlier, not looking at the data to see it, that is far away, that is an outlier, but it must be calculated and compared to determine. The method we use is called Q test. .
2. Inspection method
(1) Seeking Q: Q =
That is, the difference between the outlier and its nearest neighbor is obtained, and then the Q value is obtained by comparing it with the range.
(2) Comparison: According to the number of measurements n and the confidence level, if Q > Q, the outlier value should be discarded, otherwise the outlier value is retained.
Table 2-3 Q threshold table of 90% confidence level
Number of data (n) 3 4 5 6 7 8 9 10 ∞
Q90% 0.90 0.76 0.64 0.56 0.51 0.47 0.44 0.41 0.00
Example: To determine the concentration of a solution substance, the following results are obtained: 0.1014, 0.1012, 0.1016, 0.1025, and should 0.1025 be discarded (confidence 90%)?
Method selection
The choice of method is based on the analysis of the composition of the sample to determine the analytical method.
Determination of constant components: gravimetric method, titration method. High accuracy and low sensitivity.
Determination of trace components: instrumental analysis. High accuracy and poor sensitivity.
Increased accuracy
1. Reduce measurement error
The weight and volume are measured during the measurement process. To ensure the accuracy of the analysis results, the measurement error must be reduced.
Example: Weighing is a critical step in gravimetric analysis and should be done to reduce weighing errors.
Requirement: The relative error of weighing is <0.1%.
Generally, the weighing error of the balance is ±0.0001 g, and the weight of the sample must be equal to or greater than 0.2 g to ensure that the relative error of weighing is within 0.1%.
2. Increase the number of parallel measurements to reduce random errors
Increasing the number of parallel measurements can reduce random errors, but the number of measurements is too large, which does not make much sense. Instead, it increases the workload. Generally, when measuring and measuring, it can be measured 4-6 times in parallel.
3. Eliminate systematic errors during the measurement process
3.1 Inspection method: comparison method
(1) Control test: The standard sample with the composition close to the sample is selected for measurement, and the measurement result is statistically processed with the standard value to determine whether there is systematic error.
(2) Comparative test: A certain sample is simultaneously measured by the standard method and the selected method, and the measurement result is statistically tested to determine whether there is systematic error.
(3) Addition method: Weigh two equal parts of the sample, add a known amount of the component to be tested to one of the samples, and perform two samples in parallel to determine whether the amount of the component to be tested is quantitatively recovered. Determine if there is any systematic error. Also called recycling experiment.
3.2 Elimination method
(1) Do blank test: Analyze the test according to the sample analysis steps and conditions without adding the sample, and the result is a blank value, which is deducted from the sample measurement result. It removes impurities introduced by reagents, distilled water and containers.
(2) Calibration instrument: Calibrate the weight, pipette, etc. to eliminate systematic errors caused by the instrument.
(3) Refer to other methods for correction.
Http://news.chinawj.com.cn Editor: (Hardware Business Network Information Center) http://news.chinawj.com.cn
Distribution curve
y: probability density, which indicates the probability that the measured value appears here. The larger y, the more likely it is to appear. x: measured value.
μ population mean: The average of the infinite data, corresponding to the abscissa value of the highest point of the curve, representing an infinite number of dataset trends. It is true if there is no systematic error.
σ overall standard deviation: the distance from the population average to one of the two turning points of the curve, which characterizes the degree of data dispersion. σ is small, the data is concentrated, the curve is high and thin, σ is large, the data is scattered, and the curve is relatively short and fat.
X-σ: random error. If x-σ is the abscissa, the highest point of the curve corresponds to the abscissa of 0.
For a curve, μ and σ are the two parameters of this curve, so N (μ, σ) is used to represent this curve. This curve can be represented by a functional formula.
2. Probability density function
3. Random error regularity
(1) The probability of occurrence of small errors is greater than that of large errors, and the probability of occurrence of particularly large errors is extremely small.
(2) The probability of occurrence of positive and negative errors is equal.
4. Standard normal distribution:
The abscissa is represented by u and its definition is:
That is, the random error is expressed in units of σ.
The function expression is:
Therefore, the shape of the curve is independent of the σ size, and the different curves are combined into one.
Recorded as N(0,1)
Interval probability of random error
1. Definition
The probability that a random error occurs in a certain interval is expressed by the area contained under a certain normal distribution curve.
The area contained in a complete normal distribution curve, representing the sum of the probabilities of all measurements, ie 100%, equals 1. Expressed as:
Generally, the area included in the different value curve is calculated in units, and a probability integral table is prepared for direct inspection.
2. Calculation formula
Probability = area =
Statistical processing of limited data
The law of random error distribution provides a theoretical basis for data processing, but it is for infinite multiple measurements. In practice, we only do a limited number of measurements and treat it as part of a random extraction from an infinite population, called a sample. The number contained in the sample is called the sample size and is represented by n.
Trends in data → Representation of trends in data sets
Arithmetic mean
The average value of the measured data is n times.
It is the best estimate of the overall average. For a finite number of measurements, the measured values ​​are always concentrated towards the arithmetic mean, ie the values ​​appear around the arithmetic mean; for an infinite number of measurements, ie n → ∞, →μ.
2. Median M
The data is arranged in order of magnitude, and the data in the middle is called the median M.
When n is an odd number, the center is the one; when n is an even number, the average of the two data in the middle is.
Trend of data → Representation of data dispersion
1. Range R (or full range): refers to the difference between the largest (Xmax) and the smallest (Xmin) of a set of parallel measurements.
R = Xmax - Xmin
2. Average deviation: The average of the absolute values ​​of the deviations of the measured values ​​from the mean.
Absolute deviation di = Xi - (i =1,2,...,n )
Average deviation
Relative average deviation
3. Standard deviation S: calculation method
Standard deviation S =
The relative standard deviation, also called the coefficient of variation, is expressed in CV and is generally calculated as a percentage.
Relative standard deviation RSD = ×100 %
Degree of freedom f:f = n-1
Confidence interval for the mean → Definition
Confidence
Confidence indicates the degree to which a judgment is made. Representation symbol: P.
Sometimes we say "I have a good grasp of 80% of this matter" for something. The "eighty-one grasp" here is the degree of confidence, which actually refers to the probability of an event appearing.
Commonly used confidence: P = 0.90, P = 0.95; or P = 90%, P = 95%.
2. Confidence interval
According to the t-distribution calculation, the range containing the true value at a certain confidence level centered on the individual measurement values ​​is called the confidence interval of the individual measurement values.
1. Definition of t
, and contrast.
2. t distribution curve
(1) t distribution curve: The ordinate of the t distribution curve is the probability density, and the abscissa is t. At this time, the random error is not distributed according to the normal state, but is distributed according to t.
(2) Relationship with normal distribution: The t distribution curve varies with the degree of freedom f. When n→∞, the t distribution curve is a normal distribution.
t distribution curve
[t distribution value table]
It can be seen from the table that when f→∞, S→σ, t is u.
In fact, when f = 20, t and u are very close.
3. Confidence interval for the mean:
(1) Representation method:
(2) Meaning: Under a certain degree of confidence, centered on the average, including the confidence interval of the population mean.
(3) Calculation method:
1 Find the measured value of S, n.
2 Find the value of t from the t-distribution table according to the required confidence and f-value.
3 Substitute the formula calculation.
Significance test → Comparison of average and standard value
There are two commonly used methods: t test and F test.
Two situations are often encountered in the analysis: the sample determination mean and the sample standard values ​​are inconsistent; the average values ​​of the two sets of measurements are inconsistent. A comparison of the mean and the standard value and the comparison of the two groups of average values ​​is required separately.
Comparison method
The measurement is performed several times with a standard sample, and then a t test is used to check whether there is a difference between the average value of the measurement result and the standard value of the standard sample.
2. Calculation method
1 Find t.
t =
2 Check the t value in the t distribution table based on the confidence (usually taking 95% confidence) and the degree of freedom f.
3 Compare t and t. If t > t, the average value of the measurement appears outside the 95% probability interval centered on the true value. The average value is significantly different from the true value. We believe that systematic errors exist.
t =
Example: The determination of CaO content in a laboratory sample has the following results: CaO content = 30.51%, S = 0.05, n = 6, the standard value of CaO content in the standard is 30.43%, is there any systematic error in this operation? (Confidence is 95%)
Solution: t = = 3.92
Look up the table: 95% confidence, f = 5, t = 2.57. The comparison shows that t > t .
Description: There is a systematic error in this operation.
Significance test → Comparison of the two groups
There are two commonly used methods: t test and F test.
Two situations are often encountered in the analysis: the sample determination mean and the sample standard values ​​are inconsistent; the average values ​​of the two sets of measurements are inconsistent. A comparison of the mean and the standard value and the comparison of the two groups of average values ​​is required separately.
Comparison method
The measurement was carried out by two methods, and the results were respectively, S, n; , S, n . Then, after calculating by F test and t test, respectively, whether there is significant difference between the two groups of data.
2. Calculation method
(1) Comparison of precision - F test method:
1 find F calculation: F = >1
2 From the F table according to the degree of freedom of the two methods, check the corresponding F value for comparison.
[Table 2-2 95% confidence level (a=0.05) unilateral test F value (partial)]
3 If F > F, the difference between S and S is not significant, and then there is no significant difference between the mean values ​​by t test. If F > F, the difference between S and S is significant.
(2) Comparison of average values:
1 find t : t =
If there is no significant difference between S and S, take S as S.
2 Check the t value table, the degree of freedom f = n + n - 2.
3 If t > t , there is a significant difference between the two groups.
Example: Na CO samples were measured by two methods as follows:
Method 1: =42.34, S = 0.10, n = 5.
Method 2: =42.44, S = 0.12, n = 4.
Compare the two results for significant differences.
Outliers
1. Definition
In a set of parallel measurement data, sometimes individual values ​​are far from the other values, which is called outliers.
Judging whether a measured value is an outlier, not looking at the data to see it, that is far away, that is an outlier, but it must be calculated and compared to determine. The method we use is called Q test. .
2. Inspection method
(1) Seeking Q: Q =
That is, the difference between the outlier and its nearest neighbor is obtained, and then the Q value is obtained by comparing it with the range.
(2) Comparison: According to the number of measurements n and the confidence level, if Q > Q, the outlier value should be discarded, otherwise the outlier value is retained.
Table 2-3 Q threshold table of 90% confidence level
Number of data (n) 3 4 5 6 7 8 9 10 ∞
Q90% 0.90 0.76 0.64 0.56 0.51 0.47 0.44 0.41 0.00
Example: To determine the concentration of a solution substance, the following results are obtained: 0.1014, 0.1012, 0.1016, 0.1025, and should 0.1025 be discarded (confidence 90%)?
Method selection
The choice of method is based on the analysis of the composition of the sample to determine the analytical method.
Determination of constant components: gravimetric method, titration method. High accuracy and low sensitivity.
Determination of trace components: instrumental analysis. High accuracy and poor sensitivity.
Increased accuracy
1. Reduce measurement error
The weight and volume are measured during the measurement process. To ensure the accuracy of the analysis results, the measurement error must be reduced.
Example: Weighing is a critical step in gravimetric analysis and should be done to reduce weighing errors.
Requirement: The relative error of weighing is <0.1%.
Generally, the weighing error of the balance is ±0.0001 g, and the weight of the sample must be equal to or greater than 0.2 g to ensure that the relative error of weighing is within 0.1%.
2. Increase the number of parallel measurements to reduce random errors
Increasing the number of parallel measurements can reduce random errors, but the number of measurements is too large, which does not make much sense. Instead, it increases the workload. Generally, when measuring and measuring, it can be measured 4-6 times in parallel.
3. Eliminate systematic errors during the measurement process
3.1 Inspection method: comparison method
(1) Control test: The standard sample with the composition close to the sample is selected for measurement, and the measurement result is statistically processed with the standard value to determine whether there is systematic error.
(2) Comparative test: A certain sample is simultaneously measured by the standard method and the selected method, and the measurement result is statistically tested to determine whether there is systematic error.
(3) Addition method: Weigh two equal parts of the sample, add a known amount of the component to be tested to one of the samples, and perform two samples in parallel to determine whether the amount of the component to be tested is quantitatively recovered. Determine if there is any systematic error. Also called recycling experiment.
3.2 Elimination method
(1) Do blank test: Analyze the test according to the sample analysis steps and conditions without adding the sample, and the result is a blank value, which is deducted from the sample measurement result. It removes impurities introduced by reagents, distilled water and containers.
(2) Calibration instrument: Calibrate the weight, pipette, etc. to eliminate systematic errors caused by the instrument.
(3) Refer to other methods for correction.
Http://news.chinawj.com.cn Editor: (Hardware Business Network Information Center) http://news.chinawj.com.cn
Ball Valves,Flange Type Ball Valve,Flanged Ball Valve,Platform Flanged Ball Valve
ZHITONG PIPE VALVE TECHNOLOGY CO.,LTD , https://www.ztpipevalve.com