Sensors, Sensor Fusion are Keys to Defect Detection Design
Image Source: Andrea Slatter/Shutterstock.com
By Tenner Lee for Mouser Electronics
Published August 2, 2021
Defect detection is the process of finding and identifying properties that do not fall within specified parameters, including time-dependent deviations, physical anomalies, and group-process deviations. Now more than ever, designing optimized defect-detection systems is imperative. For example, a poorly optimized defect detection system that allows a 1-percent increase in product defects can easily cause losses in millions of dollars through product recalls, product remediation, or downstream system failures. A defect-detection system cannot maintain specificity, or excess false alarms. It can bog production lines and constrain process flow.
Using a hypothetical case about detecting oil pipeline defects, the following explores considerations for designing and optimizing sensor systems and choosing sensor fusion approaches.
Defect Detection Scenario: Oil Pipelines
Detecting defects in oil pipelines is a complex, expensive, and demanding process that has steep requirements, encounters multiple environmental variables, and includes a broad spectrum of defects that can occur:
- Environmental and natural variables, such as geographic location, scale (think Alaska and equipment size vs. humans), the corrosive nature of crude oil, and the inability to control the environment
- Flow requirements, such as volume, level, velocity, pressure, and temperature
- The number and types of defects that can occur, such as cracks, delamination, erosion, corrosion, vibration, utility flow, utility temperature, and pressure
The potential consequences of poor design are catastrophic, including toxic spills, production halts, health and safety issues, and environmental degradation. False-positives incur steep costs, with significant planning, process deviations or downtime, and multiple crews required to remedy the defects. Designing for sensor requirements, optimizing sensor design, and choosing the best sensor fusion processing approach are critical. Understanding the tradeoffs and asking the right questions can maximize defect detection system efficiency and effectiveness.
Defect Detection Design
Designing for Sensor Requirements
In defect-detection systems, sensors provide either information redundancy or new, independent information. Therefore, sensor specifications must be considered within the framework of overall system requirements:
- Resolution: Do sensors meet resolution requirements and accuracy requirements?
- Can sensors resolve defects to 0.5mm, if needed?
- Can sensors resolve defect differences between 0.5mm and 0.6mm?
- Reliability: Are sensor results reliable enough for operational requirements?
- Are sensors reliable for consistent acquisition across different environmental conditions?
- Are sensor results consistent (low variance)?
- System: Do sensors provide requisite data to meet system goals?
- Can the system resolve the task given the information available from sensors?
- Hardware: Do sensors meet hardware requirements?
- Power? Weight? Size? Timing?
Optimize Sensor Design
Optimizing sensor design in system complexity, redundancy needs, and system interactions is also important. In most cases, minimizing the number of sensors within the system helps minimize system complexity and cost. Therefore, optimizing sensor design requires trade-offs between the number of sensors and the system’s overall complexity (Figure 1). This trade-off might be necessary for several reasons beyond cost—for example, saturation, scalability, and system constraints will all play a role.

Figure 1: Notional trade-off between the number of sensors and the system. Balance must be achieved among multiple parameters. Systems generally cannot scale indefinitely without degradation within a desired parameter space. In other words, systems cannot do everything. (Note: The function behavior shown depends on application and task.) (Source: Author)
Generally, trade-offs fall into three categories:
- Cost: Does the system meet cost-effectiveness metrics?
- How often must the system be calibrated and maintained?
- Can a simpler system meet the same performance threshold as the current system?
- Information: Does the system capture everything needed to assess performance accurately?
- Is the information redundant?
- Is the information accurate?
- Is the information specific enough for my application?
- Deployment: How deployable is the system?
- Is the system reliable?
- Is the system easy to use? Does it have redundancies for proper resolution?
- Can the system be operated for its designed space? For example, is it deployable across Alaska?
Fuse Sensor Data
Sensor fusion is the allocation of data from disparate sensors or sources of information. It is used to address redundancy issues and provide additional specificity. In our defect-detection system, sensor fusion would provide both redundancy and specificity. Consider these examples of defect detection of cracks and delamination of a section in the oil pipeline:
- Detecting cracks: The image received from an optical sensor can determine the size of the crack and its topological extent on the pipeline. However, the depth of the crack, which can be more indicative of a defect, cannot easily be determined by our image. In this case, it would be ideal for fusing sensor data from, say, ultrasonic pulse reflections to measure the crack depth and verify that a defect is present.
- Detecting delamination: The image received from an optical sensor can be degraded, and a determination of fault is unclear. Fusing data from another camera can improve our estimate by providing redundancy and help us make a classification estimate.
Because of limitations, a trade-off must be made in which a percentage of detections will be false-positives (type 1 errors) or false-negatives (type 2 errors) (Figure 2). This trade-off can be mitigated by fusing more data from our sensors or having better algorithms.

Figure 2: Example of a system trade-off between false-positives and false-negatives using a single feature space (crack length). Including additional features from other sensors can provide separability and improve system performance. Thresholds are a key optimization parameter. Using the Neyman-Pearson or other techniques to find optimal thresholds is important but ultimately based on system design and requirements. (Source: Author)
By fusing more sensor data, the amount of data available to the system increases. With the additional data, the dimensionality of our feature space can increase, and we can use it to make a better decision. More information in general (appropriately outlined) allows for more accurate decisions, but this is not always practical for multiple reasons:
- More information correlates to more sensors, which falls in direct contrast to the minimization of hardware.
- The amount of data and features within the system will begin to overlap, limiting the useable number of features in the system.
- Overfitting can occur when we use too many features, making the system brittle and not generalizable.
- Systems cannot scale indefinitely (Figure 1).
To address the sensor fusion constraint and complete the notional system, we can employ two methods: a standard approach and a deep-learning (DL) approach. Factors that determine selection methods depend on the problem set and resources that can be used to develop the system:
- Problem complexity: DL addresses complex tasks and the fusion of data from many sources (more than 20). Standard approaches can address complex tasks, but a limit to complexity usually occurs as leakage happens.
- Scale: DL can scale much better than standard methods.
- Requirements: Standard approaches are generally easier to use.
Standard Approaches
Standard approaches to optimizing sensor data include using feature selection and minimizing mutual information between each feature set. Similar to the constraint on the number of sensors, it is good practice to minimize the number of features (from sensor data) while maximizing the amount of information in the system.
Determining a defect can be cast as a binary classification problem in which either a defect is detected or not (once the sensor data has been fused). Standard approaches to classification, including machine learning (ML), have pros and cons for each algorithm class. (Here, ML refers to the class algorithms that do not use function approximations or neural networks.) Understanding these pros and cons is critical. If the decision space is one-dimensional, a simple threshold can suffice when biased such that no type-2 errors are present in the system or vice versa. In more complex cases such as ours, an algorithm must be trained and fit the fused data. In sufficiently complex systems, experts cannot determine decision boundaries and should be left to machine-learning algorithms that fit the data.
To properly design standard algorithms—specifically, ML algorithms—use a process flow in which the system requirements determine whether the algorithm performs adequately. This process requires a data framework (Figure 3), which is not discussed here but is implicitly implied within the process flow.

Figure 3: Development sensor fusion algorithms using machine learning. Systematic processes define standard approaches with validation and testing phases to ensure that the algorithm meets requirement specifications. Each green box represents sets of tasks that should be iterated multiple times. (Source: Author)
Limitations to standard sensor fusion approaches include accounting for algorithm assumptions and the need for expert system design (Table 1).
Table 1: Standard sensor fusion approaches include accounting for algorithm assumptions and the need for expert system design.
Limitation |
Details |
Complexity |
Models have finite complexity and can handle up to a fixed number of dimensions or features before degrading. |
Algorithm assumptions |
Many standard approaches use assumptions that do not reflect biases or complexities of the task or data. Choose algorithm classes that fit the task at hand. |
Scalability |
Standard algorithms can be difficult to scale properly over complex tasks |
Expertise |
Include experts during every process step. Expertise can mitigate model .assumptions. Do not understand why an algorithm works the way it does? It means you have problems. |
Hysteresis |
Many tasks require hysteresis and can be ad hoc with standard approaches. |
Leakage |
Algorithms that require multiple steps (image processing) can suffer from disjoint optimizations during each step. Step 1 might not produce outputs that step 2 requires for good optimization and generalization. |
Deep Learning Approaches
The choice of DL algorithms or network architectures depends on the data and its representation. Choosing the correct network architecture corresponds to representations of the data. Implicit biases of the network allow certain classes of architectures to learn faster and achieve better performance. Using convolutional neural networks (CNNs) for images is a commonly cited example. In a CNN, convolution filters take advantage of the structural dependence of the image, where adjacent pixels provide information about the current pixel. Scaled over the entire image, CNNs can perform object detection, segmentation, and other complex image-processing tasks.
DL allows for joint optimization from data to task. Optimizing DL for a particular application relies on neural networks (parameterized functional modules/computational graphs). In sensor fusion for defect detection, DL can improve accuracy, scalability, and capability, such as detecting more types of defects or allowing for new applications, such as defect descriptions and fault correlations, during analysis (under strict conditions). Table 2 and Table 3 highlight DL pros and cons.
Table 2: DL advantages
Pro |
Details |
Scalability |
DL algorithms have been shown to scale to incredibly complex problems. |
Speed |
Able to infer and compute results quickly. |
Accuracy |
State of the art in accuracy in many pivotal tasks, but not state of the art in every task. |
Feature extraction |
Extremely powerful and not expressed enough; two defining features of DL are feature extraction and representation learning. |
Joint optimization |
Joint optimization from data to task |
Diversity |
Able to learn and perform various tasks; DL has been hyped for a reason, but that hype might be overblown. |
Table 3: DL disadvantages.
Con |
Details |
Black box |
Inability to characterize and understand. |
Limited applications |
Limited to non-safety-critical systems; do not use for safety-critical systems or without many redundancies. |
Data |
“Well, duh! There is a reason why DL took over only once dataset became large enough.” –Yann LeCun. |
Compute and resources |
Training requires large compute resources and infrastructure, and training is expensive. |
Generalization |
Can be difficult for DL frameworks to generalize properly; test, then test more: Don’t stop testing. |
Design |
Not straightforward to implement design specifications for DL algorithms |
Expertise |
Must understand quirks and techniques to deal with data and theoretical DL foundations. |
DL requires a rigorous workflow, with considerable training and validation steps to ensure proper performance and generalization (Figure 4).

Figure 4: DL requires a rigorous workflow, with considerable training and validation steps to ensure proper performance and generalization. (Source: Author)
Finally, the availability of data is a significant consideration in choosing the DL system. A lot of data is needed for DL algorithms to generalize and perform well during inference. If no data is available, the DL system will not be viable and should not be used. (For reference, 10,000 samples is not a lot of data.) Table 4 outlines DL data requirements.
Table 5: DL data requirements
Requirement |
Details |
Representation |
● Data must be representative of the real world (e.g., minimal domain shift). ● Data must be accurate. ● Training data must reflect data at inference time. |
Density |
● The number of samples or amount of data must be large. ● The more data the better, as long as it satisfies diversity and representation requirements. ● Tail distributions must be captured. |
Diversity |
● Data must be diverse and account for all variations of possibilities. ● Data must be balanced with proportional samples of each class. ● Known biases within the data should be removed or mitigated. |
Adversarial examples |
● Adversarial examples should be included for robustness. |
Conclusion
More than ever, optimized defect detection systems are imperative because consequences are catastrophic, and costs to companies and the environment are steep. Choosing sensor hardware and optimizing design are critical, as is choosing the best sensor fusion methods. Understanding and weighing key trade-offs is critical, as well, in meeting requirements and optimizing the defect detection system. Careful consideration is imperative and will lead to the design of effective and efficient systems.