
This course introduces the essential concepts and techniques of data mining, specifically within the context of semiconductor manufacturing and yield enhancement. We'll begin by exploring data plotting methods like histograms and delve into basic statistical definitions such as population, sample, PDF, and CDF. You'll learn about normal and lognormal distributions, the Central Limit Theorem, and how to assess the "goodness of fit" using Chi-Square analysis. The course will then cover two-parameter comparisons using scatter plots and correlation coefficients, followed by cumulative probability plots and box plots for visualizing data distributions and identifying outliers. We'll also discuss the Process Capability Index (Cpk) and various software tools used for yield analysis. Finally, we'll examine common yield analysis techniques, including wafer maps, trend charts, Pareto charts, and tool commonality reports, and explore different types of data, such as metrology and electrical data, and correlation techniques to improve yield.