Skip to content

Commit

Permalink
Compleated proof reading the notebook Alpha. May need user review. Wi…
Browse files Browse the repository at this point in the history
…ll wait for a pull request post release
  • Loading branch information
jaganadhg committed Jan 30, 2022
1 parent 1df3e78 commit e51cf32
Showing 1 changed file with 14 additions and 14 deletions.
28 changes: 14 additions & 14 deletions EGV_Data_exploreer.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@
"\n",
"This Data Set is one of the oldest Microfabrication/Semiconductor process-related open data. The data was released by Eigenvector, a niche industry analytics company. The information was used in a research study and paper published in 1999[1]. The first author Barry M. Wise is one of the company's founding members.\n",
"\n",
"The data was taken from LAM 9600 Metal Etcher[2]. Etching is used in microfabrication to chemically remove layers from the surface of a wafer during manufacturing[3]. The data was released in Matlab format, suitable for analyzing using the PLS Toolbox for Eigenvector. There are three .mat files; MACHINE_Data.mat, OES_DATA.mat, and RFM_DATA.mat. A detailed note on the data and attributes available in reference [1] and [2]. Since the data is in Matlab format, we created a python based parser to convert the data in pandas DataFrames. This parser should enable Data Miners and Data Scientists to play around with Open Source tools like Python or R. \n",
"The data was taken from LAM 9600 Metal Etcher[2]. Etching is used in microfabrication to chemically remove layers from the surface of a wafer during manufacturing[3]. The data was released in Matlab format, suitable for analyzing using the PLS Toolbox[9] developed by Eigenvector. There are three .mat files; MACHINE_Data.mat, OES_DATA.mat, and RFM_DATA.mat. A detailed note on the data and attributes available in reference [1] and [2]. Since the data is in Matlab format, we created a python based parser to convert the data in pandas DataFrames. This parser should enable Data Miners and Data Scientists to play around with data using Open Source tools like Python or R. \n",
"\n",
"### The Eigenvector Etch Data Parser\n",
"\n",
"The Eigenvector Etch Data Parse is developed to read Matlab data files. The parse reads each file and converts the calibration data (sensor data) and test data (sensor data) into a single DataFrame. The parse introduced an additional field in the data 'fault_name', which helps the user identify the normal/calibration wafers and test wafers(with defects). We tested the parser in Python3 environments only; if you are looking for Python2 compatibility, please test and create a bug/pull request as applicable. The source code is released under Apache 2.0 license and is available at https://github.com/jaganadhg/egvsemicon. \n",
"The Eigenvector Etch Data Parser is developed to read Matlab data files published by Eigenvector[2]. The data is from a LAM 9600 Metal Etching Machine and was collected in 1995's. The parser reads each file and converts the calibration data (sensor data) and test data (sensor data) into a single DataFrame. The parser introduced an additional field in the data 'fault_name', which helps the user identify the normal/calibration wafers and test wafers(with defects). We tested the parser in Python3 environments only; if you are looking for Python2 compatibility, please test and create a bug/pull request as applicable. The source code is released under Apache 2.0 license and is available at https://github.com/jaganadhg/egvsemicon.\n",
"\n",
"### Etch Data\n",
"\n",
"The Eigenvector Etch Data is provided in three Matlab vector files MACHINE_Data.mat, OES_DATA.mat, and RFM_DATA.mat. The MACHINE_Data.mat file consists of the engineering variables, time, and the etch recipe steps. The variable/feature categories are pressure, gas flow rate, and power (If you wonder why gas flow is here, it is part of the semiconductor chemistry process and a more significant topic beyond the note!). The OES_DATA.mat file consists of the optical emission spectroscopy (OES) of the plasma. OES description is available in Hitachi's reference pge[4]. The file RFM_DATA.mat contains radio-frequency monitoring (RFM) system to monitor the power and phase relationships of the plasma generator. \n",
"The Eigenvector Etch Data is provided in three Matlab vector files MACHINE_Data.mat, OES_DATA.mat, and RFM_DATA.mat. The MACHINE_Data.mat file consists of the engineering variables, time, and the etch recipe steps. The variable/feature categories are pressure, gas flow rate, and power (If you wonder why gas flow is here, it is part of the semiconductor chemistry process and a more significant topic beyond this note!). The OES_DATA.mat file consists of the optical emission spectroscopy (OES) of the plasma. OES description is available in Hitachi's reference pge[4]. The file RFM_DATA.mat contains radio-frequency monitoring (RFM) system to monitor the power and phase relationships of the plasma generator. \n",
"\n",
"The OES data do not come with sensor names, unlike the Machine Data/Engineering Variable and RFM Data. The data is a field wave_axis which represents wavelengths in nm of peaks. A special note from the data \"Note that this data consists of integrated peak areas at for peaks at 43 wavelengths but looking across the plasma in 3 different locations perpendicular to the overall gas flow in the system.\"[2]\n",
"\n",
Expand Down Expand Up @@ -335,7 +335,6 @@
"\n",
"12829 records span across 126 normal/calibration wafers and 20 test (defect induced) wafers. \n",
"\n",
"**Note: The column wafer_names is helpful to join other data sets such as OES and RFM.**\n",
"\n",
"#### OES Data"
],
Expand Down Expand Up @@ -844,21 +843,25 @@
{
"cell_type": "markdown",
"source": [
"If you are interested in reading about RFM, an interesting resource is \"RF Technology in Semiconductor Wafer Processing\" [6]. We have provided the sensor and unit mapping separately [7]. The data sets come with reference to the unit of each sensor value to understand the data better. The actual sensor names are masked, and the last two variables represent the wafer name and indicate test or calibration wafers. \n",
"If you are interested in reading about RFM, an interesting resource is \"RF Technology in Semiconductor Wafer Processing\" [6]. \n",
"\n",
"We have provided the sensor and unit mapping separately [7]. The data sets come with reference to the unit of each sensor value to understand the data better. The actual sensor names are masked, and the last two variables represent the wafer name and indicate test or calibration wafers. \n",
"\n",
"#### Combining Data Set\n",
"\n",
"The three DataFrames generated have various records and a caveat in identity column wafer_names. The RFM has 3519, OES has 4786, and Engineering Variables has 12829 records. Value in the identity column wafer_names; values are prefixed by l,s, and r for Engineering Variables, OES, and RFM data, respectively. A derived variable can be generated for identity by relacing the leading alphabet in the wafer_names. \n",
"The three DataFrames generated have various records and a caveat in identity column wafer_names. The RFM has 3519, OES has 4786, and Engineering Variables has 12829 records. Value in the identity column wafer_names; values are prefixed by l,s, and r for Engineering Variables, OES, and RFM data, respectively. A derived variable can be generated for identity by replacing the leading alphabet in the wafer_names. \n",
"\n",
"**Note : The idea of representing OES data as DataFrame may not be the best. We are working towards a better representation.**\n",
"\n",
"\n",
"The idea of representing OES data as DataFrame may not be the best. We are working towards a better representation.\n",
"\n",
"#### Data Mining/Data Science and Next Steps\n",
"\n",
"We are not venturing into any detailed analytics solution in the scope of current notes—the industry practices simple techniques from univariate analysis to employing Deep Learning to solve the problems. From the data description, one can infer the nature of data preprocessing and feature engineering techniques. In the same domain, understanding or active guidance from field processing engineers may benefit you in starting an exciting project. A good starting point will be the original paper [1]. \n",
"\n",
"#### Competing Interests\n",
"\n",
"This notebook is intended to introduce the Egionvector Metal Etch Data Parser[8] and the data [2]. The authors declare that they have no competing interests. The authors declare that no proprietary information related to the authors, affiliated company, or its approach, methodologies, and IPR is discussed in these notes."
"This notebook is intended to introduce the Egionvector Metal Etch Data Parser[8] and the data [2]. The authors declare that no proprietary information related to the authors, affiliated company, or its approach, methodologies, and IPR is discussed in these notes. The authors declare that they have no competing interests."
],
"metadata": {}
},
Expand All @@ -882,14 +885,11 @@
"\n",
"[7] https://github.com/jaganadhg/egvsemicon/blob/main/rfm_variable_unit_map.csv\n",
"\n",
"[8] Jaganadh Gopinadhan, “Eigenvector Metal Etch Data Parser - Python”. Zenodo, Jan. 29, 2022. doi: 10.5281/zenodo.5919197."
"[8] Jaganadh Gopinadhan, “Eigenvector Metal Etch Data Parser - Python”. Zenodo, Jan. 29, 2022. doi: 10.5281/zenodo.5919197.\n",
"\n",
"[9] https://eigenvector.com/software/pls-toolbox/"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [],
"metadata": {}
}
],
"metadata": {
Expand Down

0 comments on commit e51cf32

Please sign in to comment.