This repository consists of a 5 Day workshop conducted by NASSCOM and VSD on SOC Design and Planning. Also, this includes all the 5 Days learning from videos provided by VSD and LAB work on the OpenLANE tool provided by NASSCOM.
Author:- Arudeep Nath
- Day-1: Inception of open-source EDA, OpenLANE and Sky130 PDK.
- Day-2: Good floorplan vs bad floorplan and introduction to library cells.
- Day-3: Design library cell using Magic Layout and ngspice characterization.
- Day-4: Pre-layout timing analysis and importance of good clock tree.
- Day-5: Final steps for RTL2GDS using tritonRoute and openSTA.
This is the Arduino Leonardo board consisting of a Processor/SOC and various other interconnecting devices and peripherals.
The highlighted part in the above figure is the on block where the entire VLSI moves around.
The picture below dipicts the layout of the entire microcontroller board.
Thus, this picture consists of various interconnect devices, external chips and various other devices present. Few features of this layout are:
- The Centre part the processor/ SOC is the layout of the Chip that is highlighted on an Arduino Board.
- It comprises of various other devices like SRAM, EEPROM, ADCs and various other components that are combined and placed to make a microcontroller board
But, our main objective is to design a Processor/ SOC so the picture below depicts the design of a QFN-48 Package with the chip in the middle of it connected by bond wires or interconnect wires.
Also this Package comprises of a Die, I/O Pads, core. The layout of this is depicted below:
Now, the core of this SOC consists of the Two major parts these are:
- Foundry IP's: These the factories that helps to implement the design on the silicon wafer and also to make chips by there intelligence. These chips made by the foundrys are termed as foundry Intellectual Property.
- Macros: These are like a pure digital logics.
As, we are using RISC V Architecture also called Instruction Set Architecture (ISA) to design an SOC. The picture below depicts the flow from RTL2GDS.
Now, Let's move on to how the Software Applications connects to Hardwares.
Shown, below is the entire flow of how the software connects to Hardware.
Now, we are taking the example of a stop watch flow and the implementation is done using RISCV Architecture.
Thus, deep diving into the flow we observed that there is one more process after the Assembler is the use of HDL(Hardware Descriptive Language).
We, convert the Binary Code into the HDL that signifies that what is the function that the entire hardware will be performing with that bit stream.
Then, the next step after this is we synthesize the RTL flow for Physical Design Implementation.
Let's understand the standard RTL to GDSII Flow. This is described below using the flow diagram:
Now, let's take the first stage in the flow:
Synthesis: It Converts RTL to a circuit out of components from the standard cell library (SCL).
The other stage is:
Floor and Power Planning: It is divided into two categories that are :
- Chip-Floor Planning: Partition the chip die between different system building blocks and place the I/O Pads.
- Macro-Floor Planning: Dimensions, pin locations, rows definition.
Power Planning: The power network is connected by multiple vdd, vss and gnd.
The next stage is:
Placement: Placing the cells on the floorplan rows, aligned with the sites.
Placement is divided into 2 steps: Global and Detailed Placement.
The next stage is:
The next stage is:
The next stage is:
Let's Dive into the OpenLane toolkit provided to access the open-source software helping in the complete flow from RTL to GDSII.
Now, First we try to get into the working directory on which we have to work to access the OpenLANE flow.
The Path for the current working directory to operate will be:
vsduser@vsdsquadron:~/Desktop/work/tools/openlane_workshop_dir/openlane/
After entering into the above mentioned directory type command-
vsduser@vsdsquadron:~/Desktop/work/tools/openlane_workshop_dir/openlane$ docker
to access all the tools available in the openLANE. Then command to start the session will be:
bash-4.2$ flow.tcl -interactive
By the meantime we can also explore the /openlane dir. Below are the listed item present in
vsduser@vsdsquadron:~/Desktop/work/tools/openlane_workshop_dir/openlane$ cd designs/
Coming back to the OpenLane interactive session. We will be preparing the design file for picorv32a. Command for preparing will be
% prep -design picorv32a
After this for synthesizing the design we run the command
% run_synthesis
Now, switch to the design directory in the openlane directory. The cd to
vsduser@vsdsquadron:~/Desktop/work/tools/openlane_workshop_dir/openlane/designs/picorv32a/runs/15-03_19-57$ cd reports/synthesis/less 1-yosys_4.stat.rpt
to print the synthesis statistics.
To calculate the Flop-Ratio:
Flop-Ratio= (Total number of D-Flip-Flops)/(Total number of cells)
%Flop-Ratio= Flop-Ratio*100%
Below represents the statistics of the synthesis:
Hence, from the above obtained statistics the Flop-Ratio will be:
Flop-Ratio=(1613/14876)=0.1084296
% Flop-Ratio = 10.84%
This ENDS with the DAY1 workshop using an OpenLANE Toolkit.
- Chip Floor Planning considerations
- Library Binding and Placement
- Cell design and characterization flows
- General timing characterization parameters
This module begins with the basic concepts of floor planning that includes netlisting defining width and height of core and die.
Now let's begin by defining the Netlist: It describes the connectivity between all the Electronic components.
Below, represents the netlisting of the various Flip-Flops and logic gates.
For, placing the netlist in the core of the chip we need to convert the logic gates into the specified physical dimensions. So, that they are to correctly placed inside the core.
Below, is the representation of the conversion of netlist into physical dimensions.
Now, to calculate the exact dimensions of the core and die we need to first specify the dimensions of the standard cells (logic gates) and Flip-Flops.
Below is the measurement of the std. cell and the Flip-Flop for further calculation of the core.
Let's define what is the core on the silicon wafer??
Core is defined as the section of the chip where the fundamental logic of the design is placed.
What is the Die on the silicon wafer??
Die consists of core, it is a small semiconductor material specimen on which the fundamental circuit is fabricated.
Let's calculate the Two main parameters in the context of a core and die.
These are: a). Utilization factor and b). Aspect Ratio.
Below is the mathematical representation of these parameters.
In the above example the core is 4unit*2unit = 8sq. units. But, the Netlist has the dimension of 2unit * 2unit = 4sq. units.
Thus, the Utilization Factor = 0.5 and the Aspect Ratio = 0.5
Now, in this idea of floor planning lets come up to How to define the location of Pre-Placed cells. To implement this we cut the netlist into different parts and implement that considering each block as the individual black box.
The above statement is depicted below.
Now, the two different Boxes are to be termed as two different IP's or modules.
Various other different examples of the IP's and modules are shown below.
Now, lets define different terminologies.These are:
1. Floorplanning: The arrangements of these IP's onto a chip.
2. Pre-Placed Cells: These IP's / Blocks have user-defined locations, and hence are placed in chip before automated placement-and-routing.
Thus, the Pre-Placement is shown below.
The concept of Decoupling capacitor and its relevance in the floorplan and in the circuits are explained below.
So, the overcome the mid range of noise margin i.e., from VIH to VIL we will place the Decoupling capacitor.
The advantage of decoupling capacitor is represented in the diagram below.
Thus, the final placement onto the core of the die is shown below.
Consider different macros / IP's are interconnected to each other with power supply and there is the connection established between a Driver and a Load as shown below.
These buses are a 16-bit bus, lets say it is connected to an inverter. Then they encounter with the problem of Ground Bound and Voltage Droop as shown below.
To overcome the above depicted drawbacks we use multiple rails of vdd and vss so that any logic circuit can withdraw the required potential from the nearest branch.
Thus, the final pin placement on a core of a die will be done as shown.
Now, we are taking a design to understand the placement and routing on silicon.The complete design is shown below.
This the final placement of the input and output pins with the logical cell placement blockage which is used to block the automatic routing and placement is done and shown below.
Now, moving to OpenLANE to understand the concept of floor-planning and automated placement and routing.
After the Day1 synthesis and calculation of Flop Ratio now we are diving in the concepts of Floorplanning, Placement and Routing.
There are three files at different locations they are priortized according to there considerations that which values overwrites the other.
This heirarcy is shown below.
Now, in the interactive window write the command.
%run_floorplan
Then, cd to the specified directory to look into the merged and final vmetal and hmetal and core utilisation as per the priority order.
cd openlane/designs/picorv32a/runs/18-03_17-40/logs/floorplan/less io-Placer.log
Shown below is the final file.
Also, to check the die area cd to the below directory.
cd openlane/designs/picorv32a/runs/18-03_17-40/results/floorplan/less picorv32a.floorplan.def
The output will be as shown.
Now, to open the Layout we use magic.
The following command in the same directory as above mentioned will help to open the layout editor.
magic -T /home/vsduser/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech led read ../../tmp/merged.lef def read picorv32a.floorplan.def &
Shown below is the layout of the die and all the pre-placed cells and std cells on the layout are shown below.
If we want to know the details of the specific I/O port. Then we select it and write what in the tkcon.tcl window.
To understand the concept of Placement Lets consider a net consisting of a pre-placed cells and standard cells.
So, for placement on the core we specify some physical dimensions to each and every cell. Thus, this is termed as Library. It consists of various pre-defined std. cells and pre-placed cells.
Thus, the complete setup of floorplan, netlist and library management is shown.
Now, after placing all the cell from the library onto the core of the die considering no distruption in the pre-placed cells. Thus, this the stage where we estimate wire length and capacitances and based on that we insert repeaters. Thus, this is termed as optimised placement.
Now, to checking the optimisation of the placement for each section.
The first section is placed as shown.
The second section is placed as shown.
The Third section is placed as shown.
The Last section is placed as shown.
Now, In Library Characterization and Modelling these 5 steps a very important role. These are as follows:
- Logic Synthesis.
- Floorplanning.
- Placement.
- CTS (Clock Tree Synthesis).
- Routing
One common thing across all the stages are "GATES or CELLS".
Now, lets simulate the above learned placement using the OpenLANE toolkit.
Run the following command after synthesis and floorplanning.
%run_placement
The move into the following directory to open magic placement.
cd openlane/designs/picorv32a/runs/19-03_15-04/results/placement
To open the magic from the placement directory. We use the command given below.
magic -T /home/vsduser/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech led read ../../tmp/merged.lef def read picorv32a.placement.def &
This shows the legal placement of standard cells.
Thus, After final placement and routing for the above netlist the final die will be represented as shown.
It consists of standard cells (all the basic logic gates or cells) that are stored in the Library.
Also, the library consists of various varities of std. cells of different sizes and different functionality.
Now, lets divide the designing of each std. cell. Thus, the cell design flow is as follows.
The first design step i.e., Inputs is classified as shown below.
The second step is the Design Step or the circuit design step also consists of an addition step of CDL (circuit description language).
Now, lets move to the layout design step taking the Euler's path into consideration. Also, the layout of a cmos inverter is made on magic is shown below.
Now, the last stage the Output stage also consists of the GDSII file, LEF and extracted spice netlist file that helps in determining the Time, noise and power characterstics of the circuit.
Now, the characterization flow is described in 1-8 steps and then they are passed into the software GUNA the results to an output model file that characterizes the timing, noise and power states of a circuit.
Now, lets analyse the timing characterstics for a stimulus applied as an input to the buffer and various timing analyses comprises of input and output slew rates input and output rise and fall threshold timings etc.
Now, lets calculate the propagation delay and the transition time.
The negative propagation delay shows that the circuit is not proparly synchronised.
Positive Propagation delay:
Negative Propagation delay:
Input transition time:
Output transition time:
The final transition delay calculation.
- Labs for CMOS inverter ngspice simulations
- Inception of Layout and CMOS fabrication process
- Sky130 Tech File Labs
In this lab first we try to change the i/o ports placement schemes by using io placer.
After running floorplan cd to the given directory.
cd openlane/designs/picorv32a/runs/21-03_18-42/results/floorplan/
Run the following command after comming to the above mentioned directory.
magic -T /home/vsduser/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech led read ../../tmp/merged.lef def read picorv32a.floorplan.def &
The io placement before changing the configurations.
Now, change the directory to the following location to fetch the path that should be changed to change io settings.
cd openlane/configuration/less floorplan.tcl
Then, place the following commands in the interactive window and check the layout again by the above mentioned procedure.
%set ::env(FP_IO_MODE) 2
%run_floorplan
Now, we have a SPICE deck as shown below.
Now, lets clone the repository containing the inverter layout and run all the post-layout simulations.
cd openlane/
Command to clone from the above directory.
git clone https://github.com/nickson-jose/vsdstdcelldesign.git
Now, change to the clone directory and copy the tech file from the below mentioned directory.
cd openlane_working_dir/pdks/sky130A/libs.tech/cp sky130A.tech /home/vsduser/Desktop/work/tools/openlane_working_dir/openlane/vsdstdcelldesign/
Now, open the layout from the vsdstdcelldesign directory by using the following command.
magic -T sky130A.tech sky130_inv.mag &
Now, the layout of the CMOS inverter will be shown below.
This, module completely deals with the fabrication process of a Twin-Well CMOS device fabrication.
Lets, find that which layer consists of which type and also extract the spice file to get all the parasitics.
The below shown are the different layers and their description.
Now, create and extract file from the tckon.tcl window.
%extract all
Thus, the extract file is created.
Now, create an Spice file from the tckon window.
%ext2spice cthresh 0 rthresh 0
%ext2spice
Thus, the spice file is created and shown below.
Now, open the spice file using the following command.
vim sky130_inv.spice
In this we will characterise the extracted spice file through the vim editor and the file after changes is shown below.
Now, we run the ngspice simulation.
ngspice sky130_inv.spice
The following will be the output.
Now, we plot the output vs time curve to calculate the rise and fall transition time and the rise and fall delay time.
plot y vs time a
The output is:
The Rise transition data points at 20% and 80% are shown below.
Thus, Rise transition = 2.24025e-09 - 2.17995e-09 = 60.30ps
The fall transition data points at 80% and 20% are shown below.
Thus, Fall transition = 4.05075e-09 - 4.09369e-09 = - 42.94ps
The Rise delay data points at 50% input and 50% output are shown below.
Thus, Rise delay = 2.20781e-09 - 2.15078e-09 = 57.03ps
The fall delay data points at 50% input and 50% output are shown below.
Thus, Fall transition = 4.07547e-09 - 4.05004e-09 = 25.43ps
Now, taking the Lab challanges that are to fix poly.9 error and to implement polyresistor spacing to diff and tap also to describe DRC error as geometrical construct and last challenge is to find the missing and incorrect rules.
Shown below is the poly resistor challenge.
Shown below is the n_well missing rule challenge.
- Timing modelling using delay tables
- Timing analysis with ideal clocks using openSTA
- Clock tree synthesis TritonCTS and signal integrity
- Timing analysis with real clocks using openSTA
Now, in this module we talking about the timing modelling and delay tables and also the conversion of grid info to track info.
In this we also convert the magic layout to std cell LEF.
Now lets convert the grid info to track info.
Path to open the track info file.
cd openlane_working_dir/pdks/sky130A/libs.tech/openlane/sky130_fd_sc_hd/less tracks.info
Now, open the magic layout from the below directory.
cd openlane/vsdstdcelldesign/
Now the command to open Magic layout will be:
magic -T sky130A.tech sky130_inv.mag &
Now, converting the grid info to track info from the Tkcon.tcl window.
The tracks.info file is shown below.
From the above changed grid layout we observed that the input and output ports are lying on the intersection of X and Y pitches.
Now, lets convert the magic layout to the std cell LEF file. We first make the copy of the layout by using the following command from the Tkcon.tcl window.
% save sky130_vsdinv.mag
Also, the layout of the copied cell is shown below.
Now, convert the following layout into the lef file by using the following command.
% lef write
Now, the .mag and .lef both files are generated and the lef file is shown below.
Thus, each and every port is prioritized by using the following class and use commands and they are written in the lef file as per their order.
% port class input
% port use signal
Now, lets see the steps that how to include a new cell in synthesis.
Let's copy 4 files that are the .lef file and all the 3 files at all temperature corners. To the following picorv32a design directory.
/home/vsduser/Desktop/work/tools/openlane_working_dir/openlane/designs/picorv32a/src
The location of the lef file is:
\ File name: sky130_vsdinv.lef
cd openlane/vsdstdcelldesign/
The other 3 files are at the following location:
File names:
- sky130_fd_sc_hd__fast.lib
- sky130_fd_sc_hd__slow.lib
- sky130_fd_sc_hd__typical.lib
cd openlane/vsdstdcelldesign/libs/
Now, again change the directory to designs/picorv32a to change the config.tcl file.
The updated config.tcl file should be like this.
Then, open the interactive window in the new tab for the synthesis of the new cell.
Following commands are to be used step be step before running synthesis command.
docker
flow.tcl -interactive
package require openlane 0.9
prep -design picorv32a -tag 23-03_10-48 -overwrite
set lefs [glob $::env(DESIGN_DIR)/src/*.lef]
add_lefs -src $lefs
run_synthesis
Thus, after running the synthesis we will encounter with the huge slack so to remove that we will see it further.
Now, lets understand the usage of delay tables for timing modelling and power aware Clock tree synthesis (CTS).
Thus, as the 2nd stage of the gates are of same type thats why we have a 0 values skew. But, if this configuration is changed then we encounter with the non zeroed skew and on the large scale it will lead to a large amount of skew and output latency.
Also, the power aware CTS is shown below as we keep on the any one gate at a time for 2nd stage.
Now, the steps to configure the synthesis settings to fix slack and to include vsdinv file.
Set all the parameters then run the synthesis as per the commands below in the serial order.
set ::env(SYNTH_STRATEGY) 1
set ::env(SYNTH_BUFFERING) 1
set ::env(SYNTH_SIZING) 1
init_floorplan
place_io
global_placement_or
detailed_placement
tap_decap_or
detailed_placement
Then, change to the following directory to see the magic layout.
cd openlane/designs/picorv32a/runs/23-03_10-48/results/placement
By using the following command open the magic layout from the above folder.
magic -T /home/vsduser/Desktop/work/tools/openlane_working_dir/pdks/sky130A/libs.tech/magic/sky130A.tech led read ../../tmp/merged.lef def read picorv32a.placement.def &
Thus, the following output will be shown.
To get the detailed view use the following command from Tkcon.tcl window.
expand
Thus, the vsd_inv cell in the final layout is shown below.
Hence, we are able to fix the slack and sucessfully included vsdinv file.
In this set of modules we learn about the Setup Timing analysis and analysing timings with Ideal Clocks.
The clock provided by the PLL's will provide a temporary variation of the clock period this is termed as Jitter.
Thus, the final setup timing after introducing the jitter timing in the entire clock period.
Now, the setup time for a set of logic will be calculated as shown below but this should be less then the setup time calculated in the above diagram.
Now, lets see this observation practically on openLANE by making 2 different files on different locations.
File name: my_base.sdc
Path:
cd openlane/designs/picorv32a/src/
File name: pre_sta.conf
Path:
cd openlane/
So, to run the pre_sta.conf file write the below command from the above directory itself.
sta pre_sta.conf
Now, the slack in 1st iteration is:
Now, the slack after improvement is:
In this set of modules we learnt about the clock tree routing and buffering using H-Tree algorithm. Below shown is the H-Tree algorithm used in CT routing.
Also the Buffering in Clock Tree Routing is done as shown below.
Also, we looked into the problem of crosstalk in routing which leads to increase the delay in the ckt. Thus, to overcome this we do clock net sheilding.
Now, to run the clock tree synthesis we use the following command from the interactive window.
run_cts
Thus, the CTS is completed and shown below.
Thus, the CTS RUNS are verified and shown below.
In this module we understand the setup and hold timing analysis using real clocks.
Thus, the setup timing analysis and the slack definition is shown below.
Also, the Hold timing analysis and the slack definition in the case of hold time is shown below.
Thus, on the chip level the delay time for the Launch Flop is shown below.
Also, the delay time for the Capture Flop and the Skew value is shown below.
Thus, the value for hold timing and setup timing is shown below.
Thus, all this timing design is for a single clock.
Now. for doing timing analysis for the real clocks we invoke the openroad.
openroad
After invoking it we now create a db file and to create it we read various files as shown below.
Now, after checking the reports we come up with the Hold and SetupTime delays.
Hold slack :
Setup Slack :
But, the process we are going with is incorrect.
Now, we read the created db file directly and obtain the hold and setup time slack.
Hold slack:
Setup Slack:
In this module we are dicussing the routing techniques to route the std cells and macros on the core of a die.
We learnt about the Maze Routing strategy also known as Lee's Algorithm.
This algorithm always take two points one as a source and one as a target and basically chooses the shortest with minimum no. of bends.
Now, after successfull routing of all the cells we do the design rule check (DRC).
The few DRC voilations are shown below.
Now, the method to avoid Signal Shorting is to introduce another layer which are connected through a via.
Now, the final stage is Parasitic Extraction.
Thus, the routing strategy on a chip is shown as below.
Now after Synthesis, floorplanning, placement and cts we are comming to the routing stage.
After the cts run few commands from starting of an openlane.
docker
flow.tcl -interactive
package require openlane 0.9
prep -design picorv32a -tag 23-03_10-48 #(we do not overwrite it as it will remove the cts otherwise)
echo $::env(CURRENT_DEF)
Now, to build the power distribution network we use the following command.
gen_pdn
Now before running routing setup the TritonRoute strategy as 0 then it will use the Triton13 as shown in figure.
Then run the following command:
run_routing
Thus, we obtained the routing results as:
- The number of voilations = 0
- Obtained the picorv32a.spef file
In this module we learnt various features of TritonRoute.
We also learnt the Routing Topology Algorithm that uses MST technique.
Thus, we successfully obtained the spef file and we learnt to create our own design and the entire PnR flow.