Home

Awesome

DREAMPlace

Deep learning toolkit-enabled VLSI placement. With the analogy between nonlinear VLSI placement and deep learning training problem, this tool is developed with deep learning toolkit for flexibility and efficiency. The tool runs on both CPU and GPU. Over 30X speedup over the CPU implementation (RePlAce) is achieved in global placement and legalization on ISPD 2005 contest benchmarks with a Nvidia Tesla V100 GPU. DREAMPlace also integrates a GPU-accelerated detailed placer, ABCDPlace, which can achieve around 16X speedup on million-size benchmarks over the widely-adopted sequential placer NTUPlace3 on CPU.

DREAMPlace runs on both CPU and GPU. If it is installed on a machine without GPU, only CPU support will be enabled with multi-threading.

Bigblue4Density MapElectric PotentialElectric Field
<img src=/images/bigblue4-nofiller_SLD.gif width=250>Density MapElectric Potential MapElectric Field Map

<img src=images/DREAMPlace4.1_flow.png width=600>

Publications

Dependency

To pull git submodules in the root directory

git submodule init
git submodule update

Or alternatively, pull all the submodules when cloning the repository.

git clone --recursive https://github.com/limbo018/DREAMPlace.git

How to Install Python Dependency

Go to the root directory.

pip install -r requirements.txt

How to Build

Two options are provided for building: with and without Docker.

Build with Docker

You can use the Docker container to avoid building all the dependencies yourself.

  1. Install Docker on Windows, Mac or Linux.
  2. To enable the GPU features, install NVIDIA-docker; otherwise, skip this step.
  3. Navigate to the repository.
  4. Get the docker container with either of the following options.
    docker pull limbo018/dreamplace:cuda
    
    • Option 2: build the container.
    docker build . --file Dockerfile --tag your_name/dreamplace:cuda
    
  5. Enter bash environment of the container. Replace limbo018 with your name if option 2 is chosen in the previous step.

Run with GPU on Linux.

docker run --gpus 1 -it -v $(pwd):/DREAMPlace limbo018/dreamplace:cuda bash

Run with GPU on Windows.

docker run --gpus 1 -it -v /dreamplace limbo018/dreamplace:cuda bash

Run without GPU on Linux.

docker run -it -v $(pwd):/DREAMPlace limbo018/dreamplace:cuda bash

Run without GPU on Windows.

docker run -it -v /dreamplace limbo018/dreamplace:cuda bash
  1. cd /DREAMPlace.
  2. Go to next section to complete building within the container.

Build without Docker

CMake is adopted as the makefile system. To build, go to the root directory.

mkdir build
cd build # we call this <build directory>
cmake .. -DCMAKE_INSTALL_PREFIX=<installation directory> -DPython_EXECUTABLE=$(which python)
make
make install

Where <build directory> is the directory where you compile the code, and <installation directory> is the directory where you want to install DREAMPlace (e.g., ../install). Third party submodules are automatically built except for Boost.

To clean, go to the root directory.

rm -r build

<build directory> can be removed after installation if you do not need incremental compilation later.

Here are the available options for CMake.

How to Get Benchmarks

To get ISPD 2005 and 2015 benchmarks, run the following script from the directory.

python benchmarks/ispd2005_2015.py

How to Run

Before running, make sure the benchmarks have been downloaded and the python dependency packages have been installed. Go to the install directory and run with JSON configuration file for full placement.

cd <installation directory>
python dreamplace/Placer.py test/ispd2005/adaptec1.json

Test individual pytorch op with the unit tests in the root directory.

cd <installation directory>
python unittest/ops/hpwl_unittest.py

Configurations

Descriptions of options in JSON configuration file can be found by running the following command.

cd <installation directory>
python dreamplace/Placer.py --help

The list of options as follows will be shown.

JSON ParameterDefaultDescription
aux_inputrequired for Bookshelfinput .aux file
lef_inputrequired for LEF/DEFinput LEF file
def_inputrequired for LEF/DEFinput DEF file
verilog_inputoptional for LEF/DEFinput VERILOG file, provide circuit netlist information if it is not included in DEF file
gpu1enable gpu or not

...

Authors

Features

Reference Results for Macro Placement

Recently, many studies chose DREAMPLace for macro placement, e.g., [Cheng+, NeurIPS2021], [Lai+, NeurIPS2023], etc. However, the results reported on the same benchmarks vary significantly from one work to another. For better comparison, we provide the results collected from our GPU machine for reference. If your results deviate significantly (i.e., >5% longer HPWL) from the following numbers, something may be wrong. We recommend you to contact us with logs for validation.

Note that DREAMPlace 4.1.0 only implements the BB step and 2-stage flow proposed in [Chen+, ICCAD2023].

ISPD2005 benchmark with all fixed macros and IO pads regarded as movable macros. It can be downloaded from here.

<table> <thead> <tr> <th></th> <th colspan="3">DREAMPlace 4.0</th> <th colspan="3">DREAMPlace 4.1.0</th> </tr> </thead> <tbody> <tr> <td></td> <td>Iterations</td> <td>HPWL(x10^6)</td> <td>Time(s)</td> <td>Iterations</td> <td>HPWL(x10^6)</td> <td>Time(s)</td> </tr> <tr> <td>adaptec1</td> <td>600</td> <td>101.3</td> <td>26.3</td> <td>748</td> <td>68.2 </td> <td>27.6</td> </tr> <tr> <td>adaptec2</td> <td>588*</td> <td>137.5*</td> <td>40.6*</td> <td>784</td> <td>86.3 </td> <td>40.1</td> </tr> <tr> <td>adaptec3</td> <td>765</td> <td>179.5</td> <td>54.1</td> <td>894</td> <td>144.0 </td> <td>56.1</td> </tr> <tr> <td>adaptec4</td> <td>876</td> <td>153.3</td> <td>48.9</td> <td>872</td> <td>140.8 </td> <td>57.3</td> </tr> <tr> <td>bigblue1</td> <td>699</td> <td>86.2</td> <td>23.5</td> <td>813</td> <td>82.0 </td> <td>25.5</td> </tr> <tr> <td>bigblue2</td> <td>1267*</td> <td>2426.7*</td> <td>679.4*</td> <td>869</td> <td>98.1 </td> <td>193.4</td> </tr> <tr> <td>bigblue3</td> <td>1207</td> <td>330.2</td> <td>115.4</td> <td>1307</td> <td>288.8 </td> <td>140.1</td> </tr> <tr> <td>bigblue4</td> <td>1581</td> <td>820.1</td> <td>239.6</td> <td>1875</td> <td>610.0 </td> <td>234.5</td> </tr> <tr> <td>average ratio</td> <td>0.937</td> <td>4.211</td> <td>1.258</td> <td>1.000</td> <td>1.000</td> <td>1.000</td> </tr> </tbody> </table>

MMS benchmark (modified from ISPD2005 benchmarks with movable macros and fixed IO pads)

Our modified version can be downloaded from here.

<table> <thead> <tr> <th></th> <th colspan="3">DREAMPlace 4.0</th> <th colspan="3">DREAMPlace 4.1.0</th> </tr> </thead> <tbody> <tr> <td></td> <td>Iterations</td> <td>HPWL(x10^6)</td> <td>Time(s)</td> <td>Iterations</td> <td>HPWL(x10^6)</td> <td>Time(s)</td> </tr> <tr> <td>adaptec1</td> <td>607</td> <td>65.3 </td> <td>17.8 </td> <td>746</td> <td>64.7 </td> <td>25.8</td> </tr> <tr> <td>adaptec2</td> <td>569</td> <td>79.3 </td> <td>28.5 </td> <td>734</td> <td>75.8 </td> <td>35.8</td> </tr> <tr> <td>adaptec3</td> <td>659</td> <td>158.1 </td> <td>44.6 </td> <td>755</td> <td>153.3 </td> <td>38.9</td> </tr> <tr> <td>adaptec4</td> <td>735</td> <td>141.7 </td> <td>46.8 </td> <td>782</td> <td>142.4 </td> <td>47.5</td> </tr> <tr> <td>adaptec5</td> <td>1053</td> <td>326.3</td> <td>63.8</td> <td>1405</td> <td>337.6 </td> <td>78.4</td> </tr> <tr> <td>bigblue1</td> <td>646</td> <td>85.4 </td> <td>21.3 </td> <td>809</td> <td>85.3 </td> <td>28.9</td> </tr> <tr> <td>bigblue2</td> <td>638</td> <td>125.3 </td> <td>42.0 </td> <td>773</td> <td>125.4 </td> <td>48.4</td> </tr> <tr> <td>bigblue3</td> <td>911</td> <td>279.3 </td> <td>112.5 </td> <td>1097</td> <td>273.8 </td> <td>136.1</td> </tr> <tr> <td>bigblue4</td> <td>1189</td> <td>648.8 </td> <td>172.4 </td> <td>1515</td> <td>643.2 </td> <td>215.4</td> </tr> <tr> <td>newblue1</td> <td>574</td> <td>62.8 </td> <td>22.5 </td> <td>749</td> <td>62.0 </td> <td>30.4</td> </tr> <tr> <td>newblue2</td> <td>730</td> <td>155.5 </td> <td>34.8 </td> <td>861</td> <td>156.1 </td> <td>43.9</td> </tr> <tr> <td>newblue3</td> <td>1318*</td> <td>597.3*</td> <td>55.71*</td> <td>830</td> <td>270.6 </td> <td>72.8</td> </tr> <tr> <td>newblue4</td> <td>1009</td> <td>246.2</td> <td>52.6</td> <td>1274</td> <td>245.8 </td> <td>53.9</td> </tr> <tr> <td>newblue5</td> <td>1254</td> <td>444.2 </td> <td>99.4 </td> <td>1537</td> <td>446.4 </td> <td>134.9</td> </tr> <tr> <td>newblue6</td> <td>929</td> <td>410.6 </td> <td>96.1 </td> <td>1157</td> <td>409.3 </td> <td>115.1</td> </tr> <tr> <td>newblue7</td> <td>1077</td> <td>903.6 </td> <td>184.1 </td> <td>1578</td> <td>903.2 </td> <td>235.1</td> </tr> <tr> <td>average ratio</td> <td>0.855</td> <td>1.081</td> <td>0.830</td> <td>1.000</td> <td>1.000</td> <td>1.000</td> </tr> </tbody> </table> `*` denotes divergence or legalization failure. Note that if you observe divergence or legalization errors in the log, then the results may not be representative.