MacroPlacement

Simulated Annealing with Go With The Winners (GWTW)

This repository provides code to run Simulated Annealing (SA) wrapped with Go With The Winners (GWTW) metaheuristic to minimize the proxy cost for a given clustered netlist.

Generating the Clustered Netlist

First, generate the protobuf netlist from Innovus using this script available in the TILOS/MacroPlacement repository for a placed design.
Next, use the Circuit Training grouping code to generate the clustered netlist.

Code Overview

The SA code is implemented in C++.
Follow the steps below to build and test the code.

Here is a brief description of our SA implementation.

Prerequisites

Ensure the following dependencies are installed:

CMake: 3.25.1
GCC-Toolset: 11.2.0
Cereal: 1.3.2

Docker / Singularity Image

We provide recipie files to build a Docker/Singularity image for the code. Here are the steps to build the image:

cd tools
docker build --no-cache  --tag mp_sa:rocky8 -f ./docker_mp_sa .

If you do not have root access to your server, then you can create a Singularity image from the Docker image in your local machine using the following command:

## The following code creates singularity image: mp_sa_rocky8.sif
singularity build mp_sa_rocky8.sif docker-daemon://mp_sa:rocky8

Our SA code is deterministic, meaning that if you run it twice with the same input (and IS_ASYNC=0), you will get the same output. We have tested our code in an identical environment on both an Intel(R) Xeon(R) Gold and an AMD EPYC 7763 CPU, and both produced the exact same results for our testcases.

Therefore, we expect that using a Singularity image or Docker image will ensure the reproducibility of our experimental results for the given testcases.

Steps to Build

Execute the following commands:

mkdir build
cd build
cmake ..
make -j 10

Testcases

Nine testcases and corresponding run scripts are available in the test directory. The testcases span three enablements: NG45, ASAP7, and GF12.

Note: GF12 testcases are not uploaded due to NDA restrictions.

Available Testcases:

Ariane: NG45, ASAP7
BlackParrot (Quad-Core): NG45, ASAP7
MemPoolGroup: NG45, ASAP7

Scaled Versions of Ariane (from Circuit Training):

For more details, refer to the RDF-2024 paper.

Steps to Run

Follow these steps to run the code:

## 1. Update the following parameters in ./util/run.sh:
##    - REPO_DIR: Path to the repository
##    - DESIGN: Design name (e.g., ariane, bp, mempool_group)
##       Please ensure the NETLIST and PLC file paths are correctly set.
##    - IS_ASYNC: Set to 1 for asynchronous mode (non-deterministic)
##    - ITER: Iteration count based on the design

## 2. Run the script:
REPO_DIR=<real_path_to_repo>
cp ${REPO_DIR}/util/run.sh .
chmod +x run.sh
./run.sh

## OR To laucn using singularity image
singularity exec -B ${REPO_DIR},/tmp mp_sa_rocky8.sif ./run.sh

## OR To launch using docker image (Ensure that the REPO_DIR is mounted and the current directory is the workdir)
## For this case in run.sh set REPO_DIR=/mp_sa
docker run -v $(pwd):/workdir -v ${REPO_DIR}:/mp_sa mp_sa:rocky8 /workdir/run.sh

Recommended Iterations

To ensure runtime stays within 12 hours, use the following iteration counts:

Ariane (18K), BlackParrot (5K), MemPoolGroup (4K), Ariane_X2 (6K), Ariane_X4 (4.5K)

Asynchronous Mode

Set IS_ASYNC=1 in the run script to enable asynchronous mode.

Note:

The SA code becomes non-deterministic in this mode.

It will run faster, allowing for more iterations.

Results of SA on Our Testcases

We run our SA for IS_ASYNC=0 on all testcases. In the table below, we provide two proxy cost values per testcase: (i) the proxy cost from our FD placer/evaluator that yields the best result when evaluated by the golden Circuit Training evaluator, and (ii) the golden proxy cost value from the Circuit Training evaluator. Our FD placers do not yield the same results as the Circuit Training FD placer. Although our evaluator computes proxy cost exactly as the Circuit Training evaluator, discrepancies in the FD placers lead to different proxy cost outcomes.

Design	Proxy Cost Based on SA Evaluator				Proxy Cost Based on CT (Golden) Evaluator
Design	WL	Den.	Cong.	Proxy.	WL	Den.	Cong.	Proxy.
Ariane-NG45	0.0879	0.5019	0.8917	0.7847	0.0898	0.5146	0.9068	0.8005
BlackParrot-NG45	0.0550	0.6992	0.9433	0.8763	0.0543	0.7114	0.9361	0.8781
MemPoolGroup-NG45	0.0604	1.2112	1.0540	1.1930	0.0616	1.1308	1.0948	1.1744
Ariane-ASAP7	0.1001	0.8168	0.7580	0.8875	0.1081	0.8169	0.8216	0.9274
BlackParrot-ASAP7	0.0533	0.7622	0.7466	0.8077	0.0529	0.7584	0.7505	0.8074
MemPoolGroup-ASAP7	0.0675	1.3271	0.8235	1.1429	0.0690	1.3050	0.8338	1.1384
CT-Ariane	0.0760	0.5048	0.8027	0.7297	0.0811	0.5246	0.8138	0.7503
CT-Ariane-X2	0.0681	0.4880	0.8272	0.7257	0.0672	0.4898	0.8341	0.7292
CT-Ariane-X4	0.0539	0.4635	0.8076	0.6895	0.0522	0.4668	0.8146	0.6929

Running Evaluation

Once you have the final plc files from the SA run, you can run evaluation using the Circuit Training plc_client, where you will place the soft macros using the CT-FD placer and report the proxy cost value. The evaluation code is located in the util directory.

Prerequisites:

Download the Circuit Training repository.
Download plc_wrapper_main.
Set up the Python environment.

Running the Evaluation:

python ./util/golden_eval.py <design> <netlist> <plc_file> <output_dir>

Evaluation Output:

Places soft macros using the CT-FD placer.
Reports the proxy cost value.
Generates:
- A new .plc file with updated soft macro locations.
- A .tcl file for placing macros in Innovus.

This site is open source. Improve this page.