This repository provides code to run Simulated Annealing (SA) wrapped with Go With The Winners (GWTW) metaheuristic to minimize the proxy cost for a given clustered netlist.
Here is a brief description of our SA implementation.
Ensure the following dependencies are installed:
We provide recipie files to build a Docker/Singularity image for the code. Here are the steps to build the image:
cd tools
docker build --no-cache --tag mp_sa:rocky8 -f ./docker_mp_sa .
If you do not have root access to your server, then you can create a Singularity image from the Docker image in your local machine using the following command:
## The following code creates singularity image: mp_sa_rocky8.sif
singularity build mp_sa_rocky8.sif docker-daemon://mp_sa:rocky8
Our SA code is deterministic, meaning that if you run it twice with the same
input (and IS_ASYNC=0), you will get the same output. We have tested our
code in an identical environment on both an Intel(R) Xeon(R) Gold and an AMD
EPYC 7763 CPU, and both produced the exact same results for our testcases.
Therefore, we expect that using a Singularity image or Docker image will ensure the reproducibility of our experimental results for the given testcases.
Execute the following commands:
mkdir build
cd build
cmake ..
make -j 10
Nine testcases and corresponding run scripts are available in the test directory. The testcases span three enablements: NG45, ASAP7, and GF12.
Note: GF12 testcases are not uploaded due to NDA restrictions.
For more details, refer to the RDF-2024 paper.
Follow these steps to run the code:
## 1. Update the following parameters in ./util/run.sh:
## - REPO_DIR: Path to the repository
## - DESIGN: Design name (e.g., ariane, bp, mempool_group)
## Please ensure the NETLIST and PLC file paths are correctly set.
## - IS_ASYNC: Set to 1 for asynchronous mode (non-deterministic)
## - ITER: Iteration count based on the design
## 2. Run the script:
REPO_DIR=<real_path_to_repo>
cp ${REPO_DIR}/util/run.sh .
chmod +x run.sh
./run.sh
## OR To laucn using singularity image
singularity exec -B ${REPO_DIR},/tmp mp_sa_rocky8.sif ./run.sh
## OR To launch using docker image (Ensure that the REPO_DIR is mounted and the current directory is the workdir)
## For this case in run.sh set REPO_DIR=/mp_sa
docker run -v $(pwd):/workdir -v ${REPO_DIR}:/mp_sa mp_sa:rocky8 /workdir/run.sh
To ensure runtime stays within 12 hours, use the following iteration counts:
Set IS_ASYNC=1 in the run script to enable asynchronous mode.
Note:
- The SA code becomes non-deterministic in this mode.
- It will run faster, allowing for more iterations.
We run our SA for IS_ASYNC=0 on all testcases. In the table below, we
provide two proxy cost values per testcase: (i) the proxy cost from our FD
placer/evaluator that yields the best result when evaluated by the golden
Circuit Training evaluator, and (ii) the golden proxy cost value from the
Circuit Training evaluator. Our FD placers do not yield the same results
as the Circuit Training FD placer. Although our evaluator computes proxy cost
exactly as the Circuit Training evaluator, discrepancies in the FD placers
lead to different proxy cost outcomes.
| Design | Proxy Cost Based on SA Evaluator | Proxy Cost Based on CT (Golden) Evaluator | ||||||
|---|---|---|---|---|---|---|---|---|
| WL | Den. | Cong. | Proxy. | WL | Den. | Cong. | Proxy. | |
| Ariane-NG45 | 0.0879 | 0.5019 | 0.8917 | 0.7847 | 0.0898 | 0.5146 | 0.9068 | 0.8005 |
| BlackParrot-NG45 | 0.0550 | 0.6992 | 0.9433 | 0.8763 | 0.0543 | 0.7114 | 0.9361 | 0.8781 |
| MemPoolGroup-NG45 | 0.0604 | 1.2112 | 1.0540 | 1.1930 | 0.0616 | 1.1308 | 1.0948 | 1.1744 |
| Ariane-ASAP7 | 0.1001 | 0.8168 | 0.7580 | 0.8875 | 0.1081 | 0.8169 | 0.8216 | 0.9274 |
| BlackParrot-ASAP7 | 0.0533 | 0.7622 | 0.7466 | 0.8077 | 0.0529 | 0.7584 | 0.7505 | 0.8074 |
| MemPoolGroup-ASAP7 | 0.0675 | 1.3271 | 0.8235 | 1.1429 | 0.0690 | 1.3050 | 0.8338 | 1.1384 |
| CT-Ariane | 0.0760 | 0.5048 | 0.8027 | 0.7297 | 0.0811 | 0.5246 | 0.8138 | 0.7503 |
| CT-Ariane-X2 | 0.0681 | 0.4880 | 0.8272 | 0.7257 | 0.0672 | 0.4898 | 0.8341 | 0.7292 |
| CT-Ariane-X4 | 0.0539 | 0.4635 | 0.8076 | 0.6895 | 0.0522 | 0.4668 | 0.8146 | 0.6929 |
Once you have the final plc files from the SA run, you can run evaluation using the Circuit Training plc_client, where you will place the soft macros using the CT-FD placer and report the proxy cost value. The evaluation code is located in the util directory.
python ./util/golden_eval.py <design> <netlist> <plc_file> <output_dir>
.plc file with updated soft macro locations..tcl file for placing macros in Innovus.