有人用ZCC尝试优化过CPU SPEC2006吗。

ZCC 在coremark 有15-20%性能提升(base gcc),但是spec2006上无论怎么配置提升都不大。

hello, sadbread!

可以描述一下你使用 gcc 和 zcc 的配置文件吗?

你好,zcc和gcc用的一样的编译选项
-march=rv64g_zicond_zba_zbb -O3 -falign-functions=32

这是我们调优过的 spec2006 标量/向量的配置文件,你可以尝试一下这些配置。
另外,方便告诉我们,你们在哪个平台上进行测试吗? :smiley:

zcc-linux-rv64imafdc-4.1.1.cfg

# This file comes from https://www.francisz.cn/2021/10/04/spec2006-usage.
# This is a sample config file for CPU2006. It was tested with:
#
#      Compiler name/version:       [gcc, g++, gfortran 4.3.4]
#                                   [gcc, g++, gfortran 4.4.4]
#                                   [gcc, g++, gfortran 4.6.0]
#      Operating system version:    [SLES 11 SP1, 64-bit, gcc 4.3.4 and 4.6.0]
#                                   [RHEL 6, 64-bit, gcc 4.4.4]
#      Hardware:                    [Opteron]
#
# If your platform uses different versions, different
# hardware or operates in a different mode (for
# example, 32- vs. 64-bit mode), there is the possibiliy
# that this configuration file may not work as-is.
#
# Note that issues with compilation should be directed
# to the compiler vendor. Information about SPEC techncial
# support can be found in the techsupport document in the
# Docs directory of your benchmark installation.
#
# Also note that this is a sample configuration. It
# is expected to work for the environment in which
# it was tested; it is not guaranteed that this is
# the config file that will provide the best performance.
#
# Note that you might find a more recent config file for
# your platform with the posted results at
# www.spec.org/cpu2006
####################################################################
# AMD64 (64-bit) gcc 4.3, 4.4 and 4.6 config file
# Sample Config file for CPU2006
#####################################################################

%ifndef %{specific_ext}
%   define  specific_ext     riscv-zcc
%endif

ignore_errors = no
tune          = base
size          = test
ext           = %{specific_ext}
output_format = asc, pdf, Screen
reportable    = 1
teeout        = yes
teerunout     = yes
hw_avail = Dec-9999
license_num = 9999
test_sponsor = SFC
prepared_by =
tester      =
#test_date = Dec-9999
default=default=default=default:
#####################################################################
#
# Compiler selection
#
#####################################################################
default:
#  EDIT: The parent directory for your compiler.
#        Do not include the trailing /bin/
#        Do not include a trailing slash
%ifndef %{TOOLCHAIN_PATH}
%   define  TOOLCHAIN_PATH        "/path/to/install-zcc"  # EDIT (see above)
%endif
# NOTE: The path may be different if you use the compiler from
#       the gnu site.

CC                 = %{TOOLCHAIN_PATH}/bin/zcc --target=riscv64-unknown-linux-gnu -fdelayed-template-parsing
CXX                = %{TOOLCHAIN_PATH}/bin/z++ --target=riscv64-unknown-linux-gnu -std=c++03 -fdelayed-template-parsing
FC                 = %{TOOLCHAIN_PATH}/bin/zfc --target=riscv64-unknown-linux-gnu

## HW config
hw_model     = EVB-V1
hw_cpu_name  = U74-V7100
hw_cpu_char  =
hw_cpu_mhz   = 1000
hw_fpu       = Integrated
hw_nchips    = 1
hw_ncores    = 1
hw_ncoresperchip   = 2
hw_nthreadspercore = 1
hw_ncpuorder = 1 chip
hw_pcache    = 32 KiB 8-way
hw_scache    = 128KiB 16-way
#hw_tcache    = None
#hw_ocache    = None
hw_memory     = 8 GB (DDR4)
hw_disk       = MMC
hw_vendor     = starfive

## SW config
sw_os        = linux (for RISCV)
sw_file      = ext3
sw_state     = runlevel 3
sw_compiler        = zcc, z++ & zfc 3.0
sw_avail = Dec-9999
sw_other = None
sw_auto_parallel = No
sw_base_ptrsize = 64-bit
sw_peak_ptrsize = 64-bit

#####################################################################
# Notes
#####################################################################
notes_os_000 ='ulimit -s unlimited'

#####################################################################
# Optimization
#####################################################################

ARCH=rv64gc

%ifndef %{COPTIMIZE}
%   define  COPTIMIZE        -static -g -O3 -flto -march=$[ARCH] -mllvm --no-unsigned-wrap=true
%endif

%ifndef %{CXXOPTIMIZE}
%   define  CXXOPTIMIZE        -static -g -O3 -flto -march=$[ARCH] -mllvm --no-unsigned-wrap=true
%endif

%ifndef %{FOPTIMIZE}
%   define  FOPTIMIZE        -static -g -O3 -flto -march=$[ARCH]
%endif

default=base=default=default:
COPTIMIZE    = %{COPTIMIZE}
CXXOPTIMIZE  = %{CXXOPTIMIZE}
FOPTIMIZE    = %{FOPTIMIZE}

notes0100= C base flags: $[COPTIMIZE]
notes0110= C++ base flags: $[CXXOPTIMIZE]
notes0120= Fortran base flags: $[FOPTIMIZE]

#####################################################################
# Optimization
#####################################################################
##peak
default=peak=default=default:
COPTIMIZE    = %{COPTIMIZE}
CXXOPTIMIZE  = %{CXXOPTIMIZE}
FOPTIMIZE    = %{FOPTIMIZE}

notes0100= C base flags: $[COPTIMIZE]
notes0110= C++ base flags: $[CXXOPTIMIZE]
notes0120= Fortran base flags: $[FOPTIMIZE]

400.perlbench=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

462.libquantum=peak=default=default:
EXTRA_OPTIMIZE = -funroll-loops

410.bwaves=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

434.zeusmp=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

436.cactusADM=peak=default=default:
EXTRA_OPTIMIZE = -ffinite-math-only

444.namd=peak=default=default:
EXTRA_OPTIMIZE = -fnon-call-exceptions

459.GemsFDTD=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

465.tonto=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

#####################################################################
# 32/64 bit Portability Flags - all
#####################################################################

default=base=default=default:
notes25= PORTABILITY=-DSPEC_CPU_LP64 is applied to all benchmarks in base.
PORTABILITY    = -DSPEC_CPU_LP64

#####################################################################
# Portability Flags
#####################################################################

400.perlbench=default=default=default:
notes35    = 400.perlbench: -DSPEC_CPU_LINUX_X64
CPORTABILITY= -DSPEC_CPU_LINUX_X64 -std=gnu89

401.bzip2=default=default=default:
CPORTABILITY   = -Wno-int-conversion

456.hmmer=default=default=default:
CPORTABILITY   = -std=gnu89

462.libquantum=default=default=default:
notes60= 462.libquantum: -DSPEC_CPU_LINUX
CPORTABILITY= -DSPEC_CPU_LINUX

464.h264ref=default=default=default:
CPORTABILITY= -fsigned-char

483.xalancbmk=default=default=default:
CXXPORTABILITY= -DSPEC_CPU_LINUX -include cstring

#####################################################################
# Portability Flags - FP
#####################################################################
410.bwaves=default=default=default:
FPORTABILITY=

416.gamess=default=default=default:
FPORTABILITY=

433.milc=default=default=default:

434.zeusmp=default=default=default:
FPORTABILITY=

435.gromacs=default=default=default:

436.cactusADM=default=default=default:

437.leslie3d=default=default=default:
FPORTABILITY=

444.namd=default=default=default:

447.dealII=default=default=default:
CXXPORTABILITY=  -fpermissive -include cstring

450.soplex=default=default=default:
CXXPORTABILITY= -std=c++98

454.calculix=default=default=default:
CPORTABILITY= -Wno-int-conversion

459.GemsFDTD=default=default=default:
FPORTABILITY=

465.tonto=default=default=default:
FPORTABILITY=

470.lbm=default=default=default:

481.wrf=default=default=default:
CPORTABILITY = -DSPEC_CPU_CASE_FLAG -DSPEC_CPU_LINUX -std=gnu89
FPORTABILITY=

482.sphinx3=default=default=default:
CPORTABILITY= -fsigned-char

zcc-linux-rv64imafdcv-4.1.1.cfg

# This file comes from https://www.francisz.cn/2021/10/04/spec2006-usage.
# This is a sample config file for CPU2006. It was tested with:
#
#      Compiler name/version:       [gcc, g++, gfortran 4.3.4]
#                                   [gcc, g++, gfortran 4.4.4]
#                                   [gcc, g++, gfortran 4.6.0]
#      Operating system version:    [SLES 11 SP1, 64-bit, gcc 4.3.4 and 4.6.0]
#                                   [RHEL 6, 64-bit, gcc 4.4.4]
#      Hardware:                    [Opteron]
#
# If your platform uses different versions, different
# hardware or operates in a different mode (for
# example, 32- vs. 64-bit mode), there is the possibiliy
# that this configuration file may not work as-is.
#
# Note that issues with compilation should be directed
# to the compiler vendor. Information about SPEC techncial
# support can be found in the techsupport document in the
# Docs directory of your benchmark installation.
#
# Also note that this is a sample configuration. It
# is expected to work for the environment in which
# it was tested; it is not guaranteed that this is
# the config file that will provide the best performance.
#
# Note that you might find a more recent config file for
# your platform with the posted results at
# www.spec.org/cpu2006
####################################################################
# AMD64 (64-bit) gcc 4.3, 4.4 and 4.6 config file
# Sample Config file for CPU2006
#####################################################################

%ifndef %{specific_ext}
%   define  specific_ext     riscv-v-zcc
%endif

ignore_errors = no
tune          = base
size          = test
ext           = %{specific_ext}
output_format = asc, pdf, Screen
reportable    = 1
teeout        = yes
teerunout     = yes
hw_avail = Dec-9999
license_num = 9999
test_sponsor = SFC
prepared_by =
tester      =

#test_date = Dec-9999
default=default=default=default:
#####################################################################
#
# Compiler selection
#
#####################################################################
default:
#  EDIT: The parent directory for your compiler.
#        Do not include the trailing /bin/
#        Do not include a trailing slash
%ifndef %{TOOLCHAIN_PATH}
%   define  TOOLCHAIN_PATH        "/path/to/install-zcc"  # EDIT (see above)
%endif
# NOTE: The path may be different if you use the compiler from
#       the gnu site.
CC                 = %{TOOLCHAIN_PATH}/bin/zcc --target=riscv64-unknown-linux-gnu -fdelayed-template-parsing
CXX                = %{TOOLCHAIN_PATH}/bin/z++ --target=riscv64-unknown-linux-gnu -std=c++03 -fdelayed-template-parsing
FC                 = %{TOOLCHAIN_PATH}/bin/zfc --target=riscv64-unknown-linux-gnu

## HW config
hw_model     = EVB-V1
hw_cpu_name  = U74-V7100
hw_cpu_char  =
hw_cpu_mhz   = 1000
hw_fpu       = Integrated
hw_nchips    = 1
hw_ncores    = 1
hw_ncoresperchip   = 2
hw_nthreadspercore = 1
hw_ncpuorder = 1 chip
hw_pcache    = 32 KiB 8-way
hw_scache    = 128KiB 16-way
#hw_tcache    = None
#hw_ocache    = None
hw_memory     = 8 GB (DDR4)
hw_disk       = MMC
hw_vendor     = starfive

## SW config
sw_os        = linux (for RISCV)
sw_file      = ext3
sw_state     = runlevel 3
sw_compiler        = zcc, z++ & zfc 3.0
sw_avail = Dec-9999
sw_other = None
sw_auto_parallel = No
sw_base_ptrsize = 64-bit
sw_peak_ptrsize = 64-bit

#####################################################################
# Notes
#####################################################################
notes_os_000 ='ulimit -s unlimited'

#####################################################################
# Optimization
#####################################################################

ARCH=rv64gcv

%ifndef %{COPTIMIZE}
%   define  COPTIMIZE        -static -g -O3 -flto -march=$[ARCH] -mllvm --no-unsigned-wrap=true -mllvm --binop-rhs-first=false -mllvm --enable-loop-distribute=true -mllvm --loop-distribute-non-if-convertible=true -mllvm --riscv-inline-libcall=false -Wl,-mllvm,--riscv-inline-libcall=false
%endif

%ifndef %{CXXOPTIMIZE}
%   define  CXXOPTIMIZE        -static -g -O3 -flto -march=$[ARCH] -mllvm --no-unsigned-wrap=true -mllvm --binop-rhs-first=false -mllvm --enable-loop-distribute=true -mllvm --loop-distribute-non-if-convertible=true -mllvm --riscv-inline-libcall=false -Wl,-mllvm,--riscv-inline-libcall=false
%endif

%ifndef %{FOPTIMIZE}
%   define  FOPTIMIZE        -static -g -O3 -flto -march=$[ARCH] -mllvm --enable-loop-distribute=true -mllvm --loop-distribute-non-if-convertible=true -mllvm --riscv-inline-libcall=false -Wl,-mllvm,--riscv-inline-libcall=false
%endif

default=base=default=default:
COPTIMIZE    = %{COPTIMIZE}
CXXOPTIMIZE  = %{CXXOPTIMIZE}
FOPTIMIZE    = %{FOPTIMIZE}

notes0100= C base flags: $[COPTIMIZE]
notes0110= C++ base flags: $[CXXOPTIMIZE]
notes0120= Fortran base flags: $[FOPTIMIZE]

#####################################################################
# Optimization
#####################################################################
##peak
default=peak=default=default:
COPTIMIZE    = %{COPTIMIZE}
CXXOPTIMIZE  = %{CXXOPTIMIZE}
FOPTIMIZE    = %{FOPTIMIZE}

notes0100= C base flags: $[COPTIMIZE]
notes0110= C++ base flags: $[CXXOPTIMIZE]
notes0120= Fortran base flags: $[FOPTIMIZE]

400.perlbench=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

462.libquantum=peak=default=default:
EXTRA_OPTIMIZE = -funroll-loops

410.bwaves=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

434.zeusmp=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

436.cactusADM=peak=default=default:
EXTRA_OPTIMIZE = -ffinite-math-only

444.namd=peak=default=default:
EXTRA_OPTIMIZE = -fnon-call-exceptions

459.GemsFDTD=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

465.tonto=peak=default=default:
EXTRA_OPTIMIZE = -fpeel-loops -funroll-loops -ffast-math -ftree-vectorize

#####################################################################
# 32/64 bit Portability Flags - all
#####################################################################

default=base=default=default:
notes25= PORTABILITY=-DSPEC_CPU_LP64 is applied to all benchmarks in base.
PORTABILITY    = -DSPEC_CPU_LP64

#####################################################################
# Portability Flags
#####################################################################

400.perlbench=default=default=default:
notes35    = 400.perlbench: -DSPEC_CPU_LINUX_X64
CPORTABILITY= -DSPEC_CPU_LINUX_X64 -std=gnu89

401.bzip2=default=default=default:
CPORTABILITY   = -Wno-int-conversion

456.hmmer=default=default=default:
CPORTABILITY   = -std=gnu89

462.libquantum=default=default=default:
notes60= 462.libquantum: -DSPEC_CPU_LINUX
CPORTABILITY= -DSPEC_CPU_LINUX

464.h264ref=default=default=default:
CPORTABILITY= -fsigned-char

483.xalancbmk=default=default=default:
CXXPORTABILITY= -DSPEC_CPU_LINUX -include cstring

#####################################################################
# Portability Flags - FP
#####################################################################
410.bwaves=default=default=default:
FPORTABILITY=

416.gamess=default=default=default:
FPORTABILITY=

433.milc=default=default=default:

434.zeusmp=default=default=default:
FPORTABILITY=

435.gromacs=default=default=default:

436.cactusADM=default=default=default:

437.leslie3d=default=default=default:
FPORTABILITY=

444.namd=default=default=default:

447.dealII=default=default=default:
CXXPORTABILITY=  -fpermissive -include cstring

450.soplex=default=default=default:
CXXPORTABILITY= -std=c++98

454.calculix=default=default=default:
CPORTABILITY= -Wno-int-conversion

459.GemsFDTD=default=default=default:
FPORTABILITY=

465.tonto=default=default=default:
FPORTABILITY=

470.lbm=default=default=default:

481.wrf=default=default=default:
CPORTABILITY = -DSPEC_CPU_CASE_FLAG -DSPEC_CPU_LINUX -std=gnu89
FPORTABILITY=

482.sphinx3=default=default=default:
CPORTABILITY= -fsigned-char

这些文件配置文件的使用方式如下,

介绍

本目录提供 ZCC 在 SPEC CPU2006 下的配置文件如下:

zcc-linux-rv64imafdc-4.1.1.cfg
zcc-linux-rv64imafdcv-4.1.1.cfg

在 cpu2006 的项目目录下都存在一个 config 目录,将对应 ZCC 的 config 文件拷贝到相应目录,按照 SPEC CPU 官方使用方式指定对应 config 文件。

SPEC CPU2006

SPEC CPU2006 编译使用: Rules SPEC CPU2006

按照网页中的步骤准备好编译环境后,在编译时指定 ZCC 的配置文件即可,比如这里用 zcc-linux-rv64imafdc-4.1.1.cfg 编译 int /fp测试集:

source shrc
runspec --config zcc-linux-rv64imafdc-4.1.1 --action build int
runspec --config zcc-linux-rv64imafdc-4.1.1 --action build fp

NOTE: 对于410.bwaves, 434.zeusmp, 481.wrf 这三个case由于fortran中使用的动态数组分配在栈上导致栈溢出,使用命令ulimit -s unlimited调整操作系统栈大小限制可以解决。

非常感谢。我们测试平台主要包括自研IP核,昆明湖v2以及 Milk-V Jupiter。

我也很关注,是否测试过zcc在xiangshan 昆明湖v3版本中的spec06性能提升吗?