• Qianfeng's avatar
    Add online compilation for dynamic kernels (#37) · 1685048a
    Qianfeng authored
    * Add online-compiling facility
    
    * Synchronize from fwd-v4r5 and implement host interfaces to call conv-fwd v4r4/v4r5 using on-line compiling method
    
    * Tiny adjustment to time reporting
    
    * Use object assignment to replace explicit bytes copying in the first kernel of v4r4/v4r5
    
    * Use single thread to assign descriptor object to device memory
    
    * Adjust to the workload assignment of the two kernels of v4r4 (experimental)
    
    * Revert "Adjust to the workload assignment of the two kernels of v4r4 (experimental)"
    
    This reverts commit eb384614.
    
    * Update to make constexpr for generating descriptor types in kernel 2 of dynamic conv-fwd v4r4
    
    * Update to dynamic conv-fwd v4r4 online-compiling
    
    * Update to dynamic conv-fwd v4r5 online-compiling (result not accurate)
    
    * Tiny update to driver/CMakeLists.txt
    
    * clang-format
    
    * Tiny comments change
    
    * Add env OLC_DUMP_SAVE_TMP_DIR to support saving of temperary dir
    
    * Fwd v4r5 olc perf (#39)
    
    * added hip-clang flags that fix perf issue of online compilation
    
    * fix bug for olc fwd-v4r5-nchw
    
    * Move constexpr and type reference statements out of the function body in conv-fwd v4r4/v4r5 kernel wrapper
    
    * Remove printing in hip_build_utils.cpp
    
    * Update to root CMakeLists.txt
    
    * Revert "Move constexpr and type reference statements out of the function body in conv-fwd v4r4/v4r5 kernel wrapper"
    
    This reverts commit 3d2c5d8e
    
    .
    Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
    Co-authored-by: default avatarChao Liu <lc.roy86@gmail.com>
    Co-authored-by: default avatarroot <root@dc-smc-18.amd.com>
    1685048a
TargetFlags.cmake 1.66 KB