You can move around the graph by using the arrow keys.
Created with Raphaël 2.2.031Aug302930292827282726252423222120191817161513121112111098765432131Jul302928272625242221201918191817161514151413121110987fix host bias+gelu bugattention kernel proper granular padding for all 4 dimsshrink input value range for attention kernel validation to avoid occasional error by 1e-3add adhoc padding test for attenrefactor attention padding to better fit in unit testsadd TNTT gemm_gemm + atten kernel instancesadd gemm spec in kernel namechange input parameter for ckProfilerckprofiler finish codetrim unnecessary checkmodify commentadd conv+conv example, 1x1 onlyconv_conv_v2conv_conv_v2Add debug info to DeviceBatchedGemmXdl and instances to batched_gemmMinor fixAdd debug info to DeviceGemmXdl_CShuffle and instances to gemm_add_add_fastgelurefactor convupdate testconv_convconv_convAdd debug info to DeviceGemmXdl_CShuffleAdd debug info to DeviceGemmXdlGemm reduce examples int4/int8/fp32/bf16 (#368)Padding for attention: bmm+scale+softmax+bmm kernel (#385)start add to ckprofileradd instancegelu change to relu and GetElementSpaceSize bugMerge branch 'tensor_permutation' of github.com:ROCmSoftwarePlatform/composable_kernel into tensor_permutationchanged deviceelementwise parameters for outscalarFix code-comment mismatchRemove macro PP_DEFINE_LAYOUT_TYPE()add comments for usages of padding bmm+scale+softmax+bmmatt-with-maskatt-with-maskAdd gemm instancesUse different initialization method for examplesAdd 'final' specifier to utility classesGroup same-dim-layouts together in 'LayoutSetting<>'Add check for the 'RLayout' type argumentUse same A/B data type for host Conv in int4 exampleRemove debug messagesUse named variables to replace magic numbersMerge remote-tracking branch 'origin/develop' into conv_convMerge branch 'develop' into feature/add-convnd-fwd-reduce-examplesTry to workaround flaky GemmSoftmaxGemm tests (#386)