You can move around the graph by using the arrow keys.
Created with Raphaël 2.2.07Sep8765432131Aug302930292827282726252423222120191817161513121112111098765432131Jul302928272625Add stderr to QA logfiles, process splitK and ONNX gemm kernels (#402)Add 'BlockSize' parameter to 'DevicePermute'make device/grid level codegemm_splitk_biasgemm_splitk_biasAdd missing include directiveCheck tensor descriptor dimensions in 'GridwiseElementwise_1D'change profiler and instance to mutiple d0add dynamic d0_element_opmutiple d0Rename 'GridwisePermute' to 'GridwiseCopy'Add N/H/WPerBlock template parameter to 'DevicePermute'Add comment to indicate template argument locationadd some codeExplicitly use ck::math::sqrt in batchnorm-forward kernelsMerge remote-tracking branch 'origin/develop' into aosewski/softmax_utAdd debug code the verify resultTest non innermost dim for fp32 and int8Fix syntax.Test cases when reduced dim is not innermost axis.Renaming in the kernel argumentshost softmax: handle all reduceTransform descriptor into 3 dimensionsTiny correction and remove un-used file under example/34_batchnormChange problem description for 'DevicePermute'Remove never-entered-if-clauseRemove no-longer used methodCheck if input/output shape meet the requirementinit versionfix example; make padding on by default in example; fix argument checksMerge branch 'develop' into tensor_permutationadded elementwise permute exampleMerge branch 'develop' into feature/add-permute-device-opRemove no-longer used type argumentAdd 'GridwisePermute' kernelFused attention instances & padding tests (#395)GemmGemm TNNT instances (#399)fix formatUse more reasonable return value for Invoker::Run()Passing 'axes' to 'DevicePermute'Softmax client example (#396)remove useless code