perf_counter v1.9.8
A dedicated performance counter for Cortex-M Systick. It shares the SysTick with users' original SysTick function(s) without interfering with it.
perf_counter (v1.9.8)

A dedicated performance counter for Cortex-M Systick. It shares the SysTick with users' original SysTick function(s) without interfering with it. This library will bring new functionalities, such as performance counter,delay_us and clock() service defined in time.h.

Features:

  • Measure CPU cycles for specified code segment
  • [NEW] Enhanced measurement services for RTOS
    • Measures RAW / True cycles used for specified code segment inside a thread, i.e. scheduling cost are removed.
    • Measure RAW/True cycles used for a data-process-path across multiple threads.
  • Easy to use
  • Support ALL Cortex-M processors
    • Including Cortex-M85 and Star-MC1
  • Provide Free Services
    • Do NOT interfering with existing SysTick based applications
  • Support ALL arm compilers
    • Arm Compiler 5 (armcc), Arm Compiler 6 (armclang)
    • arm gcc
    • LLVM
    • IAR
  • Simplified Deployment
    • Drag-and-Drop deployment for Arm Compiler 5 and Arm Compiler 6.
    • CMSIS-Pack is available
    • RT-Thread package is avaialble
  • Time based services
  • Support both RTOS and bare-metal environments
  • Utilities for C language enhancement
    • Macros to detect compilers, e.g. __IS_COMPILER_ARM_COMPILER_6__, __IS_COMPILER_LLVM__ etc.
    • Macro to create atomicity for specified code block, i.e. __IRQ_SAFE{...}
    • Helper macros for C language extension:
      • VB like with()
      • foreach(), dimof(), CONNECT()
      • C# like using()
      • simple overload feature of OOPC made out of ANSI-C99, __PLOOC_VA_NUM_ARGS()
      • ...

1. How To Use

1.1 Measure CPU cycles for specified code segment

You can measure specified code segment with a macro helper __cycleof__(), it is a wrapper of get_system_ticks().

Syntax:

__cycleof__(<Description String for the target>, [User Code, see ref 1]) {
...
}

Here, [ref 1] is a small user code to read the measurement result via a local variable __cycle_count__ for perl lovers, you can also use "`_`" to read the result. This User Code is optional. If you don't put anything here, the measured result will be shown with a printf().

<strong>Example 1:</strong> Simple measurement with printf

__cycleof__() {
foreach(example_lv0_t, s_tItem, ptItem) {
printf("Processing item with ID = %d\r\n", _->chID);
}
}

You will see the measured result in console:

image-20220509004258020

<strong>Example 2:</strong> Read measured result via <tt>__cycle_counter__</tt>

int32_t iCycleResult = 0;
/* measure cycles and store it in a dedicated variable without printf */
__cycleof__("delay_us(1000ul)",
/* insert code to __cycleof__ body, "{}" can be omitted */
{
iCycleResult = __cycle_count__; /*< "__cycle_count__" stores the result */
}) {
delay_us(1000ul);
}
printf("\r\n delay_us(1000ul) takes %d cycles\r\n", (int)iCycleResult);
void delay_us(int32_t nUs)
delay specified time in microsecond
Definition: perf_counter.c:300

The result is read out from __cycle_count__and used in other place:

image-20220509004714845

1.2 Timestamp

You can get the system timestamp (since the initialization of perf_counter service) via function get_system_ticks() and get_system_ms().

NOTE: The get_system_ms() is NOT a wrapper of the function get_system_ticks().

There are various way to take advantage of those functions.

Example 3: Use <tt>get_system_ms()</tt> as random seed

#include <stdio.h>
#include <stdlib.h>
#include "perf_counter.h"
int main (void)
{
int i, n;
n = 5;
/* Intializes random number generator */
srand((unsigned) get_system_ticks());
/* Print 5 random numbers from 0 to 1024 */
for( i = 0 ; i < n ; i++ ) {
printf("%d\n", rand() & 0x3FF);
}
return(0);
}
int64_t get_system_ticks(void)
get the elapsed cycles since perf_counter is initialised
Definition: perf_counter.c:373

Example 4: Measure CPU cycles

do {
int64_t tStart = get_system_ticks();
__IRQ_SAFE {
printf("no interrupt \r\n");
}
printf("used clock cycle: %d", (int32_t)(get_system_ticks() - tStart));
} while(0);

This example shows how to use the delta value of get_system_ticks() to measure the CPU cycles used by specified code segment. In fact, the __cycleof__() is implemented in the same way:

#define __cycleof__(__STR, ...) \
using(int64_t _ = get_system_ticks(), __cycle_count__ = _, \
_=_, { \
_ = get_system_ticks() - _; \
__cycle_count__ = _; \
if (__PLOOC_VA_NUM_ARGS(__VA_ARGS__) == 0) { \
printf("\r\n"); \
printf("-[Cycle Report]"); \
printf("--------------------------------------------\r\n"); \
printf(__STR " total cycle count: %d [%08x]\r\n", \
(int)_, (int)_); \
} else { \
__VA_ARGS__ \
}; \
})

2. How To Deploy

2.1 Generic(Default) method for all compilers

2.1.1 For Bare-metal:

  1. Clone the code to your local with following command lines:
git clone https://github.com/GorgonMeducer/perf_counter.git
  1. Add including path for perf_counter folder
  2. Add perf_counter.c to your compilation
  3. Include perf_counter.h in corresponding c source file:
#include "perf_counter.h"
  1. Make sure your system contains the CMSIS (with a version 5.7.0 or above) as perf_counter.h includes cmsis_compiler.h.
  2. Call the function user_code_insert_to_systick_handler() in your SysTick_Handler()
void SysTick_Handler(void)
{
...
...
}
void user_code_insert_to_systick_handler(void)
a system timer handler inserted to the SysTick_Handler
Definition: perf_counter.c:196
  1. Make sure the SystemCoreClock is updated with the same value as CPU frequency.
  2. IMPORTANT: Make sure the SysTick_CTRL_CLKSOURCE_Msk bit ( bit 2) of SysTick->CTRL register is 1 that means SysTick runs with the same clock source as the target Cortex-M processor.
  3. Initialize the perf_counter with boolean value that indicates whether the user applications and/or RTOS have already occupied the SysTick.
void main(void)
{
SystemCoreClockUpdate();
...
while(1) {
...
}
}
void init_cycle_counter(bool bIsSysTickOccupied)
initialise cycle counter service
Definition: perf_counter.c:226
  1. IMPORTANT: Please enable GNU extension in your compiler. For GCC and CLANG, it is --std=gnu99 or --std=gnu11, and for other compilers, please check the user manual first. Failed to do so, you will not only trigger the warning in perf_counter.h, but also lose the function correctness of __cycleof__() and __super_loop_monitor__(), because __PLOOC_VA_NUM_ARGS() isn't report 0 when passed with no argument.
#if __PLOOC_VA_NUM_ARGS() != 0
#warning Please enable GNC extensions, it is required by __cycleof__() and \
__super_loop_monitor__()
#endif
  1. It is nice to add macro definition __PERF_COUNTER__ to your project GLOBALLY. It helps other module to detect the existence of perf_counter. For Example, LVGL lv_conf_cmsis.h use this macro to detect perf_counter and uses get_system_ms() to implement lv_tick_get().

Enjoy !

2.2 Use cmsis-pack in MDK

  1. Download the cmsis-pack from cmsis-pack folder. It is a file with name GorgonMeducer.perf_counter.<version>.pack, for example GorgonMeducer.perf_counter.1.9.4.pack
  2. Double click it to install this cmsis-pack. Once finished, you can find it in your Pack-Installer:

    In the future, you can pull the latest version of perf_counter from the menu Packs->Check For Updates as shown below:

    image-20220509011327392

  3. Open the RTE management window, find the Utilities and select the Core inside perf_counter as shown below:

  1. Include perf_counter.h in corresponding c source file:
#include "perf_counter.h"
  1. Make sure your system contains the CMSIS (with a version 5.7.0 or above) as perf_counter.h includes cmsis_compiler.h. Usually, you should do this with RTE as shown below:

image-20220509012432408

  1. Make sure the SystemCoreClock is updated with the same value as CPU frequency.
  2. IMPORTANT: Make sure the SysTick_CTRL_CLKSOURCE_Msk bit ( bit 2) of SysTick->CTRL register is 1 that means SysTick runs with the same clock source as the target Cortex-M processor.
  3. Initialize the perf_counter with boolean value that indicates whether the user applications and/or RTOS have already occupied the SysTick.
void main(void)
{
SystemCoreClockUpdate();
...
while(1) {
...
}
}
  1. IMPORTANT: Please enable GNU extension in your compiler.

    For Arm Compiler 5, please select both C99 mode and GNU extensions in the Option for target dialog as shown below:

image-20220509012752097

For Arm Compiler 6, please select gnu99 or gnu11 in Language C drop-list as shown below:

image-20220509012944724

Failed to do so, you will not only trigger the warning in perf_counter.h, but also lose the function correctness of __cycleof__() and __super_loop_monitor__(), because __PLOOC_VA_NUM_ARGS() isn't report 0 when passed with no argument.

#if __PLOOC_VA_NUM_ARGS() != 0
#warning Please enable GNC extensions, it is required by __cycleof__() and \
__super_loop_monitor__()
#endif

2.3 Use perf_counter in RT-Thread RTOS

perf_counter has registered as one of the RT-Thread software packages, which locats in system category. In ENV or RT-Thread Studio, you just need to simply enable cputime framework. RT-Thread will automatically enable perf_counter if you are using Cortex-M architecture.

rt-thread-settings

Enjoy !

3. FAQ

3.1 Why I see <tt>Undefined symbol $Super$$SysTick_Handler</tt>

This error usually pops-up in Arm Compiler 5 and Arm Compiler 6, it is because you haven't implement any non-weak Systick_Handler(). Please provide an EMPTY one in any c source file to solve this problem:

void SysTick_Handler(void)
{
}

NOTE: If you deploy perf_counter using cmsis-pack and encounter this issue, please DO NOT call function user_code_insert_to_systick_handler() in this should-be-empty SysTick_Handler().

4. License

Performance Counter for Cortex-M, a.k.a. perf_counter is under Apache 2.0 license.