A dedicated performance counter for Cortex-M Systick. It shares the SysTick with users' original SysTick function(s) without interfering with it. This library will bring new functionalities, such as performance counter,` delay_us` and `clock()` service defined in `time.h`.
Here, [**ref 1**] is a small user code to read the measurement result via a local variable `__cycle_count__` . This User Code is optional. If you don't put anything here, the measured result will be shown with a `__perf_counter_printf__`.
For both bare-metal and OS environment, you can measure the CPU Usage with macro `__cpu_usage__()` for a given code segment as long as it is executed repeatedly.
**Syntax**
```c
__cycleof__(<IterationCountbeforegettinganaverageresult>, [User Code, see ref 1]) {
//! target code segment of measurement
...
}
```
Here, [**ref 1**] is a small user code to read the measurement result via a local variable `__usage__`. This User Code is optional. If you don't put anything here, the measured result will be shown with a `__perf_counter_printf__`.
##### **Example 1: the following code will show 30% of CPU Usage:**
```c
void main(void)
{
...
while (1) {
__cpu_usage__(10) {
delay_us(30000);
}
delay_us(70000);
}
...
}
```
##### Example 2: Read measurement result via `__usage__`
```c
void main(void)
{
...
while (1) {
float fUsage = 0.0f;
__cpu_usage__(10, {
fUsage = __usage__; /*< "__usage__" stores the result */
}) {
delay_us(30000);
}
printf("task 1 cpu usage %3.2f %%\r\n", (double)fUsage);
delay_us(70000);
}
...
}
```
NOTE: The `__usage__` stores the percentage information.
#### 1.2.2 Cycle per Instruction and L1 DCache Miss Rate
For **Armv8.1-m** processors that implement the **PMU**, it is easy to measure the **CPI** (Cycle per Instruction) and **L1 DCache miss rate** with the macro `__cpu_perf__()`.
**Syntax**:
```c
__cpu_perf__(<DescriptionStringforthetarget>, [User Code, see ref 1]) {
//! target code segment of measurement
...
}
```
Here, [**ref 1**] is a small user code to read the measurement result via a local **struct** variable `__PERF_INFO__` . This User Code is optional. If you don't put anything here, the measured result will be shown with a `__perf_counter_printf__`. The prototype of the `__PERF_INFO__` is shown below:
```c
struct {
uint64_t dwNoInstr; /* number of instruction executed */
uint64_t dwNoMemAccess; /* number of memory access */
uint64_t dwNoL1DCacheRefill; /* number of L1 DCache Refill */
int64_t lCycles; /* number of CPU cycles */
uint32_t wInstrCalib;
uint32_t wMemAccessCalib;
float fCPI; /* Cycle per Instruction */
float fDCacheMissRate; /* L1 DCache miss rate in percentage */
} __PERF_INFO__;
```
For example, when insert user code, you can read CPI from `__PERF_INFO__.fCPI`.
This example shows how to use the delta value of `get_system_ticks()` to measure the CPU cycles used by specified code segment. In fact, the `__cycleof__()` is implemented in the same way:
If you are using EventRecorder in MDK, once you deployed the `perf_counter`, it will provide the timer service for EventRecorder by implenting the following functions: `EventRecorderTimerSetup()`, `EventRecorderTimerGetFreq()` and `EventRecorderTimerGetCount()`.
If you have not modify anything in `EventRecorderConf.h`, **you don't have to**, and please keep the default configuration. If you see warnings like this:
**By using perf_counter as the reference clock, EventRecorder can have the highest clock resolution on the target system without worring about the presence of DWT or any conflicting usage of SysTick.**
1. The `SystemCoreClock` has been updated with the new system frequency. Usually, the HAL will update the `SystemCoreClock` automatically, but in some rare cases where `SystemCoreClock` is updated accordingly, you should do it yourself.
> **NOTE**: Please do **NOT** add any assembly source files of this `perf_counter` library to your compilation, i.e. `systick_wrapper_gcc.S`, `systick_wrapper_gnu.s` or `systick_wrapper_ual.s`.
7. Make sure the `SystemCoreClock` is updated with the same value as CPU frequency.
8.**IMPORTANT**: Make sure the `SysTick_CTRL_CLKSOURCE_Msk` bit ( bit 2) of `SysTick->CTRL` register is `1` that means SysTick runs with the same clock source as the target Cortex-M processor.
9. Initialize the perf_counter with boolean value that indicates whether the user applications and/or RTOS have already occupied the SysTick.
```c
void main(void)
{
//! setup system clock
/*! \brief Update SystemCoreClock with the latest CPU frequency
*! If the function doesn't exist or doesn't work correctly,
*! Please update SystemCoreClock directly with the correct
*! system frequency in Hz.
*!
*! extern volatile uint32_t SystemCoreClock;
*/
SystemCoreClockUpdate();
/*! \brief initialize perf_counter() and pass true if SysTick is
*! occupied by user applications or RTOS, otherwise pass
*! false.
*/
init_cycle_counter(true);
...
while(1) {
...
}
}
```
10.**IMPORTANT**: Please enable GNU extension in your compiler. For **GCC** and **CLANG**, it is `--std=gnu99` or `--std=gnu11`, and for other compilers, please check the user manual first. Failed to do so, you will not only trigger the warning in `perf_counter.h`, but also lose the function correctness of `__cycleof__()` and `__super_loop_monitor__()`, because `__PLOOC_VA_NUM_ARGS()` isn't report `0` when passed with no argument.
```c
#if __PLOOC_VA_NUM_ARGS() != 0
#warning Please enable GNC extensions, it is required by __cycleof__() and \
__super_loop_monitor__()
#endif
```
11. It is nice to add macro definition `__PERF_COUNTER__` to your project GLOBALLY. It helps other module to detect the existence of perf_counter. For Example, LVGL [`lv_conf_cmsis.h`](https://github.com/lvgl/lvgl/blob/d367bb7cf17dc34863f4439bba9b66a820088951/env_support/cmsis-pack/lv_conf_cmsis.h#L81-L99) use this macro to detect perf_counter and uses `get_system_ms()` to implement `lv_tick_get()`.
1. Download the cmsis-pack from the`cmsis-pack` folder. It is a file with name `GorgonMeducer.perf_counter.<version>.pack`, for example `GorgonMeducer.perf_counter.2.2.0.pack`
5. Make sure your system contains the CMSIS (with a version 5.7.0 or above) as `perf_counter.h` includes `cmsis_compiler.h`. Usually, you should do this with RTE as shown below:
6. Make sure the `SystemCoreClock` is updated with the same value as CPU frequency.
7.**IMPORTANT**: Make sure the `SysTick_CTRL_CLKSOURCE_Msk` bit ( bit 2) of `SysTick->CTRL` register is `1` that means SysTick runs with the same clock source as the target Cortex-M processor.
8. Initialize the perf_counter with boolean value that indicates whether the user applications and/or RTOS have already occupied the SysTick.
```c
void main(void)
{
//! setup system clock
/*! \brief Update SystemCoreClock with the latest CPU frequency
*! If the function doesn't exist or doesn't work correctly,
*! Please update SystemCoreClock directly with the correct
*! system frequency in Hz.
*!
*! extern volatile uint32_t SystemCoreClock;
*/
SystemCoreClockUpdate();
/*! \brief initialize perf_counter() and pass true if SysTick is
*! occupied by user applications or RTOS, otherwise pass
*! false.
*/
init_cycle_counter(true);
...
while(1) {
...
}
}
```
9.**IMPORTANT**: Please enable GNU extension in your compiler.
For Arm Compiler 5, please select both **C99 mode** and GNU extensions in the **Option for target dialog** as shown below:
Failed to do so, you will not only trigger the warning in `perf_counter.h`, but also lose the function correctness of `__cycleof__()` and `__super_loop_monitor__()`, because `__PLOOC_VA_NUM_ARGS()` isn't report `0` when passed with no argument.
```c
#if __PLOOC_VA_NUM_ARGS() != 0
#warning Please enable GNC extensions, it is required by __cycleof__() and \
perf_counter has registered as one of the [RT-Thread software packages](https://packages.rt-thread.org/en/detail.html?package=perf_counter), which locats in `system` category. In [ENV](https://www.rt-thread.io/download.html?download=Env) or [RT-Thread Studio](https://www.rt-thread.io/download.html?download=Studio), you just need to simply enable cputime framework. RT-Thread will automatically enable perf_counter if you are using Cortex-M architecture.
This error usually pops up in **Arm Compiler 5** and **Arm Compiler 6**. It is because you haven't implemented any non-weak `SysTick_Handler()`. Please provide an EMPTY one in any c source file to solve this problem:
**NOTE**: If you deploy perf_counter using cmsis-pack and encounter this issue, please **DO NOT** call function `user_code_insert_to_systick_handler()` in this **should-be-empty**`SysTick_Handler()`.
Since version v2.1.0 I removed the unnecessary bundle feature from the cmsis-pack. If you have used the older version, you will encounter this issue. To solve this problem: