mirror of
https://github.com/nodemcu/nodemcu-firmware.git
synced 2025-01-16 20:52:57 +08:00
e516a0e9a2
Squashed commit of the following: commit f1820af82bb5467d0c79c03290fca809b0273030 Author: philip <philip@gladstonefamily.net> Date: Sun Feb 21 15:08:31 2016 -0500 Now uses userdata commit 74a2298f5f2d2b07097a9501046efb8d4061ec5e Merge: 4ffab15 716e682 Author: philip <philip@gladstonefamily.net> Date: Sun Feb 21 13:54:40 2016 -0500 Merge remote-tracking branch 'upstream/dev' into performance Conflicts: app/platform/hw_timer.c app/platform/hw_timer.h commit 4ffab15a2a15e0c6b2d7e93611a02be47bafdc79 Author: philip <philip@gladstonefamily.net> Date: Fri Feb 12 17:36:12 2016 -0500 Simple low level performance monitoring tool Make it work with the new hw_timer code commit 944db2bdb8a2b725ba683c564b39f30f3b61e47f Author: philip <philip@gladstonefamily.net> Date: Sun Feb 14 10:32:41 2016 -0500 Initial version of the hw_timer as part of the platform Addressed review comments Add the binsize return
67 lines
2.4 KiB
Markdown
67 lines
2.4 KiB
Markdown
# perf Module
|
|
|
|
This module provides simple performance measurement for an application.
|
|
It samples the program counter roughly every 50 microseconds and builds
|
|
a histogram of the values that it finds. Since there is only a small amount
|
|
of memory to store the histogram, the user can specify which area of code
|
|
is of interest. The default is the enitre flash which contains code. Once the hotspots are
|
|
identified, then the run can then be repeated with different areas and at different
|
|
resolutions to get as much information as required.
|
|
|
|
## perf.start()
|
|
Starts a performance monitoring session.
|
|
|
|
#### Syntax
|
|
`perf.start([start[, end[, nbins[, offset]]]])`
|
|
|
|
#### Parameters
|
|
- `start` (optional) The lowest PC address for the histogram. Default is 0x40000000.
|
|
- `end` (optional) The highest address for the histogram. Default is the end of the used space in the flash memory.
|
|
- `nbins` (optional) The number of bins in the histogram. Keep this reasonable otherwise
|
|
you will run out of memory. Default is 1024.
|
|
- `offset` (Very optional) This specifies the offset of the saved PC value
|
|
on the interrupt stack. It appears that 20 is the correct value.
|
|
|
|
Note that the number of bins is an upper limit. The size of each bin is set to be the smallest power of two
|
|
such that the number of bins required is less than or equal to the provided number of bins.
|
|
|
|
#### Returns
|
|
Nothing
|
|
|
|
## perf.stop()
|
|
|
|
Terminates a performance monitoring session and returns the histogram.
|
|
|
|
#### Syntax
|
|
`total, outside, histogram, binsize = perf.stop()`
|
|
|
|
#### Returns
|
|
- `total` The total number of samples captured in this run
|
|
- `outside` The number of samples that were outside the histogram range
|
|
- `histogram` The histogram represented as a table indexed by address where the value is the number of samples. The address is the lowest address for the bin.
|
|
- `binsize` The number of bytes per histogram bin.
|
|
|
|
### Example
|
|
|
|
perf.start()
|
|
|
|
for j = 0, 100 do
|
|
str = "str"..j
|
|
end
|
|
|
|
tot, out, tbl, binsize = perf.stop()
|
|
|
|
print(tot, out)
|
|
local keyset = {}
|
|
local n = 0
|
|
for k,v in pairs(tbl) do
|
|
n=n+1
|
|
keyset[n]=k
|
|
end
|
|
table.sort(keyset)
|
|
for kk,k in ipairs(keyset) do print(string.format("%x - %x",k, k + binsize - 1),tbl[k]) end
|
|
|
|
This runs a loop creating strings 100 times and then prints out the histogram (after sorting it).
|
|
This takes around 2,500 samples and provides a good indication of where all the CPU time is
|
|
being spent.
|