Squashed commit of the following: commit f1820af82bb5467d0c79c03290fca809b0273030 Author: philip <philip@gladstonefamily.net> Date: Sun Feb 21 15:08:31 2016 -0500 Now uses userdata commit 74a2298f5f2d2b07097a9501046efb8d4061ec5e Merge: 4ffab15 716e682 Author: philip <philip@gladstonefamily.net> Date: Sun Feb 21 13:54:40 2016 -0500 Merge remote-tracking branch 'upstream/dev' into performance Conflicts: app/platform/hw_timer.c app/platform/hw_timer.h commit 4ffab15a2a15e0c6b2d7e93611a02be47bafdc79 Author: philip <philip@gladstonefamily.net> Date: Fri Feb 12 17:36:12 2016 -0500 Simple low level performance monitoring tool Make it work with the new hw_timer code commit 944db2bdb8a2b725ba683c564b39f30f3b61e47f Author: philip <philip@gladstonefamily.net> Date: Sun Feb 14 10:32:41 2016 -0500 Initial version of the hw_timer as part of the platform Addressed review comments Add the binsize return
2.4 KiB
perf Module
This module provides simple performance measurement for an application. It samples the program counter roughly every 50 microseconds and builds a histogram of the values that it finds. Since there is only a small amount of memory to store the histogram, the user can specify which area of code is of interest. The default is the enitre flash which contains code. Once the hotspots are identified, then the run can then be repeated with different areas and at different resolutions to get as much information as required.
perf.start()
Starts a performance monitoring session.
Syntax
perf.start([start[, end[, nbins[, offset]]]])
Parameters
start
(optional) The lowest PC address for the histogram. Default is 0x40000000.end
(optional) The highest address for the histogram. Default is the end of the used space in the flash memory.nbins
(optional) The number of bins in the histogram. Keep this reasonable otherwise you will run out of memory. Default is 1024.offset
(Very optional) This specifies the offset of the saved PC value on the interrupt stack. It appears that 20 is the correct value.
Note that the number of bins is an upper limit. The size of each bin is set to be the smallest power of two such that the number of bins required is less than or equal to the provided number of bins.
Returns
Nothing
perf.stop()
Terminates a performance monitoring session and returns the histogram.
Syntax
total, outside, histogram, binsize = perf.stop()
Returns
total
The total number of samples captured in this runoutside
The number of samples that were outside the histogram rangehistogram
The histogram represented as a table indexed by address where the value is the number of samples. The address is the lowest address for the bin.binsize
The number of bytes per histogram bin.
Example
perf.start()
for j = 0, 100 do
str = "str"..j
end
tot, out, tbl, binsize = perf.stop()
print(tot, out)
local keyset = {}
local n = 0
for k,v in pairs(tbl) do
n=n+1
keyset[n]=k
end
table.sort(keyset)
for kk,k in ipairs(keyset) do print(string.format("%x - %x",k, k + binsize - 1),tbl[k]) end
This runs a loop creating strings 100 times and then prints out the histogram (after sorting it). This takes around 2,500 samples and provides a good indication of where all the CPU time is being spent.