Linux 4.4 (c229bf9dc179d2023e185c0f705bdf68484c1e73) added
the PERF_SAMPLE_BRANCH_CALL branch sample type, which confusingly
is a direct-call only subset of what PERF_SAMPLE_BRANCH_ANY_CALL
provides.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Linux 4.3 (b20112edeadf0b8a1416de061caa4beb11539902) improved
the accuracy of the clock/ns conversion routines. As a result
the shift factor can now be 32. This value is directly
exported in the perf_event_open() mmap page, and this
potentially breaks the sample code that shifts 1 left by
the shift value.
Add a cast in the sample code so that a proper 64-bit value
results from the shift. This is the same change that was
made to the sample code in include/uapi/linux/perf_event.h
in Linux 4.4 (b9511cd761faafca7a1acc059e792c1399f9d7c6).
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Linux 4.3 (71ef3c6b9d4665ee7afbbe4c208a98917dcfc32f)
added a cycles field to the PERF_SAMPLE_BRANCH_STACK
last branch records.
The kernel commit was a bit vague on this, but you can find
a few more details on this in the Intel Architectural Manual
vol3B. The field indicates the number of core cycles elapsed
since the previous update to the LBR stack.
This feature is only found on Skylake and newer Intel chips,
as well as Intel Atom Goldmont chips. I'm not sure if it's
worth adding this info to the manpage, as it seems a bit
specific and will probably get rapidly out of date.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Linux 4.3 introduced two new record types for recording context
switches: PERF_RECORD_SWITCH and PERF_RECORD_SWITCH_CPU_WIDE.
The advantage over the existing tracepoint and software context
switch events is primarily that full switch in/out data can be
gathered even in the face of restrictive perf_event_paranoid
settings.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Linux 4.2 reserved a new bit from the misc field in
the perf_event_open() mmap sample buffer:
PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT
Despite being reserved in the public
include/uapi/linux/perf_event.h header file, this bit is never set
by the kernel. Rather, it is used internally by the user-space
"perf" utility to indicate that, when attempting to parse all of
the /proc/xxx/maps files for the sample, it ended up taking too
long so the scan was aborted.
Signed-off-by: Vince Weaver <vincent.weaver@maine.edu>
Note that FALLOC_FL_UNSHARE may use CoW to unshare blocks to
guarantee that a disk write won't fail with ENOSPC.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
The signal behavior of pkeys is special compared to many other
processor and OS features. Add a special section to describe
the behavior.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>