Hacks Weekly #48 Introduction to Stackwalking

Welcome to another episode of CQURE Hacks Weekly. This time we will take you through the Stackwalking process. Stackwalking is the process of accessing, traversal and filtering of a stack trace.

A stack trace also called stack backtrace or stack traceback is a report of the active stack frames at a certain point in time during the execution of a program.the process of accessing, traversal and filtering of a stack trace.

Stackwalking is very useful function which could be enabled in a feature of the Event Tracking for Windows and the Windows Performance Analyzer. ETW is implemented in the Windows operating system. It provides us with a fast and reliable set of event tracing features. When you enable stackwalking for a kernel event, the kernel captures the call stack when the event is generated and saves it with the event.  

Firstly, to enable call stacks in Xperf, you need to choose what type of event you want to call stacks for. Make sure that event is being recorded and, next, enable call stacks for that type of event.

The sampling profile events from the kernel provider (interrupting the CPU every millisecond and recording what it is doing) is one of the main uses of call stacks. The profile events can be enabled with the “PROFILE” Kernel flag or by using some of the Kernel groups such as “Base” or “Latency”, which include the PROFILE flag.

For a list of all the Kernel flags and Kernel groups, we are using

xperf –providers k

In the video you can see the whole list of Kernel flags and Kernel groups.

In addition to enabling the sampler provider, we need to enable stackwalking on that event with the -stackwalk command. We also need to make sure that the PROC_THREAD and LOADER Kernel flags are enabled. Thanks to that the Xperf tools can identify what modules the code addresses are in.

Putting this all together, we are left with the command:

xperf -on PROC_THREAD+LOADER+PROFILE -stackwalk profile

And after a moment we can stop Xperf by using the following command:

xperf -d mytrace.etl

We have got Merged Etl:mytrace.etl, which means it has been captured.

Alternatively, you can use the Kernel group “Latency”, as it also includes the proc thread and loader kernel flags. We can use:

xperf -on Latency -stackwalk profile

and it will work the same way.

Getting back to our operation, we can stop the same command once again. Due to this basic method, we see how many events stackwalking could have enabled.

If you want to call stacks for context switches or registry operations, you can get them easily. For a complete list of events that can have call stacks enabled for them, use

xperf -help stackwalk

You will receive the complete list of kernel logs provider.

You can also record call stacks for manifest-based ETW events; however the syntax is quite different, and it only works on Windows 7 or above.

When you specify your ETW provider to xperf after “-on”, you can use extra parameters after the provider’s name separated by columns.

The syntax that you can see in the video is used with the latency set of Kernel flags. It enables stack walking on the profile events and starts a user session called “Browse”. It also enables the “Microsoft-IE” ETW provider for that session with call stacks recorded for all those events.

After pasting the command, you’ve got the whole command

xperf -on Latency -stackwalk profile -start browse -on Microsoft-IE:::'stack'

and here we’ve got stack in the columns. We can leave the flags blank and just specify the stack.

Now you can run it. After a moment we can stop it by using:

 xperf -stop browse -stop -d mytrace.etl

After you stopped tracing it, you can open the file mytrace.etl with WPA, which stands for Windows Performance Analyzer.

You can see a graph of everything that has happened. You can expand those categories. Let’s expand computation and you will use CPU usage. Click on it. After right-clicking you can add graph to Analysis view. And here you’ve got the detailed analysis, for example, for the CPU usage.

So Xperf, the Windows Performance Analyzer, summarizes multiple stack walks together, collapsing them down when they are identical.

And in the Count field here you can view how many stacks were collapsed together on each row.

The Weight command shows an estimate of how many milliseconds of CPU time were spent in that call stack. And the percentage Weight column shows us a percentage of total CPU time available.

Please let us know if you have any questions and we absolutely hope to see you in further episodes of the CQURE Hacks Weekly.

 

Comments