PIRA – a framework for iterative instrumentation refinement

The main software project I was working on through the last weeks and months is PIRA – the Performance Instrumentation Refinement Automation framework. It is available at https://github.com/jplehr/pira. It is the first software I have set up and used continuous integration for. However, for some historic reason, all components are split up into several repositories and the release “process” used for the initial release is a mess.
(Hint: the currently available version doesn’t work, because I missed something when I released it.)

The next release, using a better release process, is scheduled for August 1st.

Anyway – What is PIRA?

The framework can assist performance analysts and computer scientists to discover performance characteristics of their, or someone else’s, C and C++ software using Score-P. PIRA uses a combination of static and dynamic analysis to iteratively adapt an instrumentation configuration, i.e., which functions should be instrumented for measurement or analysis.

The main driver is written in Python 3. The analysis and instrumentation components are separated into an analysis tool and metric collectors built on top of Clang/LLVM. The final measurements are performed using the Score-P measurement infrastructure.

For those interested, there are two research papers available: (i) about the framework and (ii) a use case, in which we used PIRA to automatically reduce the number of functions passed to the empirical performance modeling tool Extra-P.

What is going to come?

In the next weeks I’ll write some notes about how to use PIRA for your own purposes and what I did when setting up my Gitlab CI instances.

Next Release: August 1st

The next PIRA release is planned for August 1st. It includes new features, such as automatic MPI-function filtering, configurable rebuild intervals, and better-to-use configuration files.

Name mangling in C++ with Clang and GCC

I recently came across the question whether it is possible to use lists of mangled function names (generated with a Clang-based tool) in a GCC compiler plugin. I am aware that name mangling is compiler dependent and not standardized, yet I had hopes that this would be something I can achieve.

I started with a quick web search. That, however, did not lead to satisfying answers, as I was still unsure whether it could actually work. Most of the answers I found were about the status when GCC 5 came out. Now, I am working with GCC 8 and things may change.

So, I continued by implementing very basic test cases to start this off experimentally, i.e., to get an idea on whether more reading about all this is worth my time. Codes were as simple as the one shown below. The first name in the comment is the GCC mangled name (g++ 8.3) and the second name is the Clang mangled name (clang++ 9.0).

void foo() {} // _Z3foov == _Z3foov
void foo(int a){} // _Z3fooi == _Z3fooi
double foo(double d){return 0;} // _Z3food == _Z3food
void foo(int a, double d) {} // _Z3fooid == _Z3fooid
namespace test {
 void foo(int a) {} // _ZN4test3fooEi == _ZN4test3fooEi
 double foo(double d) {return 0;} //_ZN4test3fooEd == _ZN4test3fooEd

So, at least given this small set of samples, there do not seem to be differences. I did similar tiny-scale experiments for classes and templates. All of them were simple enough to not discover differences. Eventually, I applied both compilers to a basic C++ implementation of the game of life (sources) and filtered the object code to get a list of all the function names in the resulting binary. I compiled at optimization level 0 to let the compiler not do inlining or other optimizations. I’m sure listing all functions in an object can be done much easier (e.g., using nm), but this is what I did (accordingly for a version of the code compiled with g++):

objdump -d clang++90.GoL | grep ">:" | awk '{ print $2 }' | sed -e "s/://" | uniq | sort > clang++90_names

Inspecting both lists of generated function names, I found differences. In particular, in the mangled name of the constructor of the GameOfLife class.

class GameOfLife {
    GameOfLife(int numX, int numY) : dimX(numX), 
    // other members are omitted

The constructor is mangled into _ZN10GameOfLifeC1Eii by GCC and into _ZN10GameOfLifeC2Eii by Clang. The difference is the C1 vs. the C2 in the name.

Now, I wondered: what is encoded by these C1 / C2 parts of the mangled name? I know that Clang mangles the names according to the Itanium IA64 ABI specification. A quick web search lead me here and so I searched for the respective section of the specification. I found that the specification lists the following in Constructors and Destructors.

  <ctor-dtor-name> ::= C1	# complete object constructor
		   ::= C2	# base object constructor
		   ::= C3	# complete object allocating constructor
		   ::= D0	# deleting destructor
		   ::= D1	# complete object destructor
		   ::= D2	# base object destructor

So, GCC treats the constructor of the GameOfLife class as a complete object constructor, whereas Clang treats it as a base object constructor.

At that point I did not continue digging deeper on why that is the case, i.e., thoroughly reading the IA64 ABI specification definitions, as for me it is sufficient to know that the differences in name mangling occur at such fundamental features as constructors. However, maybe, if someone (or a future me) has the same question (again), I thought I share this in order to know where to start looking for more detail.

Finally, the overall result of this small research is that I will need to write an LLVM plugin to mimic the functionality of the respective GCC plugin I wanted to use in my toolchain. Nothing too bad, but I would have been happier if I could just use the already available GCC plugin.

Overleaf for collaborative writing

We started to use Overleaf for collaborative writing of our research papers. After a few papers and other documents, I decided I share my experiences with it.

First, what did we do before we started using Overleaf?

Well, we used a git repository for the paper to synchronize the changes between different authors. Everybody used their favorite text editor and we agreed on some code style. My experience was that the most important thing is: write one sentence per line, because it makes merging just so much easier. Then we would send the pdf of the draft to whoever is doing the internal review. We get back a paper copy with handwritten remarks to be included and iterate the whole process. This isn’t bad, and I totally understand if people prefer to read on a printed out copy.

Now, how did Overleaf change this?

Overleaf is a little bit like multiplayer LaTeX. What we found to be more important: it sets up the pdf version of the paper immediately. This is particularly helpful for the internal review – at least in our group. With its comment mechanism, people can simply annotate the respective parts of the document. If they only found a typo, they can also immediately fix it in the document. This makes the review easier and faster accessible. I would, however, agree that the introduction of Overleaf is not the most important thing that ever happened.

That’s all great! What can’t it do?

I found it to be somewhat annoying that the editor is an always-on solution! You cannot, at least I did not see how, make the document / editor available offline to, say, work on a document while flying across the Atlantic [Yes, I am assuming you do not want to use the WiFi in the plane]. If you dig a little bit into it, you find that the Overleaf document is actually a git repository behind the scenes. Let’s just clone it, so we can work offline and then push the changes. Unfortunately, you can’t do this. The first part worked smoothly: cloning the project’s git. The second part, pushing changes to it, failed. At least I did not manage to add my credentials in a way that allowed me to push changes to the remote.

So what’s the conclusion?

My conclusion is: Overleaf provides a convenient way to collaborate with authors from other groups / institutions easily. It allows for nice and easy WYSIWYG reviewing and you can export the final document as a git to store it on your local git server, if you want to do so, e.g., for archiving purposes. Should you mostly find time to work on documents while you don’t have Internet access, Overleaf may not be the best solution.

All this silence

I just realized how silent I was on my website over the last almost 1.5 years. I decided that I should change that.

As a first follow-up on my article about the Opera tab manager: in some of the Opera versions after my write up, the team actually included a way to search tabs in a much nicer way than the plugin allows.

Given my, maybe weird, control setup for tab handling – Left_Alt + one_of(h,j,k,l) – I was happy that they set the shortcut to Left_Alt + Space by default. This opens the Opera quick search.

The quick search is an overlay that allows you to (1) search Google, and, more importantly for me, (2) search your tabs! This is great news for me, and, I would assume, for everybody who constantly carries around a larger bag full of open tabs. I really enjoy this feature as it nicely blends into my general browser setup. It feels also much more responsive and integrated into the browser compared to the plugin.

On another note: I have been following the Vivaldi development quite closely and think that it is a browser that you should follow and test every now and then. It does lack a little bit of performance from the UI compared to Opera or Chrome, but it has some nice features, like tab hibernation or tab stacks.

I also realize that I should write more articles here about the stuff that I do. And I will!

If you are interested you can also follow me on Twitter: @jplehr

Keyboard layout and key mappings

When I switched to i3wm as my main work-horse window manager, I decided to use the “Alt Gr” or “Alt_R” key as the modifier key used for the i3 command shortcuts. The main reason for that comes from my habits to mainly work on workspaces 1 – 5. In order to decrease the stress on my left hand, the modifier should be one key that is controlled by the right hand.

First of all, I had to teach the US layout that the both Alt keys were actually different modifier keys. For this purpose I simply use the xmodmap command:

xmodmap -e 'remove Mod1 = Alt_R'
xmodmap -e 'add Mod3 = Alt_R'

In addition of selecting the “Mod3” key in the i3 config of course. This was working fine as long as I used th US keyboard layout.

Since working at a German university requires some interaction in German from time to time, I had to also have a German keyboard layout available. A short search in the web lead to the suggestion so use the simple setxkbmap command with the desired keyboard layout as follows:

setxkbmap de # For the German keyboard layout
setxbkmap us # For the US keyboard layout

So far so good. I change to the German keyboard layout and received an error message on the console saying that it could not remove the binding of the “Alt_R” symbol as no such mapping exists. I did not pay too much attention to that until I realized that I was unable to control my window manager anymore.

After a little while I finally found out that the “Alt Gr” key is mapped to something called “ISO_Level_3_shift_modifier” in certain keyboard mappings, e.g. the German keyboard mapping. After that I did invest quite some time in trying to figure out how I can resolve that problem and get a fully working German keyboard layout.

I then came across this great article , which explains (for me sufficiently detailed) how the key mapping is done. I was then able to adapt my simple commands to change between keyboard layouts to somewhat more lengthy one:

setxkbmap us; xmodmap -e 'remove Mod1 = Alt_R'; xmodmap -e 'add Mod3 = Alt_R' # For the US layout
setxkbmap de; xmodmap -e 'keycode 108 = Alt_R'; xmodmap -e 'add Mod3 = Alt_R'; xmodmap -e 'keycode 133 = ISO_Level3_Shift' # For the German layout

The additional command for the German keyboard maps the “super” or “Windows” key to the former “Alt Gr” functionality – that is the “ISO_Level_3_shift_modifier”. This is necessary, as on a German keyboard, for example, the ‘@’ sign is placed as a third modifier symbol. Thus, without this functionality available it would be a little less convenient.

Opera Tab-Manager

After looking for some extension to cope with many open tabs within Opera, I finally found Josh Perry’s TabManager for Chrome. While it allowed to search in the title text of the tabs for specific words, it did move the found tab to a new window and did do a full string search.

However, I think it would be more useful if the found tab is focused instead of moved to a new window, I changed the behavior. In addition, I changed the search behavior to search for all words within the substrings of the tab title. I now decided to fork the original repository and commit my changes, so that maybe other people can benefit from it or improve it further.

You can find the TabManager at my fork of the original repository.

papi-wrap now public

I took some time on my last day of vacation to finish the refactorings I wanted to do on the PAPI wrapper that I mentioned in a previous post. Although I am sure that there is lots of things to clean up in this rather small code base, I made it publicly available!

It was used to generate the measurement results in my paper about the influence of measurement infrastructures, available in the ACM digital library.

The library was intended as an easy-to-use PAPI interface for C++ codes. It can be used as a library to be integrated in your code or it can be used as an external measurement routine using libmonitor.  I may continue to work on this library in my free time as I do have some more ideas and want to integrate two features. One, implement a more structured way to output the measurement results. Two, have it not only count PAPI events, but also have it provide simple timer mechanisms.

If you are interested in this project, you can go to the papi-wrap on my github and download the source, build it and play around with it.

Interactive shell with SLURM

I just discovered a half-broken blueprint script that was supposed to open an interactive bash session within a newly allocated SLURM job. I typically allocate interactive sessions when I want to test a specific benchmark configuration on a particular machine or type of machine.

I always forget the exact command, so here is a fixed, i.e. working for me, line:

srun -n 1 --mem-per-cpu=100 -t 10:00 --pty bash -i

The line will have SLURM allocate a new resource with 1 task (-n 1) and 100 mb of memory (–mem-per-cpu 200). The job will live for 10 minutes (-t 10:00) and start a bash within it. I frequently also add the SLURM flag for exclusivity (–exclusive).

Please be aware that if your compute center operates with compute quotas the exclusivity will result in increased compute time consumed. Since you are practically allocating all machines for your own, you also occupy all CPUs. As a result, independent of the number of CPUs your job actually uses, the whole machine will be accounted, i.e. #number_of_cores * runtime_of_job.

Hello world!

Like pretty much every tutorial for every programming language, this website starts with a “Hello world!”, yet, I still need to create the content.

I plan to use this website mainly as my “useful notes” archive and write about day-to-day tasks and observations, like benchmarking or how I work with vim.