Published: Fri 25 March 2016
By Richard Barnes
In misc .
tags: [R debugging C C++]
You have some C/C++ code that you've used to build a package for R. Now you'd
like to profile that code. How do you do it?
You can compile your package for use with, e.g.,
R CMD INSTALL --no-multiarch --clean .
But, you'll notice that R is quite insistent about including the -g -O2
compiler flags, which can complicate debugging.
Therefore, create the file $HOME/.R/Makevars
And, assuming your my_package/src/Makevars
file contains
Use the following line to override R's optimization flags:
Note that -Og
is an optimization level that is ... optimized for debugging
purposes. Different flag lines (such as CXX1XFLAGS
) work for various versions
of the standard.
You can then install oprofile
using
sudo apt-get install oprofile
And build a test script containing code that will evoke the functions/methods
you wish to profile. Run this script using:
The summary results can be viewed with:
And the timing of your library with:
opreport -l ~/R/x86_64-pc-linux-gnu-library/3.2/my_package/libs/my_package.so
Source files annotated with the samples per line can be generated with:
opannotate --source ~/R/x86_64-pc-linux-gnu-library/3.2/my_package/libs/my_package.so
The output appears as follows:
samples % image name symbol name
2013892 66.7818 my_package.so MyClass::hot_method(std::vector<double> const&)
831090 27.5594 my_package.so MyClass::another_method(unsigned long)
69978 2.3205 my_package.so void DoStuff(int,int,double)
56868 1.8858 my_package.so int AdjustThings(unsigned long, unsigned long)
8845 0.2933 my_package.so MyClass::cheerios(int)
Oprofile works by checking in periodically to see where your code is at. The
first line is the number of these samples that occurred in each function (listed
at right), while the second column is the percent of the total samples this
represents.
The foregoing indicates that MyClass::hot_method()
is probably where
optimization efforts should be focused.
The annotated output appears as follows:
: nearest_neighbors.push_back(which_lib[0]);
118511 3.9299 : for(auto curr_lib: which_lib)
: {
3219 0.1067 : curr_distance = dist[curr_lib];
1398851 46.3867 : if(curr_distance <= dist[nearest_neighbors.back()])
: {
: i = nearest_neighbors.size();
123725 4.1028 : while((i > 0) && (curr_distance < dist[nearest_neighbors[i-1]]))
: {
: i--;
: }
: nearest_neighbors.insert(nearest_neighbors.begin()+i, curr_lib);
Here the first column is again the number of samples and the second column is
again the percentage of the total samples this represents. This gives a pretty
good idea of where the bottleneck is.