Tracing function calls#
Quick refresher, generally the entry point for matrix products is going
to be PyArray_MatrixProduct2
which in turn (for everything we care
about) will call cblas_matrixproduct
and dispatch from there on out.
Also, since we are not really interested in debugging the CPython extension
itself (some notes here), just the cblas_
calls, we will not require a debug
variant of Python.
GDB Stepthrough#
gdb --args python optest.py --test expected_outputs.json.gz
Where we are focusing on the syrk
test within optest
:
np.random.seed(128)
N = 7
C_float = np.random.rand(4, 4).astype(np.float32)
res = np.dot(C_float, C_float.T)
tgt = expected_outputs.syrk_float_un
assert_almost_equal(
res, tgt, decimal=N, err_msg="syrk_float_un failed", verbose=True
)
From here we will start with a few breakpoints.
b cblas_matrixproduct
tui enable
run
It is easier to see where we are with the TUI enabled.
Now we want to focus on the syrk
call so:
b syrk
continue
# After another hit for cblas_matrixproduct it will reach the syrk breakpoint then
next
So far so good, now we will step through (next
and then enter to repeat the
previous command) a bit to get to the actual call, which should be the NPY_FLOAT
case since we have defined the dtype
.
We need to step into this call, however it is a macro so s
doesn’t do what we
want (it will skip to the for loop). The macro itself is defined in npy_cblas.h
to be:
#define BLAS_FUNC_CONCAT(name,prefix,suffix,suffix2) prefix ## name ## suffix ## suffix2
#define BLAS_FUNC_EXPAND(name,prefix,suffix,suffix2) BLAS_FUNC_CONCAT(name,prefix,suffix,suffix2)
#define BLAS_FUNC(name) BLAS_FUNC_EXPAND(name,BLAS_SYMBOL_PREFIX,BLAS_SYMBOL_SUFFIX,BLAS_FORTRAN_SUFFIX)
#define CBLAS_FUNC(name) BLAS_FUNC_EXPAND(name,BLAS_SYMBOL_PREFIX,,BLAS_SYMBOL_SUFFIX)
Where some comments and the ILP64
naming branch is omitted. Since we’re in the
business of compiling everything by hand right now, we probably already have
macro information for gdb
since we set -ggdb3
in CFLAGS
(if not, recompile
with it).
What we need is to set a breakpoint on the underlying call… Which we can tell is going to be..