Callgrind (профилировщик Valgrind) не дает полезных результатов

Я написал некоторый код C ++, который структурирован следующим образом:

double kernel(params)
{
//code
}

void optimize(params)
{
//some code
double x = kernel();
//some more code
}

int main()
{
//some code
optimize();
//some more code
}

Я попытался профилировать его с помощью callgrind, используя следующие команды:

g++ -O3 -g sgd.cpp
valgrind --tool=callgrind ./a.out commandline_args
callgrind_annotate callgrind.out.XXXX

Я получаю следующий вывод:

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
12,916,968,785  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir  file:function
--------------------------------------------------------------------------------
5,862,783,191  /build/buildd/eglibc-2.15/string/../sysdeps/i386/i686/multiarch/memcpy-ssse3.S:__memmove_ssse3 [/lib/i386-linux-gnu/libc-2.15.so]
2,847,653,393  /build/buildd/eglibc-2.15/malloc/malloc.c:_int_malloc [/lib/i386-linux-gnu/libc-2.15.so]
1,327,109,692  /build/buildd/eglibc-2.15/malloc/malloc.c:_int_free [/lib/i386-linux-gnu/libc-2.15.so]
847,560,182  sgd.cpp:main [a.out]
503,022,767  /build/buildd/eglibc-2.15/malloc/malloc.c:malloc [/lib/i386-linux-gnu/libc-2.15.so]
235,458,068  /build/buildd/eglibc-2.15/malloc/malloc.c:free [/lib/i386-linux-gnu/libc-2.15.so]
213,580,120  /build/buildd/eglibc-2.15/math/../sysdeps/i386/fpu/e_exp.S:__ieee754_exp [/lib/i386-linux-gnu/libm-2.15.so]
203,349,602  ???:operator new(unsigned int) [/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16]
192,222,108  /build/buildd/eglibc-2.15/math/../sysdeps/ieee754/dbl-64/w_exp.c:exp [/lib/i386-linux-gnu/libm-2.15.so]
128,438,068  /build/buildd/eglibc-2.15/string/../sysdeps/i386/i686/multiarch/strcat.S:0x0012ac73 [/lib/i386-linux-gnu/libc-2.15.so]
128,431,176  ???:operator delete(void*) [/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16]
128,358,564  /usr/include/c++/4.6/ext/new_allocator.h:main
117,645,255  /usr/include/c++/4.6/bits/stl_vector.h:main
112,167,083  /usr/include/c++/4.6/bits/stl_algobase.h:main

За исключением main (), он не показывает, какие части исходного кода занимают большую часть времени. Я точно знаю, что большую часть времени тратится на функцию optimize (), и, в свою очередь, значительную часть этого времени на функцию kernel (), но я не вижу этого в выводе.
Как мне узнать подробности, чтобы ускорить мой код?

Если это поможет, я широко использую std :: vectors в коде. Некоторое время назад я реализовал подобный код, используя массивы, и тогда казалось, что callgrind работает нормально. Может ли это быть проблемой?

Если я отключу флаг O3, я получу следующий вывод:

--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
19,026,610,083  PROGRAM TOTALS

--------------------------------------------------------------------------------
Ir  file:function
--------------------------------------------------------------------------------
5,233,252,577  /build/buildd/eglibc-2.15/string/../sysdeps/i386/i686/multiarch/memcpy-ssse3.S:__memmove_ssse3 [/lib/i386-linux-gnu/libc-2.15.so]
2,542,000,057  /build/buildd/eglibc-2.15/malloc/malloc.c:_int_malloc [/lib/i386-linux-gnu/libc-2.15.so]
1,184,626,252  /build/buildd/eglibc-2.15/malloc/malloc.c:_int_free [/lib/i386-linux-gnu/libc-2.15.so]
983,472,430  sgd.cpp:optimize(std::vector<double, std::allocator<double> >, std::vector<int, std::allocator<int> >, std::vector<double, std::allocator<double> >) [a.out]
781,018,740  ???:std::vector<double, std::allocator<double> >::operator[](unsigned int) [a.out]
772,117,839  sgd.cpp:kernel(std::vector<double, std::allocator<double> >, int, int, double) [a.out]
476,616,742  ???:std::vector<double, std::allocator<double> >::vector(std::vector<double, std::allocator<double> > const&) [a.out]
449,016,969  /build/buildd/eglibc-2.15/malloc/malloc.c:malloc [/lib/i386-linux-gnu/libc-2.15.so]
324,200,916  ???:std::vector<double, std::allocator<double> >::size() const [a.out]
305,705,504  ???:std::_Vector_base<double, std::allocator<double> >::_Vector_base(unsigned int, std::allocator<double> const&) [a.out]
267,492,204  ???:std::_Vector_base<double, std::allocator<double> >::~_Vector_base() [a.out]
238,309,873  /usr/include/c++/4.6/bits/stl_algobase.h:double* std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<double>(double const*, double const*, double*) [a.out]
238,308,370  /usr/include/c++/4.6/bits/stl_algobase.h:double* std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*) [a.out]
228,776,040  /usr/include/c++/4.6/bits/stl_algobase.h:std::_Miter_base<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > > >::iterator_type std::__miter_base<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > > >(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >) [a.out]
228,776,038  /usr/include/c++/4.6/bits/stl_algobase.h:double* std::copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*) [a.out]
210,178,748  /build/buildd/eglibc-2.15/malloc/malloc.c:free [/lib/i386-linux-gnu/libc-2.15.so]
210,172,446  ???:std::vector<double, std::allocator<double> >::~vector() [a.out]
209,711,018  sgd.cpp:square(double) [a.out]
190,646,380  /build/buildd/eglibc-2.15/math/../sysdeps/i386/fpu/e_exp.S:__ieee754_exp [/lib/i386-linux-gnu/libm-2.15.so]
181,517,469  ???:operator new(unsigned int) [/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16]
171,582,030  /usr/include/c++/4.6/bits/stl_iterator_base_types.h:std::_Iter_base<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, true>::_S_base(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >) [a.out]
171,581,742  /build/buildd/eglibc-2.15/math/../sysdeps/ieee754/dbl-64/w_exp.c:exp [/lib/i386-linux-gnu/libm-2.15.so]
152,853,344  ???:__gnu_cxx::new_allocator<double>::allocate(unsigned int, void const*) [a.out]
152,852,752  ???:std::_Vector_base<double, std::allocator<double> >::_Vector_impl::_Vector_impl(std::allocator<double> const&) [a.out]
152,517,360  /usr/include/c++/4.6/bits/stl_algobase.h:std::_Niter_base<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > > >::iterator_type std::__niter_base<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > > >(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >) [a.out]
152,517,360  /usr/include/c++/4.6/bits/stl_iterator_base_types.h:std::_Iter_base<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, false>::_S_base(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >) [a.out]
152,517,360  /usr/include/c++/4.6/bits/stl_iterator.h:__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >::__normal_iterator(double const* const&) [a.out]
133,746,571  ???:std::_Vector_base<double, std::allocator<double> >::_M_deallocate(double*, unsigned int) [a.out]
133,452,690  ???:std::vector<double, std::allocator<double> >::end() const [a.out]
133,452,690  ???:std::vector<double, std::allocator<double> >::begin() const [a.out]
131,134,604  sgd.cpp:sign(double) [a.out]
123,920,353  /usr/include/c++/4.6/bits/stl_algobase.h:double* std::__copy_move_a<false, double const*, double*>(double const*, double const*, double*) [a.out]
121,192,848  ???:std::vector<int, std::allocator<int> >::operator[](unsigned int) [a.out]
114,649,360  /build/buildd/eglibc-2.15/string/../sysdeps/i386/i686/multiarch/strcat.S:0x0012ac73 [/lib/i386-linux-gnu/libc-2.15.so]
114,642,456  ???:operator delete(void*) [/usr/lib/i386-linux-gnu/libstdc++.so.6.0.16]
114,388,018  /usr/include/c++/4.6/bits/stl_uninitialized.h:double* std::__uninitialized_copy<true>::__uninit_copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*) [a.out]
114,388,018  /usr/include/c++/4.6/bits/stl_uninitialized.h:double* std::uninitialized_copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*) [a.out]
114,388,018  /usr/include/c++/4.6/bits/stl_uninitialized.h:double* std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*, double>(__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*, std::allocator<double>&) [a.out]
105,086,674  /usr/include/c++/4.6/bits/stl_vector.h:std::_Vector_base<double, std::allocator<double> >::_M_allocate(unsigned int) [a.out]
95,533,505  ???:std::_Vector_base<double, std::allocator<double> >::_M_get_Tp_allocator() [a.out]
95,533,300  /usr/include/c++/4.6/bits/stl_construct.h:void std::_Destroy<double*>(double*, double*) [a.out]
95,533,300  /usr/include/c++/4.6/bits/stl_construct.h:void std::_Destroy<double*, double>(double*, double*, std::allocator<double>&) [a.out]
95,532,970  /usr/include/c++/4.6/bits/allocator.h:std::allocator<double>::allocator(std::allocator<double> const&) [a.out]
95,323,350  /usr/include/c++/4.6/bits/stl_iterator.h:__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >::base() const [a.out]
76,594,040  /usr/include/c++/4.6/bits/allocator.h:std::allocator<double>::~allocator() [a.out]
76,428,152  /usr/include/c++/4.6/bits/stl_algobase.h:std::_Niter_base<double*>::iterator_type std::__niter_base<double*>(double*) [a.out]
76,426,584  /usr/include/c++/4.6/ext/new_allocator.h:__gnu_cxx::new_allocator<double>::deallocate(double*, unsigned int) [a.out]
76,426,344  ???:std::_Vector_base<double, std::allocator<double> >::_Vector_impl::~_Vector_impl() [a.out]
75,798,592  /usr/include/c++/4.6/bits/stl_algobase.h:__gnu_cxx::__enable_if<std::__is_scalar<double>::__value, double*>::__type std::__fill_n_a<double*, unsigned int, double>(double*, unsigned int, double const&) [a.out]
47,768,335  /usr/include/c++/4.6/bits/stl_iterator_base_types.h:std::_Iter_base<double*, false>::_S_base(double*) [a.out]
47,767,040  ???:__gnu_cxx::new_allocator<double>::max_size() const [a.out]
47,662,045  ???:std::_Vector_base<double, std::allocator<double> >::_M_get_Tp_allocator() const [a.out]
38,297,020  /usr/include/c++/4.6/ext/new_allocator.h:__gnu_cxx::new_allocator<double>::~new_allocator() [a.out]

В нем содержится больше информации, чем в предыдущем выводе, но есть еще две проблемы: во-первых, вывод неоптимизированного кода не помогает мне сделать оптимизированный код быстрее. Во-вторых, большую часть времени (~ 50%) занимают функции libc, которые я не использую напрямую в своем коде. Как узнать, какие части кода соответствуют этим вызовам?

1

Решение

Задача ещё не решена.

Другие решения


По вопросам рекламы [email protected]