Quantcast
Channel: Intel® oneAPI Threading Building Blocks & Intel® Threading Building Blocks
Viewing all 702 articles
Browse latest View live

receive_or_steal_task

$
0
0

Is it "normal" or expected for `tbb::internal::custom_scheduler::receive_or_steal_task` to be the most expensive function of given algorithm?

My guess is, since this is recursive work and it isn't very deep, `receive_or_steal_task` is where the scheduler loops waiting for more work. If it is highest in profiling, it would denote a lack of work? Am I reading this right?


tbb malloc proxy with Python3

$
0
0

We are implementing an SDK with both C++ and Python interface.

For performance reasons we had to link them with tbb_malloc_proxy.lib

For the C++ interface the performance increased by 5x.

while for the Python interface it didn't make any difference byt further investigation with Intel Amplifier we found that the python still uses the ntdll.dll not the tbb_malloc_proxy.dll.

I did build the python interpreter against tbb_malloc_proxy.lib and now it works as fast as the C++ interface.

My question:

Can I inject the tbb_malloc_proxy.dll with the standard python on windows 10 ?

I tried the appinit_dlls but it didn't work for me but may be I did something wrong

1. I did copy the tbb dlls to the system32,

2. I changed the registery key

HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs with the names of the dlls (space separated)

3. I did change the registery key HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows\LoadAppInit_DLLs to 1

Any other suggestoins will be appreciated

Thanks

parallel_reduce performance - filter vector

$
0
0

Hi there,

I've been playing around with some performance measurements removing elements from a vector (as I'm sure everyone reading this is aware there are a number of varying methods each with different performance characteristics. They range from the N squared to N). In my example, I have a bunch of entities and I'd like to remove all the ones that are no longer alive.

I wanted to see if it was possible to speed up one of these algorithms with parallel_reduce, but I've found the performance is always worse when attempting to use tbb::parallel_reduce as opposed to just using a serial algorithm.

I feel like I must be doing something wrong so wanted to ask for expert advice on how best to implement this using tbb.

A serial version of the algorithm might look like this:

void removeDeadEntities_remove() {
    entities_.erase(
        std::remove_if(
            entities_.begin(), entities_.end(),
            [](const Entity& entity) { return !entity.alive_; }),
        entities_.end());
}

This is the most efficient serial version I've tested.

Now suppose the number of entities could grow to be very large (say millions) and I'd like to use parallel_reduce to get decent performance in both medium and large cases.

I created a parallel_reduce function object that looks like this:

struct EntityReduce
{
    std::vector<Entity> alive_;
    EntityReduce(EntityReduce&, tbb::split) {}
    void operator(const tbb::blocked_range<std::vector<Entity>::iterator>& r) {
        std::vector<Entity> alive;
        alive.reserve(r.end() - r.begin());
        std::remove_copy_if(
            r.begin(), r.end(), std::back_inserter(alive),
            [](const Entity& entity){ return !entity.alive_; });
        alive_.insert(alive_.end(), alive.begin(), alive.end());
    }

    void join(const EntityReduce& reduce) {
        alive_.insert(alive_.end(), reduce.alive_.begin(), reduce.alive_.end());
    }
};

And then I call it like this

tbb::parallel_reduce(
    tbb::blocked_range<std::vector<Entity>::iterator>(entities_.begin(), entities_.end()), 
    EntityReduce{});

Now this isn't ideal as I'm using O(n) amount of memory here as opposed to the serial version but because of iterator invalidation I can't have two threads operating on the same vector at the same time and calling erase (I've tried a version where I use the erase-remove idiom on a copy of the blocked_range passed in, and then combine it with the member alive_, but this didn't give great results either).

Is there an optimum approach to making an algorithm such as this parallel? Do I need to use something other than parallel_reduce? Is there a way to reduce the amount of copying required to support a parallel version?

I'd be very interested to hear any advice on how best to approach this particular problem.

Thank you very much for your help.

 

Tom

tbb::parallel_pipeline, nested tbb::parallel_for

$
0
0

Dear Intel devs,

TBB is a wonderfull library, thank you for all your work. I am currently experimenting, features of the library, that I have never tested. The pipeline package. I am currently chaining series of functor that can be "fat" in ressource computation, and which are already parallelized with a tbb::parallel_for. From my knowledge, it is not an issue. The thread pool of the parallel for and the parallel_pipeline should be common.

So is it reasonable to make a filter with a nested tbb::parallel_for inside ?

Same question, for the tbb::graph_flow library (tbb::parallel_for into a node) ?

Thank you 

++t

tbbmalloc memory usage statistics

$
0
0

It would really useful if there was some scalable_mallinfo() type of API similar to glibc's mallinfo() so that one can get some statistics of out the allocator.

TCE Open Date: 

Thursday, November 21, 2019 - 06:27

ARM64 cross compiled tbb lib link errors

$
0
0

Hello,

I would like to use the TBB on my dev board (linux, ARM Cortex-A53 MPCore) and I am running into issue with linking the cross compiled TBB for ARM64 with my application. A simple Hello World cross compiled with same compiler runs on the target linux OS minus tbb. If buid for host platform with tbb compiled for Host x86 it works fine.

My introduction to TDD was from this article: https://solarianprogrammer.com/2019/05/09/cpp-17-stl-parallel-algorithms...

I just followed it and then for cross compiled switched to cross compiler to create the tbb libs for arm64. I have been able to use TBB on my raspberry PI4 successfully with latest gcc and tbb source but built on PI. 

Not sure if there is a flag I am missing or something simple. Checked all the paths and those point to correct libs based on the link above.

Dev Setup:

Ubuntu 16.04 VM using C++17 and utilizing std execution policy feature.

I have cross compiled TBB 2019_U9 with aarch64-linux-gnu-g++-9.

I just have a simple program that sorts for now just to test the std::execution policy.

example: std::sort(std::execution::par, curr_data.begin(), curr_data.end());

Any suggestions would be great, Thanks!

 

Build output:

*************

[100%] Linking CXX executable arm-exe
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: CMakeFiles/arm-exe.dir/main.cpp.o: in function `__pstl::__par_backend::__merge_task<double*, double*, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, __pstl::__par_backend::__serial_destroy, __pstl::__par_backend::__serial_move_merge<__pstl::__par_backend::__stable_sort_task<__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*, std::less<double>, __pstl::__internal::__pattern_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false> >(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false>, std::integral_constant<bool, true>, std::integral_constant<bool, true>)::{lambda()#1}::operator()() const::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}>::execute()::{lambda(double*, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >)#1}, {lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}::execute()::{lambda(double*, double*, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >)#2}> >::execute()':
main.cpp:(.text._ZN6__pstl13__par_backend12__merge_taskIPdS2_N9__gnu_cxx17__normal_iteratorIS2_St6vectorIdSaIdEEEESt4lessIdENS0_16__serial_destroyENS0_19__serial_move_mergeIZNS0_18__stable_sort_taskIS8_S2_SA_ZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SP_T1_T2_SL_IbLb1EESS_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEvEUlS2_S8_E_ZNSV_7executeEvEUlS2_S2_S8_E0_EEE7executeEv[_ZN6__pstl13__par_backend12__merge_taskIPdS2_N9__gnu_cxx17__normal_iteratorIS2_St6vectorIdSaIdEEEESt4lessIdENS0_16__serial_destroyENS0_19__serial_move_mergeIZNS0_18__stable_sort_taskIS8_S2_SA_ZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SP_T1_T2_SL_IbLb1EESS_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEvEUlS2_S8_E_ZNSV_7executeEvEUlS2_S2_S8_E0_EEE7executeEv]+0xd0): undefined reference to `tbb::internal::allocate_additional_child_of_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: CMakeFiles/arm-exe.dir/main.cpp.o: in function `__pstl::__par_backend::__merge_task<__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*, std::less<double>, __pstl::__par_backend::__binary_no_op, __pstl::__par_backend::__serial_move_merge<__pstl::__par_backend::__stable_sort_task<__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*, std::less<double>, __pstl::__internal::__pattern_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false> >(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false>, std::integral_constant<bool, true>, std::integral_constant<bool, true>)::{lambda()#1}::operator()() const::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}>::execute()::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*)#3}, {lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}::execute()::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*)#4}> >::execute()':
main.cpp:(.text._ZN6__pstl13__par_backend12__merge_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES8_S4_St4lessIdENS0_14__binary_no_opENS0_19__serial_move_mergeIZNS0_18__stable_sort_taskIS8_S4_SA_ZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SP_T1_T2_SL_IbLb1EESS_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEvEUlS8_S4_E1_ZNSV_7executeEvEUlS8_S8_S4_E2_EEE7executeEv[_ZN6__pstl13__par_backend12__merge_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES8_S4_St4lessIdENS0_14__binary_no_opENS0_19__serial_move_mergeIZNS0_18__stable_sort_taskIS8_S4_SA_ZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SP_T1_T2_SL_IbLb1EESS_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEvEUlS8_S4_E1_ZNSV_7executeEvEUlS8_S8_S4_E2_EEE7executeEv]+0xd8): undefined reference to `tbb::internal::allocate_additional_child_of_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: CMakeFiles/arm-exe.dir/main.cpp.o: in function `__pstl::__par_backend::__merge_task<double*, double*, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, __pstl::__par_backend::__binary_no_op, __pstl::__par_backend::__serial_move_merge<__pstl::__par_backend::__stable_sort_task<__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*, std::less<double>, __pstl::__internal::__pattern_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false> >(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false>, std::integral_constant<bool, true>, std::integral_constant<bool, true>)::{lambda()#1}::operator()() const::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}>::execute()::{lambda(double*, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >)#1}, {lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}::execute()::{lambda(double*, double*, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >)#2}> >::execute()':
main.cpp:(.text._ZN6__pstl13__par_backend12__merge_taskIPdS2_N9__gnu_cxx17__normal_iteratorIS2_St6vectorIdSaIdEEEESt4lessIdENS0_14__binary_no_opENS0_19__serial_move_mergeIZNS0_18__stable_sort_taskIS8_S2_SA_ZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SP_T1_T2_SL_IbLb1EESS_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEvEUlS2_S8_E_ZNSV_7executeEvEUlS2_S2_S8_E0_EEE7executeEv[_ZN6__pstl13__par_backend12__merge_taskIPdS2_N9__gnu_cxx17__normal_iteratorIS2_St6vectorIdSaIdEEEESt4lessIdENS0_14__binary_no_opENS0_19__serial_move_mergeIZNS0_18__stable_sort_taskIS8_S2_SA_ZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SP_T1_T2_SL_IbLb1EESS_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEvEUlS2_S8_E_ZNSV_7executeEvEUlS2_S2_S8_E0_EEE7executeEv]+0xd0): undefined reference to `tbb::internal::allocate_additional_child_of_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: CMakeFiles/arm-exe.dir/main.cpp.o: in function `main':
main.cpp:(.text.startup+0xa0): undefined reference to `tbb::interface7::internal::isolate_within_arena(tbb::interface7::internal::delegate_base&, long)'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: CMakeFiles/arm-exe.dir/main.cpp.o: in function `__pstl::__par_backend::__stable_sort_task<__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, double*, std::less<double>, __pstl::__internal::__pattern_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false> >(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false>, std::integral_constant<bool, true>, std::integral_constant<bool, true>)::{lambda()#1}::operator()() const::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}>::execute()':
main.cpp:(.text._ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv[_ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv]+0x78): undefined reference to `tbb::internal::allocate_continuation_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: main.cpp:(.text._ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv[_ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv]+0xb8): undefined reference to `tbb::internal::allocate_child_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: main.cpp:(.text._ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv[_ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv]+0x148): undefined reference to `tbb::internal::allocate_continuation_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: main.cpp:(.text._ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv[_ZN6__pstl13__par_backend18__stable_sort_taskIN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEES4_St4lessIdEZZNS_10__internal14__pattern_sortIRKNS_9execution2v115parallel_policyES8_SA_St17integral_constantIbLb0EEEEvOT_T0_SM_T1_T2_SI_IbLb1EESP_ENKUlvE_clEvEUlS8_S8_SA_E_E7executeEv]+0x178): undefined reference to `tbb::internal::allocate_continuation_proxy::allocate(unsigned long) const'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: CMakeFiles/arm-exe.dir/main.cpp.o: in function `tbb::interface7::internal::delegated_function<__pstl::__par_backend::__parallel_stable_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, __pstl::__internal::__pattern_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false> >(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false>, std::integral_constant<bool, true>, std::integral_constant<bool, true>)::{lambda()#1}::operator()() const::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}>(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, __pstl::__internal::__pattern_sort<__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false> >(__pstl::execution::v1::parallel_policy const&, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>, std::integral_constant<bool, false>, std::integral_constant<bool, true>, std::integral_constant<bool, true>)::{lambda()#1}::operator()() const::{lambda(__gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, __gnu_cxx::__normal_iterator<double*, std::vector<double, std::allocator<double> > >, std::less<double>)#1}, unsigned long)::{lambda()#1} const, void>::operator()() const':
main.cpp:(.text._ZNK3tbb10interface78internal18delegated_functionIKZN6__pstl13__par_backend22__parallel_stable_sortIRKNS3_9execution2v115parallel_policyEN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEESt4lessIdEZZNS3_10__internal14__pattern_sortISA_SH_SJ_St17integral_constantIbLb0EEEEvOT_T0_SQ_T1_T2_SM_IbLb1EEST_ENKUlvE_clEvEUlSH_SH_SJ_E_EEvSP_SQ_SQ_SR_SS_mEUlvE_vEclEv[_ZNK3tbb10interface78internal18delegated_functionIKZN6__pstl13__par_backend22__parallel_stable_sortIRKNS3_9execution2v115parallel_policyEN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEESt4lessIdEZZNS3_10__internal14__pattern_sortISA_SH_SJ_St17integral_constantIbLb0EEEEvOT_T0_SQ_T1_T2_SM_IbLb1EEST_ENKUlvE_clEvEUlSH_SH_SJ_E_EEvSP_SQ_SQ_SR_SS_mEUlvE_vEclEv]+0x6c): undefined reference to `tbb::internal::allocate_via_handler_v3(unsigned long)'
/usr/lib/gcc-cross/aarch64-linux-gnu/9/../../../../aarch64-linux-gnu/bin/ld: main.cpp:(.text._ZNK3tbb10interface78internal18delegated_functionIKZN6__pstl13__par_backend22__parallel_stable_sortIRKNS3_9execution2v115parallel_policyEN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEESt4lessIdEZZNS3_10__internal14__pattern_sortISA_SH_SJ_St17integral_constantIbLb0EEEEvOT_T0_SQ_T1_T2_SM_IbLb1EEST_ENKUlvE_clEvEUlSH_SH_SJ_E_EEvSP_SQ_SQ_SR_SS_mEUlvE_vEclEv[_ZNK3tbb10interface78internal18delegated_functionIKZN6__pstl13__par_backend22__parallel_stable_sortIRKNS3_9execution2v115parallel_policyEN9__gnu_cxx17__normal_iteratorIPdSt6vectorIdSaIdEEEESt4lessIdEZZNS3_10__internal14__pattern_sortISA_SH_SJ_St17integral_constantIbLb0EEEEvOT_T0_SQ_T1_T2_SM_IbLb1EEST_ENKUlvE_clEvEUlSH_SH_SJ_E_EEvSP_SQ_SQ_SR_SS_mEUlvE_vEclEv]+0x7c): undefined reference to `tbb::internal::allocate_root_proxy::allocate(unsigned long)'
collect2: error: ld returned 1 exit status
CMakeFiles/arm-exe.dir/build.make:83: recipe for target 'arm-exe' failed
make[2]: *** [arm-exe] Error 1
CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/arm-exe.dir/all' failed
make[1]: *** [CMakeFiles/arm-exe.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

************

 

 

 

TCE Open Date: 

Wednesday, December 4, 2019 - 08:20

Cross Compile TBB for Aarch64

$
0
0

Hi,

I have a cmake project which uses TBB. I have compiled TBB natively and there is no problem here. I have NVIDIA Jetson Tx2 Developer kit which has Linux 18.04 and aarch64 system. So, I want to cross compile my cmake project for it.   

My pc uses:

  • Linux 18.04,
  • gcc-linaro-7.3.1-2018.05-x86_64_aarch64-linux-gnu compiler, (L4T Toolchain)
  • tbb_2019 version

I have looked at the TBB's official installation document but couldn't find cross compile related things. I am new to TBB so can you tell me how to cross compile for aarch64 (if possible) step by step? 

Thanks in advance.

TCE Level: 

TCE Open Date: 

Tuesday, December 17, 2019 - 19:57

tbb::zip_iterator and std::sort

$
0
0

Hello all,

I was waiting zip_iterator since a long time. I discover there are part of TBB now. Unfortunately, I get issue with GCC 7.4 (but working with my Mac/clang).

The following example work on my Mac with last version of Apple-clang (Apple clang version 11.0.0 (clang-1100.0.33.16)) but it does not compile with GCC 7.4 on my ubuntu station. I get the following error  

/usr/include/c++/7/bits/predefined_ops.h:215:11: note: candidate: bool (*)(const std::tuple<float&, float&>&, const std::tuple<float&, float&>&) <conversion>
/usr/include/c++/7/bits/predefined_ops.h:215:11: note:   candidate expects 3 arguments, 3 provided
main.cpp:30:102: note: candidate: main()::<lambda(const std::tuple<float&, float&>&, const std::tuple<float&, float&>&)>
     std::sort(start, end, [](const std::tuple<float&, float&>& v, const std::tuple<float&, float&>& w) {
                                                                                                      ^
#include <algorithm>
#include <iostream>
#include <tuple>
#include <vector>
#include <random>
#include <tbb/iterators.h>

int main() {
    const int N = 10;
    std::vector<float> a(N), b(N);

 // First create an instance of an engine.
    std::random_device rnd_device;
    // Specify the engine and distribution.
    std::mt19937 mersenne_engine {rnd_device()};  // Generates random integers
    std::uniform_int_distribution<int> dist {1, 52};

    auto gen = [&dist, &mersenne_engine](){
                   return dist(mersenne_engine);
               };

    generate(begin(a), end(a), gen);
    generate(begin(b), end(b), gen);


    auto start = tbb::make_zip_iterator(a.begin(), b.begin());
    auto end   = tbb::make_zip_iterator(a.end(), b.end());

    std::sort(start, end, [](const std::tuple<float&, float&>& v, const std::tuple<float&, float&>& w) {
          return std::get<1>(v) < std::get<1>(w);
    });

    for (int i=0; i< N; ++i)
        std::cout << a[i] <<  ""<< b[i] << std::endl;


    return 0;
}

Any suggestions ?

All the best

++T

TCE Level: 

TCE Open Date: 

Tuesday, December 17, 2019 - 22:17

windows 10 static/shared builds - which libs to use

$
0
0

Hi,

I'm on Windows 10, command line using nvcc to link objects to a 64 bit .exe

I see there's various dlls & libs for windows 10, 64 bit:

\compilers_and_libraries_2020.0.166\windows\ipp\lib\intel64_win has .libs,

as does subdirectories \ipp\intel64_win\threaded has .libs

& \ipp\intel64_win\tl\ has

\ipp\intel64_win\tl\openmp\ with _tl.dll & _tl.lib

&

\ipp\intel64_win\tl\tbb\ with _tl.dll & _tl.lib

If I'm using TBB in my app, and linking to the TBB libs (tbb & tbbmalloc), which TBB (vc_mt or vc14) & static IPP .libs  do I link to? I thought the TBB libs were dynamic/shared, but I don't see any TBB .dlls, only TBB .libs

Here are the include contents of TBB app:

#include <base_ipp.h>

#include "ippcore.h"
#include "ipps.h"
#include "ippi.h"

#ifdef USE_TBB
#define TBB_PREVIEW_MEMORY_POOL 1
#include "task_scheduler_init.h"
#include "parallel_for.h"
#include "blocked_range2d.h"
#include "memory_pool.h"
using namespace tbb;
#endif

Thanks

Ian

 

TCE Open Date: 

Monday, December 30, 2019 - 18:52

How to know that a function is run inside a TBB task

$
0
0

#define __TBB_VERSION_STRINGS(N) \
#N": BUILD_HOST        imbeu025 (x86_64)" ENDL \
#N": BUILD_OS        Ubuntu 18.04.2 LTS" ENDL \
#N": BUILD_KERNEL    Linux 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019" ENDL \
#N": BUILD_GCC        g++ (GCC) 7.4.0" ENDL \
#N": BUILD_LIBC    2.27" ENDL \
#N": BUILD_LD        " ENDL \
#N": BUILD_TARGET    intel64 on cc7.4.0_libc2.27_kernel4.15.0" ENDL \

Hello.

I want to use tbb::task::suspend/resume if I'm inside a TBB task and a regular semaphore otherwise. How can I check if the current function is run from inside a TBB task? Is the condition "tbb::task::self().state() == tbb::task::executing" good enough?

TCE Open Date: 

Thursday, January 2, 2020 - 05:45

geting error trying to use tbb on windows with mingw and clion

$
0
0

Here is my main function:

#include <tbb/tbb.h>

'#include<iostream>

using namespace std;

using namespace tbb;

int main() {  return 0; }

And I got these :

D:/mingw64_9.1.0/mingw64/include/c++/9.1.1/bits/hashtable_policy.h: In member function 'std::__detail::_Map_base<_Key, _Pair, _Alloc, std::__detail::_Select1st, _Equal, _H1, _H2, _Hash, _RehashPolicy, _Traits, true>::mapped_type& std::__detail::_Map_base<_Key, _Pair, _Alloc, std::__detail::_Select1st, _Equal, _H1, _H2, _Hash, _RehashPolicy, _Traits, true>::operator[](std::__detail::_Map_base<_Key, _Pair, _Alloc, std::__detail::_Select1st, _Equal, _H1, _H2, _Hash, _RehashPolicy, _Traits, true>::key_type&&)':
D:/mingw64_9.1.0/mingw64/include/c++/9.1.1/bits/hashtable_policy.h:726:16: error: 'forward_as_tuple' is not a member of 'std'

 

my platform is win10(64bit)+clion+mingw9.1.0. 

first I download the zip of "https://github.com/intel/tbb.git",second I use cmd to run the command"cmake -G 'MinGW Makefiles'",and then run "mingw32-make compiler=gcc arch=intel64 runtime=mingw",and finally ""mingw32-make install".

after all the steps above,I got "C:\Program Files (x86)\tbb/bin","C:\Program Files (x86)\tbb/include","C:\Program Files (x86)\tbb/lib".

and copy those three folder to "E:\CLionProjects\MYtoolTest\bin\tbb"(contain all .lib and .dll) and "E:\CLionProjects\MYtoolTest\include\tbb"(contain all header)

then in my project's cmakelist I wrote this:

cmake_minimum_required(VERSION 3.14)

project(MYtoolTest)

set(CMAKE_CXX_STANDARD 17)

set(CMAKE_EXE_LINKER_FLAGS "-static-libgcc -static-libstdc++")

include_directories(include)

include_directories(E:/CLionProjects/MYtoolTest/MyTool/include) include_directories(D:/ProgramData/Anaconda3/include)

include_directories(E:/CLionProjects/MYtoolTest/MyTool/include)

include_directories(E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl) include_directories(E:/CLionProjects/MYtoolTest/include/tbb/internal)

include_directories(E:/CLionProjects/MYtoolTest/include/armadillo_9.800.3)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant) include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand) include_directories(E:/CLionProjects/MYtoolTest/include/fftw-3.3.5-dll64)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob) include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens) include_directories(E:/CLionProjects/MYtoolTest/include/tbb)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks) include_directories(E:/CLionProjects/MYtoolTest/include/nolhmann)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/internal_fns) include_directories(E:/CLionProjects/MYtoolTest/include/tbb/compat)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops) include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl) include_directories(E:/CLionProjects/MYtoolTest/include/tbb/machine)

include_directories(E:/CLionProjects/MYtoolTest/include/openBLAS)

include_directories(E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc) include_directories(E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/quadrature) include_directories(E:/CLionProjects/MYtoolTest/include/gcem-master) include_directories(E:/CLionProjects/MYtoolTest/include/armadillo_9.800.3/armadillo_bits)

link_directories(E:/CLionProjects/MYtoolTest/bin/tbb D:/ProgramData/Anaconda3/libs bin) include_directories(E:/CLionProjects/MYtoolTest/bin/openBLAS)

include_directories(E:/CLionProjects/MYtoolTest/bin/tbb)

include_directories(E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64)

add_executable(MYtoolTest E:/CLionProjects/MYtoolTest/MyTool/src/strStuff.cpp E:/CLionProjects/MYtoolTest/MyTool/src/glob.cpp E:/CLionProjects/MYtoolTest/MyTool/src/ioStuff.cpp E:/CLionProjects/MYtoolTest/MyTool/src/ioStuff.cpp E:/CLionProjects/MYtoolTest/MyTool/src/glob.cpp E:/CLionProjects/MYtoolTest/MyTool/src/strStuff.cpp E:/CLionProjects/MYtoolTest/src/main.cpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/cout.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/norm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pt.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qexp.hpp E:/CLionProjects/MYtoolTest/include/tbb/scalable_allocator.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/pow.hpp E:/CLionProjects/MYtoolTest/MyTool/include/strStuff.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rwish.hpp E:/CLionProjects/MYtoolTest/include/tbb/index.html E:/CLionProjects/MYtoolTest/include/tbb/aligned_space.h E:/CLionProjects/MYtoolTest/include/tbb/global_control.h E:/CLionProjects/MYtoolTest/include/tbb/machine/windows_ia32.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/find_fraction.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_scan.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/log.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_for.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dnorm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qunif.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dens.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/windows_api.h E:/CLionProjects/MYtoolTest/include/openBLAS/f77blas.h E:/CLionProjects/MYtoolTest/include/tbb/enumerable_thread_specific.h E:/CLionProjects/MYtoolTest/include/tbb/iterators.h E:/CLionProjects/MYtoolTest/MyTool/include/MySetStuff.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/log_binomial_coef.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_while.h E:/CLionProjects/MYtoolTest/include/tbb/queuing_rw_mutex.h E:/CLionProjects/MYtoolTest/include/tbb/spin_mutex.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_async_msg_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qf.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/prob.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rand.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dgamma.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/chol.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rbinom.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/statslib_options.hpp E:/CLionProjects/MYtoolTest/include/tbb/concurrent_unordered_map.h E:/CLionProjects/MYtoolTest/include/tbb/machine/linux_ia32.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/spacing.hpp E:/CLionProjects/MYtoolTest/MyTool/include/matplotlibcpp.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/chisq.hpp E:/CLionProjects/MYtoolTest/MyTool/include/MyTimeStuff.hpp E:/CLionProjects/MYtoolTest/include/tbb/concurrent_lru_cache.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_aggregator_impl.h E:/CLionProjects/MYtoolTest/include/tbb/tick_count.h E:/CLionProjects/MYtoolTest/include/tbb/queuing_mutex.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dchisq.hpp E:/CLionProjects/MYtoolTest/include/tbb/compat/ppl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/repmat.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pchisq.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/is_odd.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/tan.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dbinom.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/plaplace.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_allocator_traits.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/t.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_types_impl.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/erf.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_tbb_windef.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_x86_eliding_mutex_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dlnorm.hpp E:/CLionProjects/MYtoolTest/MyTool/include/MySlice.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/macos_common.h E:/CLionProjects/MYtoolTest/include/tbb/memory_pool.h E:/CLionProjects/MYtoolTest/include/tbb/machine/msvc_armv7.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/quad_form.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pbeta.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/lnorm.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/atan2.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/n_elem.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/exp.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pgamma.hpp E:/CLionProjects/MYtoolTest/include/tbb/task_group.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/eye.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/get_mem_ptr.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_concurrent_queue_impl.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/incomplete_beta_inv.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/weibull.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rinvwish.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/solve.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/is_inf.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_item_buffer_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/get_row.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qlaplace.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/ceil.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/acos.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/neg_zero.hpp E:/CLionProjects/MYtoolTest/include/tbb/combinable.h E:/CLionProjects/MYtoolTest/MyTool/include/MyMapReduce.hpp E:/CLionProjects/MYtoolTest/include/openBLAS/lapacke.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dlaplace.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pnorm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rmultinom.hpp E:/CLionProjects/MYtoolTest/include/tbb/concurrent_vector.h E:/CLionProjects/MYtoolTest/include/tbb/runtime_loader.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/zeros.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qt.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/gcc_itsx.h E:/CLionProjects/MYtoolTest/MyTool/include/dictTrans.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/df.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dinvwish.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rlaplace.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rt.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rchisq.hpp E:/CLionProjects/MYtoolTest/include/tbb/blocked_range3d.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/cosh.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qbeta.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_do.h E:/CLionProjects/MYtoolTest/include/tbb/pipeline.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rnorm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qbinom.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/icc_generic.h E:/CLionProjects/MYtoolTest/include/tbb/machine/linux_common.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_body_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dpois.hpp E:/CLionProjects/MYtoolTest/MyTool/include/ioStuff.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/plogis.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rmvnorm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/sum_absdiff.hpp E:/CLionProjects/MYtoolTest/include/tbb/tbb_thread.h E:/CLionProjects/MYtoolTest/include/tbb/machine/gcc_arm.h E:/CLionProjects/MYtoolTest/include/tbb/tbb_stddef.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/acosh.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qinvgamma.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/beta.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qgamma.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/gcd.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/misc.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_impl.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/lbeta.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/internal_fns/statslib_defs.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rcauchy.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/linux_intel64.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pbern.hpp E:/CLionProjects/MYtoolTest/include/tbb/task_scheduler_observer.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/mean.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/cauchy.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/sqrt.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/unif.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/plnorm.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/binomial_coef.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/sunos_sparc.h E:/CLionProjects/MYtoolTest/include/tbb/machine/msvc_ia32_common.h E:/CLionProjects/MYtoolTest/include/tbb/reader_writer_lock.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/find_whole.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/trace.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/gcem_options.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/cumsum.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/beta.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/is_even.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rgamma.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pcauchy.hpp E:/CLionProjects/MYtoolTest/include/tbb/concurrent_priority_queue.h E:/CLionProjects/MYtoolTest/include/tbb/tbb_exception.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/n_cols.hpp E:/CLionProjects/MYtoolTest/include/tbb/flow_graph_opencl_node.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/bern.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rlnorm.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/expm1.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_indexer_impl.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_range_iterator.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/incomplete_beta.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/min.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/internal_fns/internal_fns.hpp E:/CLionProjects/MYtoolTest/include/tbb/tbb_allocator.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/cerr.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/pow_integral.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/binom.hpp E:/CLionProjects/MYtoolTest/MyTool/include/MyQueue.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/erf_inv.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/laplace.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/tanh.hpp E:/CLionProjects/MYtoolTest/include/tbb/flow_graph_abstractions.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qbern.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/gcc_ia32_common.h E:/CLionProjects/MYtoolTest/include/tbb/machine/gcc_generic.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/matrix_ops.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/ppois.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/resize.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qchisq.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/pois.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_reduce.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_streaming_node.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/find_exponent.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/punif.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/mantissa.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dunif.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_cache_impl.h E:/CLionProjects/MYtoolTest/include/tbb/machine/ibm_aix51.h E:/CLionProjects/MYtoolTest/include/tbb/parallel_invoke.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/trans.hpp E:/CLionProjects/MYtoolTest/include/tbb/tbb_machine.h E:/CLionProjects/MYtoolTest/include/tbb/tbb_profiling.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/trunc.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dwish.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pexp.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dlogis.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/internal_fns/log_if.hpp E:/CLionProjects/MYtoolTest/include/tbb/blocked_range.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/atan.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/factorial.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dinvgamma.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/cos.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/var.hpp E:/CLionProjects/MYtoolTest/MyTool/include/jsonStuff.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qcauchy.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/abs.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/sin.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rexp.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/lgamma.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/round.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/quant.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_node_impl.h E:/CLionProjects/MYtoolTest/include/openBLAS/cblas.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_template_helpers.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/max.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_for_each.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/inv.hpp E:/CLionProjects/MYtoolTest/include/tbb/aggregator.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rf.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qweibull.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/quadrature/gauss_legendre_50.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/is_nan.hpp E:/CLionProjects/MYtoolTest/include/openBLAS/lapacke_utils.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dcauchy.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dbeta.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dmvnorm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pinvgamma.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/linux_ia64.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/sinh.hpp E:/CLionProjects/MYtoolTest/include/tbb/machine/mic_common.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/incomplete_gamma.hpp E:/CLionProjects/MYtoolTest/MyTool/include/glob.h E:/CLionProjects/MYtoolTest/include/tbb/concurrent_unordered_set.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_trace_impl.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_tbb_trace_impl.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/log1p.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pweibull.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qlogis.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dt.hpp E:/CLionProjects/MYtoolTest/MyTool/include/MySTLStuff.hpp E:/CLionProjects/MYtoolTest/include/tbb/blocked_range2d.h E:/CLionProjects/MYtoolTest/include/tbb/task_scheduler_init.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/runif.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_join_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/sanity_checks.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rinvgamma.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_x86_rtm_rw_mutex_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dweibull.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/internal_fns/exp_if.hpp E:/CLionProjects/MYtoolTest/include/tbb/mutex.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/fill.hpp E:/CLionProjects/MYtoolTest/include/tbb/parallel_sort.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/prob_val.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/sgn.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/asin.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_concurrent_unordered_impl.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rpois.hpp E:/CLionProjects/MYtoolTest/include/tbb/spin_rw_mutex.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rweibull.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rlogis.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/floor.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pf.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rbern.hpp E:/CLionProjects/MYtoolTest/include/tbb/flow_graph.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_tbb_hash_compare_impl.h E:/CLionProjects/MYtoolTest/include/tbb/null_rw_mutex.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/gamma.hpp E:/CLionProjects/MYtoolTest/include/tbb/partitioner.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dbern.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/logis.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/tgamma.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qnorm.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/prob/pbinom.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/asinh.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/n_rows.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_tbb_strings.h E:/CLionProjects/MYtoolTest/include/tbb/machine/mac_ppc.h E:/CLionProjects/MYtoolTest/include/tbb/internal/_flow_graph_tagged_buffer_impl.h E:/CLionProjects/MYtoolTest/include/tbb/tbb_config.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/lmgamma.hpp E:/CLionProjects/MYtoolTest/include/tbb/null_mutex.h E:/CLionProjects/MYtoolTest/MyTool/include/MySort.hpp E:/CLionProjects/MYtoolTest/include/tbb/internal/_mutex_padding.h E:/CLionProjects/MYtoolTest/include/tbb/critical_section.h E:/CLionProjects/MYtoolTest/include/tbb/recursive_mutex.h E:/CLionProjects/MYtoolTest/include/tbb/task_arena.h E:/CLionProjects/MYtoolTest/include/nolhmann/json.hpp E:/CLionProjects/MYtoolTest/include/tbb/concurrent_hash_map.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/log_det.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qpois.hpp E:/CLionProjects/MYtoolTest/include/tbb/atomic.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/incomplete_gamma_inv.hpp E:/CLionProjects/MYtoolTest/include/tbb/task.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/dens/dexp.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/det.hpp E:/CLionProjects/MYtoolTest/include/tbb/concurrent_queue.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/invgamma.hpp E:/CLionProjects/MYtoolTest/include/tbb/tbb_disable_exceptions.h E:/CLionProjects/MYtoolTest/MyTool/include/myStats.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/accu.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/f.hpp E:/CLionProjects/MYtoolTest/include/openBLAS/lapacke_mangling.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/quant/qlnorm.hpp E:/CLionProjects/MYtoolTest/include/tbb/tbb.h E:/CLionProjects/MYtoolTest/include/tbb/tbbmalloc_proxy.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/atanh.hpp E:/CLionProjects/MYtoolTest/include/tbb/blocked_rangeNd.h E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/sanity_checks/exp.hpp E:/CLionProjects/MYtoolTest/include/openBLAS/openblas_config.h E:/CLionProjects/MYtoolTest/MyTool/include/MultiThreadFrame.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/is_finite.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/quadrature/gauss_legendre_30.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/rand/rbeta.hpp E:/CLionProjects/MYtoolTest/include/stats-master/stats_incl/misc/matrix_ops/exp.hpp E:/CLionProjects/MYtoolTest/include/tbb/cache_aligned_allocator.h E:/CLionProjects/MYtoolTest/include/tbb/machine/windows_intel64.h E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/log.hpp E:/CLionProjects/MYtoolTest/include/gcem-master/gcem_incl/lcm.hpp E:/CLionProjects/MYtoolTest/include/openBLAS/lapacke_config.h E:/CLionProjects/MYtoolTest/include/fftw-3.3.5-dll64/fftw3.h D:/ProgramData/Anaconda3/include/Python.h )

TARGET_LINK_LIBRARIES(MYtoolTest E:/CLionProjects/MYtoolTest/bin/openBLAS/libopenblas.a E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64/libfftw3f-3.lib E:/CLionProjects/MYtoolTest/bin/tbb/libtbbmalloc_proxy.dll.a E:/CLionProjects/MYtoolTest/bin/tbb/libtbbmalloc_static.a E:/CLionProjects/MYtoolTest/bin/tbb/libtbb_static.a E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64/libfftw3-3.lib E:/CLionProjects/MYtoolTest/bin/tbb/libtbbmalloc_proxy_static.a E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64/libfftw3l-3.lib E:/CLionProjects/MYtoolTest/bin/openBLAS/libopenblas.dll.a E:/CLionProjects/MYtoolTest/bin/tbb/libtbb.dll.a E:/CLionProjects/MYtoolTest/bin/openBLAS/libopenblas.lib E:/CLionProjects/MYtoolTest/bin/tbb/libtbbmalloc.dll.a python36.lib imagehlp.lib )

TARGET_LINK_LIBRARIES(MYtoolTest E:/CLionProjects/MYtoolTest/bin/openBLAS/libopenblas.dll E:/CLionProjects/MYtoolTest/bin/libdlltest.dll E:/CLionProjects/MYtoolTest/bin/openBLAS/libgcc_s_seh-1.dll E:/CLionProjects/MYtoolTest/bin/tbb/libtbbmalloc.dll E:/CLionProjects/MYtoolTest/bin/tbb/libtbb.dll E:/CLionProjects/MYtoolTest/bin/tbb/libtbbmalloc_proxy.dll E:/CLionProjects/MYtoolTest/bin/openBLAS/libgfortran-3.dll E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64/libfftw3l-3.dll E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64/libfftw3-3.dll E:/CLionProjects/MYtoolTest/bin/fftw-3.3.5-dll64/libfftw3f-3.dll )

just don't know I miss what,thank for your valuable time for reading this.

 

 

TCE Open Date: 

Monday, January 6, 2020 - 23:38

reference to gateway_t from async_node

$
0
0

In the following code snippet copied from https://github.com/Apress/pro-TBB/blob/master/ch18/fig_18_03.cpp, I have a question about the validity of function parameter gateway.

How can I guarantee that it is safe to access gateway in the following thread function. My concern is that if flow graph gets reclaimed on exception, I think it becomes unsafe to access gateway and asyncThread does not know that.

 

void run(int input, gateway_t& gateway) {

gateway.reserve_wait();

asyncThread = std::thread{

[&,input]() {

std::cout << "World! Input: "<< input << '\n';

int output = input + 1;

gateway.try_put(output);

gateway.release_wait();

}

};

}

 

chialun

TCE Open Date: 

Tuesday, January 7, 2020 - 21:30

tbbmalloc memory usage and mimalloc benchmark results

$
0
0

Have you guys seen mimalloc? https://github.com/microsoft/mimalloc has some interesting benchmark results for tbbmalloc for peak working set (last set of graphs on that page). When we first started using jemalloc on Linux and tbbmalloc on Windows, it was our experience that the peak working set with tbbmalloc was much worst and we had attributed this to the fact that we allocate on one thread and free on another. To ameliorate this, we had resorted to calling scalable_allocation_command(TBBMALLOC_CLEAN_ALL_BUFFERS) after every simulation time step. Reading the peak working set benchmarks on the mimalloc's readme.md it seems to suggest that tbbmalloc actually holds its own here with respect to jemalloc for work loads that do this (see the larsonN and mstressN results). However, the redis benchmark shows tbbmalloc being much worst than jemalloc. It might worth investigating the behaviour here.

 

PS. Benchmark is in a separate repo https://github.com/daanx/mimalloc-bench

Intel TBB Version 2020 Warnings

$
0
0

At my workplace, we are upgrading the Intel TBB library to Version 2020. After integrating the library we have started seeing deprecation warnings on all the platforms. On windows (MS Visual Studio 2017), these warnings are treated as error and on macOS (XCode 10.14) , Linux (GCC 6.3) these are warnings are displayed as harmless pragma message. For ex. on Windows the warning will be as below

1: foo.cpp
2: c:\program files (x86)\microsoft visual studio\2017\professional\vc\tools\msvc\14.11.25503\include\exception(375): warning C4996: 'tbb::captured_exception::~captured_exception': was declared deprecated

3: C:\path\tbb\include\tbb\tbb_exception.h(206): note: see declaration of 'tbb::captured_exception::~captured_exception'
4: C:\path\tbb\include\tbb\tbb_exception.h(345): note: see reference to function template instantiation 'std::exception_ptr std::make_exception_ptr<tbb::captured_exception>(_Ex) noexcept' being compiled
5:         with
6:         [
7:             _Ex=tbb::captured_exception
8:         ]

On linux and macOS

../path/tbb/include/tbb/task_scheduler_init.h:21:154: note: #pragma message: TBB Warning: tbb/task_scheduler_init.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.
 #pragma message("TBB Warning: tbb/task_scheduler_init.h is deprecated. For details, please see Deprecated Features appendix in the TBB reference manual.")

While Linux and macOS are OK for the time being, (as we will migrate to newer or standard library features as suggested in the warnings and Intel TBB Webpages here and here) whats concerning us is the warnings on the Windows platform as mentioned above. There are more than 1200 instances of this particular warning and all of them are emanating Intel TBB header file tbb_exception.h.

The warning is highlighting

1) MSVC's header file exception since tbb_exception.h's deprecated classes are using exception header's classes

2) Any header file that includes tbb_exception.h e.g. concurrent_map.h

We are aware that we can suppress these warnings via TBB_SUPPRESS_DEPRECATED_MESSAGES macro.

Having said all this, is the TBB development team aware of this issue of warnings and this particular warning emanating from tbb_exception.h ?

If yes, do they plan to get rid of these warnings in the future version of TBB library ?

AttachmentSize
Downloadtext/x-c++srcmain.cpp2.39 KB

TCE Level: 

TCE Open Date: 

Tuesday, February 4, 2020 - 15:25

concurrent_vector, shrink_to_fit(), compact()

$
0
0

Hi All.

1. concurrent_vector.h mentions compact() method:

@par Changes since TBB 2.0
[cut]
    - Added compact() method to defragment first segments

But there's no such method.

2. Doc at software.intel.com/en-us/node/506203 states: " The method shrink_to_fit()merges several smaller arrays into a single contiguous array, which may improve access time."
But I found shrink_to_fit() doesn't create a single continuous array, and I need single contiguous array to use older API that expects pointers.

version info:

// Marketing-driven product version
#define TBB_VERSION_MAJOR 2019
#define TBB_VERSION_MINOR 0

// Engineering-focused interface version
#define TBB_INTERFACE_VERSION 11008
#define TBB_INTERFACE_VERSION_MAJOR TBB_INTERFACE_VERSION/1000

Any insights to alleviate my confusion?

Regards,
Sergei.

TCE Level: 

TCE Open Date: 

Thursday, February 6, 2020 - 01:08

perf between tbb::concurrent_unordered_map and std::unordered_map

$
0
0

Hi experts,

I did a little update to the official perf tool in https://github.com/intel/tbb/blob/tbb_2020/src/perf/time_hash_map.cpp, (my updated code is attached.)

and aim to compare the perf between tbb::concurrent_unordered_map and std::unordered_map. 

the result i got is a little weird to me, it shows the ave latency for single find/update operation drops a lot after the table size grow larger than a certain value, (it's around 2000000 in my side)

So is this result expected? anyone know why the perf drops worse than std::unordered_map when table size grow larger?

i tried to pre-allocate the capacity of hash table, but the result are still the same.

 

axial x means element count in table,  axial y means latency in nano sec.

AttachmentSize
Downloadtext/x-c++srctime_hash_map.cpp8.33 KB

TCE Open Date: 

Tuesday, February 25, 2020 - 08:16

Trouble Building TBB in windows with CMake

$
0
0

Hi, 

 

I'm trying to use TBB by building from source in my project. I'm doing this on a Windows 10 system and am using the MSVC toolchain that comes with Visual Studio 2019 Community edition. I"m using the standard approach of `git submodule add` and pointing it to TBB github source to get the code. I then followed the following steps in my main CMakeLists.txt.  

# TBB
include(tbb/cmake/TBBBuild.cmake)
tbb_build(TBB_ROOT ${PROJECT_SOURCE_DIR}/tbb/ CONFIG_DIR TBB_DIR MAKE_ARGS tbb_cpf=1)

 

When I run the cmake configuration step I get the following error.

"C:\Program Files\JetBrains\CLion 2019.2\bin\cmake\win\bin\cmake.exe" -DCMAKE_BUILD_TYPE=Debug -G "CodeBlocks - NMake Makefiles" C:\Users\kartik\Projects\cv\hello_cv
-- The C compiler identification is MSVC 19.24.28316.0
-- The CXX compiler identification is MSVC 19.24.28316.0
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx86/x86/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx86/x86/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx86/x86/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx86/x86/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Intel TBB can not be built: required make-tool (gmake) was not found
TBB Dir = TBB_DIR-NOTFOUND

 

Is the gmake referred to here GNU Make ? If so why is there a dependency on GNU Make when I'm using the MSVC toolchain? In this case I'm using my IDE (CLion which generates NMake files for windows by default) however I can reproduce the problem running cmake in a power shell terminal.

 

 

PS C:\Users\kartik\Projects\cv\hello_cv\build-out> cmake -DCMAKE_BUILD_TYPE=Debug -G "Visual Studio 16 2019" ..
-- The C compiler identification is MSVC 19.24.28316.0
-- The CXX compiler identification is MSVC 19.24.28316.0
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx64/x64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx64/x64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.24.28314/bin/Hostx64/x64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Intel TBB can not be built: required make-tool (gmake) was not found
TBB Dir = TBB_DIR-NOTFOUND

 

Is there a way I can work around this ? I have a MinGW toolchain which should have the GCC tools, however, when I use that toolchain I get the same error as well. Any help would be appreciated.

 

Thanks

TCE Open Date: 

Thursday, February 27, 2020 - 12:36

unsafe_bucket_size in concurrent_unordered_base is not const?

$
0
0

Hi,

I'm reading the source code of the concurrent_unordered_map, and found it's interface  unsafe_bucket_size is designed with non-const function, 

so just curious is this by design? i guess if some body invoke this method in a const method, the compiler will not allow that.

https://github.com/intel/tbb/blob/tbb_2020/include/tbb/internal/_concurr...

 

// Bucket interface - for debugging
    size_type unsafe_bucket_count() const {
        return my_number_of_buckets;
    }

    size_type unsafe_max_bucket_count() const {
        return segment_size(pointers_per_table-1);
    }

    size_type unsafe_bucket_size(size_type bucket) { // ====> non const?
        size_type item_count = 0;
        if (is_initialized(bucket)) {
            raw_iterator it = get_bucket(bucket);
            ++it;
            for (; it != my_solist.raw_end() && !it.get_node_ptr()->is_dummy(); ++it)
                ++item_count;
        }
        return item_count;
    }

    size_type unsafe_bucket(const key_type& key) const {
        sokey_t order_key = (sokey_t) my_hash_compare(key);
        size_type bucket = order_key % my_number_of_buckets;
        return bucket;
    }

 

TCE Open Date: 

Saturday, February 29, 2020 - 06:49

Flow in the Flow graph

$
0
0

Hello everyone,

I'm really new in TBB Flow Graph and I'm facing an issue in a realtime detection application.

I use several sources (from 8 custom threads that read pixels from cameras) that send the data with a try_put to a broadcast_node. The broadcast send to a function_node for preprocessing (with unlimited concurrency) and the preprocess function send the result to a queue_node.Finally the queue is connected to the detection block which is another function_node. Due to the expensive computation (on the gpu) i need to restrict the number of detectors to 4 ; so I set the maximum concurrency to 4.

These detectors are slow (15 fps) but the final framerate has to be at the same framerate as the cameras. So when a detector is busy, I simply skip the detector's function_node and send the data to another buffer ( the previous queue_node is connected to it). Finally i reorder the frames with a sequencer_node 

In this configuration i get weird behavior when only 4 frames pass through the detector block and the rest are skipped. I decide to change the detector function_node with a couple of limiter_node and multifunction_node (output 0 for the data and ouput 1 for the continue_msg) but got the same result. 

 

It would be very helpfull if someone could give me some clues. 

Thank you

Alex

TCE Open Date: 

Friday, March 6, 2020 - 04:50

Future of thread_bound_filter? Options?

$
0
0

ProTTB Chapter 16 states: "As we noted in our earlier discussions in this chapter, affinity hints are not supported
by tbb::parallel_pipeline. We cannot express that we prefer that a particular filter
execute on a specific thread. However, there is support for thread-bound filters if we
use the older, thread-unsafe tbb::pipeline class! These thread-bound filters are not

processed at all by TBB worker threads; instead, we need to explicitly process items in
these filters by calling their process_item or try_process_item functions directly.
Typically, a thread_bound_filter is not used to improve data locality, but instead
it is used when a filter must be executed on a particular thread – perhaps because
only that thread has the rights to access the resources required to complete the action
implemented by the filter. Situations like this can arise in real applications when, for
example, a communication or offload library requires that all communication happen
from a particular thread."

Now that pipelines are deprecated as part of the move to oneAPI, what are the options for getting performance similar to thread_bound_filter for soft real-time requirements?

 

"

TCE Open Date: 

Tuesday, March 10, 2020 - 06:53
Viewing all 702 articles
Browse latest View live