looking for explanation why multicore is problematic for realtime music apps

MvKeinen Member Posts: 36 Member

I remember an excellent post by some NI official why multicore is problematic for realtime music apps. Probably in the old forum. Does someone remember? Or are there other resources in the net where its explained?

I know DAWs can put different tracks on different cores. But if you have a big reaktor ensemble it can't be calculated with more than one core AFAIK.

Is it the realtime aspect of the whole thing, where the sychronisation of different calculations in different cores would use too many resources?



  • Isotoxin
    Isotoxin Member Posts: 108 Helper

    Problem with multicore is more related to programming and not only to music production:

  • Big Gnome
    Big Gnome Member Posts: 14 Helper

    I do remember some of those discussions from a few years ago; they're lost to the old forum now, but I'll try to dig something up. For a start, there's a brief exchange where this is touched upon here-- https://web.archive.org/web/20220826042518/https://www.native-instruments.com/forum/threads/considered-harmful.262873/#post-1429921

  • colB
    colB Member Posts: 630 Guru
    edited September 19

    Is it the realtime aspect of the whole thing, where the sychronisation of different calculations in different cores would use too many resources?

    In general non-Reaktor context, I think there are multiple reasons (Please note that I'm not an expert and the limited understanding I do have is very out of date :))

    e.g. for starters, to use multi core, you need to have multiple things that can be processed at the same time. For that to make sense, they cannot be dependent on each other. And ideally, they won't be using or manipulating the same resource(s) as each other.

    In a synth, you might have a chain of processes, an oscillator, an envelope, a filter, a reverb effect. It would be great to take each one of those elements an stick them on different cores... but wait, the filter needs the output of the oscillator, so would have to wait for that before it could do any processing... and the reverb needs the output of the filter... oops.

    There are compromises that can be made, but they have to be made on a per project basis. There is no one size fits all here, so its something of an art that cannot be automated successfully without severely limiting what can be done.

    e.g. you could incorporate delays into the structure, so that the filter is processing an older oscillator output, so it doesn't need to wait for the current oscillator output... that can work just fine... unless (hehe) you want some system where there are multiple chains with varying numbers of elements that are later combined, then there could be phasing issues and whatever... and then there are problems too with feedback where you definitely want to minimise the delay ideally to a unit delay (a single sample rate clock tick).

    (it's not as simple as this in real life though, because the processing isn't done on a per sample basis, it is applied to buffer chunks)

    And then as you suggested, there are synchronisation overheads. When the oscillator and filter and reverb have done there processing, there needs to be some mechanism by which they can share their inputs and outputs... They can't just bang on completely independent of each other, and they are not likely to take the same number of cpu cycles... I suppose that cpu time on the cores is managed by the Operating System, and context switching (between your oscillator, your sound card driver, the web browser running in the background, the various windows services etc...) is costly. So it's really important not to have trivially small processes running in this way where the process itself uses less cpu than the cost of context switching...

    I dunno, it's just freakishly complex. I guess modern programming techniques and Operating systems have a pretty good handle on things to make them as efficient as possible, and some audio apps definitely use these features, but only where it makes sense, like a DAW with many multiple tracks that are basically independent of each other and completely predictable to the extent that they are not (inputs and outputs).

    Some processes naturally lend themselves to concurrency, stuff like additive synthesis, or maybe certain types of reverb algorithm, maybe heavy FFT based processing... but for some of these maybe parallelism via SIMD is more appropriate?

    That's why multicore in Reaktor is so difficult. If and How multicore should be used is a per project thing, and requires some significant engineering knowledge and experience. But Reaktor is a development environment for non-programmers. How do NI incorporate multicore mechanisms into Primary or Core that can be successfully used by non-programmers, and are still effective in dramatically improving cpu efficiency? I don't think that is possible.

    Alternatively it could be 'automatic' with no knowledge required, but then it could only really be a poly voice based thing, in which case it will be mostly useless for Blocks, or a per instrument chain type of thing in which case it forces certain types of use case, and is something of a waste as that's already possible using a DAW and multiple Reaktor plugin instances.

    ...we also have to remember that this discussion exists in a context where we are still waiting on core iteration and some form of code reuse mechanism. Both of these are achievable and would each likely provide at least as much or better efficiency boost than multicore support.

  • MvKeinen
    MvKeinen Member Posts: 36 Member

    Thanks a lot! @Big Gnome and @colB

    very helpful! Got to read this twice I guess :)

  • Kubrak
    Kubrak Member Posts: 2,453 Expert

    There is yet another aspect.....

    Even if application uses just one thread.... In most cases, unless application developer cares for, OS decides which core will be used. And not only at start, OS may dynamically allocate given thread from one core to another core (to let cool down the used one... or so ...).

    And some cores are able to do more job, others a bit less. And reallocation takes some time introducing latency....

    And with introduction of e-cores by Intel and Apple. Situation becomes yet wilder. One does not play temporarily, why not to move task to e-core if it fits there? And later on, woou that needs hell of CPU power suddenly.... Move it to p-core! But it takes considerable amount of time. At least on Intel CPU (12th and 13th gen.)....

Back To Top