 |
Welcome to the Natami / Amiga ForumThis forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.
|
Do you have ideas and feature wishes? Post them here and discuss your ideas. |
| Audio DSP for the NatAmi | page 1 2 3 4 5 6 7
|
|---|
|
|---|
Marcel Verdaasdonk Netherlands
| | Posts 3976 09 Sep 2010 08:09
| Actually Claudio there is a limit to how many cores you can add before the system becomes laggy due to the added cores. But I am making a assumption here that the number of cores required to do that is much higher then the cores you'll be implementing. CPU time is finite in a give time frame. ;)PS:cores as in units the system can do computational operations on.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 09 Sep 2010 08:57
| Amiga Believer wrote:
| The reason why it would be faster is obvious, the m96k can do three arithmetic operations in parallel (versus 1 for the 68k), |
OK, lets first correct this number. The 68060 can do: 1 Adress Calculation 2 Integer Calculations 1 Programmflow operation (Branch) = 4 operations per clock in parallel. The 68070 design will add 2 more Integer Calculations. = 6 operations per clock in parallel. + 1 Load + 1 Store = 8 operations per clock in parallel. But this value is peak and should not be taken for real. Your proposed SIMD Unit would add 4 parallel 32bit operations on top of this. + 4 parallel Operations = 12 Operations per clock peak. (But again mind that this is peak and only reachable with artificial benchmark code)
Amiga Believer wrote:
| These transfers can be done in parallel to the DSP unit doing other instructions...
|
The advantage of the DSP were mainly two things: a) Local SRAM This is comparable to Cache or lookable Cache. Also there are 68K CPUs like the Coldfire which have local SRAM too. b) The REAL super advantage why a DSP did beat all generic CPUs at its time was that the DSP did had several memory busses (3) and had three times the memory bandwidth of a normal CPU. The power of the DSP is NOT computing but memory bandwidth.The 3 memory busses were the real strength of the DSP. But 3 separate exclusive buses this does not match the AMIGA design. And implementing today three buses makes nos sense at all. Please understand that a DSP without its local buses is NOT better than a generic CPU. A DSP makes only sense if you build the whole system design around it with several independent memory buses.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3976 09 Sep 2010 11:37
| On a A500 you had 3 buses where you could have a memory expansion. Chip-mem the trapdoor expansion and the (Zorro) sidecard expansion slot.
| |
Sascha B Germany
| | Posts 131 09 Sep 2010 12:10
| Virtually the trapdoor is attached to the same Fat Agnus bus as chip ram is. Thats the reason why its called slow ram because the Agnus DMA still blocks CPU memory access to $C00000... Only ROM and the expansion area is of free access for the CPU.
| |
Loïc Dupuy France
| | Posts 253 09 Sep 2010 18:13
| Sascha B wrote:
| Virtually the trapdoor is attached to the same Fat Agnus bus as chip ram is.
|
What a pitty that the first Agnus were only able to access 512KiB Chipram, and legacy compatibility is a bitch, i've got an OCS fat agnus, and i had still 512KiB chip and 512KiB slow in expansion trap. Even with that, some games (ghost'n'goblins for exemple) were not working with the expansion ram, because it did not like fatagnus, there was a patch to reduce memory to 512KiB chip in the crack to be able to use it. Suppose that was for economics reason, but damn, loss so much power to earn $20 in production, i would have gladly paid extra $30 to have a non emasculated chipset.The same with the Falcon, it would have a been a lot sexy computer with a real 32bits memory bus (and a working Mint, monotasking in TOS was no way for me)
| |
Ceti 331 United Kingdom
| | Posts 282 09 Sep 2010 20:32
| Gunnar von Boehn wrote:
| Please understand that a DSP without its local buses is NOT better than a generic CPU. A DSP makes only sense if you build the whole system design around it with several independent memory buses. |
generally this is why Intel (much to Wojtek's frustration..) went the way it did.. generic CPU with SIMD with wide bus, burst transfers ended up making DSP accelerator cards in PC's obselete. i think DSP's are still used in embedded (eg phones alongside arm) where there instruction set can be specialized to one repetitive task. Also DSP instruction encoding can be different to a CPU, i.e. general code density not so important.. VLIW making sense, etc.. I kind of agree that the Natami/Amiga really being a 'multimedia computer' (the original!) suits CPU+SIMD .. which is what the modern world has. rather than a specialzied Sound-only dsp, you want a powerful CPU+accelerator that can be turned to Sound, image-processing, RayTracing, whatever..
| |
Amiga Believer Canada
| | Posts 282 10 Sep 2010 06:58
| > The problems with using software DSP effects and software synthesizers on modern audio workstations is that they are multitasking together with every other program. This is one of the very problems of using a software based approach. A co-processor, regardless of it's type (96k, 56k, 68k or other) should be dedicated to audio and should not be allowed to do other tasks. It isn't for nothing that audio professionals install DSP cards in their computers instead of using the CPU for audio effects. This enables realtime behavior. This is the same reason why all cards form EMU and some cards from M-Audio have an on-board DSP.The Amiga is built around the concept of using specialized co-processors. If you want generic cores, get a multicore PC. Each core is a general purpose brute force monster. This is like the winmodem debate. Moreover, if we want realistic audio positionning with HRTF or Ambisonic, or a feature such as wavetracing, it requires a dedicated co-processor, this is why sound cards by Aureal had a good DSP on board (Motorola 56362). > If we want to make Natami a powerful system, we first have to overcome the most obvious lack of Amiga nowadays: Generic computing power. I strongly disagree. It is true that the Amiga has a lack of CPU power, however, if we hardware accelerate everything, this lack of power, while still there, will no longer be an issue. > Also DSP instruction encoding can be different to a CPU, i.e. general code density not so important.. VLIW making sense, etc.. This is another important point: VLIW. Finally, there is the on-board SRAM which boosts the performance, moreover, if the SRAM is dual-ported, the DSP can do transfers between the external memory and the SRAM while at the same time working on said SRAM. If developping a m56k compatible DSP is too much work (even though I find it useful to support programs using the Delfina m56002) here are two proposed other approaches. First, there are DSPs available as VHDL code, it can be used as a starting point. A second option would to use a downstripped N68k as a starting point. Take the 68k as starting point and do the following: Remove most of the integer unit (keep only the address registers and the EA-unit) Remove the FPU Keep the SIMD unit Add a dual-ported local SRAM Make sure the processor only sees the SRAM as it's memory Add a DMA unit to transfer data to and from the SRAM in parallel of the SIMD unit using the said SRAM. Make sure said co-processor is used for audio only This design would allow re-using the design of the 68k SIMD unit, while the first suggestion (use an existing DSP VHDL core) has the advantage of supporting VLIW. We seem to have a DSP calculation specialist on board (Denis Markovic), maybe we can have him determine which extra instructions are useful for sound processing.
| |
Cesare Di Mauro Italy
| | Posts 526 10 Sep 2010 09:04
| Audio professionals will just... compute and play music. So the new SIMD unit can be used for their purposes, which will be available entirely for that. So, again, there's no need to have a separate, and specialized DSP. SIMD unit is quite general purpose to handle music tasks too, if need.
| |
Amiga Believer Canada
| | Posts 282 10 Sep 2010 18:58
| > Audio professionals will just... compute and play music. So the new SIMD unit can be used for their purposes, which will be available entirely for that. No, the tasks on the SIMD will have to go through multitasking, this, by design cannot garantee realtime behaviour which is necessary for flawless audio processing in realtime (this is even more so since AmigaDOS is not a realtime operating system).Audio processing should be done on a decicated unit to make sure it will never be stopped by another process taking over. Moreover, if you are making a soundtrack for a movie, you will have to be able to decode the video in realtime with passable quality while playing back audio tracks mixed in realtime with added effects. If you miss a video frame, the world will not end, but if you miss a single audio sample the world will end: it will result in an audible glitch. The DSPs on the M-audio and EMU soundcards are there for a good reason. Same thing for the DSP cards on a Pro Tools HD system. For gaming audio, things like HRTF and wave cannot suffer the incertitudes associated with multitasking, otherwise there will be glitches in the audio, moreover, it will use 100% of the power of a dedicated unit (it requires a high enough processing power to justify a dedicated unit). The Aureal SQ3500 was a good example of this sort of good gaming audio, it provided hardware accelerated wavetracing and true HRTF. Audio has always been the single weak point of the Amiga. Another thing to keep in mind: we cannot beat a modern computer and GPU for graphics. We can easly beat a modern computer system when it comes to sound. High audio capabilities do not need high clock speed and the total necessary processing power is much lower than for high-end graphics. We should pursue this objective. As I said, if designing a m56k compatible DSP it too much work, we can take an existing VHDL DSP and integerate in the design (or make our simple 68k derived DSP).
| |
Cesare Di Mauro Italy
| | Posts 526 10 Sep 2010 21:36
| With AmigaOS you can own & lock the Blitter, being sure that you'll be the only one that can use it. I think that the same can be made with the SIMD coprocessor.
| |
Amiga Believer Canada
| | Posts 282 10 Sep 2010 21:58
| The SIMD unit is not a coprocessor.
| |
Cesare Di Mauro Italy
| | Posts 526 11 Sep 2010 05:27
| I said "SIMD coprocessor", not "SIMD unit". I was referring to the "Robin" processor, which will be 68K based plus SIMD unit added to delegate heavy calculations. And it's a coprocessor into the Natami system, because the CPU is the N050/070.
| |
Amiga Believer Canada
| | Posts 282 11 Sep 2010 13:48
| Regardless of the approach used, there is one point which I maintain, audio should have it's own co-processor which is not shared with anything else, regardless of the exact nature of said co-processor.
| |
André Jernung Sweden
| | (MX-Board Owner) Posts 988 11 Sep 2010 13:56
| Amiga Believer wrote:
| Regardless of the approach used, there is one point which I maintain, audio should have it's own co-processor which is not shared with anything else, regardless of the exact nature of said co-processor.
|
I agree with you. For professional audio software, dedicated processors are needed to ensure time-critical audio handling. Even software on the most expensive audio workstations available today fail sometimes simply because they are multitasking the audio tasks with the system or other software. The programmer could use one or more Natami 68k subcores entirely dedicated to processing audio if needed. For example, one subcore entirely dedicated to a software synthesizer. With the solution Claudio Wieland is proposing for the subcores, multitasking on an individual core is possible, but not required. So one or more 68k cores could become your 100% solid "DSP chip" :)
| |
Claudio Wieland Germany
| | (Natami Team) Posts 703 11 Sep 2010 16:42
| Regarding "multitasking": To avoid confusion with regular EXEC tasks, I'd prefer to rather call it "multijobbing". But otherwise, yes ^^ .
| |
Amiga Believer Canada
| | Posts 282 11 Sep 2010 21:22
| > The programmer could use one or more Natami 68k subcores entirely dedicated to processing audio if needed. The concept of subcores is not the best idea, we are not trying to create a 68k based equivalent of the IBM Cell. The use of co-processors should not be chosen by the programmer, it should be fixed by the operating system. Moreover, an audio co-processor should be connected to ChipRAM while a "subcore" would likely be connected to FastRAM, which will slow down the main processor by creating a bottleneck for FastRAM access. An audio co-processor should be locked by AmigaDOS for audio only (unless a program does hardware banging which is a bad programming practice anyway). I would suggest having an audio filter creation API integrated in the operating system. Have programs use the co-processor through the said API. This is a clean approach. If we really want to let programmers use it directly (instead of through functions in a library), there should be driver called dsp.device. For game audio, there can be a wavetracing API and a positional audio API which switches automaticly between HRTF when the user selects "headphones" and Amibisonic when the user selects "speakers" in the Workbench preferences. Having true HRTF support can also enable virtual surround on headphones for, say, a DTS movie soundtrack, but this requires a dedicated audio co-processor which does not undergo multitasking as said before.> So one or more 68k cores could become your 100% solid "DSP chip" Only one powerful co-processor is needed for audio. Using the 68k as a starting point may not necessarly be a bad approach, however, if using the 68k as a base for an audio co-processor, it should not be a vanilla 68k. It should include two sets of modifications: 1 Addition of on-board dual-ported SRAM On-board SRAM would boost performance. The core should only see its SRAM, not the distant ChipRAM A DMA unit should be added to do transfers in parallel, between the SRAM and the NatAmi ChipRAM. Without local RAM, the unit would have to work on the NatAmi ChipRAM which would be slow and for which the bus is shared with other components which means that there would be a bottleneck risk. As others said in this forum topic, local SRAM is one of the things which give a DSP a higher performance than a generic processor for DSP tasks. 2 Remove everything unnecessary for an audio DSP This includes things like, for example, 64 bit floating point instructions (32 bit is enough for audio).
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 11 Sep 2010 21:37
| The topic could have been good. Its a pity that instead discussing requirements and possible solutions some proposals seem to be based on misunderstandings of so called facts. Without a clear understanding how the AMIGA works, how caches work, or knowledge of the instructions set of the 68K CPU - how can you make a good proposal? Amiga Believer wrote:
| The concept of subcores is not the best idea, |
Well this is just your opinion. Many will say that sub-cores are the most flexible and easy to program, and faster to develop than any other solution. Amiga Believer wrote:
| Moreover, an audio co-processor should be connected to ChipRAM while a "subcore" would likely be connected to FastRAM, |
No one said that sub-cores are connected to the FastRam only. There is no reason why a Sub-Core should not be able to work on Chipmem. Amiga Believer wrote:
| It should include two sets of modifications: On-board SRAM would boost performance. |
Data-Cache gives the same performance and is so much more flexible and easier to code for. You need to explain us where the benefit of "plain" SRAM would be. Amiga Believer wrote:
| ... A DMA unit should be added to do transfers in parallel, |
Well from the system perspective the whole Subcore will work like a DMA unit. Also an extra DMA unit inside a SubCore will have the same net-effect as parallel execution of outstanding loads. And adding outstanding loads to the 68K Cores was discussed long time ago already. I assume that you want to solve problems with your ides. But its not really clear to me which "problems" are really there. I also get the impression that there are already solutions for some of those problems. What I'm missing is a clear definition of what performance is catually needed to solve the AUDIO requirement. What DSP model (which MHz) is in your opinion sufficient for this? Was the DSP used in the Falcon sufficient? How many MIPS of MFLOPS are needed for these algorithms? Or if someone would use a dedicated 68030 Core for this. What clockrate would the 68030 need to have for these routines?
| |
Marcel Verdaasdonk Netherlands
| | Posts 3976 12 Sep 2010 02:53
| Amiga believer, Gunnar both of you have a biased opinion on this. Audio is a question of jitter, latency, and several other things. I assume Claudio already looked into this, at least i hope he did. How it has been explained to me the sub-cores are a sub-optimal solution for audio. Then again it hasn't been explained that well nor have i seen the further idea's on them. Dual ported SRAM is absolutely not a economical viable idea. And Amiga Believer read the SIMD thread, the SIMD would have common components with the FPU so, you might want to re-think the striped down m68k.
| |
Claudio Wieland Germany
| | (Natami Team) Posts 703 12 Sep 2010 06:29
| The kind of multiprocessing, which I intend to put to use in the future, will probably allow for instruction-exact control over process run times. What you make out of those abilities later on, is, frankly, not of my concern. Why should it be? It's your decision as a coder. If you want a jitter and latency free audio DSP functionality, then it is up to your code to implement it. Gunnar's critique is sound: It does not help us, discussing without hard numbers on how much CPU power is needed for certain applications. I agree, this is actually "hot air". If someone can deliver real cycle numbers on certain effects on usual DSP:s, we can go on discussing this subject. Cheers
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 12 Sep 2010 08:21
| Marcel Verdaasdonk wrote:
| Audio is a question of jitter, latency, and several other things.
|
What requirements do we really have: - How much integer MIPS? - How much MFLOPS? - How much memory bandwidth? We should also rethink if this is just a software requirement. Maybe the real-time requirements could be meet with correct setting of Task-priority and or suing interrupts. Before we talk about dedicated units we should validate whether the obviously simplest solution does work.
| |
|
|
|
|