|
|---|
Team Chaos Leader USA
| | (Natami Team Member) Posts 1199 15 Jan 2010 00:42
| Thomas Hirsch wrote:
| bitplanes The bitplane DMA has ram access now. Each plane now uses a buffer of 32 byte to store incoming data. The data receiving path is now detached from the read access initiating logic.
|
Hooray! :DancingBanana:
| |
Thomas Hirsch Germany
| | (Natami Team Member) Posts 233 22 Jan 2010 20:01
| Doublescan  This is almost the same picture as the last one but with active scandoubling and four planes instead of two. So the circle is a circle again. Every plane draws the same circle at a different location. The green circle shows a bug which needs to be fixed. The next picture shows the very same setup but in hires mode.  As the first value of the OSD register display shows the four plane hires display needs only 3% of the memory bandwidth. In comparison to the full bandwidth of an OCS Amiga would need. When all 8 planes are turned on it takes about 5%. And just as a reminder: the LX board is just the evaluation hardware with *only* 16bit ram. The hires resolution is an OCS screen here with 640x256. It is scandoubled to a display of 800x600 50Hz SVGA resolution. This is the default behaviour for OCS screns. By writing to the ECS resolution registers the scandoubling is dropped. With that any display and resolution can be set up with the only limitation that the pixel frequency is still 28 MHz. For the beginning this limitation does not matter. But of course later on this needs to be changed. For that SAGA registers need to be defined enabling a pixel frequency of about 120 to 160MHz as a maximum.
| |
Team Chaos Leader USA
| | (Natami Team Member) Posts 1199 22 Jan 2010 20:24
| When all 8 planes are turned on it takes about 5%.
|
On AGA that takes 50%. So the current Natami hardware is 10x the speed of AGA.Woohoo!!!!!!!!!! Imagine playing Total Chaos AGA at 10x the speed! That is faster than WinUAE can do it! :Excitement: :Shock:
| |
Ayodele Stephenson USA
| | Posts 57 22 Jan 2010 20:39
| Wow!! this is good news!!!
| |
Marcel Verdaasdonk Netherlands
| | Posts 2089 22 Jan 2010 20:45
| TCL ofcourse we do need the bug fixed so, and during that fase perhaps a optimization can be found so. 10* speed might just be a begin here, I can't wait for the full board. since the LX still only have 16 bits. ;)Baby steps seem like leaps here. :P
| |
Bartek K United Kingdom
| | (Natami Team Member) Posts 1330 22 Jan 2010 21:24
| Hallo Thomas! You've made my day! :D This is what I call an update. Good luck and may the NatAmi force be with You!
| |
Bartek K United Kingdom
| | (Natami Team Member) Posts 1330 22 Jan 2010 22:21
| Team Chaos Leader wrote:
|
When all 8 planes are turned on it takes about 5%. |
Woohoo!!!!!!!!!! Imagine playing Total Chaos AGA at 10x the speed! That is faster than WinUAE can do it! :Excitement: :Shock:
|
Imagine playing Total Chaos AGA at 20x speed with NatAmi 32bit:D !
| |
Guillaume Michalakakos France
| | (Natami Team Member) Posts 144 22 Jan 2010 22:45
| Wonderfull work Thomas ! Respect
| |
Alexander Moon Denmark
| | Posts 29 22 Jan 2010 22:49
| Awesome, simply awesome! Great work
| |
Mr Copland ;) United Kingdom
| | (Natami Team Member) Posts 452 22 Jan 2010 22:53
| Wow Thomas that is great!
| |
Channel Z
| | Posts 213 22 Jan 2010 23:05
| Seeing the chipset slowly coming alive step by step is - moving, to be honest. This is 100% true Amiga hardware. Wow.
| |
Gunnar von Boehn Germany
| | (Natami Team Member) Posts 3727 23 Jan 2010 07:43
| Thanks for the status update! Its nice to see the step by step bring-up progress.
| |
Gunnar von Boehn Germany
| | (Natami Team Member) Posts 3727 23 Jan 2010 07:57
| Team Chaos Leader wrote:
|
When all 8 planes are turned on it takes about 5%.
|
On AGA that takes 50%. So the current Natami hardware is 10x the speed of AGA. Woohoo!!!!!!!!!! Imagine playing Total Chaos AGA at 10x the speed! That is faster than WinUAE can do it! :Excitement: :Shock:
|
Did Double scanned 640x256 = 640x512 x 8 Planes on AGA take 50% or 100%? My memory might be a rusty, but I believe to recall that this mode actually took 100% on AGA.Comparing with AGA speed is complex to do as Video DMA was 4 times as fast on AGA as Blitter DMA. Therefore if the LX is 10 times faster than AGA-Video DMA then this means the Blitter is actually 40 times faster than AGA, doesn't it? But whether the 16Lx Bringup Board Video DAM does run as of today with 10times or 20times AGA speed and that Blitter DMA runs 40 times faster than AGA is not really important. Until Thomas does say that the memory interface runs at full speed and that he finished tweaking the performance - I assume that there is still room for performance improvement. ;-) I think the 10times/40times performance is just a baby step ...
| |
Erik Bauer Italy
| | Posts 227 23 Jan 2010 09:31
| Whew! This update made my day start so well!I'm with the others, please keep us updated this way, it is so exciting to know every little step of the birth of NatAmi!
| |
Team Chaos Leader USA
| | (Natami Team Member) Posts 1199 23 Jan 2010 12:20
| Gunnar von Boehn wrote:
|
Team Chaos Leader wrote:
| When all 8 planes are turned on it takes about 5%. |
On AGA that takes 50%. So the current Natami hardware is 10x the speed of AGA. Woohoo!!!!!!!!!! Imagine playing Total Chaos AGA at 10x the speed! That is faster than WinUAE can do it! :Excitement: :Shock: |
Did Double scanned 640x256 = 640x512 x 8 Planes on AGA take 50% or 100%? My memory might be a rusty, but I believe to recall that this mode actually took 100% on AGA.
|
I wasn't really sure how to factor in Thomas' DoubleScanning and/or was the 5% in 640x256 or 640x512 mode? So I just used a safe estimate. :)To use approximately 100% DMA power on AGA one must open a screen of 1280x512x8bpl @50hz 1280x400x8bpl @60hz 640x512x8bpl DBLSCAN @50hz 640x400x8bpl DBLSCAN @60hz However, the 640x512x8bpl DBL modes are so absurdly slow that nobody in their right mind ever uses them. :loco: They either view that mode in interlace or they have a hardware flickerfixer/deinterlacer/scandoubler thingamajig. All us "hardcore Amiga users"(tm) :) have a hardware flickerfixer as it doubles the speed of the AGA chipset while allowing the use of standard cheap PC VGA/SVGA monitors. The latest hardware flickerfixer to be produced is the Indivision AGA (Hi Jens! :) and is quite popular. So on my Amiga, 640x512x256 color mode 31Khz doublescan only uses 50% DMA bandwidth. There is an option to play Total Chaos AGA in the DBLPAL mode but nobody ever uses it, its quite dreadful. So all the Total Chaos fans are playing it in a 50% DMA Bandwidth mode. Another troublesome part of the calculation is that since the NATAMI SAGA chipset is doing the doublescanning, this means that on SAGA, 640x512 takes 2x the display bandwidth of 640x256. On AGA, 640x512 and 640x256 each take the same amount of bandwidth due to the "interlace trick". Comparing with AGA speed is complex to do as Video DMA was 4 times as fast on AGA as Blitter DMA. Therefore if the LX is 10 times faster than AGA-Video DMA then this means the Blitter is actually 40 times faster than AGA, doesn't it?
|
Yes, I agree. The LX Blitter should currently cruise at 40x the speed of the AGA blitter.But my asm blitting routines have been up to 72x faster than the AGA blitter for over 10 years, so the 40x faster blitter doesn't help my Total Chaos game coding any. I expect my blit routines to go anywhere from 4x to 8x faster on Natami. So I expect Total Chaos blits to go around 300x faster than the AGA blitter could do them. Zoooooooom! Natami FTW! Of course the 40x Natami blitter will still greatly help out: Workbench apps that do their blitting with the blitter Badly coded AGA games ALL OCS/ECS games AMOS games Blitz Basic games etc. But it would need to be a 300x faster blitter to benefit Total Chaos players.
| |
Gone Gahgah Australia
| | Posts 224 23 Jan 2010 14:31
| Hey TCL. How will Total Chaos benefit from a faster blitter?
| |
Thomas Hirsch Germany
| | (Natami Team Member) Posts 233 23 Jan 2010 15:21
| I think it is too early for calculations now. This counter is only a rough estimation on how extensive the memory bus is used. With this counter you can tell the difference between 50% and 5% but that's it. I mentioned it only to show that there is some "potential" in comparison to OCS/AGA. For real values to calculate on we need some performance tests. Second point is that the design itself is not fixed. Optimizations are still needed and to be done. This counter also tells only the bus load an the NatAmi bus - which is completely different and absolutely not comparably (for calculations) to the original Amiga bus. I wanted to explain the differences in the "deviation" thread but it got completely off-topic. So the only essence here is that this counter shows that the NatAmi is not at its limit with a common AGA display. The Blitter is a different thing. Currently I am thinking about the Blitter design concept. I could use the old "unbuffered" Blitter from the C-One design. I am sure this would work with some minor issues which I never fixed. But it will not be much faster than an original Blitter. I am not sure if it would be faster at all because there is at least a 22 clock wait penalty for every 16bit access. And there are 4 accesses necessary for one 16bit word. So I am thinking about a "buffered" Blitter which pipelines the accesses. The buffered Blitter is not as easy as the bitplanes because it has an ascending and a descending mode and supports self-modifying data. The Blitter has two memory interfaces, one for read, one for write. The buffered Blitter will be able to read and write "all the time" while it does the calculations pipelined in the background. So either the Blitter will be faster than the CPU or there is a serious flaw in the concept.
| |
Marcel Verdaasdonk Netherlands
| | Posts 2089 24 Jan 2010 03:02
| @Thomas Hirsch Thank you for the clarification.Just a question that still lingers. The buffered blitter version, it does R/W ChipMem and Read FastMem Or did you real mean that it can read only one type of memory and write the other? I think i ansered the question already myself but someone please reply.
| |
Gunnar von Boehn Germany
| | (Natami Team Member) Posts 3727 24 Jan 2010 21:00
| Team Chaos Leader wrote:
| Another troublesome part of the calculation is that since the NATAMI SAGA chipset is doing the doublescanning, this means that on SAGA, 640x512 takes 2x the display bandwidth of 640x256. On AGA, 640x512 and 640x256 each take the same amount of bandwidth due to the "interlace trick".
|
Interlace is dead and Interlace looks ugly. You know that a full 640x512 screen looks MUCH better than a 640x512i screen! Team Chaos Leader wrote:
| But my asm blitting routines have been up to 72x faster than the AGA blitter for over 10 years, so the 40x faster blitter doesn't help my Total Chaos game coding any.
|
I think you/we are mixing throughput numbers with algorithm efficiency here. Even the fastest Cyberstorm 68060 board could at best reach 8 times the throughput of the old OCS Blitter. The NATAMI Blitter could reach a throughput far higher than a 68060 can reach. I know you are doing a very unusual case in Blitting for which the normal AMIGA Blitter operation is not optimal. I think what you look for a some "clever" blitter operation which is more optimal for you special case. I thought about some blitter optimizations the other day when was coding on 194x. I think those simple optimizations will increase the efficiency by 400% for your usages cases too. I'll propose the idea on the "dark" side of the forum - we both only need to twist Thomas arm to implement it.
| |
Team Chaos Leader USA
| | (Natami Team Member) Posts 1199 24 Jan 2010 21:38
| Thomas Hirsch wrote:
| The Blitter has two memory interfaces, one for read, one for write.
|
So does my 060 and 040. :) The buffered Blitter will be able to read and write "all the time"
|
Yes. But so do my asm blitting routines. Gotcha ;) while it does the calculations pipelined in the background. So either the Blitter will be faster than the CPU or there is a serious flaw in the concept.
|
Yes there is a serious flaw in the concept of the Amiga blitter. It has always been so.The main reason I battled and fought and scratched and clawed my way onto the Natami team was to try to get you to fix the conceptual flaw. I know I will succeed once I have a working Natami and can do some timing tests to prove things to you. I tried to explain it to Gunnar in English but he absolutely positively would not believe me. So I wait.
| |
|