| End of Summer Status-Update | page 1 2 3 4 5 6 7 8
|
|---|
|
|---|
Ingmar H Germany
| | Posts 46 02 Oct 2008 11:59
| Peter K. wrote:
| The board specifications, completed as design, are now: NEW: -Big Performance Plus: 2 independent 64bit busses with symmetrical 256MB CHIP and 256MB FAST DDR2 memory, both clocked with FPGA speed -68060 on CPU-card via PCI (optionally local SRAM) -Altera Cyclone 3 FPGA with interal SRAM for 3d texturemapper speed-up -DVI-output for flatscreens -usb1.1 for mouse/keyboard MAINTAINED: -2x dma-capable IDE -VGA output (with integrated scandoubler) -15kHz Video Out -floppy connector, supporting Amiga-diskettes -parallel and serial -ps2-connector for AMIGA-keyboard or PC ps2-keyboard (autodetect) |
DVI is wonderful. So it can be used as a multimedia box directly at flat tvs. GREAT. But... Ethernet missing? Any chance to either add a 2nd pci for ethernet or bring it on board? or to create a cpu socket (as a4k had...) and put the cpu card to this and free the pci? Will a screenmode for LCD-TVs be added, too? I mean for a DVI-HDMI connector a 720p and 1080p screenmode would be great even if the 1080p will be slower, but hopefully fast enough for working with workbench.
| |
Asaf Ayoub United Kingdom
| | Posts 332 02 Oct 2008 17:06
| Hi This is great news ! Thanks to the Natami team for their hard work. Its always good to review pricing in BOM. The memory question : What speen increase can we expect from the old design ? With the new FPGA, do you have datasheat ? What speed increase can we achive for the N70 cpu ? Future proofing the design now is a good choice : The Sam board can use a PCI slot with a WIFI card ! Having the same WIFI chip on the motherboard, keeps compatibility and also tells people this is an uptodate system. Small, standard sized PCB - lowers cost cheap standard memory - High volume in circulation, lower cost. cheap battery - This part can cost up to $10
| |
Michael Ward USA
| | Posts 234 02 Oct 2008 20:42
| I appreciate the very good decision making in regards to the new specification. The more modular and universal the better in my opinion. There are always trade-offs in a design process but in reading the explanations you provided, everything looks as good as it can be. Nice work in utilizing FGPA internal SRAM to compensate for use of other style ram. Nice work in reducing cost too. In regards to the Altera Cyclone III, which one did you end up going with? I see they get quite expensive for larger ones. I am presuming you were able to fit everything into smaller, less expensive version and that future 070 would also fit on same FGPA?
| |
H I T
| | Posts 68 02 Oct 2008 21:09
| i found a benchmark on alteras site: EXTERNAL LINK and an article describes some possible applications for the chip: EXTERNAL LINK its seems to have native dsp capabilities too. but a question. can anyone tell me, how much LEs will be used for the softcore (070, aga+, etc) ? i'm clueless here. Edit: minimig is using the SPARTAN-3 XC3S400 which has 896 Logic Blocks/Elements. is that correct? Edit: what i forgot. implementing the 68060 on a pci card, does that mean, you could sell the same board without 68060 as a first consumer version? i mean, that would makes sense, if the softcore is stable enough :)
| |
One Thousand USA
| | Posts 832 03 Oct 2008 00:44
| Which version of the Cyclone 3 is used? There is a huge difference between the smallest and biggest. Does the DVI port have the digital signal? Thanks.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 03 Oct 2008 08:46
| h i t wrote:
| but a question. can anyone tell me, how much LEs will be used for the softcore (070) |
The 070 is not finished, we can not give an exact number for its size. But there are other 68K Softcores and the 68k like-Coldfire Softcore, which is available for the ALTERA Cyclone. The available 68K Softcores all have a size between 3500 and 8000 LE. This is about 8-20% of the available LE of our ALTREA. Cheers
| |
H I T
| | Posts 68 03 Oct 2008 10:06
| ah, thanks. so a 3C40 or 3C55 will be used (see my first link). and as stated above. its not really a matter of weeks, or months. time is nothing in the amiga world. just create some really valuable. thats what does count at the end :)
| |
Wawa Tk Germany
| | Posts 581 03 Oct 2008 15:53
| i welcome the idea of a exchangeable cpu module very much, but as gunnar has made a lot of statements about how pci latency would slow down memory performance, i am now a little bit unsure what to think about this new solution. i have the impression the idea of pci cpu card arose because of all of the talk about os 4.x and ppc pci cards that could be employed as extension allowing to run it on natami. as aos4.1 is out for aone and sam but still no more powerful hardware is in sight said approach could really adress the demand of amiga community's majority and also close the potential gap further fragmenting the platform into aos3.x_compatible apart of aos4.x (not to name aros and morphos). said that i have to admit that im 3.x fanboy up till now and that is a system on amiga (i have 4.0 as well) that offers me the most functionality at the moment. i woud like to see it improved along all the previously given up 68k branch. so maybe there would be some better possibility to fit an exchangeable processor module to natami motherboard apart of pci, not to run into the pci protocol drawbacks, like this: EXTERNAL LINK ive stolen this link from this discussion on amiga org: http://www.amiga.org/modules/newbb/viewtopic.php?topic_id=47283&forum=8#forumpost548700 best wawa
| |
Chris Sanz USA
| | Posts 122 03 Oct 2008 17:21
| @ Wawa Tk, That CPU module looks very interesting. It appears to have "the right stuff".
| |
Michael Ward USA
| | Posts 234 03 Oct 2008 17:54
| 'CPU module / FPGA / Memory' questions: From an architectural standpoint, what general component layout was decided? Chip and fast ram reside on main board? What resides on cpu card? (other than cpu), optional sram location on cpu card? When future 070 developed, no need for a cpu card?Sincerely, Michael
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 03 Oct 2008 20:23
| Michael Ward wrote:
| 'CPU module / FPGA / Memory' questions: From an architectural standpoint, what general component layout was decided? Chip and fast ram reside on main board? What resides on cpu card? (other than cpu), optional sram location on cpu card? When future 070 developed, no need for a cpu card? Sincerely, Michael |
The mainboard will have a) 256MB Chipmem (64bit) b) 256MB Fastmem (64bit) The SuperAGA Chipset will use the Chipmem always as chipmem. The 070 will use the Fastmem as Fastmem. Each CPU card (PPC/Coldfire/or 68060) should have local fastmem on board. (Just like all the good AmigaTurbo card did have.)Local memory on the CPU Card will of course have lower acess latency for the CPU than the fast memory on the mainbaord. Because of this the Fast-memory bank be "turned" into chip-memory (with a switch) when you add a CPU-Card. So if you have a CPU Card. A good setup is to have whatever amount the CPU card has a fastmemory as fast memory and use the on board memory as 128bit wide, 512MB chipmemory. If the CPU card has 2nd level cache. - The PowerPC cards have 2nd level cache. - The Coldfire V5 has 2nd level cache. - And a 68060 card could be designed to have e.g. 1-2MB SRAM as 2nd level cache. So if the card have 2nd level cache then the latency to the main board memory is less of a factor. With we selfmade the second level cache, as like on a 68060 card, then we can put the latency into account. We could do things like speculative prefetching memory lines early ahead into the 2nd level cache. This would help a lot on hiding the latency. A higher clocked 68060 with few MB of 2nd level would be very powerful compared to original classic 68k systems. Of course when our 070 design develops as we hope it does then the systems does not need a CPU card at all. Seperating the (optional) CPU from the Board has the advantage that the same board can be used for customer which want the baord with 070 or which prefer their board with 060. This will lower the cost of the system for everybody. Cheers
| |
David Ferguson USA
| | Posts 34 03 Oct 2008 21:10
| Ok, captian dummy here:) Way back in the day when I actually had an Amiga I never really got into the hardware that much. So, as a result I only understand about 30% of this, wich is ok. My question is... how much of this do I need to know in order to be able to setup and use a new Natami board? Thanks David aka Captian Dummy
| |
Marcel Verdaasdonk Netherlands
| | Posts 3979 03 Oct 2008 21:57
| COOOOOOL!!! That is all that struck my mind reading it all. As for the DDR2 Chipmem you guys caved in to Thierry didn't you. ;) And if i remember the PCI protocol correctly (or was it ISA or EISA Bus?) you can add a PCI board with a bus controller and add a few Extra PCI or older slots. BTW, i have looked up the spec's on PCI, are these correct? -It's a PCI Bridge totaly disconnecting the CPU from the bus. (under normal situations. ;) ) -32 Bitwide bus could under ideal situation acomplish a transferrate of 133MB/s -Extendtion to the 64 bits (bus width) could double the above mentiont speed. -Burstmode can be just as long as one need it to be. -Support for 5 and 3.3 Volt feed. -Write posting and read-prefetching (what-ever that might mean.). -Busmaster-posiblity. (On the 68060 CPU-board) -Working frequentie between 0 and 33MHz (the biggest problem of overclocking a Dell PC. ;) ). -Multiplexed adress and databus for a efficient number of connections. -Support for ISA/EISA/MCA busses. -Configuration by software and registers. -Processor independance. (the reason the natami prolly is using it.) Anyhow, if these spec's are correct for the NatAmi, why wouldn't a expendtion card be made with a bus controller on it. (For Thierry) But more importantly whould the NatAmi use the 5V or the 3.3V variant of PCI bus? I believe it was the 5V variant that is quite common, then i have a 50% chance to be off. ;) And would we implement the 32 or the 64 bit wide Bus? 32 Has 124 connections 4 used for defineing the difference between the 2 variants other wise liked to the GND. (that makes 8 contacts in total.) 64 Needs 64 extra contacts, and makes the biggest difference between the 2 variants because of specific voltage connections. (The 32 also has this with 4 connection, the extendtion to 64 Bitbus has this on ALL volt feed connections.) The guy with the outdated books, Marcel
| |
Bartek "Banter" K. Poland
| | (Natami Team) Posts 2277 03 Oct 2008 21:57
| Hi Gunnar, I was wondering how would you describe currently developed system latency comparing to the previous version of Natami (the one with 16MB CHIP RAM and 68060 onboard)? I mean, let's say previous version was (simplistically) with 0% latency. How would you estimate latency of latest board revision? Ich würde mich freuen, bald von Ihnen Nachricht zu erhalten:) Take care.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3979 03 Oct 2008 21:58
| David Ferguson wrote:
| Ok, captian dummy here:) Way back in the day when I actually had an Amiga I never really got into the hardware that much. So, as a result I only understand about 30% of this, wich is ok. My question is... how much of this do I need to know in order to be able to setup and use a new Natami board? Thanks David aka Captian Dummy
|
I thought it was said before, but as i was told, Plug and Play. Because the NatAmi would be pre-installled with OS 3.9.
| |
David Ferguson USA
| | Posts 34 03 Oct 2008 22:12
| Thanks, I just wanted to make sure. When I finally get one I will then start learning more about the hardware and how it all goes together and works. Looking foward to it. Dave
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 03 Oct 2008 22:23
| Regarding PCI: We are implementing 32bit. 64bit wide has some drawbacks. As for example, you would need slots which are twice as long. (Too long for the Natami board). And there are only a few cards that support 64bit. If you want to increase performance, a much easier way is to double the bus clock from the default 33 MHz PCI clockrate to 66 MHz. If you go for 66Mhz the result will be somewhat like AGP. Cheers
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 03 Oct 2008 23:11
| Bartek Kuchta wrote:
| Hi Gunnar, I was wondering how would you describe currently developed system latency comparing to the previous version of Natami (the one with 16MB CHIP RAM and 68060 onboard)? I mean, let's say previous version was (simplistically) with 0% latency. How would you estimate latency of latest board revision? Take care.
|
Thats a good question. Which will take long to answer. I'll try to give summarize it real quick: The answer depends very much on the job that you are doing and whether the Video DMA, the Blitter, the 070, or the 060 is doing it. If you design your DMA channels or the Blitter - you can have their pipelines designed in such a way that the latency is hidden. With DDR2 you can have several jobs in flight, in parallel. This means your blitter could load several jobs. For example it could load A, B, C in parallel and then store A, load D and E, store B ... If you interleave your job and your blitter is designed to work on longer lines and not on small words then you can get very good performance with the blitter with DDR2 memory. The video DMA is easy to use DDR memory well. Video DMA by nature likes bursting lines. So DDR memory is good for it. 3D operations are not so easy. SRAM has the beauty of random access. And some random access you need for good 3D performance. Our solution is to redesign the 3D-core and to rely on the usage of more internal memory. In short it means that if you Blit a 3D object the source texture map will be bursted into the 3D core. Inside the 3D core we have VERY fast SRAM with VERY wide busses. So internally the 3D core will do several operations on the internal SRAM in parallel - which will make it vary fast. All access to the outside will be pipelined and will go in longer bursts. The pipelining is needed to hide the latency of the DRAM and the bursting is used to fully use the high bandwidth of the DDR2 memory. The latency that is visible to the CPU is more tricky. While you can use pipelining to hide the latency for the blitter this does not work so well with the CPU. The solution for this (which everybody) uses is CPU on chip cache. The 060 has 16 KB on chip cache - this hides the memory latency usually quite well. If you add some extra 2nd level cache this could be improved again. 1 MB 2nd level cache would work wonders. With the 070 the situation is easier. Having the 070 inside the main chipset gives some advantages. You could do several thing to nicer compensate the latency for the 070. One thing is increase the cache line length. Or add a logic which does will automatically fetch not one cache line on a miss but more lines and which fetches lines automatically in advance. If you access one of the last words in a cache line you could speculative fetch the next cache line from memory. To answer you question in numbers: The internal SRAM has a latency of 0 The internal SRAM also has a huge throughput - A lot more than any form of external memory. Pipelined external SRAM has a latency of about 3 clocks. But you can include this latency into your pipelines for the blitter - then for the blitter this latency is not there anymore. DRAM has a longer latency, its about 10 times higher than SRAM. DDR2 memory can cure this somewhat as you can have several operations in flight. So if you start several operations in parallel the throughput will go up and the latency will get mostly hidden. On special work scenarios like pointer chasing, DRAM will always be slower than SRAM. The problem with SRAM is its small size and the high prize. SRAM chips are big - which means that 10 MegaB SRAM takes up as much real estate on the board as 16 GigaB DRAM. And SRAM is very expensive. By reworking the 3D core to buffer more data and to use longer burst for the memory access and to use more internal SRAM - we are able to save the external SRAM without loosing performance - This will reduce cost by about $150 per board. I feel that reducing the price was quite important. And as bonus we now have "nearly unlimited" 256 MB chip memory. Cheers
| |
H I T
| | Posts 68 03 Oct 2008 23:32
| Gunnar von Boehn wrote:
| Seperating the (optional) CPU from the Board has the advantage that the same board can be used for customer which want the baord with 070 or which prefer their board with 060. This will lower the cost of the system for everybody. Cheers
|
good decision, much appreciated. :)
| |
Nick B.
| | Posts 48 04 Oct 2008 01:30
| I agree with you guys.You surely know what you are doing ! i trust you and i believe that was the wise thing to do , price always counts and DDR2 is cheap,very fast and very good idea,150$ per board is a huge price cut.
| |
|