|
|---|
Thomas Hirsch Germany
| | (MX-Board Owner) Posts 647 17 Feb 2011 20:18
| Or a 64bit data bus. Since the processor is not fix.
| |
Greg the Canuck Canada
| | Posts 35 18 Feb 2011 01:18
| C'mon, we really know it means 64K total system RAM.
| |
Dag Jacobsen Norway
| | Posts 78 18 Feb 2011 10:35
| 64 K augtha be enough for everyone ;-)
| |
Lord Aga
| | Posts 129 18 Feb 2011 10:45
| Because Amiga tech is 10 times more efficient than PC tech :)
| |
André Jernung Sweden
| | (MX-Board Owner) Posts 988 18 Feb 2011 11:23
| Could you guys try to minimize the unnecessary joke posts in news threads? It makes it really hard for new people to find information if the threads are a gazillion pages long because of nonsense posts. There is a "talk" section in the forum for stuff like that. I'd rather ask nicely than start deleting stuff.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3979 18 Feb 2011 11:56
| Thomas Hirsch wrote:
| Or a 64bit data bus. Since the processor is not fix.
|
Does this extends to the chipset too?
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 18 Feb 2011 15:08
| I would prefer to buy a natami with a fpga and not ASIC. Because then I can download the latest patches and upgrades. Much cheaper than buing a new box for every version of the chipset/cpu. Natami1-natami2-natami3 etc.. Don't underestimate the 68050 CPU clocked at only 120mhz with superfast and ram and caches. Most instructions run at one clock, and with instrution fusion it will get very powerful. It will be like a 030 clocked at 1.5 GHZ..
| |
Wojtek P Poland
| | Posts 1597 18 Feb 2011 18:47
| S P wrote:
| I would prefer to buy a natami with a fpga and not ASIC. Because then I can download the latest patches and upgrades. Much cheaper than buing a new box for every version of the chipset/cpu. Natami1-natami2-natami3 etc.. Don't underestimate the 68050 CPU clocked at only 120mhz with superfast and ram and caches. Most instructions run at one clock, and with instrution fusion it will get very powerful. It will be like a 030 clocked at 1.5 GHZ..
|
not really but like 5 times faster per cycle than 030. Maybe 10 times when they will make it superscalar. Anyway what a problem. 120 MIPS is a lot if you know what you are doing and have hardware accelerators that runs at same speed not A1200 speed.
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 18 Feb 2011 18:59
| Mc68030 timings: muls.l EA,Dn (max) 2 0 44(0/0/0) 44(0/1/0) divs.l EA,Dn (max) 0 0 90(0/0/0) 90(0/1/0) :D A multiplication is 44 cycles. Division is 90 cycles. Remember that datacache reads on the Mc68030 is far from 1 cycle. On the 68050 the muls is 1 cycle. The divs is 10? but hopefully the will clone it into a seperate unit so it will run in 1 cycle(parallell)
| |
Wojtek P Poland
| | Posts 1597 18 Feb 2011 23:35
| S P wrote:
| Mc68030 timings: muls.l EA,Dn (max) 2 0 44(0/0/0) 44(0/1/0) divs.l EA,Dn (max) 0 0 90(0/0/0) 90(0/1/0) :D A multiplication is 44 cycles. Division is 90 cycles. Remember that datacache reads on the Mc68030 is far from 1 cycle. On the 68050 the muls is 1 cycle. The divs is 10? but hopefully the will clone it into a seperate unit so it will run in 1 cycle(parallell)
|
You would be a genius if you can design single cycle divider...
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 18 Feb 2011 23:40
| Wojtek P wrote:
| You would be a genius if you can design single cycle divider...
|
Claudio cooked up a way to make a divider in 3 cycles. But I say 5.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 19 Feb 2011 09:15
| Wojtek P wrote:
|
S P wrote:
| I would prefer to buy a natami with a fpga and not ASIC. Because then I can download the latest patches and upgrades. Much cheaper than buing a new box for every version of the chipset/cpu. Natami1-natami2-natami3 etc.. Don't underestimate the 68050 CPU clocked at only 120mhz with superfast and ram and caches. Most instructions run at one clock, and with instrution fusion it will get very powerful. It will be like a 030 clocked at 1.5 GHZ.. |
not really but like 5 times faster per cycle than 030.
|
How much faster the 050 is than the 030 does depend on the use case. * Are only very simple instructions used, or more complex instructions which do a lot more work per clock? Only the most simple instruction are fast on 030, medium or complex instructions are very slow on 030 - but they are very fast on 050. * Does the code fit in the very tiny cache of the 030? The 030 has a very tiny cache only the most smallest routines or loops fit in it. Most routines will run slow well on 030 because the do not fit in the cache. * Does the code work a lot with stack or memory? The 030 has only a tiny write through cache. Most code that updates variables on stack will run significant slower on 030. * Does the code has jmps or conditional branches ? The 030 is not fast when changing code flow. Is very much depends what code you are executing. I think 5 times more speed clock by clock is the minimum that 050 should get. 8 to 10 times faster clock by clock is more likely to see in real live ...
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 19 Feb 2011 11:05
| If the division unit is seperated into its own unit. We can have a 1 cycle divs divs.l d0,d1 ;Start a 10 cycle division. muls.l d2,d2 ;free muls.l d3,d3 ;free muls.l d4,d4 ;free muls.l d5,d5 ;free muls.l d6,d6 ;free muls.l d7,d7 ;free muls.l a2,a2 ;free muls.l a3,a3 ;free muls.l a4,a4 ;free muls.l a5,a5 ;free move.l d1,(a0)+ ;The division is finished and the move will only use 1 cycle. ... divs.l d0,d1 ;Start a 10 cycle division. move.l d1,(a0)+ ;The division is not finished. The move will stall for 10 cycles.
| |
Megol .
| | Posts 680 19 Feb 2011 14:48
| Team Chaos Leader wrote:
|
Wojtek P wrote:
| You would be a genius if you can design single cycle divider... |
Claudio cooked up a way to make a divider in 3 cycles. But I say 5.
|
3 cycles? Sounds too good to be true, is it verified functional? Intel uses a RADIX 16 divider (4 quotient bits/cycle) that was introduced in the 45 nm "Core" architecture. AMD IIRC uses a Goldschmidt divider (doubles quotient precision/cycle).To do a division in 3 clocks requires either a RADIX 1024 divider + fixup or a Goldschmidt divider with a first approximation with 8 significant bits. Both of these would require massive amounts of hardware so what kind of algorithm do you use in the N68050?
| |
Claudio Wieland Germany
| | (Natami Team) Posts 706 19 Feb 2011 16:16
| Since I'm not on the team now, I only want to say as much: My solution is a special case for divisions by 1..255, which should occur quite often. Bigger numbers can also be divided, but you get more or less accurate approximations. Your guess with using Goldschmidt and approximation is good, and there is some room for improving the general div.
| |
Richard Maudsley United Kingdom
| | Posts 821 19 Feb 2011 16:48
| Welcome to the club Claudio! can we talk in #natami?
| |
Megol .
| | Posts 680 19 Feb 2011 17:01
| Claudio Wieland wrote:
| Since I'm not on the team now, I only want to say as much: My solution is a special case for divisions by 1..255, which should occur quite often. Bigger numbers can also be divided, but you get more or less accurate approximations. Your guess with using Goldschmidt and approximation is good, and there is some room for improving the general div.
|
Yeah if the specialized cases are common and the general cases aren't slowed down it sounds like a good solution :)
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 19 Feb 2011 17:56
| S P wrote:
| If the division unit is seperated into its own unit. We can have a 1 cycle divs divs.l d0,d1 ;Start a 10 cycle division. muls.l d2,d2 ;free muls.l d3,d3 ;free muls.l d4,d4 ;free muls.l d5,d5 ;free muls.l d6,d6 ;free muls.l d7,d7 ;free muls.l a2,a2 ;free muls.l a3,a3 ;free muls.l a4,a4 ;free muls.l a5,a5 ;free move.l d1,(a0)+ ;The division is finished and the move will only use 1 cycle. ... divs.l d0,d1 ;Start a 10 cycle division. move.l d1,(a0)+ ;The division is not finished. The move will stall for 10 cycles.
|
yes, this is on our todo for the 050E or early 070. Running a parallel DIV uses the same concept as running a parallel LOAD that missed the cache. Both can with support from the decoder and an extra register and cache write port be executed in parallel to normal continous program flow. Is an optimal form of OOO, where OOO really makes sense.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 19 Feb 2011 18:42
| Megol . wrote:
| Team Chaos Leader wrote:
| Claudio cooked up a way to make a divider in 3 cycles. But I say 5. |
3 cycles? Sounds too good to be true, is it verified functional? |
To make this clear a general purpose DIV in 3 clock cycles is unrealistic. At least if you want a sensible clockrate. ;-D TCL posts is missleading here. TCL accidently made is sound whether this idea is for general division and that anyone of the NATAMI team did verify this idea - none of this is not the case!Fact is that: anyone in the forum can propose a new idea. This happens all the time. Thierry, or you!, or anyone else comes up with an ideas. We should not mix up unverified ideas, with plans of the team! Such mixups will lead to confusion as often people will A) Hope for features which are never planned to do. B) Hold us for idiots because they think we want to work on impossible features.
| |
Claudio Wieland Germany
| | (Natami Team) Posts 706 19 Feb 2011 20:07
| Gunnar von Boehn wrote:
|
Megol . wrote:
| Team Chaos Leader wrote:
| Claudio cooked up a way to make a divider in 3 cycles. But I say 5. |
3 cycles? Sounds too good to be true, is it verified functional? |
To make this clear a general purpose DIV in 3 clock cycles is unrealistic. At least if you want a sensible clockrate. ;-D TCL posts is missleading here. TCL accidently made is sound whether this idea is for general division and that anyone of the NATAMI team did verify this idea - none of this is not the case! Fact is that: anyone in the forum can propose a new idea. This happens all the time. Thierry, or you!, or anyone else comes up with an ideas. We should not mix up unverified ideas, with plans of the team! Such mixups will lead to confusion as often people will A) Hope for features which are never planned to do. B) Hold us for idiots because they think we want to work on impossible features.
|
TCL's post is actually misleading a bit here. My approach is not suited to generalized division, and just to correct you on this matter, Gunnar: When I was on the team, I verified it. You just never cared. That's all there is to it.
| |
|