Home   News   Concept   AMIGA-Compatible   Hardware   Forum   Questions+Answers   Pictures   Contact & Team

Welcome to the Natami / Amiga Forum

This forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.



All TopicsNewsQAFeaturesTalkTEAMLogin to post    Create account
Do you have questions about the Natami?
Post it here and we will answer it!

Comparing the 68060-Fpu to the 68050-Fpu
Angel of Paradise
Germany

Posts 61
29 Mar 2011 12:21


Hi,

You posted about the new 68050-Fpu.
How does it compare to the 68060-Fpu?
Does it include all instructions of the 68060?
What precision does the 68050 FPu support?

Thanks in advance

Gunnar von Boehn
Germany
(Moderator)
Posts 5775
29 Mar 2011 15:17


Angel of Paradise wrote:

Hi,
 
  You posted about the new 68050-Fpu.
  How does it compare to the 68060-Fpu?

Both are similar.
First of all the 68060 FPU is done.
The 68050 is nearly done.
It will certainly take a few more weeks for it to get 100% finished.

Angel of Paradise wrote:

  Does it include all instructions of the 68060?

Yes.
Actually we plan to include even instructions that the 68060 had only in software.

Angel of Paradise wrote:

  What precision does the 68050 FPu support?

Currently the FPU does support full 80Bit precision.
We consider to offer a "light" version which has 64bit precision.
Today its common to have 64bit FPU precision.
PowerPC and many other systems have 'only' 64bit.

This means for "normal" software this precision is fully enough.
The advantage of the 64bit precision would be saving in chip real estate - which is at least interesting.

I wonder what the experienced FPu coder think about this.
Feedback?



Gunnar von Boehn
Germany
(Moderator)
Posts 5775
31 Mar 2011 09:56


Looking at our current prelimenary performance numbers you can also start estimating a comparsion to the very popular 68882 FPU.

        
                  68882          68050  Latency
FMOVE  mem,reg      40              1       
FADD                75              1    8
FSUB                75              1    8
FMUL                95              1    8

Because of different sequential dependancies in different code the
the FPU performance ratio from a 68050 to a 68882 will vary.

With full sequential code the 68050 is only about 10 times faster than a 68030 with 68882.

This means the 100Mhz 68050 reaches then only a performance comparable to an 68882 clocked at 1 GigaHerz.

For code with little sequential dependencies the performance ratio will go much higher.

The peak performance of the 68050 would be  equal to an 68882 running at  13 GigaHerz .

As the 68050 can fuse a FMOCE and FMUL in a single cycle, at can do this combination in 1 cycle while these two instructions would need 135 cycles on the 68882.

Interesting?

Przemek Tkaczyk
Poland

Posts 54
31 Mar 2011 10:19


O_o JUST WOW

Sergio Gabbiani
Italy

Posts 18
31 Mar 2011 10:40


Woww...

Great!! Simply great!! :)

Wawa Tk
Germany

Posts 581
31 Mar 2011 10:44


a comparison with fpu unit of 040 or 060 would be maybe telling. extern fpus have had enourmos extra lag (something like 30 clocks iirc). has that been taken into account?

Megol .

Posts 695
31 Mar 2011 11:01


wawa tk wrote:

a comparison with fpu unit of 040 or 060 would be maybe telling. extern fpus have had enourmos extra lag (something like 30 clocks iirc). has that been taken into account?

This^


Gunnar von Boehn
Germany
(Moderator)
Posts 5775
31 Mar 2011 11:23


wawa tk wrote:

  a comparison with fpu unit of 040 or 060 would be maybe telling. extern fpus have had enourmos extra lag (something like 30 clocks iirc). has that been taken into account?
 

 
 

                    68040 Latency  68050  Latency      80486
  FMOVE  mem,reg        3              1                    3
  FADD                  3 7            1    8            8-20 
  FSUB                  3 7            1    8            8-20
  FMUL                  5 9            1    8              14
 

 
  As you see the latencies on the 68050 and 68040 are quite similar.
  From the maximum throughput the 68050 can fuse two instructions in one cycle which take on the 68040 8 cycles.
 
With normal could FPU code the 68050 could probably score in the range of an 68040@300 MHz.
At peak a 68050 @100 MHz equal to a 68040 @800 Mhz.
 
At typical Matrix Operations the 68050 could score roughly like a 1.5 GigaHerz 80486.
 
Not yet a PS3 Killer but not bad either. :-D

A proper testcase like the Mandelbrot from SP or a Matrixmul testcase will IMHO make most sense.
This will give realistic numbers.
 
 
  Cheers

Loc Dupuy
France

Posts 253
31 Mar 2011 14:03


Gunnar von Boehn wrote:

At typical Matrix Operations the 68050 could score roughly like a 1.5 GigaHerz 80486.

It means that we can have a VBL Quake I on the NATAMI (software rendering or Gl rendering, GPL code EXTERNAL LINK ).
Quake I was the first game to use massively the x86 FPU unit for software rendering.
Every gamer that had a Cyrix 166 change to the Pentium 133, because the FPU was two times faster.

I'm not found of quake, but it would be a good mixed benchmark (integer/memory/fpu) to see what Natami has in the guts compare to a 96's PC without 3D card (Quake II has a software renderer also).

Amiga port
http://planetquake.gamespy.com/View.php?view=Quake.Detail&id=326#Files
ClickBoom amiga commercial port
www.lemonamiga.com: EXTERNAL LINK 

video on youtube : EXTERNAL LINK  Amigas equiped with full 68060@50, fast gfx card and AHI compatible soundcard would have seen a good 8-10 fps.
My own setup AGA +68040/25 would get around 4 fps with postage stamp size window and 2x2 pixel mode ;) Game uses CD Audio for music (Not Recorded).


Gunnar von Boehn
Germany
(Moderator)
Posts 5775
31 Mar 2011 14:06


Loïc Dupuy wrote:

I'm not found of quake, but it would be a good mixed benchmark (integer/memory/fpu) to see what Natami has in the guts compare to a 96's PC without 3D card (Quake II has a software renderer also).

But its to big a testcase to learn anything from it.
With such a big testcase you will get a "score" but you will have no clue why your system scored this score.

A smaller testcase allows you to analyse both the code and the CPU behaviour reaction on it. And only by analyzing you can learn from it and improve our evolve your CPU.

This means for me as CPU developer such a big testcase is of little value.


Loc Dupuy
France

Posts 253
31 Mar 2011 17:25


@Gunnar von Boehn
You are 100% right from a designer point of view.
 
But for an user point of view and "penis enlargement", knowing that the 133mhz PC at the time were doing 30-45 fps in 640x480 EXTERNAL LINK , by wich margin we beat and stomp over them :-D
 
It's neither urgent or necessary outside knowing the qualitative "enlargement" gain :-D

My point was that "Quake I" is FPU bounded for the software renderer, it will not help to design the FPU, but will help to have an appreciation of its relative efficiency in an FPU bound application.

posts 11