| Performance of the 68070 Pipeline | page 1 2 3
|
|---|
|
|---|
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 28 Mar 2009 21:04
| People did ask for some status updates. Here are some very prelimary performance numbers: I did run some test of the CPU-benchmark through the Chip simulator of the 68070 CPU core. Please mind that the 68070 CPU is not finished. Currently not everything is finished and only parts are tested. The below number are a snapshot of the current pipeline behavior. Our longterm goal is to increase this performance (the goal is to double). Before the CPU will be enhanced we of course first need to finish it. And this will take quite some time. Our realistic nearterm goal is to finish the 68070 core and bring it out with behavior in the performance range of the below result. If the improved next version of the Core is ever finished it will probably get another name aka 68080.
 The cart compares the 68030, 68040, 68060, and 68070 core on 3 tests of the minibench CPU-Mark. These tests execute 4 different types of immediate operations operating on registers. The test run to 100% inside the CPU-Cache of each chip.The test show the peak performance of part of the integer pipeline. For the real-live performance the size of the CPU cache is also important. With each generation 030 - 040 - 060 - 070 the Cache size was increased and therefore the application performance did increase also. I hope you like this. Please feel free to ask questions. Cheers Gunnar
| |
One Thousand USA
| | Posts 832 28 Mar 2009 22:02
| Thanks, this is good info. The 070 is looking good so far. A steady 1 instruction a clock is great for a 1-way CPU. My questions: Is it steady with 1 instruction a clock on dependent instructions? What are the stages in the pipeline?
| |
Fabian Nunez USA
| | Posts 312 28 Mar 2009 23:39
| It's interesting that ADDQ is a lot slower on a 070 than it is on an 060 (assuming linear performance with clock speed, a 50MHz 070 would score around 42 - roughly about on a par with 030 performance). Does this reveal some implementation bug, or is the design simply optimized for the other addressing modes?
| |
Samuel D Crow USA
| | (Natami Team) Posts 1295 28 Mar 2009 23:57
| I'd suspect it would have more to do with being made in an FPGA rather than a full-fledged custom die-layout.
| |
One Thousand USA
| | Posts 832 29 Mar 2009 00:05
| It looks like the 070@50MHz should be a score closer to 50, I think. But the reason why the 060 is faster per clock is because it is superscalar and can do 2 instructions at a time, but the 070 is only 1-way. A bottleneck of the 060 is how much it can fetch, so it takes a hit when the instruction is longer. He said this test is on immediate adds, so the addq is 16 bits, the addw is 32, the addl is 48. But the 070 (and 040) is steady through it all and does not have that problem. The 070 is doing great on this test.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 29 Mar 2009 08:25
| One Thousand wrote:
| My questions: Is it steady with 1 instruction a clock on dependent instructions? What are the stages in the pipeline? |
Here is the result of depending instructions: E.G. ADDq.L #1,A0 ADDq.L #1,A0 ADDq.L #1,A0 ADDq.L #1,A0
 The 68070 result shows how it will behave next week - when I've finished the forwarding workitem thath Jens gave me. As of today the 070 does not forward fully.The 68000, 68020, 68030, 68040, and 68070 are unaffected by depending instructions. But the 68060 as Superscaler CPU has to take a performance hit if instructions are depending. The pipeline looks like this: FETCH 0) DECODE 1) Reg-Load-ALU1 2) EA-CALC (ALU1) 3) Reg-Writeback-ALU1 / MEMLOAD / Reg-Load-ALU2 4) ALU (ALU2) 5) Reg-Writeback-ALU2 / MEMSTORE The 070 has two ALUS and a LOAD-Unit behind each other. The 070 can do max 2 ALU operations with 2 Register-updates plus a Memory-LOAD per clock.
| |
Ayodele Stephenson USA
| | Posts 83 29 Mar 2009 17:55
| Thank You for the Update. Even non-technical people like myself have enjoyed reading the posts about the new N070 You have in development... This project continues to show the renewed "spirit" that this community needs. Thanks Again!!
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 29 Mar 2009 19:55
| Ayodele Stephenson wrote:
| This project continues to show the renewed "spirit" that this community needs. Thanks Again!!
|
Our pleasure :-) The 68070 is a very nice project. The 68070 is not finished yet but we make good progress and I really ensjoy the work on it. Jens and myself have learned quite a lot when working on the CPU design. As all of us, I've started programming the 68K CPUs over 20 years ago. Some of the things that I knew about the 68K CPU internals - I just knew but I never understood why some behavior were designed like they are. But now after rebuilding a very fast 68K, I've learned a lot of the reasons behind. I now understood why certains things were designed inside the 68K CPU like they were in the 68040 or 68060. It kind of all makes sense now. :-) Some more info regarding the pipeline of the 68070: We tried to achieve a few design goals with the 68070. - A 100% real 68K CPU. Support for all needed 68K integer Instructions and all addressing modes. We wanted a real 68K CPU for maximum AMIGA-OS performance. A 100% 68K CPU and not a reduced Coldfire. - Dropping MMU and 68K-Floating point for now. The reason was to speed up time to market and to speed up the integer pipeline. Our 68070 behaves like a Motorola 68EC0x0 CPU. The EC-68K CPUs were integer only and not only used in the A1200 but in many other AMIGAs too. - Optimized for high clockrates. Our design goal was to reach more than 100 Mhz. Some people believed that this impossible but we proved that if you know what you are doing, then you can create a high clocked 68K CPU. Our 68070 currently runs at up to 133Mhz, maybe we might even reach a higher clockrate. - The pipeline of the 68070 is designed to execute the majority of instructions in just 1 clock. With every new 68K CPU the needed clocks per instruction went down. The 68070 continues this and is so far the 68K CPU with the lowest number of clocks per instruction. - Like the other top 68K CPUs the 68070 is designed to do several "operations" in one instruction. The 68070 is designed to do 1 Address calculation and 1 Data-Cache access for free per instruction in addition to 1 ALU operation. This seperates the 68K and the 68070 from other RISC CPUs. For what the 68070 can do in just 1 instruction other CPUs (like PowerPC) often need 2 or 3 instructions. - The 68070 is single 68K instruction per clock. For the next iteration (called 68070B or 68080) we target multiple 68k instructions per clock. Cheers
| |
Marcel Verdaasdonk Netherlands
| | Posts 3976 29 Mar 2009 22:54
| Those results are quite impressive for a unfinished product. :P
| |
Mr. Derp USA
| | Posts 41 30 Mar 2009 10:03
| Two questions - What is the size of the cache of each CPU (030, 040, 060, 070)?Forgive my ignorance - is the 070 a new CPU you are developing? Is the intent for the Team to design the chip and then forward that design to a chip manufacturer - like IBM or UMC or RealTek or something?
| |
Gio G. Germany
| | Posts 24 30 Mar 2009 10:23
| No, it's a "Soft CPU" which lives happily inside the FPGA. :)
| |
Bartek "Banter" K. Poland
| | (Natami Team) Posts 2277 30 Mar 2009 10:51
| Yes, and the best part is, it's probably one of the very first CPUs you can actually DOWNLOAD:) Take care.
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 30 Mar 2009 14:53
| 68020 256 bytes instruction cache 000 bytes data cache68030 256 bytes instruction cache 256 bytes data cache 68040 4096 bytes instruction cache 4096 bytes data cache 68060 8192 bytes instruction cache 8192 bytes data cache
| |
One Thousand USA
| | Posts 832 30 Mar 2009 15:48
| Thanks for the answers. Things are looking good. I am also glad to see that the reckoning of the ALUs are only one stage. This CPU work does look fun. On the next version, you are looking to have multiple instructions? Nice. I take it that is by adding ss/ooo?
| |
Wawa Tk Germany
| | Posts 581 30 Mar 2009 17:21
| @bartek: Bartek wrote:
| Yes, and the best part is, it's probably one of the very first CPUs you can actually DOWNLOAD:) |
as far as i know there are already softcores available to download.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 30 Mar 2009 20:12
| Gunnar von Boehn wrote:
| The 68070 result shows how it will behave next week - when I've finished the forwarding workitem thath Jens gave me. |
UPDATE: Forwarding is working now! The 68070 does now execute the above testcases (the depending instructions) just like shown on this barchart!  Kudos go to Jens for his "overnight" forwarding work. We are making some progress :-)
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 30 Mar 2009 20:14
| wawa tk wrote:
| as far as i know there are already softcores available to download. |
This is true, there are about halve a douzand different 68K softcores. But the 68070 is designed to become by far the fastest.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 30 Mar 2009 20:44
| One Thousand wrote:
| Thanks, this is good info. The 070 is looking good so far. A steady 1 instruction a clock is great for a 1-way CPU.
|
Yes I agree. Actually the ADD instruction shown in the barchart, is not good indecator of the CPU performance. ADD is a relative simple instruction. Even the 68030 could execute the ADD instruction fast. Much more meaningfull for performance will these instruction like SHIFT or MUL which were slow and took many cycles on 68020/68030/68040.
| |
One Thousand USA
| | Posts 832 30 Mar 2009 22:39
| That is great that forwarding was put in so swiftly. Good work, Jens. And thanks for the little update. I am tempted to join the team because of this excitement.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3976 31 Mar 2009 09:48
| I am also willing to help but being literate isn't such a great help. ;) Ah, Schematics, and code, life as a tester was easy, it worked or it didn't. To bad i did the fixing part too, Perhaps in that manner i could help. But that's because i lack knowledge in some areas of the electronics. Such is life, for one can't know it all, Marcel.
| |
|