 |
Welcome to the Natami / Amiga ForumThis forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.
|
The team will post updates and news here |
|
|---|
Megol .
| | Posts 671 05 Aug 2012 21:48
| Marcel Verdaasdonk wrote:
|
Richard Maudsley wrote:
| 68K died because Motorola had their head up their ass. They jumped on the RISC bandwagon with the 88k, then realized they didn't have to do nearly as much work if they just joined the PPC alliance with apple. It's simple politics, and idiot businessmen trying to progress in their career by messing with things they don't understand. Nothing to do with technology. |
To be honest if Moto did continue it's 68K line we would have been in the same predicament as the x86 ISA from Intel. IMHO it is good the 68K died out because a dead language doesn't change. This is neat because now we can enhance it under the hood without requiring to add special instructions and other shenanigans which are not required to make operations possible.
|
I don't understand your reasoning here. If you mean that a new 68k processor can skip a lot of the evolutionary changes in x86 then yes that's true. If you mean the design of a top of the line 68k processor would be significantly different to current x86 then no, most functionality is there for it makes sense.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3974 05 Aug 2012 22:26
| We can skip on all dumb design decision that don't work as has been proven in the X86 systems. that is the essence of what i said Megol.
| |
Matt Hey USA
| | Posts 726 06 Aug 2012 03:53
| Megol . wrote:
| Motorola didn't think they would have enough market to improve the 68k line. They were most likely 100% correct in that aspect. In comparison the PPC standard had the backing of IBM and others so as it were that choice was the better one. Remember even Microsoft supported PPC at the time and the RISC seemed too be the future. |
Motorola was losing 68k market share in the desktop and workstation markets but they pretty much gave up on these. They didn't believe in their own product and were unwilling to invest in it (self fulfilling destiny). If they were smart, they would have bought Apple or C= (for a song back then) and commoditized and licensed it like the PC. They still owned the embedded/military market which they could have used for cash flow (they nearly killed that with such weak 68k micro-controller implementations like the early ColdFire). They bet on PPC instead (looked like a safe bet at the time) which didn't perform as well as the theories predicted. The 68k and x86 were the opposite, in theory they should have been outdated, but in practice they performed very well (resources are limited, compilers and programmers can't be perfect and dirty programming is a necessity sometimes ;). The 68k was exceptional though (compared to x86) because it was easy to use and simpler. Joe Circello built a superb processor (68060) that was years ahead of it's time with practically no enhancements to the 68020+ ISA and the Motorola white collars tried to hide it (they anti-marketed their own product). They were truely clueless. I guess it pushed them to make faster PPC processors though. Apple had to add 68060 incompatibility to their OS to keep 3rd party solutions (Shapeshifter on a 68060 Amiga :) from being the fastest Macs at the time. Embarrassing for both Apple and Motorola. Thanks Joe Circello! Megol . wrote:
| That x86 had enough market (=money) to survive the critical phase when ISA still had any impact on performance was something most people didn't account for at the time...
|
So you don't think the ISA matters any more? No doubt compatibility and developer tools are the driving force to a successful ISA. x86 and ARM own those. I agree that they are more important than a good ISA :/. We can make up a little bit of the difference with a superior and easier to use ISA but we are going to have to improve in that regard. We do have some allies like Frank Wille who has created an optimizing assembler that competes with the best x86 and ARM can provide. Compilers need some work, and we seem short on help there, but we will have to pick up steam as we get moving. If I'm good, I will have a few tricks up my sleeve. Don't count the Amiga/Natami/68k out yet ;). Marcel Verdaasdonk wrote:
| We can skip on all dumb design decision that don't work as has been proven in the X86 systems.
|
I agree, Marcel. x86 has the long byte instructions from hell with more outdated ISA kludges than they know what to do with. ARM made a mistake with Thumb 1 and started over for Thumb 2. It wouldn't be needed if their RISC encoding was good enough for the intended market (ours is!). ARM supports 3 different ISAs, but is supposed to reduced logic for saving battery life? Motorola never made a powerful ISA after the 68020+. It's untouched waiting for an artisans hands to craft it into the best ISA ever ;). We start with one of the easiest to use and best code densities of any ISA. With a little bit of blessing, we will succeed!
| |
Olaf Schoenweiss Germany
| | Posts 782 06 Aug 2012 09:42
| Motorola was as clueless as Commodore, so they would have been a perfect combination :-)
| |
Nixus Minimax Germany
| | Posts 272 06 Aug 2012 11:43
| Matt Hey wrote:
| | Don't count the [...] 68k out yet ;). |
You are nuts. | ARM made a mistake with Thumb 1 and started over for Thumb 2. It wouldn't be needed if their RISC encoding was good enough for the intended market (ours is!). ARM supports 3 different ISAs, but is supposed to reduced logic for saving battery life? |
The thumb instructions map to ARM instructions and thus add very little circuit complexity. And I wouldn't call the original thumb a mistake either. Actually it was a very clever concept and I don't know of any other processor with a similar concept (did MIPS add something similar to their architecture?). And there is no real reason why an ARM processor would have to support thumb. The Cortex-M series only supports thumb2. Given the very scarce amount of thumb code outside of industrial applications, I guess that dropping support for thumb1 would not be any problem if it added too much circuit complexity. | Motorola never made a powerful ISA after the 68020+. |
You don't consider PPC to be an ISA made by Motorola, right? This would be technically correct, of course. However, the PPC ISA is so much better than the pesky old 68k that in 1985 you probably would have dreamt about updating the 6502 ISA rather than using a 68000... With the PPC ISA there was no need for another ISA invented by Motorola. PPC failed. So why should Motorola have invented another ISA? They simply went out of processor business. | It's untouched waiting for an artisans hands to craft it into the best ISA ever ;). We start with one of the easiest to use and best code densities of any ISA. |
Sorry, if you really consider the 68k to be a good ISA, you obviously don't value orthogonality very highly. And if a good ISA matters to you, you might check out the National Semiconductor 32x32. Similar to the 68k both in age and architecture but much better ISA. Nobody cares about the NS32x32 these days, of course, but that's yet another similarity to the 68k...
| |
Megol .
| | Posts 671 06 Aug 2012 12:46
| Matt Hey wrote:
| Megol . wrote:
| Motorola didn't think they would have enough market to improve the 68k line. They were most likely 100% correct in that aspect. In comparison the PPC standard had the backing of IBM and others so as it were that choice was the better one. Remember even Microsoft supported PPC at the time and the RISC seemed too be the future. |
Motorola was losing 68k market share in the desktop and workstation markets but they pretty much gave up on these. They didn't believe in their own product and were unwilling to invest in it (self fulfilling destiny). If they were smart, they would have bought Apple or C= (for a song back then) and commoditized and licensed it like the PC. They still owned the embedded/military market which they could have used for cash flow (they nearly killed that with such weak 68k micro-controller implementations like the early ColdFire). They bet on PPC instead (looked like a safe bet at the time) which didn't perform as well as the theories predicted. The 68k and x86 were the opposite, in theory they should have been outdated, but in practice they performed very well (resources are limited, compilers and programmers can't be perfect and dirty programming is a necessity sometimes ;). The 68k was exceptional though (compared to x86) because it was easy to use and simpler. Joe Circello built a superb processor (68060) that was years ahead of it's time with practically no enhancements to the 68020+ ISA and the Motorola white collars tried to hide it (they anti-marketed their own product). They were truely clueless. I guess it pushed them to make faster PPC processors though. Apple had to add 68060 incompatibility to their OS to keep 3rd party solutions (Shapeshifter on a 68060 Amiga :) from being the fastest Macs at the time. Embarrassing for both Apple and Motorola. Thanks Joe Circello! |
Don't know if I can subscribe to that world view. PPC had a performance advantage for a while (at least in practice) and the future looked bright for that platform. At the same time many people thought CISC designs couldn't increase performance at the same rate as RISC designs and that included skilled technical people at Intel and Motorola. Megol . wrote:
| That x86 had enough market (=money) to survive the critical phase when ISA still had any impact on performance was something most people didn't account for at the time... |
So you don't think the ISA matters any more? No doubt compatibility and developer tools are the driving force to a successful ISA. x86 and ARM own those. I agree that they are more important than a good ISA :/. We can make up a little bit of the difference with a superior and easier to use ISA but we are going to have to improve in that regard. We do have some allies like Frank Wille who has created an optimizing assembler that competes with the best x86 and ARM can provide. Compilers need some work, and we seem short on help there, but we will have to pick up steam as we get moving. If I'm good, I will have a few tricks up my sleeve. Don't count the Amiga/Natami/68k out yet ;). |
ISA still matters but in practice the implementation matters a couple of magnitudes more. I remember when PA RISC was referred to as a memory chip with a processor. Guess what, even a cheap processor nowadays have much more cache transistors than core transistors. Branch prediction, cache hierarchy and prefetching makes the difference at this time IMHO. The x86 overhead is mostly in the decoding but still Intel have 4+1 instructions decoded per clock & core without any predecode information, AMD with predecode information can decode 4 instructions per two cores. If that isn't enough Intel can bypass the decoder in many cases effectively decoding even more instructions per clock (and that is intended as a power optimization). Compare this with any RISC of your choosing. As one example the IBM Power 6 used not only predecoding but pre-recoding of instructions and that is a RISC machine :) Marcel Verdaasdonk wrote:
| We can skip on all dumb design decision that don't work as has been proven in the X86 systems. |
I agree, Marcel. x86 has the long byte instructions from hell with more outdated ISA kludges than they know what to do with. ARM made a mistake with Thumb 1 and started over for Thumb 2. It wouldn't be needed if their RISC encoding was good enough for the intended market (ours is!). ARM supports 3 different ISAs, but is supposed to reduced logic for saving battery life? Motorola never made a powerful ISA after the 68020+. It's untouched waiting for an artisans hands to craft it into the best ISA ever ;). We start with one of the easiest to use and best code densities of any ISA. With a little bit of blessing, we will succeed! |
Translating Thumb to ARM instructions is very easy so I can't see the problem?
| |
Matt Hey USA
| | Posts 726 06 Aug 2012 13:52
| Nixus Minimax wrote:
|
Matt Hey wrote:
| | Don't count the [...] 68k out yet ;). |
You are nuts.
|
They said the same things about Steve Jobs. He could make his own processor decisions and buy his own processor design and manufacturing company (P.A. Semi) only to have them switch processor designs. Ok, some people called him an innovative leader. What's the difference, besides money? Nixus Minimax wrote:
| The thumb instructions map to ARM instructions and thus add very little circuit complexity. And I wouldn't call the original thumb a mistake either. Actually it was a very clever concept and I don't know of any other processor with a similar concept (did MIPS add something similar to their architecture?). And there is no real reason why an ARM processor would have to support thumb. The Cortex-M series only supports thumb2. Given the very scarce amount of thumb code outside of industrial applications, I guess that dropping support for thumb1 would not be any problem if it added too much circuit complexity. |
A RISC processor nearly doubles the number of instructions, creates variable length instructions and has 3 modes of operation but somehow doesn't add much circuit complexity? That defies logic. Thumb 1 mode was severely handicapped (regular ARM instructions could not be used) and had a high overhead to switch to and from it. Dropping Thumb 1 support seems like the smart thing to do but fairly recent code used it. Are programmers expected to know what kind of code their compilers produce? I guess they can be expected to recompile their code too? Now the Cortex will drop the original ARM RISC instructions? Throw away your old compilers and libraries. Here's the standard for today. Now everybody support it. That's a really friendly processor. Nixus Minimax wrote:
| | Motorola never made a powerful ISA after the 68020+. |
You don't consider PPC to be an ISA made by Motorola, right? This would be technically correct, of course. However, the PPC ISA is so much better than the pesky old 68k that in 1985 you probably would have dreamt about updating the 6502 ISA rather than using a 68000... With the PPC ISA there was no need for another ISA invented by Motorola. PPC failed. So why should Motorola have invented another ISA? They simply went out of processor business. |
I should have said Motorola/Freescale (they are the same people). When you stop leading and innovating, you start following and that's what Motorola/Freescale have done for a long time. The PPC ISA is good as far as being powerful but is short on ease of use. It does make a difference when PPC assumes manual use of everything (like the caches) but compilers have trouble handling them. It's nearly impossible to look at PPC assembler code and figure out where a slow down or inefficiency is. I wouldn't want to write an assembler for PPC let alone a compiler. Nixus Minimax wrote:
| Sorry, if you really consider the 68k to be a good ISA, you obviously don't value orthogonality very highly. And if a good ISA matters to you, you might check out the National Semiconductor 32x32. Similar to the 68k both in age and architecture but much better ISA. Nobody cares about the NS32x32 these days, of course, but that's yet another similarity to the 68k...
|
I do consider the 68k a good ISA and I do value orthogonality :). Orthogonality is not the holy grail but it is easier to use and better for compilers. There is a trade off between code density and orthogonality. I have tried to improve the orthogonality with the 68kF ISA where reasonable and possible. Data registers are still better at generic data handling (byte sized operations and quick instructions) and address registers are still better at handling addresses (addressing mode operations) but some restrictions can be dropped. It improves ease of use (including for compilers) and code density to be able to mask an address register with an AND operation or LEA into a data register that will not be used for addressing. Thanks for the pointer on the NS32x32. I do like studying different ISAs.
| |
Megol .
| | Posts 671 06 Aug 2012 16:31
| Matt Hey wrote:
| A RISC processor nearly doubles the number of instructions, creates variable length instructions and has 3 modes of operation but somehow doesn't add much circuit complexity? That defies logic. Thumb 1 mode was severely handicapped (regular ARM instructions could not be used) and had a high overhead to switch to and from it. Dropping Thumb 1 support seems like the smart thing to do but fairly recent code used it. Are programmers expected to know what kind of code their compilers produce? I guess they can be expected to recompile their code too? Now the Cortex will drop the original ARM RISC instructions? Throw away your old compilers and libraries. Here's the standard for today. Now everybody support it. That's a really friendly processor.
| Encodings != instructions. Again: the logic needed for Thumb is small, just some lookup tables + fixup. If one doesn't run in Thumb mode the translation stage(s) are bypassed completely and clock gated. No problem. You have to see the ARM as a family of architectures where the lowest end Cortex variants are replacing microcontrollers and not microprocessors. Can't see why that would be a problem for programmers?
| |
Matt Hey USA
| | Posts 726 07 Aug 2012 04:04
| Megol . wrote:
| Don't know if I can subscribe to that world view. PPC had a performance advantage for a while (at least in practice) and the future looked bright for that platform. At the same time many people thought CISC designs couldn't increase performance at the same rate as RISC designs and that included skilled technical people at Intel and Motorola. |
Did PPC really have an advantage? It was supposed to be superior but seemed to always under perform in real world comparisons. I would have thought that the engineers at Motorola would have taken a second look at why the 68060 with half the clock speed, caches, memory and memory bandwidth of early PPC processors often outperformed them. I guess they thought compilers and programmers would get better and solve all their problems but that never happened. It's more than likely that the clueless white shirts upstairs wouldn't have listened anyway. It would have been like Dave Haynie's suggestions at C=. He might have saved the company if he had made the technical management decisions. Megol . wrote:
| ISA still matters but in practice the implementation matters a couple of magnitudes more. I remember when PA RISC was referred to as a memory chip with a processor. Guess what, even a cheap processor nowadays have much more cache transistors than core transistors. Branch prediction, cache hierarchy and prefetching makes the difference at this time IMHO. |
True. A processor with a poor ISA design can hide the flaws with a longer pipeline and then better branch prediction to make up for it. A long latency doesn't seem to matter much as long as the memory bandwidth is fully utilized :/. Many modern processors end up more like the DSPs of the past. Well, if the ISA doesn't matter much for performance any more, why not make it for ease of use (including compiler support) and good code density? Megol . wrote:
| Translating Thumb to ARM instructions is very easy so I can't see the problem? |
Thumb decoding is easy and fast (1 pipeline slot?) but it still takes a fair amount of logic. Also, another simple decoder is needed for the shorter Thumb encodings which also uses logic. ARM with Thumb avoids some of the complexity of a variable length CISC decoder but also misses out on most of the advantages like full immediate support and more consistent support of powerful addressing modes. Megol . wrote:
| Encodings != instructions. Again: the logic needed for Thumb is small, just some lookup tables + fixup. If one doesn't run in Thumb mode the translation stage(s) are bypassed completely and clock gated. No problem. |
I don't see a distinction between an (instruction) encoding and an instruction. Each encoding requires some logic and processing (if it's used) even if it's translated into another instruction early in the decoder. If EXTB.L Dn was translated to MVS.B Dn,Dn in the N68050 decoder, would EXTB.L cease to be an instruction? Megol . wrote:
| You have to see the ARM as a family of architectures where the lowest end Cortex variants are replacing microcontrollers and not microprocessors. Can't see why that would be a problem for programmers? |
It's not. The needs of a microcontroller are small and backward compatibility is usually not required.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3974 07 Aug 2012 11:30
| Matt Hey wrote:
| Megol . wrote:
| You have to see the ARM as a family of architectures where the lowest end Cortex variants are replacing microcontrollers and not microprocessors. Can't see why that would be a problem for programmers? |
It's not. The needs of a microcontroller are small and backward compatibility is usually not required.
|
That might actually amaze you there is a reason Microchip does so well in the embedded market. ;)
| |
Nixus Minimax Germany
| | Posts 272 07 Aug 2012 11:37
| Matt Hey wrote:
| | Did PPC really have an advantage? It was supposed to be superior but seemed to always under perform in real world comparisons. |
Who did those comparisons? The PPC ISA is far more powerful than the 68k ISA. You can save lots of unnecessary operations using the PPC ISA. You get 32 fully orthogonal registers. The implementation was very good, too. Each operation including rather complex ones completed in a single clock cycle. Multiplications took a maximum of nine cycles. The PPC featured a division operation (contrary to most other RISCs of the time). It had a very powerful FPU with lots of fast operations that could be used in complex operations like sin, cos, sqrt a.s.o. The 604 came with a whopping speed of 233 MHz and was three-way multiscalar. Sure, the 68k could have done better than the 50 MHz 68060. It could be made to be as fast as an Intel i7. (BTW, the PPC has actually been made as fast as a POWER7 with far less changes to the ISA than those that happened to the x86 between the Pentium and the i7...). It is just not going to happen. 68k died for good reasons. The only advantage of the 68k in comparison to the PPC was the better code density which provided two advantages: instruction cache size and instruction fetch bandwidth are utilized better on the 68k than on the PPC. Of course, the PPC came with much larger caches so that really wasn't a problem when comparing real-world implementations. I'm also sure that bandwidth was better in a proper PPC system than in anything we ever saw an 68060 in... I recently had a similar (ludicrous) discussion with someone who really thought that the 68000 of the Amiga was a bad choice and that they should have chosen the 6502. That guy also based his argument on the better code density of the 6502 and on the comparatively better use of the memory bus. He also claimed that the 6502 could outperform the 68000. Perhaps we should implement a new 6502? | Well, if the ISA doesn't matter much for performance any more, why not make it for ease of use (including compiler support) and good code density? |
Because compatibility rules. And because with clock speeds in the gigahertz range, you can saturate a lot of memory bandwidth even when you use a lower code density. If code is in the cache, the density doesn't matter much. If code is not in the cache, it will be loaded using bursts. Then the code density won't really matter that much either because in either case the CPU speed goes down to memory speed. The better code density will be as much faster as the code density is better but "faster" with regard to the crawling speed that an instruction fetch cache miss implies.
| Thumb decoding is easy and fast (1 pipeline slot?) but it still takes a fair amount of logic. Also, another simple decoder is needed for the shorter Thumb encodings which also uses logic. ARM with Thumb avoids some of the complexity of a variable length CISC decoder but also misses out on most of the advantages like full immediate support and more consistent support of powerful addressing modes. |
Powerful addressing modes? I don't think that there is any real lack of addressing modes in the PPC or in ARM. Full immediate support is not a problem in ARM, you can always load a full immediate PC-relative to a register.
| |
Megol .
| | Posts 671 07 Aug 2012 14:13
| Matt Hey wrote:
|
Megol . wrote:
| Don't know if I can subscribe to that world view. PPC had a performance advantage for a while (at least in practice) and the future looked bright for that platform. At the same time many people thought CISC designs couldn't increase performance at the same rate as RISC designs and that included skilled technical people at Intel and Motorola. |
Did PPC really have an advantage? It was supposed to be superior but seemed to always under perform in real world comparisons. I would have thought that the engineers at Motorola would have taken a second look at why the 68060 with half the clock speed, caches, memory and memory bandwidth of early PPC processors often outperformed them. I guess they thought compilers and programmers would get better and solve all their problems but that never happened. It's more than likely that the clueless white shirts upstairs wouldn't have listened anyway. It would have been like Dave Haynie's suggestions at C=. He might have saved the company if he had made the technical management decisions. Megol . wrote:
| ISA still matters but in practice the implementation matters a couple of magnitudes more. I remember when PA RISC was referred to as a memory chip with a processor. Guess what, even a cheap processor nowadays have much more cache transistors than core transistors. Branch prediction, cache hierarchy and prefetching makes the difference at this time IMHO. |
True. A processor with a poor ISA design can hide the flaws with a longer pipeline and then better branch prediction to make up for it. A long latency doesn't seem to matter much as long as the memory bandwidth is fully utilized :/. Many modern processors end up more like the DSPs of the past. Well, if the ISA doesn't matter much for performance any more, why not make it for ease of use (including compiler support) and good code density?
|
If the problems can be hidden with a longer pipeline I'd not call it a bad ISA. A bad ISA is something that limits the raw performance and/or requires heroic effort to execute well. Example: both 68k and x86 have a problem with partial registers updates that complicates renaming++ in a OoO implementation. X86 have a problem with "conditional" updates of flags/condition codes for some instructions which requires some effort to execute well. On the other hand both x86 and 68k have some "warts" that can be seen as a feature if one have the extra hardware required (example: explicit stack pointer). Code density isn't a problem anymore and shouldn't be a primary goal when designing a modern processor. There are two reasons: even with 64 bit support there's no lack of memory (as the data will be much bigger anyway) and a reasonable instruction cache will fit the majority of code executed. Megol . wrote:
| Translating Thumb to ARM instructions is very easy so I can't see the problem? |
Thumb decoding is easy and fast (1 pipeline slot?) but it still takes a fair amount of logic. Also, another simple decoder is needed for the shorter Thumb encodings which also uses logic. ARM with Thumb avoids some of the complexity of a variable length CISC decoder but also misses out on most of the advantages like full immediate support and more consistent support of powerful addressing modes. Megol . wrote:
| Encodings != instructions. Again: the logic needed for Thumb is small, just some lookup tables + fixup. If one doesn't run in Thumb mode the translation stage(s) are bypassed completely and clock gated. No problem. |
I don't see a distinction between an (instruction) encoding and an instruction. Each encoding requires some logic and processing (if it's used) even if it's translated into another instruction early in the decoder. If EXTB.L Dn was translated to MVS.B Dn,Dn in the N68050 decoder, would EXTB.L cease to be an instruction?
|
As Thumb is translated to the real ARM instruction set I see it as an encoding. One of the earliest RISC efforts had a similar short encoding with translation to the native, executable instruction set.
Megol . wrote:
| You have to see the ARM as a family of architectures where the lowest end Cortex variants are replacing microcontrollers and not microprocessors. Can't see why that would be a problem for programmers? |
It's not. The needs of a microcontroller are small and backward compatibility is usually not required.
|
Marcel Verdaasdonk wrote:
| That might actually amaze you there is a reason Microchip does so well in the embedded market. ;)
|
Microchip and Atmel... But both MIPS and ARM have begun replacing traditional microcontrollers.
| |
Matt Hey USA
| | Posts 726 07 Aug 2012 14:38
| Nixus Minimax wrote:
| Who did those comparisons? The PPC ISA is far more powerful than the 68k ISA. You can save lots of unnecessary operations using the PPC ISA. You get 32 fully orthogonal registers. The implementation was very good, too. Each operation including rather complex ones completed in a single clock cycle. Multiplications took a maximum of nine cycles. The PPC featured a division operation (contrary to most other RISCs of the time). It had a very powerful FPU with lots of fast operations that could be used in complex operations like sin, cos, sqrt a.s.o. The 604 came with a whopping speed of 233 MHz and was three-way multiscalar. Sure, the 68k could have done better than the 50 MHz 68060. It could be made to be as fast as an Intel i7. (BTW, the PPC has actually been made as fast as a POWER7 with far less changes to the ISA than those that happened to the x86 between the Pentium and the i7...). It is just not going to happen. 68k died for good reasons. |
Computer magazines did the comparisons. The PPC specs usually looked better on paper too. 32 orthogonal registers makes 16 bit instructions nearly impossible (larger caches needed) and the additional registers need to be used often to have an advantage. The 68060 can do many complex instructions in 1 cycle also, including calculating an address and accessing caches/memory. A 2nd EA unit (planned for N68070 and Apollo) would have allowed almost all 68k instructions to operate in 1 cycle including 2 accesses to cache/memory. The PPC load/store architecture would need many instructions to do this. 32 bit multiplication only gives the upper or lower product with the 32 bit PPC ISA (both takes 2x as long). As I recall, division only returned the quotient and the remainder needed to be calculated. The 68k FPU had sin and cos fp instructions but they were removed (we still have sqrt). I find it interesting that Radeon GPUs also have fast sin and cos fp instructions. The PPC performs better with fp. The PPC is not a bad ISA but PPC processors have generally underperformed what they are supposed to be capable of. The 68060 outperforms what it is supposed to be capable of. Nixus Minimax wrote:
| The only advantage of the 68k in comparison to the PPC was the better code density which provided two advantages: instruction cache size and instruction fetch bandwidth are utilized better on the 68k than on the PPC. Of course, the PPC came with much larger caches so that really wasn't a problem when comparing real-world implementations. I'm also sure that bandwidth was better in a proper PPC system than in anything we ever saw an 68060 in... |
You need to take another look at the 68k then. The 68k has more powerful addressing modes that can be done with little overhead. ARM beats the PPC in addressing modes. Also, the 68k is easier to use and read despite it's lack of orthogonality. Nixus Minimax wrote:
| I recently had a similar (ludicrous) discussion with someone who really thought that the 68000 of the Amiga was a bad choice and that they should have chosen the 6502. That guy also based his argument on the better code density of the 6502 and on the comparatively better use of the memory bus. He also claimed that the 6502 could outperform the 68000. Perhaps we should implement a new 6502? |
No comparison. The 68k has 32 bit address registers with no bank switching. The 68k has more general purpose registers (or logically split data and address registers at least). The code density of a 68k should be better than the 6502 for modern complex programs. Nixus Minimax wrote:
| Powerful addressing modes? I don't think that there is any real lack of addressing modes in the PPC or in ARM. Full immediate support is not a problem in ARM, you can always load a full immediate PC-relative to a register. |
How many instructions and cycles does something like this take in PPC and ARM? move.l (a0)+,(8,a1) add.l (a0)+,($c,a1,d0.w*2) Loading PC relative immediate data from caches/memory is not as good for the caches and not as readable. Megol . wrote:
| If the problems can be hidden with a longer pipeline I'd not call it a bad ISA. A bad ISA is something that limits the raw performance and/or requires heroic effort to execute well. |
I think general purpose CPU ISA implementations that force very long pipelines are one of the signs of a bad ISA. What logic can't be hidden (and conquered) with a long enough pipeline? One of the most important duties of a general purpose CPU is branching and conditional execution where a very long pipeline becomes a detriment in most cases. Excessively long pipelines waste resources and power also. Megol . wrote:
| Example: both 68k and x86 have a problem with partial registers updates that complicates renaming++ in a OoO implementation. X86 have a problem with "conditional" updates of flags/condition codes for some instructions which requires some effort to execute well. On the other hand both x86 and 68k have some "warts" that can be seen as a feature if one have the extra hardware required (example: explicit stack pointer). |
Partial register updates also affects forwarding with Superscaler processors. It's a trade off. The ColdFire tried to do away with the 68k partial register updates in data registers (address registers always have been full update) which I think weakened the processors and hurt the code density. The 68k is simpler and more consistent than x86 with it's condition handling and instructions. While the 68k doesn't easily allow branch speculation elimination (PPC with multiple CC units) or predication (ARM with conditional instructions), it's system does support it's powerful addressing modes (where base register is updated) and it provides excellent code density. Many short branches could be replaced with new 68k conditional instructions we proposed like SELcc and SBcc in a simple and 68k way. Megol . wrote:
| Code density isn't a problem anymore and shouldn't be a primary goal when designing a modern processor. There are two reasons: even with 64 bit support there's no lack of memory (as the data will be much bigger anyway) and a reasonable instruction cache will fit the majority of code executed. |
Code density doesn't matter as much any more, especially on high end processors with large caches and lots of memory. Should we bother optimizing are code then? We can rely on users buying higher end computers with bigger caches and more memory. The data should then be in the cache, right? We would have poor code density processors and poorly optimized programs that do begin to make a more noticeable difference in performance. Code density is still important to the low to mid range processors where resources are more limited. That's where a modern 68k should be more competitive with it's excellent code density. I don't have any delusions about the 68k competing with x86_64 (or PPC) on high end desktops and in servers.
| |
Megol .
| | Posts 671 09 Aug 2012 15:56
| Matt Hey wrote:
| Megol . wrote:
| If the problems can be hidden with a longer pipeline I'd not call it a bad ISA. A bad ISA is something that limits the raw performance and/or requires heroic effort to execute well. |
I think general purpose CPU ISA implementations that force very long pipelines are one of the signs of a bad ISA. What logic can't be hidden (and conquered) with a long enough pipeline? One of the most important duties of a general purpose CPU is branching and conditional execution where a very long pipeline becomes a detriment in most cases. Excessively long pipelines waste resources and power also.
|
Yes. I don't know of any such ISA though. Megol . wrote:
| Example: both 68k and x86 have a problem with partial registers updates that complicates renaming++ in a OoO implementation. X86 have a problem with "conditional" updates of flags/condition codes for some instructions which requires some effort to execute well. On the other hand both x86 and 68k have some "warts" that can be seen as a feature if one have the extra hardware required (example: explicit stack pointer). |
Partial register updates also affects forwarding with Superscaler processors. It's a trade off. The ColdFire tried to do away with the 68k partial register updates in data registers (address registers always have been full update) which I think weakened the processors and hurt the code density. The 68k is simpler and more consistent than x86 with it's condition handling and instructions. While the 68k doesn't easily allow branch speculation elimination (PPC with multiple CC units) or predication (ARM with conditional instructions), it's system does support it's powerful addressing modes (where base register is updated) and it provides excellent code density. Many short branches could be replaced with new 68k conditional instructions we proposed like SELcc and SBcc in a simple and 68k way. Megol . wrote:
| Code density isn't a problem anymore and shouldn't be a primary goal when designing a modern processor. There are two reasons: even with 64 bit support there's no lack of memory (as the data will be much bigger anyway) and a reasonable instruction cache will fit the majority of code executed. |
Code density doesn't matter as much any more, especially on high end processors with large caches and lots of memory. Should we bother optimizing are code then? We can rely on users buying higher end computers with bigger caches and more memory. The data should then be in the cache, right? We would have poor code density processors and poorly optimized programs that do begin to make a more noticeable difference in performance. Code density is still important to the low to mid range processors where resources are more limited. That's where a modern 68k should be more competitive with it's excellent code density. I don't have any delusions about the 68k competing with x86_64 (or PPC) on high end desktops and in servers.
|
What I meant is that code density isn't a primary goal in designing an instruction set, not that it doesn't matter at all. Removing e.g. MOVEM and MOVEQ from a 64 bit 68k mode (so that 100% compatibility isn't needed) wouldn't be a performance problem and wouldn't even be a size problem for most embedded uses. And that would significantly simplify decoding.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3974 09 Aug 2012 16:37
| Pentium 4 had 28 stages in it's pipeline if that is not a example of how it should not be done IDK. I suppose it's why they took a step back and used the Pentium 3 as basis for the Core generation. Code density is very important Megol, Irony of the fact still would remain and that is that we do not require a bigger instruction word and we can Use a 64 bit data bus without extension to the ISA. problems will arise when we need to go for 64bit execution of code but at this time this still is not the goal besides with smart programming we are not limited at all in this department.
| |
Megol .
| | Posts 671 09 Aug 2012 18:43
| Marcel Verdaasdonk wrote:
| Pentium 4 had 28 stages in it's pipeline if that is not a example of how it should not be done IDK.
|
And? That had nothing to do with the ISA, the reason was the very aggressive design. It was a speed deamon design with integer pipelines at 2x the already fast nominal clock rate. Look at other designs targeting similar clock rates and see how deep the pipeline goes. I suppose it's why they took a step back and used the Pentium 3 as basis for the Core generation.
|
No the reason was that they couldn't keep the power down. Code density is very important Megol, Irony of the fact still would remain and that is that we do not require a bigger instruction word and we can Use a 64 bit data bus without extension to the ISA. problems will arise when we need to go for 64bit execution of code but at this time this still is not the goal besides with smart programming we are not limited at all in this department.
|
Yeah... Or rather there's no market for >4 GiB memory support for a 68k machine. I don't know why you bring in the databus size as it have nothing to do with what I wrote?!? It's easy to fit a 128 bit bus to an 8 bit processor.
| |
Thierry Atheist Canada
| | Posts 1828 09 Aug 2012 19:44
| Megol . wrote:
| Code density isn't a problem anymore and shouldn't be a primary goal when designing a modern processor. There are two reasons: even with 64 bit support there's no lack of memory (as the data will be much bigger anyway) and a reasonable instruction cache will fit the majority of code executed. |
LOL. Megol why do you, Richard, and others keep defending the FORCE FED policies of the UTTERLY INCOMPETENT AND USELESS microsoft corporation?I don't have access to the byte sizes of AOS1.3 commands and AOS4.0 ones to compare, but that PUTRID ABOMINATION that PPC produces, makes all commands 2, 3, and MORE OFTEN 7 to 10 (TEN) times bigger than their previous equivalents..... YOU defend OSs' like windows 7 needing 70, yes, SEVENTY GIGABYTES of hard drive storage space to function!!!!!! Go enjoy your much appreciated horrid abomination(s). And as for cache space, I was for having really big data and instruction caches in the SuperAGA NatAmi FPGA until I realized that flushing 32K data and 32K instructions would produce SERIOUS overhead on a slow ~150 MHz CPU, and that flushing 12 MEGABYTES on a 3.4 GHz CPU would still be a HUGE burden to be doing constantly. We may be best off at 16K data and 16K instructions with the NatAmi FPGA. The only good thing that came from PPC is Altivec, which is SIMD, right? And that isn't really PPC related, I don't think. I'm not sure where it came from, but think it was from graphics processing unit technology.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3974 09 Aug 2012 20:39
| Megol . wrote:
| Marcel Verdaasdonk wrote:
| Code density is very important Megol, Irony of the fact still would remain and that is that we do not require a bigger instruction word and we can Use a 64 bit data bus without extension to the ISA. problems will arise when we need to go for 64bit execution of code but at this time this still is not the goal besides with smart programming we are not limited at all in this department. |
Yeah... Or rather there's no market for >4 GiB memory support for a 68k machine. I don't know why you bring in the databus size as it have nothing to do with what I wrote?!? It's easy to fit a 128 bit bus to an 8 bit processor.
|
the Data bus is very important to fetching data off the main memory the wider it is the more data and instructions could be fetched in a single access. This is where smaller instruction word benefit us, besides it is quite common knowledge that a smaller instruction word requires less logic to decode it. As example a modern Intel CPU decodes 4+1 instructions each cycle and a AMD decodes 4 instructions. If we can defeat them cycle by cycle, meaning that we can do more operations per clock would be a feat alone. a 64bit wide data bus could fetch 4 word sized instructions, if we can keep up with the big boys in this perspective this would be a victory in and of it's own. Extending the ISA to support 64bit data thus has no priority because we do not even currently keep up in the 32bit theater. We do not need to take the lead but the gap that the past 18 years have left us with since the release of the 68060 needs to be closed before we consider even making mayor changes to the ISA. The ISA as is suffices and even requires some instructions to be dropped to close that performance gap.
| |
Nixus Minimax Germany
| | Posts 272 10 Aug 2012 12:44
| Marcel Verdaasdonk wrote:
| | the Data bus is very important to fetching data off the main memory the wider it is the more data and instructions could be fetched in a single access. |
And you think he doesn't know that? His point was that the bus size is completely irrelevant when discussing an ISA. Even a 6502 could be implemented with a 128bit bus and the precious 68k was actually implemented with an 8bit bus. | besides it is quite common knowledge that a smaller instruction word requires less logic to decode it. |
This "common knowledge" only exists in the world of Marcel Verdaasdonk. One basic idea of RISC was that instructions aren't encoded at all in the classic sense but comprise all the control flags required for the ALU. Thus, in the (ideal) RISC world an instruction hardly needs any decoding at all which is one of the reasons the instruction words are on average larger than those in the CISC world. As example a modern Intel CPU decodes 4+1 instructions each cycle and a AMD decodes 4 instructions. If we can defeat them cycle by cycle, meaning that we can do more operations per clock would be a feat alone. |
What would be the point of such a comparison? In a RISC like the PPC you can decode an infinite number of instructions at the same time if you want to because they are all the same size. This limitation only exists in the CISC world with its multitude of instruction lengths. Since it can be done with the x86, the same or more must be possible for the 68k. But the decoding problem is precisely where the x86 has the largest complexity. The difficulty of decoding CISC instructions grows exponentially with the amount of data you are looking at in a single instance. | a 64bit wide data bus could fetch 4 word sized instructions, if we can keep up with the big boys in this perspective this would be a victory in and of it's own. |
Try decoding 64 bits worth of arbitrary 68k code. It could be anything from one partial to four complete instructions. When you have figured out how many possibilities there are, you may understand that there is a reason why nobody would build a new CISC CPU in 2012 unless they were forced to do so.
| |
Lorenzo Lorenko Italy
| | Posts 63 10 Aug 2012 12:45
| Marcel Verdaasdonk wrote:
| | We do not need to take the lead but the gap that the past 18 years have left us with since the release of the 68060 needs to be closed before we consider even making mayor changes to the ISA. The ISA as is suffices and even requires some instructions to be dropped to close that performance gap. |
Right !! But how?!? Can we make a petition to Freescale :)
| |
|
|
|
|