Home   News   Concept   AMIGA-Compatible   Hardware   Forum   Questions+Answers   Pictures   Contact & Team

Welcome to the Natami / Amiga Forum

This forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.



All TopicsNewsQAFeaturesTalkTEAMLogin to post    Create account
Do you have ideas and feature wishes? Post them here and discuss your ideas.

New 68k ISApage  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 
Marcel Verdaasdonk
Netherlands

Posts 3978
30 Jul 2012 22:20


Knowledge, another set of eyes, your opinion.
 
  I mentioned you specifically because of your work on LLVM Samuel and that a top down would be a better vintage point in this discussion as opposed to mine bottom up stand point.
 
  What would be a best way to call MOVEA, when applicable to all relevant registers accessible from User space?
 
 
 
 

Samuel D Crow
USA
(Natami Team)
Posts 1295
31 Jul 2012 03:35


Since you bring up LLVM, it has its own virtual instruction set so you don't have to deal with actual machine language as often.  One of its commands is "ALIAS".  It lets you rename functions without actually adding an inline or anything.  For now we can leave it as is but later have another command name alias the original.

Matt Hey
USA

Posts 735
31 Jul 2012 09:30


Samuel D Crow wrote:

  I think we should be using high-level languages using advanced compilers.  I think ThoR would agree on that point.  That way AROS 68k isn't an island unto itself but one of several supported architectures.  That way we can focus on the center of the NatAmi Universe:  SuperAGA.
 

 
  I think that high level languages are very important today but they are built starting with the foundation blocks of the underlying hardware. Advanced compilers can hide a poor hardware implementation but only with additional complexity which requires more effort (time) and introduces more errors. We should strive for simplicity, consistency and readability at all levels of development.
 
  Let me give you a relevant example of an error created by an inconsistent and under-documented 68k behavior of the extended word (.W) with address registers. Vasm and PhxAss have been doing the optimization "CMPA.W #0,An -> TST.W An" for many years. I recently noticed that this is incorrect and reported it to Frank who verified and fixed the error in vasm. TST.W An only tests the low word of the address register. The optimization works most of the time with pointers but is more likely to cause problems if using an address register for general purpose use. Should an advanced compiler try to use a 68k address register for general purpose use? The free auto extension of address registers makes some signed operations the fastest method using address registers. Should an advanced compiler ignore this as 68k address registers are only meant for pointers? Does the "CMPA.W #0,An -> TST.W An" optimization error affect advanced compilers? Would errors like this be more difficult to find on a CPU with assembler that is more difficult to read? ThoR advocates high level languages but would he have been as prolific on the Amiga if the assembler was more difficult to read and debug?
 
  I find it interesting that you want to focus on the hardware support of a specific graphics standard like SAGA but ignore the hardware support of a specific CPU. It's easier and more powerful to put a PCI gfx card in the Natami than to replace the 68k CPU. The reasoning for keeping both is backward compatibility (does preference count?) but neither is necessary. I would like to see enhancements to both explored though :).
 
 
Samuel D Crow wrote:

  Since you bring up LLVM, it has its own virtual instruction set so you don't have to deal with actual machine language as often.  One of its commands is "ALIAS".  It lets you rename functions without actually adding an inline or anything.  For now we can leave it as is but later have another command name alias the original.
 

 
  Most languages provide aliases and renaming. It would be better to start with a good name in the first place if we could come to a consensus as to what that is. I would like to hide the .W sign extension of OPA.W <ea>,An instructions now that we added similar support for data registers which is transparent. It makes the ISA more consistent and readable. It would probably require renaming MOVEA.W to MVS.W (or whatever name we decide) also.
 

Samuel D Crow
USA
(Natami Team)
Posts 1295
31 Jul 2012 16:40


Matt Hey wrote:

I find it interesting that you want to focus on the hardware support of a specific graphics standard like SAGA but ignore the hardware support of a specific CPU. It's easier and more powerful to put a PCI gfx card in the Natami than to replace the 68k CPU. The reasoning for keeping both is backward compatibility (does preference count?) but neither is necessary. I would like to see enhancements to both explored though :).

I doubt there will be enough bandwidth in the NatAmi's PCI card slot to provide adequate support for a GFX card.  Also, the SyncZorro slot allows replacing the soft-core with a real 68060 if, for example, an MMU is needed.  The point of the matter is running Amiga software.  No GFX card can adequately well emulate the Amiga chipsets to run most of the software written for the early Amigas.

Back on the subject of naming, just pick something and be consistant.  If you need an alias or macro to be backward compatible then so be it.

Thomas Richter
Germany
(MX-Board Owner)
Posts 1425
01 Aug 2012 09:40


Matt Hey wrote:

  Let me give you a relevant example of an error created by an inconsistent and under-documented 68k behavior of the extended word (.W) with address registers. Vasm and PhxAss have been doing the optimization "CMPA.W #0,An -> TST.W An" for many years.

But that's incorrect for several reasons, and not a good optimization either. First, as you state correctly, CMPA is always 32 bit sizes, whereas TST.W is not. Second, TST An works only on the 68020 and up, and thus should probably be avoided. Third, TST.W does not set the carry and overflow bits, thus is hardly equivalent to the above.

But that's rather an argument against assembly.

Matt Hey wrote:

  Should an advanced compiler try to use a 68k address register for general purpose use?

An advanced compiler should evaluate all the register allocations possible and should pick the one that fits best to the compiler options, namely either make the code as small as possible, or make the code as fast as possible. Whether that implies that An is used as GP register is really up to the compiler.

Matt Hey wrote:

ThoR advocates high level languages but would he have been as prolific on the Amiga if the assembler was more difficult to read and debug?

Probably not. Because I was starting from the wrong end, I have to admit. Ok, not quite. The story goes a little bit different. I started with BASIC (as probably anyone else at this age) but found it too slow to be of much use. Thus, I learned assembly - that was back then on the 6502. There were no fast high level computer languages, no compilers, not a trace of a useable C compiler, and the other available languages were either scary or exotic. (Does anyone remember Action! ?). Thus, it seemed natural to use assembly on the Amiga as well.  Looking back, this was an unwise choice for several projects, and only the right choice for very few of them.

ViCNEd should have been in C. mmulib and COP are set right in assembly, but these are probably the only projects that are.

You become wiser when you get older.

Matt Hey wrote:

  Most languages provide aliases and renaming. It would be better to start with a good name in the first place if we could come to a consensus as to what that is. I would like to hide the .W sign extension of OPA.W <ea>,An instructions now that we added similar support for data registers which is transparent. It makes the ISA more consistent and readable. It would probably require renaming MOVEA.W to MVS.W (or whatever name we decide) also.

Heck, I don't know. I believe all the names are useless and do not express the intend correctly. Does anyone know the Analog Devices Blackfin DSPs? They have a very nice assembler language. Ported back to the 68K, move.l d0,d1 would there be written as d0 = d1;
and add.l d0,d4 simply by d4 += d0; Ditto for "lsl.l #2 d0", which translates to d0 <<= 2;

Simple, efficient and to the point. Avoid cryptic mnemonics when possible.

Greetings,
Thomas


Team Chaos Leader
USA
(Moderator)
Posts 2094
01 Aug 2012 15:18


@Marcel and Matt Hey

Using my knowledge of Linguistics, Mathematics, English, Marketing, Associative Memory, Mnemonics and possibly other things of dubious value I have calculated the only correct answer to be:

18) MOVESEX / MOVEZEX

Since the opcode performs a MOVE the first 4 letters (prefix) must be MOVE, in keeping with the English language, Legacy Motorola mnemonics and human memory. 

Using MOVE as the prefix also makes it easier for newbs to learn.  Newbs are our friends.  A Newb who keeps getting pwned all the time by intentionally cryptic mnemonics will get discouraged and quit.  Newbs need a helping hand from all us oldsk00l harc0re megac0derz.

So that only leaves deciding what should the suffix be.

Concatenating 2 words together always produces a final product that is easier to remember, easier to spell and easier to pronounce than simply connecting strange combinations of letters together.

So my scientific calculations produced the result of adding SEX / ZEX onto the end.

Everyone can remember that SEX = Sign EXtend and ZEX = Zero EXtend.

Everyone can read, spell and pronounce MOVESEX and MOVEZEX.

From a marketing standpoint, the 680x0 ISA now has more sex than other ISAs thus making it "the kewlest ISA evar!!!!111" :D

I did not "join the debate" earlier, as for me there is nothing to debate.

In my code the only correct mnemonics allowed will be MOVESEX / MOVEZEX.  You may use whatever mnemonic you want in your own code.  You can call them GRANDPA.L and GRANDMA.L in your own code for all I care, you'll just be doing it wrong. :)

If I have to issue my own 680x0 ISA document then I will do so.

I now formally ask my esteemed colleague Mr. Matt Hey to move us on to the next item that needs to be debated.



Megol .

Posts 678
01 Aug 2012 17:46


Team Chaos Leader wrote:

@Marcel and Matt Hey
 
  Using my knowledge of Linguistics, Mathematics, English, Marketing, Associative Memory, Mnemonics and possibly other things of dubious value I have calculated the only correct answer to be:
 
  18) MOVESEX / MOVEZEX
...

That's simply too long. One shouldn't need an editor with command completion in order to avoid carpal tunnel. Doing long, overly descriptive mnemonics is something best left to Intel x86 instruction extensions... :)


  I now formally ask my esteemed colleague Mr. Matt Hey to move us on to the next item that needs to be debated.

Agree.

Team Chaos Leader
USA
(Moderator)
Posts 2094
01 Aug 2012 18:58


As an expert on Carpal Tunnel I can tell you that the only way to avoid it is to either
A: Just rely on dumb luck of ur DNA, which u have no knowledge of.

or

B: Always type with ur wrists very straight.  Always mouse with straight wrists.  Typing with keyboard in ur lap is a VERY good idea because u don't want ur elbows bent 2 much.

Wear some loose wrist braces to always remind u to keep ur wrists straight.

NEVER use any made in China keyboard.  Or any other high impact keyboard.

ONLY use keyboards with spring switches.  I mean the kind where the springs absorb the shock of ur typing and there is no impact of key striking an immovable surface.  Some old Amiga keyboards had these.  They were called Cherry-Switch keyboards.

NEVER EVER do "finger curls" or "wrist curls".  (doing weightlifting exercises with ur fingers or wrists is an absolute NO NO.)



Matt Hey
USA

Posts 735
02 Aug 2012 04:08


Thomas Richter wrote:

 
Matt Hey wrote:

    Let me give you a relevant example of an error created by an inconsistent and under-documented 68k behavior of the extended word (.W) with address registers. Vasm and PhxAss have been doing the optimization "CMPA.W #0,An -> TST.W An" for many years.
 

  But that's incorrect for several reasons, and not a good optimization either. First, as you state correctly, CMPA is always 32 bit sizes, whereas TST.W is not. Second, TST An works only on the 68020 and up, and thus should probably be avoided. Third, TST.W does not set the carry and overflow bits, thus is hardly equivalent to the above.

 
  1) It was not a good optimization because it was incorrect. The CCR flags are set different. Vasm does only "safe" optimizations by default. "Safe" refers to optimizations that not only give the same result but set the CCR flags in the same way.
  2) "CMPA.W #0,An -> TST.L An" is not done for the 68000 or 68010. Most optimizations do not apply to all processors of the 68k and ColdFire families. Limiting optimizations to ones that work on all related processors would very much limit the potential optimizations. Vasm does not currently allow a range of processors to be targeted.
  3) TST An always clears the CCR[C] and CCR[V] Bits. CMPA #0,An always clears the CCR[C] and CCR[V] Bits because a carry or overflow is not possible when subtracting zero which is the operation done by CMP. The CCR[N] and CCR[Z] flags are set according to the operand, An, for both instructions, if the same operation size is chosen. CCR[X] is unaffected for both instructions. Do you still see a problem with the CCR flags for this optimization?
 
  This optimization is useful primarily in 2 cases:
 
  1) Sometimes there is no 68020+ version of link libraries available. The 68000 version can be used on a 68020+ with some improvement.
  2) Some compilers may produce only CMPA #0,An and expect the assembler to upgrade to TST An on machines with a 68020+. This simplifies compiler creation and size. Vbcc often generates basic instructions and expects optimizations to "quick" and short instructions. Unfortunately, assembler peephole type optimizations are not always possible because of different CCR flags being set. This can still work if the assembler programmer(s) notifies the compiler programmer(s) which optimizations are not possible in the assembler and the compiler generates the optimization for that case.
 
 
Thomas Richter wrote:

  But that's rather an argument against assembly.

 
  Compilers typically generate assembly code. How else can a high level language communicate with the hardware? Vasm is an assembler written in C that does peephole optimizations for compilers mostly written in C. There was some confusion because of the low level 68k ISA without any code written in assembler so how can this be a case against assembly?
 
 
Thomas Richter wrote:

 
Matt Hey wrote:

    Should an advanced compiler try to use a 68k address register for general purpose use?
 

  An advanced compiler should evaluate all the register allocations possible and should pick the one that fits best to the compiler options, namely either make the code as small as possible, or make the code as fast as possible. Whether that implies that An is used as GP register is really up to the compiler.
 

 
  I agree. The compiler should do reasonable optimizations so that assembly code isn't needed, if possible.
 
 
Thomas Richter wrote:

  ViCNEd should have been in C. mmulib and COP are set right in assembly, but these are probably the only projects that are.
 
  You become wiser when you get older.
 

 
  I hope we get wiser (and more mature) as we get older :). Frank Wille and me started out in assembler also but we both have programming projects in C now. I do still enjoy 68k assembler but it is tedious for some kinds of programs (like text handling). It's still a huge advantage to have a good understanding of the low level hardware and to have a readable ISA like the 68k. It helps tremendously with debugging, optimizing and patching.
 
 
Thomas Richter wrote:

  Heck, I don't know. I believe all the names are useless and do not express the intend correctly. Does anyone know the Analog Devices Blackfin DSPs? They have a very nice assembler language. Ported back to the 68K, move.l d0,d1 would there be written as d0 = d1;
  and add.l d0,d4 simply by d4 += d0; Ditto for "lsl.l #2 d0", which translates to d0 <<= 2;
 

 
  The Blackfin code is a kool concept. Most low level operations do have C style symbolic representations. The 2 op 68k could possibly simplify even further like d4+d0 and d0<<2 for your 2nd and 3rd examples. How about (8+a0+d0.l*4)=d1.w for MOVE.W D1,(8,A0,D0.l*4) and d0.l=3? for CMP.L #3,D0? I can see a few potential problems like calculations of variables and labels being confused with instructions and some assemblers having trouble parsing a more complex and unusual syntax. There is an advantage to using a more familiar and simpler syntax even if it's inferior in some ways.
 
 
Team Chaos Leader wrote:

  Everyone can remember that SEX = Sign EXtend and ZEX = Zero EXtend.
 
  Everyone can read, spell and pronounce MOVESEX and MOVEZEX.
 
  From a marketing standpoint, the 680x0 ISA now has more sex than other ISAs thus making it "the kewlest ISA evar!!!!111" :D

 
  Um, Yea. Sex does sell and that would be the sexiest ISA anyway :D. When your carpal tunnel flares up, you can tell everyone it's because of all the SEX you forced on yourself :P.
 
 
Team Chaos Leader wrote:

  I now formally ask my esteemed colleague Mr. Matt Hey to move us on to the next item that needs to be debated.
 

 
  I didn't intend to create a debate. The new ISA is more of a discussion and evaluation of ideas and improvements for the 68k ISA. It's more productive than debating the current status of the Natami project. Some of the ideas and wishes of the Amiga community should be usable and save time if an enhanced 68k CPU is ever created. Changing a name is rather simple and could be done later so it's not a big deal.
 

Marcel Verdaasdonk
Netherlands

Posts 3978
02 Aug 2012 08:14


What type of instructions should be in the A line and what should we do with the free space in 1, 2, and 3 line (MOVE lines)?

Nixus Minimax
Germany

Posts 273
02 Aug 2012 11:37


Megol . wrote:

Team Chaos Leader wrote:

  18) MOVESEX / MOVEZEX
  ...
 

 
  That's simply too long.

How about MO'SEX and MO'ZEX? Anyone should be able to remember that...


Marcel Verdaasdonk
Netherlands

Posts 3978
14 Aug 2012 20:52


Oke, I have been thinking and reading a lot including on the other threads. (Megol matt)

A line should be using a 32 bit instruction word.
I would like to see 3 op instructions like saturation and such on this line.
Another would be 64bit instructions with a slight twist.

This can be done in two ways like Megol said with a single pipeline multi stage.
Or like I said by chaining the two pipelines of super scalar CPU single stage.

Either has it's perks and cons, this is op to the implementer how to do it.

For most operations we should use a double source in this area which does not get altered and a single destination which could be a 3rd source and destination, or really just a destination register.

For things like saturation this mean you can have a constant loaded in a register and used as one of the sources.

Megol .

Posts 678
15 Aug 2012 10:42


Marcel Verdaasdonk wrote:

Oke, I have been thinking and reading a lot including on the other threads. (Megol matt)
 
  A line should be using a 32 bit instruction word.
  I would like to see 3 op instructions like saturation and such on this line.
  Another would be 64bit instructions with a slight twist.
 
  This can be done in two ways like Megol said with a single pipeline multi stage.
  Or like I said by chaining the two pipelines of super scalar CPU single stage.
 
  Either has it's perks and cons, this is op to the implementer how to do it.
 
  For most operations we should use a double source in this area which does not get altered and a single destination which could be a 3rd source and destination, or really just a destination register.
 
  For things like saturation this mean you can have a constant loaded in a register and used as one of the sources.

(Just some thoughts)
I wonder if a prefix scheme could be worth it anyway, at least as a complement. Just supporting one prefix should be relatively inexpensive to detect and decode in parallel with the following instruction.

0111REG1nnnnnnnn have at least 8 bits free for encoding extensions (11 bits if the register specifier isn't needed).
As an example:
0111REG1rxyzsnnn

Could support 3 operand instructions, increase the number of registers to 16 data and 16 address registers, support 64 bit instructions (s=size bit) and other things. It would allow future expansion and instruction specific hints too.

An instruction without prefix is treated as having a 0111DST100000000 prefix (where DST=destination comes from the instruction itself).

The rxyz bits are treated as the fourth bit of register number, the register field REG is extended to rREG and the rest are used to extend registers specified in the following instruction.

s=1 and operation size (from the following instruction) = long -> 64 bit operation (possibly trapping if not in 64 bit mode).
s=1 & opsize=word -> word size operation with zero filling of the upper 48 bit to avoid partial register problems.
s=1 & opsize=byte -> word size operation wt. zero filling of upper 56 bits.
s=0 & opsize=long always zero the upper 32 bits. This is also what long operations without a prefix would do.

Just as AMD64 this have the nice property of extending the available registers which could make 64 bit capable code a bit faster, possibly counteract the impact of data size enlargement.

Marcel Verdaasdonk
Netherlands

Posts 3978
15 Aug 2012 18:39


Well I was thinking of making it something as A line tells it to look at the second word which should have the same structure as a regular Instruction.
What should be in the A-line would be the multi-register part which should be just plain and simple 8 bit or 6 bits.
this leaves us with 6 bits or 4 bits to specify adaptation on the second word part.
I hope we can find a way to have a compatible second word like if the A line is not implemented it can be trapped in software and the first step of the instruction be already done.
Like a add part of saturation in which only normalization should be left.

Or is this a complete brain fart on my behalf....?

Matt Hey
USA

Posts 735
15 Aug 2012 20:56


I think a single 3 op prefix (originally Gunnars) idea is good. If used correctly (sparingly), it would improve speed and register utilization in some cases. I think it should, if possible, allow:

1) an <ea> for one of the sources to allow immediates
2) allow an address or data register for a source
3) have a bit to toggle on/off condition code updating

I think it should be limited to:

1) modifying 16 bit encodings only
2) using existing encoding space not including A-line or F-line
3) longword, word and byte sizes only
4) existing registers only

There is plenty of existing encoding space to do this without using A-line. I doubt that using A-line would offer a significant advantage. A saturation bit would probably be a good idea. It might allow SATS/SATU to be dropped in that case. I think it would add a fair amount more complexity, require a few more LEs and be significantly more work that Jens and other 68kF implementers would have to do :/. I would want hardware guys involved to make sure any encoding and features are optimal. I think it could be added on an evaluation basis but it would probably be a low priority.

I still do not like 64 bit in the integer units (SIMD or FPU are fine) or adding more registers in the integer units. I think these would increase the logic size (LEs) of our processor significantly without adding much usable performance and likely add incompatibilities. They would make it more difficult to comply with 68kF2. Even the prefix idea may turn off some implementers. There could be a 68kF3 but I would rather keep the highest standard conservative and simple until we have someone implementing/evaluating 68kF2.


Marcel Verdaasdonk
Netherlands

Posts 3978
15 Aug 2012 22:25


Matt what i was thinking was to use a regular instruction as a second word which has it's normal operation.(ADD.L, EOR.L, etc)
  The A line opcode only holds the first word within the given constraints, I would like this to be kept in mind for future expansion.
 
  So A line is 4 bits then 6 bits which defines what deviation to the regular operation defined in the second word.
  Then we have two times 3 bits that are either 2 registers or mode and register, depending on what the previous 6 bits are.
 
  My personal goals on 32bit operands are as followed.
  1.)We can have a wide range of operations within the constraints of the hardware, no real decoding overhead, just dependency on the second word.(Second word should be already defined in the 'current' ISA)
 
  2.)We have the possibility to add 32 variants to each sane instruction, this does not pertain to what we have currently but is also able to grow and be extended with future instructions.(Second word should be already Part of the 'current' ISA, first word only tells what is different)
 
  3.)We have the option to have 4 op or 3 op instruction depending on the coding requirement.(For the time being let's keep to 3 op)
 
  4.)We should limit our operations to register only for obvious reasons, to some extend we could use immediate too if we don't want to pull some crazy stunts.(Dn, An, and Immediate can be used on the full instruction, this could be extended if deemed possible and required)
 
  5.)A-line instructions should not cause a fetch beyond what the second word requires.(first word should only contain Dn and An)
 
  6.)Do never add a different meaning to the second word then it already has in the 'current' ISA.(Prevent conflicting instructions)
 
  7.)Do not add operations that can be done with two ASM instructions requiring the same amount of cycles this is a waste of Opcode space.
 
  8.)The First word should not effect the addressing modes of the second word(This is very important for the decoder stage, and Deep sub's current pipeline design)
 
  So as far as i know if we can decode 2 or 3 perhaps 4 opcodes at a given moment cycle.
  A line would be decoded at the same moment the second word line is decoded.
  After this the addressed register can be directly decoded since this is in a known place.
  Decoding of the second word's addressing modes is a little more complicated and takes a little more time during which more gets know about the operation of the second word which makes decoding the 6 bits possible.

Matt Hey
USA

Posts 735
16 Aug 2012 07:34


Marcel Verdaasdonk wrote:
 
    My personal goals on 32bit operands are as followed.
    1.)We can have a wide range of operations within the constraints of the hardware, no real decoding overhead, just dependency on the second word.(Second word should be already defined in the 'current' ISA)

 
  There is always decoding overhead, probably more than you think in this case. The N68050 is supposedly 3 op internally which should keep it to a minimum but there would be a lot of new functionality introduced. Would it be worth it if it made the N68k grow by, say, 10% in logic size (LEs)? Would the time and logic be better spent adding superscaler?
 
 
Marcel Verdaasdonk wrote:

    2.)We have the possibility to add 32 variants to each sane instruction, this does not pertain to what we have currently but is also able to grow and be extended with future instructions.(Second word should be already Part of the 'current' ISA, first word only tells what is different)

 
  Extendable is nice but this is one path for using A-line which could be used differently and possibly better. There is more compatible encodings available for a prefix than using A-line.
 
 
Marcel Verdaasdonk wrote:

    3.)We have the option to have 4 op or 3 op instruction depending on the coding requirement.(For the time being let's keep to 3 op)
 

 
  How could 4 op work? More practical would be 3 op internally with 2 <ea> units. The prefix could provide 2 op to 3 op (e.g. AND.L <ea>,Dn -> AND.L <ea>,Rn,Dn) and 1 op to 2 op (e.g. NOT.L <ea> to NOT.L Rn,<ea>).
 
 
Marcel Verdaasdonk wrote:

    4.)We should limit our operations to register only for obvious reasons, to some extend we could use immediate too if we don't want to pull some crazy stunts.(Dn, An, and Immediate can be used on the full instruction, this could be extended if deemed possible and required)

 
  Register only very much limits the benefits of a powerful 68k processor. I think we should allow at least 1 <ea> and possibly 2 if we go to the trouble. Most of the 16 bit instructions allow 1 <ea>. There are 3 common types of 2 op instructions:
 
  1) op <ea>,Dn or op <ea>,An
  2) op Dn,<ea>
  3) opi #<data>,<ea>
 
  I think we could leave opt #3 out as it is a bit restrictive and would provide duplicate functionality. The 2 remaining could be modified with a register to:
 
  1) op <ea>,Rn,Dn or op <ea>,Rn,An
  2) op Dn,Rn,<ea>
 
  or more aggressively with an <ea> to:
 
  1) op <ea>,<ea>,Dn or op <ea>,<ea>,An
  2) op Dn,<ea>,<ea>
 
  There is room to add a 2nd full <ea> in the prefix which could be more powerful while maintaining the instructions encoding length of 32 bits. Extension words would be possible for the extra <ea> though. The decoder would be simpler allowing only a register in the prefix and it would leave more free bits for options.

 
Marcel Verdaasdonk wrote:

    5.)A-line instructions should not cause a fetch beyond what the second word requires.(first word should only contain Dn and An)
 

 
  The prefix and the common 16 bit instruction would probably be fetched at the same time. I expect they would be treated as one 32 bit instruction.
 
 
Marcel Verdaasdonk wrote:

    6.)Do never add a different meaning to the second word then it already has in the 'current' ISA.(Prevent conflicting instructions)
 

 
  The prefix would no longer be a prefix but rather a new instruction in that case.
 
 
Marcel Verdaasdonk wrote:

    7.)Do not add operations that can be done with two ASM instructions requiring the same amount of cycles this is a waste of Opcode space.

 
  It could still be useful when out of registers or when the condition codes is not wanted for example. It would really be up to the programmer or compiler to choose the most advantageous instructions. It would be better to use 2 op when possible.
 
 
Marcel Verdaasdonk wrote:

    8.)The First word should not effect the addressing modes of the second word(This is very important for the decoder stage, and Deep sub's current pipeline design)

 
  This goes along with your option #6.
 

Marcel Verdaasdonk
Netherlands

Posts 3978
16 Aug 2012 10:46


Matt Hey wrote:

 
Marcel Verdaasdonk wrote:
 
      My personal goals on 32bit operands are as followed.
      1.)We can have a wide range of operations within the constraints of the hardware, no real decoding overhead, just dependency on the second word.(Second word should be already defined in the 'current' ISA)

   
    There is always decoding overhead, probably more than you think in this case. The N68050 is supposedly 3 op internally which should keep it to a minimum but there would be a lot of new functionality introduced. Would it be worth it if it made the N68k grow by, say, 10% in logic size (LEs)? Would the time and logic be better spent adding superscaler?
 

 
  Using the A line instead of using the other free space we have reduced most of the overhead, by design.
  I assume that a 32bit opcode takes at least twice as long to decode and takes up a slot that could be used different.(This is slightly offset by the fact the initial stages could be decoded parallel)
  I have no intention of asking Deep Sub to please add 32bit opcode to his design, I do however intend to keep it as simple as possible to minimize overhead and complexity.(K.I.S.S.)
 
 
Matt Hey wrote:

   
Marcel Verdaasdonk wrote:

      2.)We have the possibility to add 32 variants to each sane instruction, this does not pertain to what we have currently but is also able to grow and be extended with future instructions.(Second word should be already Part of the 'current' ISA, first word only tells what is different)

   
    Extendable is nice but this is one path for using A-line which could be used differently and possibly better. There is more compatible encodings available for a prefix than using A-line.
 

 
  Placing them in any other place creates more overhead then is required causes kludge designing to be spread around because wanting to comply to your ISA.
  No we need 32bit's operations to be clean lean and fast!
 
  I do not see a better way then to use A line only on 32bits and keep the rest of the ISA 16 bits.
 
  I also said sane instructions because i do not think CAS, TAS and MOVEP requiring a 32bit variant.
 
 
Matt Hey wrote:
 
   
Marcel Verdaasdonk wrote:

      3.)We have the option to have 4 op or 3 op instruction depending on the coding requirement.(For the time being let's keep to 3 op)
   

   
    How could 4 op work? More practical would be 3 op internally with 2 <ea> units. The prefix could provide 2 op to 3 op (e.g. AND.L <ea>,Dn -> AND.L <ea>,Rn,Dn) and 1 op to 2 op (e.g. NOT.L <ea> to NOT.L Rn,<ea>).
 

 
  I would not add 4 op now like FMA4 i would just conserve space so in the nearby future we could.(4 op the DST is not read, only written)
  3 op is where we should focus our attention.
 
 
Matt Hey wrote:
 
   
Marcel Verdaasdonk wrote:

      4.)We should limit our operations to register only for obvious reasons, to some extend we could use immediate too if we don't want to pull some crazy stunts.(Dn, An, and Immediate can be used on the full instruction, this could be extended if deemed possible and required)

   
    Register only very much limits the benefits of a powerful 68k processor. I think we should allow at least 1 <ea> and possibly 2 if we go to the trouble. Most of the 16 bit instructions allow 1 <ea>. There are 3 common types of 2 op instructions:
   
    1) op <ea>,Dn or op <ea>,An
    2) op Dn,<ea>
    3) opi #<data>,<ea>
   
    I think we could leave opt #3 out as it is a bit restrictive and would provide duplicate functionality. The 2 remaining could be modified with a register to:
   
    1) op <ea>,Rn,Dn or op <ea>,Rn,An
    2) op Dn,Rn,<ea>
   
    or more aggressively with an <ea> to:
   
    1) op <ea>,<ea>,Dn or op <ea>,<ea>,An
    2) op Dn,<ea>,<ea>
   
    There is room to add a 2nd full <ea> in the prefix which could be more powerful while maintaining the instructions encoding length of 32 bits. Extension words would be possible for the extra <ea> though. The decoder would be simpler allowing only a register in the prefix and it would leave more free bits for options.
 

 
  Matt there are reasons i add these restraints, they add minimal overhead for decoding.
  Besides that they can always be relaxed on a later date and time.
  But for not we should focus on Instructions that follow these rather simple rule set.
 
 
Matt Hey wrote:

   
Marcel Verdaasdonk wrote:

      5.)A-line instructions should not cause a fetch beyond what the second word requires.(first word should only contain Dn and An)
   

   
    The prefix and the common 16 bit instruction would probably be fetched at the same time. I expect they would be treated as one 32 bit instruction.
 

 
  Up until the line is decoded the decoder sees them as separate instructions, and treats them as such.
  This changes when the Decoder reads that it's on the A-line.
   
 
Matt Hey wrote:

   
Marcel Verdaasdonk wrote:

      6.)Do never add a different meaning to the second word then it already has in the 'current' ISA.(Prevent conflicting instructions)
   

   
    The prefix would no longer be a prefix but rather a new instruction in that case.
 

 
  I want the second word to be a valid 16 bit instruction on it's own.
  This reduces the overhead, and cost to the decoder.
  So yes the prefix remains a prefix even if it's part of the 32bit instruction.(Don't fix it if it ain't broke)
  Even worse the prefix should help us decode the first word, This is where i expect there is overhead.
 
 
Matt Hey wrote:
 
   
Marcel Verdaasdonk wrote:

      7.)Do not add operations that can be done with two ASM instructions requiring the same amount of cycles this is a waste of Opcode space.

   
    It could still be useful when out of registers or when the condition codes is not wanted for example. It would really be up to the programmer or compiler to choose the most advantageous instructions. It would be better to use 2 op when possible.
 

 
  I expect
  ADD.l d0, d1
  AND.l d0, d2
 
  Is just as fast as a single instruction we make to do the same.
  So this is where we should look at for the 3 ops, something special
 
 
Matt Hey wrote:
 
   
Marcel Verdaasdonk wrote:

      8.)The First word should not effect the addressing modes of the second word(This is very important for the decoder stage, and Deep sub's current pipeline design)

   
    This goes along with your option #6.
   
 

 
  I was aware of that, but i cannot stress it enough.
  The second word affects the decoding of first not the other way around.
  Then later the first affects how the CPU executes the second word's instruction and could add a second instruction altogether.
 
  I must say this Matt i do not make rules i make guidelines. ;)
  And these i mentioned above some can be seen as silly as only Allowing Dn and An in the first word but this is only to improve decoding speed and have minimal overhead.
 
  Using only the A-line for this might seem like a waste and perhaps it is, but for now let's pertain this fond Idea and see where it takes us.
 
  Also only using a Instruction that is already in the ISA as a second word stretches the Idea that it is a extension which it IS!
 
  Not using 'free' space elsewhere in the ISA might be considered stupid at this moment but, IMHO using those spaces would be a none clean implementation of 32 bit instructions and causes more overhead then is required for those instructions just take a peak on what is in F line, it's a mess!

Matt Hey
USA

Posts 735
18 Sep 2012 00:55


I have a new version of the 68kF PRM with some addressing mode information. The PDF document is much smaller and probably the best way to view now.

OpenOffice Writer
EXTERNAL LINK   
 
PDF
EXTERNAL LINK   
 
html
EXTERNAL LINK 

I didn't add an addressing mode with extended shifts. This would be slower especially in an fpga. There would be several ways to implement it depending on how many shifts were added and what is best for performance. We can always make changes. I hope everything is more understandable than the original 68000PRM in regards to addressing modes. One potential misunderstanding is whether <ea> is an address to memory and only applies to indirect addressing modes (true addressing modes) or whether it's the data used by the <ea>. Any suggestions or ideas are welcome as usual.


Marcel Verdaasdonk
Netherlands

Posts 3978
18 Sep 2012 16:04


*Sighs

So Instructions need to be long word aligned now?

posts 420page  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21