 |
Welcome to the Natami / Amiga ForumThis forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.
|
Do you have ideas and feature wishes? Post them here and discuss your ideas. |
|
|---|
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 08 Mar 2011 08:00
| Gunnar von Boehn wrote:
| Hi Matt, Thanks for your examples. Let me try to understand them. Example 1: moveq #%01010101,d0 ;clears Z flag in condition codes btst d0,#%11010111 ;should set Z flag bne .skip bchg.b #1,$bfe001 ;Toggle the power LED if Z flag set .skip: rts
According to the "normal" definition of the BTST it should test bit 7 of the immediate value. As Bit 7 is zero, the Z flag should be set. Does this happen?
|
Gunnar, *NO*. Not again. *Sigh*. The code is *not* abtst #%11010111,d0 but a btst d0,#%11010111 which is a different instruction. The btst instruction uses the bit number encoded in the source (left operant, d0) and tests whether this bit is set in the target EA, which is now immediate. That is, the Z is set if and only if d0 modulo 8 equals 3 or 5. Would you be less confused if the program would look like this? btst d0,Data(PC) with Data: dc.b %11010111 See? That's doing exactly the same. Nothing super-hidden, super-cool, super-special. Again, btst is documented (and so is this variant) in the MC68K Programmers Reference manual, available online here: EXTERNAL LINK on page 166. Plain old stupid btst.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 08 Mar 2011 08:04
| Thomas Richter wrote:
| Gunnar von Boehn wrote:
| Hi Matt, Thanks for your examples. Let me try to understand them. Example 1: moveq #%01010101,d0 ;clears Z flag in condition codes btst d0,#%11010111 ;should set Z flag bne .skip bchg.b #1,$bfe001 ;Toggle the power LED if Z flag set .skip: rts
According to the "normal" definition of the BTST it should test bit 7 of the immediate value. As Bit 7 is zero, the Z flag should be set. Does this happen? |
Gunnar, *NO*. Not again. *Sigh*. The code is *not* a btst #%11010111,d0 but a btst d0,#%11010111 which is a different instruction. The btst instruction uses the bit number encoded in the source (left operant, d0) and tests whether this bit is set in the target EA, which is now immediate. That is, the Z is set if and only if d0 modulo 8 equals 3 or 5. Would you be less confused if the program would look like this? btst d0,Data(PC) with Data: dc.b %11010111 See? That's doing exactly the same. Nothing super-hidden, super-cool, super-special. Again, btst is documented (and so is this variant) in the MC68K Programmers Reference manual, available online here: EXTERNAL LINK on page 166. Plain old stupid btst. |
Thomas, yes makes sense. I had no coffee when I wrote this, this morning *yawn*. So this is the current marking of the code, right? moveq #%01010101,d0 ;clears Z flag in condition codes btst d0,#%11010111 ;should set Z flag bne .skip bchg.b #1,$bfe001 ;Toggle the power LED if Z flag set .skip: rts
If this instruction behaves like this then everything is fine. As this is exactly how the 68050 works.
| |
Megol .
| | Posts 671 08 Mar 2011 08:06
| Gunnar von Boehn wrote:
| Hi Matt, Thanks for your examples. Let me try to understand them. Example 1: moveq #%01010101,d0 ;clears Z flag in condition codes btst d0,#%11010111 ;should set Z flag bne .skip bchg.b #1,$bfe001 ;Toggle the power LED if Z flag set .skip: rts
According to the "normal" definition of the BTST it should test bit 7 of the immediate value. As Bit 7 is zero, the Z flag should be set. Does this happen?
|
The definition in the manual say that BTST D0, IMM Is 8 bit wide (as the destination isn't a register). "When a data register is the destination, any of the 32 bits can be specified by a modulo 32- bit number. When a memory location is the destination, the operation is a byte operation, and the bit number is modulo 8" The bit number specifies the LSb as zero (1<<(D0 & 0x7)) "In all cases, bit zero refers to the least significant bit" Sets the Z flag if the tested bit is zero: CC_Z=!((1<<(D0 & 0x7))&IMM) "Z - set if the bit tested is zero; cleared otherwise" I don't see where the confusion comes from?
| |
Matt Hey USA
| | Posts 726 08 Mar 2011 14:07
| @all Yes, as ThoR explains, the btst dn,#byte is only a variation of the BTST instruction. It is still useful and can test for a short mask of sorts without destroying a register but still is not quite like the x86 TEST. The 68000PRM does describe it I suppose. It's easily overlooked because there is an immediate form of the instruction already (static form with immediate) but an immediate is specified as allowed with the dynamic form in the destination also. This formatting is inconsistent with other 68k instructions. It is not difficult to support this variation in assemblers, disassemblers, processors, etc. but it is easy to miss too. If anyone still does not understand then they can play with the tests I created. Test1 and Test2 should both blink the power LED when run from cli.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 08 Mar 2011 16:59
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks?
| |
Megol .
| | Posts 671 08 Mar 2011 19:17
| Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks?
|
I could make use of it (no Wojtek, not for translating x86 code). However if one could change the address generator to allow two data registers for this special case it would be even more useful. I guess that it would be difficult though.
| |
Thierry Atheist Canada
| | Posts 1828 08 Mar 2011 21:22
| Gunnar von Boehn wrote:
| Regarding LEA to DNHow about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks?
|
Well, not that I understand any of this, but, if it's considered that no one is using bit 3 for some other purposes in their own personal coding schemes, the 680x0 architecture is abandoned so you can ascribe any function you wish to to "undefined segments set aside for future use".Effectively, TEAM NATAMI is now designing the rest of 680x0 technology for however long it's possible to advance!!!!!
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 09 Mar 2011 07:40
| Megol . wrote:
|
Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks? |
I could make use of it (no Wojtek, not for translating x86 code). However if one could change the address generator to allow two data registers for this special case it would be even more useful. I guess that it would be difficult though.
|
Well .... Its clear that making two Data Registers available would be nice form a programmers points of view - but we need to explain this clearly. The ALU is several clocks in the pipeline below the EA Unit. This means if zou use a DATA register in the EA - then this DATA register does NOT have to be written to by the ALU for several clocks - OR you will create a slow bubble in the pipeline. This means code like this: ADDQ #1,D0 MOVE (A0,D0),D2 This will run VERY slow as between both instructions several clocks of bubbles will be forced.
| |
Jakob Eriksson Sweden
| | (Moderator) Posts 1097 09 Mar 2011 08:05
| Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks?
|
I like it a lot. Maybe this idea can even be extended over time, to allow explicit control over cache as well as register? Just thinking...
| |
Matt Hey USA
| | Posts 726 09 Mar 2011 10:58
| Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks? |
I think this is a very innovative and powerful idea. I would definitely use it. It makes EA units more versatile and allows for more parallel operations. I do have a couple of suggestions though. I would not use bit 3 of the Full Extension Word. I would use bits 0-2 to create a new entry in the I/IS. It would use %100 representing "Direct (Addressing) Mode". This is how other addressing modes are represented and this is a new addressing mode. I would leave bit 3 for either base register update or larger shifts. This would allow the new "Direct Mode" to use which ever of these enhancements is decided upon. I would also recommend a different notation for the mode. I would suggest... add.l {1234,a0,d0*2},d2 The {} is generally (and mathematically) used to describe a set or group and that is what this is. On the 68k, () is used to describe indirect addressing which this is NOT. The ! is possibly misleading as well. I am open to hearing other possible notations as well.
| |
Wojtek P Poland
| | Posts 1597 09 Mar 2011 11:06
| Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate t he EA and NOT do the memory fetch but pass the address of the A instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks?
|
VERY smart idea. Just the way it is described by assembler could be different but this is not the CPU design but assembler program. This gives FREE 2 addititions and small shift being able to do eg. 3 additions in one instruction IS USEFUL and is very powerful. As it can be encoded it is great.
| |
Wojtek P Poland
| | Posts 1597 09 Mar 2011 11:08
| Matt Hey wrote:
|
Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks? |
I think this is a very innovative and powerful idea. I would definitely use it.
|
Perfect example of PROPER instruction set and CPU design. Reusing hardware that IS THERE - giving more power.
| |
Megol .
| | Posts 671 09 Mar 2011 11:54
| Gunnar von Boehn wrote:
|
Megol . wrote:
| Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks? |
I could make use of it (no Wojtek, not for translating x86 code). However if one could change the address generator to allow two data registers for this special case it would be even more useful. I guess that it would be difficult though. |
Well .... Its clear that making two Data Registers available would be nice form a programmers points of view - but we need to explain this clearly. The ALU is several clocks in the pipeline below the EA Unit. This means if zou use a DATA register in the EA - then this DATA register does NOT have to be written to by the ALU for several clocks - OR you will create a slow bubble in the pipeline. This means code like this: ADDQ #1,D0 MOVE (A0,D0),D2 This will run VERY slow as between both instructions several clocks of bubbles will be forced.
|
I know that. However in properly scheduled code (using e.g. software pipelining) it would still be useful. In theory a future 68k derivative could move the address generation stage for this special case (in what's now the cache access stage) to lessen this impact. If this is done some extension word bits could be redefined for something more useful for this case.I like this proposal a lot BTW. :)
| |
Cesare Di Mauro Italy
| | Posts 526 10 Mar 2011 05:07
| Matt Hey wrote:
|
Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks? |
I think this is a very innovative and powerful idea. I would definitely use it. It makes EA units more versatile and allows for more parallel operations. I do have a couple of suggestions though. I would not use bit 3 of the Full Extension Word. I would use bits 0-2 to create a new entry in the I/IS. It would use %100 representing "Direct (Addressing) Mode". This is how other addressing modes are represented and this is a new addressing mode. I would leave bit 3 for either base register update or larger shifts. This would allow the new "Direct Mode" to use which ever of these enhancements is decided upon. I would also recommend a different notation for the mode. I would suggest... add.l {1234,a0,d0*2},d2 The {} is generally (and mathematically) used to describe a set or group and that is what this is. On the 68k, () is used to describe indirect addressing which this is NOT. The ! is possibly misleading as well. I am open to hearing other possible notations as well.
|
I fully agree with you. Updating the base (address) register is more general and important feature.A new pattern in I/IS is better to introduce the Gunnar's proposal (which is useful too!).
| |
Phil "meynaf" G. France
| | (Natami Team) Posts 393 10 Mar 2011 09:23
| Gunnar von Boehn wrote:
| Regarding LEA to DN How about this crazy proposal: We have one Bit free in the Full Extension Word (Bit3). How about we use this bit to disable the memory fetch. Then you could do this: ADD.L !1234(A0,D0*2),D2 This instruction would calculate the EA and NOT do the memory fetch but pass the address of the EA instead into the ALU - which would use it and calculate on it. Crazy enough? Flexible enough to create a new world of hacks?
|
And what do you intend to do in the cases of LEA, JMP, JSR that end up using this encoding ?
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 10 Mar 2011 09:41
| Phil G. wrote:
| And what do you intend to do in the cases of LEA, JMP, JSR that end up using this encoding ? |
But LEA and friends do always work in this mode already. This mode would basically allow enabling of the behavior of LEA on other instructions.But frankly if this mode is useful should be investigated beforehand. How useful do we think is this mode in real live? Can I compiler make proper use of this?
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 10 Mar 2011 09:49
| Gunnar von Boehn wrote:
|
Phil G. wrote:
| And what do you intend to do in the cases of LEA, JMP, JSR that end up using this encoding ? |
But LEA and friends do always work in this mode already.
|
Yes, but that doesn't answer the question. (-; So, what do you do? Is this bit then a no-op? Or an unimplemented instruction?
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 10 Mar 2011 09:57
| Thomas Richter wrote:
| Gunnar von Boehn wrote:
| Phil G. wrote:
| And what do you intend to do in the cases of LEA, JMP, JSR that end up using this encoding ? |
But LEA and friends do always work in this mode already. |
Yes, but that doesn't answer the question. (-; So, what do you do? Is this bit then a no-op? Or an unimplemented instruction? |
Well this can depend on the implementation. Currently LEA sets a "suppress-memory-access-flag".The proposed hack could most simple be done by setting the same flag. This makes adding this mode is quite simply. The question for me is DOES IT MAKE SENSE? Can someone see a real benefit of this mode? We could then do stuff like that: AND.L {A0 + A1*2},D0 MULS.L {12345678 + A0 + A1*4},D1 Is this just "Cool" or really useful?
| |
Phil "meynaf" G. France
| | (Natami Team) Posts 393 10 Mar 2011 10:25
| Thomas Richter wrote:
| Yes, but that doesn't answer the question. (-; So, what do you do? Is this bit then a no-op? Or an unimplemented instruction? |
It would be Just Another Wasted Bit Of Encoding ;-)Gunnar von Boehn wrote:
| Well this can depend on the implementation. Currently LEA sets a "suppress-memory-access-flag". The proposed hack could most simple be done by setting the same flag. This makes adding this mode is quite simply. The question for me is DOES IT MAKE SENSE? Can someone see a real benefit of this mode? We could then do stuff like that: AND.L {A0 + A1*2},D0 MULS.L {12345678 + A0 + A1*4},D1 Is this just "Cool" or really useful?
|
Does it make sense ? Is it cool or really useful ?Well, to answer these, the ones who proposed them should write a whole routine, for a real life's case, showing how it's a saviour - and then, of course, post it here :) For me, either versions do not seem useful but i may be wrong.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 10 Mar 2011 10:26
| We have some ideas floating around regarding the EA unit .... Maybe we can together summerize them and find out which of them are the most useful ones? 1) Allow access to Address register from ALU This allows stuff like: MUL A0,D0 or ANDI #121231,A0 The benefit of this mode is clear. It gives the CPU more registers to work as data registers and it makes the CPU easier to use in same ways as the registers are more flexible. 2) Use the above encoding to instead add 8 more DATA registers. This would allow to have 16 data registers + 8 Address register. This can be encoded without instruction grow. This option would seperate the AN and DN more cleanly but increase the Data register which many will find useful. 3) Use an unused bit encoding of the FULL EXTENSION WORD to enable address register update. This would enable this: Ex: MOVE 1234(A0,D0*2)!,D0 -- would store the EA in A0 This would make certain memory operations save extra instruction to update the pointer. As the CPU has this path already this is basically free to add. 4) Use an unused bit encoding of the FULL EXTENSION WORD to allow passing the EA to the ALU - without doing a memory access. This would allow combining the result of a LEA with any ALU operation. Ex: OR {A0,D0},D2 This like a special case optimization. The drawback of this mode is the latency dependancy between the banks as their updates are done in different pipeline stages. This might make it difficult to support this in a compiler. 5) Another proposed option would be adding more ADDRESS Registers. This could relative simply be done be using PREFIX words. Without using PREFIX words this is tricky. By using Bit3 in the FULL EXTENTION WORD someone could add cheat mode which allows doubling the registers. This could be combined with the option to add 8 more Dataregisters ... 6) PREFIX WORDS We can also use PREFIX WORDS to make a single 68K instruction do more work. We partly do this already by FUSING certain combinations. E.g right now we do: MOVEQ #5,D2 ADD.L D1,D2 This fuses two operations into one more powerful. This could be enhanced by doing even more complex operations. This would make the code denser and enhance our possibilities to use all of the powerful 070 ALU per clock. Very "strong" instruction are: ADDM (An)+,(Am)+ This instruction does 2 memory loads, 1 memory update, 2 AE calcualations, 2 Address register updates, 1 Alu operation. The 070 design is tweaked to be able to do this in 1 single cycle. Therefore we have to have all units needed for this. By strengthening more instructions to do this much work we can improve the power per clock of our system I think we all see that there is some ideas and potential for strengthening the CORE.... Which benefits do you see?
| |
|
|
|
|