 |
Welcome to the Natami / Amiga ForumThis forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.
|
Do you have ideas and feature wishes? Post them here and discuss your ideas. |
| Thor Vs. Writeable (PC) Address Modes | page 1 2 3 4
|
|---|
|
|---|
Team Chaos Leader USA
| | (Moderator) Posts 2094 01 Jul 2011 23:13
| Thor, you had convinced me to be against writeable (PC) modes so I tried to argue your reasons against them (as best I could remember them) with Phil but I immediately was in over my head and all my arguments were blown out of the sky and I looked like an idiot :) In my defense, Phil cheated, by using Logic against me. :DSo I am turning over this discussion to you Thor. If you are still opposed to such modes this is place to say so or just give a link to an old thread where you already made your case. We are trying to identify all bad ideas so we can nuke them. Here is the discussion in progress, copied and pasted: Team Chaos Leader wrote:
| The trouble is that the mere existence of this proposed new mode bans us forever from adding "write-protected code hunks" into the OS. |
Phil "meynaf" G. wrote:
| No. Not at all. It doesn't ban us from anything. This proposed new mode is just a shortcut for something that can already be done otherwise. There is no conceptual difference between : lea rel(pc),a0 move.w something,(a0) and : move.w something,rel(pc) ... apart that the latter is shorter, spares a reg, and is free to add in HW.
| Team Chaos Leader wrote:
| The current method requires "manually" activating the write-protection on each exe. Wouldn't it be kewl if AmigaOS could automatically write-protect each code hunk? Its awesome for helping find bugs and prevent crashes from runaway programs. |
Phil "meynaf" G. wrote:
| Oh yeah, good. Now half of the programs will just crash. Kewl indeed :D You haven't disassembled too much code in your life, have you ? Code hunk protection is a good feature, but it must remain a facultative one or you're running straight into the wall. Many old programs will simply crash if their code hunks get protected.
| Team Chaos Leader wrote:
| I assume you have never used Thor's "write-protected code hunks" ability before so you don't realize how gr00vy it is :) |
Phil "meynaf" G. wrote:
| I often put my data right after the code to access it via d(pc) modes - this spares a reg. And guess what, i also sometimes write to it. |
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 01 Jul 2011 23:22
| Phil G. wrote:
| You haven't disassembled too much code in your life, have you ? |
On the Amiga, I have not disassembled much. I have Resource for countless years but never installed it. Mainly the only disassembly I have done is using SASC OMD to disassemble my C code to look for bugs or stupidity in the C compiler.On the C64 I used to disassemble a lot of code and I could read asm in the raw C64 gfx charset. Ahhh the good ol' days :)
| |
Matt Hey USA
| | Posts 727 01 Jul 2011 23:59
| @TCL I think both of you make good points. I'm in favor of both allowing PC relative writes and enhancing the support for write protected hunks. One does not preclude the other. Both can exist at the same time even if both are not useful at the same time. Allowing PC relative writes should be free so there is no trade off. I would encourage write protecting hunks that are read only while documenting PC relative writes with a warning about self modifying code.
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 03 Jul 2011 12:59
| The functionality of memory write protection and PC-relative write modes are indeed independent, though the concepts are related very much due to the Havard design of the 68K. This implies that a 68K program *should* consists of a constant "text" segment containing the binary opcodes and constant strings and data, all of which is addressed relative to the PC, and a modifiable data section addressed with absolute addressing or relative to an address register. The 68K does not enforce the model in so far as you may cheat by loading the PC into an address register and then write to the region, though the available addressing modes imply that this is not a recommended operation. Enforcement of the Havard model can either be done by using the function codes to select the RAM hardware to address, or by the MMU by write-protecting the text hunk. On the side of AmigaOs, the model is supported by both having code hunks and data hunks (which also include BSS hunks), and again, while supported, the model is not actively enforced. Code hunks are not write protected (a 68K couldn't do that anyhow), and, in fact, a binary starting with a data hunk would be run by Dos from the first hunk, even if this first hunk is not indicated as containing code. But the absence of enforcement is rather typical in AmigaOs all the way, i.e. neither prohibits you the Os to write into an arbitrary memory cell, nor does it enforce that you release all the memory when the program dies. It is rather a matter of programming discipline not to do such things, and on the same matter, I would disregard usage of d(PC) for writing data. Nothing enforces this, and with a simple additional "move" or "lea", the restriction can be circumvented. But it is nevertheless in contradiction to the 68K design. Data relative to the PC is text data, and thus constant.
| |
Phil "meynaf" G. France
| | (Natami Team) Posts 393 04 Jul 2011 10:05
| There are so many programs writing inside code sections - they're not writing code, but data intermixed with code - that write-protecting code hunks must remain facultative. As Deep Sub Micron told that it was easier (in HW) to allow writes to d(pc) than trap them, why not allowing them ? This is my point. PC is an extra register to point on data, when you're out of other An regs. And, ThoR, you're constantly out of An regs, aren't you ? Here's a free reg extra for you ;)
| |
Deep Sub Micron Germany
| | (MX-Board Owner) Posts 566 04 Jul 2011 11:19
| I think PC relative writes are only useful in very special case. This is because such code will be most likely not reentrant. Not being reentrant means not allowed in shared code, like libraries are. Because such problems are often hard to reproduce, I think it is a coding style everyone should try to avoid. I think it is the job of a MMU to protect the code section, rather than the instruction set. We just don't have a MMU in n050 in foreseeable time range.
| |
Phil "meynaf" G. France
| | (Natami Team) Posts 393 04 Jul 2011 14:30
| Reentrance has a price you sometimes just don't want to pay. You'll be using the stack too much and risk an overflow. Better to have a handful of easy-to-access variables. A game, a demo, an utility (unless you really want to make it resident) do not need to be reentrant. But that's not much the point for me. While i may agree that writes to d(pc) wouldn't be a great loss if not kept, i would strongly disagree on the expense of resources to trap the case. If at the end DSM says : you can write to d(pc) but i discourage doing it, then fine with me :D
| |
Deep Sub Micron Germany
| | (MX-Board Owner) Posts 566 04 Jul 2011 16:03
| Phil "meynaf" G. wrote:
| Reentrance has a price you sometimes just don't want to pay. You'll be using the stack too much and risk an overflow. Better to have a handful of easy-to-access variables.
|
The price is as little as just not using global variables. But all the variables don't have to be on the stack instead. It can also be a memory area you refer to by a single pointer on stack or passed by register.Phil "meynaf" G. wrote:
| A game, a demo, an utility (unless you really want to make it resident) do not need to be reentrant.
|
Exactly as I said just for a special case and single task (not shared somehow) it might be ok. If at the end DSM says : you can write to d(pc) but i discourage doing it, then fine with me :D
|
Yes.
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 04 Jul 2011 16:48
| Phil "meynaf" G. wrote:
| There are so many programs writing inside code sections - they're not writing code, but data intermixed with code - that write-protecting code hunks must remain facultative.
|
Could you give examples? The programs generated by compilers, SAS or Aztec C, do not write to the code section. In fact, when instructed to create re-entrant programs, the compilers keep the code-section intact and only make a copy of the data section each time the program is called - as part of the startup code.Same goes for my assembler programs: CODE is CODE. Phil "meynaf" G. wrote:
| As Deep Sub Micron told that it was easier (in HW) to allow writes to d(pc) than trap them, why not allowing them ?
|
There are many things that could be allowed, but aren't, even though they might be remotably useful in some situations. So for example, why does add or sub not operate on the PC, instructions like "add d0,pc" are missing. It still doesn't mean that they are suitable, there is "jmp d(PC,d0)". Ditto for PC-relative writes. Just because they *might* be allowed and useful in some situations, it still doesn't mean that they *should* be allowed. d(PC) is a constant EA and thus operations that potentially modify it are intentionally missing in the instruction set. In the same vain, there is no multiplication with address registers, even though this *might* be potentially useful in some cases. The reason is, again, the use case of address registers: They hold pointers, not numbers, and as such multiplication has no meaning. Similarly, d(PC) is constant data and not supposed to be modified. Phil "meynaf" G. wrote:
| This is my point. PC is an extra register to point on data, when you're out of other An regs. And, ThoR, you're constantly out of An regs, aren't you ? Here's a free reg extra for you ;)
|
Not really, because you cannot freely let the PC point to data where I need it to point to. IOW, it does neither really help in the cases where I run out of address registers. I rather need a generic pointer and as such, only address registers satisfy this need. How would you implement a strcpy with PC as pointer to the destination? (-:If I need modifyable data, I wouldn't use anything relative to the PC (not re-entrant), but would use something relative to SP (use the stack as overflow registers). That is, d(a7) is useful as destination (and source, of course). d(PC) is not. Greetings, Thomas
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 04 Jul 2011 17:05
| Phil "meynaf" G. wrote:
| Reentrance has a price you sometimes just don't want to pay. You'll be using the stack too much and risk an overflow.
|
This is not a problem. Just double your stacksize. See, I can use Logic too :) There is no such thing as "using the stack too much". The more the stack is used, the more benefits are accrued and the more stable the Amiga becomes. I just realized there is, in fact, no need for writeable (PC) modes! Phil tricked me by saying it freed a register. But all programmers already have the SP aka A7 register to point at variables. > But I want one more address register. Oh well, too bad. :) Either live with the Address registers provided or agree to my plan for supplying as many Address registers as we want. I find it silly to use a trick to get 1 effective extra Address reg when we can just add 24 more real address regs in a compatible manner. FPGA Power FTW!
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 04 Jul 2011 18:05
| By enabling PC relative writes we free a register that can be used to speedup innerloops. The DATA memory chunk can easily be seperated from the code chunk by inserting a CNOP so that the Datacachelines won't be shared with the instructioncache. Like this: Code: move.l d0,.label(PC) (...) .loop move.l ...... (...) dbf d0,.loop rts CNOP 32,0 .Label: blk.b 16 ;this cacheline will only be mapped in the datacache
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 04 Jul 2011 19:04
| Rune Stensland wrote:
| By enabling PC relative writes we free a register that can be used to speedup innerloops. The DATA memory chunk can easily be seperated from the code chunk by inserting a CNOP so that the Datacachelines won't be shared with the instructioncache. Like this: Code: move.l d0,.label(PC) (...) .loop move.l ...... (...) dbf d0,.loop rts CNOP 32,0 .Label: blk.b 16 ;this cacheline will only be mapped in the datacache
|
Bad code. For first, it is not re-entrant, for second, it causes a conflict between code and data cache.You can replace the code above with move.l d0,offset(a7) which performs the same, is re-entrant, and doesn't require any modification of the 68K opcode set. Just the same, only nicer and more orthogonal. Greetings, Thomas
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 04 Jul 2011 19:48
| On the 060 it will not cause a conflict since the CNOP 0,32 ensure that the data is alligned to a 32byte boundary so the instruction cacheline will not tuch the datacacheline. But for future(?) CPU's with bigger cachelines It might cause problems. Depends on the cacheimplementation. The 060 Natami CPU card performs best with the copybackmode turned off and 32bit burst writes. (to SRAM) A move.l d0,offset(a7) is nice but it will require a sub.l #xxx,a7 to reserve space. In Non-system friendly code we can use a7 as a normal register and the PC as a stack. (Demo coding) PC relative writes are also perfect for generating fast self modified code ;)
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 04 Jul 2011 19:56
| Rune Stensland wrote:
| In Non-system friendly code we can use a7 as a normal register and the PC as a stack. (Demo coding)
|
Using A7 as a normal register means disabling multitasking. And I am thinking it means disabling interrupts too. Yes?
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 04 Jul 2011 20:02
| Team Chaos Leader wrote:
| Using A7 as a normal register means disabling multitasking. And I am thinking it means disabling interrupts too. Yes?
|
Yes. Demos are like useless utilities. They run, they play, and they exit. But every cycle is squeezed out of the 20 year old CPU. Interrupt are not needed, we have the copper. :)))
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 05 Jul 2011 17:42
| Are you seriously saying that there are demos that disable interrupts for long periods of time and use A7 as a general purpose register?
| |
Deep Sub Micron Germany
| | (MX-Board Owner) Posts 566 05 Jul 2011 17:48
| Maybe these demos use the fact that there are two A7 register. One for supervisor and one for the user stack. So interrupts are still possible.
| |
Rune Stensland Norway
| | (MX-Board Owner) Posts 871 05 Jul 2011 18:53
| I used it in 1998. The Texture+Gouraud routine(Landscape) in this demo optimized for 030 EXTERNAL LINK Here is the beginning of the renderer: SBZ_RENDER: moveq.l #0,d1 move.l sbz_objekt,a0 move.l #optsurfaces,SBZ_RENDER\.linjefaces lea sbz_txture,a6 move.l a7,.stack ;A7 is ready to be used by the masters of assembly move.l #visuallist,.visual bra.w .start cnop 0,8 .poly addq.l #4,a4 move.l a4,-(a5) lea (a3,d3.w*8),a3 movem.l (a3)+,a1/a2/a6/a7 (....) Here is the innerloop: (Per pixel) It uses 10 registers. .indre move.w d0,d5 move.w a4,d2 move.b d1,d5 add.l a3,a4 move.b (a6,d5.l),d2 add.l d6,d0 move.b (a6,d2.w),(a1)+ addx.l d4,d1 bcs.b .indre
The outerloop of the polygon filler is mostly in registers. Only 4 cached memory reads per scanline. The innerloop is 5 060 cycles per pixel if the texture and the shadetable is in the cache. A Bumpmap innerloop with zbuffer will need more than 16 registers since the routine will need to interpolate the zbuffer, and bumpmap etc..
| |
Team Chaos Leader USA
| | (Moderator) Posts 2094 06 Jul 2011 04:49
| Wow, that's crazy man! :) I have always written multitasking code so using A7 as a regular register feels alien to me.
| |
Phil "meynaf" G. France
| | (Natami Team) Posts 393 07 Jul 2011 10:13
| Thomas Richter wrote:
| Could you give examples?
|
Disassemble asm games at the random, and you'll see many examples. It may even be a lot more frequent in demos. The fact neither most compilers, nor your own code, do something, doesn't mean it never happens. As an example last game i've put my hands into (Death or Glory) did that. It is also used a lot in various sound-players.Thomas Richter wrote:
| The programs generated by compilers, SAS or Aztec C, do not write to the code section. In fact, when instructed to create re-entrant programs, the compilers keep the code-section intact and only make a copy of the data section each time the program is called - as part of the startup code. Same goes for my assembler programs: CODE is CODE.
|
This is right : compilers do not write to code section. Instead, they prefer wasting a register to hold pointer to their BSS section (or, worse, they generate heaps of relocs).Guess what : have all your data at the end of your code section, and use PC instead of that base register. You've freed a register. Code is code but it may be followed by data. You can put everything in a single section, by using code_bss hunks (supported since v37 IIRC). Now i'm not saying this is something you *should* do ;-) Thomas Richter wrote:
| There are many things that could be allowed, but aren't, even though they might be remotably useful in some situations. So for example, why does add or sub not operate on the PC, instructions like "add d0,pc" are missing. It still doesn't mean that they are suitable, there is "jmp d(PC,d0)". Ditto for PC-relative writes. Just because they *might* be allowed and useful in some situations, it still doesn't mean that they *should* be allowed. d(PC) is a constant EA and thus operations that potentially modify it are intentionally missing in the instruction set.
|
I'm pretty sure some RISC cpus do allow adding to the PC because it's just another register there (ARM perhaps ?).Better allowing something not very useful, than wasting valuable resources on trapping it - if it's harmless, and we all know it is in that case. Thomas Richter wrote:
| In the same vain, there is no multiplication with address registers, even though this *might* be potentially useful in some cases. The reason is, again, the use case of address registers: They hold pointers, not numbers, and as such multiplication has no meaning. Similarly, d(PC) is constant data and not supposed to be modified.
|
Address registers can hold offsets, not only addresses, hence multiplication on them isn't totally meaningless - like, say, multiplying by sizeof of a struct.Besides, it's not infrequent to make them hold data - when out of data regs. In a DCT for example, you have lots of data to handle, and lots of muls to perform - there it would make sense. Thomas Richter wrote:
| Not really, because you cannot freely let the PC point to data where I need it to point to. IOW, it does neither really help in the cases where I run out of address registers. I rather need a generic pointer and as such, only address registers satisfy this need.
|
But what you'll get for use is an address register, that gets freed by use of the PC.Anywhere you're using global variables, you probably have a basereg (usually A4 or A5). Put these at the end of your code section, and access them via d(PC). Now you have A4 or A5 free for whatever use you want. Aren't you using A4 as bss pointer, A5 as frame pointer, A6 as library pointer all the time ? Thomas Richter wrote:
| How would you implement a strcpy with PC as pointer to the destination? (-:
|
Like this :
label sub.l #label+1,d0 ; d0 is dest addr .loop addq.l #1,d0 move.b (a0)+,label(pc,d0.l) bne.s .loop
... but there is little need of An regs for strcpy so i don't see it as a good example.Thomas Richter wrote:
| If I need modifyable data, I wouldn't use anything relative to the PC (not re-entrant), but would use something relative to SP (use the stack as overflow registers). That is, d(a7) is useful as destination (and source, of course). d(PC) is not.
|
But using d(pc) has freed A4 from being your BSS pointer, so you can use A4... unless, of course, you insist on having everything reentrant (a mistake IMHO) and they you pay the price :-)Team Chaos Leader wrote:
| This is not a problem. Just double your stacksize. See, I can use Logic too :)
|
Once the damage is done, it's too late to double the size. See, you won't win against me at that play :DTeam Chaos Leader wrote:
| There is no such thing as "using the stack too much". The more the stack is used, the more benefits are accrued and the more stable the Amiga becomes.
|
Actually it's the opposite. The more some code uses the stack, the less stable it becomes.Stack is Evil. Don't use it :p The stack has a big problem : you never know the exact space you have and may inadvertently use it too much. OR, if you decide to go for a *large* stack, you'll waste memory. But whatever, the more you use the stack, the more vulnerable to buffer overflow attacks you'll be. Stack is for small amounts of temporary things, not large arrays. Oh, and, yes, you may wonder why the fine library you've just written crashes when opened on other machines, while it works fine on yours ? Oh, yes, sorry. They don't run the patch to rise Ramlib's stack size and the library in question uses quite a lot of it. Too bad for you :-D Team Chaos Leader wrote:
| I just realized there is, in fact, no need for writeable (PC) modes! Phil tricked me by saying it freed a register. But all programmers already have the SP aka A7 register to point at variables.
|
There is no absolute need, but using LEs to trap it would be a great mistake.Team Chaos Leader wrote:
| > But I want one more address register. Oh well, too bad. :) Either live with the Address registers provided or agree to my plan for supplying as many Address registers as we want.
|
You have one more address register. Use PC as BSS pointer and a code_bss section. A7 is for temporaries, PC for permanents.Team Chaos Leader wrote:
| I find it silly to use a trick to get 1 effective extra Address reg when we can just add 24 more real address regs in a compatible manner.
|
Problem is : there is no encoding space to add more real address regs. They'll always have a drawback, even if it's just code size (not to mention the ugly encoding).
| |
|
|
|
|