 |
Welcome to the Natami / Amiga ForumThis forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.
|
Do you have ideas and feature wishes? Post them here and discuss your ideas. |
|
|---|
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 18 Apr 2011 16:49
| Amir A wrote:
| Well, I know Thomas is very busy, but can someone please ask him if this FE133 chip has MMU and FPU or not.
|
As you see in the "FE" thread below, this version *does not* have a MMU nor a FPU. It is an "EC" version even though not branded as such.Greetings, Thomas
| |
Steve Thomas United Kingdom
| | Posts 178 18 Apr 2011 17:40
| So, if these chips do not have an MMU, what use are they on a NatAmi, maybe the team should design an Amiga accelerator card to get rid of these chips if a lot of them have already been bought.
| |
Christian Kummerow Germany
| | Posts 314 18 Apr 2011 18:17
| Steve Thomas wrote:
| So, if these chips do not have an MMU, what use are they on a NatAmi, maybe the team should design an Amiga accelerator card to get rid of these chips if a lot of them have already been bought.
|
And who should buy a Turboboard without a MMU? The retro gamer are happy with the exists 030 Turboboards. People that develop dont buy such. Left are the Gamer that use newer games and dont develop or use stuff that require a MMU/FPU. Not a good idea, how many years we wait already for Natami? And then Thomas should spend alot of Time into a Turboboard?
| |
Wojtek P Poland
| | Posts 1597 18 Apr 2011 18:28
| Gunnar von Boehn wrote:
| Hallo Thomas, B) We implement a direct level Bitmap using the same page concept as previously.
|
Option B with 1MB pages 4 options per page (2 bits) - readonly - read/write, no cache - read/write, cache - undefinedthis is one kilobyte of on-chip SRAM for this. in case of debugging an addon for system function - to allocate data only from given pages to tested program will suffice. And side effect - true shared chip/fastram with megabyte granularity. All you need is to flag "chipram" as no-cache.
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 18 Apr 2011 19:20
| Wojtek P wrote:
| this is one kilobyte of on-chip SRAM for this. in case of debugging an addon for system function - to allocate data only from given pages to tested program will suffice. And side effect - true shared chip/fastram with megabyte granularity. All you need is to flag "chipram" as no-cache.
|
This is way, way, way, way too coarse for any reasonable debugging. Even a simple MuForce would eat one MB of RAM at least, leave alone that one could no longer flag the "first page" of chip ram invalid since it already contains graphics at this point, thus making the whole tool pointless. Even the 4K of the 68040/060 are a bit on the coarse side for debugging, but at least half-way acceptable. So long, Thomas
| |
Marcel Verdaasdonk Netherlands
| | Posts 3991 18 Apr 2011 19:49
| this would make the 68060 board a interesting test board for MMU development for a later revision of the 68K family.
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 18 Apr 2011 20:26
| Thomas Richter wrote:
| If we can, I would suggest a prioritized set of at least eight register tupels, with the following possibilities: |
8 pairs I see. Thomas Richter wrote:
| caching-flags: (full,write-through,cache-inhibit) - I don't think we need "imprecise" unless you want to support various exception models. |
Pardon my ignorance but why do we need those caching flags at all? I assume that an elegant design (that we want) should not need those. Thomas Richter wrote:
| source-area: start address (resolution at least 4K, ideally down to a byte if this is feasible) and length. target-area: physical address the logical range maps to. |
Why do we want to do this translation? For debugging we need to "protect" certain areas. Translation should not be needed for it. Doing translation is technically a complete different story than doing "protection". It would be nice if we could avoid this pandoras box.If you start once with translation then this makes the idea of ever adding a nice and clean cache coherence between chipset and CPU by a magnitude more complex. Thomas Richter wrote:
| access-flags: (invalid,super-only,user and supervisor) |
Why do we want to differentiate between Super User and User mode? Thomas Richter wrote:
| write-flags: (read-write, read-only) |
I see the point of this. Thomas Richter wrote:
| used-flags: (a n-bit counter counting the access for page statistics) |
100 MHz CPU can make 100.000.000 counts per second. This means even 32bit counters will run over quickly. What purpose would those counters have? Thomas Richter wrote:
| dirty-flag: (a flag indicating whether the range has been written to) |
For what would you want this? Cheers Gunnar
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 19 Apr 2011 07:54
| Gunnar von Boehn wrote:
|
Thomas Richter wrote:
| If we can, I would suggest a prioritized set of at least eight register tupels, with the following possibilities: |
8 pairs I see.
|
I need to verify this on my system (once it should be running again), but at least this is the order of magnitude. To be on the safe side, 16 would surely be sufficient to keep the system running without having to swap range descriptors "on the fly" in non-debugging cases.Gunnar von Boehn wrote:
| Thomas Richter wrote:
| caching-flags: (full,write-through,cache-inhibit) - I don't think we need "imprecise" unless you want to support various exception models. |
Pardon my ignorance but why do we need those caching flags at all? I assume that an elegant design (that we want) should not need those.
|
It's up to you. We *either* need a bus snooper, or caching flags. Make your pick. If you want to offer DMA into all memory - and yes, you want - then we need to set the caching modes depending on where the DMA is coming from. Or need a bus snooper.Gunnar von Boehn wrote:
| Thomas Richter wrote:
| source-area: start address (resolution at least 4K, ideally down to a byte if this is feasible) and length. target-area: physical address the logical range maps to. |
Why do we want to do this translation?
|
For virtual machines, obviously. For mapping RAM into the high-mem area. Haven't we had this before? Yes, it is a feature you need should the system be able to overcome the limitations of AmigaOs at some point.Gunnar von Boehn wrote:
| For debugging we need to "protect" certain areas. Translation should not be needed for it. Doing translation is technically a complete different story than doing "protection". It would be nice if we could avoid this pandoras box.
|
There is no "box". It is exactly clear what to do, and why to do it.
Gunnar von Boehn wrote:
| If you start once with translation then this makes the idea of ever adding a nice and clean cache coherence between chipset and CPU by a magnitude more complex.
|
Why, no? The interface for that is there. It is "CachePreDMA()" and "CachePostDMA()", and, according to the interface definition, you need to call them anyhow.Gunnar von Boehn wrote:
| Thomas Richter wrote:
| access-flags: (invalid,super-only,user and supervisor) |
Why do we want to differentiate between Super User and User mode?
|
Because you probably need to prevent access to shared memory to regular users in a virtual machine in first place, but need superuser access then to the resources.Gunnar von Boehn wrote:
| Thomas Richter wrote:
| used-flags: (a n-bit counter counting the access for page statistics) |
100 MHz CPU can make 100.000.000 counts per second. This means even 32bit counters will run over quickly. What purpose would those counters have?
|
Page statistics. A single "U" bit is useful for a single page already, but once you cannot cover the memory by eight regions in total - and you won't in debugging scenarios - you need to swap page or "region" descriptors as regions are accessed. Obviously, you want to replace only the least used region descriptor to minimize the number of "invalid" accesses. For that, you need some kind of statistics information.Gunnar von Boehn wrote:
| Thomas Richter wrote:
| dirty-flag: (a flag indicating whether the range has been written to) |
For what would you want this?
|
To see whether the region has been written to? A typical example is the MuEVD video driver for shapeshifter which is required to emulate the Mac 24bpp/32bpp "chunky" mode. No, Natami doesn't have this mode (only a, sorry to say, close to useless planar-chunky mode that is very exotic and not available on any other architecture, and thus not suitable for emulating any kind of "standard" system you find on the market. What we have now nicely combines the drawbacks of planar and chunky. ;-) ). The MuEVD driver maps in a memory region which is used as a "screen buffer" for the virtual Mac, and refreshes only the part of the screen that was touched by the Mac, keeping the refresh rate high. For that, it needs to detect which screen regions were written to, which is exactly what the dirty bits deliver. Amongst other things.Greetings, Thomas
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 19 Apr 2011 08:34
| Thomas, Regarding the descriptor pairs. Maybe it makes sense to explain a few things. A) Programmable descriptors are about 10 times more expensive than fixed hardcoded ones. This means if we have certain "fixed" behaviours in the AMIGA like: - Chipmem always has cache behaviors XYY, - CIA and DFF Registers always have cache behavior ABC, - Rom has always cache beheviour DEF, - And all Fast mem has a default cache behavior GHI, This its optimal to hardcode those. Could you specify which regions you think we need and which behaviour. Regarding the address translations I'm not sure if I understood the goal of this. For running AMIGA OS we will not need this right? For what do we need this? Can you give a real example. Regarding Caching: > Why, no? The interface for that is there. It is "CachePreDMA()" > and "CachePostDMA()", and, according to the interface definition, > you need to call them anyhow. Its very simple: If you want to DMA into Fastmem (and you want this!) then you need an extra new DMA-MMU to translate all your Blitter or DMA tarnsfers addresses - if you want to be able to support MMU address translation on the CPU side. Regarding 32bit. That I like 3x8 for certain games or application because of simple performance advantages does NOT mean that we can not display normal 32bpp. We both agreed on this before that for compatibility to existing SDL or CyberGFX stuff displaying 32bpp is needed.
| |
Jakob Eriksson Sweden
| | (Moderator) Posts 1097 19 Apr 2011 09:40
| Thomas, do you want to run Linux on the Natami, or a future AROS with memory protection? Is that why you want an MMU? Just an idea, could one of the simple MMUs be implemented, like the 68451 or the SUN 3 proprietary MMUs? These had good support in Linux.
| |
Thomas Richter Germany
| | (MX-Board Owner) Posts 1425 19 Apr 2011 13:00
| Gunnar von Boehn wrote:
| Thomas, Regarding the descriptor pairs. Maybe it makes sense to explain a few things. A) Programmable descriptors are about 10 times more expensive than fixed hardcoded ones.
|
And 100 times more flexible? (-: One way or another, you will have to face the challenge of enabling DMA into fast RAM sooner or later.Gunnar von Boehn wrote:
| This means if we have certain "fixed" behaviours in the AMIGA like: - Chipmem always has cache behaviors XYY, - CIA and DFF Registers always have cache behavior ABC, - Rom has always cache beheviour DEF, - And all Fast mem has a default cache behavior GHI, This its optimal to hardcode those.
|
Chipmem and fastmem separation is clearly not optimal - it's an anachronism that limits the flexibility of the system. For chips, I agree, but that's only a small part of the story.Gunnar von Boehn wrote:
| Could you specify which regions you think we need and which behaviour.
|
Of course not. (-: MuGA protects "unallocated" memory regions. I don't know what is "unallocated" in advance, of course. MuEVD requires the Mac screen buffer to be mapped. I don't know where this will end up, it is a matter of the memory allocation of the system. MuLink will protect the code region of programs. I don't know where these will end up. MuLoadModules will protect RAM modules that replace ROM modules. I don't know where they will be loaded to.See, it doesn't work this way - exec has, as simple as it is, a memory allocation that allocates memory wherever it finds free memory, so no, I don't know where the memory will be. Gunnar von Boehn wrote:
| Regarding the address translations I'm not sure if I understood the goal of this. For running AMIGA OS we will not need this right? For what do we need this? Can you give a real example.
|
We will need this in the future. See, AmigaOs has a lot of limitations due to, uhm, "design errors", and if you want to resolve them, hopefully sooner than later, you need to confine it to a virtual machine. Sooner or later.Gunnar von Boehn wrote:
| That I like 3x8 for certain games or application because of simple performance advantages does NOT mean that we can not display normal 32bpp. We both agreed on this before that for compatibility to existing SDL or CyberGFX stuff displaying 32bpp is needed.
|
So what's the advantage of 8x3 again? I don't see any. In fact, you still need to blit thrice, and you cannot use the existing line drawer or graphics routines anyhow. (-;Greetings, Thomas
| |
Gunnar von Boehn Germany
| | (Moderator) Posts 5775 19 Apr 2011 13:16
| Thomas Richter wrote:
| Gunnar von Boehn wrote:
| Thomas, Regarding the descriptor pairs. Maybe it makes sense to explain a few things. A) Programmable descriptors are about 10 times more expensive than fixed hardcoded ones. |
And 100 times more flexible? (-: One way or another, you will have to face the challenge of enabling DMA into fast RAM sooner or later. |
Easy- Its not my intention to argue this. My wanted to explain the circumstances. My point is fixed comparators have a cost of N. But freely programmable comparator has a cost of about 10xN. This means if you have 10 freely programmable conditions then this has a HW cost of about 100. If you know that you always have 8 fixed conditions then you can implement them for a cots of about 8. his means you can for a cost of 100 implement 8 fixed conditions AND 9 freely programmable conditions. This is a better deal. As our FPGA is reprogrammable we can change those fixed conditions as we please anyhow. So fixing them is not really a drawback for us.
| |
Wojtek P Poland
| | Posts 1597 20 Apr 2011 21:58
| Thomas Richter wrote:
|
Wojtek P wrote:
| this is one kilobyte of on-chip SRAM for this. in case of debugging an addon for system function - to allocate data only from given pages to tested program will suffice. And side effect - true shared chip/fastram with megabyte granularity. All you need is to flag "chipram" as no-cache. |
This is way, way, way, way too coarse for any reasonable debugging. Even a simple MuForce would eat one MB of RAM at least, leave alone that one could no longer flag the "first page" of chip ram invalid since it already contains graphics at this point, thus making the whole tool pointless. Even the 4K of the 68040/060 are a bit on the coarse side for debug.
|
You do not have to be efficient while debugging - while this feature would solve more than just debugging - you could precisely specify what address range is cache'able. This way not only main memory could be divided whatever you like but even every expansion card could have uncache'able and cache'able parts.
| |
Marcel Verdaasdonk Netherlands
| | Posts 3991 20 Apr 2011 22:50
| This reminds me we talked about this in the past, write disable and read disable bitmap. A Dirty bit is usually needed in something like FLashROM, i don't see the need for it in RAM. I prefer to use a Bit for allocation this means a page has been allocated to a program.(it doesn't define which, this should be handled by the OS) I personally am of the opinion if we would go for something like a bit map we should try and keep it to a max of 4 bits per page, pages the size of 4kb or 8kb IIRC the 68K MMU line.
| |
Wojtek P Poland
| | Posts 1597 21 Apr 2011 12:09
| Jakob Eriksson wrote:
| Thomas, do you want to run Linux on the Natami, or a future AROS with memory protection? Is that why you want an MMU? Just an idea, could one of the simple MMUs be implemented, like the 68451 or the SUN 3 proprietary MMUs? These had good support in Linux.
|
Use a PC for running nix. much cheaper, much faster. Even better - connect to existing unix server using any remote protocol like telnet, rsh, ssh, X11, vnc...
| |
Jakob Eriksson Sweden
| | (Moderator) Posts 1097 21 Apr 2011 13:18
| Wojtek, Debian is bringing m68k back to supported. I think it's interesting.
| |
|
|
|
|