Home   News   Concept   AMIGA-Compatible   Hardware   Forum   Questions+Answers   Pictures   Contact & Team

Welcome to the Natami / Amiga Forum

This forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.



All TopicsNewsQAFeaturesTalkTEAMLogin to post    Create account
Do you have questions about the Natami?
Post it here and we will answer it!

68060 + 68050/70 N Corepage  1 2 3 4 5 6 7 
Thierry Atheist
Canada

Posts 1830
03 Sep 2010 07:18


Maybe software should designate sections of code that can and sections that can NOT be done separately from the main body of the program in another CPU. This is something that the programmer would set a flag for the compiler to react to in every case.... Who knows, maybe there's a time he does not want a section worked on in a separate core?

Loïc Dupuy
France

Posts 253
03 Sep 2010 07:27


Marcel Verdaasdonk wrote:

And loic is indeed thinking to heavy on this, just let the compiler do the sceddualing.

 
:-D
But that's my point, if we use a static scheduler (at compilation time for example), we can only use one extra core, otherwise the others will be underrun regulary.
 
I can see the phenomena on my dual core atom, when under heavy load, i have 230% of one normal core of use. Only when i launch make -j4 i can reach 400%.
And the task can be run by all cores, because task are not affected to one core, but the sub-core model here, linked a job to a sub-core.
 
On normal load, or a simple job list, we will have worse than this kind of degradation.
 
So to have more than one sub-core, we need a kind of microkernel SMP to handle them if we want optimal use of ressources. Oherwise, they will be underrun (exception demomakers)

I Immortal
Netherlands

Posts 67
03 Sep 2010 07:42


If a library or something like that handles the cores you can just adress it when you need to offload work. the lib would handle / queu workloads offerd by different programms. This way if extra or specialised cores are added only the lib needs to be updated and software using the lib would benefit from the new hardware update. This will probably burn some extra cycles but most likly noone will  notice this in OS (mode).

Thomas Richter
Germany
(MX-Board Owner)
Posts 1425
03 Sep 2010 08:11


I was more thinking about a library that implements some kind of "pthread" type support for additional cores, offering *real* semaphores, mutexes and primitives like non-locking stacks and queues. This way, programs already using multiple cores could be easily ported and it would be made easy to use such cores.

Exec can, of course, only schedule one CPU, and threads would not be able to call Os functions. But well, a start is a start.

Greetings,
Thomas


Marcel Verdaasdonk
Netherlands

Posts 3979
03 Sep 2010 12:23


Loïc Dupuy wrote:

Marcel Verdaasdonk wrote:

  And loic is indeed thinking to heavy on this, just let the compiler do the sceddualing.
 

   
  :-D
  But that's my point, if we use a static scheduler (at compilation time for example), we can only use one extra core, otherwise the others will be underrun regulary.
   
  I can see the phenomena on my dual core atom, when under heavy load, i have 230% of one normal core of use. Only when i launch make -j4 i can reach 400%.
  And the task can be run by all cores, because task are not affected to one core, but the sub-core model here, linked a job to a sub-core.
   
  On normal load, or a simple job list, we will have worse than this kind of degradation.
   
  So to have more than one sub-core, we need a kind of microkernel SMP to handle them if we want optimal use of ressources. Oherwise, they will be underrun (exception demomakers)

Loic the Natami would initially only have indentical hardware, so i don't have a problem with doing it statical at this time.

Loïc Dupuy
France

Posts 253
03 Sep 2010 12:31


Thomas Richter wrote:

I was more thinking about a library that implements some kind of "pthread" type support for additional cores,
Exec can, of course, only schedule one CPU, and threads would not be able to call Os functions. But well, a start is a start.

It means that we have to assign memory range to the sub-core, otherwise, we have to protect OS calls, but how do you restrict this access if in assembly you can call directly the address of the OS function ?

Question for data passing:
- we launch a program that "malloc" the needed memory, data are loaded in this memory area, and the pthread works "in place" in the memory area, but how we enforce that the subcore does not call a forbidden address, how to prevent access to registers, ..., by limiting sub-cores to fast ram ? And even here, if the os is partially in fast, you can hit it ?
- some copy mechanism beetween 2 disjoint memory area ?

// and offload
For the moment, we have ideas for offloading the main core of the batchprocessing. But all the // has to be done by hand.
We have no more dterministic timing if i cut my trivial // job in N part (N=number of sub-cores), when they are not much process, i can assume that each sub-part will be execute in //, but under heavy load, it can be slower than running the whole job one one sub-core.

My point is that i can not see full exploitation of ressources for this model with more than one core in a general computing usage.
Of course for specific task, we can always hand tune to use fully the N core, but these tasks are not the general case, more the exception.

How does we choose beetwwen concurrent access on memory by several (sub-)cores, consistency, registers access, etc...

// after thought is tougher and underoptimized than // from the start (but AOS does not less us take this path apparently).

Finally, the proven PPC accelerator card method seems the best path for the moment, coders knows the problems, the model is already here and in production since several years, so we can use the past experience here.

The question will be the sames if we had N blitters instead of one for blitter jobs.

Claudio Wieland
Germany
(Natami Team)
Posts 706
03 Sep 2010 13:18


Coders will just have to think hard and choose adequate algorithms and approaches to avoid memory access conflicts and racing conditions, before starting to code anything. Since we are no multi-billion dollar company with 100 dedicated engineers, who spend their *years* inventing silicon to support lazy programming paradigms and bad coders, I'm quite content with the suggested parallelism. Combined with a support library for intelligent distribution of jobs onto the cores by means of estimated run time and priority per job, this would work very nicely.
 
Cheers

Zbigniew Stanislawiak
Poland

Posts 26
03 Sep 2010 13:43


He, he. Put PowerPC processor core in FPGA like Xilinx did in Virtex-II Pro series and start it with WarpUp. AmigaOs 4x and MorphOs crowd will love you. We could play Shogo, Freespace, Heretic II, Quake 2 etc. 

Marcel Verdaasdonk
Netherlands

Posts 3979
03 Sep 2010 15:09


now why would we wanna do that?

Asaf Ayoub
United Kingdom

Posts 332
03 Sep 2010 16:08



dont forget the way Commodore wanted to add multiprocessing was through DOS.


The A5000 will incorporate the new Motorola 68060 plus tWO 68EC040 processors. The '060' is clocked at over 35MHz and the two '040' will be clocked at 25Mhz giving the A5000 a total speed at over 60MHz. The '060' will sit on a separate card in the CPU slot (as in the A4000) and both the '040s' will sit on the motherboard. The '040' CPUs have been designed to help the '060'. This will be most evident at times of heavy multi-tasking. As a result of this configuration the A5000 will have a new kickstart installed.

The A5000 will have Kickstart/Workbench 4.0 (the present beta version has version 3.2). This Kickstart is required to control the three processors, earlier Kickstarts will not be able to access the '040' chips. This new Kickstart will not be released for the older machines although Kickstart/Workbench 4.1 will be released as a modified option.

This new DOS will enable the '040' CPUs to be assigned to different tasks and as shipped one will handle all screen and sound processing and the other will handle all of the I/O devices. This Kickstart is a 1 Mb chip and will be shipped on the hard drive (to be confirmed). If it is released in chip form then the chip will be placed on its own card. This Kickstart will have a user-selectable Kickstart screen so the user can select which Kickstart to load (either in slot or on hard drive) and the A5000 has been tested with Kickstart 1.2 upwards so there will be no more compatibility problems.

Commodore  have done it again in changing the chipset as there are several new chips. The A5000, with Workbench 4.0 is now capable of operating in all modes with a 512 colour pallet. To maintain the speed required to operate in this mode one of the '040' CPUs can be assigned to the screen display. The maximum screen resolution is 4096 x 4096 with over 32 million colours. This new chipset will be able to detect which chipset it should use (original, ECS, super-ECS or AGA and super AGA) by detecting which Kickstart is currently running or which is selected at a cold boot.

As the new chipset has a higher resolution and more colours more Chip RAM is required and Commodore have responded by having 16Mb of Chip RAM on the motherboard (expandable to 64Mb) and 16Mb of Fast RAM (theoretically expandable to 1024Mb, tested to 256Mb). The Chip and Fast RAM have been
organised on a 32-bit wide structure as in the A3000 + A4000.

-Amiga Mart Magazine, July 1993, Page 10-11.

Bartek "Banter" K.
Poland
(Natami Team)
Posts 2277
03 Sep 2010 16:14


So, it was possible bac in '90s! Awesome!

Thank you for that info, Asaf!

Cheers

Richard Maudsley
United Kingdom

Posts 821
03 Sep 2010 16:18


Sounds like bollocks to me:
 
  EXTERNAL LINK   
  Not the same barring the name, but it's also a work of fiction.

There was no replacement chipset anywhere near finished to my knowlage, and that talks like it's sitting there waiting for release. 16MB chipram fitted in 1993? I mean come on, was it raining miracles that day?

Asaf Ayoub
United Kingdom

Posts 332
03 Sep 2010 17:06


we dont know if its true, we dont know how many secret R&D projects Commodore was working on internationally.

Some hires scans of the mag :

EXTERNAL LINK  EXTERNAL LINK  EXTERNAL LINK 
I do know Commodore UK were doing really well with sales and they only closed by 'orders' from management US.


Richard Maudsley
United Kingdom

Posts 821
03 Sep 2010 17:12


Commodore UK were still open after the Escom buyout and were bought by them a few months later. This is just a UK rag putting nonsense in the pages to get more sales (reminds me of fake dual screen PSPs in mags circa 2006). Hell, the cover image for this "world exclusive" is an amiga 4000.

Megol .

Posts 680
03 Sep 2010 17:28


Asaf Ayoub wrote:

we dont know if its true, we dont know how many secret R&D projects Commodore was working on internationally.
 
  Some hires scans of the mag :
 
  EXTERNAL LINK  EXTERNAL LINK  EXTERNAL LINK 
  I do know Commodore UK were doing really well with sales and they only closed by 'orders' from management US.

But we do know what projects where underway! Look up AAA and Hombre. That article is a complete fabrication, Commodore planned to use a PA RISC processor not a cluster of 68000-family chips.

Thomas Richter
Germany
(MX-Board Owner)
Posts 1425
03 Sep 2010 17:57


Loïc Dupuy wrote:

Thomas Richter wrote:

  I was more thinking about a library that implements some kind of "pthread" type support for additional cores,
  Exec can, of course, only schedule one CPU, and threads would not be able to call Os functions. But well, a start is a start.
 

 
  It means that we have to assign memory range to the sub-core, otherwise, we have to protect OS calls, but how do you restrict this access if in assembly you can call directly the address of the OS function ?

You don't protect anything. It is - as it was always the case for the Amiga - in the responsibility of the programmer not to do something stupid. Just don't call them.

Loïc Dupuy wrote:

  Question for data passing:
  - we launch a program that "malloc" the needed memory, data are loaded in this memory area, and the pthread works "in place" in the memory area, but how we enforce that the subcore does not call a forbidden address,

You don't.

Loïc Dupuy wrote:

how to prevent access to registers,

You don't.

Loïc Dupuy wrote:

  ..., by limiting sub-cores to fast ram ?

Why?

Loïc Dupuy wrote:

And even here, if the os is partially in fast, you can hit it ?

Sure. If you do, your fault. Same as it ever was.

Loïc Dupuy wrote:

  - some copy mechanism beetween 2 disjoint memory area ?

Why? It's unified memory anyhow. There is nothing to copy.

Loïc Dupuy wrote:

  // and offload
  For the moment, we have ideas for offloading the main core of the batchprocessing. But all the // has to be done by hand.
  We have no more dterministic timing if i cut my trivial // job in N part (N=number of sub-cores), when they are not much process, i can assume that each sub-part will be execute in //, but under heavy load, it can be slower than running the whole job one one sub-core.

Yes, sure. It is - as it always is - in the responsibility of the programmer to ensure suitable scheduling.

Loïc Dupuy wrote:

  My point is that i can not see full exploitation of ressources for this model with more than one core in a general computing usage.
  Of course for specific task, we can always hand tune to use fully the N core, but these tasks are not the general case, more the exception.

Yes. And the point is? That this is not very useful? Well, I didn't start this nonsense. You cannot support SMP in AmigaOs.

Loïc Dupuy wrote:

How does we choose beetwwen concurrent access on memory by several (sub-)cores, consistency, registers access, etc...

You don't. It is in the responsibility of the programmer to ensure that cores are synchronized. This is what semaphores and mutexes are good for - not any different under Linux or windows.

Loïc Dupuy wrote:

  // after thought is tougher and underoptimized than // from the start (but AOS does not less us take this path apparently).
 
  Finally, the proven PPC accelerator card method seems the best path for the moment, coders knows the problems, the model is already here and in production since several years, so we can use the past experience here.

The only experience I took from this is: Do not use a PPC in an Amiga - it is not very useful.
 
Loïc Dupuy wrote:

  The question will be the sames if we had N blitters instead of one for blitter jobs.

Not quite. The blitter has no "operating system" it depends on. Patching gfx to support multiple blitters is possible. But for exec, it is more than patching exec. The whole construction with Forbid/Permit locking is outright unsuitable for SMP. You don't construct an Os like this if you could envision that more than one CPU core might become relevant. Instead, you use the right synchronization primitives to begin with. In AmigaOs, semaphores are second-class citizens, and signals are first-class. That's just outright wrong.

So long,
Thomas


Thomas Richter
Germany
(MX-Board Owner)
Posts 1425
03 Sep 2010 18:02


Asaf Ayoub wrote:

  dont forget the way Commodore wanted to add multiprocessing was through DOS.
 
 

...
 

 
  -Amiga Mart Magazine, July 1993, Page 10-11.


Nothing but wet fantasies of a marketing guy. CBM had a couple of prototypes for the triple-A chipset, possibly for extending the system by a signal processor, but not extending the system towards SMP.

Besides - two 040 as add-ons do not make much sense. The 040 is a power-hog, runs hot and is at the same time slower than the 060. It really doesn't make much sense.

I guess they must have read some of the NeXT hardware specs; IIRC, these systems had something like a plain 68K for display acceleration as coprocessor, but I forgot the details. Nice system back then.

So long,
Thomas


Megol .

Posts 680
03 Sep 2010 18:12


Thomas Richter wrote:

 
Asaf Ayoub wrote:

 
    dont forget the way Commodore wanted to add multiprocessing was through DOS.
   
   

  ...
   

   
    -Amiga Mart Magazine, July 1993, Page 10-11.
 

  Nothing but wet fantasies of a marketing guy. CBM had a couple of prototypes for the triple-A chipset, possibly for extending the system by a signal processor, but not extending the system towards SMP.
 
  Besides - two 040 as add-ons do not make much sense. The 040 is a power-hog, runs hot and is at the same time slower than the 060. It really doesn't make much sense.
 
  I guess they must have read some of the NeXT hardware specs; IIRC, these systems had something like a plain 68K for display acceleration as coprocessor, but I forgot the details. Nice system back then.
 
  So long,
  Thomas
 

  68040 main core with a Intel i860 RISC coprocessor in the NeXT color.
^ Sorry should be nextdimension. Memory fail :)

Sean F S
United Kingdom

Posts 3
03 Sep 2010 21:26


Hi all,

Back in the day, I did program on the Amiga and still have an interest in it (esp Natami) and I've been reading with interest all the issues that have been raised regarding support for multiple cores.
 
I had a crazy idea on how an architecture might work (assuming that the Disable() and Forbid() problems can be overcome) - but I'm sure that there are lots of holes in it...

By my own admission, even if it works, it wouldn't be the fastest solution in the world - but humour me for a second...

Is it possible for a core to act as the "sole core" to deal with all calls to libraries/devices (which is where the other problems appear to lie?).  Processes that run on other cores will gain versions of libraries/devices that are actually 'proxies' to libraries/devices which are only accessed via the core dealing with O/S calls.

Any calls that occur actually cause a message to be posted into a queue and a wait for the core performing the calls to the library/device to respond (via a semaphore, or similar?). 

By doing this, any calls to such code will be atomic - therefore would only the message-passing code would need to be SMP-safe?

What I'm wondering is whether it's possible to offer an environment where access to existing libraries/devices is possible instead of having to break programs into computationally intensive sections which don't access libraries/devices at all?

I guess one of the many issues is about code accessing items in library bases directly, but is something like this even remotely possible?!

Cheers,

Sean.

Marcel Verdaasdonk
Netherlands

Posts 3979
03 Sep 2010 21:51


Disable() and Forbid() are functions that cannot be fixed without a kernel rewrite AFAIK.
But okay Sean let's assume they aren't a problem.

What your proposing is symlinks and semaphores.
Sean read the dining philosophers problem and you know were your idea would break up.

posts 127page  1 2 3 4 5 6 7