Home   News   Concept   AMIGA-Compatible   Hardware   Forum   Questions+Answers   Pictures   Contact & Team

Welcome to the Natami / Amiga Forum

This forum is for AMIGA fans interested in the new NATAMI platform.
Please read the forum usage manual.



All TopicsNewsQAFeaturesTalkTEAMLogin to post    Create account
Welcome to the Natami lounge.
Meet new AMIGA friends here and enjoy having a friendly chit chat.

OK Teamers, Could Someone Show Us the Progress?page  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 
André Jernung
Sweden
(MX-Board Owner)
Posts 988
06 Feb 2012 20:37


There is only HAM8 right now. The discussion about HAM10 was interesting. But better work on more fundamental features for now :)

Wawa Tk
Germany

Posts 581
06 Feb 2012 23:39


@rune: cool demo.
@all: and how is softcore doing? since i see the 060 is a dead end at about 100mhz as expected.

Amiga Blitter
Italy

Posts 34
07 Feb 2012 09:20


Thank you.

Please, continue to post picture and so on.

Natami is cool.


Nixus Minimax
Germany

Posts 272
07 Feb 2012 09:56


Rune Stensland wrote:
Modern amiga Demos doesn't have a problem with slow chipmem. The Mc68060 is able to work in paralell (cache/registers) with the chipmem buswrites. Sometimes the c2p convertion is done fastmem->fastmem, and the slow chipmem writes are pipelined into a matrix multiplier or a txturemapper.

 
  On the 68030 we had chunky2planar routines that reached copy speed, i.e. the c2p from fast to chipmem was as fast as copying from fast to chipmem. I wonder whether the mem bus was blocked if there was a write to chipmem. In this case you couldn't get any faster than copy speed. Now the 060 can do cache accesses while having data waiting for chipmem? I guess the best place to put c2p code would be code using slow ops like MUL and DIV. You could probably hide them altogether in the waits for the chipmem. But you will also need the bus for operand fetch between two chipmem writes.
 
 
 
With proper code the Mc68060 can do cached FPU+CPU+datacache operations in parallell with chipmem writes. With superscalar mode and good knowlegde of memory pipelining a handwritten assembly loop get's alot faster than the C compiler.

 
  I used to write demo code in assembly language. I never had an 060 and back in the time there wasn't any point to write code specifically for the 060. Hardly anyone had 040s or 060s in 1994/1995.
 
  So the idea is to intersparse normal code with c2p code, i.e. somehow do the c2p code when computing the next frame? Like in a 3D demo compute the point projections and doing a bit of c2p at the same time? Would that work on 030 or 040, too? The point projection routine I did used all available registers to keep memory accesses to a minimum. There wouldn't have been space for doing c2p stuff but then the puny data cache of the 030 seemed to have very little effect. On an 060 it may have been less important to keep frequently used constant data in registers all the time.
 
 
 
It renders realtime on a 640x200 Ham8 screen

 
  Almost all scenes have a strong tint. I bet they only use 6 bitplanes of the HAM8 and have the remaining two bitplanes constant (the two bitplanes that have the modify R, G, B, set commands).
 
 

Christian Kummerow
Germany

Posts 314
07 Feb 2012 09:59


Rune Stensland wrote:

  My full rev 6 Mc68060(with fpu mmu) is running stable at 100mhz without a fan. I have managed to boot at 110mhz

Can you take the Temperature of it?
And tell us the passive cooler size?

Rune Stensland
Norway
(MX-Board Owner)
Posts 871
07 Feb 2012 11:42


 
  On the 68030 we had chunky2planar routines that reached copy speed, i.e. the c2p from fast to chipmem was as fast as copying from fast to chipmem. I wonder whether the mem bus was blocked if there was a write to chipmem. In this case you couldn't get any faster than copy speed. Now the 060 can do cache accesses while having data waiting for chipmem? I guess the best place to put c2p code would be code using slow ops like MUL and DIV. You could probably hide them altogether in the waits for the chipmem. But you will also need the bus for operand fetch between two chipmem writes.

On the 68030 we had c2p that reached copyspeed, but they required 1 blitter pass or scrambling of the chunky buffer. Datacache access stalled the chipmem writes, so it was very important to keep everything in registers. On the 060 the situation is different.
The fastest way to do c2p on Mc68060 is the rol merge:

 


  ror.l #4,d1
  move.l d0,d7
  eor.l d1,d7
  and.l #$0f0f0f0f,d7
  eor.l d7,d0
  eor.l d7,d1
  rol.l #4,d1
 

You can drop the last rol away. Then you need a new set of transpositions, and add a few extra rotations to the code, to unrotate registers back for writing to planes.

The 060 can execute 2 instructions per clock if they don't share registers. By pairing the merges we double the speed:

 


  ror.l #4,d1
  ror.l #4,d2
  move.l d0,d7
  move.l d3,d6
  eor.l d1,d7
  eor.l d2,d6
  and.l #$0f0f0f0f,d7
  and.l #$0f0f0f0f,d6
  eor.l d7,d0
  eor.l d6,d2
  eor.l d7,d1
  eor.l d6,d3

  (rol.l #4,d1)
  (rol.l #4,d2)
 

On an AGA amiga in lores a chipmemwrite takes around 26 cycles. (52 free instructions on a Mc68060 clocked at 50mhz will , (superscalar instructions)

 

  move.l d1,(a0)+ ;chipwrite
  ror.l #4,d1  ;free
  ror.l #4,d2  ;free
  move.l d0,d7  ;free
  ..
  move.l instack(sp),d4  ;free if in cache.
 

By doing the c2p in fastmem first, we only need 2/3 registers later when we merge the chipram copyloop into another method.

On 060 52 instructions are more than enough to convert c2p. On 030 it will be no point in doing this, because all the free instructions(13) are used in the c2p.


  Almost all scenes have a strong tint. I bet they only use 6 bitplanes of the HAM8 and have the remaining two bitplanes constant (the two bitplanes that have the modify R, G, B, set commands).

That is probobly correct.
...

This demo is doing realtime perspective correction with the fpu (for every 16pixel), mipmap(variable texture size) to improve the cachehits. Fastmem2fastmem c2p to remove the chipmem bottleneck

Massive by Skarla:

EXTERNAL LINK 
In this posting I dissassembled some innerloops of doomclone games and optimized them by using self modified code: Since you a are a former Democoder you might find them interesting.

EXTERNAL LINK

Wawa Tk
Germany

Posts 581
07 Feb 2012 12:16


ah, ermm, and what happened to that demo game you guys were developing? it was so much talk about it but no word anymore. has it been abandoned? is gunnar gone for good?

André Jernung
Sweden
(MX-Board Owner)
Posts 988
07 Feb 2012 12:16


Christian Kummerow wrote:
 
Can you take the Temperature of it?
And tell us the passive cooler size?

I measured this with my NAe60R. I don't have the exact measurements, but it uses a quite small heatsink with no active cooling.

Ambient temperature  24C
   
50MHz idle  41C
50MHz load  46C
   
90MHz idle  47C
90MHz load  54C
   
100MHz idle  46C
100MHz load  55C

Nixus Minimax
Germany

Posts 272
07 Feb 2012 12:54


Rune Stensland wrote:
On the 68030 we had c2p that reached copyspeed, but they required 1 blitter pass or scrambling of the chunky buffer.

Yes, I personally wrote a c2p that used one blitter pass.


 

  ror.l #4,d1
  move.l d0,d7
  eor.l d1,d7
  and.l #$0f0f0f0f,d7
  eor.l d7,d0
  eor.l d7,d1
  rol.l #4,d1
 

Ha, I learned the 3-eor-trick from the guy who invented it (don't remember the name, I met him on usenet). I think I came up with the blitter pass on my own, though.

By doing the c2p in fastmem first, we only need 2/3 registers later when we merge the chipram copyloop into another method.

Yes, I already imagined that this was the idea of doing c2p in fastmem. You could do the copying with just two address registers and copy to chipmem while doing more interesting stuff. With self-modified code you probably could make that one address register with a large immediate offset.

This demo is doing realtime perspective correction with the fpu (for every 16pixel), mipmap(variable texture size) to improve the cachehits. Fastmem2fastmem c2p to remove the chipmem bottleneck

Nice demo, awful soundtrack. Having an FPU of course is a great advantage. I had a 50 MHz 68882 but hardly anyone else did. I wonder how many 060 Amigas there are...

In this posting I dissassembled some innerloops of doomclone games and optimized them by using self modified code: Since you a are a former Democoder you might find them interesting.
 
  EXTERNAL LINK 

I will have a look at that when I have more time. BTW, I used self-modified code for my point-projection in a 3D engine. It was the only way to not move data between registers and RAM. I merely replaced some immediates with the values required for the next frame and flushed the instruction cache. With an FPU all of this could have happened in parallel to something more interesting... :)



Thierry Atheist
Canada

Posts 1828
07 Feb 2012 13:26


Nixus Minimax wrote:

I will have a look at that when I have more time. BTW, I used self-modified code for my point-projection in a 3D engine. It was the only way to not move data between registers and RAM. I merely replaced some immediates with the values required for the next frame and flushed the instruction cache. With an FPU all of this could have happened in parallel to something more interesting... :)

Allowing the usage of self-modified code is one of 3 reasons I'm against a MMU being put in a NatAmi.

Ajc ;)
United Kingdom

Posts 688
07 Feb 2012 13:27


wawa tk wrote:

ah, ermm, and what happened to that demo game you guys were developing? it was so much talk about it but no word anymore. has it been abandoned? is gunnar gone for good?

Which one? The V-scroller is mostly done.

I had to stop doing anything on the 3D-core because of various job problems (got made redundant, got another job - it was awful, etc etc) and still trying to get back on my feet now.

Hardly matters, Thomas is still marching onward to getting the Natami working :)

Thierry Atheist
Canada

Posts 1828
07 Feb 2012 13:28


wawa tk wrote:

is gunnar gone for good?

I didn't want to bring it up, but haven't seen a post from Gunnar for a long time, soooo.

Gunnar = ????

Ajc ;)
United Kingdom

Posts 688
07 Feb 2012 13:29


Thierry Atheist wrote:

Allowing the usage of self-modified code is one of 3 reasons I'm against a MMU being put in a NatAmi.

An MMU does not have anything to do with self modifying code. As most coders will tell you an MMU to use for whatever we want could actually be really fuckin' useful at times.

Nixus Minimax
Germany

Posts 272
07 Feb 2012 14:02


Thierry Atheist wrote:
Allowing the usage of self-modified code is one of 3 reasons I'm against a MMU being put in a NatAmi.

Aha. You can delete this reason then because it is not valid. Having an MMU does not imply that code pages are write-protected.



Louis Dias
USA

Posts 217
07 Feb 2012 15:33


Thierry Atheist wrote:

wawa tk wrote:

  is gunnar gone for good?
 

  I didn't want to bring it up, but haven't seen a post from Gunnar for a long time, soooo.
 
  Gunnar = ????

He posted in the team section as late as 18 Jan 11:35 with regards to the next demo...
Though I also miss his posts in the public area...

Nixus Minimax
Germany

Posts 272
07 Feb 2012 16:41


Rune Stensland wrote:
In this posting I dissassembled some innerloops of doomclone games and optimized them by using self modified code: Since you a are a former Democoder you might find them interesting.
 
  EXTERNAL LINK 

I found your remark about AB3D2 interesting. I always had a feeling that the game was published knowing that it was unplayable but trying to make some bugs before leaving the business. I was very disappointed at the time because I actually liked AB3D a lot and completed it. To me it had a very good atmosphere despite the fact that the copper chunky was horrible.

If I ever dig out my A1200, I might find that doom type tech demo by a Dutch coder that in my opinion was the best doom engine at the time. It was on Aminet, in case you are still interested in the subject. It had floor texturemapping and a very good frame rate.


Wawa Tk
Germany

Posts 581
07 Feb 2012 17:02


i think its okay that there is no this lengthy discussions between gunnar and thierry anymore. would be sad if he had left, but as it is now gunnars time is likely better assigned.

Nixus Minimax
Germany

Posts 272
07 Feb 2012 17:12


Nixus Minimax wrote:
If I ever dig out my A1200, I might find that doom type tech demo by a Dutch coder that in my opinion was the best doom engine at the time. It was on Aminet, in case you are still interested in the subject. It had floor texturemapping and a very good frame rate.

Wow, the internet never forgets:

EXTERNAL LINK 
I found that engine very good at the time.



Dariusz Gac
Poland

Posts 3
08 Feb 2012 20:13


André Jernung wrote:

I took some screenshots of my Natami system...

  I see there is DOpus 4 there on your desktop.
  What about Directory Opus Magellan then ?
  Running smoothly? Or some issues like in AmigaOS4.x ?

 
 
   
 

André Jernung
Sweden
(MX-Board Owner)
Posts 988
09 Feb 2012 05:37


Dariusz Gac wrote:

André Jernung wrote:

I took some screenshots of my Natami system...

 
I see there is DOpus 4 there on your desktop.
What about Directory Opus Magellan then ?
Running smoothly? Or some issues like in AmigaOS4.x ?

Good idea. I did not try it yet. But I see no reason why it shouldn't work - it is compatible with OS3.1 after all :)

posts 370page  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19