.tHE .sIRIUS .cYBERNETICS .cORPORATION
|
|
--------------------------------------------------------------------------------
CHUNK 2 PLANAR GRAPHICS - THE WAY TO DO IT
--------------------------------------------------------------------------------
introduction
--------------
Hi freaks... I'm Ray, the new coder of .tSCc.
This is my first tutorial so let me explain some basic things:
First of all I got to tell you that all the examples will be plain 68000 asm
code and sometimes it'll get quite machine-dependend. So my tutorials will only
be useful with the ST and its "brothers" - so don't even get the idea of taking
the following descriptions for valid on a TT or something, but I think that's
clear somehow.
On the second I'm expecting you to understand the basics of 68k programing
because I don't have the time to teach you 68k coding in all its sometimes
'bizarre' details like fx. The diffrent addresing modes or something (but watch
out 'cause I'm going to release a 68k-tutorial in one of the next magazines).
And if there's something you don't understand just read it 2 or more times just
until you got it - if this doesn't help just ask me.
So, if you wanna contact me don't hesitate:
1) e-mail : reimund.dratwa@freenet.de
2) homepage: http://rd-developments.de.gs (my page before I joined .tSCc.)
(don't forget to grab the newest tutorial!)
let's get it on
-----------------
Let me tell you some sentences about the purpose of this tutorial. Today I'll
try to explain how a technique called 'chunk 2 planar graphics', which is the
secret of all those modern demos, can be realized on the ST.
Now you might ask, what the hell means chunk to planar?!?
Since I think you know how the ST's bitplane-screen works (if that isn't the
case read on in the next paragraph) you'll propably know how hard it is to set a
pixel and even worse, how slow.
So chunk 2 planar 'conversion' means that the bitplanes bits are set according
to byte-values in a so called chunky-buffer where every byte represents the
color of one single plot on the screen (this buffer could be compared to the
PC's VGA-memory of a 320x200x256 screen - don't get me wrong, Intel really
suckz! - I just wanted to give an example).
First you set up your effect or what ever you want in that chunky-buffer by
simply moving the according bytes into it and in a further step the chunky-
buffer is converted to bitplane-data (hence planar simply means bitplane-
graphics). How this works exactly will be described later on...
a brief description of the bitplanes
--------------------------------------
Skip this paragraph if you already know how the ST's screen is arranged.
As already mentioned, it's hard to set pixels 'by hand' because the ST's screen
is somehow splitted into bitplanes.
Ok I'll be some more exact, but notice: I'm only covering ST low rez 320x200x16.
In this resolution the screen is divided into 4 so called 'bitplanes' every four
'corresponding' bits of these bitplanes indicate in which color one pixel on the
screen should appear.
So lets take a look at the first 4 words of screen-memory.
The diagram should make it clear:
screen
0 319
-------------------------------------------
0 -* -
- -
- -
- -
- -
.
.
.
199 - -
-------------------------------------------
Assuming we wanted to set the pixel in the upper left corner of the screen
(0/0), marked with *, in color $D (=13 if you don't know what's hexadecimal just
stop reading) this would mean we would have to manipulate the highest bit of the
first 4 words in screen memory.
$D (hex) = %1101 (bin)
screen memory:
bit 15 7 0
--------------------------------------------------------------------------------
bitpl/word 1 | 1 | | | | | | | | | | | | | | | |
--------------------------------------------------------------------------------
bitpl/word 2 | 1 | | | | | | | | | | | | | | | |
--------------------------------------------------------------------------------
bitpl/word 3 | 0 | | | | | | | | | | | | | | | |
--------------------------------------------------------------------------------
bitpl/word 4 | 1 | | | | | | | | | | | | | | | |
--------------------------------------------------------------------------------
Got it, no - then let's do some more examples:
Just imagine we wanted set the pixel right beside the one before (ie. 1/0) in
the same color. Then we'd just mainpulate bit 14 of the 4 words.
But now be careful, what if we wanted to set pixel 16/0 (the 17th on the sceen)?
The answer is just as simple as this: since every single of the 4 words only
holds 16 bits (=16 pixels) we just had to skip over to the next 4 words in
memory (5,6,7,8).
I think you're getting the idea, right? But I'll think now you'll see why
setting pixels is really slow on the ST, as well. It's just because you have to
'bit set' 4 places in screen-memory just for one pixel - and I think you know
that bit operations are kinda hard for a byte/word/long aligned cpu.
Now guys, I think it's time for some code at the end of this paragraph (let's go
back to the first example when we wanted to set the upper left pixel with
color $D):
move.l screen,A0 * assume the screen addr is stored in 'screen'
clr.w D0 * start at offset 0
ori.w #%1000000000000000,0(A0,D0.w) * set bit 15 of the 1st word of bitpl 1
ori.w #%1000000000000000,2(A0,D0.w) * set bit 15 of the 1st word of bitpl 2
andi.w #%0111111111111111,4(A0,D0.w) * clear bit 15 of the 1st word of bitpl 3
ori.w #%1000000000000000,6(A0,D0.w) * set bit 15 of the 1st word of bitpl 4
Hey - we've just set the first pixel! Now try what happens if you fx. move #8
into D0 instead of clearing it...
the secret of c2p, how it works and why it's so quick
-------------------------------------------------------
The only significant thing I can mention on top of this paragraph is that the
whole let's call it secret of the c2p conversion is the 68k's movep-instruction:
what movep actually does is moving word or long values 'byte per word' aligned
by splitting them up into bytes, so let's take a look at a little example:
move.l #$FE23D2E0,D0
movep.l D0,(A0) * this instruction kinda covers 64 bit or in other
* words only the upper bytes of 4 following words in
* memory (already smelling what we'll do ? ;) )
after the instruction:
(A0) = $FE
2(A0) = $23
4(A0) = $D2
6(A0) = $E0
or if we start at an odd addr. (if A0 holds an even one):
move.l #$FE23D2E0,D0
movep.l D0,1(A0) * now movep will will use the lower 8 bits of the data-
* bus or some more simple : now it covers the lower
* bytes of 4 following words
after the instruction:
1(A0) = $FE
3(A0) = $23
5(A0) = $D2
7(A0) = $E0
if we didn't have movep we'd have to to:
move.l #$FE23D2E0,D0
move.b D0,6(A0)
lsr.l #8,D0
move.b D0,4(A0)
lsr.l #8,D0
move.b D0,2(A0)
lsr.l #8,D0
move.b D0,(A0)
I think then we could forget our chunk 2 planar graphics - thanx motorola!
And now go on reading very carefully because now I'll describe the core of the
c2p conversion:
Well, what we need to do a c2p conv. is, like mentioned above, 1st a 'chunky-
buffer' holding the chunky byte data and at 2nd a c2p table which is finally
used for the conversion.
Don't be afraid I'll give you some more detail...first of all let's talk about
the chunky buffer:
to get the conversion anywhere near reasonable quick we need to double the
pixels horizontally and vertically, what means that one 'chunky-byte' in fact
represents 2x2 pixels on the screen and it means that we'll get a virtual rez
of 160x100 'chunky-pixels', of course that's why the chunky-buffer is sized
160x100 bytes = 16000 bytes (notice: the size of the chunky buffer may be
variable but to keep it simple I'm talking of the fullscreen conversion).
Now some words on the c2p table and the actual conversion:
what we will do now - let's assume the first 4 bytes of the chunky buffer would
have the values $03,$04,$0D,$02 (these 4 values represent 4 chunky-pixels or 4
double pixels on the screen).
because only the lower 4 bits of these values are used (16 colors!) we're now
able to 'pack' those 4 bytes into one word the following way:
lea chunkybuffer,A0 * A0 now points to the chunky buffer
moveq #0,D0 * clear D0.l
move.w (A0)+,D0 * D0 = $0304 - 1st 2 values out of the chunky buffer
lsl.w #4,D0 * D0 = $3040 - shift them up one nybble
or.w (A0)+,D0 * D0 = $3D42 - that's the four 1st values in one word
But stop, values 2 and 3 have been swapped somehow, wouldn't it be more simple
if D0 would contain $34D2 like in the chunkybuffer??
That brings us to the next element of the c2p conv. - the c2p table (which is,
apart from the low resolution of c2p graphics, a further disatvantage against
the planar-graphics. ie. the high memory consumption!)
The chunk 2 planar table holds all the possible bitplane-combinations of 4
double-pixels on the screen. And since every one of those pixels can have 16
colors it's 16 * 16 * 16 * 16 * 4 bytes = 256 kb huge! (* 4 because we get
longwords out of it).
But when we precalc this c2p-tbl we need to remeber that pixels 2 and 3 of
that 4 pixel 'quad' are swapped - because now we gonna use the value of D0 as
offset:
lsl.l #2,D0 * get the longword alingment
lea c2p,A1 * set up the address of the c2p tbl
move.l screen,A2 * A2 points to our screen
move.l 0(A1,D0.l),D0 * move the needed value into D0
movep.l D0,0(A2) * set the first 4 doublepixels on the screen
movep.l D0,160(A2) * do the scanline below
Wow, this little program-block has set 4 double pixles on the screen! It's also
managed to skip the lower bytes of the bitplane words using the movep
instruction, meaning that the other byte of the bitplane word won't be
affected.
What you'll do now is getting the next for double-pixels and converting them
to the next screenoffset and so on. But notice that you have to skip one
scanline when you've done 160 double-pixels because the pixels are also doubled
vertically. So, that's all. Did you get it? I hope so...
And now I think it's at the time to say a few words about the demo:
what it does is
1. setting up a pattern in the chunkybuffer
CHUNKYBUFFER: REPT 1000 * 16000 bytes
DC.L $00010203
DC.L $04050607
DC.L $08090A0B
DC.L $0C0D0E0F
ENDR
2. doing the c2p conversion
(do this 100 times - one for each 2 scanlines)
move.w #20-1,D7 * convert one scanline
SETPIXEL: * = 20 * 8 doublepixels
moveq #0,D0
move.w (A0)+,D0
lsl.w #4,D0
or.w (A0)+,D0
lsl.l #2,D0
move.l 0(A1,D0.l),D0
movep.l D0,0(A2)
movep.l D0,160(A2)
moveq #0,D0
move.w (A0)+,D0
lsl.w #4,D0
or.w (A0)+,D0
lsl.l #2,D0
move.l 0(A1,D0.l),D0
movep.l D0,1(A2)
movep.l D0,161(A2)
addq.w #8,A2
bra D7,SETPIXEL
.
.
.
3. scrolling the chunkybuffer
lea CHUNKYBUFFER,A3
lea CHUNKYBUFFER,A4
addq.l #1,A4
move.w #16000-1,D4
SCROLLLOOP: move.b (A4)+,(A3)+
dbra D4,SCROLLLOOP
4. repeat step 2 until space is pressed
The whole thing runs at about 8.5 fps on a plain ST - which is as I think quite
fast for a fullscreen fine-scroller (imagine how you would code this with
planar graphics X(. but keep in mind this demo is very basic and not optimized,
'cause I wanted to keep it simple.
last remarks
--------------
Some words on optimizing....one way to optimize this algo a little bit is to
preshift every color up 2 bits (I mean your texture or image colors or whatever)
before you store them in the chunky buffer. That saves the lsl.l #2,D0 when
calculating the offset in the c2p-tbl.
There are even ways to let the chunky-buffer fall away completly. But believe
me, it shouln't be that hard to figure it out yourself and by the way it could
be a real good practice for you.
Ok, I'll give you a little hint. Just write a rout that directly sets a 4 double
pixel 'quad' not getting the data out of a chunky buffer...
Another way of speeding it up will be to just resize the chunky buffer when your
effect will not take up the whole screen.
Just keep coding, you'll get it ;)
What's for next time? Hmm... I'm thinking of doing something on raycasting, at
least on the basics, so keep looking for my tuts, he he...
hope to see you next time,
.tSCc. Ray
--------------------------------------------------------------------------------
|
|
|