(http://www.underwaredesign.com/PlayBasicSig.png)
Visit www.PlayBasic.com (http://www.playbasic.com)
NOTE: You'll find these articles and MANY more in the PlayBASIC HELP FILES under the ABOUT TAB
Economizing Image Blitting (drawing!)
I like Big Blit's
If you've been programming for a while you've perhaps noticed a strange anomaly when drawing (bliting) images. Which is, why does the blitter (the image drawer) seem to choke up drawing a lots of small images, when it can throw around huge ones easily ?
To answer this, we have to look at how the windows give us control of the video hardware. It should be noted, that we're primarily discussing the rendering of
Video Images (those stored into your graphics cards video memory) and not those in system memory.
Your PC has what's termed a blitter. You can think of this as the part of the graphics chip solely designed to draw/copy rectangular graphic data quickly. Now every time you create a image (in video memory) and render it. PB uses the blitter device to transfer the pixel information for you. Now this is all lovely, but there's a catch ! The Blitter is shared devic and we're not the only one's using it!. (windows + other programs are using it also)
Since the blitter is shared, when we want to draw (blit) something, we're forced to wait inline for it to become available (from either a pervious draw call, or some other program/task is using it). Once it's free, we can send it our drawing job and continue on our way. The drawing will take place while our program continues running. This is called asynchronous rendering. Which is fancy way of saying the graphics chip is drawing while the main CPU's continue on doing something else. (Note: Not all video cards support this!)
The issue is that if we call the blittler frequently, we're inevitably going end up stalling our program in wait loops, while DX waits for the graphics cards Blitter to become available to us. This is what occurs when we're try and draw lots of small images with the blitter. While the drawing might be quick once acquired, all this polling/waiting for the blitter equalls one sure thing,
lots of lost time. Which can really impact performance. You've probably noticed this when drawing Maps with really small blocks for example.
Lets demonstrate. In this example bellow were going to render a tiled image to the screen and monitor the FPS. Obviously newer GPU's with faster blitters will cope better with this than older ones, this doesn't mean the issue is anymore, just that it'd impacting those users less.
In this example we'll draw a tiled image with 8*8 tiles, 16*16, 32*32, 64*64, 128*128, 256*256 blocks etc etc then monitor the average FPS at these sizes.
[pbcode]
openscreen 800,600,32,2
Seconds=5
Maxtests=6
Type tFrame
Afps#
Frames
Size
endtype
dim Info(Maxtests) as tFrame
TestCount=0
BlockSize =8
repeat
Cls 0
ShadeBox 0,0,BlockSize,BLockSize,rgb(255,0,0),rgb(255,255,255),255,rgb(0,255,0)
getimage 1,0,0,BlockSize,BlockSize
EndTime=timer()+Seconds*1000
frames=0
repeat
inc frames
TileImage 1,Xpos,Xpos,false
Xpos=mod(Xpos+1,BlockSize)
Info(testCount).frames =frames
Info(testCount).Size =BlockSize
text 0,0,str$(fps())+" Block Size:"+str$(BlockSize)
SYnc
until Timer()>EndTime
BlockSize=BlockSize*2
inc testCount
until TestCount>=Maxtests
cls 0
print "Results"
;writefile "C:\BlitResults.txt",1
For lp=0 to Maxtests-1
f#=info(lp).frames
f$=digits$(f#,5)
fr$=str$(f#/seconds)
m$=digits$(lp,2)+" Frames:"+F$+" Fps:"+fr$+" Size:"+str$(info(lp).Size)
print m$
; writestring 1, m$
next
;closefile 1
Sync
Waitkey
[/pbcode]
results; (Duron 800 & GeForce 2 MX200)
00 Frames:00050 Fps:10.0 Size:8
01 Frames:00193 Fps:38.6 Size:16
02 Frames:00572 Fps:114.4 Size:32
03 Frames:00721 Fps:144.2 Size:64
04 Frames:00746 Fps:149.2 Size:128
05 Frames:00755 Fps:151.0 Size:256
Return To PlayBasic Tutorial Index (http://www.underwaredesign.com/forums/index.php?topic=2552.0)
These are my results for information based on AMD 3500+ (Socket 939) and GeForce 6600 GT...
00 Frames:00117 Fps:23.4 Size:8
01 Frames:00474 Fps:94.8 Size:16
02 Frames:01687 Fps:337.4 Size:32
03 Frames:04503 Fps:900.6 Size:64
04 Frames:06138 Fps:1227.6 Size:128
05 Frames:06696 Fps:1339.2 Size:256
Big C.
Economizing Image Blitting (drawing!)
Crash course in Image & Sprites Draw Modes
One the most misunderstood aspects of PlayBasic (among other things), is how to design your program to take advantage of image & sprite draw modes. Draw mode gives us control over how the sprites are rendered. This is nice and all, but certain render modes can place additional stress on your computer, if
you're not careful. Making them slower than need be. This most commonly occurs when using the Alpha Blended modes. In fact, it'll effect any draw mode that has to read from the destination surface.
Getting the best out it, comes down to how well you understand the different types of image buffers PlayBasic offers us. Today (PB1.63 and bellow) we have 2 primary image types... ( note: future editions of PlayBasic have more) which are
Image Type #1 - Video Image Video Images are images where the pixel data is stored in your computers graphics card memory. These images can be copied/drawn to the screen (which is also in your video cards memory) very quickly, since they utilize you've video cards blitter. Which is the part of the GPU specially designed for this purpose.
While it's fast to transfer Images around in video memory with the blitter (on most cards), the blitter has it's fair share of limitations. Those being, that it can basically only Copy & fill rectangles of pixels, and that's about it.
So all other rendering is in the hands of the CPU. While the CPU can write data to images in video memory very quickly,
it can't read from them. Effectively reading from video memory is 20->30 times slower than writing! Even worse on some systems. Most people no doubt assume this is a PlayBasic thing, it's
not, it's limitation of how the PC was designed. It doesn't matter what language you use, we just can't write or read from video memory at the same rate.
Basically, Video images are our best option if we want to draw loads of solid (no alpha), none rotated sprites around the screen.
Image Type #2 - FX Image FX images are a variation of normal images. The primary difference however, is that the Pixel data is stored in your computers main memory. While there's no visible difference between the two, they do give us a difference set of abilities.
Being stored in system memory grants us the freedom of fast access to the pixels in the image. So we can we can read & write pixels as fast as our cpu can manage. But there's a price, FX images are slower to draw to the screen. Since the drawing has to be performing using the CPU. We can't use the video cards blitter, since it was designed to work with image data stored in video memory, not system memory.
Now since FX's images are stored in system memory, this gives us the ability to draw them rotated/scaled in real time. This is possible as the software rotation code can read the pixels from the image as fast as CPU / Memory will allow. We couldn't do this as fast, if the image was stored in video memory. You can try that yourself. Load Image "myIMageNameHere",1, then try and draw it rotated.... it'll be slow for the aforementioned reason. Then try it with and FX image.
Common Mistakes
OK, so we've established that if we want to rotate an image or read from it a lot, then FX images are probably our bet at this time. But what about rendering translucence styled image effects such as Alpha Blending, Alpha Addition, Alpha Subtract, Logical operations etc... ?
This is where the most common mistakes are made! I.e. A newbie loads up an FX image and tries to alpha blend it to the screen. First thing they notice that's nice and slow! To explain why, we'll examine the basic process that's being performed when we blend images together..
For each pixel we're doing the following.
Step #1 Read the Src pixel from the image were drawing
Step #2 Read the corresponding Dest Pixel from the destination image (that one we're going to overwrite/blend with)
Step #3 Perform the blending operation.
Step #4 Output the newly blended pixel to the destination image.
Now, that seems simple enough, so what are we missing ?
Notice how that in order to blend a pixel, we have to the read pixels from the destination image ? Hmmm, well what if that destination is in Video memory ? Wouldn't all that reading video meory slow our drawing routine down ? Yep, it certainly will !
While you can get away with blending small images directly to the screen or video images, the better approach, if you want a heavy amounts of blending, would be to draw your screen to a screen sized FX image, do all your blending stuff on that, then transfer (draw) that FX image to the screen. This will avoid video memory reading completely.
The upside of this approach is that we get much faster blending. However, we do loose the assistance of the video cards blitter while doing this. So commands like CLS/BOX & drawing solid images/map will be slower. How slow, depends on how fast your cpu/memory is, and don't forget we have to transfer the whole image to the screen. But even with this added Burdon, it's still
way faster than attempting to render to blend effects directly to video memory.
Example This example shows the process of rendering sprites to an FX image, then drawing the image the screen so we can see it. Those with a keen eye will notice it's a variation of the Alpha sprite example that comes in the PB example pack. The main difference is the scrolling backdrop isn't present in this version.
Anyway, I've provided the example so you can get an idea of how much FX image blitting your system can push around. I recommend experimenting with the sprite particle size, screen depth. Every system will have a balancing point.
[pbcode]
global Use_FX_Buffer = true
Constant Particle_Size =32 ; try 16, 24, 32, 48, 64, 96, 128
Constant RequiredFrameRate =30
OpenScreen 640,480,32,2
; OpenScreen 640,480,16,2
#include "BlitIMage"
MakeBitmapFont 1,$ffffff
sw=GetScreenWidth()
sh=GetScreenHeight()
Dim ParticleImages(4)
Size=Particle_Size
ParticleImages(0)=MakeParticle(size,RGB(255,RndRange(100,200),Rnd(75)))
ParticleImages(1)=MakeParticle(size,RGB(255,RndRange(20,40),Rnd(15)))
ParticleImages(2)=MakeParticle(size,RndRGB())
ParticleImages(3)=MakeParticle(size,RndRGB())
ParticleImages(4)=MakeParticle(size,RndRGB())
Type tObject
Status
x#,y#
xdir#,ydir#
sprite
rotspeed#
EndType
max=10
Dim Objects(max) As tobject
Gosub INit_Objects
Screen=NewFXImage(sw,sh)
; ------------------------------------------------------------------
; Start of Main Loop
; ------------------------------------------------------------------
Do
Gosub Update_Logic
Gosub Render_Scene
Sync
Loop
` *=----------------------------------------------------------------------=*
` >> Update Sprites Rebound Logic <<
` *=----------------------------------------------------------------------=*
Update_logic:
For lp=0 To max
If Objects(lp).status
spr=Objects(lp).sprite
MoveSprite spr,objects(lp).xdir#,objects(lp).ydir#
If SpriteInRegion(spr,0,0,sw,sh)=False
If GetSpriteX(spr)<0
objects(lp).xdir#=objects(lp).xdir#*-1
EndIf
If GetSpriteX(spr)>sw
objects(lp).xdir#=objects(lp).xdir#*-1
EndIf
If GetSpriteY(spr)<0
objects(lp).ydir#=objects(lp).ydir#*-1
EndIf
If GetSpriteY(spr)>sh
objects(lp).ydir#=objects(lp).ydir#*-1
EndIf
EndIf
TurnSprite spr,objects(lp).rotspeed#
EndIf
Next
If UpKey()
max=max+10
Gosub INit_Objects
EndIf
If DownKey() And max>10
max=max-10
Gosub INit_Objects
EndIf
Return
` *=----------------------------------------------------------------------=*
` >> Render The Current Scene <<
` *=----------------------------------------------------------------------=*
Render_Scene:
ClsColour=rgb(110,140,170)
if Use_FX_Buffer=true
RenderToImage screen
else
Cls ClsColour
endif
; Draw the Sprites
DrawAllSprites
if Use_FX_Buffer=true
RenderToScreen
; render the FX screen to the real screen
BlitImageClear(Screen,0,0,ClsColour)
endif
if Use_FX_buffer=false
t$="[Video Render] "
else
t$="[FX Render] "
endif
Text 0,0,t$+Str$(max)+" Sprites @ "+str$(CurrentFPS)+"fps"
if enterkey()
Use_FX_Buffer=1-Use_FX_Buffer
Flushkeys
endif
if SpaceKey()
SetCursor 0,20
Pixels=Max*(Particle_Size*Particle_Size)
Print " Blended Pixels :"+Str$(pixels)
Print "Dots Per Second :"+Str$(Pixels*CurrentFPS)
EndIF
; Check the fps, if it's over our target, then
; add more sprites to the scene
CurrentFps=fps()
if Timer()>CheckFpsTime
CheckFpsTime=timer()+250
if (CurrentFps-1)=>RequiredFrameRate
max=max+2
Gosub Init_Objects
endif
endif
Return
` *=----------------------------------------------------------------------=*
` >> INIT/CREATE Sprites <<
` *=----------------------------------------------------------------------=*
Init_Objects:
For lp=max+1 To GetArrayElements(Objects().tobject,1)
If Objects(lp).status
Objects(lp).status=False
DeleteSprite Objects(lp).sprite
EndIf
Next
If max>GetArrayElements(Objects().tobject,1)
ReDim Objects(max) As tobject
EndIf
For lp=0 To max
If Objects(lp).status=False
Objects(lp).status=True
x=50+Rnd(sw-100)
y=50+Rnd(sh-100)
Angle# =Rnd(360)
Speed# =RndRange(1,5)
Objects(lp).xdir#=CosRadius(angle#,speed#)
Objects(lp).ydir#=SinRadius(angle#,speed#)
zz=Rnd(4)
; zz=2
Select zz
;Rnd(4)
Case 0
ThisImage=ParticleImages(0)
DrawMode=2+16
transparent=On
Case 1
ThisImage=ParticleImages(1)
drawMode=2+32
transparent=Off
Case 2
ThisImage=ParticleImages(2)
drawMode=2+16
transparent=Off
Case 3
ThisImage=ParticleImages(3)
drawMode=2+8
transparent=On
Case 4
ThisImage=ParticleImages(4)
drawMode=2+4
transparent=On
Alphalevel#=Rnd#(1)
EndSelect
spr=NewSprite(x,y,thisimage)
AutoCenterSpriteHandle Spr,True
SpriteDrawMode Spr,drawmode
SpriteTransparent spr,transparent
SpriteAlphaLevel spr,AlphaLevel# ; Only has effect when sprite is
; set to alpha draw mode
Objects(lp).sprite =Spr
objects(lp).rotspeed# =RndRange#(1,5)
EndIf
Next
Return
` *=----------------------------------------------------------------------=*
` >> Make particle Image <<
` *=----------------------------------------------------------------------=*
Function MakeParticle(Size,Col)
ThisImage=NewFXImage(Size,Size)
RenderPhongImage ThisImage,Size/2,Size/2,col,255,260/(size/2)
EndFunction ThisImage
[/pbcode]
Keys Enter = Toggle between Render to an FX image or rendering Directly to Video memory.
Space = See basic filler stats on your machine.
Results All test were conducted in 640*480 (full screen exclusive) display mode in both 16&32bit modes. The test tries to calc how many sprites of size X, can be
on screen while holding 30fps on the machine. All sprites are rotating, but have randomly assigned an alpha draw modes.
800mhz Duron & GF2 mx200 * 32bit
Quote
Sprite Size [16] = 764
Sprite Size [32] = 336
Sprite Size [64] = 118
Sprite Size [128] = 38
* 16 bit
Quote
Sprite Size [16] = 908
Sprite Size [32] = 422
Sprite Size [64] = 166
Sprite Size [128] = 56
3gig AMD 64 & GF6600 * 32bit
Quote
Sprite Size [16] = 3200
Sprite Size [32] = 1558
Sprite Size [64] = 630
Sprite Size [128] = 200
* 16 bit
Quote
Sprite Size [16] = 3418
Sprite Size [32] = 1690
Sprite Size [64] = 668
Sprite Size [128] = 244
Return To PlayBasic Tutorial Index (http://www.underwaredesign.com/forums/index.php?topic=2552.0)
That's a really interesting piece there, and it may well be very useful for my Sorcery game - when you destroy an enemy it release a scaling, alpha blending rainbow explosion. When one is on screen it's fine, but it slows dramatically when there are two it can slow down dramatically if both are quite large. I presume you've noticed this ;)
Very edifying! I found this solution myself last week, and now know why it was so slow lately. Thanks!
I really appreciate your tutorials and information - Good job!
But I am having some problems in understanding. I understand the 'concept' of having an image in either: (1) system memory (FXimage), or (2) in Video memory (on the video card). BUT, I don't see any commands that say specifically that they load an image into the Video memory - at least the help file doesn't make it clear. I would like to test the speed difference between the two different methods.
Also, related to these Video Images... you first say that Video Images are drawn SOLID with no alpha - which I am guessing means 'no transparency':
QuoteImage Type #1 - Video Image
...Video images are our best option if we want to draw loads of solid (no alpha), none rotated sprites around the screen.
but then later you say that Video images can have transparency:
QuoteImage Type Key
VI = Video Image
FX = FX image
Performance Guide when Drawing fixed sized solid/transparent images
Quote
Copying VI to VI = FASTEST - These will generally be assisted by the Graphics Card Blitter.
Copying VI to FX = VERY SLOW - Requires reading from the video memory. Avoid if possible
Copying FX to VI = GOOD - CPU driven.
Copying FX to FX = GOOD - CPU driven.
So, my questions are
(A) can a Video Image have any sort of transparency, either as a mask (like how 'Magic Pink' is used in some languages RGB(255,0,255), or an Alpha mask (plot the pixel if the alpha = 255, any other alpha then don't plot), or full alpha transparency (where the src pixel is blended with the destination)?
(B) What commands are used to load/unload an image directly into Video memory (ot the screen yet, just memory on the video card)? What are the commands to Draw(copy to...) an image on the video card to the video screen?
for some reason his has been very confusing for me...
(C) I assume that all the specialized sprite commands, collision abilities, etc are based off of FX images, NOT images on the video card, right?
I tend to only use images with transparency (either a mask, or, prefer 0-255 full alpha) - I am never rotating/scaling/other special effects that FX images can have applied, so my thought is that I can get away with solely using video images. If Video images can actually be used with a version of transparency, then I have one more question:
(D) I assume that I can then make a screen sized Video Image, blit all my video images into it utilizing their transparency, and then blit that entire screen image onto the screen - without moving anything to system memory and utilizing the speedy video blitter - basically make a screen buffer (unless there already is such a thing on the video card...)?
UPDATE:ok, re-read some of the help info on images and did some testing and I think I figured out a few of my questions:
Answer to A: yes, the video image can have a transparency - it appears any 'black' pixel rgb(0,0,0) is considered transparent when drawn with the 'transparent flag' set to '1' - speed test indicates that drawing an image, from video, is faster than an FXimage - but seems to ONLY be so when the transparent flag is set, seems same speed when no transparency(nned to test this more as it doesn't jive with what you have said) (also just read about the whole 'mask color' thing which talks about the default transparent color being black... Pretty neat that you can specify the mask color! nice!)
Answer to B: looks like loadImage, loadNewImage, createImage all deal with Video Images. DrawImage will draw both Video Images and FX images...
and answering those two questions actually answers my remaining two questions!
silly me... I am using a low-end notebook.... video memory is apparently 'shared' system memory anyways... I guess video images probably don't offer much speed improvement, if any