Print Page - Tutorial - A Crash Course In BASIC program Optimization

(http://www.underwaredesign.com/PlayBasicSig.png)

Visit www.PlayBasic.com (https://www.playbasic.com)

A Crash Course In BASIC program Optimization

Optimization is often seen as some type dirty word in programming circles. So it's not that uncommon to run into the same old "Computers are so fast today, nobody optimizes their code anymore." catch phrase. If only that were true!

While today's machines are indeed many times faster than they used to be (even just a few years ago), the programs we use today require much more media & data to be manipulated than before. The more media and data programs need to man handle, the more muscle (computing power) it takes. So it's still a balancing act.

What is optimization ? - Optimization is the process where a programmer tries to improve their programs performance by restructuring the time critical parts (the sections of the program where computer spends the most time executing) into a more stream lined manner. Unfortunately, this tends be something that you learn through experience / education. So unfortunately, it's not uncommon for new programmers to make mistakes that can dramatically impact their programs performance, without even knowing it.

There's two basic forms of optimizations. We'll call these, Micro optimizations and Method optimizations. In this tutorial, we're going to kick around some ideas that may help give your programs a little more get up and go.

Micro optimizations are when we try and make subtle changes to the programs code, looking for 'faster' commands and slight logic tweaks to improve the overall performance of the action. So generally, we're not changing the entire approach taken, rather just stream lining key parts of the algorithms in our program. While useful, there's only so far you can go take this before you need to change approaches.

Method optimizations are when we completely change how we approach a key problem(s) in our program. Often this will give far greater performance improvement than any micro optimization you'll ever make. The trick is, coming up with alternative methods isn't always obvious. In particular, if we're new to this programming stuff.

For example, Imagine a person "Jim" wants to send a message to a friend in another city. Immediately, your probably already thinking of a few ways you'd approach this problem. Well, lets say the first thing "Jim" thought of, was sending his message via carrier pigeon. However, when "Bill" was confronted with the same situation, he decided to send an EMAIL. While Bill's email might find it's way into his friends inbox within a few seconds, Jim's pigeon takes a week. Now with more training, Jim could probably make Micro Optimizations to the pigeons flight path that would no doubt see it reach it's target quicker. But there's only so far he can go before the bird can't physically get there any faster. So in order to compete, Jim would have to optimize his sending method by changing to an entirely difference approach. Such as Email/SMS..

While not best the analogy, this hopefully demonstrates that the difference between Micro and Method optimizations.

Don't waste time Optimizing the Input Code

Be careful. Optimizing can sometimes be a trap, you can easily waste valuable time, effort and more importantly motivation optimizing aspects of your program, that are not that time critical. As it turns out, in the majority of projects, only a tiny percentage (10% or less) of your game code is really time critical. So if you're going to optimize, don't guess, analyze your programs behavior and see where it's really spends the bulk of it's time. In most games, the main time eaters will be the updating the character Logic (AI / Physics) ), Collisions and ultimately Drawing the stuff to the screen.

General Tips On Tweaking a Programs Performance

All too frequently the war is lost in the design phase, by attempting to make our programming tools do something they weren't really intended to do. No matter how many stripes you paint on an elephant, it'll never be a zebra! In other words, by designing our programs (games in our case) to take advantage of the feature set of the language they're written in, then this will generally put us on the right track.

So bellow, we're basically going to take a general overview of common issues and code optimizations. While the following examples are written in PlayBasic (http://playbasic.com) code, they're fairly universal. So you might just learn something useful, you might not too :)

* Removing Redundant Calculations (http://www.underwaredesign.com/forums/index.php?topic=2548.msg17250#msg17250)

* Smarter Drawing (http://www.underwaredesign.com/forums/index.php?topic=2548.msg17251#msg17251)

* Smarter Logic (http://www.underwaredesign.com/forums/index.php?topic=2548.msg17252#msg17252)

When Should I Optimize My Program ?

Generally, method optimizations are made during a programs design phase. Which occurs before even a single line of program code is written. Experienced programmers can do this as they're often familiar with the types of problems & solutions required within a particular game genre. Unfortunately, It's going to be very difficult for new programmers to do this up front. While this tutorial covers some of the basics, this is a huge subject...

Anyway, assuming you've some idea of how your game is going to work, then It's often best to prototype the design before making a serious commitment to this approach. It's here we generally find more of the big issues that we missed at design time. As such, you can refine whatever core methods the program is using and press onto the next phase. I should point out, that sometimes we'll need to revise our original game design ideas to fit into with our programming skills / languages.

Micro optimizations are generally best made at the very end of a programs development.

Now, you can optimize while you work. Which in some places makes perfect sense. Like for example, tweaking a heavily used subroutine / function to work more efficiently. However, If by optimizing a programs code you're going to end up with messy and or impossible to follow/update code, or the optimization limits the programs functionality is some negative way in the future... Then DON'T DO IT!

We optimize to make our programs more efficient, not to make unreadable spaghetti !

(http://www.underwaredesign.com/PlayBasicSig.png)

Visit www.PlayBasic.com (http://www.playbasic.com)

A Crash Course In Optimization Part#2 - Removing Redundant Calculations (Micro Opt's)

A redundant calculation is any program code that is being performed where when the result never changes. This is particularly important inside high frequency loops (Loops with lots of iterations).

Variables Faster Than Array/Types Accesses

In this example, we're looking at a situation that sometimes can occur when performance calculations upon structures (types/arrays). Every time you access an array/type PB has to resolve this access location to a where the item is in memory. While this isn't a massive overhead (considering what it does), it's a still far greater overhead than variables have. So it can often be beneficial to hold the calculation results in temp variables when doing calculations from a type/array.

[pbcode]
Tests=500

// Variables have the lowest access overhead, so it's often quicker
// to pull values out of arrays or type if your performing calculations
// on them.

Type tGameObject
x#
y#
SpeedX#
SpeedY#
EndType

Dim Object(10) as tGameObject

   ScreenWidth=GetScreenWidth()
   ScreenHeight=GetScreenHeight()


   Do
      cls 0
      inc frames

         // ===================================
         // Calling Functions In a Loop
         // ===================================
         X=300

         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests

            // Here we're constantly accessing the type fields. So PB has to resolve
            // this every time.

            // You'll notice PB has to access the X# and y# fields at least 4 times, possible 5
            // times. This adds up.
            for object=1 to 10
             Object(Objectlp).X#= Object(Objectlp).X#+Object(Objectlp).SpeedX#
             Object(Objectlp).Y#= Object(Objectlp).Y#+Object(Objectlp).SpeedY#

             if Object(Objectlp).X#>ScreenWidth
    Object(Objectlp).X#=ScreenWidth
                 endif
             if Object(Objectlp).Y#>ScreenHeight
    Object(Objectlp).Y#=ScreenHeight
                 endif
            next

         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         // Since the screen width is never going to change
         // during the loop, if we want speed we should move
         // this calculate outside the loop
         ScreenWidth=GetScreenWidth()
         For LP=0 to Tests

            // Here we're constantly accessing the type fields. So PB has to resolve
            // this every time.
            for object=1 to 10

               // since we're going to compare it's new position, we'll store this in
               // a variables X# & Y#, rather than back in the type.
               // This way we're reduce the type accessof fields X# * y# to 2 accesses in total

             X#= Object(Objectlp).X#+Object(Objectlp).SpeedX#
             Y#= Object(Objectlp).Y#+Object(Objectlp).SpeedY#

             if X#>ScreenWidth
X#=ScreenWidth
                 endif
             if Y#>ScreenHeight
Y#=ScreenHeight
                 endif

               // Once we're done now we store it back in the type
               Object(Objectlp).X#=X#
               Object(Objectlp).Y#=Y#

            next

         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)


      Sync
   loop

[/pbcode]

Calling Functions In Loops

This demonstration we've two simple loops. The first loop we're constantly comparing a variable X with the screenwidth. Each loop it calls the "GetScreenWidth()" function. The question is why ? - Will the screen ever change sizes inside the loop ? - Nope. So in the second version of the loop we precalculated the screen width by calling the GetScreenWidth and storing the value in a variable.

So if we call functions too much, they can eat up valuable time from our game. A millisecond improvement might not sound like much, but can make all the difference to your games performance.

[pbcode]
   Tests=10000

   Do
      cls 0
      inc frames

         // ===================================
         // Calling Functions In a Loop
         // ===================================
         X=300

         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests
            // Check if the X variable is large the Screen Width
            if X>GetScreenWidth()
               Print "X bigger than Screen edge"
            endif
         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         // Since the screen width is never going to change
         // during the loop, if we want speed we should move
         // this calculate outside the loop
         ScreenWidth=GetScreenWidth()
         For LP=0 to Tests
            // Check if the X variable is large the Screen Width
            if X>ScreenWidth
               Print "X bigger than Screen edge"
            endif
         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)


      Sync
   loop

[/pbcode]

Storing Calculation Results In Temp Variables

This (fairly bogus) example demonstrates how including unnecessary calculations inside heavy loops can cost us some performance.

[pbcode]
   Tests=10000

   Do
      cls 0
      inc frames

         // ===================================
         // Precalc Values Out Of Heavy Loops
         // ===================================

         A=100
         B=200

         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests
               Result=A*B*lp
         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         // Since values of A and B never change inside the loop
         // we can pre calc the value outside before we enter the loop
         ATimesB   =A*B
         For LP=0 to Tests
               Result=ATimesB*lp
         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)
      Sync
   loop
[/pbcode]

Short Circuiting Comparisons

This example demonstrates how we can win back some performance by separating combined comparisons. While will generally win us some performance, providing we place the more unlikely (to be true) compare first. Other wise, if the first compare is true frequently, this might cost us more, since we're falling through both comparisons to get a complete match.

[pbcode]
   Tests=10000

   Do
      cls 0
      inc frames

         // ===================================
         // Short Circuiting Comparisons
         // ===================================

         A=100
         B=200

         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests
            if A=12345 and B=12345
               print "Both Match"
            endif
         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         For LP=0 to Tests
            if A=12345
               if B=12345
                  print "Both Match"
               endif
            endif
         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)


      Sync
   loop
[/pbcode]

Dropping the True from Comparisons

Often it's common to compare the return value from a function call with a TRUE/FALSE within an IF statement. This extra comparison normally not needed. Since the IF statement assumes that value/calculation following it, is FALSE if the calc return/variable is zero. Anything else, and the IF will treat is as TRUE.

[pbcode]
   Tests=10000

   Do
      cls 0
      inc frames

         // ===================================
         // Dropping the True from Comparisons
         // ===================================

         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests
            if GetSpriteStatus(1)=true
               print "Sprite Exists"
            endif
         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         // If we're checking for TRUE/FALSE return value from
         // a function we can drop extra =TRUE
         // in these situations. Since GetSpriteStatus(1) will
         // return a true/false value anyway.
         For LP=0 to Tests
            if GetSpriteStatus(1)
               print "Sprite Exists"
            endif
         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)


      Sync
   loop
[/pbcode]

User Functions VS Projected Subroutines

[pbcode]
   Tests=10000

   Do
      cls 0
      inc frames

         // ===================================
         // USer Functions VS Projected Subroutines
         // ===================================

         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests
            result=SomeFunctionCalc(10,lp)
         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         // Call the Psub function
         For LP=0 to Tests
            result=SomesubCalc(10,lp)
         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)


      Sync
   loop

Function SomeFunctionCalc(A,B)
A=A*B
EndFunction A

Psub SomeSubCalc(A,B)
A=A*B
EndPsub A

[/pbcode]

String Thrashing

[pbcode]
   Tests=10000

   Do

      cls 0
      inc frames

         // ===================================
         // Using MID$() when comparing a Character within a String
         // ===================================

         TestString$="Hello World"


         // ==========
         // Test #1
         // ==========

         T=timer()
         For LP=0 to Tests
            if Mid$(TestString$,5,1)="a"
               print "Character 5 is A"
            endif
         next
         tt1#=tt1#+(timer()-t)
         Print "Test #1 Average Time:"+Str$(tt1#/frames)

         // ==========
         // Test #2
         // ==========

         T=timer()
         // MID() is variation of the MID$(), except it
         // returns the ASCII value as integer.
         // Integers are faster to compare than strings
         For LP=0 to Tests
            if Mid(TestString$,5)=asc("a")
               print "Character 5 is A"
            endif
         next
         tt2#=tt2#+(timer()-t)
         Print "Test #2 Average Time:"+Str$(tt2#/frames)


      Sync
   loop

[/pbcode]

Disc Thrashing

One cause of performance loss often occurs when we have to read or write large files. Sadly disc drives are notoriously slow beasts, so we have to work with them and not against them. What you might not know, is that because disc devices (and CD/DVD's also) have to physically spin up and seek out the position of the information we're requesting or writing. There's a latency between each time you try and read or write. This latency is particularly magnified when we try and read/write small pieces of data. Each time we ask it to read/write a small packet of data, the device has to local this area on the spinning disc. This really goes against how these devices are designed, which is they like to read/write big fat chunks of information in one hit. So knowing this, we can improve our disc performance by working with the devices strengths. If we have big file to work with, then rather than nibble away it with BYTE/WORD/Long access, it's generally more efficient to load (or save) a bigger chunk in memory and then work with that.

This example shows the worst possible case scenario, which is reading and writing files BYTE by BYTE.

[pbcode]
   Filename$=CurrentDir$()   +"TempFile.txt"

   // Alloc 100 K sized bank
   Bank=NewBank(100*1024)
   For lp=0 to size-1
      pokeBankByte Bank,lp,lp
   next

   repeat
      Cls rgb(10,20,40)

         inc frames

         // This test write the Bank contents BYTE by BYTE
         t=Timer()
            WriteBankAsBytes(Filename$,Bank)
         t1#=t1#+(timer()-t)
         print "Write Bank As Bytes:"+Str$(t1#/frames)

         // This test reads the file into a new bank, byte by byte
         // this is where you see just how much this can choke
         // your app's disc performance.
         t=Timer()
            ThisBank=ReadBankAsBytes(Filename$)
         t2#=t2#+(timer()-t)
         print "Read Bank As Bytes:"+Str$(t2#/frames)
         DeleteBank ThisBank

         // These versions read/write the same data, but they use
         // batch writing. Making them much quicker
         t=Timer()
            WriteBankAsBlock(Filename$,Bank)
         t3#=t3#+(timer()-t)
         print "Write Bank As Data Block:"+Str$(t3#/frames)

         t=Timer()
            ThisBank=ReadBankAsBlock(Filename$)
         t4#=t4#+(timer()-t)
         print "Read Bank As Block:"+Str$(t4#/frames)
         DeleteBank ThisBank

      Sync
   until Spacekey()

   // remove the temp file
   DeleteFIle Filename$

   end

Function WriteBankAsBytes(Filename$,ThisBank)

If FileExist(filename$) then deletefile Filename$

Fh=WriteNewFile(Filename$)
   if Fh
      For lp=0 to getbanksize(Thisbank)-1
            WriteByte Fh,PeekBankByte(ThisBank,lp)
      next
      CloseFile FH
   endif

EndFunction

Function ReadBankAsBytes(Filename$)

Fh=ReadNewFile(Filename$)
   if Fh
      Size=FileSize(Filename$)
      ThisBank=NewBank(size)
      For lp=0 to size-1
            POkeBankByte ThisBank,lp,ReadByte(Fh)
      next
      CloseFile FH
   endif

EndFunction ThisBank

Function WriteBankAsBlock(Filename$,ThisBank)
If FileExist(filename$) then deletefile Filename$
Fh=WriteNewFile(Filename$)
   if Fh
      Address=GEtBankPtr(ThisBank)
      WriteMemory fh,Address,Address+GetBankSize(ThisBank)
      CloseFile FH
   endif
EndFunction

Function ReadBankAsBlock(Filename$)

Fh=ReadNewFile(Filename$)
   if Fh
      Size=FileSize(Filename$)
      ThisBank=NewBank(size)
      readMemory Fh,GetBankPtr(ThisBank),Size
      CloseFile FH
   endif

EndFunction ThisBank

[/pbcode]

Examples,
Load Text Files To String Array (http://www.underwaredesign.com/forums/index.php?topic=1917.0)
Buffered file copy (copy data in blocks) (http://www.underwaredesign.com/forums/index.php?topic=3815.0)

(http://www.underwaredesign.com/PlayBasicSig.png)

Visit www.PlayBasic.com (http://www.playbasic.com)

A Crash Course In Optimization Part #3 - Smarter Drawing Techniques

In all facets of video game development, optimizing our programs drawing behavior is one of the most beneficial ways of improving our games overall performance. Now, I know what the little voice in back your head is saying, but doesn't my computers super fast video card make optimization redundant ? - Sadly No. Like everything in the computer they have a sweet spot. As such, they're designed to work in a certain way. While we can simply ignore this and press on regardless, this will often place more strain upon the drawing devices than need be. Making our programs slower than they could be.

How can I optimized my drawing ? - Well that's a good question, rather than getting into anything thats game or genre specific, we'll just try and cover some simple basic tips that can help us in all types of graphics programming. There's tips could be categorized as,

Tips

* When Not To Clear The Screen
* Avoid Rendering Dead Pixels (Drawing nothing sure takes a long time)
* Changing Display depths can improve performance ?
* Drawing Everything Real Time ?
* Selectively Refresh The Screen (Dirty Rectangles)

Note: The objective of optimizing our programs rendering, should be to improve the programs performance, without changing the appearance. We can generally do this by removing redundant drawing operations. For example, if we equated drawing each pixel on the screen as costing 1$ dollar per pixel. Then we'd want to make sure our program wasn't drawing the same pixels over and over again. As if we draw over a pixel twice then old one is gone (the user of our program never sees it), and cost of drawing that pixel has now doubled.

When Not To Clear The Screen

Clearing the screen (aka CLS) is more often than not one of the first commands you'll find our a programs main rendering loop. But is it really necessarily ? - Sometimes yes, but not always. We only need to clear the screen, if we're not going to be completely filling the screen with some backdrop. Such as a backdrop picture, gradient or perhaps a map even. It's in those situations that CLS is just a wasting some rendering time.

To demonstrate this has effect, the following example simulates the situation where we're drawing a backdrop image and also doing the unnecessary CLS. Press Space to toggle the CLS on/off between modes.

[pbcode]

   // Create a screen sized image. This is used to simulate a backdrop
   // picture you might use.
GameBackDrop=NewImage(GetScreenWidth(),GetSCreenHeight())

   //Fill the BackDrop image with a blue gradient
   RendertoImage GameBackDrop
   c1=rgb(50,70,200)
   c2=rgb(250,70,200)
   ShadeBox 0,0,GetScreenWidth(),GetSCreenHeight(),c1,c2,c2,c1
   rendertoscreen


   ClsEnabled=true


   // start of programs main loop
   Do

         //
         if ClsEnabled=true
            // Clear the Screen to a bright pinky/orange colour
            // We can't see it, since the backdrop is being completely covered
            // So the CLS is here for no reason
               cls Rgb(200,100,100)
         endif

         // Draw the gane image as we would in our game.
         // Since we're drawing it full screen, and we don't want
         // anything to show through it, we draw it solid
         Drawimage GameBackDrop,0,0,false

         // SHow FPS
         Text 10,10,"fps:"+str$(fps())

         if ClsEnabled=true
            t$="Cls Enabled"
         else
            t$="Cls Disabled"
         endif

         Text 10,30,t$

         // Hit the spacekey to toggle CLS
         if Spacekey()=true
               ClsEnabled=1-ClsEnabled
               FlushKeys
         endif

      Sync
   loop

[/pbcode]

This example, is an extension of the same subject more or less. This one draws a gradient backdrop and a some foreground mountains as two screen sized separate images. Visually, we get a sort of sunset effect sitting behind the scrolling mountain side (which is just some ellipses in this example). The thing is, since the backdrop gradient is none descriptive (no distinctive markings), what we could do in this case, is combine the gradient and the mountain picture together. This will effectively 1/2 the amount of drawing our program is doing and therefore give us a free speed boost. I should point out, this isn't always possible, but well worth it if you can!

Press Space to toggle between the two method in this demo

[pbcode]
   // =======================================
   // Create a Screen Sized Image. This is used to simulate a backdrop
   // picture you might use.
   // =======================================
GradientBackDropLayer=NewImage(GetScreenWidth(),GetSCreenHeight())

   //Fill the BackDrop image with a gradient
   RendertoImage GradientBackDropLayer
   c1=rgb(50,70,200)
   c2=rgb(250,70,200)
   ShadeBox 0,0,GetScreenWidth(),GetSCreenHeight(),c1,c1,c2,c2

   // =======================================
   // Create a second foreground layer
   // =======================================
ForegroundLayer   =NewImage(GetScreenWidth(),GetSCreenHeight())
   RendertoImage ForegroundLayer

   // Draw some ellipses to this surface to
   Radius=GetSCreenWidth()/3
   For Xlp=Radius/2 to GetScreenWidth()-1 step Radius
         Ellipsec Xlp,GetScreenHeight(),Radius/2,Radius*1.5,true,rgb(0,255,0)
   next

   // =======================================
   // Create version that is merged together
   // =======================================

MergedBackDrop=NewImage(GetScreenWidth(),GetSCreenHeight())
   RendertoImage MergedBackDrop
   Drawimage GradientBackDropLayer,0,0,false
   Drawimage ForegroundLayer,0,0,true
   rendertoscreen

   //
   DRawMergedBackDrop=false

   // start of programs main loop
   Do

         ScrollX=Mod(ScrollX-1,GetScreenWidth())

         if DRawMergedBackDrop=false

            // Draw the gane image as we would in our game.
            // Since we're drawing it full screen, and we don't want
            // anything to show through it, we draw it solid
            Drawimage GradientBackDropLayer,0,0,false

            // Draw the forreground layer over the gradient
            Tileimage ForegroundLayer,ScrollX,0,true

         else

            // Draw the pre merged version of the back drop
            // in place of the two seperate images
            TileImage MergedBackDrop,ScrollX,0,false

         Endif

         // SHow FPS
         Text 10,10,"fps:"+str$(fps())

         if DRawMergedBackDrop=false
            t$="Drawing Backdrops Layers Separately"
         else
            t$="Drawing Merged version"
         endif

         Text 10,30,t$

         // Hit the spacekey to toggle CLS
         if Spacekey()=true
               DRawMergedBackDrop=1-DRawMergedBackDrop
               FlushKeys
         endif

      Sync
   loop
[/pbcode]

Avoid Rendering Dead Pixels

What's a dead pixel ? - These are pixels in the image that we're drawing but are not visible to the end user.

Solid VS Transparent rendering

[pbcode]

   // Create a blank image 100*100 pixels in size
   BlankImage         =NewIMage(100,100)

   TransparentRender   =false

   // start of programs main loop
   Do

         // Clear the Screen
         Cls rgb (100,150,200)

         // Draw the Blank image to screen 25 times
         For lp=1 to 25
            Xpos=150+(lp*20)
            Ypos=100+(lp*15)
            Xpos=Xpos-GetIMageWidth(BlankImage)/2
            Ypos=Ypos-GetIMageHeight(BlankImage)/2
            DrawImage BlankImage,Xpos,Ypos,TransparentRender
         next

         // SHow FPS
         Text 10,10,"fps:"+str$(fps())

         if TransparentRender=False
            t$="Drawing Solid Images"
         else
            t$="Drawing Transparent Images"
         endif

         Text 10,30,t$

         // Hit the space key to toggle between transparent and solid rendering
         if Spacekey()=true
               TransparentRender=1-TransparentRender
               FlushKeys
         endif

      Sync
   loop
[/pbcode]

If you test (cut and paste the code into PlayBASIC) you'll possibly see something you didn't expect. While your intuition might be telling you that since we can't see anything being drawn, when rendering transparent pixels, then this isn't eating up our computing time. If that was the case, then it'd be much faster when rendering an all transparent image. However, the reality is quite different. It's actually a fraction slower. How much slower really depends on the system here.

Armed with little bit of knowledge, the next question is how can we best take advantage of it ? - Well, there's a few situations where we can optimize our images to smooth out our games rendering performance. Such as trimming backdrop images of unnecessary transparent sections and full screen HUD overlays for starters. But one place our games can eat up a lot of processing power is when we draw our characters to the screen. In particular character animations.

Why ? - Well animations are often drawn and laid out upon sprite sheets so that all frames are the same size. While this might make loading the animation relatively straight forward, it does mean that we're potentially going to have animation frames that are all the same size (width/height) regardless of the what's in each frame.

If you imagine a running animation for a second, then hopefully it's easy to visualize that certain frames during the animation are going to wider or higher than others. Such as when the character is fully extended for example. Moreover if we imagine an explosion animation, where the debris starts out centralized and then radiates out. Then in such explosion animations, the initial frames will generally be smaller than the later ones.

OK, you're probably falling asleep by now wondering why this is an issue. Well, It relates back to our discovery above about how rendering transparent pixels costs the same (or more) than drawing a visible pixel. So if all of our animation frames have extra unused (transparent) space around the visible graphics porttion, then that space is not only costing us a little bit of render performance, but it's also wasting memory.

Example

Lets have a look,

[pbcode]
   // Create two identical images. The only difference is that
   // one is larger than the other. The larger (oversized) one
   // is bit slower to render.
   OverSizedImage      =MakeBallImage(256,90)
   TrimmedImage      =MakeBallImage(180,90)

   CurrentImage=OverSizedImage

   //
   DrawTrimmedIMages=false

   // start of programs main loop
   Do

         // Clear the Screen
         Cls rgb (30,40,50)

         // Draw the current image to screen 25 times
         For lp=1 to 25
            Xpos=200+(lp*15)
            Ypos=200+(lp*10)
            Xpos=Xpos-GetIMageWidth(CurrentImage)/2
            Ypos=Ypos-GetIMageHeight(CurrentImage)/2
            DRawimage CurrentImage,Xpos,Ypos,true
         next

         // SHow FPS
         Text 10,10,"fps:"+str$(fps())

         if DrawTrimmedIMages=False
            t$="Drawing Oversized Images"
            CurrentImage=OverSizedImage
         else
            t$="Drawing Trimmed Images"
            CurrentImage=trimmedImage
         endif

         Text 10,30,t$

         // Hit the space key to toggle between trimmed and oversize images
         if Spacekey()=true
               DrawTrimmedIMages=1-DrawTrimmedIMages
               FlushKeys
         endif

      Sync
   loop

Function MakeBallImage(FrameSize,Radius)
   Index=NewImage(FrameSize,FrameSize)
   Rendertoimage Index
      For lp=Radius to 1 step -1
         Circlec frameSize/2,frameSize/2,lp,true,rgb(100,200-lp,255-(lp*2))
      next
   rendertoscreen
EndFunction index
[/pbcode]

Example #2 - Fixed Size Vs Trimmed Animations Size

To demonstrate, in this example we're creating two animation sequences. One is trimmed of any dead (transparent) pixels the other isn't. We're testing the performance difference by rendering these animations on screen, each in a different stage of it's animation sequence. You can probably surmise what's going to happen. The one where we're showing fixed size animation frames will get progressively slower (the more you have the slower it gets) than the one with the trimmed animations. So trimming helps us balance the load better.

[pbcode]

   FrameCount=25
   FrameSize=150

   // Make two Expanding Ball Type Animations.
   Dim FixedSizeImages(FrameCount*2)
   Dim TrimmedImages(FrameCount*2)
   MakeArray Animation()

   Radius=70
   For lp=0 to frameCount
      Radius2=Radius-(lp*2.5)

      // Make the Fixed size version
      img=MakeBallImage(FrameSize,Radius2,rgb(00,255,lp))
      FixedSizeImages(lp)=img
      FixedSizeImages(FrameCount*2-lp)=img

      // Make a 'rough' trimmed version anim (in a different colour)
      img=MakeBallImage((Radius2*2)-2,Radius2,rgb(255,155,lp))
      TrimmedImages(lp)=img
      TrimmedImages(FrameCount*2-lp)=img
   next

   // Set toggle which animation is being viewed
   DrawTrimmedImages=false

   // start of programs main loop
   Do

         // Clear the Screen
         Cls rgb (30,50,70)

         if DrawTrimmedImages=False
            t$="Fixed Size Animations"
            SetArray Animation(),GetArray(FixedSizeImages())
         else
            t$="Trimmed Animations"
            SetArray Animation(),GetArray(TrimmedImages())
         endif

         FrameIndeX=Mod(FrameIndex+1,FrameCount*2)

         // Draw a screen full of animations
            Xpos=100
            Ypos=50
            For lp=1 to 200
               ThisFrame=Mod(FrameIndex+lp,FrameCount*2)
               CurrentImage=Animation(Thisframe)
               Xpos=Xpos+20
               if Xpos>700
                  Xpos=100
                  Ypos=Ypos+100
               endif
               X=Xpos-GetIMageWidth(CurrentImage)/2
               Y=Ypos-GetIMageHeight(CurrentImage)/2
               DRawimage CurrentImage,X,Y,true
            next

         // DRaw the anim over the scene again.
            X=100-GetIMageWidth(CurrentImage)/2
            Y=200-GetIMageHeight(CurrentImage)/2
            DRawimage CurrentImage,X,Y,false
            box x,y,x+GetIMageWidth(CurrentImage),Y+GetIMageHeight(CurrentImage),false

         // SHow FPS
            Text 10,10,"fps:"+str$(fps())
            Text 10,30,t$

         // Hit the space key to toggle between trimmed and fixed size versions
            if Spacekey()=true
               DrawTrimmedImages=1-DrawTrimmedImages
               FlushKeys
            endif

      Sync
   loop

Function MakeBallImage(FrameSize,Radius,Colour)
   Index=NewImage(FrameSize,FrameSize)
   Rendertoimage Index
      For lp=Radius to 1 step -1
         Scale#=(Radius-lp)/Float(Radius)
         Circlec frameSize/2,frameSize/2,lp,true,RgbFade(Colour,Scale#*100)
      next
   rendertoscreen
EndFunction index

[/pbcode]

Changing Display depths can improve performance ?

While you might not think of it, the screen resolution (games display dimensions width & height) and the number of colours play a big part in final performance of our game. This is particularly important for those wanting to design programs that still perform well on older computers. While most of us take for granted that modern machines are capable of throwing high resolution images around the display fairly easily, the same can not be said for older systems. Which is clearly evident when we compare games from 2008 with games from 2004, or back in 2000, or 1980 for example.

The basic rule here is that the higher the resolution, the more pixels on screen, the more pixels on screen, the more power required to fill the screen with images at a reasonable speed.

For example,

If we use a screen size of 1048x * 768y with 32bit colour. That's (1024*768*4)=3,145,728 bytes for the screen. Yes, 3 meg !

Now lets say you CLS the screen and draw one layer of map to it. That means your GPU is roughly shifting 3 meg for the clear, and another 3 meg for the map render. So to draw the image costs approximately 6 meg of bandwidth per frame.

Now as we previously discovered, Every single pixel that is rendered (solid, transparent (mask colour) or translucent ) costs you performance! While Modern GPU's have a much higher fill rate (they can draw more pixels per second), older ones simply can't cope with too much graphics data. So we'll need a way to reduce the work load, if we hope to get reasonable performance on those systems.

We can reduce the work load in number of two ways.

1) Give the user the option of lowering the display depth from 24bit/32bit down to 16bit. This halves the amount of graphics data that has to be shifted, the trade of is that visually we lose some colour quality as well. It's also worth noting the many older GPU's were optimized for 16bit display modes over 24bit or 32bit modes. Really old systems often don't have 32bit mode at all. So on those systems there's no option but 16bit.

2) Change to smaller screen size. However, this is not always convenient when making 2D games. Since most of the artwork is drawn specifically for a certain display resolution. It is possible to scale the graphics media proportionally from within your program. However this will add a lot of extra processing work, and is unlikely to be very efficient or visually attractive. But possible if you really wanted. This approach is far easier in the 3D world.

3) Try to minimize the the amount of the wasted drawing the program might be doing. (The stuff mentioned in this very tutorial)

4) If you're doing things like rotation/image scaling then reducing the quality (the images dimensions) of the image can be very beneficial on old systems. Might not look as good, but this can keep the frame rate higher on those older systems.

5) Don't worry about supporting old clunkers anymore.. Optimizing is your choice, so you don't have to go out of your way support old systems.

Example

What example demonstrates how much the resolution can effect the performance of our program. It does this by rendering a set of boxes to a collection of the 'screen' sized images. You can cycle through the images and monitor the programs performance(). The boxes are drawn proportionally, so the scene looks the same regardless of the size. Moreover, It also displays the number of pixels drawn at each resolution. The bigger the display gets the more pixels that are being drawn, and therefore the slower the demo gets.

[pbcode]

   MaxCords=50

   // Create an array to hold
   Dim BoxCords(MaxCords,5)

   For lp=0 to MaxCords
      BoxCords(lp,1)=rnd(100)
      BoxCords(lp,2)=rnd(100)
      BoxCords(lp,3)=rnd(100)
      BoxCords(lp,4)=rnd(100)
      BoxCords(lp,5)=rndrgb()
   next

   Dim Screens(5)
   Screens(1)=NewIMage(320,240)
   Screens(2)=NewIMage(400,300)
   Screens(3)=NewIMage(640,480)
   Screens(4)=NewIMage(800,600)
   Screens(5)=NewIMage(1024,768)


   CurrentScreen=1

   // start of programs main loop
   Do

         // Clear the Screen
            Cls rgb (30,50,70)

            ThisImage=Screens(CurrentScreen)

            PixelsDrawn=DrawScene(ThisImage)

            Xpos=GetScreenWidth()/2-GetIMageWidth(ThisImage)/2
            Ypos=GetSCreenHeight()/2-GetIMageHeight(ThisImage)/2
            DRawimage ThisImage,Xpos,Ypos,false


            w=GetimageWidth(ThisImage)
            h=GetimageHeight(ThisImage)
            t$="Current Screen Size:"+str$(w)+"*"+str$(h)

         // SHow FPS
            Text 10,10,"fps:"+str$(fps())
            Text 10,30,t$
            Text 10,50,"Pixels Drawn:"+str$(PixelsDrawn)

         // Hit the space key to toggle between trimmed and fixed size versions
            if Spacekey()=true
               CurrentScreen =CurrentScreen+1
               if CurrentScreen>GetArrayElements(Screens(),1) then CurrentScreen=1
               FlushKeys
            endif

      Sync
   loop

Function DrawScene(ThisImage)

   rendertoimage Thisimage
   w=GetIMageWidth(ThisImage)
   h=GetIMageHeight(ThisImage)

   ScaleX#=w/100.0
   ScaleY#=h/100.0

   c1=rgb(200,100,10)
   c2=rgb(200,100,150)
   ShadeBox 0,0,w,h,c1,c1,c2,c2
   For lp=0 to GetArrayElements(BoxCords(),1)
      x1=BoxCords(lp,1)*Scalex#
      Y1=BoxCords(lp,2)*Scaley#
      x2=BoxCords(lp,3)*Scalex#
      y2=BoxCords(lp,4)*Scaley#
      swapifhigher x1,x2
      swapifhigher y1,y2
      boxc x1,y1,x2,y2,true,BoxCords(lp,5)
      PixelsDrawn=PixelsDrawn+((x2-x1)*(y2-y1))
   next
   rendertoscreen
EndFunction PixelsDrawn

[/pbcode]

Drawing Everything Real Time

Initially when we first start getting into game programming, one of the most common misconceptions, well, mistakes people make is they assume 'everything' needs to be drawn every update. While this is sometimes true, often we can selectively update and player won't really notice the difference. However, this does mean that it's not quite as cut and dry as just drawing everything all the time. But it's not rocket science either.

One the most common opportunities for selectively updating occurs in things refresh backdrop animations and foreground overlaps.

Scores and Health Bars

In this example all were doing is drawing a simple score and health bar bellow. It's the sort of thing we'd just slap into our game and forget, but we can actually improve the performance by selectively updating it. The same approach will work for all things HUD related in fact. So If it's not changing, don't render it!

[pbcode]

   LoadFont "Courier",1,24,0

   ScoreOverLayImage=NewImage(204,50)


   // start of programs main loop
   Do

      // Clear the Screen
         Cls rgb (30,50,70)

         scw=GetScreenWidth()/2
         sch=GetScreenHeight()*0.4

         if timer()>NextUpdate
            NextUpdate=timer()+100
            Score=Score+rnd(100)
            Health=mod(health+1,100)
         endif

         if DrawMode=0
            // DRaw the score and health directive to the screen, each update
            DrawScoreAndHealth(scw,sch,Score,Health)
         endif

         if DrawMode=1

            // Selectively cache the score& health to an image (since image rendering is quicker)

            RefreshScore=false
            if OldScore<>Score or OldHealth<>Health
               ; either the score or health have been changed, so we neeed to
               ; the score image cache
               OldScore=Score
               OldHealth=Health
               RefreshScore=true
            endif


            if   RefreshScore=true
               rendertoimage ScoreOverLayImage
               cls 0
               DrawScoreAndHealth(101,0,Score,Health)
               rendertoscreen
            endif
            drawimage ScoreOverLayImage,scw-101,sch,true

         endif

         if Drawmode=0
               t$="Drawing to the screen"
         else
               t$="Selectively updating"
         endif

         // SHow FPS
         Text 10,10,"fps:"+str$(fps())
         Text 10,30,t$

      // Hit the space key to toggle between draw mode
         if Spacekey()=true
            DrawMode =1-DrawMode
            FlushKeys
         endif

      Sync
   loop

Function DrawScoreAndHealth(Xpos,Ypos,Score,Health)

   s$="Score:"+Digits$(Score,6)
   CenterText Xpos,Ypos,s$

   th=GetTextHeight(s$)
   x1=xpos-102
   x2=xpos+102

   y1=ypos+th+5
   y2=y1+th

   c=rgb(20,30,40)
   boxc x1,y1,x2,y2,true,c
   c1=rgb(220,30,40)
   c2=rgb(20,40,240)
   ShadeBox x1+2,y1+2,x1+(Health*2),y2-2,c1,c2,c2,c1
EndFunction

[/pbcode]

Selectively Refreshing The Screen (Dirty Rectangles)

Often when we start writing games, it's common to see programmers just throwing everything at the graphics engine/hardware and let that deal with it. While this may work fine for some games on some computers, but not others. The reason for this is that graphics engines are generally designed to just do what you tell it to do. if you tell it draw the same image/circle/pixel 100 times, then that's what it'll do. It doesn't know what you're trying to achieve, or that you might be drawing something that isn't even on screen.

Selective refreshing is conceptually simple. Rather than just shoveling everything on screen every frame, we going to pay closer attention to what we're drawing and where. In this particular example, we're simply drawing some sprites (with alpha channel) moving around on a static backdrop. If we take the brute force approach, we've have a FX image the size of the screen. Copy the backdrop to this image, draw all the sprites over it, then copy the screen image to actual screen. This will work, but do we really need to refresh the entire backdrop every frame if the backdrop is static picture ? Nope. We only need to refresh the parts of the backdrop that are being overwritten.

There's a few ways of doing this, in this example we're simply going to treat the backdrop picture as if it was Tile Map. Using an array to signify when a tile (of rectangular portion of the backdrop image) needs to be refreshed to it's original state. During our sprite movement routines, we not only move the sprite, but we work out what tiles it's going to overwrite when it's drawn. These backdrop tiles are then flagged as needing to be refreshed when the backdrop is being restored.

So each loop we're basically doing this

Code Select


       Do
           * Refresh Backdrop (redraws any tile that was flagged as being overdrawn last refresh.  this restores the backdrop to it's original state)
           For TileY = 0 to TilesDown
             For TileX = 0 to TilesAcross
                   if TileRefresh(xlp,ylp)
                            ; redraw this portion of the backdrop,  since it was previously covered by a sprite 
                   endif
             next
         next

           For each character in game
                   .. Move character (AI , physics etc)
                   Get characters final screen coordinates and calc bounding rectangle
                   convert the  sprites screen coords to array coords   and fill this rectangle of tiles in the array with 
           next
           refresh the screen
       Loop

What this lets us do, is we're effectively removing the cost of the redrawing the full backdrop each update. That is assuming we dont have a full screen of sprites covering the backdrop, in that case it'll all get redrawn regardless.

See example: Selective Tile Map Refreshing (http://www.underwaredesign.com/forums/index.php?topic=2677.0)

See Example #2 Dirty Rectangles (Combined Video & Fx Rendering) (http://www.underwaredesign.com/forums/index.php?topic=3395.0)

(http://www.underwaredesign.com/PlayBasicSig.png)

Visit www.PlayBasic.com (http://www.playbasic.com)

A Crash Course In Optimization Part #4 - Smarter Program Logic

Becoming a good programmer (games or otherwise), goes far beyond the graphical mumbo jumbo of the day. What do I mean by that ? - Well, it doesn't matter what display technology/graphics libraries you use, if your programming & problem solving abilities aren't up to it. Then you'll most likely end up with some pretty visuals on screen, but be unable to actually get the game working. Which is incredibly common !

So here, we'll try and address some these stumbling blocks new/old programmers are likely to run into. Hopefully this will help improve your algorithmic knowledge (methods) and code design principals. Which might just help you shake the brass monkey!

There's tips could be categorized as,

Design Tips

* Clearer Code means faster development

Algorithm Tips

* Spacial Partitioning (Collision)

Note: Remember, improving our programming design and algorithmic knowledge(methods) has a two fold effect. While better method/algorithms will certainly help optimize our programs performance, the bigger gain can come improved code design/structure. Which will allow us to create larger more complex programs, with less errors at much faster rate.

Clearer Code means faster development

Optimization is the pursuit of efficient design. Conceptually this goes far beyond just improving the execution speed of a piece of program code. By learning to write clearer more structured code to begin with, we can optimize our very development and bug test progresses. Allowing us to develop larger more complex programs with less faults in less time. How do we go about this ?

The simple answer is that we need apply some structure, some rules to our own programming style. Thus forcing ourselves to think before acting. While this might not be the most intuitive thing to do, in particular for new programmers, all too often badly/ugly code is the result of an impulsive blast at the keyboard.

Just what is bad code ? - While highly subjective, we'll classify a bad or ugly code fragment as being something that fails to efficiently convey it's meaning to the reader in a well laid out manner. The key point here is that even though it might 'execute correctly', if it's not written in legible fashion, then this can make the code difficult to debug, very difficult to change and almost impossible to reuse.

So in no particular order this means,

* Use meaningful Variable, Array, Types & Function Names throughout our program.

Using meaningful names (of variables/functions/arrays etc) helps with the programs readability dramatically. As the author of a program, we can sometimes be forgiven for using names that don't really related to what they hold or do. This might not be a huge issue while we're developing a section of program code, as while we're writing it, what these variables/functions name represent will be fresh in our memory. It's when we take a break from programming, or show the code to somebody else. It's here that such decisions become more of a liability than anything else. Why ? - Well, beside from being error prone, it means that in order to understand what a section of code is doing, we have to re-learn it from scratch every time. The bigger the fragment, the more difficult and time consuming this becomes.

* Comment your code.

Comments might be used to describe the approached a section of code is using to solve a particular problem, outline any known limitations of the code fragment, code separators, to do lists anything really. So comments can help us future proof our own programs.

* Use the appropriate control structures for the algorithm/logic we're implementing.

* Use Functions & Subs to divide and conquer.

Related articles

Planning Your Project (http://www.underwaredesign.com/forums/index.php?topic=1018.msg8246#msg8246)
Thoughts on Planning / Code Design (http://www.underwaredesign.com/forums/index.php?topic=1018.msg13328#msg13328)
Choosing The Correct Variables (http://www.underwaredesign.com/forums/index.php?topic=2401.0)

Wiki articles

Given the size of the this particular subject, we can't possibly cover it in a few paragraphs, so feel free to could out some out.

http://en.wikipedia.org/wiki/Structured_programming (http://en.wikipedia.org/wiki/Structured_programming)
http://en.wikipedia.org/wiki/Procedural_programming (http://en.wikipedia.org/wiki/Procedural_programming)

Spacial Partitioning

One increasingly common problem in game programming, relates to the efficient detection of the collisions between the various characters in our game world. Effectively the more characters we have, the more collisions possibilities, and the more checks, the slower the program will operate.

Imagine this scenario, imagine we have a room full of people all walking around the room randomly trying not to bump into each other. So our programs AI will be required to predict collisions between the various people, so the characters can take evasive action. Which just means that each time a person moves, we'll have to make sure the person isn't blindly moving into another.

One way of doing this, is by checking each person against every other person in the room. So if there's 10 people. Person #1 is compared with persons 2,3,4,5,6,7,8,9,10. Person #2 is compared with persons 1,3,4,5,6,7,8,9,10 and so on for all 10 people. While this approach will work, it's doesn't scale up very well, which is it's main flaw. For 10 people we're only making (9*9)=81 collision checks. Which isn't too bad. But what if the room has 100 or 1000 people in it ? It's here the cracks start appearing.. For 100 characters we're performing (99*99)=9801 checks, for 1000 characters it's (999*999)=998,001 checks. If we think about, by comparing each person to every other person, we'll going to end up making lots of redundant collision checks. Meaning lots of wasted processing time and ultimately a slower running game.

So what's the alternative if comparing everything to everything is too much work for our game ? - Well, we need to start thinking of ways to reduce the number of necessary collision checks. This can be tackled in a while bunch of ways, But for now we'll look into some basic Spatial Partitioning

Spatial partitioning (as the name suggests) is where our program keeps track of the region or zone each character is currently within. So when it comes to detecting collisions between the characters, we only need to the compare our current character, with just the characters that share the same zone. This simple design change, will dramatically reduce the collision overhead.

If we use the example from above again, but this time lets imagine we have an overhead view of room with 1000 people randomly positioned on screen. This time, rather than comparing each person, with every other person, we'll first run though our list people are assign them one of two zones. Characters standing in the top half of the display will be in zone A, characters standing in the bottom part of the display will be in zone B. Characters that are standing in the middle of the screen, will be placed in both zones A & B. So for example sake, lets say that Zone A has 550 people in that half of the screen and Zone B has the remaining 450 people. This will give us an approximate collision test count of (550*550) + (450*450) = 505,000 checks. So we've halved the work load by simply splitting up the screen into two zones. What would happen if we split it into four zones, eight zones ? Simple, we get lower collision check count.

This example uses this very approach (vertically).

See Partition (inter object) Shape Collision (http://www.underwaredesign.com/forums/index.php?topic=1237.0)

UnderwareDESIGN

PlayBASIC => Resources => Tutorials => Topic started by: kevin on June 30, 2008, 01:46:38 PM