PlayBASIC V1.64N2 / V164N3 (Work In Progress) Gallery

kevin · April 29, 2012, 01:06:13 AM

PlayBASIC V1.64N2 Beta #7 - The Knock on effect

Work continues on the expanding the parser to be more 'type cache' aware. Internal testing shows the proof of concept works, there's no disputing that. The tricky part is that the parser has to detect situations where the previous cached data has become invalid. Imagine we have a typed high score array called HighScores. Now if you manually insert new values into the high score table, we often see a section of code loops from the end to the insertion point that reads the cell above and writes to the cell bellow.

A bit like this,

Code Select


    
      For copylp=ArraySize to InsertPosition step -1
              HighScore(Copylp+1).Score = HighScore(Copylp).Score
      next

Whats surprising about this very simple routine, is that it's one where the caching could easily break the run time execution. Since the left expression queries HighScore(Copylp) caching that access point, but the write occurs on cell CopyLP+1 in the same array. The parser can detect that different patterns and output a standard write to array in side expression.

But, here's where it gets rather interesting.. Now lets imagine a situation where we're reading some data from a typed array, but this time we're manipulating the index between multiple accesses across lines.

Code Select


    
           print Array(Index).Value
           Index ++

           print Array(index).Value

And bingo, this won't work. The parser detects the pattern of each type access, but in both print expressions the accesses have the same pattern. But clearly the code is not wanting to access the same type between prints, in this case, it's successive types in the structure.. Even though the situation is fairly bogus, it reveals something a hidden complexity when caching array accesses across multiple expressions. Where the parameter variables might have been altered since the previous access, making a any previous cached result invalid.

While the above situation can indeed be handled, it's far from unique. So rather than even trying to run this project wide, the next objective is to sure up the caching across self contained expressions first, as seen in the first Highscore example above. Pretty much spent all Saturday working through the logic of said change. Previously, I'd mainly been testing linked typed variables, where user code tends to be set out in the fairly sequential manner, which creates more potentials caching opportunities. Implementing the cache isn't a lot of code, it's all about tweaking up the logic, so what's generated is valid VM code and equal to what would normally be generated.

Now as with all things parser you can get some knock on effects when making even tiny changes. In PlayBASIC, the parser and code generation occur at the some time. During an expression the parser does all the grunt work stepping though the user code, where it calls operation generator to spit out the VM code as it goes. In long expressions we often need to store temp results, which end up in temp registers. Interestingly, as by product of adding type caching, the parser can better detect when operations occur between temp registers. Meaning we use less temp data, resulting in better memory fetching and less locals in user defined functions.

What does this mean, it means some free function calling speed for a completely unrelated topic. Bellow we see a results of running the test code (bellow) in V1.64N and V1.64N2 Beta7. Which give us a few free FPS back for absolutely nothing.

Code Select



	Tests=10000

	Do
		cls 0

			inc frames

			// ===================================
			// USer Functions VS Projected Subroutines
			// ===================================

			// ==========
			// Test #1
			// ==========

			T=timer()
			For LP=0 to Tests
				result=SomeFunctionCalc(10,lp)
			next			
			tt1#=tt1#+(timer()-t)
			Print "Test #1 Average Time:"+Str$(tt1#/frames)	


			// ==========
			// Test #2
			// ==========

			T=timer()
			// Call the Psub function
			For LP=0 to Tests
				result=SomesubCalc(10,lp)
			next			
			tt2#=tt2#+(timer()-t)
			Print "Test #2 Average Time:"+Str$(tt2#/frames)	





			// ==========
			// Test #3
			// ==========

			T=timer()
			For LP=0 to Tests
				result=SomeFunctionCalc(10,lp)
			next			
			tt3#=tt3#+(timer()-t)
			Print "Test #3 Average Time:"+Str$(tt3#/frames)	



			// ==========
			// Test #4
			// ==========

			T=timer()
			// Call the Psub function
			For LP=0 to Tests
				result=SomesubCalc(10,lp)
			next			
			tt4#=tt4#+(timer()-t)
			Print "Test #4 Average Time:"+Str$(tt4#/frames)	


			print ""
			print "Fps:"+Str$(fps())

	
		Sync
		
	loop	


Function SomeFunctionCalc(A,B)
     A=A*B
EndFunction  A

Psub SomeSubCalc(A,B)
     A=A*B
EndPsub A

PlayBASIC V1.64N2 Beta #7b - Cached Type Reading

Type caching has been expanded activated across each expression, there's still a few gotchas but we're getting there. Bellow (second pair of pictures) we're looking at the results of PB1.64N and PB1.64N2 Beta7b running the this snippet. Allowing the parser to run through the expressions, means that any code that's set out like ME.X = ME.X + 1 can now benefit from the caching also. But really big winner in all this, is when you access multiple dimension typed arrays. If you look at the results the caching wins us back about 10 milliseconds from that part of the test.. Which i thing demostrates just how much work the runtime has to do read/write safely from an array.

Code Select



		type Vector3D
				x#,y#,z#
		EndType

		type Stuff
		
				Name$
				X
				y
				z#
				Table(100)
				V3 as Vector3D

		EndType

	Dim Me as stuff
	Dim test(10,10) as stuff
	Dim cool as stuff list

	me = new stuff
	
	
	cool = new stuff
	
	me.v3 =0  

	max=25000
	
Do


		cls	
		frames++

		tests=0
		t=timer()
		for lp =0 to max	
			me.X+=1
			me.y+=2
			me.z+=3	
		next
		tt1#+=timer()-t	
		print "Math Short Cuts on Typed Variable"
		PrintTime(tt1#/frames)
		print Me.x
		print Me.y
		print Me.z
		Me.x=0
		Me.y=0
		Me.z=0
		tests++


		t=timer()
		for lp =0 to max			
			me.X=me.X+1
			me.y=me.y+2
			me.z=me.z+3
		next
		tt2#+=timer()-t	
		print "Long Hand Math Operators on Typed Variable"
		PrintTime(tt2#/frames)
		
		print Me.x
		print Me.y
		print Me.z
		Me.x=0
		Me.y=0
		Me.z=0
		tests++


		Index=6
		Index2=5

		test(Index,Index2)= New Stuff

		t=timer()
		for lp =0 to max			
			test(Index,Index2).x=test(Index,Index2).x+1
			test(Index,Index2).y=test(Index,Index2).y+2
			test(Index,Index2).z=test(Index,Index2).z+3
		next
		tt3#+=timer()-t	
		print "Long Hand Math Operators on 2D Typed Array"
		PrintTime(tt3#/frames)
		print test(Index,Index2).x
		print test(Index,Index2).y
		print test(Index,Index2).z
		

		tests++

		print "BandWidth:"+Str$(max*24*Tests*4)+" Bytes"	
		print "FPS:"+Str$(fps())	
			
	
	Sync
loop

Function PrintTime(T#)
	ink $ffff0000
	print "Time:"+Str$(t#)
	ink -1    
EndFUnction

Just got this hooked up and are testing The second pair of picture

kevin · April 30, 2012, 01:10:11 PM

PlayBASIC V1.64N2 BETA #7 (Retail Compiler Only Beta) - (Avail for Registered Users ONLY)

PlayBASIC V1.64 N2 Beta #7, this revision includes the Type caching in the compiler / runtime.

In the previous beta, types could only be cached within math short operations. Which is a nice safe way to introduce the feature, but in this version the parser is able to cache type accesses across an entire expression. This expansion means the compiler can now detect and simplify the following styles of expression for the run time. Ie Me.X = Me.X +1, In fact, if you stack multiple accesses within the one expression, you can take further advantage of caching. So if we had this Me.X = Me.X +Me.Speed, the runtime will pull the ME type into the cache once, after which it's only accessing the cached version. So at least two of the ME accesses in that expression are cached. Performance wise, the more dimensions you have, the better this works.

While the parser can cache within each expression, it doesn't support caching accesses across multiple expressions or lines of code. The reason for this, is that once we do that, we're in a bit of mine field. Because as mentioned above, there's any number of situations where an array index variable could be changed, a list pointer moved, or just flow of code can break it. But we'll see.

Download

old beta deleted

kevin · May 01, 2012, 03:15:20 PM

PlayBASIC V1.64N2 Beta #8 - Global Type Caching

Moving on from the expression type caching of previous beta's, we head into the wild unknown wilderness that is global type caching. In beta 7, the caching is activated during assignments styled expressions and then turned off again, to avoid potential logical issues at runtime. Most of the logical stuff is fairly predictable, where caching could result in potentially runtime errors, it's the unforeseen stuff that worries me about this feature most of all. As such, caching is disabled by default, which reverts the parser to it's original behavior. To enable it, we set Bit 1 in OptExpression, Bit 0 is used to toggle the standard optimization (which is on by default), so to enable Both modes we set OptExpressions to an integer value of 3.

Currently, i'm slowly picking through the parser tagging the various situations where the cache should be flushed or disallowed completely. I'd like to say it's all working wonderfully well, but it isn't. Small stuff seems to work ok, but the bigger apps virtually all die at this point. Thus we'll focus on the small stuff for now :) - Like for example the 'Cached Type Reading' bench mark from above.

The post above, shows the results for PBV1,64N and V1,64N2 beta7 running the bench mark, now bellow we've got today's results. At first glance you might not see the big deal, but beta 8 can run a complete forth test in the time V1.64N can only run three of them. Whats more interesting is that the forth test is benching 3D typed array accesses. Now granted, the test code is set out to cache well, but I didn't think it'd work that well. To put the test in terms of bandwidth, the demo is pushing around 1/2 a gigabyte of bandwidth per second, not too bad for a 10 year old runtime, on almost 7 year old hardware.

PlayBASIC V1.64N2 Beta #8b - Write Caching

Previously the cache state was only enabled after a READ access of the typed structure, this has been expanded to support writes also. Which wasn't a big deal performance wise, since generally if we're only writing to a type we're probably initializing it's fields prior to use in a create object/character styled function. A few extra fetches in a routine that's not being called 1000's of times isn't going to kill us, but it adds extra bloat to the runtime code.

You can get a quick example of the style of routine i'm talking about from the 8Way Layered Star Field / Asteroids Style example. Where there's a function that creates the layers where it's virtually all writes to the initialize the structure.

This bit,

PlayBASIC Code: [Select]

Function CreateLayer(Size,depth#)
      Index=GetFreeCell(Layers())
   
      Layers(iNdex)=new tLayer
      Layers(index).Size=Size   
      Layers(index).x=0   
      Layers(index).Y=0   
      Layers(index).Depth=depth#

      NumberOfDots=rndrange(50,100)
      Layers(index).Shape=Make_Randomized_Dot_Shape(NumberOfDots,Size)

EndFUnction

So we're got a bunch of successive writes to the same type. So obviously in this case it's caching the write to the second, third and so on writes. But not all of them. Expressions that include function calls flush the cache. So on the shape creation line, the write to Layers(Index).Shape is an absolute write, rather than cached one. We can rearrange the code though to make sure all successive writes are cached by pre-computing the function call as a variable, then writing that later.

Like this,

PlayBASIC Code: [Select]

Function CreateLayer(Size,depth#)
      Index=GetFreeCell(Layers())

      NumberOfDots=rndrange(50,100)
          ThisShape=Make_Randomized_Dot_Shape(NumberOfDots,Size)

      Layers(iNdex)=new tLayer
      Layers(index).Size=Size   
      Layers(index).x=0   
      Layers(index).Y=0   
      Layers(index).Depth=depth#
      Layers(index).Shape=ThisShape

EndFUnction

Both versions are logically the same, but the second one just lets the optimizer do a better job.

kevin · May 03, 2012, 04:05:31 AM

PlayBASIC V1.64N2 Beta #8c - Vector Stacking

As with all upgrades it's not all feature adding, there's an addition cycle --> testing --> repeat until done. Have ironed out a number of type cache related issues, It's far from 100%, but most of the big programs I was having problems with, work again. Which is good thing. So now it's time to dive deeper find those nitty gritty issues through usage. Hence knocking up some tech demos.

One demo I've been messing around with is basically a 3d particle explosion. We have a linked list of 10,000 points and are plotting each points path, which is clearly overkill, but it makes for a good test bed for brute forcing the runtime. But if we have 10,000 points and there's say 15 array accesses per point, then as much as caching helps out, it's not a silver bullet in this type of situation.

So the pictures bellow are comparing two methods side by side, the first method is native PB version of the particle code (unopt'd), the second is using a type of vector stack. Where the user defines a list of operations to apply to a stack of vectors and then executes the list on demand. While it's a little more work to set up, the results speak for themselves, with the stack version of the particle routine being around 3 times faster.

ATLUS · May 04, 2012, 01:32:21 PM

3 times faster ^__^ sounds good!

kevin · May 04, 2012, 02:22:06 PM

yeah sometimes we win some, sometimes we lose some, but every gain is nice, no matter how big or small..

monkeybot · May 06, 2012, 06:00:02 AM

nice!

kevin · May 06, 2012, 11:16:13 AM

it works pretty well, but i'm sure for many, it's going to be a pretty abstract concept, as described here (login required) .

kevin · May 07, 2012, 03:01:11 AM

PlayBASIC V1.64N2 Beta #9 - Disassembler

Work continued on the upgrade over the weekend, but with shift in focus geared more towards testing then anything else. I guess the only new addition a simple parser to convert string based Vector Stack expressions into an operations list for the resolve stack function, making the process a lot easier. Unlike the PB compiler, the vector stack operation lists can be processed at runtime. Giving a bit of leeway in how an operation is solved.

Anyway, in terms of testing, my focus has been on trying to validate where and when the compiler is seeing cachable type operations. For some reason some programs with heavy type usage, return little or no successful caches, even though the code is serialized. After a lot of head scratching ended up coming to the conclusion that it'd be much easier to see exactly where and when it's failing if I could look a dis-assembly of byte code. Much older versions of the PB had something similar built in, but was removed long ago. Luckily i had most of the material just sitting around.

The disassembler can currently decode about 50% of PB core instruction set (don't need all of it), in the initial version it just showed the opcode and parameter data, so you don't get anything even reassembling PB code back. But yesterday added some 'clean up' routines that try to reconstruct that type of the expression that created this opcode, making it a lot more BASIC looking and easier to follow the logic. Some logic you could basically resource with a pretty high accuracy really, where as some bits, you can't as the information just doesn't exist or it requires multiple passes to try and guess the original logic.

The tool is currently working well enough for me to get an overview at just what the compiler is producing in various situations, and it's already revealing some rather interesting results. There seems to be the odd occasion where certain expressions generate extra MOVE operations, where they could be short cut. The odd extra operation is no big deal in a code that's executed infrequently , but that weigh is magnified in brute force situations.

Another oddity appears when initializing the function scope, where it seems to be writing some bogus data, dunno what's up with that. Perhaps the best thing about it so far, is that it reveals a few situations where merging some common operations could be used to reduce the vm overhead further. Some of which I think could be done a lot easier from a dedicated optimization pre-processing pass though, but well worth putting on the ideas board..

PlayBASIC V1.64N2 Beta #10 - Download

Old beta deleted:

remember.. Type caching has to be enabled in these versions. To do so, add the line OptExpressions 3 to the start of your program.

kevin · May 08, 2012, 12:42:23 PM

PlayBASIC V1.64N2 Beta #10 - Inner Workings Of Type Caching

Testing is going pretty well thus far, the more code I look over the dis-assembly outputs from, the more little potential tweaks appear. I think most of the problems i've been having with caching relate to array indexes where the index has changed between accesses. The parser isn't smart enough to trap those situations as yet. As a whole, the compiler takes a pretty nervous approach to caching, which you can see when picking over the code it's producing for various blocks. If i suss out the problem areas, then it should be able to be relaxed a bit more, which in turn will make further improvements.

The disassembler tool is working well, still LOTS of stuff it doesn't have a clue about, but for what i'm interested in (the core logic) it works well enough now. So well, i've dressed it a little more since yesterday. Now it allows us to see reconstructed BASIC styled versions of the instruction set. Some stuff, there's no obvious BASIC translation, but it's a lot more readable and easier to browse now than it was before. Attached to this post you can see an example routine with caching enabled (it's one of the example from earlier in this thread) . The tool colours key lines to help pick out the key operations i'm interested in. Not too pretty to look at, but a much easier form of validation stand point.

ATLUS · May 08, 2012, 01:18:24 PM

disassembler tool looking pretty good ^__^

monkeybot · May 10, 2012, 02:19:21 AM

what does OptExpressions do? i seem to have missed that one.

kevin · May 10, 2012, 02:56:11 AM

OptExpressions is a compile time switch, much like Explicit, which you can toggle it on and off anywhere you like (within reason :) ). The switch controls the activity of a group of redundancy optimizations that the compiler can apply to expressions and move operations during code generation. It's technical not a user command, since there's no reason to actually turn it off today. So It's basically a legacy feature now only used when something breaks in code generation, like the same string function assignment problems in V1.64M for example.

Much like the Type Caching changes mentioned above, originally the standard optimizer features were turned off by default in those releases. Since it would help some programs, but completely break others, It took some time before those issues were resolved, which is what i'm expecting here. Once the issues are all ironed out, the cache mode will be on by default.

kevin · May 10, 2012, 10:01:30 AM

PlayBASIC V1.64N2 Beta #11 - Tuning The Sprite / Map Instruction Sets

Shifting focus from the 'compiler' back to the runtime instruction set for a few days, namely how the VM traps instructions with the aim of adding a few batching/structure read/write typed opcodes as well. Today I've been through all the mapping and sprite libs trying to arrange them for efficiently. The less VM overhead, the better basically, as over time (9->10 years of it) things bloat out. But with a little cut and paste, the updated lib's are performing better in brute force situations.

In this example, we're running three separate for/next loop through 10,000 sprites pulling /setting and moving the X / Y cords from each sprite, simulating what most game loops will do at one stage or another.

eg

PlayBASIC Code: [Select]

   Max=10000   
   Dim Spr(max)
   For lp =0 to max 
         Spr(lp)=NewSprite(rnd(1000),rnd(1000),1)
   next   
   

   Do
      Cls 
      
      Frames++

      t=timer()
      For lp =0 to Max
            spr=Spr(0)
            xpos=getspritex(spr)
            ypos=getspritey(spr)
      next
      tt1#+=Timer()-t


      t=timer()
      For lp =0 to Max
            spr=Spr(0)
            xpos=getspritex(spr)
            ypos=getspritey(spr)
            positionsprite spr,Xpos,Ypos
   
      next
      tt2#+=Timer()-t


      t=timer()
      For lp =0 to Max
            spr=Spr(0)
            movesprite spr,1,1
            
      next
      tt3#+=Timer()-t


      print "      Pull Positions:"+STR$(tt1#/frames)
      print "Pull + Set Positions:"+STR$(tt2#/frames)
      print "         Move Sprite:"+STR$(tt3#/frames)
      print "                 Fps:"+STR$(fps())
         
      Sync
   loop

The tuning up of instruction set gives us some more free speed back (about 5 fps on the test system) for no real deep investment of time, so it's basically free performance gain. But, we've got remember that this is a brute force test of 10,000 iterations. Most PlayBASIC games would be lucky to have 25 active sprites on screen. So obviously, such opt's will have less impact in every day situations. Having said that, the streamlining has been applied across all the SPRITE functions. So if you have a lot of queries / function calls per sprite, you may well see some real world benefit even with small object counts. You may not also, but thems the breaks :)

The mapping commands have had much the same treatment, with about the same returns really, but slightly better. Some calls will be much quicker, others will be the same.. A lot of the collision functions had a lot of VM overhead in the caller, much of that has been trimmed away now, Flatten out the performance of all the calls in Map command set.

Another thing I've been experimenting with in the maps library, is way to batch pair/groups of calls. The idea being, is that if program code is set out so there's two (or more lines) of the basically the same function call, the caller can sit and batch them, avoiding some set up work between operations. In the test it seems to work ok actually. Where there's a brute test peeking map tiles tens of thousands of times. Todays test version of PB, runs the test about 20fps faster than V1.64N. Hard to tell if it'd be worth rolling out.

Where I think it would help is when the user is calling built in functions to query the properties of a entity such as a sprite or map or something.

Eg.
xpos=getspritex(spr)
ypos=getspritey(spr)

So if we have code like this, the calls could be batched together, making the second call almost free of overhead, the more i think about it, this type of concept might well be better suited to the third generation of the runtime.

PlayBASIC V1.64N2 Beta #11 - Manually Reading Sprite Vertex

So tonight sessions continues on from today's cleaning up the Sprite caller lib in the VM, previously i'd only done the GetSprite styled functions, but have pretty much gone over the whole thing now. To bench mark this we have the following snippet (you'll need the bubble pic to run), from 2004. The original code runs as is, but with some slight tweaks we can really get running even better in V1.64N revisions. No great surprise there, since N is much quicker than the version of PB this example with originally written in.

Anyway, the really interesting thing is that we're seeing a tangible performance gain from V1.64N2. Where It's running around 208 FPS in N2, compared to 185 ish in 1.64N. Which equates to about a 5% gain. So in sprite terms, N2 can render about 5 more sprites and it's older brother at approximately the same speed. So there's some benefit to be had.

PlayBASIC Code: [Select]

; PROJECT : Get_A_Sprites_Vertex
; AUTHOR  : Underware Design
; CREATED : 8/13/2004
; EDITED  : 7/08/2005

   ; added to cache types
   optexpressions 3

   randomize 67


` +====================================================*
`               >> SPRITE Rotation/Scale Test <<
` +====================================================*

  ; convert default font #1 from a true type font to a bitmap version
   MakeBitmapFont 1,$ffffff,8

  ; load the bubble image

   LoadfxImage CurrentDir$()+"..\..\..\gfx\bubble_64x64.bmp",1
;   LoadImage CurrentDir$()+"gfx\bubble_64x64.bmp",1
;   PrepareFXImage 1
 

; get the Screen widht/height 
   sw=GetScreenWidth()
   sh=GetScreenHeight()
   
; set starting number of    Bubbles
   BubblesN = 100

; Create the Bubble Type
   Type Bubble
      X#,y#,XSpeed#,YSpeed#,Angle#,RotSpeed#
   EndType

 
   // ===================================================================
   // Init Sprites, dim's an array sets a bunch of sprites read for use 
   // ====================================================================

   Dim Bubbles(BubblesN) As Bubble
   For starsI=1 To BubblesN
      Bubbles(starsI).x = Rnd(sw)
      Bubbles(starsI).y = Rnd(sh)
      Bubbles(starsI).xspeed = Rnd(5)
      Bubbles(starsI).yspeed = Rnd(5)   
      Bubbles(starsI).angle = Rnd(360)
      INC speed#
      If speed#>10 Then speed#=1
      Bubbles(starsI).rotspeed = speed#

      CreateSprite starsI
      SpriteImage starsI,1
      SpriteTransparent starsI,1
      PositionSprite starsI,starsX,starxY
      SpriteDrawMode starsi,2
      SpriteHandle Starsi,(GetImageWidth(1)/-2),(GetImageHeight(1)/-2)
      RotateSprite Starsi,Bubbles(starsI).angle 
      ScaleSprite StarsI,0.15+(Rnd(100.0)/100.0)      
   Next starsI
   


; Direct All Drawing to the screen (back buffer)
   RenderToScreen 

; Start of DO/Loop
Do

    cls rgb(80,50,30)
 
     // =================================================================
     // Run Through and update all the sprites positions
     // =================================================================

   lockbuffer          
         For starsI=1 To BubblesN

            x#=Bubbles(starsI).x+Bubbles(starsI).xspeed
            Y#=Bubbles(starsI).y+Bubbles(starsI).Yspeed

            TurnSprite Starsi,Bubbles(starsI).rotspeed

         
            If X#<0 Or X#>sw
               Bubbles(starsI).xspeed=Bubbles(starsI).xspeed*-1
            EndIf
      
            If Y#<0 Or Y#>sh
                  Bubbles(starsI).yspeed=Bubbles(starsI).yspeed*-1
            EndIf

            PositionSprite Starsi,X#,Y#   
         
            ; Read the Sprites Vertex
            x1#=GetSpriteVertexx(Starsi,0)
            y1#=GetSpriteVertexy(Starsi,0)

            x2#=GetSpriteVertexx(Starsi,1)
            y2#=GetSpriteVertexy(Starsi,1)

            x3#=GetSpriteVertexx(Starsi,2)
            y3#=GetSpriteVertexy(Starsi,2)

            x4#=GetSpriteVertexx(Starsi,3)
            y4#=GetSpriteVertexy(Starsi,3)

            Line x1#,y1#,x2#,y2#
            Line x2#,y2#,x3#,y3#
            Line x3#,y3#,x4#,y4#
            Line x4#,y4#,x1#,y1#

            Bubbles(starsI).x=x#
            Bubbles(starsI).Y=y#
         Next starsI
   unlockbuffer
   
   
     DrawAllSprites
   
      Print "Manually Reading the Sprites Rotated Vertex"

   Sync
Loop

kevin · May 11, 2012, 11:37:57 AM

PlayBASIC V1.64N2 Beta #11c - Smoothing Out Type Caching

Yesterdays Sprite/Map reshuffling passes ended up going a lot faster than i'd expected. It's not a lot of hands on coding work, it's just one of those things that often leads to broken instructions if your not careful, of which there were a few, but it all went pretty smoothly for once. Which leads me back for another round at type caching problem, which thankfully I've had a few more ideas about since last pass.

In previous versions, the type cache optimizer is intentionally set up to be rather over protective. So the moment it spots a user or built in function call within an expression, it turns caching off, regardless if the function/command damages the previous cached data or not. To combat this ,the command tables now have computed safely flag. The process is a bit like a shot gun really, where any function/command that accepts passed arrays as a parameter, is set to being potentially unsafe, the rest are tagged as safe. Now, yeah I know that just because a function is being passed an array it doesn't necessarily mean our cache data will be broken, but better safe than sorry.

Now that's ok for built in commands/functions, but for user functions there's really no easy way to accurately guess if the function destroys our cache data or not, so those calls disable caching until after the call. The reason this is such an issue is that we're caching a raw pointer to a previously known to exist thing, a type in this case. If we allow caching across user functions, and that function alters a previously cached type pointer, then we're in for a world of hurt. Best case scenario is it'll pop a runtime error, but I can't check at runtime where the cache data comes from. So if it's been corrupted unknowingly then any following accesses might just end up reading/writing from the wrong structure of the same type, or some different structure completely, potentially resulting in a horrible death.. It's possible to solve, but way too much work for the return.

Even so, what we're got now, is already able to chomp out slabs of bogus accesses from some of the bigger example programs, such as YAAC and THESIUS XIII for example. The tweaked optimizer spots around 100 caches in the YAAC demo (Yet Another Asteroids Clone) and almost 400 optimizers in the THESIUS demo, making around a 5/6 K saving in the byte code alone. Saving a few K is not a big deal, its just that's 5K less memory accesses the runtime now doesn't have to do, in order to do the same thing. So provided the opt's are in routines that are heavily used, we're gaining some more free performance. The down side, is that the Thesius demo is is still a little crash happy when caching is enabled globally. YAAC works fine, but its definitely not 100% at this time.

Anyway, I've also been able to address the changing variable issue mentioned earlier, now it's just matter of tracking down those last few annoying issues.

PlayBASIC V1.64N2 Beta #11c - Download

Old beta deleted:

remember.. Type caching has to be enabled in these versions. To do so, add the line OptExpressions 3 to the start of your program.

News:

PlayBASIC V1.64N2 / V164N3 (Work In Progress) Gallery