PlayBASIC V1.64N2 / V164N3 (Work In Progress) Gallery

Started by kevin, April 22, 2012, 12:07:12 AM

Previous topic - Next topic

kevin

#15
 PlayBASIC V1.64N2  Beta #7 - The Knock on effect

     Work continues on the expanding the parser to be more 'type cache' aware.   Internal testing shows the proof of concept works, there's no disputing that.  The tricky part is that the parser has to detect situations where the previous cached data has become invalid.  Imagine we have a typed high score array called HighScores.    Now if you manually insert new values into the high score table,  we often see a section of code loops from the end to the insertion point that reads the cell above and writes to the cell bellow.      
 
      A bit like this,


   
     For copylp=ArraySize to InsertPosition step -1
             HighScore(Copylp+1).Score = HighScore(Copylp).Score
     next


   
      Whats surprising about this very simple routine, is that it's one where the caching could easily break the run time execution.   Since the left expression  queries HighScore(Copylp) caching that access point,  but the write occurs on cell CopyLP+1 in the same array.   The parser can detect that different patterns and output a standard write to array in side expression.

      But, here's where it gets rather interesting..   Now lets imagine a situation where we're reading some data from a typed array,  but this time we're manipulating the index between multiple accesses across lines.
 

   
          print Array(Index).Value
          Index ++

          print Array(index).Value



    And bingo, this won't work.   The parser detects the pattern of each type access, but in both print expressions the accesses have the same pattern.   But clearly the code is not wanting to access the same type between prints, in this case, it's successive types in the structure..  Even though the situation is fairly bogus, it reveals something a hidden complexity when caching array accesses across multiple expressions.    Where  the parameter variables might have been altered since the previous access, making a any previous cached result invalid.  

   While the above situation can indeed be handled, it's far from unique.   So rather than even trying to run this project wide, the next objective is to sure up the caching across self contained expressions first, as seen in the first Highscore example above.   Pretty much spent all Saturday working through the logic of said change.   Previously, I'd mainly been testing linked typed variables, where user code tends to be set out in the fairly sequential manner, which creates more potentials caching  opportunities.     Implementing the cache isn't a lot of code, it's all about tweaking up the logic, so what's generated is valid VM code and equal to what would normally be generated.  

   Now as with all things parser you can get some knock on effects when making even tiny changes.  In PlayBASIC, the parser and code generation occur at the some time.  During an expression the parser does all the grunt work stepping though the user code, where it calls operation generator to spit out the VM code as it goes.  In long expressions we often need to store temp results, which end up in temp registers.    Interestingly, as by product of adding type caching, the parser can better detect when operations occur between temp registers.     Meaning we use less temp data, resulting in better memory fetching and less locals in user defined functions.    

   What does this mean, it means some free function calling speed for a completely unrelated topic.   Bellow we see a results of running the test code (bellow) in V1.64N and V1.64N2 Beta7.   Which give us a few free FPS back for absolutely nothing.  



Tests=10000

Do
cls 0

inc frames

// ===================================
// USer Functions VS Projected Subroutines
// ===================================

// ==========
// Test #1
// ==========

T=timer()
For LP=0 to Tests
result=SomeFunctionCalc(10,lp)
next
tt1#=tt1#+(timer()-t)
Print "Test #1 Average Time:"+Str$(tt1#/frames)


// ==========
// Test #2
// ==========

T=timer()
// Call the Psub function
For LP=0 to Tests
result=SomesubCalc(10,lp)
next
tt2#=tt2#+(timer()-t)
Print "Test #2 Average Time:"+Str$(tt2#/frames)





// ==========
// Test #3
// ==========

T=timer()
For LP=0 to Tests
result=SomeFunctionCalc(10,lp)
next
tt3#=tt3#+(timer()-t)
Print "Test #3 Average Time:"+Str$(tt3#/frames)



// ==========
// Test #4
// ==========

T=timer()
// Call the Psub function
For LP=0 to Tests
result=SomesubCalc(10,lp)
next
tt4#=tt4#+(timer()-t)
Print "Test #4 Average Time:"+Str$(tt4#/frames)


print ""
print "Fps:"+Str$(fps())


Sync

loop


Function SomeFunctionCalc(A,B)
    A=A*B
EndFunction  A

Psub SomeSubCalc(A,B)
    A=A*B
EndPsub A





PlayBASIC V1.64N2  Beta #7b - Cached Type Reading

      Type caching has been expanded activated across each expression, there's still a few gotchas but we're getting there.     Bellow (second pair of pictures) we're looking at the results of PB1.64N and PB1.64N2 Beta7b running the this snippet.    Allowing the parser to run through the expressions,  means that any code that's set out like   ME.X = ME.X  + 1 can now benefit from the caching also.   But really big winner in all this,  is when you access multiple dimension typed arrays.    If you look at the results the caching wins us back about 10 milliseconds from that part of the test..  Which i thing demostrates just how much work the runtime has to do read/write safely from an array.




type Vector3D
x#,y#,z#
EndType

type Stuff

Name$
X
y
z#
Table(100)
V3 as Vector3D

EndType

Dim Me as stuff
Dim test(10,10) as stuff
Dim cool as stuff list

me = new stuff


cool = new stuff

me.v3 =0 

max=25000

Do


cls
frames++

tests=0
t=timer()
for lp =0 to max
me.X+=1
me.y+=2
me.z+=3
next
tt1#+=timer()-t
print "Math Short Cuts on Typed Variable"
PrintTime(tt1#/frames)
print Me.x
print Me.y
print Me.z
Me.x=0
Me.y=0
Me.z=0
tests++


t=timer()
for lp =0 to max
me.X=me.X+1
me.y=me.y+2
me.z=me.z+3
next
tt2#+=timer()-t
print "Long Hand Math Operators on Typed Variable"
PrintTime(tt2#/frames)

print Me.x
print Me.y
print Me.z
Me.x=0
Me.y=0
Me.z=0
tests++


Index=6
Index2=5

test(Index,Index2)= New Stuff

t=timer()
for lp =0 to max
test(Index,Index2).x=test(Index,Index2).x+1
test(Index,Index2).y=test(Index,Index2).y+2
test(Index,Index2).z=test(Index,Index2).z+3
next
tt3#+=timer()-t
print "Long Hand Math Operators on 2D Typed Array"
PrintTime(tt3#/frames)
print test(Index,Index2).x
print test(Index,Index2).y
print test(Index,Index2).z


tests++

print "BandWidth:"+Str$(max*24*Tests*4)+" Bytes"
print "FPS:"+Str$(fps())


Sync
loop

Function PrintTime(T#)
ink $ffff0000
print "Time:"+Str$(t#)
ink -1   
EndFUnction


   



Just got this hooked up and are testing The second pair of picture




kevin

#16
PlayBASIC V1.64N2 BETA #7 (Retail Compiler Only Beta) - (Avail for Registered Users ONLY)

     PlayBASIC V1.64 N2  Beta #7,  this revision includes the Type caching in the compiler / runtime.

     In the previous beta, types could only be cached within math short operations.  Which is a nice safe way to introduce the feature, but in this version the parser is able to cache type accesses across an entire expression.    This expansion means the compiler can now detect and simplify the following styles of expression for the run time.  Ie  Me.X = Me.X +1,  In fact, if you stack multiple accesses within the one expression,  you can take further advantage of caching.   So if we had this   Me.X = Me.X +Me.Speed, the runtime will pull the ME type into the cache once, after which it's only accessing the cached version. So at least two of the ME accesses in that expression are cached.  Performance wise, the more dimensions you have, the better this works.  

     While the parser can cache within each expression, it doesn't support caching accesses across multiple expressions or lines of code.  The reason for this, is that once we do that, we're in a bit of mine field.  Because as mentioned above, there's any number of situations where an array index variable could be changed,  a list pointer moved, or just flow of code can break it.    But we'll see.



Download


     old beta deleted


kevin

#17
 PlayBASIC V1.64N2  Beta #8 - Global Type Caching

     Moving on from the expression type caching of previous beta's,  we head into the wild unknown wilderness that is global type caching.   In beta 7, the caching is activated during assignments styled expressions and then turned off again, to avoid potential logical issues at runtime.   Most of the logical stuff is fairly predictable, where caching could result in potentially runtime errors, it's the unforeseen stuff that worries me about this feature most of all.  As such, caching is disabled by default,  which reverts the parser to it's original behavior.   To enable it, we set Bit 1 in OptExpression,  Bit 0 is used to toggle the standard optimization (which is on by default), so to enable Both modes we set OptExpressions to an integer value of 3.

    Currently, i'm slowly picking through the parser tagging the various situations where the cache should be flushed or disallowed completely.   I'd like to say it's all working wonderfully well, but it isn't.  Small stuff seems to work ok, but the bigger apps virtually all die at this point.   Thus we'll focus on the small stuff for now :)  - Like for example the 'Cached Type Reading' bench mark from above.

    The post above, shows the results for PBV1,64N and V1,64N2 beta7 running the bench mark, now bellow we've got today's results.    At first glance you might not see the big deal, but beta 8 can run a complete forth test in the time  V1.64N can only run three of them.    Whats more interesting is that the forth test is benching 3D typed array accesses.     Now granted, the test code is set out to cache well, but I didn't think it'd work that well.  To put the test in terms of bandwidth, the demo is pushing around 1/2 a gigabyte of bandwidth per second, not too bad for a 10 year old runtime, on almost 7 year old hardware.  



PlayBASIC V1.64N2  Beta #8b - Write Caching

     Previously the cache state was only enabled after a READ access of the typed structure,  this has been expanded to support writes also.  Which wasn't a big deal performance wise, since generally if we're only writing to a type we're probably initializing  it's fields prior to use in a create object/character styled function.   A few extra fetches in a routine that's not being called 1000's of times isn't going to kill us, but it adds extra bloat to the runtime code.  

     You can get a quick example of the style of routine i'm talking about from the 8Way Layered Star Field / Asteroids Style example.    Where there's a function that creates the layers where it's virtually all writes to the initialize the structure.

     This bit,

PlayBASIC Code: [Select]
Function CreateLayer(Size,depth#)
Index=GetFreeCell(Layers())

Layers(iNdex)=new tLayer
Layers(index).Size=Size
Layers(index).x=0
Layers(index).Y=0
Layers(index).Depth=depth#

NumberOfDots=rndrange(50,100)
Layers(index).Shape=Make_Randomized_Dot_Shape(NumberOfDots,Size)

EndFUnction




     So we're got a bunch of successive writes to the same type.  So obviously in this case it's caching the write to the second, third and so on writes.  But not all of them.  Expressions that include function calls flush the cache.   So on the shape creation line, the write to Layers(Index).Shape is an absolute write, rather than cached one.    We can rearrange the code though to make sure all successive writes are cached  by pre-computing the function call as a variable, then writing that later.  
 
     Like this,

PlayBASIC Code: [Select]
Function CreateLayer(Size,depth#)
Index=GetFreeCell(Layers())

NumberOfDots=rndrange(50,100)
ThisShape=Make_Randomized_Dot_Shape(NumberOfDots,Size)

Layers(iNdex)=new tLayer
Layers(index).Size=Size
Layers(index).x=0
Layers(index).Y=0
Layers(index).Depth=depth#
Layers(index).Shape=ThisShape

EndFUnction




     Both versions are logically the same, but the second one just lets the optimizer do a better job.


kevin

#18
 PlayBASIC V1.64N2  Beta #8c - Vector Stacking

     As with all upgrades it's not all feature adding, there's an addition cycle --> testing --> repeat until done.  Have ironed out a number of type cache related issues, It's far from 100%, but most of the big programs I was having problems with, work again.  Which is good thing.  So now it's time to dive deeper  find those nitty gritty issues through usage.    Hence knocking up some tech demos.    

     One demo I've been messing around with is basically a 3d particle explosion.  We have a linked list of 10,000 points and are plotting each points path, which is clearly overkill, but it makes for  a good test bed for brute forcing the runtime.  But if we have 10,000 points and there's say 15 array accesses per point, then as much as caching helps out, it's not a silver bullet in this type of situation.

     So the pictures bellow are comparing two methods side by side, the first method is native PB version of the particle code (unopt'd), the second is using a type of vector stack.  Where the user defines  a list of operations to apply to a stack of vectors and then executes the list on demand.   While it's a little more work to set up,  the results speak for themselves, with the stack version of the particle routine being around 3 times faster.  

     

ATLUS


kevin


   yeah sometimes we win some, sometimes we lose some, but every gain is nice, no matter how big or small..   


kevin

#22
  it works pretty well, but i'm sure for many, it's going to be a pretty abstract concept,  as described here (login required) .    



kevin

#23
   PlayBASIC V1.64N2  Beta #9 - Disassembler

      Work continued on the upgrade over the weekend, but with shift in focus geared more towards testing then anything else.    I guess the only new addition a simple parser to convert string based Vector Stack expressions into an operations list for the resolve stack function, making the process a lot easier.    Unlike the PB compiler, the vector stack operation lists can be processed at runtime.  Giving a bit of leeway in how an operation is solved.  

      Anyway, in terms of testing, my focus has been on trying to validate where and when the compiler is seeing cachable type operations.   For some reason some programs with heavy type usage, return little or no successful caches, even though the code is serialized.    After a lot of head scratching ended up coming to the conclusion that it'd be much easier to see exactly where and when it's failing if I could look a dis-assembly of byte code.     Much older versions of the PB had something similar built in, but was removed long ago.    Luckily i had most of the material just sitting around.

      The disassembler can currently decode about 50% of PB core instruction set (don't need all of it), in the initial version it just showed the opcode and parameter data,  so you don't get anything even reassembling PB code back.   But yesterday added some 'clean up' routines that try to reconstruct that type of the expression that created this opcode, making it a lot more BASIC looking and easier to follow the logic.   Some logic you could basically resource with a pretty high accuracy really, where as some bits, you can't as the information just doesn't exist or it requires multiple passes to try and guess the original logic.      

       The tool is currently working well enough for me to get an overview at just what the compiler is producing in various situations, and it's already revealing  some rather interesting results.  There seems to be the odd occasion where certain expressions generate extra MOVE operations, where they could be short cut.  The odd extra operation is no big deal in a code that's executed infrequently , but that weigh is magnified in brute force situations.

       Another oddity appears when initializing the function scope, where it seems to be writing some bogus data, dunno what's up with that.   Perhaps the best thing about it so far, is that it reveals a few situations where merging some common operations could be used to reduce the vm overhead further.   Some of which I think could be done a lot easier from a dedicated optimization pre-processing pass though, but well worth putting on the ideas board..  



  PlayBASIC V1.64N2  Beta #10 - Download

    Old beta deleted:

   remember.. Type caching has to be enabled in these versions.  To do so, add the line  OptExpressions 3  to the start of your program.

   

kevin

#24
  PlayBASIC V1.64N2  Beta #10 - Inner Workings Of Type Caching

       Testing is going pretty well thus far, the more code I look over the dis-assembly outputs from, the more little potential tweaks appear.   I think most of the problems i've been having with caching relate to array indexes where the index has changed between accesses.   The parser isn't smart enough to trap those situations as yet.  As a whole, the compiler takes a pretty nervous approach to caching, which you can see when picking over the code it's producing for various blocks.  If i suss out the problem areas, then it should be able to be relaxed a bit more, which in turn will make further improvements.  

        The disassembler tool is working well, still LOTS of stuff it doesn't have a clue about, but for what i'm interested in (the core logic) it works well enough now.   So well, i've dressed it a little more since yesterday.   Now it allows us to see reconstructed BASIC styled versions of the instruction set.  Some stuff, there's no obvious BASIC translation, but it's a lot more readable and easier to browse now than it was before.     Attached to this post you can see an example routine with caching enabled (it's one of the example from earlier in this thread) .  The tool colours key lines to help pick out the key operations i'm interested in.   Not too pretty to look at, but a much easier form of validation stand point.      
         
         

ATLUS

disassembler tool looking pretty good ^__^

monkeybot

what does OptExpressions  do?  i seem to have missed that one.

kevin

#27
  OptExpressions is a compile time switch, much like Explicit, which you can toggle it on and off anywhere you like (within reason :) ).  The switch controls the activity of a group of redundancy optimizations that the compiler can apply to expressions and move operations during code generation.    It's technical not a user command, since there's no reason to actually turn it off today.  So It's basically a legacy feature now only used when something breaks in code generation,  like the same string function assignment problems in V1.64M for example.

  Much like the Type Caching changes mentioned above,  originally the standard optimizer features were turned off by default in those releases.  Since it would help some programs, but completely break others,    It took some time before those issues were resolved,  which is what i'm expecting here.   Once the issues are all ironed out, the cache mode will be on by default.    


kevin

#28
  PlayBASIC V1.64N2  Beta #11 - Tuning The Sprite / Map Instruction Sets

     Shifting focus from the 'compiler' back to the runtime instruction set for a few days, namely how the VM traps instructions with the aim of adding a few batching/structure read/write typed opcodes as well.    Today I've been through all the mapping and sprite libs trying to arrange them for efficiently.   The less VM overhead, the better basically,  as over time (9->10 years of it) things bloat out.   But with a little cut and paste, the updated lib's are performing better in brute force situations.      

    In this example, we're running three separate for/next loop through 10,000 sprites pulling /setting and moving the X / Y cords from each sprite, simulating what most game loops will do at one stage or another.  

   eg

PlayBASIC Code: [Select]
   Max=10000   
Dim Spr(max)
For lp =0 to max
Spr(lp)=NewSprite(rnd(1000),rnd(1000),1)
next


Do
Cls

Frames++

t=timer()
For lp =0 to Max
spr=Spr(0)
xpos=getspritex(spr)
ypos=getspritey(spr)
next
tt1#+=Timer()-t


t=timer()
For lp =0 to Max
spr=Spr(0)
xpos=getspritex(spr)
ypos=getspritey(spr)
positionsprite spr,Xpos,Ypos

next
tt2#+=Timer()-t


t=timer()
For lp =0 to Max
spr=Spr(0)
movesprite spr,1,1

next
tt3#+=Timer()-t


print " Pull Positions:"+STR$(tt1#/frames)
print "Pull + Set Positions:"+STR$(tt2#/frames)
print " Move Sprite:"+STR$(tt3#/frames)
print " Fps:"+STR$(fps())

Sync
loop



        The tuning up of instruction set gives us some more free speed back (about 5 fps on the test system) for no real deep investment of time, so it's basically free performance gain.   But, we've got remember that this is a brute force test of 10,000 iterations.    Most PlayBASIC games would be lucky to have 25 active sprites on screen.   So obviously, such opt's will have less impact in every day situations.  Having said that, the streamlining has been applied across all the  SPRITE functions.  So if you have a lot of queries / function calls per sprite, you may well see some real world benefit even with small object counts.   You may not also, but  thems the breaks :)  

        The mapping commands have had much the same treatment, with about the same returns really, but slightly better. Some calls will be much quicker, others will be the same..  A lot of the collision functions had a lot of VM overhead in the caller, much of that has been trimmed away now, Flatten out the performance of all the calls in Map command set.    

         Another thing I've been experimenting with in the maps library, is way to batch pair/groups of calls.   The idea being, is that if program code is set out so there's two (or more lines) of the basically the same function call, the caller can sit and batch them, avoiding some set up work between operations.  In the test it seems to work ok actually.    Where there's a brute test peeking map tiles tens of thousands of times.  Todays test version of PB, runs the test about 20fps faster than V1.64N.   Hard to tell if it'd be worth rolling out.    

         Where I think it would help is when the user is calling built in functions to query the properties of a entity such as a sprite or map or something.  

          Eg.
            xpos=getspritex(spr)
            ypos=getspritey(spr)

          So if we have code like this,  the calls could be batched together, making the second call almost free of overhead, the more i think about it, this type of concept might well be better suited to the third generation of the runtime.    




  PlayBASIC V1.64N2  Beta #11 - Manually Reading Sprite Vertex

         So tonight sessions continues on from today's cleaning up the Sprite caller lib in the VM,  previously i'd only done the GetSprite styled functions, but have pretty much gone over the whole thing now.    To bench mark this we have the following snippet (you'll need the bubble pic to run), from 2004.   The original code runs as is,  but with some slight tweaks we can really get running even better in V1.64N revisions.   No great surprise there, since N is much quicker than the version of PB this example with originally written in.

        Anyway, the really interesting thing  is that we're seeing a tangible performance gain from V1.64N2.  Where It's running around 208 FPS in N2, compared to 185 ish in 1.64N.    Which equates to about a 5% gain.  So in sprite terms,  N2 can render about 5 more sprites and it's older brother at approximately the same speed.   So there's some benefit to be had.    

PlayBASIC Code: [Select]
; PROJECT : Get_A_Sprites_Vertex
; AUTHOR : Underware Design
; CREATED : 8/13/2004
; EDITED : 7/08/2005

; added to cache types
optexpressions 3

randomize 67


` +====================================================*
` >> SPRITE Rotation/Scale Test <<
` +====================================================*

; convert default font #1 from a true type font to a bitmap version
MakeBitmapFont 1,$ffffff,8

; load the bubble image

LoadfxImage CurrentDir$()+"..\..\..\gfx\bubble_64x64.bmp",1
; LoadImage CurrentDir$()+"gfx\bubble_64x64.bmp",1
; PrepareFXImage 1


; get the Screen widht/height
sw=GetScreenWidth()
sh=GetScreenHeight()

; set starting number of Bubbles
BubblesN = 100

; Create the Bubble Type
Type Bubble
X#,y#,XSpeed#,YSpeed#,Angle#,RotSpeed#
EndType


// ===================================================================
// Init Sprites, dim's an array sets a bunch of sprites read for use
// ====================================================================

Dim Bubbles(BubblesN) As Bubble
For starsI=1 To BubblesN
Bubbles(starsI).x = Rnd(sw)
Bubbles(starsI).y = Rnd(sh)
Bubbles(starsI).xspeed = Rnd(5)
Bubbles(starsI).yspeed = Rnd(5)
Bubbles(starsI).angle = Rnd(360)
INC speed#
If speed#>10 Then speed#=1
Bubbles(starsI).rotspeed = speed#

CreateSprite starsI
SpriteImage starsI,1
SpriteTransparent starsI,1
PositionSprite starsI,starsX,starxY
SpriteDrawMode starsi,2
SpriteHandle Starsi,(GetImageWidth(1)/-2),(GetImageHeight(1)/-2)
RotateSprite Starsi,Bubbles(starsI).angle
ScaleSprite StarsI,0.15+(Rnd(100.0)/100.0)
Next starsI



; Direct All Drawing to the screen (back buffer)
RenderToScreen

; Start of DO/Loop
Do

cls rgb(80,50,30)

// =================================================================
// Run Through and update all the sprites positions
// =================================================================

lockbuffer
For starsI=1 To BubblesN

x#=Bubbles(starsI).x+Bubbles(starsI).xspeed
Y#=Bubbles(starsI).y+Bubbles(starsI).Yspeed

TurnSprite Starsi,Bubbles(starsI).rotspeed


If X#<0 Or X#>sw
Bubbles(starsI).xspeed=Bubbles(starsI).xspeed*-1
EndIf

If Y#<0 Or Y#>sh
Bubbles(starsI).yspeed=Bubbles(starsI).yspeed*-1
EndIf

PositionSprite Starsi,X#,Y#

; Read the Sprites Vertex
x1#=GetSpriteVertexx(Starsi,0)
y1#=GetSpriteVertexy(Starsi,0)

x2#=GetSpriteVertexx(Starsi,1)
y2#=GetSpriteVertexy(Starsi,1)

x3#=GetSpriteVertexx(Starsi,2)
y3#=GetSpriteVertexy(Starsi,2)

x4#=GetSpriteVertexx(Starsi,3)
y4#=GetSpriteVertexy(Starsi,3)

Line x1#,y1#,x2#,y2#
Line x2#,y2#,x3#,y3#
Line x3#,y3#,x4#,y4#
Line x4#,y4#,x1#,y1#

Bubbles(starsI).x=x#
Bubbles(starsI).Y=y#
Next starsI
unlockbuffer


DrawAllSprites

Print "Manually Reading the Sprites Rotated Vertex"

Sync
Loop


     


kevin

#29
 PlayBASIC V1.64N2  Beta #11c - Smoothing Out Type Caching

       Yesterdays Sprite/Map reshuffling passes ended up going a lot faster than i'd expected.  It's not a lot of hands on coding work, it's just one of those things that often leads to broken instructions if your not careful, of which there were a few, but it all went pretty smoothly for once.   Which leads me back for another round at type caching problem,  which thankfully I've had a few more ideas about since last pass.

        In previous versions, the type cache optimizer is intentionally set up to be rather over protective. So the moment it spots a user or built in function call within an expression, it turns caching off, regardless if the function/command damages the previous cached data or not.   To combat this ,the command tables now have computed safely flag.    The process is a bit like a shot gun really,  where any function/command that accepts passed arrays as a parameter, is set to being potentially unsafe, the rest are tagged as safe.    Now, yeah I know that just because a function is being passed an array it doesn't necessarily mean our cache data will be broken, but better safe than sorry.  

        Now that's ok for built in commands/functions,  but for user functions there's really no easy way to accurately guess if the function destroys our cache data or not, so those calls disable caching until after the call.    The reason this is such an issue is that we're caching a raw pointer to a previously known to exist thing, a type in this case.   If we allow caching across user functions, and that function alters a previously cached type pointer, then we're in for a world of hurt.  Best case scenario is it'll pop a runtime error, but I can't check at runtime where the cache data comes from.  So if it's been corrupted unknowingly then any following accesses might just end up reading/writing from the wrong structure of the same type, or some different structure completely, potentially resulting in a horrible death..   It's possible to solve, but way too much work for the return.

        Even so, what we're got now, is already able to chomp out slabs of bogus accesses from some of the bigger example programs, such as YAAC and THESIUS XIII for example.  The tweaked optimizer spots around 100 caches in the YAAC  demo (Yet Another Asteroids Clone) and almost 400 optimizers in the THESIUS demo,  making around a 5/6 K saving in the byte code alone.  Saving a few K is not a big deal, its just that's 5K  less memory accesses the runtime now doesn't have to do, in order to do the same thing.   So provided the opt's are in routines that are heavily used, we're gaining some more free performance.     The down side, is that the Thesius demo is is still a little crash happy when caching is enabled globally.  YAAC works fine, but its definitely not 100% at this time.
 
        Anyway, I've also been able to address the changing variable issue mentioned earlier, now it's just matter of tracking down those last few annoying issues.



  PlayBASIC V1.64N2  Beta #11c - Download

     Old beta deleted:

   remember.. Type caching has to be enabled in these versions.  To do so, add the line  OptExpressions 3  to the start of your program.