Index ranges - Media Ranges - do we really need 32bit unique indexes ?
wouldn't a 16bit range be enough (1 to 65536.0)
or a 24bit range be enough (1 to 16777200.0)
in a 32bit environment having 32bit unique anything is actually impossible anyway..
this is stuff like MEDIA indexes... So mainly Images/Sprites and another other media
so i assume it is currently 32 bit? would say 16 bit values work better, or is it faster?
if you could explain the advantage / disadvantage please... i am interested to know the difference in how this effects PB code.
The media indexs are limited to about 30bits since they're 32bit integers.. so 2^30 possible slots for EACH media type.. That's a LOT OF MEMORY (4gig in fact)
I don't know off the top of my head, but I'd say that each slot in say the BANKS command set consumes about 32 bytes of info data.. Just to store the info about the BANK, it's size, pointer, and whatever else it needs. Something like a SPRITE would more like 200 (or more) bytes per sprite.. those are just the structures (TYPES) to hold the basic info about the object.
So having big linear ranges is just overkill, given that very few examples thus far have ever used more than a few thousand of any one resource..
Current implementations use big linear arrays to house all of the data from each media library, there's some variations of that theme, but that's basically how it's down in PB/PBFX builds..
But what you could do is use high bits of the MEDIA INDEX as a bank selector, and split the big arrays into smaller chunks..
[pbcode]
; PROJECT : MEDIA INDEX banked demo
; AUTHOR :
; CREATED : 11/05/2021
; EDITED : 11/05/2021
; ---------------------------------------------------------------------
// Allow 256 tables of 2^16 banks of indexes
// so 2^24 possible indexes
dim MediaTable(256)
function AllocIndex(MediaIndex)
// High Bits are the media table
mediaTableindex = MediaIndex >>16
index = MediaIndex & $ffff
Thisbank=MediaTable(MediaTableIndex)
if ThisBank=0
// Here we're Use a PB BANK TO simulate a memory allocation
ThisBank = NewBank( (2^16)*4)
MediaTable(MediaTableIndex)=ThisBANK
endif
// Check if this was previously allocated ?
if PeekBankInt(ThisBank,Index * 4)
// free this since it was in use
FreeIndex(MediaIndex)
endif
// Tag This MEDIA INDEX as in use (this would be a pointer to the resoure in real life)
POkeBankInt ThisBank,Index * 4, true
EndFunction
function FreeIndex(MediaIndex)
// High Bits are the media table
mediaTableindex = MediaIndex >>16
index = MediaIndex & $ffff
Thisbank=MediaTable(MediaTableIndex)
if ThisBank
// here we'd release any data this thing uses
// then reset this index back to off
POkeBankInt ThisBank,Index * 4, false
endif
EndFunction
Psub GetIndexStatus(MediaIndex)
Status =0
// High Bits are the media table
mediaTableindex = MediaIndex >>16
index = MediaIndex & $ffff
Thisbank=MediaTable(MediaTableIndex)
if ThisBank
// check if this is in use or not
Status = PeekBankInt(ThisBank,Index * 4)
endif
EndPsub Status
// Test Library
setfps 60
do
cls
// Randomly Alloc
mediaIndex = rnd($fffff)
AllocIndex(MediaIndex)
// Compute useds Indexs
Count=0
For lp=0 to 255
if MediaTable(lp)
StartIndex = lp * $10000
EndIndex = (lp+1) * $10000
for index=StartIndex to EndIndex-1
Count+=GetIndexStatus(Index)
next
endif
next
print "Media Count:"+str$(Count)
print ""
// Show the used CHUNKS OF indexes
Xpos=0
Ypos_ScreenTop = GetCursorY()
Ypos=Ypos_ScreenTop
TH = GetTextHeight("|")
for lp =0 to 255
ThisBank = MediaTable(lp)
if ThisBANK
baseIndex = lp * $10000
if Ypos+TH => GetScreenHeight()
Xpos =Xpos+125
Ypos = Ypos_ScreenTop
endif
Text Xpos,Ypos, "Range:"+str$(BaseIndex)
Ypos+=TH
endif
next
sync
loop
[/pbcode]
thanks for the demo kev.
i spend a some time going through it, but i initially only got the first 3 functions only. i was gonna mention there is no do-loop, but when i pressed reply to say this, the full code showed up then. now i can see all the code, lol. now it is 2 functions, 1 psub, and do-loop. it makes sense now.
i have already learned more about bits, bytes, bit shifting, and why we have a range of @+2gig to @-2gig. and that was from the first 3 functions i initially got earlier.
now to go through the rest of the code. learning is fun (that is what i tell myself when i am about to rage quit).
Toying with BITS is fun STEVE :) I'm serious.. it's fun and very useful
Anyway.. There's a few annoying things I'm actually concerned about here,
1) if we have all the media data in one big table and that table needs to be resized, there's big bottle neck to expand/shrink the table.. Try setting the Sprites to a 1,000,000 for example.. on older system(s) it chokes big time
[pbcode]
do
cls 0
frames++
spritequantity 1000
// see how long it takes to resize from 1000 to a 1000000
t=timer()
spritequantity 1000000
tt#+=timer()-t
print "Average Resize Time:"+str$(tt#/frames)
sync
loop spacekey()
[/pbcode]
2) if the media resource table is being resized / shrunk, then I can't have a secondary thread read from the table as it's unsafe (see randomly crash)
i am going through the code, step by step, and can now see how handy binary bits are. they are useful.
you used bit shifting to read the higher bits only, and also used & (AND) to only use the lower bits. also hex numbers are handy with binary too.
i wondered why you kept using hex, now i understand. saves typing large binary numbers.
i have a few more questions about the first demo code. i am not sure why you are allocating ((2^16)*4) bytes for the bank size. this is around 1/4 Meg.
16 bits = 2^16 = 65,536 numeric value. then this is multiplied by 4 ? this is done each time you enter a value into the MediaTable(n) array.
well it's late so i may not be thinking straight. the more i learn and understand, the more i want to know. it's a never ending cycle...lol.
Quote
i wondered why you kept using hex, now i understand. saves typing large binary numbers.
just easier to have big numbers when you want groups of bits. You could use integers, binary or hex... but Hex makes more sense in this case.
Quote
i have a few more questions about the first demo code. i am not sure why you are allocating ((2^16)*4) bytes for the bank size. this is around 1/4 Meg.
16 bits = 2^16 = 65,536 numeric value. then this is multiplied by 4 ? this is done each time you enter a value into the MediaTable(n) array.
The Bank represents 2^16 integers.. so the index needs to scaled by 4, as one integer takes 4 bytes.
Here's a version that lets the ranges to be user defined.. it's a bit pompous but you get the idea.
Run in DEBUG mode (F7) and check console for output.
[pbcode]
// use 2^12 bits per block and have 2^8 blocks
constant Index_Low_Bits = 12
constant Index_High_Bits = 8
#print "BIT MASKS"
#print " Low Bit Mask "+bin$(( $ffffffff >> ( 32 - Index_Low_Bits )))
#print "High Bit Mask "+bin$(( $ffffffff >> ( 32 - Index_High_Bits )))
#print ""
for MediaIndex =0 to 100000 step 100
Bank = (MediaIndex >> Index_Low_Bits) & ( $ffffffff >> ( 32 - Index_High_Bits ))
Index = MediaIndex & ( $ffffffff >>Index_Low_Bits ) & ( $ffffffff >> ( 32 - Index_Low_Bits ))
// dump this console
#print "Index#"+str$(MediaIndex)+" Bank["+str$(bank)+"] ["+str$(Index)+"]"
next
Sync
waitkey
[/pbcode]
For a bit of trivia, you can swap the $ffffffff values for decimal -1
well, i read the code above, and actually understood it without my eyes glazing over.
using & + bit shifting sure is handy.
: Bank increased by 1 every 42 loops
: Index reset every 42 loops but with an offset increase +4 (0,4,8,12,16,20,24 etc)
trivia tip is cool. i wondered how the values worked. so the lower 31 binary bits are value 0 to 2gig+, and if the highest bit (32nd bit) is 1, then the number is negative, and a zero means positive?
yay, i'm learning.
Yep, the top bit is the SIGN, it looks a bit weird when you look at the bits but they do make sense for negatives..
-16 %11111111111111111111111111110000
-15 %11111111111111111111111111110001
-14 %11111111111111111111111111110010
-13 %11111111111111111111111111110011
-12 %11111111111111111111111111110100
-11 %11111111111111111111111111110101
-10 %11111111111111111111111111110110
-9 %11111111111111111111111111110111
-8 %11111111111111111111111111111000
-7 %11111111111111111111111111111001
-6 %11111111111111111111111111111010
-5 %11111111111111111111111111111011
-4 %11111111111111111111111111111100
-3 %11111111111111111111111111111101
-2 %11111111111111111111111111111110
-1 %11111111111111111111111111111111
0 %00000000000000000000000000000000
1 %00000000000000000000000000000001
2 %00000000000000000000000000000010
3 %00000000000000000000000000000011
4 %00000000000000000000000000000100
5 %00000000000000000000000000000101
6 %00000000000000000000000000000110
7 %00000000000000000000000000000111
8 %00000000000000000000000000001000
9 %00000000000000000000000000001001
10 %00000000000000000000000000001010
11 %00000000000000000000000000001011
12 %00000000000000000000000000001100
13 %00000000000000000000000000001101
14 %00000000000000000000000000001110
15 %00000000000000000000000000001111
16 %00000000000000000000000000010000
[pbcode]
for lp = -16 to 16
#print str$(lp)+" "+bin$(lp)
next
sync
waitkey
[/pbcode]
so, after all this binary chatting, that i have enjoyed learning, i haven't answered the original question, lol.
i guess a reduction to 16 bit should be fine. i haven't used high values for images/objects/sprites etc.
It's one of those things that once you set a limit people have a instant negative reaction, but the reality is that you've always been limited..
What prompted this question is those ugly truths about having common chunks of big blocks of memory and the subsequent ownership issues that arise
when trying to do even simple memory management into threading. If you have two bits of code looking at the same table without some type of please and thank yous between them you inevitably end up one thread is deleting/creating/resize something that other was using and boom it dies..
Comparing Alternate BANK COMMAND set to internal command sets
So enough talking bout this stuff, how does it work when implemented... Well.. on the surface our test library (a recreation of the bank commands) seems to be 2.9 times faster than the built in commands..
Why is this ?? - Well I doubt that improvement is purely all about the INDEX mapping and structure caching which the newer approach uses solely, but rather some of that gain is in how the runtime calls internal commands compared to externally bound command sets (LInkDLL's), which was vastly improved during the rebuilding of V1.65
So can we just swap them over ? - Not yet, the internal command sets have an Open/Close structure that linked DLL's don't have. meaning that when you start your program the command set is opened (stuff is initialized) and then when you END or quit, then runtime calls the clean up routine to Close each command set, before shutting down the runtime. So we'd need something like that build into the parser / runtimes for external DLLS..
But it does potentially take a step towards a more plug and plug command set.
[pbcode]
Function GUI_TEST_BANKS()
index=10
for BankSize = 000 to 100000+1 step 5000
cls
size=banksize
if Size<1 then size=1
CreateBank index, Size
PB_CreateBank index, Size
print "--[ BANK Info ]-------------------------------"
print "Bank Status:"+str$(PB_GetBankStatus(index))+ " Size:"+str$(PB_GetBankSize(index))+" Ptr:"+str$(PB_GetBankPtr(index))
print ""
maxframes = 10
tt#=0
for frame =0 to MaxFrames
t=timer()
for lp =0 to Size-1
PB_PokebankByte index,lp,lp
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
PB_PokebankByte(index,lp,lp)
next
tt#+=Timer()-t
next
R1#= tt#/Frame
print "Test #1: "+str$(r1#)+" Milliseconds"
tt#=0
for frame =0 to MaxFrames
t=timer()
for lp =0 to Size-1
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
PokebankByte index,lp,lp
next
tt#+=Timer()-t
next
R2#= tt#/Frame
print "Test #2: "+str$(r2#)+" Milliseconds"
print ""
print "--[ RESULTS ]-------------------------------"
print " Result: Test 1 is ["+str$(r2#/r1#)+"] times faster then test 2"
print " Test Count:"+str$(MAxFrames)
print "Writes Per Test:"+str$((20*lp*MaxFrames)/(1000000.0))+" Megs "
; for lp =0 to Size-1
; #print PB_PeekbankByte(index,lp)
; a=b
; next
#print "--[Words]---------------------------"
; for lp =0 to Size-1
; #print Hex$(PB_PeekbankWord(index,lp))
; a=b
; next
#print "--[Integer]---------------------------"
; for lp =0 to Size-1
; #print Hex$(PB_PeekbankInt(index,lp))
; a=b
; next
PB_DeleteBank Index
print ""
print "--[ BANK After DELETE ]-------------------------------"
print "Bank Status:"+str$(PB_GetBankStatus(index))+ " Size:"+str$(PB_GetBankSize(index))+" Ptr:"+str$(PB_GetBankPtr(index))
sync
next
waitkey
waitnokey
EndFunction
[/pbcode]
is this code something your experimenting with, because if i cut and paste the code and add at top the function name to call it, it must need a library/DLL also that you haven't included?
apart from this it's interesting...
this is just some hodge podge test code atm..
interestingly though PB1.64P outperforms 1.65 here for the internal command sets at least and is slower calling external functions.
Computing FREE MEDIA indexes
Our example is becoming more robust as the sessions go on, in older libraries we use a parallel linked list of used and Free indexes to keep track of them. This is cool as it means pulling a FREE or NEW media index of some type.. Ie. GetFReeBank() / NewBANK() or GetFreeSprite() / NewSprite() - Those kinds of functions all use this approach underneath. This gives us a fixed time to find a free resource from the various lists. But that doesn't really fit into our next blocked range solution.. The reason is the list is another big block of parallel data. If we have a 100K quantity of indexes then we have 100K time the size of the data structure just to remember the free order. Which is something I'm trying to avoid here. So unfortunately this means searching linearly though the indexes to find a free one at the moment.
So since we're trailing the Banks library against the internal command set, how does calling GetFreeBank() compare to it's new counter part PB_GetFreeBank() ???
Well, it's slower..
Results
---[ GET FREE FUNCTIONTESTING ] ------------------------------
Creation Time:1
GetFreeBank Time:12.0 For #500000 Tests
PB GetFreeBank Time:779.0 For #500000 Tests
------------> Result: Test 1 is [64.91666] times faster then test 2
---[ GET FREE FUNCTION TESTING DONE ] ------------------------------
Test function
This is the as is build code and there fore is not runable for users.. It's here just so you an see what were testing
[pbcode]
Function GUI_TEST_BANKS_GETFREE()
IndexRange= 10000
#print "" : #print ""
#print "---[ GET FREE FUNCTIONTESTING ] ------------------------------"
#print "" : #print ""
BankQuantity IndexRange
for index=1 to IndexRange
if GetBankStatus(index)
deletebank Index
endif
if PB_GetBankStatus(index)
PB_deletebank Index
endif
next
// Create 1/4 of the range of banks 100 bytes in size
t=timer()
for index=1 to IndexRange/4
createbank index,100
PB_createbank index,100
next
#print "Creation Time:"+str$(Timer() - t)
// TIme the get free functions
MaxTests=100000
t=timer()
for lp =0 to Maxtests
Index=GetFreeBank()
Index=GetFreeBank()
Index=GetFreeBank()
Index=GetFreeBank()
Index=GetFreeBank()
next
t1#=Timer()-t
#print "GetFreeBank Time:"+str$(t1#)+" For #"+str$(lp*5)+" Tests"
t=timer()
for lp =0 to Maxtests
Index=PB_GetFreeBank()
Index=PB_GetFreeBank()
Index=PB_GetFreeBank()
Index=PB_GetFreeBank()
Index=PB_GetFreeBank()
next
t2#=Timer()-t
#print "PB GetFreeBank Time:"+str$(t2#)+" For #"+str$(lp*5)+" Tests"
#print "" : #print ""
#print "------------> Result: Test 1 is ["+str$(t2#/t1#)+"] times faster then test 2"
#print "" : #print ""
#print "---[ GET FREE FUNCTION TESTING DONE ] ------------------------------"
#print "" : #print ""
EndFunction
[/pbcode]
if the new method is better, but slower, it shouldn't matter because free indexes are really only use at the beginning when loading images etc.
Quote from: stevmjon on May 19, 2021, 12:31:52 AM
if the new method is better, but slower, it shouldn't matter because free indexes are really only use at the beginning when loading images etc.
Yeah.. that was my initial gut reaction too, then you start thinking about the Sprites command sets, where coders tend to create / destroy things dynamically a lot across the frame. This test does show worse case though (and it could be a lot worse than this), so it feels a lot like robbing Peter to pay Paul. I did tweak it up a bit more (which was about 15% faster than the test above) but I think i'll have to link the smaller index blocks together. That should give us a result much the same as the existing solution, just annoying :)
Editing that video atm
PlayBASIC LIVE - Media Indexes & Threading Chit Chat - (2021-05-19)
Computing FREE MEDIA indexes - GetFreebank #2
So the GetFreeBank() problem seems to be solved by linking the blocks together. In the test results bellow, the replacement free item function is twice as fast as the old internal function. There's some functional difference from the old approach, where the newest one will work within the sub blocks of indexes. Meaning if the bank quantity is 1024 items, that is then broken down into four blocks of 256 possible indexes, calling the FreeBank function will pull from the first block of 256 first and if that's full, fall down to the next block. Keep them managed within the allocated blocks. There's a penalty though, if a sub block is full, then we have to bump up to the next used sub block. Since we're trying to keep indexes within allocated blocks. So worse case scenario is it'll will be bit slower, but generally it should be about the same speed or faster in the real world.
---[ GET FREE FUNCTIONTESTING ] ------------------------------
Creation Time:1
GetFreeBank Time: 10.0 For #500000 Tests
PB GetFreeBank Time: 645.0 For #500000 Tests
PB GetFreeBank2 Time: 5.0 For #500000 Tests
------------> Result: Test 1 is [64.5] times faster then test 2
------------> Result: Test 1 is [0.5] times faster then test 3
---[ GET FREE FUNCTION TESTING DONE ] ------------------------------
Computing FREE MEDIA indexes - Media Quantities becoming obsolete
Ok, so continuing with the Bank command set prototype, some things are becoming obsolete such as theBankQuantity /GetBankQuantity() which were for resizing the internal command sets tables for the command set. In the replacement libraries these commands are really only for controlling the potential ranges, changing the Quantity just sets that range, there's no allocation happening anymore.
Computing FREE MEDIA indexes - BANK COMMAND SET (REPLACEMENTS)
Picking through the old bank commands looking for odd behaviors and have found more than a few places where some data management & clipping was missing. This mostly refers to how it manages the allocations for you, ideally things like resizing the bank. Which now includes some limited logic for shrinking the buffer, which is handy if your constantly resizing the buffer down. I'll include some logic for expansion too, just haven't got there as yet.
But for today's little diddy.. Now i'm looking at the CopyBankBytes command. Which seems to have some missing clipping functionality, in particular when you write to an address bellow the first byte in the bank, which rejects this copy, whereas if you copy to the end of the buffer it'll clip the target size correctly..
Run this in DEBUG mode F7 and check the console output.
[pbcode]
A=NewBank(20)
B=NewBank(20)
Randomize 12234
// FillBANK(A,255)
RandomizeBank(A)
ShowBANK A
#print " CopyBankBytes A, 0, 5, B, 0"
CopyBankBytes A, 0, 5, B, 0
ShowBANK B
FillBank(B,0)
#print " CopyBankBytes A, 0, 5, B, 1000"
CopyBankBytes A, 0, 5, B, 1000
ShowBANK B
FillBank(B,0)
#print " CopyBankBytes A, 0, 5, B, 17"
CopyBankBytes A, 0, 5, B, 17
ShowBANK B
FillBank(B,0)
#print " CopyBankBytes A, 0, 5, B, -3"
CopyBankBytes A, 0, 5, B, -3
ShowBANK B
FillBank(B,0)
sync
waitkey
Function RandomizeBank(ThisBank)
ptr=getbankptr(ThisBank)
if Ptr
Size=getBankSize(ThisBank)
for lp=0 to size-1
pokebankbyte Thisbank,lp, floor(rnd#(256))
next
endif
EndFunction
Function FillBank(ThisBank,ThisByte)
ptr=getbankptr(ThisBank)
if Ptr
Size=getBankSize(ThisBank)
FillMemory Ptr,Size,ThisByte,1
endif
EndFunction
Function ShowBank(ThisBank)
#print "Bank#"+str$(Thisbank)+" --------------------------"
#print ""
if getBankStatus(ThisBank)
ptr=getbankptr(ThisBank)
if Ptr
Size=getBankSize(ThisBank)
for lp=0 to size-1
thisbyte=PeekBankByte(Thisbank,lp)
s$+=digits$(Thisbyte,3)+","
row++
if row > 100
#print s$
row=0
endif
next
if row
#print s$
endif
endif
else
#print "BAnk Doesn't Exist"+str$(Thisbank)
endif
#print ""
EndFunction
[/pbcode]
Output data
Bank#1 --------------------------
238,243,173,201,175,233,044,041,141,193,164,154,128,220,144,068,039,238,063,051,
CopyBankBytes A, 0, 5, B, 0
Bank#2 --------------------------
238,243,173,201,175,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
CopyBankBytes A, 0, 5, B, 1000
Bank#2 --------------------------
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
CopyBankBytes A, 0, 5, B, 17
Bank#2 --------------------------
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,238,243,173,
CopyBankBytes A, 0, 5, B, -3
Bank#2 --------------------------
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,