Tips How To Make a Text Adventures - Tokenize String Example - Learn To Code

Started by kevin, September 26, 2022, 11:15:55 PM

Previous topic - Next topic

kevin

 Tips How to make a Text Adventure - Tokenize String Example - Learn To Code


    This code translate the INPUT text from the player into a list of tokens.  These tokens will be either WORDS /  NUMBERS or any special characters we want to allow.  Everything else gets ignored.  

     The benefit of a such a model is we can simplify the users input by remove unwanted spaces / tabs or capitulations from the input data boiling down to the stuff we actually care about.  Which is what is the player trying to get their character(s) to do.  







PlayBASIC Code: [Select]
   InputString$="Pick up   shovel & pick   and bounce ball 1000 times"

// This aray we'll place each word into
Dim Words$(256)

// Token will either be ASC values of characters we accept
// or WORD tokens.
Dim Tokens(256)

InputStringSize = len(InputString$)

for LP =1 to InputStringSize

ThisLetter= asc(mid$(InputString$,lp,1))

// Check for chracters we'll ingore within the input string
if ThisLetter <=32 then continue


// Trap any charaters you need
if ThisLetter = asc(".") or ThisLetter = asc(",") or ThisLetter = asc("&") // etc
// Add this character as ASCII value
Tokens(TokenCOUNT) = ThisLetter
TokenCOUNT++
continue
endif



// -----------------------------------------------------
// If we get down here, it's multi character TOKEN.
// -----------------------------------------------------

// --------------------------------------------------
// Check for word letter (lower ot upper case)
// --------------------------------------------------
StartLP=lp
While lp<=InputStringSize
if (ThisLetter=>asc("a") and ThisLetter<asc("z")) or (ThisLetter=>asc("A") and ThisLetter<asc("Z"))
// read next letter
lp++
ThisLetter= asc(mid$(InputString$,lp,1))
else
exit
endif
endwhile


if StartLP < LP
Words$(TokenCOUNT) = mid$(INputString$,StartLP,lp-StartLP )
Tokens(TokenCOUNT) = 1000
TokenCOUNT++
continue
endif


// --------------------------------------------------
// Perhaps it's a number ? form: 123342
// --------------------------------------------------
StartLP=lp
While lp<=InputStringSize
if (ThisLetter=>asc("0") and ThisLetter<asc("9"))
// read next digit
lp++
ThisLetter= asc(mid$(InputString$,lp,1))
else
exit
endif
endwhile

if StartLP < LP
Words$(TokenCOUNT) = mid$(INputString$,StartLP,lp-StartLP )
Tokens(TokenCOUNT) = 1001
TokenCOUNT++
continue
endif


next




print "Tokens Found:"+str$(TokenCOUNT)


// once we have the words we could look it up from dictionary

For Lp=0 to TokenCOUNT-1


select Tokens(lp)



// -------------------
case 1000 // WORD
// -------------------

ThisWord$= Words$(lp)



// -------------------
case 1001 // DIGITS
// -------------------

ThisWord$= Words$(lp)

default
ThisWord$= chr$(Tokens(lp))

endselect

print ThisWord$
next



sync
waitkey






  Related Examples:

      - Text Adventure (Basic framework)
      - Dictionary Searching



stevmjon

cool kev

what's funny is i was just looking at tokens the last few days, and looking at this has given me more tips.
i didn't realise the if (asc("")=> and <asc("")) could be used. handy.

it's great that you can always learn new things, even after years of coding.
It's easy to start a program, but harder to finish it...

I think that means i am getting old and get side tracked too easy.

kevin

Steve,

Quotewhat's funny is i was just looking at tokens the last few days, and looking at this has given me more tips.

 Depending upon what you want to convert there's a number of them on the forums.    The one's i knock up mostly have the same basic structure a looop through a string that drops the classified tokens to an array/list or whatever I'm using to store them all.

  If you search for Lexer / Tokenizer / Parser you'll most likely find some more optimal approaches.  

 


Quotei didn't realise the if (asc("")=> and <asc("")) could be used. handy.

 if you give the ASC function a literal string, those functions are solved at compile time into the CHRII character code.     So you can use

 Making


 A = 32
 B= asc(" ")


 produce the same output code..  



 Here's a variation of the initial example that uses an accepted ASCII character code array for what to do with characters that it might see within the input code..  

PlayBASIC Code: [Select]
  InputString$="A B C D Pick up   shovel & pick   and bounce ball 1 22 333 4444 55555 1000 times"


// This aray we'll place each word into
Dim Words$(256)

// Token will either be ASC values of characters we accept
// or WORD tokens.
Dim Tokens(256)




// --------------------------------------------------------
// Set up Accepted Character TABLES
// --------------------------------------------------------
// 0 = Ignore this character
// 1 = Single letter character
// 2 = Alphabet character
// 3 = Digit character

Dim AcceptedCHR(255)

s$=".,:;&?"
for lp=1 to len(s$)
AcceptedCHR(mid(s$,lp))=1
next

// tag the alphabet characters
for lp=asc("a") to asc("z")
AcceptedCHR(lp)=2
next
for lp=asc("A") to asc("Z")
AcceptedCHR(lp)=2
next

// tag the digit characters
for lp=asc("0") to asc("9")
AcceptedCHR(lp)=3
next


// ---------------------------------------------------------------
// ---------------------------------------------------------------
// -------------------->> TOKENIZE STRING <<----------------------
// ---------------------------------------------------------------
// ---------------------------------------------------------------



InputStringSize = len(InputString$)

for LP =1 to InputStringSize

// Grab the ASCII value from the string as integer
ThisLetter= mid(InputString$,lp)



Action= AcceptedCHR(ThisLetter)

// ----------------------------------------------
// ingore any character that returns a zero..
// ----------------------------------------------
if Action=0 then continue


// ----------------------------------------------
// Trap any single charaters you need
// ----------------------------------------------
if Action=1
// Add this character as ASCII value
Tokens(TokenCOUNT) = ThisLetter
TokenCOUNT++
continue
endif


// ----------------------------------------------
// Is this a LETTER or WORD ?
// ----------------------------------------------
if Action=2

TypeTYPE = 1000

Parse_MUlti_Chr_Fragment:

StartLP=lp
for ScanLP=lp to len(InputString$)
ThisChrAction= AcceptedCHR(mid(InputString$,ScanLP))
if ThisChrAction<>Action
exit
endif
lp=ScanLP
next

Words$(TokenCOUNT) = mid$(INputString$,StartLP,(lp+1)-StartLP )
Tokens(TokenCOUNT) = TypeTYPE

TokenCOUNT++
continue
endif


// ----------------------------------------------
// Is this some DIGITS ?
// ----------------------------------------------
if Action=3
TypeTYPE = 1001
// call existing routine as they're functionaly the same
goto Parse_MUlti_Chr_Fragment
endif


next


// ---------------------------------------------------------------
// ---------------------------------------------------------------
// -------------------->> OUTPUT TOKENIZED FRAGMENTS <<----------------------
// ---------------------------------------------------------------
// ---------------------------------------------------------------

print "Tokens Found:"+str$(TokenCOUNT)

// once we have the words we could look it up from dictionary
For Lp=0 to TokenCOUNT-1

select Tokens(lp)
// -------------------
case 1000 // WORD
// -------------------
ThisWord$= Words$(lp)

// -------------------
case 1001 // DIGITS
// -------------------
ThisWord$= Words$(lp)

// -------------------
default
// -------------------
ThisWord$= chr$(Tokens(lp))

endselect

print ThisWord$
Login required to view complete source code


   Related Source Codes:

      - Syntax Highlight PlayBASIC source code as HTML