Beginners Guide To Programming - Part IV File Access

Started by kevin, April 21, 2008, 07:18:14 AM

Previous topic - Next topic

kevin


TDK's Play Basic Programming For Beginners

Back To Index

Part 4 - File Access

  All but the most basic programs use file access. Although strictly speaking this also encompasses PB's commands for loading media files for your games such as images, sounds and shapes/models, this part of the tutorial series covers saving your program's data to disk and reading it back in again.

 This process is required for reading INI files, saving hiscore tables or creating new file formats for your new world editor.

 The basic process is to open a file for reading or writing, read in (or write out) the data then close the file. What you write is up to you, so long as you read the information back in the same order.

 Now I know many people will argue with me, but I have decided that it's far simpler to write data as ASCII text files when you are learning to program. The main benefit is that you can open up your files after they have been created to see if they actually contain what you thought you had written. Some of PB's save data commands create encrypted files which can't be opened for examination - despite any advantages they may have.

The first thing you have to do is open a file. Assuming that we need to write a file before we are able to read it back in, this is done with OPEN TO WRITE.



WRITEFILE


This command will create a new file on your hard disk and uses the syntax:

WriteFile Filename$,Channel

  Filename$ is the filename you want to use,   Channel is an integer number and is like a 'stream' number.


 The channel number is used because you can open more than one channel at a time. For example, you can open channel 1 to read and channel 2 to write simultaneously, allowing you to read from one file and write selected parts of it to a second file at the same time. By including the channel number in all of the commands, PB knows which file to access.

 It's like connecting a pipe from PB to the file on disk. The channel number tells PB which pipe to send the data down when writing and which pipe to take the data from when reading. As long as you number the pipe(s), open the correct valves (READ or WRITE) before using the pipe and remember to close the valves when you are finished, you can use as many pipes as you need.

 Filename$ can be a specific filename including the full path like:

+ Code Snippet

C:\Program Files\MyprogData\Mydata.dat


It can also be relative, so using just a filename like 'Mydata.dat', the file will be opened in the current project directory (where your PB program is located).

If you have a directory called 'DATA' in the current directory and wanted to save your data to a new file in there, you would set the filename to 'DATA\Mydata.dat'. The process will fail if the directory DATA does not exist though.

So, to open a file called 'Mydata.dat' in the current directory we would use:

WriteFile "Mydata.dat",1

 This creates an empty file called "Mydata.dat" and connects our 'pipe' which is labelled '1'.  But, it is very important that the named filename DOES NOT ALREADY EXIST. If it does, then you will get an error. To avoid this, you have the FileExist() function.




FILE EXIST()


 So, before creating a new file, you should always check to see if it exists already with:

+ Code Snippet

 If FileExist(Filename$)=1
    Rem Do Something About It
 Endif



 Here, the FileExist() function must be given the exact filename string as is used in the WriteFile command or you may not be checking for the existence of the file in the same location. It therefore makes sense to use a variable for the filename - rather than entering the filename literally:

Filename$="Mydata.dat"

 If the file does exist, then the FileExist() function will return 1 (true) and if it doesn't exist will return 0 (false).  So, in our example, the code between the If and Endif lines will only be carried out if the file does exist.

 I put 'Do Something About It' in the above example because you have two options at this point.  As you cannot open a file to save if it already exists, it HAS to be deleted so you can re-create a new one. But, what if the file is there and contains data which you don't want to lose?

Well we'll cover that later, but for now, we'll assume that it can just be deleted. So, we use Delete File:

+ Code Snippet

If FileExist(Filename$)=1
 DeleteFile Filename$
Endif



which can be shortened to:

If FileExist(Filename$) Then DeleteFile Filename$


 Here, if you don't say =1, then it is 'implied' - in other words, PB assumes you are testing for true (=1).  Also, as you only have a single action to carry out - not multiple lines of code, you can add the keyword THEN and include the action on the end of the IF line.

 So, having checked for the existence of the file, deleted it if it was found and opened a new file for writing, we now have to write our data to disk.

 Normally, this data would be variables. If you were writing say a world editor then all of the worldx data the user has created or altered would be in variables like WorldWidth, WorldHeight, etc.   All we need to do is write all these relevant variables to disk.

 Once the file has been opened, there are a number of commands to write different types of data. These include WriteBYTE, WriteFLOAT and WriteINT - each of which writes data in an encrypted format.

 When I say encrypted I simply mean that you can't read the data with anything other than PB's respective READ command.  Use WriteFLOAT and you can only access the data with PB's ReadFLOAT - you can't open it with say Windows Notepad and examine the contents.

There is a way around this though, by using WriteString for everything. As mentioned earlier, when you are learning PB, then I think it's important that you are able to write some data to a file then open it in Notepad and see if it contains what you actually thought you were writing.

The fact that all your output is strings is irrelevant - the same data is still stored and you are still learning how to save data to disk.

So, let's see some WRITE STRING examples:

WriteString 1,"This is a sample text string!"


A$="This is a sample text string!"
WriteString 1,A$


 OK, these both do the same thing. The first example writes the literal string enclosed in the quotes to disk, (but not the actual quotes). You could use this method for the very first line of your file to write a header description of the file so if anyone opened the file to look at it, they would see what the file was for.   For example, if the file is a list of Alien Positions, then the first line might be "ALIEN-POSITIONS" to identify what the file contains.


The second example is what you use to write string variables. But, what if your variables are numeric - not string?

That's not a problem, we just convert them to strings when we write them out. For example:



MatrixWidth=20000
MatrixHeight=20000
TilesX=70
TilesZ=70
FloatVar#=44.82

WriteString 1,"This is the header"
WriteString 1,Str$(MatrixWidth)
Writetring 1,Str$(MatrixHeight)
WriteString 1,Str$(TilesX)
WriteString 1,Str$(TilesZ)
WriteString 1,Str$(FloatVar#)



As you can see, the use of Str$() converts the numeric variables to strings before writing them. The original variables are not altered in any way by this process. As you can see, the process also works with float (real) numbers too. If you opened the above resulting file with Notepad you would see:


This is the header
20000
20000
70
70
44.82


Having written our data out, we need to close the file. This is done very simply with:

CloseFile Channel

...where Channel is the channel number used when opening the file.

The complete routine for our example would therefore be:


+ Code Snippet

Filename$="Mydata.dat"
If FileExist(Filename$) Then DeleteFile Filename$
MatrixWidth=20000
MatrixHeight=20000
TilesX=70
TilesZ=70
FloatVar#=44.82

WriteFile Filename$,1
 WriteString 1,"This is the header"
 WriteString 1,Str$(MatrixWidth)
 WriteString 1,Str$(MatrixHeight)
 WriteString 1,Str$(TilesX)
 WriteString 1,Str$(TilesZ)
 WriteString 1,Str$(FloatVar#)
CloseFile 1



 OK, that's written an example file, but what about reading the information back in?




READFILE


 This process is very similar to writing files but using Read instead of Write. It's probably easier to show you the complete routine for reading the file generated by the above example code then discussing it afterwards:

+ Code Snippet

Filename$="Mydata.dat"
If FileExist(Filename$)
 ReadFile Filename$,1
   Header$ =ReadString$(1)
   MatrixWidth =Val(ReadString$(1))
   MatrixHeight =Val(ReadString$(1))
   MatrixTilesX =Val(ReadString$(1))
   MatrixTilesY =Val(ReadString$(1))
   FloatVar# =Val#(ReadString$(1))
 CloseFile 1
Endif



OK, first of all, we check for the existence of the file we are trying to load. To avoid errors we only open the file if it's there. If it isn't then we don't attempt to open it. That's why all the reading code is enclosed inside the If FileExist(Filename$) EndIF statement.

If the file does exist then we use ReadFile along with ReadString$ to get the data. As we know that all the data in the file is of type string.  So for numeric data we'll use the VAL() (for integers) and VAL#() to convert the string data into a numeric form. So it can be assigned to a integer or float variable.

There's no way to detect automatically what type of data is in a file, but as you are reading the same data that you wrote out, you already know what each string you read in has to be converted to - if it isn't actually a string. You just have to make sure that you load data strictly in the same order that you wrote it out or nothing will work!

The first of our data items is a text header. As this is unwanted information, we can ignore it once it is loaded, though it MUST be loaded as it's part of the file.  Data files are sequential so in order to read say the third item in the file, the first two must be loaded first. So the rule is load EVERYTHING and ignore what you don't want!

 The next item of our example is MatrixWidth which is numeric, so once the string version of the value has been loaded via ReadString$(), this string needs to convert it to a numeric value with VAL().  Which will convert the string data into a numeric form that we can assign to our variable.

The process is repeated the remaining numeric variables in the file.  The only exception is the last piece of data which is Float.  So this time we need to use the Floating point version of Val#(). Which version will convert the string into a floating point number.    So  it will still convert the string "44.82" to the numeric value 44.82 as long as you use a float type variable to receive it.

 Thus FloatVar#=Val#(ReadString(1)) will result in FloatVar# containing 44.82 which is what we want. However if you miss off the # symbol then FloatVar=Val#(ReadString(1)) will result in FloatVar equalling 44 because without the # it is an integer variable and you will lose the .82 off the end!

Finally the file is closed.




Saving Arrays:


  Saving arrays can be done any many ways  Howvere the method we will discuss next will allow you to save all the arrays from your program that you want - all in the same file. This is essential if you want to create your own file format.

 Arrays are no more than simple variables in blocks. Each variable in the array can be accessed by using the array's index number and if you can access a variable, you can save it out to disk.  

 Here's a useful example...


Hiscore Tables

 Creating a hiscore table in your program is easy enough, but if it doesn't write the data to disk, the next time the program is run, all the hiscores are lost.

 So, let's assume that our game has a hiscore table which holds the top 10 hiscores and the names of the players who scored them. For this we need two very simple arrays - Hiscore() and PlayerName$(). Hiscore() is an integer array as the hiscores will be numeric and PlayerName$() is naturally a string array.

These are created with:

Dim Hiscore(10)
Dim PlayerName$(10)


 For these tutorials, once again I am purposely ignoring the fact that element 0 exists in an array as it makes life easier - we can refer to players/hiscores 1 to 10 rather than 0 to 9. The file on disk will be called HISCORE.DAT.

So, when your game runs it checks to see if the file HISCORE.DAT exists. If it's the very first time it has been run, then the file will not exist so it must be created and the arrays written out to disk. At this time they will obviously all be empty or contain 0 (zero).

At this point, the arrays written to disk are the same as in memory. The player plays the game and if their score gets on the hiscore table, the arrays are modified. Obviously the first time the game is played, ANY score will get onto the table so they enter their name and the data is stored in the two arrays.

When the game is exited, the existing file HISCORE.DAT is deleted (we already have a later version in memory) and the new contents of the two arrays written out to the file HISCORE.DAT.

The next time the game is run and it checks to see if the file HISCORE.DAT exists, it will be there, so instead of creating a new one, the old hiscore table is read in. Once in memory, our two arrays can be modified when a new hiscore is attained and on exit the hiscore table is just written out again - regardless of whether or not it has changed since last time.

Writing arrays are very simple. All we have to do is write the data in a loop which matches the size of the array. For Next loops are ideal for this. So, to write our array Hiscore() to disk with 10 elements, we would use:

+ Code Snippet

For N=1 To 10
  WriteString 1,Str$(Hiscore(N))
Next N



As you can see, Str$() is used as before to convert the numeric array data to string when writing it out to disk.

Reading the array back in is also just as simple:

+ Code Snippet

 For N=1 To 10
    Hiscore(N)=Val(ReadString$(1))
 Next N



When writing string arrays, there is no need to convert the data, so we skip the Str$() section and just use:

+ Code Snippet

 For N=1 To 10
    WriteString 1,PlayerName$(N)
 Next N



Reading the string array back in is done with:

+ Code Snippet

 For N=1 To 10
   PlayerName$(N)=ReadString$(1)
 Next N




Saving Multi-Dimensioned Arrays:

 If the array you want to save is a multi-dimensioned array, then the process is identical - we just alter the loop accordingly. To save a numeric integer array which was created with DIM MultiArray(10,5) we would use:

+ Code Snippet

 For Ny=1 To 5
   For Nx=1 To 10
      WriteString 1,Str$(MultiArray(Nx,Ny))
   Next Nx
 Next Ny



 Here, this nested loop will use Nx to write the 10 Nx array values for every Ny value in the Ny loop. So, the contents of MultiArray() will be written using Nx from 1 to 10 with Ny=1, followed by Nx from 1 to 10 with Ny=2 and so on until Ny=5.

Reading back in is the same as with single dimensioned arrays, but using exactly the same nested loop.

+ Code Snippet

 For Ny=1 To 5
   For Nx=1 To 10
     MultiArray(Nx,Ny)=Val(ReadString$(1))
   Next Nx
 Next Ny



  OK, that's how data in arrays is saved to disk and read back in again. Once again, I will stress that it's very, very important that you read in the information in EXACTLY the same order that it was written out. Failure to do this can cause problems - especially when you realise that it is possible for the data you are reading in to be fed into the wrong variables. Your program will often not error during the load process in cases like this as the routine will load any data into any variables so long as the variable types match - they just won't work properly and the problem could be very difficult to trace.

 So back to our hiscore example...

 What we have to do now is place a small routine at the beginning which checks for the hiscore data file, creates it if it doesn't and reads it in if it does:

+ Code Snippet

Dim Hiscore(10)
Dim PlayerName$(10)
Filename$="HISCORE.DAT"

If FileExist(Filename$)
 ReadFile Filename$,1
 For N=1 To 10
   PlayerName$(N)=ReadString$(1)
   Hiscore(N)=Val(ReadString$(1))
 Next N
 CloseFile 1
Else
 WriteFile Filename$,1
   For N=1 To 10
     WriteString 1,Str$(Hiscore(N))
     WriteString 1,PlayerName$(N)
   Next N
 CloseFile 1
Endif



 In your game, you write the code which checks the players score at the end of each game and if it's higher than the lowest score in the hiscore table, ask for the players name, inserts the name and score into the two arrays - pushing the bottom entry off the list.

 On exiting the program, we know that the file definitely exists so we just delete it and create a new file containing the contents of the hiscore arrays currently in memory - ready for being read in the next time the program is run.

+ Code Snippet

DeleteFile FileName$
WriteFile Filename$,1
 For N=1 To 10
   WriteString 1,Str$(Hiscore(N))
   WriteString 1,PlayerName$(N)
 Next N
CloseFile 1





File Formats


As you have seen, you can write many different types of variables while a file is open for writing, so when there is a lot of data to be written it's worth planning what order to write the data.

The structure of your data file is called a 'File Format' and all files created with Windows applications have one. There's a bitmap file format, a Microsoft Word file format and so on.

The file format defines for other users the layout of your file and what information can be found where, so they can add routines to their programs giving them the ability to load files created by your programs.

For example in a graphics file format one part of the file is the header, one is reserved for the colour palette and another part of the file will be the data which makes up the picture. You decide where the data goes in your own file format.

There are no fixed rules for designing a file format, just write the data out sensibly and logically.  MatEdit for example creates a .MDF file with the Build option. If you were to look at an MDF file you would just see numbers - lots of them. Publishing the file format simply describes to others what these number are, what variable types they are and so on.

As a rule of thumb, you should have a description of the file type at the start saying what the file is used with. The numeric and string variables should come next and finally all the array data. Try not to have too much unwanted information like comments scattered about the file as it complicates the load routine - you still have to load all the useless information even though you are immediately going to discard it.




Loading Routines


If you write a program which creates a data file usable by other people you will also need to create a loading routine in PB which is supplied with your program. This will normally be a function (or collection of functions) which users can #Include in their programs so they can use your loading/saving functions when required.

If you write a world editor or sprite editor then you want people to be able to use the creations made with your program in their own PB programs. If you don't provide them with a simple way to do this, then they are not going to want to use your program.





Reading Other Files


 ReadFile isn't just restricted to reading files you created yourself with WriteFile.   It can also be used to read information in from other files too. As long as you know the file format, you can read data in from graphics and text files.

 One of the easiest files to read in are plain ASCII text files created with a text editor (Such as windows NotePAD) as each line is going to be a string.

 The only issue is 'how much data do we read in'? As we didn't create the file, we have no idea how long the file is!




EndOFFile()


 Luckily, PB gives us a function called EndOFFile() which uses the syntax:

 EndOFFile(Channel)

...where Channel is the same as the channel used with ReadFile.  This will return true (1) if the end of the file has been reached or false (0) if there is still more data to be read in.

  Using this function in a loop, we can read all of the data in the file without having to know how much is there first. The data from a string-type file like this is usually done with a string array. You just need to dimension the array with a large enough number of subscripts before reading in the file or an error will occur while reading. Let's see an example:

+ Code Snippet

Filename$ ="DOCUMENT.TXT"

Dim TextLines$(5000)

If FileExist(Filename$)
 ReadFile Filename$,1
   LineCount=0
   Repeat
     Inc LineCount
     TextLines$(LineCount)= ReadString$(1)
   Until EndOFFile(1)=true
 CloseFile 1
Endif



This example creates a string array with 5000 elements and is thus able to read up to 5000 lines from a text file. The filename is set to DOCUMENT.TXT and we use our usual method of placing the loading code inside an If...Endif which checks to see if the named file exists first.

The important part of this example is that we are not using a For...Next loop any longer as we don't know how many lines there are in the file - and we therefore don't have any start and end values for this kind of loop. Instead we use a Repeat...Until loop which uses EndOFFile() to check if the end of the text file has been reached.

The ReadString$() line reads each piece of data into our TextLines$() array. Each time we read a new line we place it in the array at the using out LineCount variable..  This variable is increases each time we read line.  So it's keeping track of the number of the lines we've read.


This loop continues reading lines of text from the text file until there is no more lines to read and then drops out of the loop. At this point, LineCount is equal to the number of lines read in from the text file. Knowing this, we can add a For...Next loop to the end of the program which will print the lines read in to the screen:

+ Code Snippet

Print "Lines In File:"+Str$(LineCount)
For lp=1 to LineCount
Print TextLines$(lp)
next
Sync
Waitkey


And that's all there is to reading a text file.  




 OK, that's it for the File Access tutorial. If you think there's some aspect of File Access you think I've missed and would like to see covered then let me know.

TDK_Man