next up previous contents
Next: wav files Up: Files: reading and writing Previous: ASCII files   Contents

Binary files

Binary files are the method used by most systems to store information. We will begin with a simple example: an array of one channel of 2-byte integers (`short') generated by an analog to digital convertor. As you know, analog, to digital convertors translate a voltage into an integer. Usually, each convertor has a precision indicating the number of levels utilized in the range of conversion. An 8-bit convertor samples with 256 levels (28), which is not enough for most neurophysiological applications.

Values coded as 12-bit or 16-bit (4096 or 65536 levels respectively) are frequently used. Each value from a 8-bit ADC can be included in a byte but to include a value of 12 or 16 bit we need two bytes. So, we have to read values from a file which represents each value with two bytes. (By the way, if we want to store 30 seconds of electromyographic activity at 20000 samples per second, each second requires 40000 bytes and the segment needs 1200000 bytes to be stored. 1 Mb allows to store roughly 30 seconds of signal at this rate of sampling. This signal can be stored in a floppy disk without any class of compression).

Variables are stored in a similar way to that used by other languages. The next paragraph is taken from the help of `mget'

"l","i","s","ul","ui","us","d","f","c","uc"
               : for reading respectively a long, an int, a short, an
               unsigned long, an unsigned int, an unsigned short, a double, a
               float, a char and an unsigned char.
We are going to begin with `char', which uses one byte for item. Our first step is to create a file with only one character, the letter `S' that correspond to ASCII code 83. The file will be called `charfile1.test'. Before saving the variable, we have to create the file and once the character is written, we close the file. After reading it, we repeat a similar process


-->fid = mopen ('filechar1.test','w');       // creates file

-->x = 83;

-->mput ( x, 'c', fid)                       // writes x
 ans  =

    83.

-->mclose(fid)                               // closes file
 ans  =

    0.

-->fid = mopen ('filechar1.test','r');       // opens file for reading
      
-->y = mget( x, 'c', fid)                    // reads x
 y  =

    83.

-->mclose(fid)                               // closes file
 ans  =

    0.
Now we are going to see indirectly the structure of the file. We generate an array of integers of one byte (char) from -10 to 10 but we are going to read it in two formats: as `c' (character) and as `uc' (unsigned character). To follow the access of the file, we are going to use two functions: `mtell' that indicates the position where data are going to be read and `mseek' that adjusts the position where data are going to be read


-->fid = mopen ('filechar2.test','w');        // creates file

-->x = [ -10 , -5, 0, 5, 10];                 // defines a vector

-->mput(x,'c',fid);                           // writes the values
                                     
-->mclose(fid)                                // closes file
 ans  =

    0.
We are going to recover the values we previously stored


-->fid = mopen ('filechar2.test','r');        // opens file for reading
                                     
-->y =mget(1,'c',fid)                         // reads first value
 y  =

  - 10.
 
-->mtell(fid)                                 // tells position in file
 ans  =

    1.

-->meof(fid)                                  // checks not eof
 ans  =

    0.
The function `mtell' indicates the position in the file that is going to be read and `meof' indicates if the end of the file has been reached.
 
-->y =mget(1,'c',fid)                        // reads second value
 y  =

  - 5.

-->mtell(fid)                                 // tells position in file
 ans  =

    2.

-->meof(fid)                                  // checks not eof
 ans  =

    0.
Once we read one character, the position advances one byte. Now we are going to read several characters with one command

 
-->y =mget(3,'c',fid)                        // gets three values 
 y  =

!   0.    5.    10. !

-->mtell(fid)                                 // tells position in file
 ans  =

    6.

-->meof(fid)                                  // checks not eof
 ans  =

    0.
If we try to continue reading


-->y =mget(1,'c',fid)                         // tries to get another value
 y  =

     []

-->mtell(fid)                                 // checks position
 ans  =

    6.

-->meof(fid)                                  // checks whether eof is reached
 ans  =

    16.
We do not get any value and we reached the end of the file. We can set the position with the command `mseek'


-->mseek(0)                                   // position to first value

-->mtell(fid)                                 // checks position
 ans  =

    0.

-->y =mget(1,'c',fid)                         // gets value
 y  =

  - 10.
 
-->mseek(0)                                   // position to first value

-->mtell(fid)                                 // checks position
 ans  =

    0.
We read one byte as signed or unsigned

-->y =mget(1,'uc',fid)                        // gets char as unsigned
 y  =

    246.

-->mclose(fid)                                // closes file
 ans  =

    0.
Notice that each value is stored in one byte (as `tell' clearly shows) and that the same byte in the file can be interpreted in different ways. Now we are going to repeat a similar order of commands with 2-byte storing (int).

-->fid = mopen ('filechar3.test','w');        // creates file

-->x = [ -10 , -5, 0, 5, 10];

-->mput(x,'s',fid);                           // puts vector x as short

-->mclose(fid)                                // closes file
 ans  =

    0.
In this format (`s') integers are stored in two bytes.

-->fid = mopen ('filechar3.test','r');        // opens file

-->mtell(fid)                                 // checks position
 ans  =

    0.

-->y =mget(1,'s',fid)                         // gets first value
 y  =

  - 10.
 
-->mtell(fid)                                 // position (2 bytes right)
 ans  =

    2.
The position is `2' instead of `1', indicating that two bytes have been read

-->y =mget(1,'s',fid)                         // gets second value
 y  =

  - 5.

-->mtell(fid)                                 // position (2 bytes right)
 ans  =

    4.
We set the position to the beginning.

-->mseek(0);                                  // sets position to 0

-->y =mget(1,'s',fid)                         // gets first value
 y  =

  - 10.

-->mseek(0);                                  // sets position to 0

-->y =mget(1,'us',fid)                        // value as unsigned short
 y  =

    65526.

-->mclose(fid)                                // closes file
 ans  =

    0.
Of course, reading `-10' as an unsigned short produces a different value when it is read as an unsigned char.

We see that each value is represented by two bytes (as `tell' shows) and that negative values are represented in reference to two bytes. We can mix different kinds of data in the same file


-->fid = mopen ('filetypes.test','w');        // creates file

-->i = 5                                      // value to be written
 i  =

    5.

-->mput (i,'c',fid);                          // as a character

-->mput (i,'s',fid);                          // as short

-->mput (i,'l',fid);                          // as long

-->mput (i,'f',fid);                          // as float 

-->mput (i,'d',fid);                          // as double

-->mclose(fid)                                // closes file
 ans  =

    0.
We wrote different kinds of data and now we are going to try to read them

-->fid = mopen ('filetypes.test','r');        // opens file

-->mtell(fid)                                 // checks position
 ans  =

    0.

-->i1 = mget(1,'c',fid)                       // reads value as a character
 i1  =

    5.

-->mtell(fid)                                 // position right 1 byte
 ans  =

    1.
We read `i1' as one-byte data (`c')

-->i2 = mget(1,'s',fid)                       // reads data as short
 i2  =

    5.

-->mtell(fid)                                 // position right 2 bytes
 ans  =

    3.
We read `i2' as two-byte data (`s') and the position in the file is 3.

-->i3 = mget(1,'l',fid)                       // reads data as long
 i3  =

    5.
-->mtell(fid)                                 // position right 4 bytes
 ans  =

    7.
We read `i3' as a four-byte data (`l') and the position in the file is 7.

-->f1 = mget(1,'f',fid)                       // reads data as float
 f1  =

    5.

-->mtell(fid)                                 // position right 4 bytes
 ans  =

    11.
Float (`f') is stored as four-byte data and the position in the file is 11.

-->f1 = mget(1,'d',fid)                       // reads data as double
 f1  =

    5.

-->mtell(fid)                                 // position right 8 bytes
 ans  =

    19.
Double (`d') is stored as 8-byte data and the position in the file is 19.

-->mclose(fid)                                // closes file
 ans  =

    0.
Usually, files contain not only data but information about acquisition in some part of the file. As an example of this kind of binary files we are going to treat `wav' files and `edf' files.
next up previous contents
Next: wav files Up: Files: reading and writing Previous: ASCII files   Contents
j 2003-01-23