Valhalla Legends Archive

Programming => General Programming => Visual Basic Programming => Topic started by: Tontow on April 08, 2004, 12:04 AM

Title: splitting a txt file and setting it to an array
Post by: Tontow on April 08, 2004, 12:04 AM
I know how to open it but im haveing dificulty when i try to set the .txt file  = to an array

(i want to open a text file and split it up every vbCrLf (return) and have it be returned in an array.
text file = array with every line being a different entrie in the array)
Title: Re:splitting a txt file and setting it to an array
Post by: Newby on April 08, 2004, 12:17 AM
If I understand you correctly:

Input a line of the text file, and add it to the array

Loop until you reach the end of the file.

Perhaps show some coding?
Title: Re:splitting a txt file and setting it to an array
Post by: o.OV on April 08, 2004, 12:29 AM
Or you can load the whole file into a temporary string and then use Split.
I don't know which would be best
and I'm not aware of any direct methods.
Title: Re:splitting a txt file and setting it to an array
Post by: Eli_1 on April 08, 2004, 12:34 AM
Quote from: Tontow on April 08, 2004, 12:04 AM
I know how to open it but im haveing dificulty when i try to set the .txt file  = to an array

(i want to open a text file and split it up every vbCrLf (return) and have it be returned in an array.
text file = array with every line being a different entrie in the array)

There are 2 ways (that I would use) to do this:

1.) Input the file line by line into an array

dim myArray() as string
redim myArray(0)
open app.path & "/file.bla" for input as #1
do until eof(1)
   input #1, myArray(ubound(myArray))
   redim preserve myarray(ubound(myarray) + 1)
loop

redim preserve myarray(ubound(myarray) - 1)


Or
2.) Use binarry access read to input the whole file, then parse it according to the VbCrLf's with Split

dim myArray() as string, buffer as string
open app.path & "\file.bla" for binary access read as #1
buffer = space$(lof(1))
get #1, , buffer
myarray = split(buffer, vbcrlf)


Both are untested so you may have to tweak them some to get it to work. Hope it helps.
Title: Re:splitting a txt file and setting it to an array
Post by: Tontow on April 08, 2004, 12:55 AM
thx, that helped alot
Title: Re:splitting a txt file and setting it to an array
Post by: Grok on April 08, 2004, 07:02 AM
Maybe TheMinistered or Adron knows how a VB arrays is constructed in memory, and if it is possible to get a little trickier.  Perhaps loading the whole file into a string, then altering the string to be an array, without having to redim.  I think that redim preserve is going to cause at least linecount copy operations.
Title: Re:splitting a txt file and setting it to an array
Post by: Adron on April 08, 2004, 08:23 AM
Quote from: Grok on April 08, 2004, 07:02 AM
Maybe TheMinistered or Adron knows how a VB arrays is constructed in memory, and if it is possible to get a little trickier.  Perhaps loading the whole file into a string, then altering the string to be an array, without having to redim.  I think that redim preserve is going to cause at least linecount copy operations.

A VB array consists of a number of same-size objects laid out sequentially in memory, just like a C array. An array of String is a bit like a C array of "char*". The pointers will be stored at consecutive locations, but the actual text data may be stored anywhere in memory. This means that you can't turn a long string into an array of strings.

In C, you could do something like:


char buffer[] = "String1\nString2\nString3";
char *strings[3];
strings[0] = strtok(buffer, "\n");
strings[1] = strtok(0, "\n");
strings[2] = strtok(0, "\n");


which would give you 3 strings using the same big buffer. You yourself handle the allocation of memory for the strings, and you know that they all share the same buffer. In VB, the compiler handles allocation of memory for strings, and you can't tell it what memory to use.

If you did some magic to make VB use the same memory buffer for all strings, you'd get errors later when VB tried to free the memory used by each string separately.

If VB isn't stupid, it won't reallocate the memory for each string when you redim the array of strings. It will just move the pointers, which will be a rather fast operation. It should be nearly equivalent in speed to the solution in C above. Because there too you need to "redim" the strings array of pointers if you don't know the number of lines beforehand.

In C, you could also turn it into an actual array of strings without doing any more assignments at all, but only if the strings are fixed length. That would look something like this:


char buffer[] = "String1\0String2\0String3";
char (*strings)[8];
strings = (char (*)[8])buffer;


Here you are telling the compiler that "buffer" is actually an N by 8 (N = 3 in this case) matrix of characters. Each line in the matrix is one string. When you're reading the data from the file you have to replace the '\n' at the end of each line by the string-terminator '\0'.

Title: Re:splitting a txt file and setting it to an array
Post by: iago on April 08, 2004, 11:47 AM
Perhaps it would be faster to scan in the file, count the endlines, and then read it in?  I don't know how the second file operation will compare to the redims, but I DO know that the second time you read the file it'll be faster due to caching.
Title: Re:splitting a txt file and setting it to an array
Post by: Adron on April 08, 2004, 12:51 PM
Another possibility would be to have a collection of arrays and add one array at a time, each array larger than the last, then only reallocate it once at the end.

Another possibility would be to check the filesize, guess using some reasonable statistic how many lines there will be, and allocate enough room + some margin for that right away. Then if you hit the limit, you do a new estimate based on the data you've read so far. And at the end, you redim it *down* which should hopefully not involve any copying of data.
Title: Re:splitting a txt file and setting it to an array
Post by: Eli_1 on April 08, 2004, 01:29 PM
Quote from: Eli_1 on April 08, 2004, 12:34 AM
1.) Input the file line by line into an array
<codeblock>
Or
2.) Use binarry access read to input the whole file, then parse it according to the VbCrLf's with Split
<codeblock>
I was bored and I used those two different ways *on my crappy computer* on various different files to see which one was faster. Here's my results (in ms).

On a file with only 56 lines (readme.txt):
   Method with ReDim: 16
   Method with Split   : 11

   Method with ReDim: 17
   Method with Split   : 11

   Method with ReDim: 21
   Method with Split   : 11

On a file with 550 lines (win.ini):
   Method with ReDim: 29
   Method with Split   : 20

   Method with ReDim: 35
   Method with Split   : 10

   Method with ReDim: 33
   Method with Split   : 24

On a file with 2589 lines (list from BrooDat.mpq):
   Method with ReDim: 117
   Method with Split   : 159

   Method with ReDim: 157
   Method with Split   : 181

   Method with ReDim: 172
   Method with Split   : 174

So it seems like the second method is much faster than the first, untill the file size gets pretty big. So the second method would be faster for the average config/shitlist/whatever (on my comp.)
Title: Re:splitting a txt file and setting it to an array
Post by: Adron on April 08, 2004, 01:40 PM
It's more important to get good timings for a large list though - noone cares about 20 or 50 ms, but when it's 5000 or 10000 people will start caring...
Title: Re:splitting a txt file and setting it to an array
Post by: Eli_1 on April 08, 2004, 01:41 PM
Then in that case the first method would be a better choice.  :-\