• Welcome to Valhalla Legends Archive.
 

dumpfile

Started by Eli_1, June 13, 2004, 05:30 PM

Previous topic - Next topic

Eli_1

I was trying to make a program that would read a file, create a hex dump of it, and put it in a .txt file. I tested it on storm.dll, and it works... in 21 minutes...


Const FILE_READ As String = "C:\Program Files\Starcraft\storm.dll"
Const FILE_DUMP As String = "C:\WINDOWS\DESKTOP\hexdump.txt"
Const READ_SIZE As Long = 16

Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" (pDst As Any, pSrc As Any, ByVal ByteLen As Long)
Private Declare Function GetTickCount Lib "kernel32" () As Long

Dim startdump As Long


Private Sub dumpfile(ByVal filename As String, ByVal outfile As String)
   Dim readcnt As Long
   Dim inbuf As String
   Dim tmpbuf As String
   Dim readtime As Double
   
   ' This is to figure out how long the
   ' whole proccess takes.
   startdump = GetTickCount
   
   
   Close #2
   Open filename For Binary Access Read As #2
   
   If LOF(2) = 0 Then Exit Sub
   
   pbar.Max = LOF(2)
   inbuf = Space$(LOF(2))
       
   Get #2, , inbuf
   While LenB(inbuf) <> 0
       tmpbuf = Left$(inbuf, 16)
       inbuf = Right$(inbuf, Len(inbuf) - 16)
       fappend hexdump(tmpbuf, readcnt), outfile
       readcnt = readcnt + 16

       pbar.Value = readcnt
       Label1.Caption = readcnt & "/" & pbar.Max
       pbar.Refresh
       DoEvents
   Wend
   
   readtime = (GetTickCount - startdump) / 60000
   Close #2
   Label2.Caption = "Finished: ~" & readtime & " mins."
End Sub



Private Function hexdump(ByVal data As String, ByRef loffset As Long) As String
   Dim tmpbuf As String
   Dim spacebuf As String
   Dim perbuf As String
   Dim tmpoffset As String * 2
   Dim offset As String
   
   spacebuf = Space$(16)
   perbuf = String(16, ".")
   
   CopyMemory ByVal tmpoffset, loffset, 2
       
   tmphex = Hex(Asc(Mid$(tmpoffset, 2, 1)))
   offset = IIf(Val("&H" & tmphex) < 16, "0" & tmphex, tmphex)
   tmphex = Hex(Asc(Mid$(tmpoffset, 1, 1)))
   offset = IIf(Val("&H" & tmphex) < 16, offset & "0" & tmphex, offset & tmphex)
       
   tmpbuf = strtohex(data)
   For i = 1 To Len(data)
       If Asc(Mid$(data, i, 1)) <> 32 Then
           If Asc(Mid$(data, i, 1)) < 48 Or Asc(Mid$(data, i, 1)) > 122 Then Mid$(data, i, 1) = "."
       End If
   Next i
   
   hexdump = offset & ": " & tmpbuf & Left$(spacebuf, 16 - Len(data)) & " " & data & Left$(perbuf, 16 - Len(data))
End Function


Private Function strtohex(ByVal data As String, Optional addspace As Byte = 0) As String
   Dim buffer As String
   Dim tmphex As String
   
   For i = 1 To Len(data)
       tmphex = Hex(Asc(Mid$(data, i, 1)))
       If Val("&H" & tmphex) < 16 Then tmphex = "0" & tmphex
       
       buffer = IIf(addspace = 0, buffer & tmphex & " ", buffer & tmphex)
   Next i
   
   strtohex = Left$(buffer, Len(buffer) - 1)
End Function

Private Sub fappend(ByVal data As String, ByVal filename As String)
   Close #1
   Open filename For Append As #1
   Print #1, data
   Close #1
End Sub

Private Sub fclear(ByVal filename As String)
   Close #1
   Open filename For Output As #1
   Close #1
End Sub

Private Sub Form_Activate()
   fclear FILE_DUMP
   dumpfile FILE_READ, FILE_DUMP
End Sub



Sample output:
Quote0000: 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 MZ..............
0010: B8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 ........@.......
0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0030: 00 00 00 00 00 00 00 00 00 00 00 00 E0 00 00 00 ................
0040: 0E 1F BA 0E 00 B4 09 CD 21 B8 01 4C CD 21 54 68 ...........L..Th
0050: 69 73 20 70 72 6F 67 72 61 6D 20 63 61 6E 6E 6F is program canno
0060: 74 20 62 65 20 72 75 6E 20 69 6E 20 44 4F 53 20 t be run in DOS
0070: 6D 6F 64 65 2E 0D 0D 0A 24 00 00 00 00 00 00 00 mode............
0080: 24 F7 6A E0 60 96 04 B3 60 96 04 B3 60 96 04 B3 ..j.`...`...`...
0090: 73 9E 59 B3 62 96 04 B3 60 96 05 B3 A2 96 04 B3 s.Y.b...`.......
00A0: E3 9E 59 B3 6B 96 04 B3 E3 8A 0A B3 64 96 04 B3 ..Y.k.......d...
00B0: 60 96 04 B3 43 96 04 B3 34 B5 35 B3 7C 96 04 B3 `...C...4.5.....
00C0: A7 90 02 B3 61 96 04 B3 9F B6 00 B3 61 96 04 B3 ....a.......a...
00D0: 52 69 63 68 60 96 04 B3 00 00 00 00 00 00 00 00 Rich`...........
00E0: 50 45 00 00 4C 01 06 00 A2 FB AB 40 00 00 00 00 PE..L......@....

. . .

FFF0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................


Any idea why it takes so long (~21 minutes)?

Adron

#1
Quote from: Eli_1 on June 13, 2004, 05:30 PM

..
       inbuf = Right$(inbuf, Len(inbuf) - 16)
..      


This is a very slow operation, because it will copy the entire file contents to a new buffer. It's makes the algorithm O(n^2). Reading the input file 16 bytes at a time would be faster, or using Mid to extract the data from inbuf.


Quote from: Eli_1 on June 13, 2004, 05:30 PM

...
       pbar.Refresh
       DoEvents
...


This is slow, don't do it on every iteration.


Quote from: Eli_1 on June 13, 2004, 05:30 PM

Private Sub fappend(ByVal data As String, ByVal filename As String)
   Close #1
   Open filename For Append As #1
   Print #1, data
   Close #1
End Sub


Do you have On Error Resume Next? I don't see how the first Close succeeds. Opening and closing files is typically slow. Don't open and close files unnecessarily.

Eli_1

#2
Thanks, Adron. I'll try some of your suggestions and get back to you.

Quote
Reading the input file 16 bytes at a time would be faster ...

That was how the program was origionally designed. Both ways seem equally slow.

Eli_1

Quote

..
       inbuf = Right$(inbuf, Len(inbuf) - 16)
..      


This is a very slow operation, because it will copy the entire file contents to a new buffer. It's makes the algorithm O(n^2). Reading the input file 16 bytes at a time would be faster, or using Mid to extract the data from inbuf.

I'm using Mid to extract the data now (this made the huge difference).


Quote

...
       pbar.Refresh
       DoEvents
...


This is slow, don't do it on every iteration.

I now do it on every 100th iteration.



The read/dump time on storm.dll is now ~1.59 minutes. This is a HUGE difference from the origional 21 minutes. I'm gonna try and only write to 'hexdump.txt' on every 100th iteration, along with the progressbar update - I'll post again if it makes a significant change. Thanks, Adron.

Eli_1

#4
The main changes made are here:


...

While readcnt <> LOF(2)
       tmpbuf = Mid$(inbuf, readcnt + 1, 16)
       buffer = buffer & hexdump(tmpbuf, readcnt) & vbCrLf
       readcnt = readcnt + 16
       
       If readcnt Mod (16 * 50) = 0 Or readcnt = LOF(2) Then
           pbar.Value = readcnt
           Label1.Caption = readcnt & "/" & pbar.Max
           pbar.Refresh
           fappend buffer, FILE_DUMP: buffer = vbNullString
           DoEvents
       End If
   Wend

...


The new time is ~.8 minutes, but that's still a long time. How is it some of the hex editors like HexWorkshop seem to read a file and create identical output almost instantly?

K

#5
This may not make a big difference, but every function call has some overhead associated with it, and i'm not sure what exactly goes on inside LOF() -- it might be a very expensive function.  Try storing the length of the file in a variable when the program first runs and using variable that in place of the LOF() calls.

hismajesty

#6
Quote from: Eli_1 on June 13, 2004, 05:30 PM

Private Sub fappend(ByVal data As String, ByVal filename As String)
   Close #1
   Open filename For Append As #1
   Print #1, data
   Close #1
End Sub



Do you have On Error Resume Next? I don't see how the first Close succeeds. Opening and closing files is typically slow. Don't open and close files unnecessarily.

That wouldn't cause an error and would execute fine but it's bad practice. He should should just do something like.
Dim fFile As Byte
fFile = FreeFile 'Get unused file



Eli_1

#7
Quote from: K on June 13, 2004, 08:18 PM
... and i'm not sure what exactly goes on inside LOF() -- it might be a very expensive function.  Try storing the length of the file in a variable when the program first runs and using variable that in place of the LOF() calls.

It did make a small difference. The time is at .7 minutes now.

Adron

Quote from: Eli_1 on June 13, 2004, 07:19 PM
The new time is ~.8 minutes, but that's still a long time. How is it some of the hex editors like HexWorkshop seem to read a file and create identical output almost instantly?

Hex editors typically don't read the entire file. They throw up a window with a scrollbar large enough for the entire file (needs to check lof), and then read as much of the file as they need to display what you see (needs to read less than 1 kb).

drivehappy

You may want to pass some of your larger variables in as ByRef so that it doesn't need to create a copy of one each time. You need to make sure that you don't change it in the function though.

Eli_1

#10
Quote from: drivehappy on June 14, 2004, 12:13 PM
You may want to pass some of your larger variables in as ByRef so that it doesn't need to create a copy of one each time. You need to make sure that you don't change it in the function though.

I'll try that in just a little bit. It didn't make a noticable difference.  :(


I got the time down to .4 minutes by using SetPriorityClass.

Adron

Quote from: Eli_1 on June 14, 2004, 01:26 PM
I got the time down to .4 minutes by using SetPriorityClass.

SetPriorityClass is probably a bit the same as reducing the frequency of DoEvents calls.