Valhalla Legends Archive

Programming => General Programming => Topic started by: MyndFyre on July 13, 2006, 08:20 PM

Title: Sorted data structure
Post by: MyndFyre on July 13, 2006, 08:20 PM
A B-tree is the fastest standard sorted data structure, right?
Title: Re: Sorted data structure
Post by: dxoigmn on July 13, 2006, 08:59 PM
Depends on how your define "fastest."
Title: Re: Sorted data structure
Post by: MyndFyre on July 13, 2006, 09:32 PM
I have a lot of records sorted by an integral ID.  I don't really care how long it takes to populate the tree, but it needs to be fast to find records.
Title: Re: Sorted data structure
Post by: dxoigmn on July 13, 2006, 09:51 PM
Quote from: MyndFyre[vL] on July 13, 2006, 09:32 PM
I have a lot of records sorted by an integral ID.  I don't really care how long it takes to populate the tree, but it needs to be fast to find records.

Are the IDs unique? Are they constrained to some range? A hash table will be the fastest for lookup (i.e. constant time) but might use too much space. Typically any sort of tree structure will take logarithmic-time to lookup and in the worst case linear time if the tree is not well balanced (i.e. in the case of a vine).
Title: Re: Sorted data structure
Post by: MyndFyre on July 13, 2006, 10:14 PM
Right, but isn't a B-tree balanced so that it should always take logarithmic time to find an item and near-logarithmic time to add one?

Yes, the IDs are unique.
Title: Re: Sorted data structure
Post by: rabbit on July 14, 2006, 07:20 AM
A perfect B-Tree is balanced.  If it isn't a B-tree in the first place, it can take a good deal of time to sort it all out.  But yes, the fastest access tree is binary.  Since the IDs are unique, I'd also suggest a hash map.
Title: Re: Sorted data structure
Post by: MyndFyre on July 14, 2006, 12:28 PM
Quote from: rabbit on July 14, 2006, 07:20 AM
A perfect B-Tree is balanced.  If it isn't a B-tree in the first place, it can take a good deal of time to sort it all out.  But yes, the fastest access tree is binary.  Since the IDs are unique, I'd also suggest a hash map.

Now see, I don't know why I didn't think of a hashtable.

What are the memory considerations between a hash table and a B-tree?

Also, rabbit, there is a distinction between a binary tree and a B tree.  I don't remember what the distinction is - I think a binary tree is just not balanced and so it can get really long leaves.  Right?
Title: Re: Sorted data structure
Post by: dxoigmn on July 14, 2006, 01:00 PM
Quote from: MyndFyre[vL] on July 14, 2006, 12:28 PM
Now see, I don't know why I didn't think of a hashtable.

What are the memory considerations between a hash table and a B-tree?

Also, rabbit, there is a distinction between a binary tree and a B tree.  I don't remember what the distinction is - I think a binary tree is just not balanced and so it can get really long leaves.  Right?

Space(hash table) >> space(b-tree). Of course this all depends on your hash function. If you IDs are unique and constrained to a small range, then the hash table will use less memory than the B-tree.

A B-tree is simply a balanced binary tree.
Title: Re: Sorted data structure
Post by: rabbit on July 14, 2006, 07:30 PM
Yeah.  But if your program is set up for binary trees and not specifically B trees, you can create some problems by assuming the structure is always a B tree.  Anyway:
B(inary) trees - only the exact amount of memory needed is used, access is O(logn)
hash map/table - more memory is used, depending on the hash distances, access is (without collisions) O(1)

Basically, if your hash is perfect (hah!) you have the fastest access time possible, though with a tradeoff in memory usage.  B(inary) trees sacrifice some speed for reduced memory usage.  It's basically what you want more for your program, speed or usage.