A B-tree is the fastest standard sorted data structure, right?
Depends on how your define "fastest."
I have a lot of records sorted by an integral ID. I don't really care how long it takes to populate the tree, but it needs to be fast to find records.
Quote from: MyndFyre[vL] on July 13, 2006, 09:32 PM
I have a lot of records sorted by an integral ID. I don't really care how long it takes to populate the tree, but it needs to be fast to find records.
Are the IDs unique? Are they constrained to some range? A hash table will be the fastest for lookup (i.e. constant time) but might use too much space. Typically any sort of tree structure will take logarithmic-time to lookup and in the worst case linear time if the tree is not well balanced (i.e. in the case of a vine).
Right, but isn't a B-tree balanced so that it should always take logarithmic time to find an item and near-logarithmic time to add one?
Yes, the IDs are unique.
A perfect B-Tree is balanced. If it isn't a B-tree in the first place, it can take a good deal of time to sort it all out. But yes, the fastest access tree is binary. Since the IDs are unique, I'd also suggest a hash map.
Quote from: rabbit on July 14, 2006, 07:20 AM
A perfect B-Tree is balanced. If it isn't a B-tree in the first place, it can take a good deal of time to sort it all out. But yes, the fastest access tree is binary. Since the IDs are unique, I'd also suggest a hash map.
Now see, I don't know why I didn't think of a hashtable.
What are the memory considerations between a hash table and a B-tree?
Also, rabbit, there is a distinction between a binary tree and a B tree. I don't remember what the distinction is - I think a binary tree is just not balanced and so it can get really long leaves. Right?
Quote from: MyndFyre[vL] on July 14, 2006, 12:28 PM
Now see, I don't know why I didn't think of a hashtable.
What are the memory considerations between a hash table and a B-tree?
Also, rabbit, there is a distinction between a binary tree and a B tree. I don't remember what the distinction is - I think a binary tree is just not balanced and so it can get really long leaves. Right?
Space(hash table) >> space(b-tree). Of course this all depends on your hash function. If you IDs are unique and constrained to a small range, then the hash table will use less memory than the B-tree.
A B-tree is simply a balanced binary tree.
Yeah. But if your program is set up for binary trees and not specifically B trees, you can create some problems by assuming the structure is always a B tree. Anyway:
B(inary) trees - only the exact amount of memory needed is used, access is O(logn)
hash map/table - more memory is used, depending on the hash distances, access is (without collisions) O(1)
Basically, if your hash is perfect (hah!) you have the fastest access time possible, though with a tradeoff in memory usage. B(inary) trees sacrifice some speed for reduced memory usage. It's basically what you want more for your program, speed or usage.