• Welcome to Valhalla Legends Archive.
 

Formula for optimizing database items

Started by Spht, May 05, 2006, 02:20 PM

Previous topic - Next topic

Spht

Hi.  This is for my ClanGnome project.

The program is periodically updating its database for every item, and the database grows at an exponentially declining rate.

The items (channels) are either successfully visited (yes_visit) or unsuccessfully (no_visit).  If yes_visit, then various information about that item is updated--most importantly, a calculated average channel volume (average_volume).  In a cycle, each channel is attempted visit once.  At the end of a cycle, all channel's yes_visit, no_visit, average_volume are observed.

It is at this point optimization occurs.  I need a formula for optimizing this database.  The idea is to get rid of channels that have seemed to have gone dead (no_visit has risen considerably in contrast to history), or are very inactive (average_volume low and possible balance of yes_visit to no_visit ratio).

This is what I'm currently using, but it isn't entirely reliable:

if (no_visits) > (yes_visits + ((average_volume + 5) * 2))

When optimization occurs: if no_visits is 0 then yes_visits will be > 0, if yes_visits is 0 then no_visits will be > 0.  if no_visits is 0 then average_volume will be > 0, if yes_visits is 0 then average_volume will be 0.

Suggestions welcome.

Newby

#1
What does it mean by unsuccessful?

Also, when optimized, does it reset all the values for each one? Or does it just remove ones that are considered dead? I'm assuming the latter, since now that I think about it, the former would be counterproductive.

Let me put some thought into this.
- Newby

Quote[17:32:45] * xar sets mode: -oooooooooo algorithm ban chris cipher newby stdio TehUser tnarongi|away vursed warz
[17:32:54] * xar sets mode: +o newby
[17:32:58] <xar> new rule
[17:33:02] <xar> me and newby rule all

Quote<TehUser> Man, I can't get Xorg to work properly.  This sucks.
<torque> you should probably kill yourself
<TehUser> I think I will.  Thanks, torque.

Spht

#2
Quote from: Newby on May 06, 2006, 12:18 AM
What does it mean by unsuccessful?

Also, when optimized, does it reset all the values for each one? Or does it just remove ones that are considered dead? I'm assuming the latter, since now that I think about it, the former would be counterproductive.

Let me put some thought into this.

The unsuccessful count (no_visit) is the amount of times the channel has been empty when the bot attempted to join.  It also keeps a record of the amount of times the channel has been unavailable for other reasons (banned), but that doesn't affect the channel's activeness.

The purpose of optimization is to remove "inactive" clans from the database so that the bot doesn't waste valuable time to try joining them.  If the clan is spotted again, then it'll be added back to the database with all default zero-variables.

Expected behaviour is that after several failed attempts to get in the channel, it is considered inactive, and removed.  If a channel was previously very active (high yes_visit and average_volume) and is now dead, it will likely take more no_visits before the channel is removed from the database.  no_visits should take precedence over low average_volume.  average_volume should never exceed 40.

The optimization formula should be designed around the assumption that each channel is sampled about once every 1 or 2 hours.

rabbit

Wouldn't it be better to list a "dead" channel that was very active as a "try again later" thing if it hasn't been too long?  And does the Gnome take into account members like Clan.Ops[1], Clan.Ops[2], etc...?
Grif: Yeah, and the people in the red states are mad because the people in the blue states are mean to them and want them to pay money for roads and schools instead of cool things like NASCAR and shotguns.  Also, there's something about ketchup in there.

Spht

Quote from: rabbit on May 06, 2006, 01:00 PM
Wouldn't it be better to list a "dead" channel that was very active as a "try again later" thing if it hasn't been too long?

It does.  It should take longer for a very active clan to expire from the database if they become "dead."  If the bot spots someone wearing the clan's tag afterwards, it'll make a new record for the channel and start tracking again.

Quote from: rabbit on May 06, 2006, 01:00 PMAnd does the Gnome take into account members like Clan.Ops[1], Clan.Ops[2], etc...?

What do you mean? average_volume is the average number of people in the clan CHANNEL, aswell it's not just members--it's everyone.  There are other variables like average_w3_volume, average_nonw3_volume, average_member_volume, and so on, but those aren't important for optimization.  The purpose is to track a clan's channel statistics, not specifically its members.

rabbit

I see a lot of clans with ops like
Clan.Ops[5]@Azeroth
Clan.Ops[3]@Azeroth
Clan.Ops[8]@Azeroth
Clan.Ops[1]@Azeroth
Clan.Ops[7]@Azeroth

That's cheating -.-
Grif: Yeah, and the people in the red states are mad because the people in the blue states are mean to them and want them to pay money for roads and schools instead of cool things like NASCAR and shotguns.  Also, there's something about ketchup in there.

Spht

This is a simple simulation I made in VB to test different formulas:

http://www.valhallalegends.com/spht/test_optimize.txt

You should be able to change the extension to .frm and run it if you have VB installed.