• Welcome to Valhalla Legends Archive.
 

Open BotNet Spec 1.0

Started by Banana fanna fo fanna, April 29, 2003, 09:51 PM

Previous topic - Next topic

Banana fanna fo fanna


Arta

#1
1. How does client A know the addresses of clients B, C, D... etc, without a central server?

2. Even if you get round this problem, it's eew to have to connect each client to every other client. Imagine if there are 200 users online? or 2000? This means that each client has to send 200 (or 2000) identical messages instead of having to send one to the server. This is not practical. What happens if you have a crappy client as one node? Other nodes are then not going to have correct information about what's transpiring on the OBN.

3. How do you prevent abuse? This totally open network would have no reliable means to ban or restrict the access of abusive users.

4. Last, but not least, it's *very* exploitable. Imagine client A, a malicious user, adds an entry on client B. Client B then sends a 'register' message to all the other clients. Now imagine that client A floods client B with additions. If there's 200 users on the OBN, you've suddendly multiplied the amount of data traversing client B's connection by a factor of 200 (at least). Dos-in-a-box.

tA-Kane

You most certainly need a way to communicate the communication version, to ensure that new nodes are compatible with old, and vice versa.

How will you combat database corruption by hostile nodes?

I am not very fond of peer-to-peer networks such as this, because of how bandwidth intensive they are; The Gnutella network being a prime example; just 4 "idle" connections will leech the fuck out of my DSL line. Granted, there's thousands more clients (as such, an equal increase in messages transmitted), but the possibility still exists.

One way to cut down on traffic is to have "designated hubs" which have receiving clients (not receiving hubs) transmit their whole database to the hub, and transmit changes (adds, changes, deletes, etc) to the hub when appropriate. Then, when the designated hub receives a query, it searches the cached databases instead of forwarding it on to the client, and if matches are found in the cache, it forwards the reply as if the reply had origionated from the matched database's owner, and forwards a "query/match found for your database" to the client, to let it know that a match was found on its database.

Quote2.1 Database exchange
This message is sent to a connecting node once. It is not routed.
What do you mean by "It is not routed."? Are you meaning that the client receiving this exchange does not transmit it to any other clients connected to it? If so, then that's very bad... if two networks "merge", then the both databases most surely need to be transmitted. Additionally, it would be wise, when merging networks, to have the highest-versioned "hubs" of those networks to be the link, so that the opposing databases aren't limited by an older protocol.

Just my two cents on what you have, so far.
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

tA-Kane

Quote from: Arta[vL] on April 29, 2003, 11:07 PM2. Even if you get round this problem, it's eew to have to connect each client to every other client.

4. Dos-in-a-box.
Designated hubs would cut down on those two factors.
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Skywing

Quote from: tA-Kane on April 29, 2003, 11:11 PM
Quote from: Arta[vL] on April 29, 2003, 11:07 PM2. Even if you get round this problem, it's eew to have to connect each client to every other client.

4. Dos-in-a-box.
Designated hubs would cut down on those two factors.
Then you defeat the whole idea of decentralization.

tA-Kane

#5
Quote from: Skywing on April 30, 2003, 12:06 PM
Quote from: tA-Kane on April 29, 2003, 11:11 PMDesignated hubs would cut down on those two factors.
Then you defeat the whole idea of decentralization.
Not really. If the open BotNet spec allows for other clients' IP address transmission (which the current open spec does not specify, but I assumed it would via a custom message), then it would not be hard to get a list of alternate designated hubs (or even connect to designated hubs through other users, if that user's client allows for such a thing) so you can remain connected to the network when it goes down.
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Banana fanna fo fanna

I'm sorry I wasn't clear enough in the spec on how to prevent floods and such.

We'll assume there are more compliant clients than clients that try to disrupt the system. For each register message, a given public key can only entered in the database once. Message throttling will also be implemented. Since most good clients will comply with this, they will refuse such messages.

About finding IP addresses Arta, I was envisioning either a profile key or a special whisper command to get the IP addresses.

Arta: not every client needs to be connected to each other. The messages propagate themselves via others, so I can be connected to another node which is connected to two other nodes. I have one connection, and when I send the message, I will reach three other nodes.

Connection hubs would be implemented if necessary, however I believe they may automatically evolve, i.e. the people in vL would all connect to [vL], and the fe people would connect to Fatal-Error. Connecting these two together would bring the network together.

Kane: thanks for pointing that out about database exchange. I didn't consider that. I see three possible solutions:

1) forward to everyone on the network (expensive)
2) don't allow two networks to be connected (perhaps make this version 0)
3) central database (then why would it be p2p?)

The problem of authentication database corruption is an interesting one. I was assuming the good clients would be able to weed it out, but perhaps a malicious client would be able to get high enough in the chain and spoil it. Perhaps some sort of digital signature is required, however, I cannot figure out how a central authority would be determined.

About bandwith, I believe UDP and hubs could be used to limit it. Instead of broadcasting responses, replies could just be sent via UDP to the peer. Firewall and reliability issues would have to be addressed. Central hubs, as I discussed above, may or may not be implemented in software.

The biggest problem I see is with the authentication database. Any suggestions about how faking could be 100% eliminated?

Banana fanna fo fanna

Cuphead proposes: "Ok, suppose we have nodes A, B, and E.  A and B are normal functioning nodes, E (the Evil hacker node) has a corrupted database.  All are functioning independently of one another.  Assume E requests a database merge with A.  A saves its current database into a new file and merges E's database into its own.  A's database is now corrupted with E's bad keys.  Now B requests a connect/merge with A.  A sees an entry that exists in its backup database that coincides with B's, but is modified in A's newly merged (with E) database.  A restores its old database and merges that with B's, since chances are that B's database is intact."

To handle distribution of the public key database, perhaps we could build it like we build the information database (query/response)? This would allow propagation and users who have dead accounts wouldn't clog up the network. We'd simply use the public key that gets the most responses.

Please make suggestions! ;)

tA-Kane

Another problem I just thought about was what if two clients make a different database change at the same time?

1) Either the packet number would be the same, thus some clients would parse packet A and others parse packet B, causing a database desync

2) If the packet numbers aren't the same, then other clients would still have a possible database sync, because of one packet being transmitted one way, and the other being transmitted in the opposite direct. Take the following diagram...

A <-> B <-> C <-> D <-> E

Database change 1 (packet 1) gets sent from A
Database change 2 (packet 2) gets sent from E

Packet 1 would be transmitted to B, packet 2 gets to D. Then, B sends to  C, and D sends to C. C receives either of them first... then forwards them both on. C's database is now the last one received, B's database is now E's, and D's database is now A's.

QuoteMessages are each assigned a unique routing ID.
Additionally, going back to message numbers ("Routing IDs"), this could be done with any combination of packet requests/replies... Using the same diagram as above, take the following example...

A sends a request to E. The request gets sent to B. B adds the message number to the "received message numbers list", and forwards on to C. While C receives the packet, E now transmits a request to B. E sends to D, C forwards to D. D, seeing E first, forwards E's request to C, and disregards what it received from C. C, receiving E's request with the same message number as A's request, disregards the packet.

Now both packets are disregarded. I suppose you could call this a packet number collision...
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

tA-Kane

Quote from: tA-Kane on April 30, 2003, 03:10 PMNow both packets are disregarded. I suppose you could call this a packet number collision...
Now that I think about this, Routing IDs would be best created by using the source IP address and then a unique number. That way, a collision would only occur if the source-client uses the a number on one connection for one packet, then the same number on a different connection for a different packet, and then those two connections are on the same network.
Macintosh programmer and enthusiast.
Battle.net Bot Programming: http://www.bash.org/?240059
I can write programs. Can you right them?

http://www.clan-mac.com
http://www.eve-online.com

Camel

Quote from: tA-Kane on April 30, 2003, 01:10 PM
Quote from: Skywing on April 30, 2003, 12:06 PM
Quote from: tA-Kane on April 29, 2003, 11:11 PMDesignated hubs would cut down on those two factors.
Then you defeat the whole idea of decentralization.
Not really. If the open BotNet spec allows for other clients' IP address transmission (which the current open spec does not specify, but I assumed it would via a custom message), then it would not be hard to get a list of alternate designated hubs (or even connect to designated hubs through other users, if that user's client allows for such a thing) so you can remain connected to the network when it goes down.
agreed. consider DNS: the internet is (basicly) decentralized. one could set up a whole bunch of hubs that are preset in to the client, and if even one connection is made that user would be able to discover new hubs. it's great in theory, but so is communism.

Skywing

Quote from: Camel on April 30, 2003, 03:33 PM
Quote from: tA-Kane on April 30, 2003, 01:10 PM
Quote from: Skywing on April 30, 2003, 12:06 PM
Quote from: tA-Kane on April 29, 2003, 11:11 PMDesignated hubs would cut down on those two factors.
Then you defeat the whole idea of decentralization.
Not really. If the open BotNet spec allows for other clients' IP address transmission (which the current open spec does not specify, but I assumed it would via a custom message), then it would not be hard to get a list of alternate designated hubs (or even connect to designated hubs through other users, if that user's client allows for such a thing) so you can remain connected to the network when it goes down.
agreed. consider DNS: the internet is (basicly) decentralized. one could set up a whole bunch of hubs that are preset in to the client, and if even one connection is made that user would be able to discover new hubs. it's great in theory, but so is communism.
Err.. you do know what the DNS root servers are, don't you?

Banana fanna fo fanna

Quote from: tA-Kane on April 30, 2003, 03:10 PM
Another problem I just thought about was what if two clients make a different database change at the same time?

1) Either the packet number would be the same, thus some clients would parse packet A and others parse packet B, causing a database desync

2) If the packet numbers aren't the same, then other clients would still have a possible database sync, because of one packet being transmitted one way, and the other being transmitted in the opposite direct. Take the following diagram...

A <-> B <-> C <-> D <-> E

Database change 1 (packet 1) gets sent from A
Database change 2 (packet 2) gets sent from E

Packet 1 would be transmitted to B, packet 2 gets to D. Then, B sends to  C, and D sends to C. C receives either of them first... then forwards them both on. C's database is now the last one received, B's database is now E's, and D's database is now A's.

QuoteMessages are each assigned a unique routing ID.
Additionally, going back to message numbers ("Routing IDs"), this could be done with any combination of packet requests/replies... Using the same diagram as above, take the following example...

A sends a request to E. The request gets sent to B. B adds the message number to the "received message numbers list", and forwards on to C. While C receives the packet, E now transmits a request to B. E sends to D, C forwards to D. D, seeing E first, forwards E's request to C, and disregards what it received from C. C, receiving E's request with the same message number as A's request, disregards the packet.

Now both packets are disregarded. I suppose you could call this a packet number collision...

Well, I'm not sure if this is included in the spec, but replies should include a GMT timestamp, which should help with time collisions.

I don't know what you mean about routing IDs. They are unique (well, as unique as we can get) and shouldn't have a problem. They're only used so a packet doesn't get endlessly routed.

Arta

Quote
Arta: not every client needs to be connected to each other. The messages propagate themselves via others, so I can be connected to another node which is connected to two other nodes. I have one connection, and when I send the message, I will reach three other nodes.

If Client A sends a message to client B, how does client B know if client C already has it? Say B forwards it to C, and then C unknowingly forwards it to D, but A & B have already sent the message to D? Even with some means to prevent clients from processing the same message twice, the messages are still *sent* - It would be very, very easy to create feedback loops in such a system. You could help prevent that by including some kind of TTL in each packet, but this introduces a new problem - that some clients might miss out on messages. Either way, you'll have a lot of redundent traffic floating around.

These kinds of distributed systems only work with some kind of central control mechanism. This has clearly been shown and demonstrated many times. Just look at similar distributed systems on the net - Kazaa and the like. All of them have a central controlling node. The lack of central control simply makes it too difficult to keep track of what's going on.

A better system would be to designate one of the nodes as a Master node, and another as a backup node. If the master node was going down intentionally, it could broadcast a reconnect message. If it suddenly disappeared, clients could automatically connect to the backup node, which would also have noticed that the master server died, and could thus designate itsself as the master node and set a new backup. Even this has it's problems, though..

Clients setting themselves as masters when they're not. Who's the master's master? Who has the definative say?
How does a new node know who the master is if the last master/backup they used have both gone?
What happens if a master goes down and some poor sod on dialup who's designated as the backup nodem suddenly gets ~200 clients connecting to it?

(that's just a sampling)

There are solutions to these problems too, but the whole idea is still fraught with difficulties. Ultimately, totally distributed systems like this only work when 2 (maybe more) conditions are met:

- All nodes are trustworthy
- There exists a separate means for nodes to find the other nodes, should they become isolated.

The second condition *might* be met here (Battle.net), but the first one *definitely* isn't, as has been demonstrated with the current Botnet on a number of occasions.

Banana fanna fo fanna

I think if *most* nodes are trustworthy, they won't participate with the malicious nodes and will filter their traffic.

Yes, D may receive the same message a few times, but a feedback loop won't occur because it won't process the same message twice.