• Welcome to Valhalla Legends Archive.
 

[MySQL/PHP] Repeat values?

Started by Barabajagal, October 11, 2007, 02:06 PM

Previous topic - Next topic

Barabajagal

I've been working on a php script that logs the amount of users that use my bot. The user's IP address (MD5 Hashed) and exact time of login are stored in the IP and Login fields (respectively). I then display the output in nice graphs, organized in 6 of them: Daily Usage (Count of users that logged in for each hour of the day), Monthly Usage (Count of users that logged in for each day this month), Yearly Usage (Count of users that logged in for each month this year), and the same 3 but counting only unique IP addresses. This can all be seen at http://rcb.realityripple.com/stats/
However, for the yearly usage, which contains quite a bit of data to sort through, the loop takes so long that the server returns Error 500 (Internal Server Error). I need to find a way to request only unique IP addresses from the database, so I don't have to sort through all the results to get unique ones. I've heard of an operator called Distinct, but I don't understand how to use it correctly...

Here's my current code:

Graph.php:
<? /*  Usage Graphing System by RealityRipple Software  (c) 2007  -graph.php-  input:    u          [y/n]         y      displays only unique IPs         n      displays all IPs    a          [d/m/y]         d      displays all the hours of today         m      displays all the days of this month         y      displays all the months of this year   todo:    totals (all years)    improve graphs a bit    perhaps allow for color scheme inputs? */ if (isset($_REQUEST["u"])) {  $uni=$_REQUEST["u"]; } else {  $uni='n'; } if (isset($_REQUEST["a"])) {  $age=$_REQUEST["a"]; } else {  $age='d'; } include('getstats.php'); $user     = "XXXX"; $password = "****"; $database = "dbXXXX"; $host     = "localhost"; mysql_connect($host,$user,$password); @mysql_select_db($database) or die(""); $query    = "SELECT * FROM `RCBUsers`"; if ($age=='d') {  $startTime = date("Y-m-d 0:00:00");  $endTime   = date("Y-m-d H:59:59"); } else if ($age=='m') {  $startTime = date("Y-m-1 00:00:00");  $endTime   = date("Y-m-d 23:59:59"); } else if ($age=='y') {  $startTime = date("Y-1-1 00:00:00");  $endTime   = date("Y-m-t 23:59:59"); } $query.=" WHERE `Login` > '$startTime' AND `Login` < '$endTime' ORDER BY `Login` ASC"; $listing = mysql_query($query); mysql_close(); if ($age=='d') {  for ($i = 0 ; $i < date("H") + 1; $i++)  {   $startTime = date("Y-m-d ".$i.":00:00");   $endTime   = date("Y-m-d ".$i.":59:59");   $times[$i] = getStats($uni, $listing, $startTime, $endTime);  }  $gTitle1='Daily Usage';  $gTitle2='Time of Day';  $xLabel ='time'; } else if ($age=='m') {  for ($i = 1 ; $i < date("j") + 1; $i++)  {   $startTime = date("Y-m-".$i." 00:00:00");   $endTime   = date("Y-m-".$i." 23:59:59");   $times[$i] = getStats($uni, $listing, $startTime, $endTime);  }  $gTitle1='Monthly Usage';  $gTitle2='Date';  $xLabel ='day'; } else if ($age=='y') {  for ($i = 1 ; $i < date("n") + 1; $i++)  {   $startTime = date("Y-".$i."-1 00:00:00");   $endTime   = date("Y-".$i."-t 23:59:59");   $times[$i] = getStats($uni, $listing, $startTime, $endTime);  }  $gTitle1='Yearly Usage';  $gTitle2='Month';  $xLabel ='month'; } include('postgraph.php'); $graph = new PostGraph(400,300); $graph->setGraphTitles($gTitle1, $gTitle2, 'Logins'); $graph->setYNumberFormat('integer'); $graph->setYTicks(0); $graph->setYValueMode(1); $graph->setXLabels($xLabel); $graph->setData($times); $graph->setWhiteColor(array(0,0,0)); $graph->setBackgroundColor(array(0,0,0)); $graph->setTextColor(array(255,0,0)); $graph->setLinesColor(array(192,0,0)); $graph->setBarsColor(array(128,0,0)); $graph->setAboveBarColor(array(255,0,0)); $graph->setInsideBarColor(array(255,128,0)); $graph->drawImage(); $graph->printImage();?>


GetStats.php:
<? /*  Usage Graphing System by RealityRipple Software  (c) 2007  -getstats.php-  input:    u   unique [y/n]         y      displays only unique IPs         n      displays all IPs    listing    [SQL]                sql list of usage    startTime  [date]                starting time of listing    endTime    [date]                ending time of listing   todo:    see if startTime and endTime can be removed? */   function getStats($u, $listing, $startTime, $endTime) {  $num = mysql_numrows($listing);  $IPcount = 0;  $IPs = '';  for ($i = 0; $i < intval($num); $i++)  {   $tmpIP = strtoupper(mysql_result($listing,$i,'IP'));   $tmpTime = mysql_result($listing,$i,'Login');   $bAdd = 'n';   $tStart = strtotime($startTime, 0);   $tTemp  = strtotime($tmpTime, 0);   $tEnd   = strtotime($endTime, 0);   if ($tStart < $tTemp && $tTemp < $tEnd)   {    if ($u == "n")    {     $bAdd = 'y';    }    else    {     $bThere = 'n';     $lIP = explode(",", $IPs);     for ($j = 0; $j < Count($lIP)-1; $j++)     {      if ($tmpIP == $lIP[$j])      {       $bThere = 'y';      }      else      {       $bAdd = 'n';      }     }     if ($bThere == 'n')     {      $bAdd = 'y';     }     else     {      $bAdd = 'n';     }    }   }   else   {    $bAdd = 'n';   }   if ($bAdd == 'y')   {    $IPs.="$tmpIP,";    $IPcount++;   }  }  return $IPcount; }?>


Any other editing I can do to make this more efficient would be greatly appreciated. PHP and DataBases are still not all that familiar to me.

Banana fanna fo fanna

1. In a complex application, your utility files may be include()'d more than once in multiple files, if there are lots of includes. If you're just using it to import functions (which you are doing), as opposed to including to render input, use the require_once() function. This ensures that the file is parsed and imported once and only once.

2. I tend to do <?php ?> instead of <? ?> because "short" delimiters are sometimes not enabled.

3. Use single-quotes for constant strings so PHP doesn't waste time searching them for variable interpolation. Use double quotes when you want to enable it. I also see some places where you are using string concatenation (where you are coming up with the dates) where you can just use variable interpolation.

4. Everyone says you're supposed to use the PEAR DB API instead of the mysql_* functions, but I don't. Then again, I try to stay away from PHP. Consult the PHP documentation.

5. Move database information into a separate config file

6. Move ANYTHING that calls the mysql_* functions into a separate include file that abstracts away the database operations you are performing. This way you can theoretically drop in any other database later. This means you shouldn't pass resultsets around (like you do to getStats()). This isn't important now, but will be when you scale up the size of your project. I don't think it's necessary in this instance

7. Name your include files with a .inc.php

8. When scaling your website up, I like to use the Fusebox 3 framework, but that's just me.

EDIT: just read the beginning of your post, thought you were just looking for comments. Will answer soon.

Banana fanna fo fanna

The query you want begins with:

$query    = "SELECT DISTINCT ipaddress,[your other fields here] FROM `RCBUsers`";

iago

You could also do

SELECT * FROM blah GROUP BY ipaddress


And incidentally, I don't see what the point of md5'ing the IP is -- the domain of possible IP addresses is small enough to bruteforce in fairly reasonable time.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Barabajagal

Not when they're salted ;D
And thanks for the info, I'll see if I can get this all to work some time...

Barabajagal

SELECT DISTINCT IP, Login FROM `RCBUsers` WHERE `Login` > '2007-10-13 0:00:00' AND `Login` < '2007-10-13 16:59:59' ORDER BY `Login` ASC

Didn't work. still returns repeat values of "IP". How do I use groups in MySQL?

rabbit

Grif: Yeah, and the people in the red states are mad because the people in the blue states are mean to them and want them to pay money for roads and schools instead of cool things like NASCAR and shotguns.  Also, there's something about ketchup in there.

Barabajagal

I realized I've been doing this horribly inefficiently, and with Blake's help (he's making one for JBLS too, since mine was such a good idea ;) ) we're making a much more efficient system.

iago

Quote from: Andy on October 12, 2007, 12:48 PM
Not when they're salted ;D
And thanks for the info, I'll see if I can get this all to work some time...
Salting it shouldn't make any difference when brute forcing. Salting is only designed to prevent pre-computation attacks.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Barabajagal

How would they brute force a salted IP address? Instead of being 15 characters (XXX.XXX.XXX.XXX), it becomes 30 characters (SSSS.XXX.XXX.SSSS.XXX.XXX.SSSS) [note that that's not the actual way I salt the ips, just a demonstration].

iago

Because, by definition, a salt is a known value.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Barabajagal


Banana fanna fo fanna

Quote from: Andy on October 14, 2007, 07:50 PM
Known to whom?

If they (bad guys) have access to your hashes, they likely have access to your salts, too.

iago

Exactly. Salts aren't supposed to be a secret.

My main point is, why bother hashing the IPs? That's a silly idea.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


Barabajagal

User privacy. And no, it's not that hard to get in to a MySQL db. It is a bit harder to get php code.