• Welcome to Valhalla Legends Archive.
 

Define a word -- Code!

Started by iago, April 13, 2005, 11:19 AM

Previous topic - Next topic

iago

I'm using this in a JavaOp2 plugin, but it's pretty good on its own, in my opinion:

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Vector;

/*
* Created on Apr 12, 2005
* By iago
*/

public class Define
{
    /** Get the list of definitions for the word */
    public static String []define(String word) throws IOException
    {
        word = word.replaceAll(" ", "+");
       
        Vector ret = new Vector();
        String text = getPage(word);
       
        boolean stop = false;
        while(stop == false)
        {
            int ddIndex = text.indexOf("<DD>");
            int liIndex = text.indexOf("<LI>");
           
            if(liIndex >= 0 && (ddIndex >= 0 ? liIndex < ddIndex : true))
            {
                text = text.substring(liIndex + 4);
                String def = text.replaceAll("<\\/LI>.*", "");
                ret.add(def.replaceAll("<.*?>", "").trim());
                text = text.substring(text.indexOf("</LI>") + 5);
            }
            else if(ddIndex >= 0 && (liIndex >= 0 ? ddIndex < liIndex : true))
            {
                text = text.substring(ddIndex + 4);
                String def = text.replaceAll("<\\/DD>.*", "");
                ret.add(def.replaceAll("<.*?>", "").trim());
                text = text.substring(text.indexOf("</DD>") + 5);
            }
            else
            {
                stop = true;
            }
        }
       
        return (String []) ret.toArray(new String[ret.size()]);
    }
   
    private static String getPage(String word) throws IOException
    {
        HttpURLConnection conn = (HttpURLConnection) new URL("http", "dictionary.reference.com", 80, "/search?q=" + word).openConnection();

        conn.connect();

        if (conn.getResponseCode() != HttpURLConnection.HTTP_OK)
        {
            conn.disconnect();
            throw new IOException("Unable to find the result: Server returned error " + conn.getResponseCode());
        }
       
        BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));

        StringBuffer fullTextBuf = new StringBuffer();
        String line;
        while((line = in.readLine()) != null)
        {
            fullTextBuf.append(line);
        }
       
        conn.disconnect();
       
        return fullTextBuf.toString();
    }
   
    public static void main(String []args) throws IOException
    {
        while(true)
        {
            System.out.print("Please enter a word to define --> ");
            String word = new BufferedReader(new InputStreamReader(System.in)).readLine();
            String []defs = define(word);
           
            System.out.println(defs.length + " definitions found");
            for(int i = 0; i < defs.length; i++)
                System.out.println((i + 1) + ": " + defs[i]);
        }
    }
}


Here is it being used:
Quote
Please enter a word to define --> donkey
3 definitions found
1: The domesticated ass (Equus asinus).
2: Slang. An obstinate person.
3: Slang. A stupid person.
Please enter a word to define --> squint
13 definitions found
1: To look with the eyes partly closed, as in bright sunlight.
2: To look or glance sideways.
3: To have an indirect reference or inclination.
4: To be affected with strabismus.
5: To cause to squint.
6: To close (the eyes) partly while looking.
7: The act or an instance of squinting.
8: A sideways glance.
9: An oblique reference or inclination.
10: See strabismus.
11: A hagioscope.
12: Looking obliquely or askance.
13: Squint-eyed.
Please enter a word to define --> mooga
0 definitions found

One thing I'm missing -- I'm not encoding the parameters being sent.  I'd like to do that, but I can't remember how.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


dxoigmn

#1
Probably would be better to use Dictionary Server Protocol and connect to one of the many servers for a specified language, rather than parsing through the ugly html which could possibly change.

Edit: need support for ftp:// in [[url]]
Edit2: Oh interesting, there is an [[ftp]] code.

iago

Hmm, cool, I've never heard of that.

I actually don't parse much html -- all definitions are handily enclosed in either <LI> .... </LI> or <DD> ... </DD> on that site, and nothing else.  But I'll have a look at that protocol when I have some time.
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*


dxoigmn

Quote from: iago on April 13, 2005, 11:19 AM
One thing I'm missing -- I'm not encoding the parameters being sent.  I'd like to do that, but I can't remember how.

See URLEncoder.

iago

Ah, that would be it.  I know I've used it before, but I couldn't remember what it was called.  Thanks!
This'll make an interesting test for broken AV:
QuoteX5O!P%@AP[4\PZX54(P^)7CC)7}$EICAR-STANDARD-ANTIVIRUS-TEST-FILE!$H+H*