• Welcome to Valhalla Legends Archive.
 

Gmail-style labeling for files: Converting from directories to labels

Started by Yoni, August 18, 2004, 08:28 AM

Previous topic - Next topic

Yoni

Does anybody know of a (Windows) program that does this, or an ongoing project?
This is an idea I've been thinking about for a while.

Many of you played around with Gmail accounts recently, but for people who haven't, I will quickly explain "Gmail-style labels":
In the "traditional" philosophy of Email, every message goes to a certain folder, i.e. Inbox, Sent Mail, your own custom folders, etc.
In Gmail, there are only a few actual folders (Inbox, Sent Mail, Spam, Trash, Archive) and you can't add more. Instead, you can apply labels to messages. The beauty of this is that:
1. You can apply more than one label to a message.
2. You can apply a label to messages in different folders.
I.e., labels and folders are completely independent.
Then, you can perform queries with regard to labels, view all messages under a label, etc.

My thought was to apply the same philosophy to files as well.
I would mainly (only?) use it for photos (JPG files), but this can be generalized.
To be specific:

I have a directory C:\Misc\Netcooler\Pictures\Photos. In this directory I have many, many, many files, all sorted into many directories. And as many as I have, NC has maybe 10 times more.

The directories are all nicely sorted by date and named by event, therefore this is excellent for "sequential scanning". But it is terrible for "random access".

Let's say I want to locate a picture of me sniffing glue. How would I know whether it's in "2003-05-30 Mostly Rotem's House" or "2003-06-29 Rotem's House Kinda" or "2004-02-06,07 At Rotem's"? I don't remember when it was, only where. Also, which file is it? They are all named P#######.JPG. I would have to look through all of them, which is time-consuming. Another example: Say I want to look for the most embarrassing picture of Netcooler ever taken. In this case I would have to look through gigabytes of photos. Extremely time-consuming.

Instead, with the label philosophy, the former would be quickly found by looking under the "sniffing glue" label, and the latter by viewing all images under the "embarrassing pictures of Netcooler" label.

Since I mostly care for this for image viewing, if this is integrated in an image viewer, that's good. If not, that's good too.

Note: While typing this post I realized that this sounds similar to something that may or may not be a feature of WinFS - the Windows File System, or Windows Future Storage, or whatever they call it now. But I'm certainly not going to install Longhorn. (Not when it's released, either.)

warz

What about the thumbail previews you can do in windows xp? Just go into the directory and look at all the thumbnails? Sounds easy to me.

Yoni

I'm looking for all pictures that fit under a certain label that could be in any one (or all) of over 150 folders.

No thanks. Gmail style labeling please.

MyndFyre

Quote from: Yoni on August 18, 2004, 03:11 PM
I'm looking for all pictures that fit under a certain label that could be in any one (or all) of over 150 folders.

No thanks. Gmail style labeling please.

I believe that this will be part of the WinFS file system to be shipped with Windows Longhorn.  Essentially, the drive will operate as a base of metadata that applies to several different files.  For example (this is MS's), if you have a picture of Bob at the company golf outing, rather than storing the one file in one place as "Bob at Last Year's Company Golf Outing.jpg", you can store it, with metadata keywords such as "Bob", "Work", and "Golf", retaining whatever name you want.

Since WinFS hasn't been a feature of the Longhorn preview editions thus far, I can't say how much of a difference this will make from the current searching and storage methods, but I do believe it is what you're looking for.

Sorry though, that you have to wait.  ;)
QuoteEvery generation of humans believed it had all the answers it needed, except for a few mysteries they assumed would be solved at any moment. And they all believed their ancestors were simplistic and deluded. What are the odds that you are the first generation of humans who will understand reality?

After 3 years, it's on the horizon.  The new JinxBot, and BN#, the managed Battle.net Client library.

Quote from: chyea on January 16, 2009, 05:05 PM
You've just located global warming.

Adron

The answer is that you use ACDSee. ACDSee allows you to put a description on files. You can then search for files by keywords from the description. Sounds like it will solve your problem perfectly.

Yoni

Quote from: Adron on August 18, 2004, 07:12 PM
The answer is that you use ACDSee. ACDSee allows you to put a description on files. You can then search for files by keywords from the description. Sounds like it will solve your problem perfectly.
Not exactly. I don't want to make a description. I want to apply various labels like in Gmail. I know the difference is small but there's still a difference.

And as for WinFS: I don't have to wait, since I won't be using LH (see last 2 lines of original post). Looking for alternative now.

Adron

Quote from: Yoni on August 19, 2004, 06:50 AM
Not exactly. I don't want to make a description. I want to apply various labels like in Gmail. I know the difference is small but there's still a difference.

Open acdsee with a hex editor and change the word "description" to "labels"? :P

Is it about the gui method for applying the labels to files?

Yoni

It's a different philosophy.

I am not describing my files. I am categorizing them.

j0k3r

Quote from: Yoni on August 19, 2004, 11:33 AM
I am not describing my files. I am categorizing them.
Couldn't you put the categorys' names inside the description?
QuoteAnyone attempting to generate random numbers by deterministic means is, of course, living in a state of sin
John Vo

Yoni

Quote from: j0k3r on August 19, 2004, 12:27 PM
Quote from: Yoni on August 19, 2004, 11:33 AM
I am not describing my files. I am categorizing them.
Couldn't you put the categorys' names inside the description?
Yes. But that's bad. Ambiguity will follow.

Let's say I want labels:
"Tabasco" - for anything to do with Tabasco.
"Tabasco bottle collection" - for pictures of my collection. Everything labeled "Tabasco bottle collection" will of course also be labeled "Tabasco".

Then, I won't be able to perform a query like "Tabasco and NOT Tabasco bottle collection".

That's just one example, I'm sure I could think of other problems with this approach.
Anyway, it's designed with a different philosophy and achieves a different goal. Using it to do label style stuff would be ugly and bad. No.

Adron

Quote from: Yoni on August 19, 2004, 01:21 PM
Then, I won't be able to perform a query like "Tabasco and NOT Tabasco bottle collection".

That's just one example, I'm sure I could think of other problems with this approach.
Anyway, it's designed with a different philosophy and achieves a different goal. Using it to do label style stuff would be ugly and bad. No.

You didn't specify that you needed to be able to do boolean queries on labels... I still don't see how its design differs: It allows you to store textual data with each image, and it allows you to find images matching certain textual data. Except for the way you choose to use it, I see no difference. Well, apart from that you now added the requirement to be able to do boolean searches on it, which acdsee's search function doesn't support.

If I were to make a simple labelling system, I might just store the labels as comma separated keywords in a string attached to each item, and then do text search on those fields. IIRC, that's how you make labels when you compile .hlp files.


Grok

Yoni, this is easy to solve.  The directory structure you use is meaningless to your need to retrieve.  What you need is to store about 2000 files per directory.  Create a new folder everytime you reach 2000 (or other arbitrary volume).

As you store them in their locations, write the full path/filename to a database.  I recommend MSDE if you are using Windows.  In a second table, put the index type, and index value, for each property you want to apply to a given file.  Set the value of each such property, and include the identity value for the filename entry in the first table.

Scripts to create the schema:

if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[files]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[files]
GO

if exists (select * from dbo.sysobjects where id = object_id(N'[dbo].[properties]') and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table [dbo].[properties]
GO

CREATE TABLE [dbo].[files] (
   [id] [int] IDENTITY (1, 1) NOT NULL ,
   [filename] [varchar] (300) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[properties] (
   [file_id] [int] NOT NULL ,
   [property] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
   [value] [varchar] (300) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
) ON [PRIMARY]
GO