Tag based file management

Over the years, I have amassed a collection of 20,000+ files in a folder called “cool.” This folder has everything from woodworking ideas, to epic photos of aerospike engines. As this collection has grown, I have been running into more and more logical conflicts when trying to organize things within a hierarchical structure. For me, the obvious solution was to use a tag based system, where files could be in multiple sets simultaniously. Most OS’s provide a tool for file tagging, however, I often find them to be pitiful, and on the quality level of an after thought. And goodness help you if you want it to be portable.

After considering the problem, I have decided to create my own file manager that is completely tag-based. Here’s why:

To start, most tag systems require the user to already know the exact tagname of the file they are summoning. Unlike the standard folder views, where catagories names are presented to the user outright, most file managers leave you guessing in the search box. To solve this problem, obviously, tag names would be displayed to the user as subsets. However, this can lead to information-overload, since there may be hundreds of individual tags to display. Therefore, it would make sense to only display the tags with the largest sets, or the results of a user’s incomplete search.

This helps a ton, but there is still the common ailment of search-based approaches, and that is the alphabet. Searching is completely done with text, and most often, searching utilities will try to autocomplete you half-done search in the hopes of being helpful. While this works when the user roughly knows the characters involved, I find it to be too narrow of a selection technique. Instead, I propose to implement language processing to generate tag suggestions. This way, searches will be based solely on topical relation to the search. By adding this fuzzy logic, I can query the word “time” and see other tags such as “clock,” “sky,” and “pendulum” (in addition to the tag “time” (if it exists)).

I have started a proof-of-concept GitHub repo called Cluster, written in Processing. Leter, once I have the design refined, I plan to write the application in C++ using Cinder.

Am I trying to replace hierachical file structures? No, don’t be silly. This is meant only as supplement for current filesystems, and is really only applicable when the user has a wide collection of topics.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s