Tagging Files—Or How to Keep Research Organized
I received the following email from an Academhack reader.
Here’s my situation: I work in education policy, which means I spend a lot of time reading long-ish reports and writing syntheses, papers, policy briefings, etc. What I often find is that Report A will contain potentially useful info about a variety of topics (we’ll call those topics X, Y and Z). Report B will contain info about topics V, W and X; Report C about W, X and Y - you get the idea.
What I would love is software that allows me to take notes on such a way that, when I need to write a paper on Topic Z, I can easily find all the notes I’ve taken on that topic, rather than having to look laboriously through the notes from Report A for info on topic Z, then the notes from Report B, etc. I would assume this sort of software exists, but I’ve asked around and none of my colleagues have any recommendations. I’ve looked online, of course, but without real world recommendations it’s difficult to know where to start or even what exactly I should be looking for. And when I asked an academic I know, she suggested 6×4 notecards - not exactly what I have in mind.
Re platforms, I use mac at home, but, alas, use a pc at work, so my first priority would be finding software for the latter (though I would like to have it at work and home).
Good question, I think I have an answer, or several answers, and as I have been wanting to write about “metadata” or “tagging” files this gives me a good opportunity.
Meta-explanation
First let me reframe the issue, and explain why you would want to do something like what is being asked here. Say for example you are doing research on 20th Century Literature, and you have just finished reading an excellent article that discusses David Foster Wallace, Thomas Pynchon, and William Vollman, all through the lens of Foucault. (Does such a paper exist? I just made this up, but bonus points for anyone who finds such a thing.) Under a typical filing system you could place the paper under Pynchon, or under Wallace, but not both, and what to do about Foucault and Vollman? You could create four copies of the document and file it under each, or just create small documents that say see ” (Insert Name of Article) located in Wallace, place the document in Wallace and file extra small documents in the other locales. Or you could create a card catalog for your computer—personally I think both sound pretty futile. The problem here stems from organizing computer files as if they were in a file cabinet, when they do not have to be.
I think on the internet over the past year or so we have started to see a change in this as things are no longer necessarily organized in a hierarchy, but instead placed in relation. If you think about it this is how deli.cio.us operates. Things aren’t just placed in one bin, but multiple bins and you can see a great deal just by looking at the bins.
The problem comes in when you move this stuff to your own hard drive and start to organize your own data. You might want a mix of the file system, but be able to still tag files.
The Solution
The easiest way to solve this problem is to use something like Devon or Yojimbo or for the PC as some users have recommended, OneNote. Obviously I am a fan of this method, although it does have its drawbacks. For one it is not cross platform, and you have to be rigorous about always using the application. These are also not free.
What this emailer is really looking for is a way to tag, or use metadata that is searchable. Now there are a few solutions I have for this problem.
You can add the metadata, or the tags into the document. Let’s take the same example as above. You could just use something that will search inside all of your documents (more on this later) for Pynchon. But the problem with this is that you will get all your documents that have the word Pynchon, and you really only want the documents you “tagged” as having to do with Pynchon. The solution to this is to create a tagging system with a prepend. I use a tagging system on my hard drive, and what I do is every tag starts with an “&.” Back to the above example, you would open up the document, and on the last line of the document add, &Pynchon, &Wallace, &Vollman, &Foucault. Now when you search inside all of your documents instead of searching for Pynchon you search for &Pynchon, thus returning only the documents you have tagged with “&Pynchon.” Now this has a few drawbacks, namely that you have to be able to search inside the document, and you have to be able to edit the document, (what if you get a .pdf, or have saved a webpage, this can get a bit cumbersome). This will work though as you move the documents across platforms though, the tags will stay when you move the document from a PC to a Mac.
If you are on a Mac—a better way
If you are on a Mac this is actually really easy, spotlight has the ability to add tags/metadata to any file, and you can then search for it. Lucky for me, someone for more knowledgeable has already done an extensive write-up on how to do this, and make it easy. So I can just link to it here. This is an excellent five part series where Nick lays out all the tools, tricks, methods and philosophy of getting this done. This is really a thorough post, complete with screencasts. So if you are a Mac user and you want to tag your files, this will do the trick. Start here, and then go to Part 2, Part 3, Part Four, and finally The Wrap Up. This should pretty much cover how to tag files. And if you are only going to work on a Mac, and don’t want to use something like Devon or Yojimbo this is the way to go. It can even be done for free, or there are some cheap apps that Nick features that makes it even easier.
If you are on a PC
First, if you are on a PC just go buy a Mac (just kidding, please don’t send me hateful emails, I really am just kidding). As the emailer states, this is probably a problem many in academia face as you are given PCs by the workplace. Windows doesn’t have the extensive search capabilities of spotlight so you have to work around this problem. One easy solution is to get Google Desktop this will allow you to search the files on your computer as if they were google pages. If you want to use this check out Lifehacker for some help. The one problem with this solution is it still does not allow you to tag files as the above model wants, only search within them.
If you want to “tag” files on a PC you need Copernic Desktop. This program is also free, and will allow you to do one thing that Google Desktop will not. That is search inside the Properties Pop-up for a file, which allows you to employ the tagging system. to get this done first read the series of posts above from Nick on the Apple Blog. You don’t need to get all of the tools, but rather his general philosophy, at the very least read the first one. Now when you get a new document, place it on your desktop. Before you file it, right click open the properties=>summary window, and look for “keywords.” In the keywords field place your tags, for example &Pynchon, &Foucault. It is crucial to use the prepend & (or whatever symbol/system you choose) for if you just put keywords in without this prepend you are going to have trouble distinguishing as per above. Copernic will search in the keywords field (at least that is what I am told-I have not actually tried this), but Google Desktop will not. Since Copernic it should not be too much of a problem to get it installed on your work computer.
Closing Thought(s)
I think the more information becomes digital, and one tries to manage said information, metadata will become evermore crucial for handling research. I still think something like Devon or Yojimbo is the way to go, but the above should work, and unless you have everything in your database manager it is useful to employ a tagging system for your other files (this is what I do).
Thoughts? Questions? anyone have something to add? (Post in the comments.)
March 12th, 2007 at 12:50 am
Google Notebook ( http://www.google.com/notebook ) might be something to look into that is web based. I use Firefox and the Google Notebook extension ( http://www.google.com/tools/firefox/ ). If I highlight something then right click on it I can add it as a note then get back to it easily later, add my thoughts to it, etc.
March 12th, 2007 at 6:36 am
You can also use the keywords field in BibDesk to do this; I’ve been doing this for several years, and it works great.
March 12th, 2007 at 12:52 pm
Keywords in BibDesk, or Bookends/Sente for that matter is a good bet. The one problem with this is what to do with stuff that is not in BibDesk, for example your notes on something, posts you have saved from the internet etc. But if you dont have to worry about this, i.e. only are working with cited material BibDesk works well. Although you would still have to find the article once you located it in BibDesk.
March 12th, 2007 at 7:17 pm
Dave- BibDesk has a “local URL” field (in addition to a URL field) that you can drag & drop a file onto; if you do this once for every paper that you catalog, you can then open files directly from BD- no need to find them again in the Finder.
March 12th, 2007 at 8:50 pm
Gabi-good point hadn’t thought about that (Bookends has this too but I am not sure about Sente). You could also use this to create links to your own notes on a particular work as long as it holds two links in the that field.
March 12th, 2007 at 10:33 pm
The free and open source Zotero (formerly known as firefox scholar) works great, with tagging and searches for anything you access in a web browser (and it doubles as a bibliographic citation program, like endnote). However, it’s not so good for other files.
http://www.zotero.org/
March 13th, 2007 at 12:38 am
Yeah, big fan of Zotero, you can actually search this site, I have talked about it a couple of time, but it doesn’t necessarily handle the problem this emailer was having. But in general I think Zotero has the potential to become one of the essential tools for scholarship in the age of the digital, incidentally, the post prior to this one about the podcast is from the center that developed Zotero worth listening to.
March 13th, 2007 at 5:27 am
Since the reader in question seems to be looking for a way to store and tag notes, I’m a little surprised that no one has mentioned Tinderbox. Yes, it’s a touch pricey, and has a bit of a learning curve… but it’s one heck of a tool for storing, organizing, and manipulating notes.
I keep a master files with notes containing everything I’ve read that’s even tangentially related to my research. Multiple notes are made for each reading, since the problem with tagging entire files is that while a document may relate to many subjects, it’s sometimes difficult to find what you want if you need to search through an entire document to find what you need. NY Times writer Steven Johnson discussed this issue in his well known “Tool for Thought” post as it regards to DevonThink.
As for cross-platform compatibility, Mark Bernstein (the developer of Tinderbox) keeps teasing with the mention that Tinderbox for Windows (TinderWin) is in active development, but no firm release date has been mentioned. Still, the files are XML, so they’re pretty easy to edit in any text editor, and exporting notes out of Tinderbox as an HTML or TXT files is also fairly simple, and obviously can be used on Windows computers.
March 13th, 2007 at 6:19 pm
I agree that Tinderbox (with caveat about learning curve) would actually be ideal for what you are trying to do. Tinderbox does (at least) two things that regular meta-tagging does not. First, it allows you, to find emergent structure. Rather than pre-tagging your notes as XYZ and WXY you can do a search on all of your notes. If they contain references to WXYZ or any defined combination, you can bring them up. You can also do truncated searches, and more important, you can combine a search for XYZ with GHK later, after you’ve realized that GHK is actually relevant. In otherwords, Tinderbox lets you both see and manage emergent structures in your data. Metatagging goes part of the way towards this, but the problems are 1) you have to pre-tag, knowing in advance the metatags you need for each note and then 2) re-tag as your structure changes (adding or deleting). The second feature of Tinderbox is that it very easily allows you to find emergent collections, e.g. XYZ and GHK, and then automatically retag all of these notes. It’s much, much more flexible and dynamic. The price for this is its learning curve. But once you realize that the learning curve is less related to the software (although it can be picky) than managing emergent structure, things go a bit easier.
March 13th, 2007 at 8:04 pm
I haven’t tried Tinderbox, I prefer Devon (really a matter of preference though). These are by far the best solution. But they don’t solve the PC problem. But that aside having a database/information program is the better way to go I think. Maybe we should do a breakdown of the differences advantages of each of the three big ones: Devon, Yojimbo, Tinderbox. (My sense is that Devon is the most powerful but hardest to learn-with Yojimbo being the easiest, and Tinderbox falling somewhere in the middle.)
March 14th, 2007 at 5:45 pm
[...] Bis jetzt verwende ich vor allem Safari, Firefox kommt mir irgendwie langsamer vor. Leider gibt es diese wundervollen Firefox-Erweiterungen wie Greasemonkey oder Firebug nicht in Safari-Land. Heute bin ich in einem Kommentar bei academhacks auf die Erweiterung Zotero gestoßen, dazu: [...]
March 14th, 2007 at 6:16 pm
If the notes are files on the hard drive, try Punakea, a free app that lets you tag anything on your Mac, including book marks. It’s very easy to use. Get it here: http://nudgenudge.eu/punakea
BTW, I have both Devon and Yojimbo and find it really difficult to get into using Devon. The interface leaves a lot to be desired, though it’s certainly more powerful than Yojimbo.
April 24th, 2007 at 3:08 pm
Now you can tag every file in Windows Vista. Read more here: http://lifehacker.com/software/vista/geek-to-live–tag-files-and-save-searches-in-windows-vista-232891.php
June 10th, 2007 at 11:27 am
A windows application called TagAndFacet is available on http://www.tagandfacet.com. It is running on windows Vista and Windows XP. It enables users to tag all types of files and folders. It also integrates Outlook email tagging and Internet Explorer tagging. Fot both tagging an autocompletion option is available to help users keeping coherence between tagging. Search capability is integrated for all sources (files, emails and urls).
It is available in pre-release mode for free.