Introducing BioJava!

To put it simply (and in the words of the creators), “BioJava is an open-source project dedicated to providing a Java framework for processing biological data”. It is a package that contains class files and objects that implement Java code to be used for a variety of things. It is a great toolbox for those in the bioinformatics field that want to use Java to do “bioinformatics stuff” on their data, like analyzing and manipulating sequences. You can check out the wiki at http://biojava.org/wiki/Main_Page if you would like to get more details about BioJava. In this post I will be explaining the installation procedure to add BioJava to the Eclipse(Ganymede) platform on Windows Vista. If you already had the experience of adding library files to Eclipse this will probably be a bit of a review for you. Also, the BioJava wiki offers an installation guide if you would like to just follow this, although it does not explain the Eclipse work that must be done but only the part to add the .jar files to your machine’s CLASSPATH. Click the Read More for install help.

So, first and foremost, you must download the packages from the BioJava website, which could be located here. There is the choice there for the complete download that contains all the binaries, documentation, and source files for BioJava, but I didn’t choose this because I’m not familiar with how to unpack the binaries from the file. If you know how, and it would be awesome to tell me how, and you want all that extra options, then go ahead and download that file. The other download options are listed below as .jar files with descriptions of their purpose. I chose to just download all the options individually but I think I am really only using the first two options (bytecode.jar and commons-cli.jar). After downloading those, go ahead and go to the install guide to add the files to your CLASSPATH environment variable for your specific machine. Another option for Windows Vista to find your CLASSPATH beside using the command prompt is to access the Advanced System Settings on your computer’s properties. You would then click on the Environment Variables button to find the CLASSPATH.

Now for the Eclipse part. I figured this part out by browsing around Eclipse and messing around with settings so if there is a “more correct method” in doing this please let me know. First, you must create a user library for BioJava. This is easily done by clicking on Window – Preferences. On the left column of choices, click on the Java dropdown, then Build Path, then User Libraries. On the right you will see a list of defined User Libraries that have been added (which is probably empty if you’ve never done this before). Click on New and then Eclipse will ask for a user library name. This isn’t to important where it is only a label for the library you’re about to add. For example, I named mine “BIOJAVA” for ease. Next, click on Add JARs… Here you will locate the files that were downloaded from the BioJava wiki. I only added bytecode.jar and commons-cli.jar because I was just testing to see if it worked. The wiki explains what each of the files are used for and I have yet to determine what their purposes are. Now the BioJava library has been created.

So, after creating the library in Eclipse, you have to add the library to the files that need it. If you’ve already created a new Java project you can right click on the project name in the Package Explorer and go to Build Path – Add Libraries… Choose User Library from the list, click on User Libaries…, and then choose your BioJava library and click OK. It should then show in your list of User Libraries which you could then choose and click Finish. If you are beginning a Java project, then a different method is used. When setting your project name and layout, click Next instead of Finish. This will take you to your projects build settings. There will be a tab labeled Libraries to click on. This will show you the list of libraries that the project may use. Click on the Add Library… button on the right, choose User Library, if your BioJava library isn’t shown click on the User Libraries… button and find your BioJava library and click OK. Now check the box next to your added library and click Finish. Now the BioJava library and the original JRE System Library will be listed as the two libraries that can be used by your project. Do any additional changes that you need and click Finish. Now you’re ready to use BioJava!

Well, I hope this helps anyone who had as much confusion as I did. I haven’t done a great deal of tests with any code using BioJava but I grabbed some code from the tutorials on the BioJava wiki and no errors occurred. I’ll be learning more about BioJava and, hopefully, more about BioInformatics through using this.