How to share and store data in an electronic lab notebook

Posted by Rory on September 23rd, 2010 @ 5:16 pm

In this blog I usually look at data sharing from the point of view of the core research unit, the lab.   That was the perspective I adopted a couple of weeks ago in a presentation, Electronic lab notebooks in biomedical research, at the Storing, Accessing and Sharing Data: Addressing the Challenges and Solutions event co-hosted by the Scottish Bioinformatics Forum and S3 in Edinburgh.  I’ll come back to that perspective in a minute, but first I’d like to contrast two very different institutional perspectives on data management described at the conference.

Sanger Institute:  centralized institutional data management

Phil Butcher, head of IT at the Sanger Institute, started with a high level overview of data management issues at Sanger.  He focussed mainly on the rapid growth in the amount of data generated at Sanger, and the other institutes with which it has large scale collaborations, and the issues relating to storing and finding data when there is so much of it.  The impression I came away with is that at Sanger data is viewed as an institutional matter, not something that individual labs or scientists manage or, apparently, have much of a say in.  That makes sense, because the research projects Phil mentioned were all large scale, involving large numbers of scientists, and the generation of huge amounts of data.  The title of Phil’s talk, Scaling up Science and IT: Sanger Institute’s Perspective, reflects the centralized approach.

London Research Institute:  decentralized institutional data management

The next speaker, Jeremy Olsen, head of IT at the London Research Institute, started by saying that based on Phil’s description of Sanger, the London Research Institute was very different indeed, more  a collection of individual research groups.  In describing his LRI  perspective Jeremy said that he would be sticking up for the “little guy”.  He proceeded to briefly overview how research is carried out at the LRI, introducing the various research groups and their research interests.  The LRI represents a very different paradigm from Sanger; at the LRI decentralization rules, as reflected by the title of Jeremy’s talk, Data Growth and Management in a Diverse Life Sciences Environment.  At the LRI there are fundamental issues relating to getting a handle on what research the various groups are involved in, what data they generate and how they manage it. Progress would need to be made on understanding  these issues before it would be possible even to consider a centralized approach to data management and what that might entail.

The lab: bottom up data management

When it came time for my presentation, I started by saying that if Phil was representing the centralized  institutional approach, and Phil was looking at  the “little guys” from an institutional perspective, I was going to look at the issue of data management and sharing from the point of view of the little guy him/herself, i.e. the PI.  In the academic context, it’s important to note that the Sanger model is the exception and the LRI  decentralized model is the rule.  In fact it is almost certainly the case that the LRI, decentralized as it is, is still towards the more organized and centralized end of the spectrum of academic biomedical institutions. That point was reinforced to me when speaking recently with the IT director of a medium – large biomedical research institute in Australia (800 people including 700 scientific staff).  His description of the issues he faced with getting a grip on what data there was in the labs at the institute, how they managed it (if they managed it all), and uncertainty about how to help PIs get a better handle on their data was uncannily reminiscent of Jeremy’s description of the situation at the LRI.

From the perspective of IT managers tasked with, among other things, trying to bring some order to the data generated by the research groups at their institution, to store it in a cost effective fashion and have it archived in a way that is useful in the future, multiple PIs generating ever increasing amounts of data may be a ‘problem’ to be managed or dealt with.  But from the PIs’ point of view it is their data and theirs to manage (or not) as they want.  There is a pretty fundamental difference in outlook here.

Electronic lab notebooks — part of the solution?

In my presentation I asked where electronic lab notebooks might fit into this picture, and whether they could have a role to play in crafting better data management solutions that meet the objectives of both PIs and IT directors.

ELNs tick some of the key boxes IT directors look for in best practice in data storage and sharing, including:

  1. Storing metadata in a structured fashion and ensuring controlled access.
  2. Effectively managing different data types, including attachments and imports.
  3. Allowing improved indexing  and search, through the use of structured metadata.

Electronic lab notebooks can also solve  the key data management problem facing many PIs:  coordinating a wide diversity of data type sets generated by a large number of people within the lab.  They can, that is, if they meet the following key requirements of today’s PIs:

  1. The ELN is flexible and can be set up the way the PI and their lab want it set up.
  2. It’s easy for the lab to transfer to the ELN.
  3. The ELN facilitates better exchange of information between members of the lab and, over time, better archiving.
  4. the ELN is web based and hence accessible anywhere, anytime.

So, electronic lab notebooks can help to solve the key data management  issue faced by  the core unit in academic institutions — labs.  And they provide a platform for data management that IT directors looking at the problem from an institutional perspective can work with.  As such they can be part of a solution which benefits both PIs, who are concerned with the research done in their group, and IT directors, who are concerned with the data generated throughout their institution.

5 Things PIs want in an electronic lab notebook — other suggestions?

Posted by Rory on July 28th, 2010 @ 7:00 am

What PIs want in an electronic lab notebook is often different from what postdocs and graduate students want because PIs are looking for a tool for recording the entire lab’s work, rather than an individual note taking tool.  I looked around the web at recent discussions of what PIs are looking for in an ELN, and identified five common themes:

  1. Something that’s easy to learn and easy to use in order to ensure (relatively stress free) lab-wide buy in and take up.  Joshua Shaevitz, at Princeton, has a good description of the considerations that went into adopting an ELN, and the adoption process, in his recent  post on My Lab’s Wiki-based Electronic Lab Notebook System.  He says, “Before implementing our wiki system, I setup a mock wiki ELN on my laptop and presented it during a  lab meeting to show everyone the benefits firsthand. I especially wanted to convince them that the new system would not generate extra work, but would instead make their lives easier.”
  2. Something that’s flexible in terms of providing for, on the one hand, common structures for group records and records that need to be accessed by multiple members of the group, and, on the other hand, scope for individuals to ‘do their own thing’ in terms of both research style and having their own private space.  Joshua Shaevitz again: “I didn’t want to impose too much structure on each lab member, as I think notebook style is very personal thing. But, I also wanted to ensure that the results would be compatible with features such as search and would work well with our archiving strategies.”
  3. Something that facilitates integrated handling of  experimental data (i.e. the lab notebook function) in the same environment as other information the lab deals with, e.g. protocols, meeting notes, etc. Alex Swarbrick at the Garvan Institute: we use our electronic lab notebook “to compile the diverse collections of data that we generate as biologists, such as images and spreadsheets, and to take minutes of meetings.”
  4. Related to the previous point, something that provides the capacity to manage physical inventory as well as data in electronic form, and the ability to link the two together.  This point is brought out by Cameron Neylon in a thread accessible in a great recent discussion started by Jonathan Eisen at U.C. Davis, Possible electronic lab notebook systems – update.  In discussing what kinds of data a system needs to able to handle, Cameron says, “generating, storing, analysing and publishing research objects, explicitly including samples and other physical objects.”  And Alex Swarbrick again: “the ability to link records, reagents and experiments. For example, to connect an experimental mouse with the tube containing its tissues in the freezer, to the 6 different experiments (conducted over a year) that analysed those tissues in different ways. Managing this kind of ‘metadata’ is absolutely essential to our work.”
  5. Something that can “help to deal with information and data overload (sorting and filtering)” — a scientist interviewed in a recent study of the research practices of seven life sciences research labs Patterns of information use and exchange:  case studies of researchers in the life sciences.

How does this list sound?  Is it an accurate reflection of what others want in an ELN? Is it comprehensive?  Are key requirements missing?  Comments welcome!

What is an electronic lab notebook?

Posted by Rory on July 6th, 2010 @ 9:30 pm

Welcome to the electronic lab notebook blog.  This will be a space for discussing electronic lab notebooks from every angle:  what benefits do they bring? how do they compare with alternatives? what kinds of features do they have and should they have? what issues do people face in using them?  how can you get the most out of them?

In this first post I’m going to start at the beginning:  what is an electronic lab notebook?

Wikipedia defines electronic lab notebook as “a software program designed to replace paper lab notebooks“.  That could be just about anything — beauty is in the eyes of the beholder!   I’ll take a look in a second at who the relevant beholders are and what each of them thinks, but taking wikipedia as the starting point it’s  fair to say first that all of them are looking to move away from this:

Postdocs and graduate students

So who are the relevant beholders?  They can be divided into two categories.  The first is people looking for an electronic note taking device for themselves.  They tend to be interested in convenience and simplicity.  Something which is easy to use and also helps them get organized.     But many postdocs and grad students want something which also provides  support for research.  Here is one description of the ‘dream app’ over on an Apple forum about electronic lab notebooks:

  • easy copy-pasting/drag ‘n drop
  • ability to re-open the files
  • metadata and search (tags, keywords, …)
  • possiblity to link to older notes and graphs
  • store PDF (or TIFF) representation of external files along with the original file: preview files without their originating application
  • automated backup mechanism
  • encryption on disk

The thread following that post contains a good discussion of the varieties of things people want in an electronic lab notebook for their own use, examples of what they have tried, what they like and dislike, and the limitations of the available tools.

PIs

The second category of people looking for an electronic lab notebook want a collaborative tool rather than one aimed at individuals.  They tend to be PIs, or in some cases others in the lab who’ve been asked by the PI to identify a suitable tool. Like those looking for an individual tool, PIs want something which is simple to use and easy to learn.  But beyond that their needs diverge from that of individual scientists. Professor Mike Shipston of the University of Edinburgh provides a good summary of the kinds of challenges that drive PIs to look at adopting an electronic lab notebook:

“We generate a wide variety of types of data sets, for example data from molecular analysis, quantitative analysis, for example quantitative RTPCR, gene cloning, through to electrophysiological analysis, for example from confocal images and total internal reflective microscopy right up to behaviorial assays in animals.  So its really about coordinating those types of data sets that fit together, keeping them contained within projects, because the data sets are derived from different people within the lab.  Also we have a very big extended network both in the UK and across Europe and the US.  It’s about keeping that information together. We have a large number of people coming in and out of the lab, the challenge is keeping track of that data and integrating it in with data from existing projects.”

Well that’s it for this first post — I realize I haven’t yet provided an answer to the question of what an electronic lab notebook is!  The next post is going to look at the tool that has been most widely adopted by labs looking for a collaborative tool, wikis.  I’ll discuss the strengths and limitations of wikis and whether they do – or should – qualify as electronic lab notebooks.  And don’t worry, I promise to come back with a specific answer to the question of what an electronic lab notebook is.