Privacy versus sharing: electronic lab notebooks, Facebook and wikis compared

Posted by Rory on September 29th, 2010 @ 9:08 am

Common misconceptions about sharing and privacy in ELNs

A couple of weeks ago I fielded the following question (assertion, really) at a conference on data sharing and storage in biomedical research:

“An electronic lab notebook is not useful because everyone can see everyone else’s work — there’s no privacy.”

To which I responded that good ELNs have a permissions system that allows records to be kept private.

The person who asked the question, still on the attack, then said something to the effect of, that’s no good because people can’t share their data.

To which I responded that the permissions system in a good ELN allows fine level controls so that any record can be completely private, completely public to the entire universe of users, or accessible only to a particular group of users.   In other words, it supports privacy and sharing.

I was a bit taken aback by the aggressiveness of the questioner, and felt quite pleased with myself in that I had, I thought, successfully countered both lines of his attack  on ELNs.  But reflecting on the exchange afterwards, I began to have second thoughts.  The questioner said that he was in a research support role with a group of academic biomedical researchers.  So presumably his comments reflected concerns/preconceptions the researchers he works with have about ELNs.  And judging by the tack he adopted, the prevailing view about ELNs is not positive — they don’t allow privacy, or they don’t allow  sharing, and in any event they are inflexible.

ELNs:  neither Facebook nor wiki

I don’t know how representative these views are.  Since ELNs have yet to be widely adopted by academic scientists, it’s probably the case that few people have first hand experience with them, so whatever the prevailing view is, it will be based on vague impressions rather than a good set of information.    Many labs have adopted wikis for sharing general information like meeting notes and protocols, and most of these wikis will be inflexible, and not offer scope for keeping private records.  So it’s quite possible that people just assume that electronic lab notebooks are beset by the same restrictions.  It’s also possible that people assume ELNs are only capable of replicating the crude and inflexible privacy/sharing regime you get with your Facebook account.  In other words, many people probably project on to ELNs concerns they have with information sharing applications they are familiar with without any understanding of how sharing actually works in ELNs.

Fine-grained and flexible sharing in ELNs and the benefits it brings

In fact there are some key differences between the sharing/privacy system of Facebook, wikis, and ELNs designed for documenting and sharing experimental data. Here are three of them.

1. Sharing and privacy in ELNs is simpler than on Facebook, and more flexible than in wikis.

When you think about it, sharing on Facebook is very complex!  You’ve got three categories of things you can share — things you share, things on your Wall and things you’re tagged in, and then within each of these a whole variety of subcategories.  And then you’ve got a variety of categories of people you can share with — everyone, friends and friends of friends.    Most people ignore most of the sharing  functionality — the system is just too unwieldy.  It’s also very inflexible — the categories of what you can share and what kinds of groups you can share with are decided by Facebook, not you!

Sharing on wikis is at the other end of the spectrum:  exceedingly simple, but it’s even more limiting.  The way most wikis are configured you are part of one or more groups and the pages in that groups or groups can be viewed by everyone in the group.  In other words, there is no privacy!  And of course no flexibility, since the decision about what group(s) you are in is made by the administrator, not you.

In contrast to both Facebook and wikis, sharing and privacy in the best ELNs are (a) simple, and (b) flexible.  They are simple because they don’t require distinctions between different kinds of things that can be shared or between different categories of people that are involved in the sharing.  For any record in the system sharing is set in the same way. They are flexible because a record can be shared with one other person, with everyone, or with any subset of people  using the system at the discretion of the person setting the permissions, and a different sharing regime can be set for each record if so desired.

2.  ELNs give equal weight to individuals and groups

Facebook, like most social media, is designed around individuals — sharing is about individuals creating groups centering on themselves.  Wikis are just the opposite — they are designed around groups — individuals are slotted in to an environment which is focussed on achieving group objectives.  Neither of these extreme orientations is appropriate to  labs.  When you think about what makes a scientific research lab tick, it’s the fact that it is designed to facilitate both group and individual objectives.  So what a lab really needs is a collaboration and communication tool that has been designed with both individuals and the group in mind. Enter the ELN!  As noted, ELNs allow for some records to be completely private.  So a PhD student, for example, can have their private space where their experiments are accessible to no one but themselves.  But ELNs also allow for the flexible sharing described above, so records which everyone needs to see, e.g. lab protocols and meeting notes, can be made accessible to everyone, and the records in certain projects can be restricted to a specified set of users, e.g. just to the group of students working on the project and the PI.

3.  ELNs  enable  sharing of a particular kind of information — experimental data — in the same environment as other general information.

ELNs bring another kind of benefit to labs engaged in creating and sharing scientific data that is not supported by the sharing regime in either wikis or Facebook.  This is that they are specifically designed to handle sharing of experimental data, the bread and butter of labs engaged in scientific research.    They do this by making it easy to put structure into the research record.  And with structure comes better organization, more targeted search, and better archiving.  So current and future members of the lab can more easily find and use data which they, and other members of the lab, have entered into the ELN.

So that’s a brief overview of how ELNs facilitate both sharing and privacy, and enable labs and lab members to record and share experimental data.    They are superior to wikis in these respects, and they don’t suffer from the sharing and privacy concerns people have as a result of their experience with Facebook.   That’s not too surprising since ELNs have been specifically designed with labs in mind!

Provenance in electronic lab notebooks

Posted by Rory on August 11th, 2010 @ 7:00 am

In this post I’d like to stimulate some discussion about provenance in electronic lab notebooks, and more generally in documenting biomedical research. This issue is of interest to various groups of  people, but they usually don’t talk to each other.  I’ll begin with observations on the issue from three people.  One is a biochemist who is also a leading commentator on documenting and communicating about biomedical research, the second is a thoughtful scientist who works in a lab and is is constantly looking for ways to get better organized in capturing her research, and the third is an informatician working on a project to bring the benefits of databases — including provenance — to wikis with a particular focus on biomedical research. Perhaps this post will stimulate some cross fertilization of ideas that otherwise might not take place.

The first person is Cameron Neylon.  Cameron has written a lot about different aspects of provenance in research, and helped organize a workshop on the issue in April where he delivered a presentation called In your worst nightmare:  how experimental scientists are doing provenance for themselves.  For the purposes of this discussion I’m going to focus on some comments Cameron made recently in a discussion started by Jonathan Eisen about possible electronic lab notebook systems.  Commenting on versioning and provenance, Cameron said,

“. . .versioning systems (generally) fail to provide a good way of capturing or thinking about the process that converts one thing to another. So I think the provenance problem or the process problem is the more interesting one.”

The second person is Kim Martin, who works at the Division of Pathway Medicine at Edinburgh University.  Kim has a strong interest in organizing her research and communicating with colleagues in an efficient way. Like Cameron, she feels that a simple audit trail showing all past versions of a particular record provides only a very limited perspective on the research process.

Kim has developed the idea of a journal view or ‘journalling’ in an electronic lab notebook as a way of being able to look back at the process of her work during a particular period of time.   To do this she wants to be able to very easily create a snapshot of everything she was doing on a particular day.  Here is Kim’s sketch of how such a ‘journal view’ might look:

Kim’s concept is that the electronic lab notebook would, through automatic linking, support the creation with a single click of a’ journal view’ of research and related activity undertaken on any given day.  One of Kim’s key objectives is to gain insights on the process of research which may have been undertaken some time ago, as a mnemomic device.  I think she shares this objective with Cameron — it would be interesting to get Cameron’s views on this.

The third person is James Cheney, at the Laboratory for Foundations of Computer Science, Edinburgh University.  I met James when we both spoke at the Biomedical-data day held at Edinburgh University in June. James gave a presentation called Databases + Wikis = Curated Databases.  Among the core areas of expertise of James’ group, which is led by Peter Buneman, is provenance for database queries and updates. They are working on a project aimed at bringing the benefits of databases, including the ability to deploy more sophisticated provenance, to wikis.  The project involves developing a “database wiki” which includes support for provenance and user queries about provenance, including the following planned features:

  • Basic Provenance: Record basic information about changes (userids of logged-in users, IP addresses of unknown users).
  • [DONE 0.2] Copy-Paste Provenance: Record provenance links relating data in consecutive versions of the tree.
  • Provide the ability to import data from other sources (including other DatabaseWikis) while automatically recording source information.
  • Query provenance: Propagate provenance along with queries embedded in pages, to support user queries about provennace
  • Bulk update provenance: Provide the ability to rearrange data within DBWiki pages or data using bulk updates while automatically recording provenance for these transformations.

With that background, it would be great to hear more from Cameron, Kim, James and others about:

  1. The nature and details of  the research  ‘process’ that needs to be captured.
  2. Reactions to Kim’s journalling idea — general reactions and also views on whether it provides a good (or at least a useful) angle on the research process, and how it might be modified to capture other aspects of the research process.
  3. Reactions to James’ planned provenance features — e.g. are these features likely to be useful to biomedical researchers, what other kinds of provenance would be useful in capturing the research process?
  4. Other thoughts on process and provenance in biomedical research stimulated by the above.

5 Things PIs want in an electronic lab notebook — other suggestions?

Posted by Rory on July 28th, 2010 @ 7:00 am

What PIs want in an electronic lab notebook is often different from what postdocs and graduate students want because PIs are looking for a tool for recording the entire lab’s work, rather than an individual note taking tool.  I looked around the web at recent discussions of what PIs are looking for in an ELN, and identified five common themes:

  1. Something that’s easy to learn and easy to use in order to ensure (relatively stress free) lab-wide buy in and take up.  Joshua Shaevitz, at Princeton, has a good description of the considerations that went into adopting an ELN, and the adoption process, in his recent  post on My Lab’s Wiki-based Electronic Lab Notebook System.  He says, “Before implementing our wiki system, I setup a mock wiki ELN on my laptop and presented it during a  lab meeting to show everyone the benefits firsthand. I especially wanted to convince them that the new system would not generate extra work, but would instead make their lives easier.”
  2. Something that’s flexible in terms of providing for, on the one hand, common structures for group records and records that need to be accessed by multiple members of the group, and, on the other hand, scope for individuals to ‘do their own thing’ in terms of both research style and having their own private space.  Joshua Shaevitz again: “I didn’t want to impose too much structure on each lab member, as I think notebook style is very personal thing. But, I also wanted to ensure that the results would be compatible with features such as search and would work well with our archiving strategies.”
  3. Something that facilitates integrated handling of  experimental data (i.e. the lab notebook function) in the same environment as other information the lab deals with, e.g. protocols, meeting notes, etc. Alex Swarbrick at the Garvan Institute: we use our electronic lab notebook “to compile the diverse collections of data that we generate as biologists, such as images and spreadsheets, and to take minutes of meetings.”
  4. Related to the previous point, something that provides the capacity to manage physical inventory as well as data in electronic form, and the ability to link the two together.  This point is brought out by Cameron Neylon in a thread accessible in a great recent discussion started by Jonathan Eisen at U.C. Davis, Possible electronic lab notebook systems – update.  In discussing what kinds of data a system needs to able to handle, Cameron says, “generating, storing, analysing and publishing research objects, explicitly including samples and other physical objects.”  And Alex Swarbrick again: “the ability to link records, reagents and experiments. For example, to connect an experimental mouse with the tube containing its tissues in the freezer, to the 6 different experiments (conducted over a year) that analysed those tissues in different ways. Managing this kind of ‘metadata’ is absolutely essential to our work.”
  5. Something that can “help to deal with information and data overload (sorting and filtering)” — a scientist interviewed in a recent study of the research practices of seven life sciences research labs Patterns of information use and exchange:  case studies of researchers in the life sciences.

How does this list sound?  Is it an accurate reflection of what others want in an ELN? Is it comprehensive?  Are key requirements missing?  Comments welcome!

What is an electronic lab notebook III: the benefits of structure

Posted by Rory on July 21st, 2010 @ 8:43 pm

The last post and the one before that looked at different views on who electronic lab notebooks are for — individuals or the lab — and how wikis measure up as environments that  enable lab members to enter and share experimental data.  Notwithstanding their attraction as convenient online tools for sharing general information, wikis lack structure, and it is primarily this which has kept even labs that use wikis wedded to the paper lab notebook for documenting experiments.

In this post we’ll look at why the ability to add structure to research data  is the key enabler permitting the transition from paper lab notebooks to electronic lab notebooks.

Paper lab notebooks support as much structure as you like.  You can create sections, paste copies of images, make notes in the margin, draw diagrams — the only limit to adding structure to a paper lab notebook is the scientist’s imagination.    Unlike a wiki, an electronic lab notebook allows you to replicate the structure that you put into a paper lab notebook.  Why?  Because an electronic lab notebook allows the creation of records with different kinds of fields.  This supports structuring your research data in two ways. First, the different types of fields support entry of information in differing ways, e.g. by date or time, by entry of text, by number, with radio buttons signalling  series of mutually exclusive options, etc.  Second, different classes of records can be put together with different combinations of various kinds of fields, so creating types of records that are appropriate to different aspects of research, e.g. a CHiP experiment, a freezer, a particular protocol,or an antibody.   This is in stark contrast to the wiki, which has only one type of record — the wiki page — and an undifferentiated one at that with no support for separation into different fields.

The benefits of this structure extend further to the other  things you use in your research like images and spreadsheets.  Like wikis, electronic lab notebooks have the advantage over paper lab notebooks of being able to make links to images and spreadsheets, which can also be inserted into the electronic repository — wiki or electronic lab notebook.  But electronic lab notebooks offer superior structuring capabilities in this respect too, because with an electronic lab notebook, unlike a wiki, you can associate a spreadsheet, image or other electronic item with a particular field of a particular kind within a record.

Making use of an electronic lab notebook’s ability to create records with different kinds of fields allows you to put structure into the record of your research in an online electronic environment much as you did with a paper lab notebook and at the same time gain the benefits of associations between bits of information which can only be made in an online environment, so they actually enable taking structuring of research data to a new and higher level.  It is this element – the ability to add structure to research data – which explains why that electronic lab notebooks — and not wikis — provide the best platform for labs  wishing to move from paper to electronic recording and management of their research data.  This is the unstated driver that lies behind wikipedia’s definition of electronic lab notebook as “a software program designed to replace paper lab notebooks“.    And so, I would revise that definition and say that an electronic lab notebook is an online environment that provides a sufficient capability for structuring research data to enable scientists to document and share their research data in that environment without the need to also resort to a paper lab  notebook.

What is an electronic lab notebook II — how do wikis measure up?

Posted by Rory on July 14th, 2010 @ 7:00 am

In the last post I poked a bit of fun at the wikipedia definition of  electronic lab notebook — “a software program designed to replace paper lab notebooks”.  Since that could mean just about anything, beauty is in the eye of the beholder.  The big dividing line is between people — like postdocs and graduate students — looking for a note taking tool for themselves, and others — like PIs — looking for a collaborative research tool for the lab.

This time I’m going to take a look at wikis — how do they measure up to the challenge faced by PIs looking for a collaborative research tool:  organizing and keeping track of a wide range of types of research data generated by lab members, present, past and future?   The first point to make is that there are all sorts of wikis, with varying degrees of sophistication, power and capabilities.    I’m going to use the most developed wikis as a point of comparison here — Confluence and PBWorks are good examples — wikis with a fully developed feature set.

As a recent study looking in depth at the work practices of seven life sciences research labs pointed out, a growing number of labs have turned to wikis as convenient environments for storing and sharing general information like meeting notes and protocols.  Wikis have a number of attractions to labs, in that they are easy to learn and use, online, and provide good support for sharing and collaboration.  In addition, the more sophisticated wikis have integrated messaging systems and some, like PBWorks, even have voice conferencing capabilities.

At the top end, then, wikis are becoming fully fledged knowledge management tools.  But features like voice conferencing are aimed at businesses, not labs.  For labs the key issue is managing their data. The study notes that notwithstanding the trend towards organizing general information in wikis, all the labs studied still maintain paper lab notebooks.  Paper lab notebooks stand out as an island of tradition in the midst of a growing ocean of online information sharing.    An island perhaps but a pretty big island, Australia rather than Fiji if you will, because paper lab notebooks are the repositories for the most important information labs deal with, their research data.

On this evidence wikis are falling short as a software program that replaces paper lab notebooks, and hence are not functioning as electronic lab notebooks per the wikipedia definition.  Why are labs staying with paper lab notebooks even as they adopt wikis to share information other than research data?  Inertia no doubt is a big part of the reason.  But the other big barrier to adoption of tools in labs — they have to be easy to learn and easy to use — is probably less of a factor.  Wikis  are coming into general use and it’s not the wiki per se that is being resisted, its the use of the wiki specifically as a place for entering and sharing research data.

Here’s a hypothesis:  the reason labs are sticking with paper lab notebooks for dealing with experimental data and not moving their experimental data into wikis along with general information like meeting notes and protocols is that wikis are unable to provide structure for the data.  With a wiki all you get is the wiki page.  It has no more support for structure than a Word document, and even less structure than a spreadsheet, without  doubt the most popular electronic repository for experimental research data.  Next time I will look in more detail at how electronic lab notebooks provide support for structuring research data and the benefits this can bring to collaboration and communication in the lab.

What is an electronic lab notebook?

Posted by Rory on July 6th, 2010 @ 9:30 pm

Welcome to the electronic lab notebook blog.  This will be a space for discussing electronic lab notebooks from every angle:  what benefits do they bring? how do they compare with alternatives? what kinds of features do they have and should they have? what issues do people face in using them?  how can you get the most out of them?

In this first post I’m going to start at the beginning:  what is an electronic lab notebook?

Wikipedia defines electronic lab notebook as “a software program designed to replace paper lab notebooks“.  That could be just about anything — beauty is in the eyes of the beholder!   I’ll take a look in a second at who the relevant beholders are and what each of them thinks, but taking wikipedia as the starting point it’s  fair to say first that all of them are looking to move away from this:

Postdocs and graduate students

So who are the relevant beholders?  They can be divided into two categories.  The first is people looking for an electronic note taking device for themselves.  They tend to be interested in convenience and simplicity.  Something which is easy to use and also helps them get organized.     But many postdocs and grad students want something which also provides  support for research.  Here is one description of the ‘dream app’ over on an Apple forum about electronic lab notebooks:

  • easy copy-pasting/drag ‘n drop
  • ability to re-open the files
  • metadata and search (tags, keywords, …)
  • possiblity to link to older notes and graphs
  • store PDF (or TIFF) representation of external files along with the original file: preview files without their originating application
  • automated backup mechanism
  • encryption on disk

The thread following that post contains a good discussion of the varieties of things people want in an electronic lab notebook for their own use, examples of what they have tried, what they like and dislike, and the limitations of the available tools.

PIs

The second category of people looking for an electronic lab notebook want a collaborative tool rather than one aimed at individuals.  They tend to be PIs, or in some cases others in the lab who’ve been asked by the PI to identify a suitable tool. Like those looking for an individual tool, PIs want something which is simple to use and easy to learn.  But beyond that their needs diverge from that of individual scientists. Professor Mike Shipston of the University of Edinburgh provides a good summary of the kinds of challenges that drive PIs to look at adopting an electronic lab notebook:

“We generate a wide variety of types of data sets, for example data from molecular analysis, quantitative analysis, for example quantitative RTPCR, gene cloning, through to electrophysiological analysis, for example from confocal images and total internal reflective microscopy right up to behaviorial assays in animals.  So its really about coordinating those types of data sets that fit together, keeping them contained within projects, because the data sets are derived from different people within the lab.  Also we have a very big extended network both in the UK and across Europe and the US.  It’s about keeping that information together. We have a large number of people coming in and out of the lab, the challenge is keeping track of that data and integrating it in with data from existing projects.”

Well that’s it for this first post — I realize I haven’t yet provided an answer to the question of what an electronic lab notebook is!  The next post is going to look at the tool that has been most widely adopted by labs looking for a collaborative tool, wikis.  I’ll discuss the strengths and limitations of wikis and whether they do – or should – qualify as electronic lab notebooks.  And don’t worry, I promise to come back with a specific answer to the question of what an electronic lab notebook is.