Who is Mendeley for? Researchers? Universities?

Posted by Rory on January 18th, 2013 @ 6:00 pm

Elsevier buying Mendeley?

Techcrunch reported yesterday that Elsevier is in talks to purchase Mendeley for around $100M.  It that’s true it’s a significant milestone not only for Mendeley, but also in the evolution of the way researchers access, manage and share information about publications, their own and others’.  If Mendeley is worth $100M (or thereabouts) to Elsevier, it’s worth asking the question: who is Mendeley for?  To do that it’s first necessary to define briefly what Mendeley is.

What Mendeley does now

Mendeley enables researchers to access and manage their and other researchers’ papers and communicate about the papers, for free. Mendeley also provides  an Institutional Edition which, in the words of Robert Knight, a Mendeley programmer, “provides an analytics dashboard that allows institutions to see what journals, papers, tags etc. are popular amongst Mendeley users at their institution and also the impact that their members’ publications are having across the whole Mendeley userbase.” The Techcrunch post says that early customers include the University of Pittsburgh, the University of Western Ontario, the University of Nevada, Reno, the VTT Technical Research Centre of Finland, the Korea Advanced Institute of Science and Technology, and the Agriculture, Forestry and Fisheries Research Council Japan.

Four possible futures for Mendeley

In an interesting and entertaining post made a few hours ago, Roderic Page discusses the Jekyl and Hyde sides of both Elsevier and Mendeley, and speculates about directions in which Mendeley might be developed by Elsevier.   He posits four possibilities.  I’ve listed each of them below and added in square brackets an example of a company with an analagous business model.

  1. Mendeley becomes the iTunes of papers and charges, say, $1 to access a paper [Apple]
  2. Elsevier uses data from Mendeley to construct impact measures, and sells this data [Google]
  3. Elsevier uses the reference lists in Mendeley to evolve into an authoring platform [Wordpress]
  4. Mendeley evolves into a post-publication review forum by encouraging comments on bookmarked papers [Amazon – Kindle]

All four of these possibilities strike me as plausible – fascinating stuff!  But it is the first two that seem more likely to be explored in the short term because they offer the most immediate routes to monetizing the two main assets Mendeley has built up, i.e. (i)  the researchers who use Mendeley and (ii) data about those researchers.  In both cases there is a clearly identifiable set of customers who would be willing to pay for what Elsevier/Mendeley is offering.  And that brings me back to the question I posed in the title of this post:  who is Mendeley for?

Who is Mendeley for?

In the iTunes model the customers are researchers, who presumably would be willing to pay very small amounts to access papers.  Their institutions, or grant funders, could support this activity by providing researchers with a budget to fund this activity in whole or in part, but the direct customer would still be the researcher.    In this model Mendeley is for researchers.  You could say it is a more logical model than the prevailing one because the people who actually use papers — researchers — rather than the institutions they work at — are paying for the papers.

In the impact measures model the customer would be the institution — which would purchase  information about papers and how researchers use them from Elsevier/Mendeley in order to help evaluate researchers. In this model Mendeley is for institutions, and what institutions are paying for is data.

Mendeley: vehicle for a publishing oligopoly, or a new monopoly?

Both these possible models exist in an undeveloped form in Mendeley right now.  The basis for the  iTunes model is the ability to access, manage and share papers, and the basis for the research impact model is charging institutions for information about how their researchers carry out this activity.   What’s different – and it is a transformative difference — about the fully developed versions of these models that could come into existence if Elsevier acquires and develops Mendeley is that Elsevier’s plentiful resources could be used to transform Mendeley from a promising experiment into a dominant platform, i.e. to giving Elsevier the same kind of oligopolistic position in the online era that Elsevier and a few other publishers established in the print era.  In other words a vehicle that Elsevier can use to maintain its oligopolistic position.

If that analysis has any merit, we can expect other publishers to find ways — through acquiring and developing competing platforms like ResearchGate and acadmia.edu and/or building their own versions of a transformed Mendeley — to ensure that they retain their seat at the oligopolists’ table.  In that case the market could come to look a bit like Apple (Elsevier), followed by Android (a second publisher) and Windows (a third publisher) in the mobile space.  If Elsevier is successful and the others don’t find a way to compete, then Elsevier could conceivably establish, at least until the underlying technology changes, a monopolistic position like Google occupies in search. Both researchers and academic institutions might find the control this enabled Elsevier to wield distasteful, or even unacceptable.  It would be interesting to see what steps they might take to limit, regulate or even take over the service or services that Elsevier had developed.

As Roderic Page says, interesting times indeed!

Enhanced by Zemanta

Electronic lab notebooks in universities — interest is growing

Posted by Rory on November 28th, 2012 @ 6:30 pm

Universities are beginning to take a serious interest in electronic lab notebooks (ELNs).  See, for example the University of Wisconsin’s recently published study of a major ELN pilot conducted earlier this year, and the University of Otago, which is currently conducting an ELN pilot after getting a thumbs up from researchers in a university-wide survey.

Projects like these represent a big change – until recently ELNs were of interest only to individual researchers and labs.  Why is this change taking place now?  I think there are four drivers of institutional interest in ELNs.

First, as evidenced by the Otago survey, there is growing interest in adopting ELNs among researchers.  Without this grass roots demand from users, universities were never going to be very interested in providing the financial or technical support needed to roll ELNs out to a wide base of researchers.

Second, the newest generation of ELNs, which —  unlike the first generation ELNs designed for Big Pharma in the 1990s — was developed with academic users in mind, cost an order or magnitude less than the first generation ELNs, and are affordable for academic institutions.

Third, universities are now set up in such a way that they are able to (a) investigate, and (b) purchase, ‘enterprise’ software solutions.  For evidence of this look at the systematic way in which both Wisconsin and Otago are going about their search for an ELN.  This is a relatively recent development.  Five to ten years ago, how many universities had a CIO, or the research support infrastructure or budget needed to run a pilot and then support product rollout?

Fourth, developments in technology have made it possible for developers to produce, at relatively low cost, ELNs which (a) researchers find easy to use and useful, and (b) IT finds easy to deploy and support.



Enhanced by Zemanta

Facebook’s Groups for Schools – could it become a research tool?

Posted by Rory on April 13th, 2012 @ 9:51 am

On Wednesday Facebook announced Groups for Schools,  which allow people with an active school  “.edu” email address to join groups at their college or university. What caught my eye is that Groups for Schools also allows sharing of files up to 25mb.  The announcement says that file sharing for these groups “will make it even easier to share lecture notes, sports schedules or class assignments”.  So Facebook clearly intends Groups for Schools to have an educational function in addition to a social role.

Dropbox competitor?

John Constine at Techcrunch wrote a post focusing on the file sharing capability of Groups for Schools, and speculated about whether it could eventually evolve into a competitor to Dropbox.

From teaching to research?

I’d like to take that thought a step further and ask:  what would it take for Groups for Schools evolve into a research tool, i.e. could it be useful for collaborative research in addition to the more limited role in teaching and learning implied by Facebook’s suggestion of how it can be used now?

First, many files used in research are larger than 25mb.  So for Groups for Schools to become a viable tool for sharing research, it would have to accommodate uploading of much larger files.  Since that would involve a cost to Facebook, Facebook would have to charge for large files, just as Dropbox and other file sharing services do. That is not impossible, but it would require a major shift in Facebook’s business model.

Second, most academic researchers are concerned if not obsessed with security. Facebook  starts at a disadvantage here since (a) it does not have a great reputation for security, and (b) is associated with ‘social’ exchange rather than serious scholarly endeavor, so that many if not most scholars are likely to view Facebook as an inappropriate repository for their data.  In addition to overcoming this negative perception through, presumably, lots of marketing, Facebook would have to add security layers to the file sharing, which again would entail additional cost and complexity, and is not in line with Facebook’s current business model.

Third, many scholars need to share their research not just with people in their own institutions, but also with collaborators in other institutions, hence with other .edu addresses.  So to effectively support file sharing for research collaboration Facebook would need to find a way around this restriction.  Technically that is certainly possible, but from a practical point of view it would run directly counter to the whole thrust of Groups for Schools.


Those three hurdles are each significant, and pose challenges Facebook would have to overcome, but they are not insurmountable.  It will be interesting to watch the evolution of Groups for Schools and to track whether and how Facebook develops its latent collaborative research capabilities.


Is Twitter just about flattery?

Posted by Rory on December 12th, 2011 @ 12:30 pm

Social networks as the new vehicle for brown nosing and sucking up

There is an interesting piece in today’s Financial Times by Lucy Kellaway (for those who don’t know she is the UK’s most famous ‘corporate agony aunt’) called Social Networks Upend Office Etiquette.  In the piece Kellaway laments the rise of ‘brown nosing’ and and ‘sucking up’ to bosses on Facebook and LinkedIn.

Twitter:  just a forum for flattery?

She then turns to Twitter:

“Above all, people flatter each other on Twitter. Indeed, this seems to be the main function of the site: it’s a great big, instant electronic support group. You tweet someone, they do it back to you.”

Twitter in the office versus Twitter for groups with particular interests

Kellaway’s observations center on people using Twitter in an office context, i.e. on the horizontal and vertical relationships that drive the office dynamic.  The people I am engaged with on Twitter are primarily either  academics and/or’ independent agents’ – software developers, people working in small organizations, etc.  So their Twitter activity centers around mutual interests — scientific topics, data sharing, science and social media, etc. — rather than office dynamics.  How does the charge of ‘Twitter is just for flattery’ stand up in this quite different context?

I am pleased to say that in my experience brown nosing and sucking up are relatively uncommon.  They do exist, but are certainly not the main drivers of conversations.

Simple flattery is more common, but still far from being the driver of most conversations.

Self-congratulation, another feature that Kellaway highlights, is, however, fairly common, and I have to say it’s not something I like.  In fact I find it an unfortunate aspect of the groups whose conversations I follow and participate in.  I would draw a distinction here between reporting something that you have done or participated in, like a presentation or blog post —  to me it’s a good thing to let others know about those kinds of things — and blowing your own horn — to me that’s usually not necessary and it does not enhance my impression of the person doing the cheerleading.

Another aspect of the conversations among the groups I follow I find somewhat troubling — because it is limiting — is that most of the people have basically similar views about the core topics that are discussed, and come from similar work backgrounds.  I suppose that is inevitable, because if they didn’t they wouldn’t want to be part of the conversations.  There is a positive aspect of this, too, in precisely the ‘support group’ aspect of Twitter that Kellaway points to.  The Twitter groups I follow usefully serve to support people who participate in them because they take heart from the fact that there are others with similar interests and largely similar perspectives.

Still, it would be nice if, for example, people who disagree with ‘open science’ participated more in discussions about open science, and if more scientists working in commercial backgrounds, and more people working in companies that provide services and tools for conducting scientific research, engaged with academic scientists about the use of social media in science.

Enriching the debate through widening the group of participants

For the groups I follow, Twitter is already a lot more than just a ‘great big electronic support group”, but diversifying the  participant  base by the addition of  people from these other backgrounds would certainly enrich the debate.  And that’s important because beyond providing support for groups with common interests Twitter also has the potential to become a forum for even more meaningful discussion and debate about the issues that interest those groups.



Informative guide to electronic lab notebooks from the University of Utah

Posted by Rory on November 28th, 2011 @ 6:13 pm

I recently came across a useful guide to electronic lab notebooks on the University of Utah website http://campusguides.lib.utah.edu/content.php?pid=126157&sid=2131670.

Two things interested me about the guide.

Categorization of electronic lab notebooks

The first is the breakdown of electronic lab notebooks into three categories (each entry is further categorized as being focused on biology, chemistry, general use, etc.):

  1. Vendor ELNs (36 entries, which came from Atrium Research)
  2. Open source ELNs (7 entries)
  3. General note taking and management (14 entries)

This strikes me as a sensible breakdown which accurately captures the three different kinds of electronic lab notebooks scientists are likely to come across.

ELNs have two layers

The second is the characterization of ELNs as having two layers.  The first is a Calculations and Data Manipulation layer where researchers access and work with data:

The second is an ‘IP protection or people layer’ where researchers collaborate around the research data:

I found this characterization to be quite intriguing.  First you engage with the data, then you engage with people — collaborators, your PI, etc., about the data.  That’s quite a neat way of thinking about ELNs.  And it’s simply put and presented.

Thanks and well done to the University of Utah for making such a useful, and accessible, overview of electronic lab notebooks available!


The Encyclopedia of Open Research and the data/publication problem

Posted by Rory on November 14th, 2011 @ 12:25 pm

The Encyclopedia of Research: an exciting concept

The Encyclopedia of Original Research (EOR) is being proposed by Daniel Mietchen as a GitHub open repository of scientific articles that evolve along with the topic they cover through a series of reviews. You can read more about this concept/project, and the philosophy behind it, here and here.  And Daniel has just written a post called How would you fund research:  An Open Science Perspective, where he explores among other things the relationship between EOR and various open funding models, including the SciFund Challenge.

Daniel and I have had a couple of stimulating email exchanges about EOR.  Our exchanges covered various issues, such as how to incentivize people to contribute to and use EOR, and how EOR will distinguish itself from services in adjacent areas like Mendeley. I thought it would be interesting to build on those exchanges with a post focusing on a particular issue EOR will have to face, namely the interface between ‘evolving publications’ in EOR and the data that lies behind the publications.

How to deal with data used in the research leading to publications and reviews?

Will the data used in the original publication be presented along with the publication? If so, will this be optional, or a condition of having the publication included in EOR?  If the data is presented, will a certain format be required and therefore necessarily supported?  These issues will also arise in relation to reviews, because in some cases the reviewer may want to present their own data.

Data presentation: mandatory or optional?

Behind these simple questions lie a couple of difficult decisions or tradeoffs for EOR. A key objective of EOR is to encourage ongoing discussion of publications.  This objective will be significantly enhanced by making the data used in the research that led to publications available for readers and reviewers.  But, as we all know and many have pointed out, getting people to provide their data is difficult, for a variety of reasons probably foremost among them that the data is often (a) scattered among a variety of things like a paper lab notebook, spreadsheets, wikis, etc., and (b) in multiple formats, so that making it available in a single place or repository is not simple or straightforward.  So, the first decision or tradeoff is between requiring data to be provided, thereby enhancing the quality of the project, and making that optional, thereby encouraging more people to submit publications and review them because it will be easier to do that without having to present the data.

The difficulty of developing an interface for data presentation

My understanding is that EOR is planning to make it possible to present data related to publications, and probably to reviews.  So, EOR will face a secondary, technical issue: how to make that possible for contributors?  I.e. in what format(s) will it be possible to upload data, and what kind of interface will be presented to contributors to enable them to upload data?  Again there is a tradeoff:  a simple interface is easy to use, which should stimulate greater interest in contributing, but if the interface is too simple it won’t be able to accommodate the diverse kinds of data that people will need to include.

This isn’t just EOR’s problem

These issues EOR will face in determining how to interface with data used in research is of course not just EOR’s problem, and they are not issues that EOR will be able to ‘solve’ on its own.  They reflect the broader fact that the interface between data and publication is (a) crucial, but (b) generally ignored by both data collection/sharing apps like electronic notebooks, and publications-focussed apps and services.

As I discussed in last week’s post, data is currently collected in such diverse formats and structures — from paper notebooks to spreadsheets to wikis to blogs to electronic lab notebooks to note taking apps like Evernote to databases — that it is not yet possible to develop publication practices that facilitate anything close to convenient and comprehensive replication and verification of results by including data and code along with the publication.

Towards better presentation of data in EOR, and generally in connection with publications

For things to improve, in my view, there needs to be movement from both ends, i.e. from both the ‘publications aggregators’ (i.e. both existing publishers and innovators like EOR) and from the data collection end, i.e. tools, apps and services that scientists use to collect and organize their data (‘data aggregators’).

What publications aggregators can do

For publications aggregators, the first step is to acknowledge the importance of making data available along with publications in a useful and usable form.  EOR, it appears, has taken this first step.  The second step is to develop structures that facilitate easy inclusion of data alongside publications, in preparation for the day when data aggregators have made it easier for scientists to export their data from the data aggregation tools.  Again, EOR seems to be interested in developing along these lines, and I would say that an opportunity exists for EOR to take the lead in this area.

What data aggregators can do

At their end, data aggregators need to make it far easier to export data from their tools and services. It is difficult and in many cases practically impossible to export data from most electronic lab notebooks, and even generic tools like Evernote do not support export very well.  Google is better, and is serving as a model.  Google has a Data Liberation Front team, whose goal is to make it easier to move data in and out of Google products.  Earlier this year Google introduced Google Takeout, a service which lets you take data from multiple Google products at once. Only a few products, including Buzz, are included now, but the plan is to expand Google Takeout to other products going forward.

An offer to collaborate with EOR

In December Axiope will be releasing a new version of our electronic lab notebook and sample management system eCAT.  With this new version, for the first time, export (other than exporting the XML, which is already possible) will be specifically supported.  With the new version it will be possible to export from both the Notebook and the Inventory (i.e. sample management) sides of eCAT.

eCAT Notebook records will be exportable to ODF (Open Document Format = the format for Open Office).  We have chosen that format for two reasons.   The first is that eCAT is platform-agnostic; it runs on Windows, OS X and Linux, and so does ODF.  The second is that ODF supports retention of links and embedded images, so after export the formatting will be retained.

It will be possible to export eCAT Inventory records to CSV.  We have chosen to start with CSV because in our experience scientists like to see sample data in spreadsheets.

We see this initial  export capability as a modest but important starting point for making data from eCAT ‘portable’. We plan a series of future releases with additional kinds of export capabilities.  This means that over the same period that EOR is taking shape, export from eCAT will be developing.  To us, this seems like a great opportunity to explore better publication/data interfaces from both the data aggregation end and the publications aggregation end.  So, here is an open invitation to Daniel and others involved with EOR:

If you keep us advised on the data formats and methods for including data you think EOR will be needing, we will do our best to ensure that it’s possible to export data from eCAT in those formats and using these methods.

Reproducibility of data and collaboration: Response to Victoria Stodden with two examples

Posted by Rory on November 9th, 2011 @ 11:24 am

The importance of reproducibility

In a post yesterday, “Disrupt science, but not how you’d think“, Victoria Stodden writes, “I am not necessarily in favor of greater openness during the process of scientific collaboration. I am however necessarily in favor of openness in communication of the discoveries at the time of publication.”  To enable this, she goes on to argue, “we need to establish the routine inclusion of the data and code with publication, such that the results can be conveniently replicated and verified.”

Two examples of making data available using currently available tools

This seems to me to be a very important point, and one that few would dispute.  But how to make reproducibility happen is not so obvious.  As the following two examples make clear, even when the will is there, the tools do not yet exist to make reproducibility convenient and therefore widespread.

Example 1:  Append an entire paper lab notebook to the publication

Gregory I. Lang and David Botstein recently published a paper (A Test of the Coordinated Expression Hypothesis for the Origin and Maintenance of the GAL Cluster in Yeast) to which an entire 101 page lab book containing all the notes, strain construction, methods and raw data that went into producing the paper was included as supplementary data.  I was so struck by this that I wrote a post about it, pointing out that the authors would have (a) saved themselves a lot of time and (b) made it easier for themselves and others to make use of the data they generated if they had recorded their data electronically.

So how does the inclusion of Greg Lang’s paper notebook stand up to the reproducibility test?  It gets top marks for openness; all the experiments, results, materials used, observations, questions asked, thoughts, etc.,the ‘actual scholarship’ in the words of Victoria Stodden’s thesis advisor, are there for all to see.  But, and it is a big but, notwithstanding the openness this scholarship is actually pretty inaccessible in practical terms.  It’s impossible to search on key terms, there is no linking, internally or with external sources of information, and all the benefits of electronic recording are absent.  It would take a huge effort to plough through the notebook, understand how it all fits together, and pick out relevant bits, and the magnitude of this effort surely acts as a major barrier to anyone trying to reproduce the research.

Example 2: Document and publish online everything that happens in the lab as it happens

The Roberts lab at the University of Washington focuses on characterizing physiological responses of marine organisms to environmental change.   The lab has adopted a highly innovative way of organizing and communicating its research.  The lab uses a wiki as the home for its research activities and results.  Protocols, the lab calendar, and image and data repositories are housed in the wiki. That in itself is not particularly novel.  What is more interesting is that the wiki is also home to each individual lab member’s online lab notebook.

And this organizational platform is just the starting point for the more radical part of the lab’s innovation, which is the way it presents and communicates its research, in real time.  The lab uses facebook, tumblr, youtube, and flickr to post developments related to the lab’s research and activities.

The lab members also use their twitter account, @genefish, to push out an auto-feed of all or virtually all notebook entries, blog posts, calendar modifications, photo uploads, etc.  @genefish currently has more than 13,000 tweets!  The core of @genefish is the tweets about the research being documented in each individual’s lab notebook.  Each notebook + the tweets = a step by step account of each lab member’s research as it takes place, and since all the notebooks are included in the lab’s wiki, the lab notebooks + the tweets = a complete record of the collective research in the lab as it takes place.

Like Greg Lang, the Roberts lab gets high marks for openness.  Indeed they have extended the openness to include (a) collaboration between lab members, and (b) open publication of the research as it happens. It’s hard to imagine a more thorough, or well-organized, example of open science than this.  And in terms of adoption of available online resources — Twitter, Facebook, etc., the Roberts lab is on the bleeding edge.

What about reproducibility?  Everything (absolutely everything, it seems!) is there for others to see and access electronically, and there is a record of the process as well as of the results.  These are big plusses in terms of facilitating reproducibility.  There are, however, also some negatives, relating to information overload. There is so much information there that it could be difficult for someone wanting to reproduce the research to zero in on the important and relevant bits.  Another aspect of this is the use of so many different platforms to capture and communicate information — the wiki, Twitter, Facebook, etc.  Although these are all electronic, searching for a term or a key component of the research would probably be even more difficult here than with the paper lab book.

Even with the will, the way is not there yet . . . but it’s coming

Victoria Stodden concludes her post by saying   “. . . It is of primary importance to establish publication practices that facilitate the replication and verification of results by including data and code . . .”  The two examples discussed above demonstrate that currently available tools are nowhere close to being convenient enough or sufficiently fit for purpose to support the development of the kinds of publication practices Victoria Stodden would like to see.   On a brighter note, the examples also reflect the widespread interest in making reproducibility possible, and the wide range of experimentation going on.

Surprising similarities between field biology notetaking and micro notetaking

Posted by Rory on October 26th, 2011 @ 4:48 pm

Two radically different kinds of note taking?

This week I came across ButterflyNet — a mobile capture and access system for field biology research developed by the Stanford HCI group.  I discovered ButterflyNet in a footnote to Finders/Keepers:  a Longitudinal Study of People Managing Information Scraps in a Micro-note Tool on a ‘micro-note taking tool by David Karger and others at MIT and the University of Southampton.  These are two examples of note taking at the extreme poles of the spectrum: notetaking in the field is complex, with entry of heterogeneous kinds of data, over extended periods of time, often in difficult conditions, whereas micro-notes are scraps of information — like  ‘to do’ items and web urls — that people want to note down as simply and as quickly as possible.  As you’d expect, there are some differences between what note takers want from tools in the two different situations, but what I found more interesting was the number of underlying similarities.


Field biology

Field biology is a complex activity.  It is carried out in extended sessions, involves observations of multiple things and activities, some of which are changing while the observation takes place, often involve taking photographs and/or collecting physical samples, and requiring recording location information about the subjects of the study.  This results in collection and production of often large amounts of heterogeneous forms of data.  The collection of data in the field, moreover, is only one step in an extended research process.  After the data has been collected in the field, it will be organized, analyzed and often tested back in the lab, with results subsequently discussed and reported in meetings, presentations, and publications.

The paper presenting the results from a trial of ButterflyNet points out that this results in some key requirements:

  1. Capturing and accessing heterogeneous data
  2. Transforming and integrating this data
  3. Robust tools

Micro notetaking

Micro notetaking refers to taking quick notes or recording scraps of information. The kind of thing you put down on a Post-it note or its many electronic wannabe equivalents.  The Finders/Keepers authors found that people wanting to take quick electronic notes have three requirements for notetaking tools; the tools must be:

  1. Easy to use.
  2. Organized to let people arbitrarily capture small bits of information quickly and easily.
  3. Able to keep these information items readily available in visible locations.


So much for the differences in the two note taking environments, and the resulting differences in the requirements of field biologists and micro notetakers.  But underlying these differences are some fundamental similarities.  These are reflected in the following feedback from the biologists involved in trialling ButterflyNet:

  1. The in-field focus — when time is expensive — is on documentation, rather than interface manipulation.
  2. The top advantage of ButterflyNet was that it would help participants to capture and transcribe more data
  3. Participants would rather save field time, even if it resulted in more work later.

So it turns out that for field biologists, just like micro note takers, the primary concern is to make data entry as quick, simple and easy as possible, even if this results in more work later on.

Consequences for tool design

These conclusions lead to some important, if difficult, consequences for those of us designing tools for biologists.  The main ones, it seems to me, are:

  • Data entry must be ultra simple and easy — a major challenge given the heterogeneity of the data being entered.
  • This inevitably increases the complexity of dealing with and organizing the data after it has been entered, so it’s a double complexity — heterogeneous data with little structure captured at the time of entry.
  • Users still want the ability to structure data for the purpose of  analyzing and testing it in the lab, and subsequently discussing and reporting data and results in meetings, presentations, and publications.
  • The second challenge is thus building support for organizing and dealing with the data in ways which (a) appear simple to the user, but (b) allow the user to organize, manipulate and present the data, and integrate it with other data (from databases, the web, etc.) in a variety of sophisticated ways.

What kind of field biology notetaking tool would this mean in practice?

ButterlyNet was produced in 2005.  It’s safe to describe it as (a) a brave attempt using the technology then available, and (b) way ahead of its time.  The main curiousity, from the perspective of 2011, is ButterflyNet’s creative attempts to integrate paper note taking with online data capture and retrieval.  This was highly innovative in 2005, but in the age of the iPad, it makes ButterflyNet look extremely clunky.  The question that leaps to mind today is:  why not just get rid of the paper and do it all (a) online and (b) on a tablet?

What would that take?  A tool that

  1. (Like Butterfly Net) Supports entry of heterogeneous kinds of data including notes, photos, audio, video, GPS data, and data relating to physical samples.
  2. Allows simple and quick entry of all these kinds of data, on tablets and offline if necessary (hence dispensing with the need for paper).
  3. Facilitates easy association of the different kinds of data; automatically, or by the user in ways the user determines.
  4. Syncs between tablets and pcs or macs.
  5. Is designed to be collaborative, i.e. has support for controlled sharing of data.


If a tool like that sounds interesting to you, watch this space!

Paper lab book versus electronic lab notebook: a real life comparison

Posted by Rory on October 10th, 2011 @ 12:57 pm

An entire paper lab book appended to a PLoS ONE paper!

Gregory I. Lang and David Botstein recently published a paper (A Test of the Coordinated Expression Hypothesis for the Origin and Maintenance of the GAL Cluster in Yeast) which is attracting attention because  the supplementary data consists of a 101 page lab book containing all the notes, strain construction, methods and raw data that went into producing the paper.  As Mark Hahnel pointed out in SciCrunch, this is an admirable example of both open science and thorough science.  But as Mark and several people who commented on Mark’s post noted, the authors would have (a) saved themselves a lot of time and (b) made it easier for themselves and others to make use of the data they generated if they had recorded their data electronically.

To bring this point home, I thought it would be interesting to illustrate it in a very concrete way, by showing how the data from the paper could look, and be integrated with other parts of the research, like protocols, constructs, and strains, when gathered and presented in an electronic lab notebook.  First, here’s how it looks using the paper approach.

The paper lab book view

The paper itself follows a standard format with the following headings:

  • Abstract
  • Introduction
  • Results
  • Discussion
  • Materials and Methods
  • Supporting Information
  • Acknowledgements
  • Author Contributions
  • References

And the paper lab book also follows a familiar format, with dated entries.  The entries contain the narrative of what was done, thoughts about results that were emerging, calculations, formulas, materials and constructs used in the experiment, etc.  Here’s the first entry


Benefits of using an electronic lab notebook

Headings mirror publication format

Using an electronic lab notebook would have given the authors a big win right at the outset, because the electronic lab notebook can be set up with the same headings that will be used in the paper where the experimental results are presented and discussed, i.e. Abstract, Introduction, Results, Discussion, Materials and Method, Supporting Information, Acknowledgements, Author Contributions, References.

To illustrate this point, here’s how the blank template looks in the eCAT electronic lab notebook

So  the data for each section of the paper is already organized and available for selective export into the document used for publication.

Presentation stays the same

Here’s what the Notebook field looks like with the initial entry filled in – just like it does in the paper notebook.

Flexible set up

There is plenty of flexibility in formatting the form used to document experiments. In this example it’s been set up so that the fields exactly follow the sections in the PLOS ONE publication, and one additional field called Notebook has been added at the top. That’s for recording the daily narrative as in the paper notebook.  But the form can be set up with any fields  — e.g. as follows where certain fields — Abstract, Introduction, Acknowledgements, etc. — have been removed to simplify the experiment form.

An entire working environment

With the electronic lab notebook the notebook is not isolated from the other parts of the research.  In addition to experiments,  protocols, constructs, etc. can also be created in the notebook.  For example, here’s a protocol


A big benefit of the electronic lab notebook over paper is the ability to make links.  So for example in the Materials and Methods section of the experiment, a link could be added to the protocol used in the experiment, as shown below

Now, whenever it’s necessary to see the protocol used in the experiment, either when the experiment is active or afterwords, e.g. when it is being written up, just click on the link and you are taken to the protocol. Links can also be made to the constructs used in the experiment, and even the strains. It’s also possible to link to a record showing where the strains or other samples used in the experiment are stored in the freezer, and also what has been done to them in the experiment.

Sharing with lab members

PI and postdoc

The article contains the following note explaining the division of labor between the two authors:

“Conceived and designed the experiments: GIL DB. Performed the experiments: GIL. Analyzed the data: GIL DB. Wrote the paper: GIL DB.”

So the paper notebook is Greg’s notebook.  And, as the note says, he is the one who performed the experiments. Presumably David, Greg’s PI, commented and provided advice as the experiments were being carried out, but this is not reflected in the notebook narrative.

The electronic lab notebook opens up new possibilities for collaboration, and attribution.  For example David could have been given permission to view, and edit, the experiment form, and could use the Discussion field for making his comments as the experiments were carried out, and these would be recorded as part of the experimental record. For example

David could do this at any time, because eCAT is a webapp and hence available 24/7. So he could review the experimental narrative not just when he’s looking at Greg’s paper notebook, and not just when he is in the lab; he could review and make comments from home or when travelling.  So it’s much easier to communicate about the experiment.  And the process is more transparent because there would be a dated record of David’s comments.

Bigger groups

That’s useful enough when  just a postdoc and a PI are involved, as in this case, but it’s even more useful when a larger group is involved in working together on a series of experiments that are going to be written up in a paper.  In that case the group can take full advantage of the electronic lab notebook’s linking and variable permissions capability by, say, having common access to shared resources like protocols and strains, and selective access to the experiment record, e.g. the key contributors have view and edit permission for the experiment record, and others have only view permission.  Links can also be made to the publication in progress. And of course everyone can access all these records in the electronic notebook lab notebook 24/7; the physical and temporal limitations of the paper lab book are left behind.

Sharing with the community

New possibilities are also opened up for sharing with the community.  Whether it is during the course of the experiment —  for open science advocates — or after the experiment has been written up and published, having the experiment documented in the electronic lab notebook makes it much easier to share it with the interested community beyond the lab.  And that sharing has far higher utility, because it allows the authors, and others, to search more effectively.

Search and archiving

It’s just not practical to search for terms, concepts, etc. in a 100+ page paper notebook.  Not even for the author, much less for others like readers of the publication.  Search is useful for those doing the experiment as they carry it out.  E.g. in Greg’s case it would have been useful to search for ‘strain’ + ‘transformation’ + [a particular strain]’.  But these kinds of searches are also invaluable after the fact, both for the researcher and for those wanting to understand, recreate or build on the original research. So with the electronic lab notebook you get a shareable archive of the experimental results and the data associated with them.


The electronic lab notebook also keeps an automatic record each time a change is made, so using its audit trial it’s possible to see who made what change when.  This is useful in tracing the evolution of the experiment, and also in determining who contributed and in what way.

iPad electronic lab notebook: the best of both worlds

The paper lab notebook has one major advantage over an electronic lab notebook on a pc or mac– the paper lab notebook is portable.  But electronic lab notebooks that work on mobile devices like the iPad are also portable, so that distinction also becomes a thing of the past.   And the iPad opens up new possibilities for individual mobility and sharing with colleagues.  So you get portability, just like the paper lab book; plus all the benefits noted above of the electronic lab notebook.  It’s the best of both worlds!

4 ways the rise of mobile apps for science will change the way research is done

Posted by Rory on July 6th, 2011 @ 4:48 pm

The rise of apps for science

Last week I wrote a post discussing whether apps will overtake websites in scientific research as they now have done for general usage.  I later discovered that the previous day Antony Williams and Sean Ekins had launched Scientific Mobile Applications, a “community resource for developers and users to share information about the various science apps that are available”.  There is a useful presentation on Slideshare discussing the rise of mobile apps for science and providing additional background to the establishment of Scientific Mobile Applications, which is laid out as a editable wiki.

I wanted to mention that before going on, as I promised to do last week, to discuss how the rise of apps in scientific research is likely to change the way research is done.  First, it’s a nice coincidence that my post and the launch of Scientific Mobile Applications happened on almost the same day!   Second, this coincidence affirms the theme that both Antony and Sean, and I, are stressing, namely the coming onset of a slew of apps for science.  Third, I would like to thank Antony and Sean for creating what has the potential to become a very useful resource.

Now, onto the main topic — how the rise of apps in science is likely to impact the way science is done.  Here are a few changes that are likely to accompany the rise of apps for research — they are pretty fundamental.

Where you find resources

Currently you probably find new ‘resources’ — services, publications, people, tools, materials, etc., primarily through (a) Google, and (b) websites, e.g. PubMed, Mendeley, discipline-specific forums and databases, and possibly (c) social media like Twitter and Facebook.  As more and more apps for science become available, people will begin to spend more time looking for resources in a third ‘space’, namely appstores.  At first this may not seem like a big change since it will just feel like an additional  ‘resource space’.  But if science goes the way of general usage and apps overtake websites as the place where scientists spend a majority of their time online, then apps will become the primary place people go to find resources.  When they want to find a new resource their first instinct will be to look for it in the apps they are already using, and if they can’t find it there they will look next in the app store.  Apps will become the primary resource metaphor, and things that used to seem fundamental, like Google and websites, will be seen as just other apps.

Where you get information

The primacy of apps will also impact where you find information.  Currently people think of the  web as the window or portal through which you look or search for information generally, not just resources.    But if the interface to your mobile device has replaced your desktop as the screen you see most often in the lab or the office, then you will be looking at something like this:

When that happens, you are likely to organize your thinking about where to find a bit of information around which app will help you in the quickest and most effective way. And, as with resources, if you don’t have an app that is good at finding whatever it is you are looking for, your next step is likely, again, to be an app store where you can find one that is.

How you work

In future your research is likely to center around a series of apps. Already there are an increasing number of apps for consuming information related to research — for reading and for search.  But as data entry on tablets becomes easier, and more complex forms of data entry become possible, apps for producing and manipulating information will also proliferate, and hence you will find that you are actively ‘producing’ on apps as well as just consuming.  For example, you can currently find apps which allow you to read barcodes, and these can be applied in some cases to barcodes on inventory items used in the lab.  But in future you will be able to enter data into an app about particular samples, and that data entry will be quick and convenient.  So, you will be producing data through the app as well as consuming it.

How you share informaton

Since most of the information you consume and produce will be through an app, information that you share with others will also be through an app.   Since the ability to share with others, or with others who don’t have the app, is in many cases limited, this will lead to limitations on the ability to share data.

Next week: the implications, good and bad

So there are some of the changes the rise of apps in research are likely to stimulate. These changes have far reaching implications, some good and and some not so good. Next week I’ll take a look at those implications.





Enhanced by Zemanta