Keeping Up with the Scientific Literature using Twitterbots: The FlyPapers Experiment

A year ago I created a simple “twitterbot” to stay on top of the Drosophila literature called FlyPapers, which tweets links to new abstracts in Pubmed and preprints in arXiv from a dedicated twitter account (@fly_papers). While most ‘bots on Twitter post spam or creative nonsense, an increasing number of people are exploring the use of twitterbots for more productive academic purposes. For example, Rod Page set up the @evoldir twitterbot way back in 2009 as an alternative to receiving email posts to the Evoldir mailing list, and likewise Gordon McNickle developed the @EcoLog_L twitterbot for the Ecolog-L mailing list. Similar to FlyPapers, others have established twitterbots for domain-specific literature feeds, such as the @BioPapers  for Quantitative Biology preprints on arXiv, @EcoEvoJournals for publications in the areas of Ecology & Evolution and @PlantEcologyBot for papers on Plant Ecology. More recently, Alberto Acerbi developed the @CultEvoBot to post links to blogs and new articles on the topic of cultural evolution. (I recommend reading posts by Rod, Gordon and Alberto for further insight into how and why they established these twitterbots.) One year in, I thought I’d summarize my thoughts on the FlyPapers experiment, and to make good on a promise I made to describe my set-up in case others are interested.

First, a few words on my motivation for creating FlyPapers. I have been receiving a daily update of all papers in the area of Drosophila in one form or another for nearly 10 years. My philosophy is that it is relatively easy to keep up on a daily basis with what is being published, but it’s virtually impossible to catch up when you let the river of information flow for too long. I first started receiving daily email updates from NCBI, which cluttered up my inbox and often got buried. Then I migrated to using RSS on Google Reader, which led to a similar problem of many unread posts accumulating that needed to be marked as “read”. Ultimately, I realized what I want from a personalized publication feed — a flow of links to articles that can be quickly scanned and clicked, but which requires no other action and can be ignored when I’m busy — was better suited to a Twitter client than a RSS reader. Moreover, in the spirit of “maximizing the value of your keystrokes“, it seemed that a feed that was useful for me might also be useful for others, and that Twitter was the natural medium to try sharing this feed since many scientists are already using twitter to post links to papers. Thus FlyPapers was born.

Setting up FlyPapers was straightforward and required no specialist know-how. I first created a dedicated Twitter account with a “catchy” name. Next, I created an account with dlvr.it, which takes a RSS/Twitter/email feed as input and routes the output to the FlyPapers Twitter account. I then set up an RSS feed from NCBI based on a search for the term “Drosophila” and add this as a source to the dlvr.it route. Shortly thereafter, I added a RSS feed for preprints in Arxiv using the search term “Drosophila” and added this to the same dlvr.it route. (Unfortunately, neither PeerJ Preprints nor bioRxiv currently have the ability to set up custom RSS feeds, and thus are not included in the FlyPapers stream.) NCBI and Arxiv only push new articles once a day, and each article is posted automatically as a distinct tweet for ease of viewing, bookmarking and sharing. The only gotcha I experienced in setting the system up was making sure when creating the Pubmed RSS feed to set the “number of items displayed” high enough (=100). If the number of articles posted in one RSS update exceeds the limit you set when you create the Pubmed RSS feed, Pubmed will post a URL to a Pubmed query for the entire set of papers as one RSS item, rather than post links to each individual paper. (For Gordon’s take on how he set up his Twitterbots, see this thread.) [UPDATE 25/2/14: Rob Lanfear has posted detailed instructions for setting up a twitterbot using the strategy I describe above at https://github.com/roblanf/phypapers. See his comment below for more information.]

So, has the experiment worked? Personally, I am finding FlyPapers a much more convenient way to stay on top of the Drosophila literature than any previous method I have used. Apparently others are finding this feed useful as well.

One year in, FlyPapers now has 333 followers in 16 countries, which is a far bigger and wider following than I would have ever imagined. Some of the followers are researchers I am familiar with in the Drosophila world, but most are students or post-docs I don’t know, which suggests the feed is finding relevant target audiences via natural processes on Twitter. The account has now posted 3,877 tweets, or ~10-11 tweets per day on average, which gives a rough scale for the amount of research being published annually on Drosophila. Around 10% of tweeted papers are getting retweeted (n=386) or favorited (n=444) by at least one person, and the breadth of topics being favorited/retweeted spans virtually all of Drosophila biology. These facts suggest that developing a twitterbot for domain-specific literature can indeed attract substantial numbers of like-minded individuals, and that automatically tweeting links to articles enables a significant proportion of papers in a field to easily be seen, bookmarked and shared.

Overall, I’m very pleased with the way FlyPapers is developing. I had hoped that one of the outcomes of this experiment would be to help promote Drosophila research, and this appears to be working. I had not expected it would act as a general hub for attracting Drosophila researchers who are active on Twitter, which is a nice surprise. One issue I hadn’t considered a year ago was the potential that ‘bots like FlyPapers might have to “game” Altmetics scores. Frankly, any metric that would be so easily gamed by a primitive bot like FlyPapers probably has no real intrisic value. However, it is true that this bot does add +1 to the twitter count for all Drosophila papers. My thoughts on this are that any attempt to correct the potential influence of ‘bots on Altmetrics scores should unduly not penalize the real human engagement bots can facilitate, so I’d say it is fair to -1 the orginal FlyPapers tweets in an Altmetrics calculation, but retain the retweets created by humans.

One final consequence of putting all new Drosophila literature onto Twitter that I would not have anticipated is that some tweets have been picked up by other social media outlets, including disease-advocacy accounts that quickly pushed basic research findings out to their target audience:

This final point suggests that there may be wider impacts from having more research articles automatically injected into the Twitter ecosystem. Maybe those pesky twitterbots aren’t always so bad after all.

About these ads

6 thoughts on “Keeping Up with the Scientific Literature using Twitterbots: The FlyPapers Experiment

  1. I enjoyed this post a lot. I hadn’t thought much about it before, but twitter really does seem like an ideal venue to keep up with the literature, and to share each others efforts. So, inspired by fly_papers, I set up phy_papers: https://twitter.com/phy_papers

    And in an attempt to make the process of doing this crystal clear for others who are interested, I’ve posted a (very) detailed set of step-by-step instructions here:

    https://github.com/roblanf/phypapers

    If anyone has any tips, tricks, or suggestions, feel free to post issues on the repo.

  2. Personally, I don’t like link shorteners, doesn’t look like you can opt out of in dlvr.it. Being cynical, I’m sure it suits the service provider to a tee to have product placement in every tweet…

    The full URL is an important piece of info when it comes down to 140 characters, at least for those “bots” with various sources (not an issue if it’s always from Pubmed I guess). You want to know what journal it is, and some mightn’t want to read Eurekalert for instance

    I use IFTTT (If This Then That) which links together a whole host of services beyond RSS⇒Twitter. RSS⇒Buffer + Buffer⇒Twitter would give the same outcome as what you’re doing with dlvr.it timing-wise, and wouldn’t require the short URL. » ifttt.com/connect/feed/buffer

    The one complaint I’d make (rant time) is that IFTTT has trouble parsing ScienceDirect’s ridiculously long URLs from time to time, meaning that I occasionally have to copy the title which has been tweeted without a link, but I don’t mind as I’d be reading these pieces anyway.

    For FlyPapers it’d avoid the awkward prefix/suffix solution (that might push you over the 140 char limit in some cases and detract from the info per tweet)

    Having informed the support team I’m hoping they’ll fix the SciDir issue (it’ll probably just be some maximum character limit on the URL)

    From what I’ve discussed with friends in CS, it shouldn’t be too hard to embed either Ruby or Python within a .cgi and have it execute periodically on AWS, I’d like to get around to playing with that soon.

    I think it’s a shame that there’s no open source version of dlvr.it or IFTTT. The special interest patient group you mention is one of the great sides to this and something I’ve seen too. This is in part why I’m considering adding semantic info (whatever you want to call “text-mined” tags, author info, DOI, PMID etc.), as it could enable greater sharing of primary lit., to whatever degree, and just to make the archive of abstracts I’ve got a bit more useful.

    Choosing to link the abstracts from my RSS feed to a Tumblr account as well as a Twitter account means it’s
    [a] backed up
    [b] unrestricted to 117 characters (= 140 minus the link) and
    [c] possible to download in its entirety (as XML or whatever else)

    My site naivelocus.com doesn’t *add* much value to what goes into it, but it does let non-tweeting students, researchers and organisations follow it, and being able to filter past articles by journal is a nice feature. Being able to set precisely which parts of the RSS feed are rendered as which attributes e.g. tag, title, link URL beyond just prefix/suffix is good and might be of interest to you or people thinking of doing similar

    Just thought I’d throw all that out there. Any time invested in organising our information environment pays off in the long run I say.

    To those looking for a Twitter⇒RSS feed solution for FlyPapers, it’s worth noting that the support for these services is *really poor* and that’s why I back my posts up with the Tumblr (which provides consistent RSS) — the owner of the Twitter account can has to signing into IFTTT, so it’d fall to you Casey to set such a parallel feed up, otherwise http://www.rssitfor.me/ will work (but they’re asking for donations, there’s no real guarantee of reliability and it’s not encouraging that others don’t seem to have persisted, see http://socialmediaslant.com/twitter-rss/ for example)

    My feed is just for all things biochemistry-related (/genomics /”life sciences”) that I read, one way of keeping abreast of the scientific lit and not having to read it immediately as it flies past on my phone, until I can get to a VPN.

    As you say, if you stop it’s overwhelming – I did for 2 weeks over exams recently and there were some 5000 papers :-| Obviously it’s not feasible to read them all, and unfiltered RSS⇒Twitter for every journal would just be ugly and useless, so I hand pick things of note (RSS⇒Feedly⇒Twitter), and I think it works pretty nicely

    Louis

    P.S. Dead link alert in twitter.com/caseybergman/status/344379023222263808

  3. How to use twitter to follow the latest scientific papers | The Microsized Mind

  4. Protocolli – Ocasapiens - Blog - Repubblica.it

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s