Saturday, November 21, 2009

Health Internet and the guiding principles for standards

John Halamka writes of his experience of the HIT Standards Committee and sets out some guiding principles for creating health IT standards. They all seem important, but the one which caught my eye was:

Leverage the web for transport whenever possible to decrease complexity & the implementers’ learning curve (“health internet”)
Notable, it seems to me, in the context of the possibilities created by Chrome OS

Google in 2009

Useful review of Google in 2009 from Matt Cutts. Social search is particularly interesting, though has been down recently.

The review misses the announcement of the Chrome Operating System, which has the potential to transform user experience in the next 3-5 years. Video below.


Monday, October 26, 2009

More library catalogue woes

By chance, more use of a library catalogue this week. This time a friend just starting a PhD wanted to know how to access journals either on paper or electronically. Finding paper holdings was relatively easy, though as a PhD student she hadn't been able to work it out on her own.

But the electronic journals were a shambles. Part of the problem was that SFX didn't seem to do more than take us to journal home pages even when we'd asked for a specific issue; integration with free journals and archives was poor (we wanted an old BMJ article, which I know is freely available by several routes), and part of the problem is that, as the library website acknowledged, ejournal access doesn't work very well unless your computer is directly connected to the university network.

And then my friend said she's been told to avoid Google. So we tried Google and found several reports and articles, and a book, full text freely available (well the book was only part available but it was available enough for essay writing purposes). I was disappointed that a library system in a top 10 UK university was falling this far behind the reasonable expectations of modern users and am wondering what librarians have been doing to improve the tools of their trade for the last 10 years.

Saturday, October 24, 2009

Using a library catalogue

I used a library catalogue last week in a big academic library, and it was the first time in several years that I've done this. It was clearly the latest version of the system. And maybe that was the problem. It looked Google-esque, but when I searched for a person it took me to an authority list, which was just annoying because I wasn't sure which Smith was mine. I'd rather have gone, Amazon style, to some books, and then homed in on my Smith. In the end I needed to get help from a nice librarian, so I felt a bit of an idiot, being a trained librarian.

Thursday, October 22, 2009

Designing URI Sets for the UK Public Sector

The state of public sector URLs is probably no worse than that of private sector URLs, but anyway they're pretty awful -  just a few examples. So the release of Cabinet Office guidance on URI sets is welcome.

The most radical proposal is to use data.gov.uk , for example transport.data.gov.uk , or health.data.gov.uk . I hope it happens. But I wonder how this will happen - it will take a huge change in Whitehall at least.

Wednesday, October 21, 2009

Enabling interoperability in healthcare

Interoperability in healthcare is vital (HL7, SNOMED - http://www.e-health-insider.com/comment_and_analysis/501/personal_view:_tim_benson, ) but it won't be achieved until the non-technical barriers to achieving it are recognised.

Two non-technical barriers are critical:

- commercially it may not make sense for systems suppliers to make it easy for data to move in and out of their systems

- professionally, it is unclear that the needs and interests of different professional groups point towards shared data.

The answer I believe lies in patients owning and holding their health record, with each system and professional holdig their own record and able (if permitted by the record owner) to read and write to the master record. The challenge is to create a value proposition for patient held records. It may be something as simple as electronic prescriptions. I like the approach www.keas.com are taking.

Thursday, October 08, 2009

Medpedia and Wikipedia

I like Medpedia very much, and the Medpedia page on Atrial Fibrillation is better in some respects than the corresponding Wikipedia page. But it's less up to date, eg no reference to http://circ.ahajournals.org/cgi/content/full/115/24/3050 which challenges some of the content of the article. I don't have the time, inclination or expertise to edit the Medpedia page by the way.

Tuesday, October 06, 2009

Limitations to evidence based medicine

A limitation of the EBM approach is that as questions get more specific the evidence becomes less available. EBM can be quite good at answering broad brush answers questions, but these are rarely what clinicians actually need. Here is a recent example, taken from a recent NICE guideline on rheumatoid arthritis (pdf)

The available data does not answer the clinical question of whether a patient who is not responding to DMARD therapy should go onto other conventional DMARDs or onto a biological drug. There are no head to head trials of these comparators.
This question, about switching to another low cost DMARD or to a high cost biologic, important as it is, doesn't seem to have an answer. The problem in part is that the underlying research agenda regarding the effectiveness of therapeutics is motivated (of course) by commercial concerns, as we showed 15 years ago in an analysis of NSAIDS for osteoarthritis.

Sunday, September 27, 2009

Mendeley growing rapidly; alternative model for repositories

I've argued that Mendeley and ResearchGATE can add some pep to the repository pattern. Here's some more of the same from Open Access News.

Mendeley growing rapidly; alternative model for repositories: "John MacColl, Mendeley scrobbles your papers, HangingTogether, September 24, 2009.

Mendeley is a social web application for academic authors that has been receiving quite a lot of attention recently. Victor Keegan wrote about it in The Guardian last week, likening it to the streaming music service Last.fm:



How does it work? At the basic level, students can “drag and drop” research papers into the site at mendeley.com which automatically extracts data, keywords, cited references, etc, thereby creating a searchable database and saving countless hours of work. That in itself is great, but now the Last.fm bit kicks in, enabling users to collaborate with researchers around the world, whose existence they might not know about until Mendeley’s algorithms find, say, that they are the most-read person in Japan in their niche specialism. You can recommend other people’s papers and see how many people are reading yours, which you can’t do in Nature and Science. ... There are lots of research archives. For the physical (but not biological) sciences there is ArXiv, with more than half a million e-papers free online – but nothing on the potential scale of Mendeley. Around 60,000 people have already signed up and a staggering 4m scientific papers have been uploaded, doubling every 10 weeks. At this rate it will soon overtake the biggest academic databases, which have around 20m papers.



The site has grown fast, aided by significant investment capital from investors associated with Last.fm, Skype and Warner Music Group. ...



If it realises the potential many people are now predicting, the library community is bound to ask why a web application based on an entertainment model should have proved so much more attractive than the painstakingly built repositories we have been holding under the noses of our academic authors over the last several years?



I think there may be a few reasons for this. First, its appeal is intuitive. Put your papers in our service and we will give you lots of webscale data back on how popular they are. The system can show you instantly how your research profile compares with the average researcher in your field. Second, it is instant. The map of research adjusts daily as new papers are added. Want to find out who is the most popular author in your field today? Mendeley can tell you. ... And third, the demands it makes are low compared to the benefits it provides. A range of simple tools allow you to ship your papers into it. ... [Y]ou can scrobble. Scrobbling is the word Last.fm uses to describe the use of a tool that works invisibly in the background to add your music choices to your Last.fm account. ... In Mendeley, the same notion is applied via the 'Watched Folder' facility. With it, you can designate folders on your hard disk that Mendeley will monitor, and from which it will suck new papers as they appear.



By adopting these approaches, Mendeley has grabbed the attention of users because it understands what they like. They like simplicity. ... What do they not like? Tedious rules about copyright (the Mendeley FAQ, perhaps ironically, quotes the E-prints Self-Archiving FAQ to reassure authors about the extent of Open Access tolerance among publishers). They don’t like rigorous requirements for metadata (Mendeley automatically extracts metadata, and asks users to help it make corrections where it gets things wrong). In other words, the requirements libraries often put up front are almost dismissed as non-issues. ...



Comment. To me, the better analogy may be Napster. I don't necessarily mean that pejoratively: both Napster and Mendeley watch a folder on the user's computer and automatically share files in that folder. That takes the effort out of sharing, which means more documents get shared. It also means that metadata will often be incomplete or inaccurate. In addition, since there's less emphasis on copyright compliance, I'd suspect that some authors may share documents in ways that violate their publisher's contract -- more so than traditional repositories. In short, the Mendeley model seems to have some major advantages over traditional repositories, but also some significant shortcomings vis-à-vis traditional repository goals. I think there's a place for both in a healthy scholarly communications ecosystem, with both competition and collaboration.



See also our past posts on Mendeley.

"

Saturday, September 26, 2009

Google doesn’t use the keywords meta tag in web search

From Matt Cutts:


Google doesn’t use the keywords meta tag in web search: "

We went ahead and did this post on the official Google webmaster blog to make it super official, but I wanted to echo the point here as well: Google does not use the keywords meta tag in our web search.


To this day, you still see courts mistakenly believe that meta tags occupy a pivotal role in search rankings. We wanted to debunk that misconception, at least as it regards to Google. Google uses over two hundred signals in our web search rankings, but the keywords meta tag is not currently one of them, and I don’t believe it will be.


In addition to the official blog post, we made a video as well:



I hope this clarifies that the keywords meta tag is not something that you need to worry about, or at least not in Google.


"

Wednesday, September 16, 2009

US Government Opens-Up to OpenID and Information Cards

US Government Opens-Up to OpenID and Information Cards: "

openidToday at the Gov 2.0 Summit in Washington, DC: the Federal Government is announcing they will be implementing OpenID and Info Cards as part of its open government initiative. The looming adoption of these two standards paves the way for citizens to use existing accounts and online identities (such as their Yahoo or Google accounts) to participate in various government web sites. This also means that citizens can customize their experience on government websites without needing to reveal any personally identifiable information – including passwords.


The collaborative effort, which will initially include participation from a variety of digital identity providers, will be phased in as part of a pilot program implemented by the following three agencies:



  • Center for Information Technology (CIT)

  • National Institutes of Health (NIH)

  • U.S. Department of Health and Human Services (HHS)


And the identity providers participating in the pilot include several major players in the tech industry:



  • Yahoo!

  • PayPal

  • Google

  • Equifax

  • AOL

  • VeriSign

  • Acxiom

  • Citi

  • Privo

  • Wave Systems


In order to ensure a fair opportunity for digital identity providers, companies participating in the pilot are being certified under a collaborative framework set up between the Federal Government and both the OpenID Foundation and the Information Card Foundation. An Open Trust Frameworks for Open Government white paper provides more insight and detail about this effort.


There have been a slew of positive reactions to this new facet of the open government initiative. You can check out the various reactions by government directors, identity providers, and digital identity advocates at the Information Cards blog.


Chris Messina, an active and vocal advocate of open technologies perhaps sums it best:



This effort sets in motion a shift in how individuals can interact with the public sector and makes progress on the Obama administration’s promise for a more open, transparent, and participatory government.


And indeed this new effort is indicative of the fact that changes are underway in the government’s adoption of Web 2.0 technologies, as well as its migration towards providing online services that are more than just digital brochures. The implications for developers are profound as well, as the use of open standards signals another step towards enabling a truly programmable government web. It provides hope and potential for citizens to not only participate as stakeholders, but also to contribute to and assist with government efforts that benefit the public at large.


The Information Cards blog has posted the formal press release for the initiative. As we have covered in previous posts, including our recent post on the latest upgrade to The New York Times Congress API, this is an important and popular topic that is increasingly gathering attention. Marshall Kirkpatrick has posted some additional analysis on the news over at ReadWriteWeb.




"

New Knol developments from Matt Cutts

From Matt Cutts:

New Knol developments: "

Google launched Knol about a year ago. The big worry back then was that Google might favor Knol in our search rankings. I stopped around various places on the net to debunk that idea back then, but I think it’s safe to call this idea fully debunked now. As I said six months ago:



Google Knol does not receive any sort of boost or advantage in Google’s rankings. When Knol launched, some people asked questions about this. I dutifully trundled around the web and said that Knol would not receive any special benefits in our scoring/ranking for search. With the benefit of six months’ worth of hindsight, I hope everyone can agree that Knol doesn’t get some special boost or advantage in Google’s rankings.


I think we can call that idea completely debunked now.


In the mean time, the Knol team hasn’t been standing still. In a recent announcement on the Google Blog, the Public Library of Science is starting up a new website on Knol to publish research results about influenza. PLoS Currents: Influenza will be moderated by an expert group of researchers. With H1N1, it’s important to communicate preliminary results, and this new site provides a way to do that.


I wanted to talk about something else cool that I recently saw on Knol. One of Knol’s strengths is making it easier to add knowledge to the web. For example, the web has fewer documents written in languages such as Arabic. One Google Knol project resulted in a ton of informative pages being added to the web in Arabic. They made a video about it:



Getting more useful content onto the web is a good thing, so I’m glad that Knol can help with that.


"

Tuesday, August 25, 2009

Twitter API Adds Location Data - Tweets Get Realtime Geo

Twitter API Adds Location Data - Tweets Get Realtime Geo: "

TwitterTwitter co-founder Biz Stone has just announced a new geolocation API that will be available to developers fairly soon (Twitter API profile). The new API, which will likely be rolled out in Twitter clients before being available on the Twitter site, will allow users and developers to add latitude and longitude to tweets, thereby adding a valuable new layer of meta information to tweets.


According to Biz’s post, the new API has various implications for improving how Twitter is used, including the ability for users to connect with other users based on geographic commonality:



For example, with accurate, tweet-level location data you could switch from reading the tweets of accounts you follow to reading tweets from anyone in your neighborhood or city—whether you follow them or not. It’s easy to imagine how this might be interesting at an event like a concert or even something more dramatic like an earthquake. There will likely be many use cases we haven’t even thought of yet which is part of what makes this so exciting.


The API will first be released as a developer preview, and it likely that we will see existing apps and mashups built with the Twitter API quickly integrate this new feature. We can only guess that we also will see the rapid emergence of a new breed of geolocation enabled apps and mashups that use this new API in unexpected ways.


Documentation on the new API is not yet available. Developers and Twitter users should note that this in an opt-in feature, especially important with respect to privacy. As Biz describes it:



Folks will need to activate this new feature by choice because it will be off by default and the exact location data won’t be stored for an extended period of time. However, if people do opt-in to sharing location on a tweet-by-tweet basis, compelling context will be added to each burst of information.


The news is certainly leading to quite a buzz on various blogs (TechCrunch, Mashable, and Google Maps Mania) as well as on Twitter itself. O’Reilly’s Brady Forrest points out that there are more technical details in the Twitter developer group, including source example shown below


twittergeo


We’re excited to hear about this, and we’re curious to see how this new API will fit into the overall geolocaiton ecosystem, which includes Yahoo!’s Fire Eagle (our Fire Eagle API Profile) and Google Latitude.


Related ProgrammableWeb Resources

Twitter Twitter API Profile, 249 mashups




"

Thursday, August 13, 2009

Two projects for civic-minded student programmers

Two projects for civic-minded student programmers: "



One of the key findings of the elmcity project, so far, is that there’s a lot of calendar information online, but very little in machine-readable form. Transforming this implicit data about public events into explicit data is an important challenge. I’ve been invited to define the problem, for students who may want to tackle it as a school project. Here are the two major aspects I’ve identified.


A general scraper for calendar-like web pages



There are zillions of calendar-like web pages, like this one for Harlow’s Pub in Peterborough, NH. These ideally ought to be maintained using calendar programs that publish machine-readable iCalendar feeds which are also transformed and styled to create human-readable web pages. But that doesn’t (yet) commonly happen.



These web pages are, however, often amenable to scraping. And for a while, elmcity curators were making very effective use of FuseCal (1, 2, 3) to transform these kinds of pages into iCalendar feeds.



When that service shut down, I retained a list of the pages that elmcity curators were successfully transforming into iCalendar feeds using FuseCal. These are test cases for an HTML-to-iCalendar service. Anyone who’s handy with scraping libraries like Beautiful Soup can solve these individually. The challenge here is to create, by abstraction and generalization, an engine that can handle a significant swath of these cases.


A hybrid system for finding implicit recurring events and making them explicit



Lots of implicit calendar data online doesn’t even pretend to be calendar-like, and cannot be harvested using a scraper. Finding one-off events in this category is out of scope for my project. But finding recurring events seems promising. The singular effort required to publish one of these will pay ongoing dividends.



It’s helpful that the language people use to describe these events — “every Tuesday”, “third Saturday of every month” — is distinctive. To being exploring this domain, I wrote a specialized search robot that looks for these patterns, in conjunction with names of places. Its output is available for all the cities and towns participating in the elmcity project. For example, this page is the output for Keene, NH. It includes more than 2000 links to web pages — or, quite often, PDF files — some fraction of which represent recurring events.



In Finding and connecting social capital I showed a couple of cases where the pages found this way did, in fact, represent recurring events that could be added to an iCalendar feed.



To a computer scientist this looks like a problem that you might solve using a natural language parser. And I think it is partly that, but only partly. Let’s look at another example:






At first glance, this looks hopeful:



First Monday of each month: Dads Group, 105 Castle Street, Keene NH



But the real world is almost always messier than that. For starters, that image comes from the Monadnock Men’s Resource Center’s Fall 2004 newsletter. So before I add this to a calendar, I’ll want to confirm the information. The newsletter is hosted at the MMRC site. Investigation yields these observations:




  • The most recent issue of the newsletter was Winter ‘06




  • The last-modified date of the MMRC home page is September 2008




  • As of that date, the Dads Group still seems to have been active, under a slightly different name: Parent Outreach Project, DadTime Program, 355-3082




  • There’s no email address, only a phone number.





So I called the number, left a message, and will soon know the current status.



What kind of software-based system can help us scale this gnarly process? There is an algorithmic solution, surely, but it will need to operate in a hybrid environment. The initial search-driven discovery of candidate events can be done by an automated parser tuned for this domain. But the verification of candidates will need to be done by human volunteers, assisted by software that helps them:




  • Divide long lists of candidates into smaller batches




  • Work in parallel on those batches




  • Evaluate the age and provenance of candidates




  • Verify or disqualify candidates based on discoverable evidence, if possible




  • Otherwise, find appropriate email addresses (preferably) or phone numbers, and manage the back-and-forth communication required to verify or disqualify a candidate




  • Refer event sponsors to a calendar publishing how-to, and invite them to create data feeds that can reliably syndicate





Students endowed with the geek gene are likely to gravitate toward the first problem because it’s cleaner. But I hope I can also attract interest in the second problem. We really need people who can hack that kind of real-world messiness.


"

Dismantling IT and urban flu myths

Testing the new Google Reader share this feature with this blog post by Hanna from the Informaticist


Dismantling IT and urban flu myths: "

According to Pulse (Tories unveil plans to ‘dismantle’ NHS IT infrastructure) a new Conservative government would dismantle the national programme for IT in favour of a lovely local version instead which a bit similar to arguments surrounding local versus centralised medicine albeit this has some degree of logic viz quality; somehow IT projects do get stereotyped as unfathomly unwieldly whatever original size/budget. The BMA welcomed the move but wanted the control of patient records to stay with patients not private companies. It seems it will fall into private hands in any case, with Labour they love a good PPI. Perhaps people should walk around with it round their neck on a secure dongle? It is interesting where this will go now that many people are publishing details freely on the internet and not making the connection…


Meanwhile something I don’t think we have mentioned yet on this blog is avian flu or its many other names. The Lancet sped up a systematic review of the use of Tamiflu et al or neuraminidase inhibitors and found that they are not beneficial when given out to the healthy general populace…hmm evidence based policy is rather thin on the ground now anyone can ring up for their antivirals; one GP says public health is now a patronising public nuisance.


And a last thing I came across today is a blind search engine test. Blindsearch, from a Microsoft employee with no much time on his hands, searches across Google, Yahoo and Bing and gets you to vote for the one you prefer when you’ve seen the results, the search engine is revealed once you’ve voted. It seemed to slide off my page but that might be a local problem, otherwise intriguing.

"

Friday, May 01, 2009

NHS Evidence

NHS Evidence is a very promising start - it's already good enough to be used in favour of Medline and Google for quick searches on clinical effectiveness in the UK. It could do with a really good database of recent systematic reviews because DARE is not up to date.

One odd result so far - a search for BNP (a peptide not the dark side of UKIP) in heart failure worked well, but when the 'diagnosis' filter was applied the results became less relevant to the use of BNP as a diagnostic tool. It's difficult to get get metadata right on search engines (Berry Brothers does a very good job, but that is because of the nature of their business). Maybe not even worth it. Just let users put in the words they want and let the technology de-fangle the query.

Wednesday, April 29, 2009

The end of clinical coding

One of the reasons put forward by its advocates is that clinical coding is a vital component of a standardised health record, which is itself a vital pre-requisite to sharing clinical records. So the investment goes on, in initiatives, committees, sub-committees, standards, taxonomies, meta-taxonomies, ontologies, to support the wide range of coding systems used in health.

The effort would be worthwhile if coding was essential to the accurate structuring and transmission of the information in a medical record or medical information more generally. One recent incident has shown how fragile that claim is. And there is nothing to suggest that using SNOMED-CT will improve matters, since SNOMED-CT is itself far too large, complex (and oddly incomplete) to accurately and consistently convey the meaning of free-text or even semi-structured text in a compressed format. (Hint for Google - SNOMED-CT is to health information as the semantic web is to understanding language)

"The use of codes ensures the information derived from them is standardised and comparable." is an often repeated claim. The fact that it does neither continues to be overlooked, and there is still an open question about the value of coding in support of decision making, patient safety. Where coding clearly has a role (HRGs, Read codes) is in support of physician and provider payments. Even this role is of questionable value , and there is always a risk of coding being affected by the level of payment associated with a class.