Saturday, December 24, 2005

Implementing Pan Government Search

Is search the missing "killer app" that will drive usage of e-government? In May 2003, I put together a short business case for a government wide search engine that could both search all of government but be called by any site to search locally. The economics seemed obvious at the time but the head of the civil service wanted to see all of the necessary departmental customers signed up first - "shared services" were still in their infancy (there was only one at the time - the government gateway) and funding them (it) had proved enormously challenging. So, we didn't get it done. I've put extracts from the case below (I've stripped out vendor and price details): "Since work began on e-government, departments, local authorities and agencies have created more than 1,800 separate websites. Some 75% of these sites have a search capability (“a search engine”) that allows the citizen to look for items within that site alone. What we know about search engine usage in government is limited, but there are some important notes gleaned from ukonline data: - The Government Ad Server (available on ten government sites to date) which eDt put in place late last year includes a “search government” ad which is clicked on seven times more than any other ad, indicating that citizens want access to search.
AM note: the ad server was another one of eDt's projects - we wanted to drive traffic to other sites and give visibility of government campaigns on our own sites (and for no cost)
- Over 60% of ukonline visitors use the search engine (and most of those use only the search engine). - The top search terms in ukonline usually relate to departments directly: “inland revenue”, “home office”, “health”, “hse” and “dvla” were all in the top ten in April 2003 It’s apparent from this data that not only is search perhaps the most important route to finding what a citizen needs, but that departmental domain names (www.homeoffice.gov.uk, for instance) are not well understood by many citizens. A search engine has several components – a software licence, a hardware configuration for installation and then regular maintenance costs to ensure that it remains current and returns appropriate search results. A search engine that does not return the most relevant result high up the list quickly falls into disuse, making the maintenance work essential. Typically, search engines cost from £25,000 to £300,000 to install (including hardware and the one-time setup work that defines business rules and requirements) with 0.5-1 person required to keep them updated. The very smallest sites can use open source engines that, naturally, cost little to licence, but do not reduce the business work around rules and maintenance. This initial work should not be underestimated – if done well, it can take weeks and involves assessing important content, tagging it appropriately (“metadata”), defining dictionaries and taxonomies and so on. If a conservative 1,000 of the current sites have search engines, and all are at the low end of operation then the installation cost to government was as much as £25,000,000 with between 500 and 1000 people engaged in keeping them up to date (at an average cost of £20,000 per head), making £10-20 million in annual costs. It takes only a few of these sites to have large engines (at the £300,000 or more end of the market) for the cost to quickly double or triple – many departments are presently actively reviewing their search engine with a view to going to procurement for a new, more capable engine. A single search engine in the centre of government that searches all government websites and that is able to provide both single site and multiple site results would normally be licensed at perhaps £Y based on current market rates for two years – and the licence would typically restrict usage to just one site (historically, ukonline is the only search engine that spans all of government). eDt has negotiated a deal with a supplier that, subject to approval, provides all government websites with access to the central search results, with only small code changes required to each site to gain such access (the e-Delivery team will work with departments to measure their existing search engine performance, demonstrate the potential of the central engine and then jointly handle the migration). Results can be presented both for that site and other government sites, increasing the chance that the citizen will find what they need. All results can be presented in the “look and feel” of the home site – with or without reference to use of the central engine. In addition to the savings in licence costs and in people maintenance costs, people using the search engine would have an immediate benefit. The more people that use a search engine, the better the results it provides: - It can log which links people click on and move those that are clicked most frequently to the top of the list, increasing the chance of finding the right link first time. - Search results can be provided for both the local site and for sites across the rest of government (so, for instance, a citizen arriving at the wrong government site would, on typing something into the search engine, have a good chance of being directed to the right site). - Search engines can provide personalised search results based on prior usage – so if a citizen regularly searches for common terms, the results can be tailored to give more relevant results. - The engine can be tailored to also present results from commercial engines, perhaps pointing citizens to related articles on the Internet. Savings for these latter points fall into the “common good” category. Joining up government functions provides significant benefits to citizens that are difficult to put financial measures against. The search engine will also be used to drive a search function available from a persistent toolbar, driven by the Online Government Store project.
AM note: the "OGS" became direct.gov
Implementing this search engine could be achieved in a 3-4 month project, with up to one week additional for each government department that connects. Departments (websites) would pay £X per annum to make use of the central search, with the search service more than paying for itself (including hardware, set-up time and maintenance) after 25 sites connect – before the “common good” saves are factored in for citizen benefit. These fees would be recovered through eDt’s planned charging process." So, I show this not so much as an "I told you so" but so as to help make a wider point and to hopefully, as one commenter says below, put the issue of search to bed: - Good search is vital. As the comments below note, all of the biggest and best internet businesses (ebay, amazon, yahoo, google etc) centre on helping you find what you need fast, using search. - The commerce businesses - ebay and amazon - search only their own sites and have good control over how things are tagged and presented. Even then, you will find lots of irrelevant items coming up in response to your search, particularly if it's a one or two word search (and I'm sure the average word count is 1.5-1.7 but I can't remember where I saw that statistic). If, using Amazon, you type in "360" then, of course, you get lots of items related to the xbox 360, but you also get a Kodak camera. If you enter "lost" then you get the DVD and book of the TV series, but you also get books about endangered species. You're smart enough to know which it is that you want and, if you're unsure, a click or two and all will be clear. - Government search is, I think, different. If google ranks based on "authority" - i.e. you're important and if you link to someone then that makes them important too ad infinitum - then there is a large need for links to sites and articles within those sites to be clear who is the definitive authority. Government should, by definition, be the authority on the stuff that it does - but if every site owner thinks that they should try and all be all things to all people, then you end up with thousands and sometimes hundreds of thousands of references to "tax credits" or "disabilty living allowance". Many of these references are out of date, a little misleading or sometimes plain wrong. The right job of a search engine is to bring up the definition that has most authority in a world where authority is hard to pin down (e.g. housing benefit, whilst on the surface the same all over the UK, actually is highly locally varied - perhaps it shouldn't be, but it is). Anyway, great search is a necessary part of driving e-government adoption. I just believe that without significant consolidation of websites, it won't deliver what's needed.

6 comments:

  1. Pan-gov search should be a no-brainer from a business-case perspective for two reasons.

    Firstly, government is big, so can't be navigated through a single information architecture - just like google doesn't try to create one for the web.

    Secondly, government has got itself into such a mess with the proliferation of websites that it needs a single place where people can get to what they want, as it's unlikely they'll find it through clicking.

    The problem about artificial weighting given to heavily linked-to pages is a tough one. If you searched for a word that appeared on the Directgov homepage, then that result is going to shoot to the top, given that there are over 1,800 links to that page (one from every other site's homepage). Also, the fact that lots of webmasters link to a specific page on a certain topic doesn't necessarily make it a/the definitive source.

    Also, google is now getting more personalised and localised. If I search for "fish and chip shop" on google.com, I'm unlikely to find one in my area, but if I use Google Local, my nearest one comes up third. Government has a more pronounced problem, as the majority of most people's dealings will be with local government. Asking "when does my local skip close" is unlikely to give me the results I'm after.

    It would be interesting to get hold of a ranked list of the top 50 search terms hitting each of the 1k+ engines in government and see what the overlap is. Obviously the terms used in HMIR will be wildly different from those used on DH, but I'd have guessed thought that there will be much overlap across local gov. I bet a lot of people are after contact information, building addresses and opening/closing times etc. (If you can get hold of the data, I'm more than willing to do the analysis.)

    Find out the key information people are looking for and use the Directgov CMS to distribute authoring rights for a certain section to people within local government. Have a single "pertinent information" page for each local council. Retain approval rights centrally to ensure that it's not defaced or abused.

    ReplyDelete
  2. Anonymous9:05 am

    "Search Anorak" .. I love the label .. a couple of anorak's started Google and now look at them .. a few billion dollars richer .. their passion for search has spawned a revolution in the way we use the Internet .. so in terms of moving the debate on, I don't think it has been debated enough !!

    ReplyDelete
  3. egu_insider9:55 pm

    Search is critical, absolutely. A couple of contributions:

    - From the business case perspective, I'd be more cautious about the resources already deployed in search across public sector sites. 500 - 1,000 FTEs seems really high given the low budget, slap-it-up approach particularly when it comes to site search. That's not to say great pan-government search wouldn't represent a saving, just maybe not quite that much.

    - Would Google-style PageRank solve the problem? Dan's approach feels better, identifying the 95% of requests which represent the volume and making sure there's a strong, effective postcode link to local council services (maybe a smarter, more robust Local Directgov?)

    But there's always the question about where search fits in to the broader citizen relationship with public services. If you want to know when your bin gets collected, fine. But if you want to know what benefits you're entitled to, or have been putting off finding out about school admissions as it's all too much hassle... what you need is the kind of no-nonsense, easy introduction guide that Directgov should offer, with links out for more detail, or at individual or local level.

    ReplyDelete
  4. Hmmm...an insider...

    agreed the numbers might be wrong, but the nice thing about numbers in this space is you have no better data than I have. so i can guess, and you can guess. either of us could be right. the absence of data on cost to set up and ongoing operating cost is pretty scary. even a very basic website is likely to cost £150k to run each year - of course, that's not all search, but i doubt too many folks are paying as little as that.

    I agree that pagerank won't solve the problem - i think i said as much somewhere in one of the many anorak posts i've written on this.

    Bottom line for me ... if you proliferate content, the odds of finding the right data go down, no matter how good the search is. who is to say that fulham's DLA description is better than kensington's? not a pagerank algorithm i suspect.

    But then you move on to transactions and establishing e.g. what benefits are due ... and, last time i looked, direct.gov (of which I am a fan so take this as constructive criticism) doesn't join things up. take a look at phil windley's site for his comments on egovernment maturity:

    http://www.windley.com/archives/2005/12/what_does_this.shtml

    ReplyDelete
  5. Some of you may be happy with your booleans and venn like thinking as you blogdrone on about search

    but for your average mary everest who shops at walmart/lidl/asda, just how much time do you think she or george spends pondering search?

    interesting as search is for blogging, you are still scrapping the tepid shallows of abc1 land, which ain't where the majority of population live... ...

    Until eGov ceases to be the playground of the timerich web browsers, it's just a minority resort, however acute your search

    ReplyDelete
  6. "Until eGov ceases to be the playground of the timerich web browsers, it's just a minority resort, however acute your search"

    So contrary mary, what are you suggesting as a plan?

    ReplyDelete