Thursday, March 31, 2005

Thanks for the Light 'Ogg'

Humans are great at discovering new things.

Since the first caveman discovered fire, the idea of fire has been passed down through the ages. In a way we've all been touched by that very first fire and the idea of it is still burning. We owe 'Ogg' - he or she was one of the first trailblazers.

Fire was a cool discovery but human language is our best discovery yet - a way of infecting others with new discoveries. With language we can pass more than DNA to the next generation: ideas too.

Language is ephemeral though, spoken and lost in a second. Books, cave paintings and other cultural artefacts help but now it's time to get serious.

Billions of discoveries are being made everyday and we have the means to pass them on to each other - efficiently and easily. Everytime you search for something you make a discovery - so why not share it? Ogg did.

Where a Trail Ends ...

A trail begins when you blaze a new trail by searching on one of the search engines in our database - you'll need a TrailBar too.

That's how trails start. But how do they end? For the BETA version of Trexy:
"A trail is the unbroken sequence of links you click on after entering a search query."
But what breaks the sequence of links?
  • entering a new search on a same or different engine breaks the trail and starts a new trail
  • submitting a form stops a trail. To protect your privacy we only ever access forms listed in our search engine database.
  • clicking on an encrypted page (i.e., https) stops a trail - trails aren't saved on secure pages
  • clicking on your favourites breaks the trail
  • clicking on the home page button breaks the trail
  • typing in a new URL also breaks a trail
  • closing your browser stops a trail too
So while you're trailblazing if you notice the green light in the TrailBar change to off you know why.

Wednesday, March 30, 2005

Privacy Matters

Respecting the privacy of our users is one of the driving forces in the design of Trexy. If our users don't feel confident their privacy is protected they won't blaze search trails and share them with others. So for us, protecting your privacy is paramount.

From a legal point of view we comply with the data protection legislation as set out in the UK Data Protection Act 1998 - check out our Privacy Policy.

From a technical point of view we go to great lengths to maximise your privacy. When you blaze a trail the only information we keep is:
  • an anonymous userid
  • the country flag of the browser (determined by IP address)
  • the links visited (only http - not https)
  • the engine where the trail began
For extra security and privacy we make sure:
  • trails are only ever blazed through our database of valid search engines. The contents of all other forms (e.g., email, login boxes, comment forms) are never accessed.
  • trails are never blazed on encrypted pages (i.e., https)
  • we never view cookies from third party sites
  • we never access files on your machine
  • we never access your browser's cache
At all times we let you know what's happening. The green light on the TrailBar shows when a Trail is being saved.

You have full control over the trails you create. At any time you can:
  • delete your trails (search on MyTrails and click delete)
  • turn trailblazing off (go to Preferences and select Trailblazing Off)
  • turn trail sharing off (go to Preferences and select Sharing Off)
  • uninstall the TrailBar
If in the end, you're just not comfortable with trailblazing, that's fine too - you can search other peoples' anonymous trails by searching "All Trails" only. If you're still not comfortable - don't use Trexy - no one's forcing you. ;-)

Tuesday, March 29, 2005

A bit glitzy

A reporter from 'The Wharf' - the newspaper for London's Canary Wharf business district - just called. He wants to write an article about Turbo10 and life after winning a London Chamber Business Award.

We were fortunate to win the 'Best Use of eCommerce and Website' award for Turbo10 at the last Docklands awards ceremony in 2003.

To enter the awards, we needed to submit a business overview and then the top 4 entrants were shortlisted to give a presentation before a panel of 6 judges.

The winners were announced at an Oscar style, black-tie dinner. It was a great buzz to hear Turbo10 named the winner in our category.

Nige went to collect it and a big spotlight shone on him as he made his way through the tables to get on stage to collect the trophy. The best bit was the glam music played in the background - 'I've got the Power' by Snap. So kitsch but so right.

Anyway hopefully an article will be in the next edition of 'The Wharf ' this coming Thursday.

Wednesday, March 23, 2005

Server meltdown on the day Turbo10 "Registered"


Friday the 30th of May 2003 is something of a Black Friday for us and Turbo10.

It was the day the world finally registered that we have some serious search technology to offer but sadly it was also the day our one server had a serious meltdown - and there lies the problem.

Meg had done a fantastic job of managing the marketing campaign and the word was starting to get out. But we hadn't counted on receiving such a glowing review in The Register, "Make way for the contender to Google's crown." It took us, and everybody else by surprise. We were suddenly in the boxing ring with Google. Everyone flocked to Turbo10.

Unfortunately I was on a plane at the time and Meg had to call our data center in Texas to ask where our server had mysteriously gone? You can imagine the Texan drawl:
"I'm sorry MaaM your server is havin' a denial of service attack. I've never seen anythin' like this. I'm lookin' at the ethernet meter and it's gone pure green."
We were going pure green too. It wasn't a DOS attack - just lots of people interested in Turbo10. We we're happy, sad, elated, frustrated all at the same time. Happy and elated that our efforts and original ideas had been recognised but sad and frustrated that our search engine was failing under the load.

I fought for two days from the lobby of the hotel against a Portuguese keyboard and a torrent of incoming hits but our weeny server just couldn't handle it. For many visitors Turbo10, the "contender to Google's crown" was a blank page.

Well as a "contender" I felt like a boxer smelling the canvas. But despite our limited resources we got the server back up and four days later The Register reported: Turbo10: Getting back on its feet.

We promptly hit the canvas again! :-(

At the time we didn't have enough money to scale up our cluster. Two months before I'd approached Dell and IBM to see if they could help out with a "Powered By" server sponsorship but sadly they weren't interested.

Lots has changed since then. Instead of just one server we now have a load balanced cluster of over forty five servers! :-) But the lessons learnt while smelling the canvas have stayed with us and we're determined it won't happen again.

Tuesday, March 22, 2005

Search Engine Strategies Conference - Munich

Just received confirmation from the organisers of the Search Engine Strategies conference in that I'll be speaking at the up-coming event in Munich.

I have been invited to speak in the session: 'Neue Entwicklungen im deutschen Suchmarkt' which means, 'new developments in the search industry'.

We are not yet ready to unveil Trexy, so I'll be speaking about Turbo10.com and how we connect to the Deep Net. I am very excited about it. Ich muss es auch auf Deutsch sagen!

Some basket cases

Had to say goodbye to a few of the new servers today.

When we add multiple servers we always find a couple of machines that have problems. For example, they might be configured slightly differently or have network problems. We had one machine yesterday that thought it had a different IP address to the one actually assigned. Nige and I call these the 'basket cases'.

Instead of spending too much time trying to figure out the problem we have found it is easier to just let them go. This also helps to keep the set-up across the nodes as homogeneous as possible.

Vannevar's Trails of Association

"The human mind ... operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory."
"As We May Think", Vannevar Bush 1945.

Physiologically he's right on. Our brain records by burning memory engrams or trails between associated neurons. Each time a memory trail is traversed the synaptic gaps fire between neurons and the engram is reinforced. The more times this trail of neurons fires the stronger the memory becomes. If it fires less frequently the memory fades.

Imagine if your trails didn't fade?

Despite the rudimentary computer technology of the day, Vannevar dreamt of a device he called a Memex that recorded trails permanently, clearly - "an enlarged intimate supplement to memory."

Every day we all make personal discoveries and traverse trails from what we're looking for to what we find - but sometimes we forget. What did I search on again? What was I looking for? How did I find that? Where did I see that? We're all in an endless loop of searching, finding, sometimes forgetting - and the procedure repeats.

Wouldn't it be good if you could remember your personal search trails?

MyTrails is the first part of Trexy's design inspired by the Memex.

Monday, March 21, 2005

Adding new search engines - the power is in your hands!

A crucial subsystem of both Turbo10 and Trexy is the ability to add new search engines.

We previously used a fully automated process to connect to new search engines. A robot, 'Amy the Adapter Manager' worked with 'Penelope the Probe Manager' to automatically create adapters that connect to other search engines. The process is described in a WWW2003 conference paper: The Mechanics of a Deep Net Metasearch Engine [PDF]. We achieved a 65% connection success rate and learnt lots about connecting to the myriad of different search engines out there.

But when it comes to processes like this human's still do it better. The connection success rate is now above 80%! We incorporated much of the technology and lessons learnt from the automated process to create a simple three step, semi-automated process for connecting to new engines.

The good news is the power is now in your hands. You can quickly add your own search engines and get immediate confirmation that it works. Click here to add a search engine at Trexy or Turbo10. Once it's added you can metasearch it at Turbo10 and blaze trails through its content at Trexy!

Sunday, March 20, 2005

Jukebox Hero

At the moment our favourite coding song is 'Jukebox Hero' from the classic rock group - Foreigner. There's nothing like turning up the volume while we synchronise the cluster!

The cluster is getting bigger. We are adding 25 new servers tomorrow.

Saturday, March 19, 2005

Pressing the flesh

I have almost finished contacting all the people Nige and I met at the Search Engine conference in New York at the beginning of March.

After a couple years of talking to our American and Canadian partners over the phone and by email, it was finally good to meet people face to face.

We also met up again with search gurus - Danny Sullivan and Chris Sherman. They expressed interest in our technology and we're looking forward to demonstrating Trexy to them at the upcoming Seach Engine Strategies Conference in London.

Introducing the Robots

Running a large-scale search engine with just two people would not be possible without a lot of help. Our philosophy is to automate everything we can - so whenever something needs doing, instead of doing it ourselves we program a robot to do it for us!

Each robot manages an important subsystem for either Turbo10, Trexy or both. Here are just some of the robots dutifully working behind the scenes:
  • Amy the Adapter Manager - manages and tests connections to Deep Net/Trexy search engines
  • Ray the Replication Checker - monitors database integrity on the cluster
  • Ernie the Exception Checker - alerts us if a bug happens
  • Bill the Bug Catcher - like Ernie but reports really serious bugs
  • Daisy the Daily Reporter - shows daily site stats, click stats etc.
  • Tim the Timer - monitors response time on the cluster - we want to be consistently quick and Tim helps us here
  • Penny the Partner Manager
  • Carla the Account Manager - handles new advertiser accounts
  • Barry the Blocker - helps detect fraudulent clicks to sponsored links
  • Chester the Checksum Checker - helps out Barry
  • Larry the Load Reporter - monitors load levels on the cluster
  • Betty the Back Up Manager - keeps regular backups
  • Kylie the Collections Manager - handles MyCollections for Turbo10
  • Fred the Flusher - flushes the logs before they get too big
  • Terry the Tester - runs the test suite and reports
And there's many more ... but I'll mention these later!

Friday, March 18, 2005

Tick that Box.

Nige and I had a good bug run today. We worked on finalising the TrailBar for both Firefox and IE. We updated some of the buttons and swapped in some better graphics. I had done a small Trexy logo but it was appearing too dark. So we replaced it with a lighter version that matches the logo used on our site.

We also tested the Trail creation component as well. So far everything works well on Google, MSN, Teoma and Looksmart but we need to do some more testing to get it to work smoothly on Ask Jeeves and on some pages of Yahoo.

We have a simple procedure for handling bugs. I catch a bug, show it to Nige, he approves its bug status, we write it down and draw an empty box next to it. Nige tracks the bug down, exterminates it (hopefully it is an easy bug to kill!) and then, and only then, can we tick the box to signify its death...and then we can move on to the next bug.

Bugs ... Run!!

Apart from being a marketing maestro Meg is also a mean bug tester. Because we're such a small company we've had to roll up our sleeves and do stuff we normally wouldn't.

When it comes to testing Meg is ruthless. ;-) Our rule is: a seen bug is a dead bug. But sometimes it's hard to catch the little buggers in action.

Once a bug is seen it goes onto Meg's hit list - and doesn't get removed until it's 'eliminated'. Every now and again we go on a 'Bug Run' - an intensive day of testing, bug hunting and fixing - today is one of those days.

You can help too, if you see a bug please email: support@trexy.com and it'll go on the hit list. ;-)

Turbo10? Trexy? Why have two engines?

Turbo10 is our sister search engine and the foundation of our business. We've been developing it for over four years now. The first question people ask is, "what makes Turbo10 different?" The short answer is, it helps you browse faster and search deeper. Here's how:
  1. Client-side result caching for faster browsing and fastest-first results display
  2. Client-side relevance ranking
  3. Client-side topic clustering and result filtering
  4. Metasearching the Deep Net - you can metasearch a large database of topic-specific search engines
  5. My Collections - tailor your own collections of engines to search on
  6. Manual Add Search Engine and Testing - the power is in your hands
Turbo10 is the best solution we can come up with for metasearching - universal metasearch in a consistent interface. Click here to find out more about the mechanics behind Turbo10.

The next question people ask is, "so why are you making a completely different engine?"

Trexy provides a new searching layer on top of all search engines. It's important that Trexy is completely search engine agnostic.

We want users to be free to choose whatever engine they want to blaze trails on! We think Turbo10 is the best tool for metasearching but we're not going to force you to use it. This means you can choose which engines you want to blaze your trails on: Google, Yahoo, AskJeeves, MSN, AOL, Vivisimo ... you name it ... check out the long list of engines that work with Trexy. If your favourite engine isn't listed you can always add a new search engine.

Thursday, March 17, 2005

FireFox TrailBar: Alpha to Beta

Work on the BETA version of the FireFox TrailBar begins today! The current ALPHA version needs a few more tweaks and packaging fixes - but we're 99.99% there already. Fingers crossed there aren't too many technical hitches.

We're launching the Trexy TrailBar with the FireFox community first - this group of early adopters will give great feedback during the BETA phase! Based on this feedback we'll incorporate changes for our general release.

Meg the marketing maestro has a plan to work with http://spreadfirefox.com to promote both the TrailBar and FireFox.

It's great to see grassroots support for a new product - we're hoping the early adopters will adopt Trexy too!

Vannevar's Vision Thang

One of the inspirations for the design and development of Trexy is an article written way back in 1945!

"As We May Think"
by Vannevar Bush (The Atlantic Monthly) 1945.

At the time the article was written computing machines were clunky contraptions - more mechanical than electrical. Despite the physical and computational limitations of the technology, Vannevar Bush, the Chief US Scientist, saw the huge potential to use these rudimentary machines to augment human mental powers individually and collectively.

The idea was wild! The meme contagious.

The influence the article has had on the development of computer science, the Internet, and the World Wide Web cannot be overstated.

Even 60 years on, Vannevar has correctly predicted the future by directly influencing it. There is a clear trail of ideas/memes from "As We May Think" to "As We Do Think". :-)

Looking back at it, what Vannevar wrote can be boiled down to a simple functional specification - but there are still major bits unimplemented.

I'll be uncovering steps in the meme trail from "As We May Think" to "As We Do Think" in future blogs and showing how these have lead to the design of Trexy and the implementation of the final and most significant parts of Vannevar's spec.

Stay tuned ...

Wednesday, March 16, 2005

Sparky - a lean, mean webserving machine

Sparky is the name of our new bespoke web server. It does page precompression and lots of caching to deliver results and pages as fast as possible. Its been in development on and off for over three years but hasn't quite managed to oust Apache - that is until now. Recent tests on our cluster show it's faster and consumes 40% less RAM.

So it's finally bye bye Apache ... and welcome Sparky!

Tuesday, March 15, 2005

Cluster Growing Pains

Trexy is powered by a large cluster of Linux servers that also supports our sister search engine: Turbo10.com.

Last weekend the Turbo10 Network came under a large increase in load and as a result we need to scale up fast. Meg and I are adding another 20 servers to our cluster this week.

Turbo10's growing pains is good news for Trexy. As a result of the increased capacity Trexy can handle millions more searches and trail insertions per day.

Introducing Trexy

Trexy our mascot - a cheeky goat who loves making trails - was today positioned on the TrailBar!

Meg worked around some tricky bitmap problems and made some cosmetic changes to Trexy so he still looked good even in a 16px by 16px space. The look and feel for the IE TrailBar is now almost complete! :-)

We spent quite a while coming up with a name for our mascot. It needed to be distinctive and connote a spirit of adventure and discovery. While at the same time paying homage to the source of inspiration for the unique design of the search engine: the Memex and Vannevar Bush's concept of Trails and Trailblazing.

So we coined a distinctive portmanteau word: TRails + MemEX + Y = T R E X Y.

Plus it sounds like 'sexy'.

So there you have it TREXY the goat is our inquisitive, trailblazing mascot!

Trexy Blog - Open for Business

Welcome to the Trexy Blog!

This is where Meg and I will share our 'trails' and tribulations as we go about building what we hope will be the best search engine in the world!