Content distribution, now for the masses
We’ve known about Akamai, who’ve been long regarded as the leaders in CDN technology for some time now. As of a few weeks ago, there’s a new kid on the block.
Amazon released CloudFront on the 19th November, and it’s what I think has been lacking in their S3 storage solution that now makes Amazon a key competitor on the CDN front. That of what Amazon have termed “Edge Zones”. Nothing particularly complex in the definition: You upload your content, and you can have Edge Zones in the USA, Europe, Hong Kong and Japan. Amazon will automatically fetch the data from S3, cache it at the Edge Zone location, and from that point forward the transfer time for your users will be blistering!
I fully expect to see some really innovative uses coming out of this and we’ll keep tabs on it here and keep you abreast of the happenings. If you’ve got a project you’re working on that uses CloudFront, feel free to comment here. It would be really good to see the different uses people have for CDN tech.
For now, that is all. Back later with more fresh goss from the Tech world.
Yesterday, peering to McColo, an ISP accused of providing the transit for 75% of the world’s spam, was shut off. VNUNet and The Washington Post both talk about where the figures come from. I’d like to explain what this can potentially mean to the advertising world, and some realities that need to be checked first.
Connectivity
With 75% less email travelling through the pipes that make up the Internet, there is a huge amount of space freed up for people visiting legitimate sites, chatting on IM services, or sending legitimate email, amongst all the other things that happen on the Internet. That means potentially faster connections to websites that are located elsewhere around the world.
Email advertising
With less spam being sent, emails sent for legitimate marketing purposes will slowly get better and better spam ratings as less spam means it will be easier to spot the rogues coming through. This means more flexibility with email content, and therefore a greater ability to be creative in email campaigns.
Realities
This is great news. One problem though. Botnet owners (the people who “own” the networks that send the spam out) are a bit like parasites. They’ll quickly enough find another ISP to latch on to and reconnect with their zombie machines.
On the other hand, one email provider kindly gave me a graph from their spam filter over the last 7 days. It clearly shows a massive reduction in spam coming through the system in the last couple of days – the time in which McColo has been off-line:

How spam works
Spam is one of the very reasons why spyware (malicious software that sits on your machine) exists. Often, it turns your computer into a zombie machine. Ever found that your downloads are stupidly slow or you just can’t access websites? That may be because you’ve installed something and one of the botnets now controls part of your machine. They call home, and await instructions. When someone gives out those instructions, every zombie connected goes and sends mail out. This reduces the workload of the botnet owners and the requirement for them to have a whole host of computers for sending spam out. They just use us, like a virus uses us.
Can we expect change?
Maybe. If peering providers (the major ISPs that provide backbone services to the Internet) are quick enough to notify each other of the botnets, and they’re cut off at the source. However, if this particular set of botnets have lasted this long, how long would they last with a new provider. I guess we’ll find out soon enough, if spam levels return to normal in the next few weeks.
"It Doesn’t Work"
“The site doesn’t work.”
“What do you mean, the site doesn’t work?”
“It just doesn’t work.”
“So what is it doing that it shouldn’t be doing?”
“When I go to the site, I can’t do X.”
“But when I go to the site, I can do X.”
To a developer this is an all too common dialogue between someone looking over the site – usually someone non-technical, and definitely someone who hasn’t read up on Reporting bugs effectively.
After development comes testing, and during testing, bug reports are made. If the bug reports are unclear in any way, then it takes more time to fix them, because it takes more time to decipher their real meaning. So what this article is about is understanding the core elements of a good bug report, and why “It doesn’t work” is bad.
From the perspective of a developer, a good bug report consists of a specific set of things that must always be there in order to be able to replicate it. Without the ability to replicate a bug, the chances of it being fixed are very slim. The above article by Simon Tatham, a very well regarded developer, of PuTTy fame, is probably one of the best groundings I’ve seen for writing good bug reports for software, and it extends to web applications and sites too. The first of these things is a concise but clear summary of the bug. One line is enough – just something that describes the issue, eg. “Firefox 3.0.3 prevents upload of multiple files in new messages”. That’s a good summary that with some additional information as described below is the formative part of a good report.
The concept of “Show me how to show myself” is crucial here. What’s needed is a step by step set of directions by which you came to see the bug. The developer may well not have used the application in this way, and didn’t expect it to be, but users will always do the unexpected! If the issue is a display issue, take a screen shot, and include the whole window, not just a bit of it. If it’s a browser issue, and this happens a lot on the web, the developer needs to know which browser the issue happens in. If the same issue happens in every browser, one screen shot will do, but a comment to that effect is absolutely necessary. There are also some key facts that should be included with any and all bug reports: Operating System, Browser, Browser version, URL on which the error occurs. Some other useful pieces of information, depending on the bug report are: Flash version (or that it’s not installed), JavaScript status (on / off). If you think there are any more, there probably are, so feel free to comment on this article.
So, to conclude, we need a good summary (usually marked down in a subject field if you’re using a bug reporting system), a set of facts about the machine you’re seeing the issue on, a clear set of instructions on how to reproduce the bug, any additional information the developer might need to reproduce the bug, and if you can, a screen shot or set of screen shots to show the developer the problem you’re seeing.
The development process
It’s maybe an unfortunate consequence of the web being so instantly accessible that development process is often forgotten about when dealing with clients and their expectations. I’d like to put forward some suggestions and reasoning behind them as to why development processes should be adhered to, even on smaller projects. The aim of this article is to take you through each step of the process and explain why each bit happens and what it achieves. It’s going to be a fairly lengthy article again I think, so bear with me.
After the strategists have done their thing and come up with a conceptual plan for what could be done to improve the brand/ROI/other business related thing, the next step is to decide exactly what they meant by it all. The best way to look at each project is to take each part as a sub-project of a much larger umbrella project.
Scoping
So… The first one is scoping. Scoping doesn’t mean you have to get technical. It just means you have to decide what you want to achieve from the site, and what you want the site to do as a user. In actual fact, scoping can take on some technical aspect, but I’d suggest looking at RSpec stories as a good way to describe what it is you want the user to be able to do. Scoping can take anything from a couple of days to a few weeks to get right, depending on the size of the project and the number of people who have a say in the matter. I recommend limiting this to the client, an Interaction Designer, and a technical person of some kind (usually a Senior Developer or a Tech Lead, both of whom would have a breadth of knowledge). If you don’t want to use RSpec stories, you’ll probably want to split this project up into two – Scoping and then Technical Functional Specification. RSpec stories allow you to do both at once though, saving time and money in the process. They’re easily understood by non-technical people and they show exactly what needs to happen to the developers later. Deliverables for a successful scoping project would be:
- Project Scope (containing outline business case, the method by which this will be achieved, and the salient parts of the functionality therein)
- RSpec stories for the functionality; giving a bit more detail to the project
- Cost estimate for building the project according to the signed off Scope and Stories.
Note that I added the cost estimate in at this point rather than before the start of the overall project. This is because usually we don’t know beforehand the size of the project. Admittedly it doesn’t always work this way round either, but it’s a good ideal to aim for. If at this point you find the cost is prohibitive, it’s time to push some functionality back into a second phase. One further contributing factor to the cost will be the complexity of the design, though this isn’t dealt with at this stage in the process. Again, this will need to please the designers, yet keep within the required budget as set out in the initial cost estimate. This can change, as a change request, if the design comes in as more complex than was originally specified (this must be detailed in the scoping document)
Interaction Design
A lot of companies are calling this IA, meaning Information Architecture. I think this is a little misleading, as it infers a technical assessment of the data storage and delivery. Those things are dealt with by a separate process later down the line, though the basics will have been thought through during the RSpec story development earlier. Interaction Designers should be highly aware of the various issues surrounding accessibility, web standards and how alternate technologies should be used within the remit of the project. Flash is a good example, and there are countless arguments as to how it should be used on the web. A good ID should understand these fully and design the interaction of a site accordingly. This will take the RSpec Stories into a format that can be visualised by a web designer, and scaffolded (making up a basic page without an overall look and feel) by the developers. Tools like Axure RP Pro can make this a straight forward task. Many ID’s like to use Visio, but the interaction doesn’t exist and Axure allows you to model complex scenarios and demo them to the client.
If you’re using Axure, the deliverable for this part of the project will be a model of the site annotated as needed, and signed off by the client.
Web Design
The contentious part! Anyone can design a website, right? Wrong. A good web designer will know (mostly through communicating with web developers and gaining experience) what should, shouldn’t, can and can’t be done. Often, everything can be done that is desired, but it’s the method by which that should be done that will ultimately come out from the design. The designs should always be done in collaboration with the same group of people involved in the initial scoping of the project. Peer review is an incredibly important part in making a project like this successful. Get it wrong now, and future maintenance costs will soar. Get it right, and you’ll have yourself a beautiful site that is a joy to use.
Deliverables will be a set of designs, along with annotations within the Photoshop files, and guidelines in a document, which would detail general branding throughout the site – colours, link decoration (underlines, colour, etc), fonts and sizes used throughout, and anything else that is to be held consistently throughout the site.
Development
Given an accurate Interaction Design, this phase can actually happen while the design is in progress. At Initforthe, we often provide a timeframe where we’re not doing the designs ourselves by which we need to have final signed off designs. As long as this is met, development can continue, and the project isn’t held up by requiring designs first. There are two aspects to development in this regard. Front-end development, and Back-end development. The former will usually include building the designs once they have been signed off, and the latter involves building the functionality according to the RSpec stories. The back-end developers will usually build the scaffold according to the Interaction Design, and then plug in the front-end when that has been done later on. This allows everyone to get on with their part without affecting the other.
Depending on how the project is being built, parts of it can be put on a development server during build, or working parts can be put up as soon as they are done. These can be shown to the client as working models based on the RSpec stories, which now form an acceptance test, and the wireframes built as the deliverables for the ID. It is important if you’re doing this that the client is made painfully aware that the site has areas that don’t work yet, and they don’t yet look like the design that they’ve signed off. But it does give them some level of confidence in seeing things happening. Each day or couple of days, something new goes up for them to look at.
The key deliverable here is the final working project.
Testing
Testing can be carried out throughout the project. Developers should be testing, and if using languages or frameworks that provide for testing the functionality, this should be used. It takes longer to develop, but you can usually be more certain that the bugs won’t be there if it’s done properly. This is called Test Driven Development. Ruby on Rails is predicated on this as a concept.
Final testing should be carried out to perform a number of tasks. Firstly that the functionality works in the browser. If the functionality works, the acceptance criteria can be signed off as complete. The second is the look and feel. This should be as per the designs signed off by the clients and peer reviewed by the ID and developers. It should be tested in a multitude of browsers, though at the time of writing this article, I’d recommend the following set: Mozilla Firefox 3, Apple Safari 3, Internet Explorer 7, Opera 9.5. The reason for this is because each of these browsers will prompt the user to upgrade when there are new versions available, and as there is no reason not to upgrade, only the latest should realistically be supported. My previous article on supporting IE6 details why old browsers shouldn’t be supported any more.
Once you have functionality and look and feel signed off, you have a working product.
Things to remember
There are always a few things you need to keep in mind when you’re building a project of any size. Firstly is the scope. If something falls outside of the scope, it is important to keep track of these changes as requests. Some will impact severely on cost, and often also on time, which will push the delivery date back if it’s agreed that they’re needed for launch. Alternatively they should be put into a separate project to be started after the end of the current one.
Next is copy. Signed off copy is what should be delivered to developers for inclusion in the site. If the copy changes at a later date, this is also classed as a change request and should be charged for. There is good reason for keeping a strict rule about this. In offline work, if you change the copy after the artwork has gone to the printers, or you change the credits in a video after sending to the distribution companies, they’ll charge an arm and a leg to get the new version out on time and the printing presses/distribution stopped on the version you initially sent. Just because the web feels instant, it should still be adhered to in the same fashion. A lot of the time, copy changes are where money is either made or lost on the project.
The final thing to remember is that it always takes longer than the client wants to build what they’re after! So make them aware of it, and make sure they know they may have to drop some functionality to get the project out on time. Further things can always be added later!
Hosting part 3: virtual servers?
Part 3 is here, and it takes a few forms that I’ll discuss.
Virtualisation is the process of taking a physical computer and dividing it up so that it thinks it has lots of computers inside it, each doing different things completely unrelated to each other. Why is this useful? Because you could either take on a small bit of someone elses real computer without the cost of buying your own, and with the security that comes with having your own, or you could divided up a computer that you already have into smaller chunks to do different things with each chunk.
When it comes to hosting a website, a lot of the time, the server is sitting there doing very little, so it’s almost wasted space in a way. To get the computer working harder, you can make it do more things, but what then happens if those more things conflict with each other somehow? Or you’re using some common parts and someone changes the configuration for one site, and it then breaks all the others! It happens.
Virtualisation is the answer to this – each virtual machine (sometimes known as a virtual private server, or VPS) has its own memory, it’s own drive space, its own applications and even its own operating system. People who have access to one don’t necessarily have access to any others on the same machine.
If you’re looking for a solution for a website that is going to have super heavy load, virtualisation might not cut it for you, unless you decide to create a virtual server farm and dish out the work to different virtual machines. However, for the vast majority of websites, this isn’t the case, and virtualisation could suit the needs of protection of data, and separation that can cause so many problems in other cases.
With respect then to virtualisation, the options available are usually something along the lines of:
- Get your own server and virtualise it, using it for many different applications
- Rent space on someone elses servers
The latter of those two options can be further divided up, as big companies like SliceHost, Amazon, EngineYard and so on provide virtualisation in different ways. Amazon uses virtualisation to provide their EC2 cloud computing platform, where slicehost and engineyard provide what they call “slices”; in reality just virtual machines on a larger real machine.
At Initforthe, we’ve been using virtual servers for some time to provide our clients with hosting and have built up our own centrally managed platform. Of course, our options aren’t the only solution, but they do offer our clients a way in which everything to do with a given web build can come from one place. Saying that though, we recently embarked on a project that was set to use the entire swathe of Amazon Web Services provided services: EC2 for computing, S3 for storage, and their newly released EBS to store the database on). What we found is that the reality of virtualisation doesn’t always pare up with what is sold in… OK, that’s not true in a lot of cases – SliceHost and EngineYard have impeccable records, and our servers have shown very little downtime themselves. Amazon was a slightly different story though.
Whilst developing the EC2 instances and installing the applications on them, we found that they were rebooting almost randomly, and the access to EBS (the database storage facility) was flaky. You can’t have a website which is up some of the time and not all of it, and you certainly can’t have it go down because your machines can’t access the database any more. To finish this off, we then discovered that actually EC2 had no SLA (Service Level Agreement) to maintain uptime, and so we went with a more familiar hosting method, though still virtualised. So far we’ve had no problems with this and for the time being our recommendation is to steer clear of cloud computing services until such a time as they’ve been thoroughly matured.
Back to the usual methods of virtualisation. Why? Who do you use? Why not just use shared hosting? I think I’ve explained most of these, but in finishing this part of the article, I’ll summarise:
Virtual servers are much cheaper than buying lots of hardware, use the server to its capacity and therefore not waste additional expense that the machine is incurring in just being on, it’s stable (most methods anyway!) and keeps data far more secure than shared hosting. You have your virtual space, and no one can get in, as long as its managed properly.

Recent Comments