I have a major problem with many of the Web 2.0 companies that I meet in my job as a venture capitalist: They lack even the most basic understanding of Internet operations.
I realize that the Web 2.0 community generally views Internet operations and network engineering as router-hugging relics of the past century desperately clutching to their cryptic, SSH-enabled command line interfaces, but I have recently been reminded by some of my friends working on Web 2.0 applications that Internet operations can actually have a major impact on this century’s application performance and operating costs.
So all you agile programmers working on Ruby-on-Rails, Python and AJAX, pay attention: If you want more people to think your application loads faster than Google and do not want to pay more to those ancient phone companies providing your connectivity, learn about your host. It’s called the Internet.
As my first case in point, I was recently contacted by a friend working at a Web 2.0 company that just launched their application. They were getting pretty good traction and adoption, adding around a thousand unique users per day, but just as the buzz was starting to build, the distributed denial-of-service (DDOS) attack arrived. The DDOS attack was deliberate, malicious and completely crushed their site. This was not an extortion type of DDOS attack (where the attacker contacts the site and extorts money in exchange for not taking their site offline), it was an extraordinarily harmful site performance attack that rendered that site virtually unusable, taking a non-Google-esque time of about three minutes to load.
No one at my friend’s company had a clue as to how to stop the DDOS attack. The basics of securing the Web 2.0 application against security issues on the host system — the Internet — were completely lacking. With the help of some other friends, ones that combat DDOS attacks on a daily basis, we were able to configure the routers and firewalls at the company to turn off inbound ICMP echo requests, block inbound high port number UDP packets and enable SYN cookies. We also contacted the upstream ISP and enabled some IP address blocking. These steps, along with a few more tricks, were enough to thwart the DDOS attack until my friend’s company could find an Internet operations consultant to come on board and configure their systems with the latest DDOS prevention software and configurations.
Unfortunately, the poor site performance was not missed by the blogosphere. The application has suffered from a stream of bad publicity; it’s also missed a major window of opportunity for user adoption, which has sloped significantly downward since the DDOS attack and shows no sign of recovering. So if the previous paragraph read like alphabet soup to everyone at your Web 2.0 company, it’s high time you start looking for a router-hugger, or soon your site will be loading as slowly as AOL over a 19.2 Kbps modem.
Another friend of mine was helping to run Internet operations for a Web 2.0 company with a sizable amount of traffic — about half a gigabit per second. They were running this traffic over a single gigabit Ethernet link to an upstream ISP run by an ancient phone company providing them connectivity to their host, the Internet. As their traffic steadily increased, they consulted the ISP and ordered a second gigabit Ethernet connection.
Traffic increased steadily and almost linearly until it reached about 800 megabits per second, at which point it peaked, refusing to rise above a gigabit. The Web 2.0 company began to worry that either their application was limited in its performance or that users were suddenly using it differently.
On a hunch, my friend called me up and asked that I take a look at their Internet operations and configurations. Without going into a wealth of detail, the problem was that while my friend’s company had two routers, each with a gigabit Ethernet link to their ISP, the BGP routing configuration was done horribly wrong and resulted in all traffic using a single gigabit Ethernet link, never both at the same time. (For those interested, both gigabit Ethernet links went to the same upstream eBGP router at the ISP, which meant that the exact same AS-Path lengths, MEDs, and local preferences were being sent to my friend’s routers for all prefixes. So BGP picked the eBGP peer with the lowest IP address for all prefixes and traffic). Fortunately, a temporary solution was relatively easy (I configured each router to only take half of the prefixes from each upstream eBGP peer) and worked with the ISP to give my friend some real routing diversity.
The traffic to my friend’s Web 2.0 company is back on a linear climb – in fact it jumped to over a gigabit as soon as I was done configuring the routers. While the company has their redundancy and connectivity worked out, they did pay their ancient phone company ISP for over four months for a second link that was essentially worthless. I will leave that negotiation up to them, but I’m fairly sure the response from the ISP will be something like, “We installed the link and provided connectivity, sorry if you could not use it properly. Please go pound sand and thank you for your business.” Only by using some cryptic command line interface was I able to enable their Internet operations to scale with their application and get the company some value for the money they were spending on connectivity.
Web 2.0 companies need to get a better understanding of the host entity that runs their business, the Internet. If not, they need to need to find someone that does, preferably someone they bring in at inception. Failing to do so will inevitably cost these companies users, performance and money.
No comments:
Post a Comment