2010-05-05

原文转自 http://highscalability.com/blog/2010/5/3/mocospace-architecture-3-billion-mobile-page-views-a-month.html

 

This is a guest post by Jamie Hall, Co-founder & CTO of MocoSpace, describing the architecture for their mobile social network. This is a timely architecture to learn from as it combines several hot trends: it is very large, mobile, and social. What they think is especially cool about their system is: how it optimizes for device/browser fragmentation on the mobile Web; their multi-tiered, read/write, local/distributed caching system; selecting PostgreSQL over MySQL as a relational DB that can scale.

MocoSpace is a mobile social network, with 12 million members and 3 billion page views a month, which makes it one of the most highly trafficked mobile Websites in the US. Members access the site mainly from their mobile phone Web browser, ranging from high end smartphones to lower end devices, as well as the Web. Activities on the site include customizing profiles, chat, instant messaging, music, sharing photos & videos, games, eCards and blogs. The monetization strategy is focused on advertising, on both the mobile and Websites, as well as a virtual currency system and a handful of premium feature upgrades.

Stats

  1. 3 billion page views a month
  2. Top 4 most trafficked mobile website after MySpace, Facebook and Google (http://www.groundtruth.com/mobile-is-mobile)
  3. 75% mobile Web, 25% Web
  4. 12 million members
  5. 6 million unique visitors a month
  6. 100k concurrent users
  7. 12 million photo uploads a month
  8. 2 million emails received a day, 90% spam, 2.5 million sent a day
  9. Team of 8 developers, 2 QA, 1 sysadmin

Platform

  1. CentOS + Red Hat
  2. Resin application server — Java Servlets, JavaServer Pages, Comet
  3. PostgreSQL
  4. Memcached
  5. ActiveMQ’s job + message queue, in Red Hat cluster for high availability
  6. Squid – static content caching, tried Varnish but had stability issues
  7. JQuery + Ajax
  8. for user photo & video storage (8 TB) and EC2 for photo processing
  9. F5 BigIP load balancers – sticky sessions, gzip compression on all pages
  10. Akamai CDN – 2 TB a day, 250 million requests a day
  11. Monitoring – Nagios for alerts, Zabbix for trending
  12. EMC SAN – high IO performance for databases by RAIDing (RAID 10) lots of disks, replacing with high performance Fusion-io solid-state flash ioDrives, much more cost effective
  13. PowerMTA for mail delivery, Barracuda spam firewalls
  14. Subversion source control, Hudson for continuous integration
  15. FFMPEG for Mobile to Web and Web to mobile video transcoding
  16. Selenium for browser test case automation
  17. Web tier
    1. 5x Dell 1950, 2x dual core, 16G RAM
    2. 5x Dell 6950/R905, 4x dual core, 32G RAM
  18. Database tier
    1. 2x Sun Fire X4600 M2 Server, 8x quad core, 256G RAM
    2. 2x Dell 6950, 4x dual core, 64G RAM

Architecture

  1. All pages are dynamic, with user data and customizations as well as many browser and device specific optimizations. Browser and device fragmentation issues are much greater on mobile than on the Web. Many optimizations, adaptations required based on browser capabilities, limited support for CSS/JavaScript, screen size, etc. Mobile Web traffic is often served via network proxies (gateways), with poor support for Cookies, making session management and user identification a challenge.
  2. A big challenge is handling the device/browser fragmentation on the mobile Web – optimizing for a huge range of device capabilities (everything from iPhones with touchscreens to 5 year old Motorola Razrs), screen sizes, lack of / inconsistent Web standards compliance, etc. We abstract out our presentation layer so we can serve pages to all mobile devices from the same code base, and maintain a large device capabilities database (containing things like screen size, supported file types, maximum allowed page sizes, etc) which is used to drive generation of our pages. The database contains capability details for hundreds of devices and mobile browser types.
  3. Database is sharded based on a user key, with a master lookup table mapping users to shards. We rolled our own query and aggregation layer, allowing us to query and join data across shards, though this is not used frequently. With sharding we sacrifice some consistency, but that’s Ok as long as you’re not running a bank. We perform data consistency checks offline, in batches, with the goal being eventual consistency. Large tables are partitioned into smaller sub tables for more efficient access, reducing time tables are locked for updates as well as operational maintenance activities. Log shipping used for warm standbys.
  4. A multi-tiered caching system is used, with data cached locally within the application servers as well as distributed via Memcached. When making an update we don’t just invalidate the cache and then re-populate after reading again from the database, rather we update Memcached with the new data and save another trip to the database. When updating the cache an invalidation directive is sent via the messaging queue to the local caches on each of the application servers.
  5. A distributed message queue is used for distributed server communication, everything from sending messages in realtime between users to system messages such as local cache invalidation directives.
  6. Dedicated server for building and traversing social graph entirely in memory, used to generate friend recommendations, etc.
  7. Load balancer used for rolling deploys of new versions of the site without affecting performance/traffic.
  8. Release every 2 weeks. Longer release cycles = exponential complexity, more difficult to troubleshoot and rollback. Development team responsible for deploying to and managing production systems ¿ ¿you built it, you manage it¿.
  9. Kickstart used to automate server builds from bare metal. Puppet is used to configure a server to a specific personality i.e. Webserver, database server, cache server etc, as well as to push updated policies to running nodes.

Lessons Learned

  1. Make your boxes sweat. Don’t be afraid of high system load as long as response times are acceptable. We pack as many as five application server instances on a single box, each running on a different port. Scale up to the high end of commodity hardware before scaling out. Can pick up used or refurbished powerful 4U boxes for cheap.
  2. Understand where your bottlenecks are in each tier. Are you CPU or IO (disk, network) bound? Database is almost always IO (disk) bound. Once the database doesn’t fit in RAM you hit a wall.
  3. Profile the database religiously. Obsess when it comes to optimizing the database. Scaling Web tier is cheap and easy, database tier is much harder and expensive. Know the top queries on your system, both by execution time and frequency. Track and benchmark top queries with each release, need to catch and address performance issues with the database early. We use the pgFouine log analyzer and new PostgreSQL pg_stat_statements utility for generating profiling snapshots in real-time.
  4. Design to disable. Be able to configure and turn off anything you release instantly, without requiring a code change or deployment. Load and stress testing are important but nothing like testing with live, production traffic via incremental, phased rollouts.
  5. Communicate synchronously only when absolutely necessary. When one component or service fails it shouldn’t bring down other parts of the system. Do everything you can in the background or as a separate thread or process, don’t make the user wait. Update the database in batches wherever possible. Any system making requests outside the network need aggressive monitoring, timeout settings, and failure handling / retries. For example, we found latency and failure rates can be significant, so we queue failed calls and retry later.
  6. Think about monitoring during design, not after. Every component should produce performance, health, throughput, etc data. Set up alerts when these exceed thresholds. Consolidated graphs showing metrics across all instances, rather than just per instance, are particularly helpful for identifying issues and trends at a glance and should be reviewed daily — if normal operating behavior isn’t well understood it’s impossible to identify and respond to what isn’t. We tried many monitoring systems – Cacti, Ganglia, Hyperic, Zabbix, Nagios, as well as some custom built solutions. Whichever you use the most important thing is to be comfortable with it, otherwise you won’t use it enough. It should be easy, using templates, etc to quickly monitor new boxes and services as you throw them up.
  7. Distributed sessions can be a lot of overhead. Go stateless when you can, but when you can’t consider sticky sessions. If the server fails the user loses their state and may need to re-login, but that’s rare and acceptable depending on what you need to do.
  8. Monitor and beware of full/major garbage collection events in Java, which can lock up the whole JVM for up to 30 seconds. We use Concurrent Mark Sweep (CMS) garbage collector, which introduces some additional system overhead, but have been able to eliminate full garbage collections.
  9. When a site gets large enough it becomes a magnet for spammers and hackers, both on site and from outside via , etc. Captcha and IP monitoring are not enough. Must invest aggressively in detection and containment systems, internal tools to detect suspicious user behavior and alert and/or attempt to automatically contain.
  10. Soft delete whenever possible. Mark data for later deletion, rather than deleting immediately. Deletion can be costly, so queue up for after hours, plus if someone makes a mistake and deletes something they shouldn¿t have it¿s easy to rollback.
  11. N+1 design. Never less than two of anything.

I’d really like to thank Jamie for taking the time write this experience report. It’s a really useful contribution for others to learn from. If you would like to share the architecture for your fablous system, please contact me and we’ll get started.

 

^^^^^^^^^^^^^^^^^^^^^^

很实际的经验分享,值得被转载下.尤其是各个平台的搭配与使用目的,很清晰明确,看来Amazon的应用是越来越广泛了.目前我发现的不仅仅是twitter的头像icon的存储,又加上这个MocoSpace了.

Tags: ,,,.
2010-04-22

http://www.slideshare.net/kevinweil/nosql-at-twitter-nosql-eu-2010

http://www.slideshare.net/ryansking/scaling-twitter-with-cassandra

http://www.slideshare.net/nkallen/q-con-3770885

in Real-Time at Twitter

http://www.slideshare.net/al3x/building-distributed-systems-in-scala

 

主要分析了功能上需求与分散系统采用所能完成的目的的合理搭配.

个人觉得是很不错的几篇slide有助于对twitter的一个基本了解,当然是背后构架的基本了解,各个功能与服务是通过什么来实现以及达到一个怎么样的高度

具体内容各位还是看slide吧,谁看谁知道.

 

^^^^^^^^^^^^

再记下来本书名叫

Programming Scala (O’Reilly 2009)

Tags: ,,,,,,,,,.
2010-04-03

之前对这方面也算有所了解,但是今天真真正正的被我撞到了,确实很惊讶,很震撼,有点迷茫有点觉得技术的力量是强大的,当然这都是我单反面猜测的结果.

我下午在twitter发了一条这个消息

哪位对#SXSW 2010这个会议有了解呢?能否大概介绍下今年都有啥闪光的? http://sxsw.com about 9 hours ago via Echofon

然后就刚刚…收到了slideshare的

Dear jimeyren,
We know not everyone can make it out to Austin for , so we’ve compiled as many presentations as possible. Hope you enjoy!
Cheers,
SlideShare Team

 

………………………………………………

我不能说这是巧合了.只能怀疑slideshare肯定语义分析了我今天发的tweets了,要不然不可能知道我想要找的资料啊,竟然给我发提醒.忒诡异了

Tags: ,,,,,,,,,,,.
2010-03-29

作者为 David Cramer | 访问插件主页

俺这是9时区设置,8时区的话,请自行修改

修改\themes\default\main.inc.php

主要需要修改5个地方, today  yesterday  以及下面3个timestamp

[code lang="php"]

<?php
$day = '';
if (count($events))
{
$today = date('m d Y', time() - 54000);
$yesterday = date('m d Y', time() -140400);
if ($has_paging)
{
if ($has_prev_page) {
echo '<p><a href="' . $->get_previous_page_url($page) .

'">Newer Entries</a></p>';
}
}
?>
<table>
<?php
foreach ($events as $event)
{
$timestamp = $event->get_date();
if ($today == date('m d Y', $timestamp -32400)) $this_day = $->__('Today');
else if ($yesterday == date('m d Y', $timestamp-32400)) $this_day = $->__('Yesterday');
else $this_day = $->__(ucfirst(htmlentities(date($->get_option('day_format'),

$timestamp-82400))));
if ($day != $this_day)
{
?>
<tr>
<th colspan="2">
<h2><?php echo $this_day; ?></h2>
</th>
</tr>
<?php
$day = $this_day;
}
..............
[/code]

Tags: ,,,,,.

这样的空间流应用

具体可以查看http://foursquare.com/

不知道能否有个timeline+ 的应用?

即:记录每天24小时,在什么地方待了多久,在什么地方待了多久 如此记录下每天的行程

虽然对于个人隐私来讲,这些数据不便公开,但是就好比车载一样,来一个人载记录下自己每天的行程

犹如目前已有的记录自己每天做什么事情的录像记录模式一样,其实待今后退休,闲暇的时候均可以翻出来回忆

随着越来越繁忙的生活,越来越没办法闲暇来记录自己的点点滴滴的时候 在如此越来越开放如此越来越没隐私的社会,也许将会有如此产品推出吧?

Tags: ,,,,,.
2009-12-26

就好比中国国内的移动梦网

韩国最大的手机通信商SK开设的服务

之前一直不知道

http://twitter.com/natetweeting

今天在twitter上收到RT才发现

@tweeting_event: [트위팅 고객님이라면 모두 받으세요!] 스타벅스까페라떼 100%당첨 + 이 글을 RT하면 Sony VAIO 넷북,햅틱 아몰레드폰까지~ http://j.mp/4TbI97

基本上就是勾引人加入提供的twitter服务,然后给点小奖励与优惠

加入了以后

有一个月300条免费往手机上发送的功能,就是读取你twitter上的消息发送给你

1.全部好友

2.部分好友

3.关闭好友

4.只发@自己的

5.发送时间段设置

6.暂时关闭这个功能

 

当然也给了一个号码可以往twitter上发

*1234 这个号码即可

 

另外

http://hantweet.com/

看来这个服务要……被取代了。不过和KT还可以用这个服务应该

……………………………………………………………………

目前不足之处

1.twitter 认证采用id & pw, 没采用目前流行的Oauth方式……

2.还没在twitter上申请自己的API地址…默认还是API方式,这个比较丑…

 

优势:

1.目前一个月300条免费发送到手机上,发送的消息都是abstract,如果想看全文,需要在手机登录,那都是要给钱的……流量费啊流量费……钱就是这么赚出来了

2.hantweet 如果要收消息,也是要给钱的,

包月制:

  • 4900원/월
  • 월 최대 200건까지 한트윗의 답글과 DM을 전달해줍니다

 

  • 9900원/월
  • 월 최대 500건까지 한트윗의 답글과 DM을 전달해줍니다.

按条:

  • 3300원
  • 토큰 100개를 구입합니다.

100条3300  200条6600  300条9900这样

……………………………………………………………………………………

Tags: ,,,,,,,,.