How to scale dropbox

What is dropbox
Scale:
10s of millions of users
100s of millions of the file syncs per day.

Challenges

  1. Write volume.
  2. ACIDity requirements (Atomicity, Consistency, Isolation, Durability)

Example #1: High-level architecture
Phase 0:
Clients –> The Server
This is the humble beginning of dropbox

Phase 1: (The sever is overloaded)
Clients –> The Server –> S3/DB(MySQL)

Phase 2: (The server may not endure downloading requests)
Clients –> Meta Server –> Notification Server –> Clients
–> DB

Clients –> BlockServer(Uploading requests AWS EC2) –> S3
–> DB

Phase 3: (UPL caused by long distance, encapsulate whole logic in MetaServer, lack of load balancer, memcache)

Clients –> LB –> Meta Server (distributed) –> Notification Server –> Clients
–> memcache
–> DB

Clients –> BlockServer –> LB
–> S3
Phase 4: (High Availability: make servers distributed)

Clients –> LB (distributed)–> Meta Server (distributed) –> Notification Server(distributed) –> Clients
–> memcache(distributed)
–> DB(distributed)

Clients –> BlockServer (distributed) –> LB(distributed)
–> S3

Example #2: server_file_journal
Phase0:

1
2
3
4
5
6
7
8
9
CREATE TABLE 'server_file_journal' (
'id' int(10) unsigned,
'filename' varchar(260),
'casepath' varchar(260),
'latest' tinyint(1),
'ns_id' int(10) unsigned,
[...]
PRIMARY KEY ('id')
) ENGINE = InnoDB;

Phase1: Dropbox does not need casepath anymore

1
2
3
4
5
6
7
8
CREATE TABLE 'server_file_journal' (
'id' int(10) unsigned,
'filename' varchar(260),
'latest' tinyint(1),
'ns_id' int(10) unsigned,
[...]
PRIMARY KEY ('id')
) ENGINE = InnoDB;

Phase2: Dropbox needs prev_rev

1
2
3
4
5
6
7
8
9
CREATE TABLE 'server_file_journal' (
'id' int(10) unsigned,
'filename' varchar(260),
'latest' tinyint(1),
'ns_id' int(10) unsigned,
'prev_rev' int(10) unsigned,
[...]
PRIMARY KEY ('id')
) ENGINE = InnoDB;

Phase2: With customers growing, dropbox cannot only use id as the primary key.

1
2
3
4
5
6
7
8
9
CREATE TABLE 'server_file_journal' (
'id' int(10) unsigned,
'filename' varchar(260),
'latest' tinyint(1),
'ns_id' int(10) unsigned,
'prev_rev' int(10) unsigned,
[...]
PRIMARY KEY ('ns_id', 'latest', 'id')
) ENGINE = InnoDB;

Phase3: mySQL 255 need one byte which costs less space

1
2
3
4
5
6
7
8
9
CREATE TABLE 'server_file_journal' (
'id' int(10) unsigned,
'filename' varchar(255),
'latest' tinyint(1),
'ns_id' int(10) unsigned,
'prev_rev' int(10) unsigned,
[...]
PRIMARY KEY ('ns_id', 'latest', 'id')
) ENGINE = InnoDB;