Design a tinyurl system

Design a tinyurl system. This system converts a long URL into some small URL, typically people use it when they post something on platforms where there is limit on number of characters like Twitter.

解题思路:单台机器

A Better Solution is to use the integer id stored in database and convert the integer to character string that is at most 6 characters long. This problem can basically seen as a base conversion problem where we have a 10 digit input number and we want to convert it into a 6 character long string.

Below is one important observation about possible characters in URL.

A URL character can be one of the following

  • A lower case alphabet [‘a’ to ‘z’], total 26 characters
  • An upper case alphabet [‘A’ to ‘Z’], total 26 characters
  • A digit [‘0′ to ‘9’], total 10 characters

There are total 26 + 26 + 10 = 62 possible characters.

So the task is to convert a decimal number to base 62 number.

如果短网址由上述的6个字符组成,那么短网址数量为:62^6 ~= 56.8 billions unique strings.

public class UrlShortener {
    private static final String ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    private static final int    BASE     = ALPHABET.length();

    public static String encode(int num){
        StringBuilder sb = new StringBuilder();

        while ( num > 0 ){
            sb.append( ALPHABET.charAt( num % BASE ) );
            num /= BASE;
        }

       while (res.length() < 6)  res.append('0');
       return sb.reverse().toString();   
    }

    public static int decode(String str){
        int num = 0;

        for ( int i = 0, len = str.length(); i < len; i++ ){
            num = num * BASE + ALPHABET.indexOf( str.charAt(i) ); 
        }

        return num;
    }   
}

解题思路:多台机器

we can use Distributed Key-Value Datastore.Some distributed datastore (e.g. Amazon's Dynamo) uses Consistent Hashing to hash servers and inputs into integers and locate the corresponding server using the hash value of the input. We can apply base conversion algorithm on the hash value of the input.

The basic process can be:

Insert:

  1. Hash an input long url into a single integer;
  2. Locate a server on the ring and store the key--longUrl on the server;
  3. Compute the shorten url using base conversion (from 10-base to 62-base) and return it to the user.

Retrieve

  1. Convert the shorten url back to the key using base conversion (from 62-base to 10-base);
  2. Locate the server containing that key and return the longUrl.

References

1 System Design for Big Data [tinyurl]

2 Design a tinyurl system

3 URL Shortening: Hashes In Practice

4 How to code a URL shortener?

5 【案例实战】 Design a tinyurl system(System Design Study Group)

results matching ""

    No results matching ""