Zipf’s Law & Spotify Puzzle

Back in 20th century American linguist George Zipf noticed a striking pattern in distribution of words in English language. Though he was not the first to observe that pattern, Zipf formulated roughly accurate theoretical law using math statistics and popularized it.

In short words, Zip’s law states that the frequency of any word is inversely proportional to its rank in frequency table. This way, first most popular word would occur twice as many times as second most popular word and three times more than third word in the ranking.

Generalized form: In collection T of terms {t}, collection frequency cfi of ith term t would be calculated as:

 

 

Zip’s distribution belongs to power law family distribution, where rank and frequency are inversely related quantities.

Not only corpus of written and spoken natural language fall under this distribution, wide range of distributions are Zipfian in nature:

Spotify Puzzle ‘Zipf’s Song’ (Difficulty Level: Medium)

Your slightly pointy-bearded boss has assigned you to write software to find the best songs from different music albums. And the software should be finished in an hour. But don’t panic, you don’t have to solve the problem of writing an AI with good taste. At your disposal is the impeccable taste of a vast horde of long-tailed monkeys. Well, at least almost. The monkeys are not very communicative (or rather, you’re not sure which song “Ook!” is supposed to refer to) so you can’t ask them which songs are the best. What you can do however is to look at which songs the monkeys have listened to and use this information to deduce which songs are the best.

At first, you figure that the most listened to songs must be the best songs. However, you quickly realize that this approach is flawed. Even if all songs of the album are equally good, the early songs are more likely to be listened to more often than the later ones, because monkeys will tend to start listening to the first song, listen for a few songs and then, when their fickle ears start craving something else, stop listening. Instead, if all songs are equal, you expect that their play frequencies should follow Zipf’s Law.

Zipf’s Law is an empirical law originally formulated about word frequencies in natural languages, but it has been observed that many natural phenomena, such as population sizes and incomes, approximately follow the same law. It predicts that the relative frequency of thei’th most common object (in this case, a song) should be proportional to 1/i.

To illustrate this in our setting, suppose we have an album where all songs are equally good. Then by Zipf’s Law, you expect that the first song is listened to twice as often as the second song, and more generally that the first song is listened to i times as often as the i’th song. When some songs are better than others, those will be listened to more often than predicted by Zipf’s Law, and those are the songs your program should select as the good songs. Specifically, suppose that song i has been played fi times but that Zipf’s Law predicts that it would have been played zi times. Then you define the quality of song i to be qi = fi / zi. Your software should select the songs with the highest values of qi.

Input

The first line of input contains two integers n and m (1 ≤ n ≤ 50000, 1 ≤ m ≤ n), the number of songs on the album, and the number of songs to select. Then follow n lines. The i’th of these lines contains an integer fi and string si, where 0 ≤ fi ≤ 10^12 is the number of times the i’th song was listened to, and si is the name of the song. Each song name is at most 30 characters long and consists only of the characters ‘a’-‘z’, ‘0’-‘9’, and underscore (‘_’).

Output

Output a list of the m songs with the highest quality qi, in decreasing order of quality. If two songs have the same quality, give precedence to the one appearing first on the album (presumably there was a reason for the producers to put that song before the other).

Sample input 1
4 2
30 one
30 two
15 three
25 four
Sample output 1
four
two
Sample input 2
15 3
197812 re_hash
78906 5_4
189518 tomorrow_comes_today
39453 new_genious
210492 clint_eastwood
26302 man_research
22544 punk
19727 sound_check
17535 double_bass
18782 rock_the_house
198189 19_2000
13151 latin_simone
12139 starshine
11272 slow_country
10521 m1_a1
Sample output 2
19_2000
clint_eastwood
tomorrow_comes_today

  
Python
Time complexity: O(N lg N)
Space complexity: O(N)

from collections import namedtuple
import operator

def getKTopSongs(k):
    song = namedtuple('Song', 'name freq rank quality')
    songs = list()
    rank = 0

    while True:
        s = input()
        if not s:
            break
        else:
            song_data = s.split(' ')
            rank += 1
            freq = int(song_data[0])
            quality = freq / (1 / rank)
            songs.append(song(song_data[1], freq,
                              rank, quality))

    assert(k <= len(songs))

    songs.sort(key=operator.attrgetter('rank'))
    songs.sort(key=operator.attrgetter('quality'),reverse=True)    

    print(songs[:k])
Advertisements

Ubuntu 14X. “Enable Wireless” option is disabled in network settings


How to approach:

1. On laptop try to press combination FN+F1-F12 (FN+F12 on HP) and see if it enables wireless. Otherwise go to step 2.

2. Check what type of block is on wireless adapter:

 nezdolik@nezdolik-HP-Pavilion:~$ sudo rfkill list
1: hp-wifi: Wireless LAN
	Soft blocked: yes
	Hard blocked: no
2: hp-bluetooth: Bluetooth
	Soft blocked: no
	Hard blocked: no

For soft block issue:

 sudo rfkill unblock wifi

For hard block go to step 3.

3. Restart laptop, launch BIOS (F10), reset to default settings (F9), save and exit (F10).
If issue still not resolved, restart once again and before Ubuntu start up, while on black screen press combination FN+F1-F12 (FN+F12 on HP).

Spotify Puzzles. Reversed Binary Numbers (Difficulty Level: Easy)

Yi has moved to Sweden and now goes to school here. The first years of schooling she got in China, and the curricula do not match completely in the two countries. Yi likes mathematics, but now… The teacher explains the algorithm for subtraction on the board, and Yi is bored. Maybe it is possible to perform the same calculations on the numbers corresponding to the reversed binary representations of the numbers on the board? Yi dreams away and starts constructing a program that reverses the binary representation, in her mind. As soon as the lecture ends, she will go home and write it on her computer. See Spotify Puzzles.

Task

Your task will be to write a program for reversing numbers in binary. For instance, the binary representation of 13 is 1101, and reversing it gives 1011, which corresponds to number 11.

Input

The input contains a single line with an integer N, 1 ≤ N ≤ 1000000000.

Output

Output one line with one integer, the number we get by reversing the binary representation of N.

Sample input 1
13
Sample output 1
11
Sample input 2
47
Sample output 2
61
Java
Time complexity: O(1)
Space complexity: O(1)

public class Solution {
    //full integer range is supported
    public static int reverseBits(int n) {

        if (n == 0) return 0;

        int result = 0;

        while (n != 0) {
            result = (result << 1) + (n & 1); 
            n >>>= 1;
        }

        return result;
    }
}
Go
Time complexity: O(1)
Space complexity: O(1)

package main

import (
	"fmt"
)

func reverseBits(n int) int{
	if n == 0 {
		return 0
	}
	
	res := 0
	for n != 0 {
	        res <<= 1 
            if n & 1 == 1{ 
                res += 1 
            } 
            n >>= 1
	}
	return res
}

K’th Largest Element in Unsorted Array

Given an array and a number k where k is smaller than size of array, find the k’th largest element in the given array. All array elements are distinct.

Examples:

Input: arr[] = {1, 7, 10, 4, 3, 30, 15}
       k = 3
Output: 10

Input: arr[] = {1, 7, 10, 4, 3, 30}
       k = 4
Output: 4

Java (sorting)
Time complexity: O(N log N)
Space complexity: O(1)

public class Solution {
    public int findKthLargest(int[] nums, int k) {
        if (nums == null || nums.length == 0 || k <= 0 || k > nums.length) {
            return -1;
        }

        Arrays.sort(nums);
        
        return nums[nums.length-k];
        
    }
}

Java (heap sort)
Time complexity: O(N log N)
Space complexity: O(N)

public class Solution {
    public int findKthLargest(int[] nums, int k) {
        if (nums == null || nums.length == 0 || k <= 0 || k > nums.length) {
            return -1;
        }

        Queue<Integer> q = new PriorityQueue<>(Collections.reverseOrder());
        for (int n : nums){
            q.offer(n);
        }
        
        for (int i = 1; i < k; ++i){
            q.poll();
        }
        
        return q.poll();
        
    }
}

Java (quick select)
Time complexity: O(N)
Space complexity: O(N)

public class Solution {
    public int findKthLargest(int[] nums, int k) {
        if (nums == null || nums.length == 0 || k <= 0 || k > nums.length) {
            return -1;
        }

        //shuffle to eliminate worst case running time O(N^2)
        shuffle(nums);

        return findKthLargest(nums, k, 0, nums.length - 1);

    }

    private int findKthLargest(int[] nums, int k, int lo, int hi){
       if (lo <= hi){
            int pivot = partition(nums, lo, hi);

            if (hi - pivot == k - 1){
                return nums[pivot];
            } else if (hi - pivot < k - 1){
                return findKthLargest(nums,k-(hi-pivot+1),lo, pivot - 1);
            } else {
                return findKthLargest(nums,k,pivot + 1, hi);
            }
        }

        return -1;
    }

    private int partition(int[] nums, int lo, int hi){
        if (lo <= hi){
            int i = lo + 1;
            int j = lo + 1;

            while (j <= hi){
                if (nums[j] < nums[lo]){
                    swap(i, j, nums);
                    ++i;
                }
                 ++j;
            }

            swap(lo, i-1, nums);
            return i-1;
        }
        return -1;
    }

    private void shuffle(int[] nums){
        Random r = new Random();
        for (int i = 1; i < nums.length; ++i){
            int j = r.nextInt(i+1);
            swap(i, j, nums);
        }

    }


    private void swap(int i, int j, int[] nums){
        int x = nums[i];
        x ^= nums[j];
        nums[j] = x ^ nums[j];
        nums[i] = x ^ nums[j];
    }
}

Interval Scheduling

Input: set of intervals on the line, represented by pairs of points (ends of intervals).

Output: finding the largest set of intervals such that none two of them overlap.

Java
Time complexity: O(N log N)
Space complexity: O(N)

public class IntervalScheduling {

    static class Interval{
        private int start;
        private int finish;

        public Interval(int start, int finish){
            this.start = start;
            this.finish = finish;
        }

        public int getStart(){
            return start;
        }

        public int getFinish(){
            return finish;
        }

        @Override
        public String toString(){
            return "[" + start + ", " + finish + "]";
        }
    }

    public List<Interval> maxNonOverlap(List<Interval> intervals){
        if (intervals == null || intervals.size() == 0){
            return Collections.EMPTY_LIST;
        }

        intervals.sort((Interval i1, Interval i2)->i1.getFinish() - i2.getFinish());

        List<Interval> res = new ArrayList<>();
        res.add(intervals.get(0));
        int ptr = 1;

        while (ptr < intervals.size()){
            Interval curr = res.get(res.size() - 1);
            while (ptr < intervals.size() && intervals.get(ptr).getStart() < curr.getFinish()){
                ++ptr;
            }
            if (ptr < intervals.size()){
                res.add(intervals.get(ptr++));
            }
        }

        return res;

    }

    public static void main(String[] args) {
        List<Interval> intervals = new ArrayList<>();
        intervals.add(new Interval(600, 830));
        intervals.add(new Interval(800, 900));
        intervals.add(new Interval(900, 1100));
        intervals.add(new Interval(900, 1130));
        intervals.add(new Interval(1030, 1400));
        intervals.add(new Interval(1230, 1400));

        List<Interval> nonOverlapIntervals = new IntervalScheduling().maxNonOverlap(intervals);
        System.out.println((Iterable) nonOverlapIntervals);

    }

}

Lintcode. Number of Airplanes in the Sky

Given an interval list which are flying and landing time of the flight. How many airplanes are on the sky at most?

Notice

If landing and flying happens at the same time, we consider landing should happen at first.

 

Java
Time complexity: O(N log N)
Space complexity: O(N)

class Solution {
    /**
     * @param intervals: An interval array
     * @return: Count of airplanes are in the sky.
     */
    public int countOfAirplanes(List<Interval> airplanes) { 
        // write your code here
        if (airplanes == null || airplanes.size() == 0){
            return 0;
        }
        
        int n = airplanes.size();
        int[] start = new int[n];
        int[] land = new int[n];
        
        for (int i = 0; i < n; ++i){
            Interval t = airplanes.get(i);
            start[i] = t.start;
            land[i] = t.end;
        }
        
        Arrays.sort(start);
        Arrays.sort(land);
        
        int maxPlanes = Integer.MIN_VALUE;
        int currPlanes = 0;
        
        int startPtr = 0;
        int landPtr = 0;
        
        while (startPtr < n){
            if (land[landPtr] <= start[startPtr]){
                currPlanes -= 1;
                ++landPtr;
            } else if (start[startPtr] < land[landPtr]){
                currPlanes += 1;
                ++startPtr;    
            } 
            maxPlanes = Math.max(maxPlanes, currPlanes);
        }
        
        return maxPlanes;
        
        
    }
}

Python
Time complexity: O(N log N)
Space complexity: O(N)

class Solution:
    # @param airplanes, a list of Interval
    # @return an integer
    def countOfAirplanes(self, airplanes):
        # write your code here
        if not len(airplanes):
            return 0;
        
        #creates copy of iterable, one may use [].sort() instead   
        start = sorted([plane.start for plane in airplanes])
        land = sorted([plane.end for plane in airplanes])
        
        maxAirplanes = 0
        currAirplanes = 0
        sIdx = 0
        lIdx = 0
        
        while sIdx < len(airplanes):
            if start[sIdx] < land[lIdx]:
                currAirplanes += 1
                sIdx += 1
            else:
                currAirplanes -= 1
                lIdx += 1
                
            maxAirplanes = max(maxAirplanes, currAirplanes)
        
            
        return maxAirplanes;