Saturday, July 27, 2013

Select k random elements from a list of size n

Today, I'm going to talk about a popular problem which is how to select "k" random elements of a list of size "n" and how to do this when the size of the list is unknown like if the list is some kind of an endless stream.


First attempt:

Randomly shuffle the list and return the first k elements. This is an easy solution, however, you would have to either modify the array or use O(N) extra storage where you'll copy the array elements before shuffling them.

How to shuffle the array?

You traverse the array from i = 0 to i = n - 1 and generate a random number from a random number generator between i and n - 1 and swap that element with i.

Code should be something like this (Error checking ommitted for simplicity)


void krandom(int *arr, int *output, int n, int k)
{
   int *cpyArr = new int[n];

   for(int i = 0; i < n; ++i)
      cpyArr[i] = arr[i];

   for(int i = 0; i < n; ++i)
   {
      int j = rand(i, n - 1);
      swap(cpyArr[i], cpyArr[j]);
   }

   for(int i = 0; i < k; ++i)
      output[i] = cpyArr[i];

   delete cpyArr;
} 


Second attempt:

In this attempt, we'll pass through the array once and decide if we should insert  the element into the output array or not.

This is an O(N) solution with no extra storage

let's see how we can implement this.

if we have an array of 100 elements and we want to choose 100 random elements, then each one of these elements should have a 100% percent chance of being in.

if after choosing some random elements we end up having 10 elements to choose from and 5 random elements to choose, then the next element should have a 50% chance of being in.

what if we have 4 elements to choose from and "0" elements choose from these four? then the next element should have a 0% chance of being inserted in the output.

this suggests the following algorithm

void krandom(int *arr, int *output, int n, int k)
{
   // Where to insert the element into output array
   int iOutput = 0;
   int kRest = k;
   int nChoose = n;

   for(int i = 0; i < n && kRest != 0; ++i)
   {
      // Generate a random number between 1 and
      // nChoose inclusive
      int iRand = rand(1, nChoose);
      // give the element a kRest/nChoose chance
      // of being inserted
      if(iRand <= k)
      {
         output[iOutput++] = arr[i];
         // we have one less random item to insert
         --kRest;
         // we have one less element to choose from
         --nChoose;
      }
   }
} 

Third attempt:

This is an easy solution using O(K) extra storage and O(K) time complexity when you have a good pseudo-random generator, this is very efficient when N is very big and K is relatively small.

The solution uses a hashtable to check if the current item was inserted after selecting one random element from the input array.

void krandom(int *arr, int *output, int n, int k)
{
   hash_map<int> tableItems;
   int iOutput = 0;

   while(tableItems.size() != k)
   {
      int iRand = rand(0, n-1);

      if(!tableItems.containsKey(iRand))
      {
         tableItems.insert(iRand);
         output[iOutput++] = arr[iRand];
      }
   }
}

Now what if you don't already know the size of the list? maybe it's a very big linked list that you don't want to traverse twice to know the size of or it's an endless stream, how do you choose k random elements from this list?

First think how would you do it when the list has only k elements? how do you choose 10 random elements from a 10-size list? you just insert all items in the output.

What if you get another element? the 11th element? you should choose if you should replace it with one of the items you already chose or just ignore it. This can be done by generating a number between 1 and 11 and replace the item with one already there if the generated random number is between 1 and 10.

This way items at the beginning of the array will have a big chance of being in and also a big chance of being thrown away later. Items at the end of the last will have a smaller chance of getting in but also a smaller chance of getting out once they get in.

This is known as the reservoir method.

See you in another post!

Thursday, July 25, 2013

UVa 10946 - You want what filled?

UVa Problem

Solution:


#include <iostream>
#include <algorithm>
#include <vector>
#include <set>

using namespace std;

bool comp_1(const pair<char, int>& a,
      const pair<char, int>& b)
{
   return a.first < b.first;
}

bool comp_2(const pair<char, int>& a,
      const pair<char, int>& b)
{
   return b.second < a.second;
}

class Problem
{
public:
   Problem(const vector<vector<char> >& problemMap) :
      mMap(problemMap)
   {

   }

   vector<pair<char, int> > getSolution() const
   {
      return mHoles;
   }

   void solve()
   {
      for(int iRow = 0; iRow < mMap.size(); ++iRow)
      {
         for(int iCol = 0; iCol < mMap[0].size(); ++iCol)
         {
            if(mVisited.count(make_pair(iRow, iCol)) == 0)
            {
               if(mMap[iRow][iCol] != '.')
               {
                  char character = mMap[iRow][iCol];
                  int holeSize = floodfill(iRow, iCol, character);
                  mHoles.push_back(make_pair(character, holeSize));
               }
            }
         }
      }
      sort(mHoles.begin(), mHoles.end(), comp_1);
      stable_sort(mHoles.begin(), mHoles.end(), comp_2);
   }

private:
   bool valid(int row, int col)
   {
      return row >= 0 && row < mMap.size() &&
            col >= 0 && col < mMap[0].size();
   }
   int floodfill(int row, int col, char c)
   {
      int holeSize = 0;

      if(valid(row, col) && mVisited.count(make_pair(row, col)) == 0)
      {
         if(mMap[row][col] == c)
         {
            ++holeSize;
            mVisited.insert(make_pair(row, col));
            holeSize += floodfill(row+1, col, c);
            holeSize += floodfill(row-1, col, c);
            holeSize += floodfill(row, col+1, c);
            holeSize += floodfill(row, col-1, c);
         }
      }

      return holeSize;

   }
private:
   const vector<vector<char> >& mMap;
   set<pair<int, int> > mVisited;
   vector<pair<char, int> > mHoles;
};

int main(int argc, char *argv[])
{
   int numRows, numCols;
   int iProblem = 1;


   while(cin >> numRows)
   {
      cin >> numCols;

      if(numRows == 0 || numCols == 0)
         break;

      vector<vector<char> > problemMap(numRows);

      for(int iRow = 0; iRow < numRows; ++iRow)
      {
         string strLine;
         cin >> strLine;

         for(int iCol = 0; iCol < numCols; ++iCol)
            problemMap[iRow].push_back(strLine[iCol]);
      }

      Problem prob(problemMap);
      prob.solve();

      vector<pair<char, int> > sol = prob.getSolution();

      cout << "Problem " << iProblem << ":" << endl;

      for(int i = 0; i < sol.size(); ++i)
      {
         cout << sol[i].first << ' ' << sol[i].second << endl;
      }

      ++iProblem;
   }
   return 0;
}

UVa 216 - Getting in Line

UVa Problem

Solution:


#include <iostream>
#include <algorithm>
#include <vector>
#include <cstdio>
#include <cmath>
#include <limits>

using namespace std;

struct Point
{
   int x, y;
};

struct Link
{
   Point start, end;
   double distance;
};

class NetworkSolver
{
public:
   NetworkSolver(const vector<Point>& nodes) :
      mNodes(nodes), mTotalLength(numeric_limits<int>::max()),
      mConfig(nodes.size() - 1)
   {
   }

   double getTotalLength()
   {
      return mTotalLength;
   }

   vector<Link> getConfig()
   {
      return mConfig;
   }

   void solve()
   {
      vector<Point> buffer(mNodes.size());
      vector<bool> bFree(mNodes.size(), true);
      doSolve(buffer, bFree, 0);
   }
private:
   void doSolve(vector<Point>& buffer, vector<bool>& bFree, int level)
   {
      if(level == mNodes.size())
      {
         double currDist = 0;

         for(int iNode = 0; iNode < level - 1; ++iNode)
         {
            currDist += (16 + sqrt( (buffer[iNode].x - buffer[iNode+1].x)
                  * (buffer[iNode].x - buffer[iNode+1].x)+
                  (buffer[iNode].y - buffer[iNode+1].y)
                                    * (buffer[iNode].y - buffer[iNode+1].y)));
         }

         if(currDist < mTotalLength)
         {
            mTotalLength = currDist;
            for(int iNode = 0; iNode < buffer.size() - 1; ++iNode)
            {
               mConfig[iNode].start = buffer[iNode];
               mConfig[iNode].end = buffer[iNode+1];
               // Euclidean Distance
               mConfig[iNode].distance =
                     (16 + sqrt( (buffer[iNode].x - buffer[iNode+1].x)
                           * (buffer[iNode].x - buffer[iNode+1].x)+
                             (buffer[iNode].y - buffer[iNode+1].y)
                           * (buffer[iNode].y - buffer[iNode+1].y)));
            }

         }
         return;
      }

      for(int i = 0; i < mNodes.size(); ++i)
      {
         if(bFree[i])
         {
            bFree[i] = false;
            int tmp = buffer.size();
            buffer[level] = mNodes[i];
            tmp = buffer.size();
            doSolve(buffer, bFree, level+1);
            bFree[i] = true;
         }
      }
   }

private:
   const vector<Point>& mNodes;
   vector<Link> mConfig;
   double mTotalLength;
};


int main(int argc, char *argv[])
{
   int iNetwork = 1;
   int numComputers;

   while( (cin >> numComputers) && numComputers)
   {
      vector<Point> nodes;

      for(int iNode = 0; iNode < numComputers; ++iNode)
      {
         Point currNode;
         cin >> currNode.x;
         cin >> currNode.y;
         nodes.push_back(currNode);
      }

      // Node and Distance of result configuration
      vector<Link> links;

      double totalLength;

      NetworkSolver solver(nodes);
      solver.solve();
      links = solver.getConfig();
      totalLength = solver.getTotalLength();


      printf("**********************************************************\n");
      printf("Network #%d\n", iNetwork);
      for(int iLink = 0; iLink < links.size(); ++iLink)
      {
         printf("Cable requirement to connect (%d,%d) to (%d,%d) is %.2f feet.\n",
               links[iLink].start.x,
               links[iLink].start.y,
               links[iLink].end.x,
               links[iLink].end.y,
               links[iLink].distance);
      }
      printf("Number of feet of cable required is %.2f.\n", totalLength);
      ++iNetwork;

   }

   return 0;
}

UVa 167 - The Sultan's Successors

UVa Problem

This is a famous problem of how to arrange 8 queens on a chessboard without having any queen threaten another one (by sharing the same row, column, or diagonal).

This can be solved simply by using recursion and backtracking. We iterate through all rows and try to find a free column to insert the queen into by checking that the cell won't share the same column, left diagonal or right diagonal with another queen, then mark the column, left diagonal and right diagonal as non-free.

this problem has one additional requirement. Each cell on the board is given some value, and as there can be many valid solutions, we wish to find the solution which maximizes the sum of the values of the cells containing the queens.

Solution:


#include <iostream>
#include <algorithm>
#include <vector>
#include <cstdio>

using namespace std;

class GameSolver
{
public:
   GameSolver(const vector<vector<int> >& board) :
      mBoard(board), mBoardSize(board.size()),
      mCurrMaxScore(0)
   {
      mQueens.reserve(mBoardSize);

      for(int iCol = 0;
            iCol < mBoardSize;
            ++iCol)
         mFreeColumns.push_back(true);

      for(int iDiag = 0;
            iDiag < 2 * mBoardSize - 1;
            ++iDiag)
      {
         mFreeLeftDiags.push_back(true);
         mFreeRightDiags.push_back(true);
      }
   }

   int getMaxScore()
   {
      doSolve(0);
      return mCurrMaxScore;
   }
private:

   void doSolve(int iRow)
   {
      if(iRow == mBoardSize)
      {
         int score = 0;
         for(int i = 0; i < mBoardSize; ++i)
         {
            score += mBoard[i][mQueens[i]];
         }

         mCurrMaxScore = max(mCurrMaxScore, score);
         return;
      }

      for(int iCol = 0; iCol < mBoardSize; ++iCol)
      {
         if(mFreeColumns[iCol] &&
               mFreeLeftDiags
               [iRow - iCol + mBoardSize - 1] &&
               mFreeRightDiags[iRow + iCol])
         {
            mFreeColumns[iCol] = false;

            mFreeLeftDiags
            [iRow - iCol + mBoardSize - 1] = false;

            mFreeRightDiags[iRow + iCol] = false;

            mQueens[iRow] = iCol;
            doSolve(iRow+1);

            mFreeColumns[iCol] = true;
            mFreeLeftDiags
            [iRow - iCol + mBoardSize - 1] = true;
            mFreeRightDiags[iRow + iCol] = true;

         }

      }
   }

   int mBoardSize;
   const vector<vector<int> >& mBoard;
   vector<int> mQueens;
   vector<bool> mFreeColumns;
   vector<bool> mFreeLeftDiags;
   vector<bool> mFreeRightDiags;

   int mCurrMaxScore;

};


int main(int argc, char *argv[])
{
   const int boardSize = 8;

   vector<vector<int> > board;

   for(int iRow = 0; iRow < boardSize; ++iRow)
      board.push_back(vector<int>(boardSize, 0));

   int k;
   cin >> k;

   for(int iBoard = 0; iBoard < k; ++iBoard)
   {
      for(int iRow = 0; iRow < boardSize; ++iRow)
         for(int iCol = 0; iCol < boardSize; ++iCol)
            cin >> board[iRow][iCol];

      GameSolver game(board);
      int score = game.getMaxScore();

      printf("%5d\n", score);

   }

}

Wednesday, July 24, 2013

UVa 10344 - 23 out of 5

UVa Problem

You have five numbers (a0 ... a4) as input and you're required to determine if there's an arrangment of the numbers and operators (+, -, *) that can give a total value of "23"

the expression must be of the form  ( ( ( a0 o a1) o a2)  o a3) o a4
every "o" can be replaced by any of the above operators "+, -, *" and you can use any permutation of the numbers (order is not required).

Solution:

I solved the problem using recursion and backtracking

#include <iostream>
#include <algorithm>

using namespace std;

bool isReachable(int *numbers, bool *used,
      int numsTotal, int numsAvailable, int valExpr)
{
   if(numsAvailable <= 0)
      return false;
   if(numsAvailable == 1)
   {
      for(int i = 0; i < numsTotal; ++i)
      {
         if(!used[i])
            return valExpr == numbers[i];
      }
      return false;
   }

   for(int i = 0; i < numsTotal; ++i)
   {
      if(!used[i])
      {
         int currNum = numbers[i];
         used[i] = true;

         if(isReachable(numbers, used, numsTotal,
               numsAvailable - 1, valExpr - currNum))
         {
            return true;
         }

         if(isReachable(numbers, used, numsTotal,
               numsAvailable - 1, valExpr + currNum))
         {
            return true;
         }

         if(currNum == 0 && valExpr == 0)
            return true;
         if((currNum != 0) && (valExpr%currNum == 0) &&
               isReachable(numbers, used,
                     numsTotal, numsAvailable - 1,
                     valExpr / currNum))
         {
            return true;
         }
         used[i] = false;
      }
   }
   return false;
}

int main(int argc, char *argv[])
{
   const int finalValue = 23;
   const int arrSize = 5;

   int arrNums[arrSize];
   bool used[arrSize];

   while(cin >> arrNums[0])
   {
      bool exit = (arrNums[0] == 0);

      for(int i = 1; i < arrSize; ++i)
      {
         cin >> arrNums[i];
         if(arrNums[i] != 0)
            exit = false;
      }
      if(exit)
         break;

      for(int i = 0; i < arrSize; ++i)
      {
         used[i] = false;
      }

      if(isReachable(arrNums, used, arrSize,
            arrSize, finalValue))
         cout << "Possible" << endl;
      else
         cout << "Impossible" << endl;

   }
   return 0;
}

UVa 100 - The 3n + 1 problem

UVa Problem

#include <iostream>
#include <algorithm>

using namespace std;

int main(int argc, char *argv[])
{
   int t1, t2;

   while(cin >> t1)
   {
      cin >> t2;

      int start, end;

      start = t1;
      end = t2;

      if(start > end)
         swap(start, end);

      int maxCycleLen = 1;

      for(int num = start; num <= end; ++num)
      {
         int cycleLen = 1;
         int currNum = num;

         while(currNum != 1)
         {
            if(currNum%2 == 0)
               currNum /= 2;
            else
               currNum = 3 * currNum + 1;

            ++cycleLen;
         }
         maxCycleLen = max(maxCycleLen, cycleLen);
      }

      cout << t1 << " " << t2 << " " << maxCycleLen << endl;
   }

   return 0;
}

Tuesday, July 23, 2013

UVa - 10392 - Factoring Large Numbers

Problem on UVa


Problem F: Factoring Large Numbers

One of the central ideas behind much cryptography is that factoring large numbers is computationally intensive. In this context one might use a 100 digit number that was a product of two 50 digit prime numbers. Even with the fastest projected computers this factorization will take hundreds of years.

You don't have those computers available, but if you are clever you can still factor fairly large numbers.

Input

The input will be a sequence of integer values, one per line, terminated by a negative number. The numbers will fit in gcc's long long int datatype. You may assume that there will be at most one factor more than 1000000.
Output

Each positive number from the input must be factored and all factors (other than 1) printed out. The factors must be printed in ascending order with 4 leading spaces preceding a left justified number, and followed by a single blank line.

#include <iostream>
#include <vector>

const int N = 1000000;

using namespace std;

vector<int> primes;

void sieve(void)
{
   vector<bool> nums(N+1, true);

   for(int i = 2; i * i <= N; ++i)
   {
      if(nums[i])
      {
         for(int j = i * 2; j <= N; j += i)
         {
            nums[j] = false;
         }
      }
   }

   for(int i = 2; i <= N; ++i)
   {
      if(nums[i])
      {
         primes.push_back(i);
      }
   }
}

int main(int argc, char *argv[])
{
   sieve();

   long long num;

   while(cin >> num)
   {
      if(num < 0)
         break;

      for(int i = 0; (i < primes.size()) &&
         (primes[i] <= num); ++i)
      {
         while( (num % primes[i]) == 0)
         {
            cout << "    " << primes[i] << '\n';
            num /= primes[i];
         }
      }

      if(num > N)
      {
         cout << "    " <<  num << '\n';
      }

      cout << endl;
   }

   return 0;
}

UVa - 10533 - Digit Primes

Problem on UVa

A prime number is a positive number, which is divisible by exactly two different integers. A digit prime is a prime number whose sum of digits is also prime. For example the prime number 41 is a digit prime because 4+1=5 and 5 is a prime number. 17 is not a digit prime because 1+7 = 8, and 8 is not a prime number. In this problem your job is to find out the number of digit primes within a certain range less than 1000000.



Input

First line of the input file contains a single integer N (0<N<=500000) that indicates the total number of inputs. Each of the next N lines contains two integers t1 and t2 (0<t1<=t2<1000000).



Output
For each line of input except the first line produce one line of output containing a single integer that indicates the number of digit primes between t1 and t2 (inclusive).

#include <cstdio>

#define N 1000000

bool primes[N+1];
int digitPrimes[N+1];

void sieve(void)
{
    for(int i = 0; i <= N; ++i)
    {
        primes[i] = true;
        digitPrimes[i] = 0;
    }

    primes[0] = primes[1] = false;

    for(int i = 2; i * i <= N; ++i)
    {
        if(primes[i])
        {
            for(int j = 2 * i; j <= N; j += i)
            {
                primes[j] = false;
            }
        }
    }

    int countDigPrimes = 0;

    for(int i = 2; i <= N; ++i)
    {
        if(primes[i])
        {
            int tmp = i;
            int sumDigits = 0;

            while(tmp)
            {
                sumDigits += tmp%10;
                tmp /= 10;
            }

            if(primes[sumDigits])
            {
                ++countDigPrimes;
            }
        }
        digitPrimes[i] = countDigPrimes;
    }
}

int main(int argc, char *argv[])
{
    int numLines;

    int t1, t2;

    sieve();

    scanf("%d", &numLines);

    for(int iLine = 0; iLine < numLines; ++iLine)
    {
        int numDigiPrimes;

        scanf("%d %d", &t1, &t2);

        if(t1 == 0)
            numDigiPrimes = digitPrimes[t2];
        else
            numDigiPrimes = digitPrimes[t2] - digitPrimes[t1 - 1];
        printf("%d\n", numDigiPrimes);
    }

    return 0;
}

UVa - 10168 - Summation of Four Primes

UVa - 10168 - Summation of Four Prime

My solution relies on Goldbach's conjecture which states that "Every even integer greater than two can be expressed as the sum of two primes".

This conjecture has been proved right up until 4 * 10 ^ 80.
In this problem, we'll have an N with a maximum value of 10 ^ 7, so we can use the conjecture safely.

The problem simply gives you an integer N and you're required to find four primes that sum up to that integer or output "Impossible" if you can't find them.

sieve is an algorithm that marks prime numbers from 2 to N. It works this way, starts from 2 and marks all multiples of 2 as "not primes", then it goes to 3 and marks all multiples as "not primes" then "4" but found out that "4" is already marked as composite so it doesn't need to mark its multiples which must be already marked and so on.

#include <vector>
#include <iostream>
#include <algorithm>

using namespace std;

vector<bool>* sieve(int n)
{
    vector<bool>* pNums = new vector<bool>(n+1, true);
    vector<bool>& nums = *pNums;
    nums[0] = nums[1] = false;

    for(int i = 2; i * i <= n; ++i)
    {
        if(nums[i])
        {
            for(int j = 2 * i; j <= n; j += i)
            {
                nums[j] = false;
            }
        }
    }

    return pNums;
}

int main(int argc, char *argv[])
{
    const int N = 10000000;
    vector<bool> *pNums = sieve(N);
    vector<bool> &nums = *pNums;

    int n;

    while(cin >> n)
    {
        if(n < 8)
            cout << "Impossible." << endl;
        else
        {
            int remainingSum;

            if(n%2 == 0)
            {
                cout << "2 2 ";
                remainingSum = n - 4;
            }
            else
            {
                cout << "2 3 ";
                remainingSum = n - 5;
            }

            int i;
            for (i = 2; i < remainingSum;++i)
            {
                if(nums[i] && nums[remainingSum-i])
                {
                    cout << i << " " << remainingSum - i << endl;
                    break;
                }
            }
            if(!(nums[i] && nums[remainingSum-i]))
                cout << "Impossible." << endl;
        }
    }
    delete pNums;
    return 0;
}

Sunday, July 21, 2013

UVa 10219 - Find the ways !

UVa 10219 - Find the ways !

So in this problem, we are given as input (n, k) and we are required to calculate the number of digits in "n choose k".

for instance, "10 choose 5" equals "252", so we should print "3" as "252" has 3 digits.

one simple way to solve this is to calculate "n choose k" using dynamic programming and count the number of digits found. However, this will use a lot of memory and there's a simpler approach to the problem.

We all know that (n choose k) equals n!/(k! * (n - k)!)
We also know that the number of digits of "n" equals log10(n)+1
and we also know that log(a * b) = log (a) + log (b)

then the number of digits in "n choose k" = the number of digits in n!/(k! * (n - k)!) = log(n!) - log(k!) - log (n - k)!

and we know that log(n!) = log(1) + log(2) + log(3) + .... + log(n)

this gives the following solution to the problem

/*
 * Main.cpp
 *
 *  Created on: Jun 7, 2013
 *      Author: usama
 */
#include <vector>
#include <iostream>
#include <cmath>

using namespace std;

const int N = 1000000;

vector<double> cache(N+1, 0);

void digitsFact(int n)
{
    double result = 0;

    for(int i = 2; i <= n; ++i)
    {
        result += log10(i);
        cache[i] = result;
    }
}

int digitsCombin(int n, int k)
{
    int result = 1;
    result += (int)(cache[n] - cache[k] - cache [n - k]);
    return result;
}

int main(int argc, char *argv[])
{
    int n, k;
    digitsFact(N);
    while (cin >> n)
    {
        cin >> k;
        if(n == k)
            cout << "1" << endl;
        else if(n < k)
            cout << "1" << endl;
        else
        {
            if(n >= cache.size())
            {
                double result = cache[cache.size()-1];
                for(int i = cache.size(); i <= n; ++i)
                {
                    result += log10(i);
                    cache.push_back(result);
                }
            }
            cout << digitsCombin(n, k) << endl;
        }
    }
    return 0;
}

Back to Programming!

Since I have some free time, actually plenty of it! I decided to spend some of my time solving problems at UVA and TopCoder, I also decided to use this blog to post my solutions to some of the problems I'm going to solve.

See you soon!
Usama