Cristian HERGHELEGIU

Just another HERGHELEGIU Blogs weblog

Merge sort

Another sort method is merge sort

MERGE-SORT(A,p,r)

T(n) is THETA(n lg n)
Insertion sort run faster for low n because of the constant factors.
We can use insertion sort when subproblems is sufficient small.

MERGE-SORT(A,p,r)
{
    if ( p < r )
    {
        q = (p+r)/2;
        MERGE-SORT(A,p,q);
        MERGE-SORT(A,q+1,r);
        MERGE(A,p,q,r);
    }
}

 

MERGE(A,p,q,r)
{
    int i = p;
    int j = q+1;
    int k = p;
    while( i <=q || j<=r )
    {
        min = 0;
        if ( i<=q && ( !(j<=r) || (j<=r && A[i] < A[j]) ) )
        {
            min = A[i];
            i++;
        }
        else
        {
            min = A[j];
            j++;
        }
 
        B[k] = min;
        k++;
    }
    
    for( i=p; i<=r; i++ )
        A[i] = B[i];
}

 

if ( i<=q && ( !(j<=r) || (j<=r && A[i] < A[j]) ) )

can be rewrite as

if ( i<=q && ( !(j<=r) || (A[i] < A[j]) ) )


Read more on Wikipedia Merge Sort

21 April 2010 at 16:51 - Comments

Heap sort

Another sort method is the HEAP SORT which uses the heap property.
We can use to sort or for priority queues.

So, if we have the following:


A is a heap

A[0..heapsize[A]-1]

Heap property: A[PARENT(i)] >= A[i]

Height of the heap is the height of the root, THETA(lg n)

where:


int PARENT(i)
{
    return (i-1)/2;
}
 
int LEFT(i)
{
    return 2*i+1;
}
 
RIGHT(i)
    return 2*i+2;


First, we must build A as a heap


BUILD-HEAP(A)
{
    for( int i = (length[A]-2)/2; i >= 0; i--)
        HEAPIFY( A, i, length[A] );
}
 
 
void HEAPIFY( A, i, heapSize ) // the left and right are heaps and we make a heap for i
{
    
    int l = LEFT(i);
    int r = RIGHT(i);
    int largest = i;
    
    if ( l <= heapsize && A[l] > A[i] )
    {
        largest = l;
    }
    else
    {
        largest = i;
    }
    
    if ( r <= heapsize && A[r] > A[largest] )
    {
        largest = r;
    }
    
    if (largest != i )
    {
        exchange( A[i] - A[largest] );
        HEAPIFY(A, largest, heapSize );
    }
}


Now we can sort A:

void HEAP-SORT(A) // - time O(n lg n)
{
    BUILD-HEAP(A);
    int heapsize = length[A];
    for( int i = length[A]-1; i>0; i--)
    {
        exchange( A[0] , A[i] );
        heapsize = heapsize-1;
        HEAPIFY(A, 0, heapSize );
    }
}


int HEAP-MAXIMUM(A) //- time THETA(1)
{
    return A[0];
}


HEAP-EXTRACTMAX(A) //- time O(lg n)
{
    if ( heapsize[A] < 1 )
        error "heap underflow"
    
    max = A[0];
    
    A[0] = A[ heapsize[A]-1 ]
    heapsize[A] = heapsize[A]-1;
    
    HEAPIFY(A,0, heapSize[A] );
    
    return max;
}


HEAP-INSERT(A, key) //- time O(lg n)
{
    heapsize[A] = heapsize[A] + 1;
    
    i = heapsize[A];
    while ( i > 0 && A[PARENT(i)] < key )
    {
        A[i] = A[PARENT(i)];
        i = PARENT(i);
    }
    A[i] = key;


Read more about it on Wikipedia

21 April 2010 at 15:00 - Comments

Quick sort

QUICK-SORT(A,p,q) – uses the divide et impera alghoritm

it separates the elements from p to q into 2 partitions from p to r and from r+1 to q
where all the elements from the first partition are smaller than the elements from the second one.

The worst running time is O(n^2) but on average is very efficient THETA(n log n)
For balanced partitioning 9 good and 1 bad -> O(n log n)


 
void QUICKSORT(A,p,r)
{
    if ( p < r )
    {
        q = PARTITION(A, p, r);
        QUICKSORT(A, p, q);
        QUICKSORT(A, q+1, r);
    }
}
 
 
int PARTITION(A, p, r) //- rearrange the A[p..r]
{
    x = A[p]; // pivot element
    i = p-1;
    j = r+1;
 
    while (true)
    {
        for( j--; j>=p && A[j] > x; j-- );
 
        for( i++; i<=r && A[i] < x; i++ );
 
        if ( i < j )
            exchange( A[i] , A[j] );
        else
            return j;
    }
}

 

To avoid the worst time of the algorithm use the randomize partition

 
int RANDOMIZED-PARTITION(A, p, r)
{
    i = RANDOM(p, r);
    exchange A[p]-A[i];
    return PARTITION(A, p, r);
}

Read more on Wikipedia

5 April 2010 at 16:34 - Comments

Linear sort

The most simple sort


a[i] = 1 if i exists and 0 otherwise

5 April 2010 at 16:29 - Comments

Disable Intellisense for VStudio

There is a undocumented way to disable C++ Intellisense:
Rename or delete the following file:

<VS root path>\VC\vcpackages\feacp.dll

After doing this the vba macros will not work.

And reconsider using Visual Assist

5 April 2010 at 15:48 - Comments

SQLite

If you need a database to support your anti spam filter consider SQLite for an embedded database.

Use the following class to execute a query:


    //
    // SQLiteQuery helper
    //
    // use like this
    // SQLiteQuery sqlQuery( lpszPath, szQuery );
    // for( ; SUCCEEDED( sqlQuery.GetStatus() ); sqlQuery.GetNext() )
    // {
    //    CString strValue    = sqlQuery.GetCurrentAsText( 0 );
    // }
    //
    class SQLiteQuery
    {
    public:
    //
    // constructor
    //
    SQLiteQuery( const TCHAR * lpszPath, const TCHAR * szQuery )
    {
        m_pDatabase         = NULL;
        m_pStatementQuery   = NULL;
        m_result            = SQLITE_OK;
     
        Start( lpszPath, szQuery );
    }
     
     
    //
    // destructor
    //
    ~SQLiteQuery()
    {
        Stop();
    }
     
    //
    // get status
    //
    HRESULT GetStatus()
    {
        if ( m_pDatabase && m_pStatementQuery && m_result == SQLITE_ROW )
            return S_OK;
        return E_FAIL;
    }
     
    //
    // get status exec
    //
    HRESULT GetStatusExec()
    {
        if ( m_pDatabase && m_pStatementQuery && m_result == SQLITE_DONE )
            return S_OK;
        return E_FAIL;
    }
     
    //
    // go next
    //
    void GetNext()
    {
        if ( m_pDatabase && m_pStatementQuery )
        {
            m_result = sqlite3_step ( m_pStatementQuery );
            if ( m_result != SQLITE_ROW )
                Stop();
        }
    }
     
    //
    // get column text or column int
    //
    CString GetCurrentAsText( int column = 0 )
    {
        if ( m_pDatabase == NULL || m_pStatementQuery == NULL )
            return CString( _T("") );
     
        TCHAR * szValue   = (TCHAR *) sqlite3_column_text16( m_pStatementQuery, column );
        return CString( szValue );
    }
     
    int GetCurrentAsInt( int column = 0 )
    {
        if ( m_pDatabase == NULL || m_pStatementQuery == NULL )
            return 0;
     
        return (int) sqlite3_column_int( m_pStatementQuery, column );
    }
     
     
    protected:
    //
    // start
    //
    void Start( const TCHAR * lpszPath, const TCHAR * szQuery )
    {
        //
        // open
        //
        m_result    = sqlite3_open16( lpszPath, &m_pDatabase );
        if ( m_result != SQLITE_OK || m_pDatabase == NULL )
        {
            m_pDatabase         = NULL;
            Stop();
            return;
        }
     
        //
        // prepare
        //
        m_result    = sqlite3_prepare16_v2( m_pDatabase, szQuery, -1, &m_pStatementQuery, NULL );
        if ( m_result != SQLITE_OK || m_pStatementQuery == NULL )
        {
            m_pStatementQuery   = NULL;
            Stop();
            return;
        }
     
        //
        // SQLITE_DONE || SQLITE_ROW
        //
        m_result    = sqlite3_step ( m_pStatementQuery );
        if ( m_result != SQLITE_ROW && m_result != SQLITE_DONE )
        {
            Stop();
        }
    }
     
    //
    // stop
    //
    void Stop()
    {
        if ( m_pStatementQuery )
            m_result        = sqlite3_finalize( m_pStatementQuery );
        if ( m_pDatabase )
            m_result        = sqlite3_close( m_pDatabase );
     
        m_pStatementQuery   = NULL;
        m_pDatabase         = NULL;
        m_result            = SQLITE_ERROR;
     
    }
     
    protected:
        sqlite3 *           m_pDatabase;
        sqlite3_stmt *   m_pStatementQuery;
        int                 m_result;
     
    };
4 March 2010 at 14:06 - Comments

Regular expressions

regular expressions
Recently I wanted to test and validate an email address so the best way to do is to use the regular expressions.
The simple way is the use the one from Windows. (in the vbscript.dll)


    // import the second table from vbscript
    #import <vbscript.dll> tlbid(2) no_namespace named_guids raw_interfaces_only

    CComBSTR bstrEmailValidation ( L"^[A-Z0-9._%+-]+@(?:[A-Z0-9-]+\\.)+[A-Z]{2,6}$" );

    CComPtr<IRegExp> m_pRegExp;
    bool bIsValid = false;

    HRESULT m_hrCoInit = CoInitializeEx( NULL, COINIT_APARTMENTTHREADED );
    if ( SUCCEEDED( m_hrCoInit ) )
    {
        HRESULT hr = m_pRegExp.CoCreateInstance( __uuidof(RegExp) );
        if ( SUCCEEDED( hr ) )
        {
            hr = m_pRegExp->put_IgnoreCase ( VARIANT_TRUE );
            hr = m_pRegExp->put_Global ( VARIANT_TRUE );
            hr = m_pRegExp->put_Pattern ( bstrEmailValidation );

            VARIANT_BOOL bMatch = VARIANT_FALSE;
            hr = m_pRegExp->Test( CComBSTR(strText), &bMatch );
            if ( SUCCEEDED( hr ) )
            {
                bIsValid = (bMatch == VARIANT_FALSE) ? false : true;
            }
        }
    }

    // and later uninit
    m_pRegExp = NULL;
    if ( SUCCEEDED( m_hrCoInit ) )
    {
        CoUninitialize();
    }

If you need more info about regular expressions read this “How to Find or Validate an Email Address

1 March 2010 at 17:14 - Comments

Spamfilter links

bullguard spamfilter
Here are the most important links for BullGuard Spamfilter:

The description page at http://www.bullguard.com/why/bullguard-spamfilter.aspx

and the help page at http://media.bullguard.com/is9_help/EN/SF/module_spamfilter_overview.html

1 March 2010 at 17:14 - Comments

Autologin in Win 7

autologin
Follow this stepst to do the autologin in Windows 7 (but is not recommended)

  • Press the Windows key + R on your keyboard to launch the “Run” dialog box.
  • Type in “control userpasswords2” and press Enter.
  • The User Accounts window will display.
  • Uncheck the option “Users must enter a user name and password to use this computer”
    Click “OK”
  • You will then be prompted to enter the current password and confirm it.
  • After doing so, you will no longer be prompted to enter your password upon login.
1 March 2010 at 17:13 - Comments

Spamfilter

spamfilter
The best place to start your research regarding an antispam filter is here

CEAS – Collaboration, Electronic messaging, Anti-abuse and Spam (http://www.ceas.cc)

For bayesian understanding read the Paul Graham article ‘A plan for spam(http://www.paulgraham.com/spam.html)

and then improvements of it with the hidden markov model (the simplest dynamic bayesian filter) (http://en.wikipedia.org/wiki/Hidden_Markov_model)

and the implementation in the Viterbi alghorithm (http://en.wikipedia.org/wiki/Viterbi_algorithm)

also I recommend this books:

  • Ending Spam: Bayesian Content Filtering and the Art of Statistical Language Classification, by Jonathan Zdziarski (see on Amazon)
  • Inside the SPAM Cartel: By Spammer-X (see on Amazon)
  • Spam Wars: Our Last Best Chance to Defeat Spammers, Scammers & Hackers, by Danny Goodman (see on Amazon)
1 March 2010 at 17:11 - Comments