Challenge 02 solved, questions about why it works


#1

This code solves the problem (including the duplicate proper names being filtered from the successful comparison) but I don’t understand why it works. My thinking was that if it made it through the case insensitive comparison, to then do a simple compare and print the Name and word only if they were NOT the same. In other words, don’t log the name Al with the word Al, do log the name Al with the word al, expressed by:

....
if(![n compare:w]){
     //print n and w
}
....

However, this returned the proper Name logged with the word Name. If i set it to the code below, it returns the proper Name logged with the word.

I cannot figure out why this works opposite of how I’d expect.

//read in files
NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/propernames" encoding:NSUTF8StringEncoding error:NULL];
NSString *wordString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words" encoding:NSUTF8StringEncoding error:NULL];

//generate arrays, split at \n
NSArray *names = [nameString componentsSeparatedByString:@"\n"];
NSArray *words = [wordString componentsSeparatedByString:@"\n"];

//iterate through names
for (NSString *n in names){
     //iterate through words
     for (NSString *w in words){
          //case insensitive comparison
          if ([n caseInsensitiveCompare:w] == NSOrderedSame){
               //filter out the duplicate names from the word list
               //this is what works, but is the opposite of what I would expect to work
               if ([n compare:w]){
                    NSLog(@"The name %@ is also the word %@.", n, w);
               }
          }
     }
     return 0;
}

#2

phsphi,

//read in files
NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/propernames" encoding:NSUTF8StringEncoding error:NULL];
NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words" encoding:NSUTF8StringEncoding error:NULL];

I think the second line should be NSString *wordString = […].
Mitch


#3

Ah, yes, thanks, it is in the actual code, I was entering the original post on a PC workstation and reading from xcode on my laptop. I’ve edited the post. I’m still confused as to what the heck is happening in the code that returns YES if Name == name, and not Name == Name though.


#4

I think the issue might be that

[n compare:w]

does not return a boolean, it returns an NSComparisonResult.


#5

I used isEqualToString instead of compare and it worked as expected.

        // Read in a file of proper names and store in array
        NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/propernames" encoding:NSUTF8StringEncoding error:NULL];
        NSArray *names = [nameString componentsSeparatedByString:@"\n"];
        
        // Read in a file of regular words and store in array
        NSString *wordString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words" encoding:NSUTF8StringEncoding error:NULL];
        NSArray *words = [wordString componentsSeparatedByString:@"\n"];
        
        for (NSString *n in names) {
            for (NSString *w in words) {
                // Filter out the duplicate proper names and compare the rest
                if (!([n isEqualToString:w]) && ([n caseInsensitiveCompare:w] == NSOrderedSame)){
                    NSLog(@"%@ is a proper name and %@ is also a regular word", n, w);
                }
            }
        }

#6

This last code snippet works for me. I am wondering if there is a faster mechanism in the language than using nested for/in loops.

I am using:

   for(NSString *n in names) {
        for(NSString *w in words) {
            if(!([n isEqualToString:w ]) && ([n caseInsensitiveCompare:w ] == NSOrderedSame)) {
            NSLog(@"%@ is a proper name and %@ is also a regular word.", n, w );
         }
    }

The books makes a point of mentioning that the for/in loop is a highly tuned looping structure. Even so, the code runs a little slow. Is there a set comparison where we could compare each outer word with a set of words? Or a similar approach …

J


#7

[quote=“srbjwe”]This last code snippet works for me. I am wondering if there is a faster mechanism in the language than using nested for/in loops.

I am using:

   for(NSString *n in names) {
        for(NSString *w in words) {
            if(!([n isEqualToString:w ]) && ([n caseInsensitiveCompare:w ] == NSOrderedSame)) {
            NSLog(@"%@ is a proper name and %@ is also a regular word.", n, w );
         }
    }

The books makes a point of mentioning that the for/in loop is a highly tuned looping structure. Even so, the code runs a little slow. Is there a set comparison where we could compare each outer word with a set of words? Or a similar approach …

J[/quote]

Someone posted code in one of the other threads for this chapter that supposedly completes in 1 sec. It was a little over my head at this point though.


#8

Try optimising the test expression of the if statement:

        for (NSString *n in names) {
            for (NSString *w in words) {
                // Filter out the duplicate proper names and compare the rest
                if (!([n isEqualToString:w]) && ([n caseInsensitiveCompare:w] == NSOrderedSame)){
                    NSLog(@"%@ is a proper name and %@ is also a regular word", n, w);
                }
            }
        }

The T1 && T2 expression in the above if statement can be optimised by swapping T1 and T2. In the resulting expression T2 && T1, T1 will be evaluated only if T2 evaluates to true.

            for (NSString *n in names) {
                for (NSString *w in words) {
                    // Filter out the duplicate proper names and compare the rest
                    if (([n caseInsensitiveCompare:w] == NSOrderedSame) && !([n isEqualToString:w])){
                        NSLog (@"%@ is a proper name and %@ is also a regular word", n, w);
                    }
                }
            }
        }

Try the following Objective-C++ code on your machine:
main.mm (Note the mm suffix)

//  main.mm

#import <Foundation/Foundation.h>
#import <iostream>
#import "MyTiming.h"

int main (int argc, const char * argv[])
{
    @autoreleasepool {
        // Read in a file of proper names and store in array
        NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/propernames" encoding:NSUTF8StringEncoding error:NULL];
        NSArray *names = [nameString componentsSeparatedByString:@"\n"];
        
        // Read in a file of regular words and store in array
        NSString *wordString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words" encoding:NSUTF8StringEncoding error:NULL];
        NSArray *words = [wordString componentsSeparatedByString:@"\n"];
        
        {
            my::timing::ElapsedTime T1 ("---> Code-1");
            for (NSString *n in names) {
                for (NSString *w in words) {
                    // Filter out the duplicate proper names and compare the rest
                    if (!([n isEqualToString:w]) && ([n caseInsensitiveCompare:w] == NSOrderedSame)){
                        NSLog (@"%@ is a proper name and %@ is also a regular word", n, w);
                    }
                }
            }
        }
        {
            my::timing::ElapsedTime T2 ("---> Code-2");
            for (NSString *n in names) {
                for (NSString *w in words) {
                    // Filter out the duplicate proper names and compare the rest
                    if (([n caseInsensitiveCompare:w] == NSOrderedSame) && !([n isEqualToString:w])){
                        NSLog (@"%@ is a proper name and %@ is also a regular word", n, w);
                    }
                }
            }
        }
     }
    return 0;
}

MyTiming.h

// MyTiming.h

#ifndef MY_TIMING_H
#define MY_TIMING_H

#include <iosfwd>
#include <sys/resource.h>

namespace my {
    namespace timing {
        struct ElapsedTime
        {
            rusage t1;
            
            std::string prefix;
            bool autoPrint;
            std::ostream &os;
            void (ElapsedTime::*AutoPrint)(std::ostream&) const;
            
            ElapsedTime (const std::string &prefix, bool autoPrint = true)
            : prefix (prefix), autoPrint (autoPrint), os (std::cout)
            {
                Setup (autoPrint);
            }
            
            ElapsedTime (const std::string &prefix, std::ostream &os, bool autoPrint=true)
            : prefix (prefix), autoPrint (autoPrint), os (os)
            {
                Setup (autoPrint);
            }
            
            void Setup (bool autoPrint)
            {
                getrusage (RUSAGE_SELF, &t1);
                if (autoPrint)
                    AutoPrint = &ElapsedTime::Print;
                else
                    AutoPrint = &ElapsedTime::Nop;;
            }
            
            ~ElapsedTime ()
            {
                (this->*AutoPrint) (os);
            }
            
            double Value () const
            {
                rusage t2;
                getrusage (RUSAGE_SELF, &t2);
                double v = double (1000000) * double (t2.ru_utime.tv_sec - t1.ru_utime.tv_sec)
                + double (t2.ru_utime.tv_usec - t1.ru_utime.tv_usec);
                return v;
            }
            
            void Print () const
            {
                Print (os);
            }
            
            void Print (std::ostream &os) const
            {
                os << prefix << ": " << Value() << " microseconds" << std::endl;
            }
            void Nop (std::ostream &os) const
            {
            }
        };
    }
}
#endif

The speed-up is roughly two folds (on my machine):

...
---> Code-1: 3.99146e+07 microseconds
...
---> Code-2: 2.10156e+07 microseconds

#9

Great answers … thank you!

I noticed that the system didn’t like

#import
#import “MyTiming.h”

I am assuming the MyTiming.h is your own file … I have used on previous C development but I haven’t within X.code. Does anyone know where or how to access this import.

I can see that the code certainly runs faster. I am thinking that using sets might also be a faster way to operate. The set stuff looks really fast.

Thanks again!


#10

I found an approach that uses sets and it is significantly faster … try this code.

[code]NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/propernames"
encoding:NSUTF8StringEncoding
error:NULL];
NSString *wordString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words"
encoding:NSUTF8StringEncoding
error:NULL];
//break it into an array of strings
NSArray *names = [nameString componentsSeparatedByString:@"\n"];
NSArray *words = [wordString componentsSeparatedByString:@"\n"];

    NSSet *nameSet = [NSSet setWithArray:names];
    NSSet *wordSet = [NSSet setWithArray:words];
    for(NSString *n in names) {
        if ([words containsObject:n] ) {
            NSLog(@"Array contains %@",n);
        }
    }
    for(NSString *n in nameSet) {
        if ([wordSet containsObject:[n lowercaseString]]) {
            NSLog(@"%@",[n lowercaseString] );
        }
    }

[/code]

I didn’t bother to get specific times but it does run much faster for me. Sets are apparently very optimized for these types of compares.

Jwe