Saturday, July 20, 2013

Grep - Swiss army knife


I have used this command sparingly in the past and I recently learnt more options by solving couple of my problems. I really don't know whether grep is the right choice here, but hey! it worked :-)

Problem 1: Filtering CSV

I had a very large CSV file of size 620Mb, a report exported from salesforce. I have to find the duplicate accounts and contacts with same email id.

Excel and Numbers were either crashing or taking light years to respond to any of the action I had to perform on the data.

The CSV was in the format:

Email: unique@domain.com

Contact: 1 Sales Contact
Duplicate Contact 1

Email: duplicate-exists@domain.com

Contact: 2 Sales Contacts
Duplicate Contact 1
Duplicate Contact 2

Email: duplicate-exists@another-domain.com

Contact: 3 Sales Contacts
Duplicate Contact 1
Duplicate Contact 2
Duplicate Contact 3

I knew I could just see how many contacts has duplicates by simply doing a grep -i 'contacts' | wc -l but I would like to know the email id associated with the contacts which is not in the same line.

Quick search on man grep lists an option to print previous lines in addition to the line which matches the string. Here is the recipe I used to solve the problem.

#this is to return all emails and sales contacts
Ξ ~/Desktop → grep -B 2 'Contacts' filename.csv | grep 'Email:'

From Manual 
-B num, --before-context=num
    Print num lines of leading context before each match.  See also the -A and -C options.

 

 Problem 2: Objects and APEX classes linked to a particular object.

We are in need to provide effort required to do an impact analysis of migrating an existing salesforce object to a new object. This activity includes identifying the no. of classes has a reference to the object in question and an effort estimate to analyse those source files which has also has a reference to the object.

To arrive an approximate effort estimate, we need a list of files and no. of lines of each files.

Salesforce Schema browser is not very helpful if I want to know just the _number_ of objects associated, not what is the relation and how it is related to another object. And it is also a real pain to collapse and follow the links/pipes in a production environment with larger number of objects.

Like every other occasion, management was asking the numbers as soon as possible. And developers were doing a Ctrl + F on Eclipse on classes and counting the no. of files. I could not suggest or question the team as I have only very little knowledge about salesforce APEX development.

Since I am the author of iForce (a Sublimetext extension to help salesforce development) I knew that all the files are just text with metadata. So If I know the pattern to search for I could use grep to get the list of files and pipe it to wc to get approximate number we were looking for.

Well, it took more time to finalise the pattern of object usage in APEX code than to find the number. Also the default payload.xml of iForce doesn't fetch all the required objects so I had quickly replaced the payload.xml on my iForce working copy with the one from Eclipse workspace.

Once I refresh my iForce working copy with one from the server, the answer is just a minutes away. Here is the list of commands I ran to get the number, I just copied it to excel, formatted columns with bright colour background for the people above my food chain to process ;)

Ξ salesforce-sandbox/classes → grep -iE 'new contact||new account|' -l *.cls |wc -l
    
Ξ salesforce-sandbox/triggers → grep -iE 'new contact||new account|' -l *.trigger |wc -l
      
Ξ salesforce-sandbox/components → grep -iE 'new contact||new account|' -l *.component | wc -l

P.S: Though count from wc -l doesn't provide the meaningful number of lines as CLOC. But I was in a hurry, and I have not tried CLOC with APEX code.

Link: This is Linus

Last week Zite suggested this article, I've known about Linus and his rude comments. But reading this article, I realised that being polite is not right solution for everything. Having the experience of working with the many developers over the years, I met only very few people who act on subtle comments I pass during discussion about code/best practices. The rest just ignore and do what they have always been doing.



The fact is, people need to know what my position on things are. And I can't just say "please don't do that", because people won't listen. I say "On the internet, nobody can hear you being subtle," and I mean it.

Because if you want me to "act professional," I can tell you that I'm not interested. I'm sitting in my home office wearing a bathrobe. The same way I'm not going to start wearing ties, I'm *also* not going to buy into the fake politeness, the lying, the office politics and backstabbing, the passive aggressiveness, and the buzzwords. Because THAT is what "acting professionally" results in: people resort to all kinds of really nasty things because they are forced to act out their normal urges in unnatural ways.

At times you need to be King Leonidas and kick them hard so they learn and do not repeat the mistake.

Link: Linus Torvalds defends his right to shame Linux kernel developers

Sunday, July 7, 2013

WWDC'13 - Hidden Gems in Cocoa and Cocoa Touch


If you did not have enough time to go through the 50G contents from WWDC'13, I highly recommend you to watch the session 228, aptly named "Hidden Gems in Cocoa and Cocoa Touch".

Links: PDF | Video - SD | Video - HD

There were like 30+ tips, if you are interested my score, it is only 8.

Third year in a row, I published the compiled list of sessions and download links here. Note: You need iOS developer program credentials to download the files.