Thursday, September 30, 2010

Come Home (Software) Engineer

Everything that comes our way, changes us... many a times we do not see the change. Mind is sticky - particularly when its money or company. Money changes our beliefs, make us risk averse, puts on a track that we did not want to be in. Equally powerful influence is of people we work with, the company we keep. I realize that i picked some of the habits i actually hated. The sum of all this is that one is on wrong track, wrong role at job and is behaving differently from one wanted to. Lost Home. Hmmmm ... time to go back home.

So, why did i like my job of software engineer? Here is the list to check and rank,

Solving hard problems
Understanding things
Making Money, financial security
Being powerful
Build word class products
Being most intelligent in group
Be a leader, create leaders
Business alignment
Build big architectures - like big bridges in civil world


To me this quote shows the spirit of engineer:
Scientists investigate that which already is; Engineers create that which has never been.

To me creativity, knowledge and solving problem come at the top. And i like solving problems that take some time - not the ones that are just memory recall away. And my day job should be filled with these.

It is to recall that we software engineers have a pact with GOD, He doesn't write code and we don't do miracles. From being a super-hero it is to get more humble, low on ego. As Dijkstra says, "The competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague."

Hiring people to write code to sell is not the same as hiring people to design and build durable, usable, dependable software.
--Larry Constantine

It is to go back to be an engineer from a hire.

Tuesday, August 3, 2010

What is Architecture?

Architecture is being used with so many adjectives that i find it worthwhile to think about what is architecture. Business architecture, architecture of democracy etc give a feeling that architecture is about "essence". To begin with it looks like executive summary and it even stretches to look similar to a theory of science or axiom of maths. What is a mere description of something (or say specification) vs an architecture is not very clear. Also it not clear whether it is the description or it is by the use of that description that an architecture should be demarcated by.

Gregory Chaitin in his book "META MATH" differentiates between description and theorem. In simple terms theorem somehow should take less amount of information encoding for representation than the description. And as we find a more concise theorem the elaborate one is not needed.

Do we call Einstein's theory of relativity as architecture of astronomy. Perhaps not. Next is the level where we group of theories and facts together and call it architecture. Do we mean the principles when we say architecture of democracy?

Or as in a formal system, the alphabet, production rules and axioms which together describe the system constitute the architecture of that system.

Or do we talk of architecture only when we build systems? That is, if there is nothing to architect, there is no architecture. I can't think of an example where this is not true.

Architecture is a set of principles and techniques, as written in the book "Art of Systems Architecting"

Anyway when we discuss of this in Information Technology/Software Engineering field the definition is little narrower.

Here is one of the definition:
Architecture- Orderly arrangement of parts; structure:

Another one from IEEE
Architecture. The organizational structure of a system or component
– IEEE Glossary of Software Engineering Terminology, 610.12–1990

Or a more precise one from IEEE 1471:

architecture: the fundamental organization of a system embodied in its components, their relationships to each other, and to the environment, and the principles guiding its design and evolution.

This means we need,
A System
Environment
System having components
Components interacting with each other

And we are talking about design and evolution of this system.

Monday, July 12, 2010

GC performance

On a single cpu, regular laptop,


New GC reclaim : 115MB in 0.0285765 -> 4024MB/sec
New GC scan: 940MB in 0.0285765 -> 32 GB/sec
Full GC: 860MB in 6.2 sec -> 138MB/sec

Thursday, April 22, 2010

Human Genome sequencing: computational aspect

Business,
… Knome.com, currently provides genome sequencing services but the cost is about $99,500 per genome
… At the end of February 2009, Complete Genomics released a full sequence of a human genome … will contain approximately 80,000-100,000 false positive errors in each genome
… In June 2009, Illumina announced that they were launching their own Personal Full Genome Sequencing Service at a depth of 30X for $48,000 per genome
… In November 2009, Complete Genomics announced that they are now able to sequence a full genome for $1,700… Complete Genomics has previously released statements that it was unable to follow through on
… In March 2010, Pacific Biosciences said they have raised more than $256 million USD in venture capital money and that they will be shipping their first 10 full genome sequencing machines by the end of 2010. … by 2015 … $100 per genome


Technology and open source software
http://www.cbcb.umd.edu/research/assembly_primer.shtml
http://sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS

Problem statement in computing terms
A long string of 4 characters (A, T, G and C) is broken down into many pieces. Some of the pieces may be missing too for biological limitations and human error. This has to be reassembled by looking at sub-sequences that look same. This is error prone, but here is good news.
Multiple copies of the long string were shredded and it is shown that 8-10 copies shredded is good to get a very good genome sequence done.

State of Art
Greedy Algorithms
Overlap-layout algorithm – Hamiltonian path
Eulerian path
Align-layout
BAC-by-BAC (hierarchial sequencing)

… June 2008 the quantity of purity-filtered sequence data generated by our Genome Analyzer (Illumina) platforms reached 1 terabase, and our average weekly Illumina production output is currently 64 gigabases…


Looks inspiring?

-bala

Monday, April 19, 2010

Bug in java's System.out.write with large byte array

Here is a program that just writes lot of "a"s. Compile it and run it with
java B 60000
It doesn't write. Now try,
java B 50000
It does write

----------------------------------------------------------------------------------------
import java.io.*;

public class B {
public static void main(String[] args) throws IOException{
StringBuffer sb = new StringBuffer();
for(int i=0;i {
sb.append("a");
}
byte [] b= sb.toString().getBytes();
System.out.write(b);
System.out.println("Number of bytes: "+b.length);

}
}

---------------------------------------
My colleague Venkatesh pointed out that some error has occurred and because this method doesn't throw an exception, it sets the error,

System.out.println("Error=" + System.out.checkError());

Wednesday, March 10, 2010

J2ME jar size

If you build application, sooner or later you will hit J2ME jar size problem. if you use libraries like LWUIT, you start with 90KB. Then you add resources (images, strings) of 130KB. And then your app takes few more hundred KBs. This will hit you when you realize that some of the phones like MOTO RAZR will only support 400 or 500KBs app sizes and some phones will even restrict you to 200KB.

Statment level LOC, Method and Class are three variables that we played with. Keeping LOC and class constant we changed methods. Here are the numbers,






MethodsJar Size (Bytes)Delta for additional method
11301
2132019
3133919


Keeping LOC contant, we added classes (and in first three additions a new method also came about)








MethodsJar Size (Bytes)Delta for additional method
11301
21738 437
32167429
42605 438
52913308 Empty class added


So, for every method add a 19B and every class add a 308-400B hit is there.

Here is the class that we were playing with,

public class HelloWorld {
public void say()
{
System.out.println("H");
System.out.println("e");
System.out.println("l");
System.out.println("l");
System.out.println("o");
}

}

For methods we had,


public class HelloWorld {
public void say()
{
System.out.println("H");
System.out.println("e");
System.out.println("l");
System.out.println("l");
System.out.println("o");
}
public void say1()
{
System.out.println("W");
System.out.println("o");
System.out.println("r");
}
public void say2()
{
System.out.println("l");
System.out.println("d");
System.out.println("!");
}

}