Challenges of the Software Age

This week, we got a chance to delve into some open software news sources. The task was to take two articles from opensource.com and post our personal response and reflect upon the articles argument or thesis. For my first article, I chose one about Louis C.K., the comedian. I am a lover of comedy, and Louis C.K. is one of my favorites. To be honest, I was little surprised to see an article about him even covered in a software magazine, but once I read the article I fully understood.

The article states its premise outright, “The answer to stabilizing content and price is letting artists retain greater control of their work.” This claim is not unheard of and is a sound premise. It is based off of Louis C.K. providing a download of his stand up special for five dollars a pop. This model was used instead of the production model to prove that ease of access will generate revenue, and that the current system of production is antiquated. Using this method, he spent approximately 250,000 dollars. Surprisingly, he generated 1,000,000 dollars in revenue. The author of the article is arguing that this electronic method provided better user-developer relations, and allowed Louis to make better quality comedy because he had control over the entire production.

This argument needs to be heard many times over, and not just in comedy. Software can benefit from this method of thinking as well. Over and over we hear of software losses due to theft through pirating software. However, as the author of the article is saying, the issue can be solved simply through ease of access and a reasonable pricing model. Another great example of this thought is in mobile application development. The rise of easy to afford, easy to install, and mobile apps demonstrates this key principle: price and production affect piracy. The current structure of the software world promotes attacking individuals for sharing files, and punishing paying users with inconvenient protection measures. In the end, removing this methodology helps customer relations by making the paying user feel less punished for choosing the right way.

Also mentioned in the article is price. Price for software can reach upwards of millions of dollars. How much of this cost is purely administrative? How much comes from unnecessary costs such as advertising and publishing? When you go to the store and look at the sixty dollar game, I can tell you a significant chunk of that price is going straight to the publisher, not the developer. By removing these middle men in the modern internet era, we can reduce the cost of software to a point where it is almost non-existent (open source anyone?). We can create better software by fostering a more direct relationship between the end user and the developer. We can create better software for a better price by being in greater control of our development process. This point is what I construed from the article. Developers must always be agile in the field of fast-paced technology. So why not start adapting now?

The second article I chose was called “A cure of the common troll”. By troll, they are referring to patent trolling. With the rising boom in software comes new technology and innovations. These new technologies can all be patented in order to protect the developer’s copyright. Some companies will arise and have risen, whose sole purpose is to collect patents and then sue infringers. This is the art of the patent troll. The results of this trolling are adverse. For one, patent trolling restricts innovation by preventing smaller companies from developing without being sued out of existence. Another notable reason is that many of these patents have been bought, sold, and traded. These patents are not the true inventors but people who buy or acquire these patents from the inventors for profit. By choosing to do so, they are going against the whole concept of a patent to begin with: to protect the inventor. Lastly, many of the patents are dealt with in an archaic manner. A common analogy is that it is like patenting the door knob or the wheel. These are basic universal components that just cannot be patented because they are so basic and necessary to software development.

The article suggests a way to deal with these trolls of the modern age:

“First, create a compulsory licensing mechanism for patents whose owners are not making competitive use of the technology in those patents. Patent owners should be required to declare the areas or products that incorporate the patented technology. All other non-practiced areas should be subject to a compulsory license fee. (A non-practiced “area” would be a market or technology sector or activity in which the patent owner is not using or licensing the invention rights, though the owner may be using the patent in other “areas.”) Licensing rates for patents could be set by patent classification or sub-classification based on industry average licensing rates for each such technology. Again, this would only apply to applications where the patent is not being practiced or voluntarily licensed by the patent owner.
Given the vast number of patents issued, an accused party should have a reasonable, set time after receiving notice of a patent within which to pay for the license going forward. Compulsory licenses are authorized by the treaties we have entered into, and we have significant experience with compulsory licensing of copyrighted works from which to develop an analogous patent mechanism. Uniform rates could be set.
Second, cap past damages for trolls at $1 million per patent and eliminate the possibility of obtaining injunctive relief for infringement of patents that are not in use, or are not used commercially, by the patent owner.
Third, a mandatory fee shifting provision should be put in place where the plaintiff is required to pay the defendant’s reasonable defense fees if the plaintiff does not obtain a better recovery than what was offered by the defendant. (Presently, there is such a cost shifting mechanism in place; however, the relevant costs typically are a tiny fraction of the legal fees in a case.)
Fourth, for U.S. domestic defendants, require that suits be brought in the venue where the defendant’s primary place of business is located.
Fifth, if a party wants more than limited discovery from the opposing side, particularly for electronically stored information (ESI), the requesting party should pay the cost of production. For large technology companies, ESI production alone can cost into the seven figures.”

I am a big supporter of all these concepts. I would also add to the list, that patents cannot be bought or sold, only inherited or renounced (made open to all). By doing so, patent companies would be insolvent and inviable. Each of the other suggestions from the author are great ideas and should be considered in updating our current system of patent application and distribution.

These two articles discussed some hot button issues in not just open source development, but also in all forms of software development. I particularly enjoyed this assignment and found the articles to be both informative and interesting. I look forward to reading more!

Advertisements

Exercises in Unit Testing

Today’s exercise is based of of chapter five in our software development book. I have to look at three exercises at the end of the chapter. Based on what I have seen and read, these exercises will be applications in testing and documentation within a FOSS project. Let’s get started.

5.1 Examine the RMH Homebase release 1.5 code base and accompying documentation. Identify at least one instance of the following:

a. Long Method
b. Too Few Comments
c. Data Clumps

This method comes from personEdit.php. It is over 140 lines long. The purpose of the function is to refine all the forms entered. As you can see in the code below. The code is effecting over 14 variables in this part of the code alone which could cause some hard to find side effects. There are few comments relating what each of these blocks effect, and the method tries to accomplish too much in this one function. There is so much going on that it could be its own php file. Also it appears every piece of data extraction is occurring in this function on the variable id. The constant use of id in so many different extractions leads to data clumping.

function process_form($id) {
//step one: sanitize data by replacing HTML entities and escaping the ‘ character
$first_name = trim(str_replace(‘\\\”,”,htmlentities(str_replace(‘&’,’and’,$_POST[‘first_name’]))));
$last_name = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘last_name’])));
$address = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘address’])));
$city = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘city’])));
$state = trim(htmlentities($_POST[‘state’]));
$zip = trim(htmlentities($_POST[‘zip’]));
$phone1 = trim(str_replace(‘ ‘,”,htmlentities($_POST[‘phone1’])));
$clean_phone1 = ereg_replace(“[^0-9]”, “”, $phone1);
$phone2 = trim(str_replace(‘ ‘,”,htmlentities($_POST[‘phone2’])));
$clean_phone2 = ereg_replace(“[^0-9]”, “”, $phone2);

$private_notes = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘private_notes’])));
$public_notes = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘public_notes’])));
$my_notes = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘my_notes’])));

$background_check = ”;
$shadow = ”;
$interview = ”;
if($_POST[‘background_check’]==’yes’) $background_check = ‘yes’;
if($_POST[‘interview’]==’yes’) $interview = ‘yes’;
if($_POST[‘shadow’]==’yes’) $shadow = ‘yes’;

$convictions = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘convictions’])));
$wherelived = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘wherelived’])));
$experience = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘experience’])));
$motivation = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘motivation’])));
$specialties = trim(str_replace(‘\\\”,’\”,htmlentities($_POST[‘specialties’])));
d. Speculative Generality

As for speculative generality, thew helpfooter.inc file clutters the code, and could have easily been made into a txt file for the user. Or better yet it could have been added to another method that deals with all footers, instead of being a repetitive include file.

5.2 For each of these “bad smells” refactor the code to reduce the size of the code base.

a. Long Method
b. Too Few Comments
c. Data Clumps

For these three I had to do a good chunk of work. I separated the process_form into different functions. This made the code more manageable and alleviated the Long Method and Data Clumps issues. This yielded four methods: process_address, process_notes, process_meeting, and process_background. Then I had to add comments to each explaining what they did and why they were there. This de-cluttered the code and turned a long method into a reasonable one.

d. Speculative Generality- I combined the helpfooter.inc with the footer.inc to remove one include file, and changed the reference to helpfooter.inc within the help.php file.

5.3 Perform unit tests

I ran unit tests on personEdit.php and help.inc that yielded all passes. Overall I am satisfied with the results of my refactoring though I will need to add in more test cases later to make sure I have checked absolutely everything necessary.

This weekend I and my team will be working on setting up a project schedule for the rest of the semester. See you all then!

Bug Exercises Part II: Patching

Today’s exercises involve one of the more basic skills that all programmers should know: patching. In the open source world patching has become pretty standardized, except for certain minutiae in formatting. So for the exercise my job was to create some test patches and understand the test process using the Unix terminal.

7.2.1

This exercise was creating a basic diff output using none other than the diff command. The output of this diff corresponded to the books results and is reprinted below.

steve@ubuntu:~$ diff -u hello.c hello.c.punct

— hello.c 2012-02-09 23:50:18.659577870 -0500

+++ hello.c.punct 2012-02-09 23:51:37.780125196 -0500

@@ -5,6 +5,6 @@

#include

int main() {

– printf(“Hello, World.\n”);

+ printf(“Hello, World!\n”);

return 0;

}

7.2.2
Then the next exercise was to examine the differences made by not having a -u. By not having the -u we see that the formatting of the output has changed. There is less information and is more bare representation.

steve@ubuntu:~$ diff hello.c hello.c.punct

8c8

printf(“Hello, World!\n”);

7.8
This exercise was the creation of patch file containing the word bar. First I created the file “foo” containing “bar” and used diff on it with the null file provided by Unix. The resulting output was created in a file named “foopatch.patch”. Redundant I know. Just did not want to lose the foo.c

— foo.txt 2012-02-09 23:58:58.439593808 -0500
+++ /dev/null 2012-02-09 23:48:23.021831997 -0500
@@ -1 +0,0 @@
-bar

7.9
Finally this exercise had me making a patch file using a real program. The program is called caultelis. I took the echo.c file and changed a small snippet of code to compare to echo.c.reverse. The following output was created, which matches the output the book says I will get.

— src/echo.c.reverse 2012-02-10 00:06:29.018333108 -0500
+++ src/echo.c 2012-02-10 00:10:10.938330675 -0500
@@ -258,14 +258,14 @@
}
else
{
– while (argc > 0)
+ while (argc > 0)
{
– fputs (argv[0], stdout);
argc–;
– argv++;
+ fputs (argv[argc], stdout);
if (argc > 0)
putchar (‘ ‘);
}
+
}

if (display_return)

These exercises were a great experience. Considering we have our first bug fix due Monday I was glad to get this practice in for making a patch to my team’s Drupal project. This weekend I will update the results of creating my first official patch for Drupal.

Bug Exercises Part One

While working on the team bug to be submitted Monday, each of us needed to complete some exercises in our open source online textbook. These exercises correlate directly with our team’s project and will prove valuable in becoming better at managing open source projects.

6.4 Find the oldest bug that’s still open in your chosen project. Write a blog entry describing the problem, with a theory about why the bug hasn’t been resolved yet. (Bonus points if you can actually resolve the bug.)

There is a bug that is 6 years and 27 weeks old. The bug is with the bug reporting system itself within a drupal project. Each bug can be classified in various ways. When it is classified as “patch (ready to be committed)” the bug is removed from the issues list. The bug was resolved in the comments but left open because of other issues discussed within the comments.

6.5 Figure out how to create a new account on the bug tracker of your chosen project. You’ll need that account very soon.

I had actually created an account a few days ago to post about a bug I saw. The name of the account is steveo1490.

6.6 Go through your project’s bug tracker and find a bug that you think you might be able to reproduce — and then try to reproduce it in the latest build. Take careful notes. Report your experiences as a comment to the bug. If you can reproduce the bug, great! Give as much information as you can. If you can’t reproduce the bug, great! Give as much information as you can, and ask the original reporter if there are other steps you might be able to take to reproduce the bug.

A bug I was working on this past weekend I was able to successfully reproduce. When in the administrative theme, the edit, add, revision, and delete pages will not come up properly if you change the lettering into anything other than all lowercase. I tried quite a few combinations such as Edit, EdIT, and EDIT. Each version did not show the administrative theme which shows the bug’s existence.

6.7 Find five bug reports in the newstate, and attempt to triage them according to the rules above. Your goal is to do as much as you possibly can, in a short period of time, to make those bug reports as useful as possible to the developer to whom they are assigned. (Note: be sure to follow any triage rules that your project may have defined. If there are no set triage rules, be sure to announce your intentions on the project’s mailing list, so that developers can provide you some guidelines if they choose.)

After reviewing some of the newer bugs. I noticed most of the bugs are triaged into the correct categories. The only slip ups I noticed that might make development and bug fixing more difficult is prioritizing. The Drupal bug tracker has four priorities: normal, minor, major, critical. Almost all new bugs were categorized as normal and some of the major bugs I looked at were either dupes, which I commented on and reduced the priority on, or were just not a functional issue. Being a nonfunctional bug lowers the priority down to at least the normal to minor level. I manged to change the priority on a few which I hope will give developers the right glimpse at what bugs need to be fixed now rather than later.

These exercises gave me a significant amount of experience with the bug tracker, as well as bug tracking organization. The triage techniques were also useful to learn, so I can use them in my future computer science projects that involve code maintenance. This Friday I will be posting about even more exercises that help foster my bug fixing talents.

Reflections: Parrallelism via Multithreaded and Multicore CPUs

For any reader’s who are a member of the ACM or IEEE you might be familiar with the magazine, Computer. This segment is a critical look at an article from March 2010 called “Parallelism via Multithreaded and Multicore CPUs”. I will offer my own personal analysis along with general information about what was contained within the article.

The general summary of the article is that it is a “comparison between multicore and multithreaded CPUs currently on the market”. The attributes the article focuses on are “design decisions, performance, power efficiency, and software concerns in relation to application and workload characteristics”. This area is really intriguing to me personally, because it is an area I have always wanted more clarification on. What are the yields of a multicore and multithreaded processors and how important are they, especially when it comes to choosing the right cpu for a new computer.

The article starts off with design decisions. It describes how multithreaded cores have multiple hardware threads to make switching between threads easier and more efficient. The most common approach to switch between threads is known as simultaneous multithreading aka hyperthreading. This threading technique utilizes precoded instructions from only a subset of the threads on the chip. Interestingly the article also mentions that no commercial CPUs issue more than two threads per core per cycle. This information tells me that the way CPUs are threaded is a negligible difference when deciding which CPU is better. The article further explains that the limits on threading are due to scalability. By having more than two threads you have surpassed a “saturation point”. This point hampers your ability to get any more use out of executing more than two threads. However there is a way to work around this dilemma: multiple cores, which is great news. These facts indicate that threading is standard but how many cores you have does make a big difference in capability.

Another consideration is the cache. There are currently three types of caches: shared, private, and dynamic. The latter being very rare. Because of this reason the article compares the major types of private and shared. Shared implies that the cache is shared between the cores, while private implies it belongs to one core alone and cannot be used by other cores. For multicore programs it is better to have shared cache if the software threads need to share data. This method also prevents the need to copy data and if more efficient because it avoids the need to access other caches indirectly between cores. The draw back though is that shared cache is more unpredictable. The software is less isolated and therefore can end up using much more cache than is necessary. Also it makes it difficult to gauge the program’s service to each thread which leads to instability. The private is therefor more predictable and controls performance. These findings depict some of the tricky decisions in CPU choice. It becomes a much more advanced situation of deciding which kind of trade off you want to make based on the software you use. To me I would prefer the private because stability is many times better than speed when it comes to managing memory.

So when it comes to multicore processors, this article suggests that there is no clear cut choice. It all depends on your software’s specific needs and design. Certain hardware is always have certain software designs in mind and CPUs are no exception. Just like life, there is never one true right answer.

Reflection: The Cathedral and the Bazaar

These are my personal reflections about the article The Cathedral and the Bazaar,written by Eric Raymond.

The premise of the article is that software development has two categories: a cathedral style and a bazaar style. According to Eric each choice has their own purposes and uses that make each unique and effective in their own way. However a majority of the article involves an anecdote about him initially discovering what is known as the bazaar style. This anecdote involves his development of the open source software fetch mail. Within the anecdote are pieces of programming wisdom provided by Eric to be used by any programmers reading the article.

As for my personal view, I found the premise to be interesting. I think the theme of the paper is accurate and Eric takes a fair shake on it. He mentions that most of the time the core of a program is cathedral style, while the innovation and tools added onto the program are done bazaar style. This blend of the styles is what makes the most sense to me. You want to have skilled programmers who understand the main concept to develop a strong foundation. Then the community can help build the rest.

I also found the anecdote quite entertaining and a good example of how bazaar style can be effective. I agree with a majority of his claims such as releasing early and treating the user as a co-developer. These kinds of concepts were newer to the field back in 1996 but have flourished today, and for good reason. As Eric points out, the most famous example of the bazaar technique was Linus’ work with the linux system. This system was only the beginning. Today programs such as filezilla and firefox are quite competitive in the market and show that bazaar techniques can lead to a more stable and better built programs.

There is one area where I have to disagree with Eric though. I think he does not address issues appropriately. He writes a great article about how awesome this style is and how great the bazaar is but does not anticipate audience response. Now he may not be going for a purely argumentative paper or persuasive paper, but at the least I would hope he would counter people who say the bazaar style is a poor way of doing things. Like most open source writers he does not appropriately address the issue of payment. The thing about a bazaar is that the people in a bazaar are making money, they do not come there out of the kindness of their heart and do everything for free. This is where is analogy seems to be off the mark.

However, I am not suggesting this is a mortal flaw with the argument for open source. Just as he says near the middle of the article. By making the program open source he had thousands of users looking at bugs and suggesting ways to fix them. I just feel like his analogy incorrectly depicts the relationship between core developers (cathedral builders) and the community (bazaar). A more appropriate way to describe this relationship is a house being built by an architect in a community. Then the community volunteers to help their neighbor make some great improvements to the house, motivated by either their like of the neighbor or just wanting to try out their skills on improving the house.

Either way, this article was entertaining, funny, and convincing. Definitely worth the read.