Professional Code Monkey

  • Archive
  • RSS

Updating GPS coordinates in Android emulator

The Android Eclipse plugin is generally a very handy plugin. There are, however, a few limitations. For example, if you are working with multiple emulators running simultaneously and want to update the GPS coordinates in both emulators, you will find that you can load only one KML file at the time. Needless to say, this is a very specific issue, but annoying enough when developing applications which heavily depend on the GPS functionality and interacting with other clients at the same time.

@pauloricardomg wrote a GPS server to be used in his and @navaneethr course project. He programmatically updates the coordinates by sending instructions over Telnet to the emulator. For the evaluation of our own project, we needed something similar. Below is a hack which extracts the essential functionality and wraps it in a python script.

Hope it may come in handy.

https://gist.github.com/967616

    • #JustMigrate
    • #android
    • #geolocation
    • #hack
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

I have awesome classmates

Who for some mysterious reasons share a deep love to Sweden. Last weekend they demonstrated their passion by singing, without any rehearsing or knowledge of Swedish, the Swedish national anthem. Love it.

The DHTs are in business!

    • #JustMigrate
    • #education
    • #nontech
    • #swedish
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Presenting Cassandra

Today Lalith, Bruno and I presented Cassandra. A distributed database with high availability and partition tolerance. I.e, the A and P of the CAP theorem.

Overall we were pleased with delivery. We got some good feedback from our professor and our colleagues. One comment addresses something which I try to mind when shaping presentations.

Sometimes your presentation is directed to a public which is familiar with Distributed Systems. This is PADI, but many people don’t have Distributed Systems as their major or minor, but Software Engineering. I’m not familiar with a lot of systems you talk about in Parallel Computing or Cloud Computing.

We should always start with the audience. Who is listening to our talk? What do they know? Why are they here (apart from attendance)? Where do they come from? We should think about this before we start thinking about content, structure, and other presentation mechanisms that we can use in our delivery.

Probably we thought about it too, but not to sufficient detail. I thought: “they’re students like us.” Now I know, next time I make a presentation, I should do my audience analysis more carefully and avoid generalising too much.

Thanks for the feedback!

PADI_Cassandra_Presentation.pdf Download this file
    • #JustMigrate
    • #distributed systems
    • #presentation
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Cassandra and its Accrual Failure Detector

For Friday’s presentation of Cassandra – a distributed storage system – I needed to understand how the system is able to detect node failures. In distributed systems a so called failure detector is sometimes used to simplify an algorithm’s work. And, Cassandra uses a failure detector called the Accrual Failure Detector. Accrual for those of you who don’t know, means accumulation, or the act of accumulating over time.

The basic idea is that a node’s state is not only up or down. It is not true or false. Rather, it is an educated guess which takes multiple factors into account. With approximation we can, for example, take slow messages into consideration and, thus, allow ourselves to be wrong. How weird?

A server (node A) suspects that a node is down because it hasn’t received the two last heartbeats from node B. Node A assigns a Phi value of (let’s say) 1. Phi denominates the suspicion level that another server might be down. This value can be adjusted dynamically according to local conditions such as load.

Phi represents the likelihood that Node A is wrong about Node B’s state. So, when a third heartbeat is considered lost Phi increases, and eventually a threshold is reached. When that happens the application will be notified about the failed node. The threshold is a configured value.

Cassandra approximates Phi using exponential distribution. Thus, the higher the Phi, the bigger the confidence that Node B has failed. I still haven’t found any more detailed explanation than the following as to why exponential is used rather than Gaussian:

Exponential Distribution to be a better approximation, because of the nature of the gossip channel and its impact on latency.

Don’t know if that made sense to anyone else, but I think I get it know.

    • #JustMigrate
    • #distributed systems
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Breaking it down - Cassandra

The last post was a good exercise. I decided to do the same for the next paper titled: Cassandra - A Decentralized Structured Storage System written by Avinash Lakshman and Prashant Malik, both employed at Facebook. We’re presenting this paper on Friday as part of our Distributed Systems course.

Same premises as before. I write as I read. And the goal of this paper:

Cassandra system was designed to run on cheap commodity hardware and handle high write throughput while not sacrificing read efficiency.

It is immediately noticed in the introduction of this paper that there is some weight behind their motivation:

Facebook runs the largest social networking platform that serves hundreds of millions users at peak times using tens of thousands of servers located in many data centers around the world.

Maybe it is unfair to compare papers from industry with papers from academia? Not everyone has access to millions of test users… Anyway. Failures are treated as the norm rather than the opposite. Moreover, their systems must support continuous growth. It is also noted that Cassandra does not provide anything new, it only draws on what was existing before. What’s new is the combination of techniques, and their implementation of it. 

It scales to 250 million users as of writing and was initially intended for Inbox Search.

Related work mentions two projects commonly referenced in mobile computing literature: Coda and Ficus. Both replicate files for availability but don’t provide consistency guarantees. They also compare to GFS, Google File System, and it’s simple design. It is interesting to note that they make no distinction between acamedic and industrial projects. Conflict resolution, network partition and data schemes are different among all related projects. Dynamo is highlighted for its gossip membership. None fulfill the goal in one aspect or another.

The data model is very similar to Bigtable with the addition of super column families; a family which includes other families. Sorting by name or time. In general very little specifics, only possibilities listed.

The architecture of a storage system that needs to operate in a production setting is complex.

No shit. They limit the system architecture section to: partitioning, replication, membership, failure handling and scaling. A high-level overview is provided.

The principal advantage of consistent hashing is that departure or arrival of a node only affects its immediate neighbors and other nodes remain unaffected.

Immediately afterwards they also provide the drawbacks: non-uniform distribution of load and heterogenous nodes. Solution is based on being able to make deterministic decisions on load-balancing. Clearly related to the goal set out early in the paper.

It is also interesting to note that Cassandra makes use of real-life notions:

Cassandra provides various replication policies such as “Rack Unaware”, “Rack Aware” (within a datacenter) and “Datacenter Aware”.

There is a clear separation of concerns. Things not related directly to Cassandra are left out of the implementation and instead Cassandra depends on other tools to perform some tasks for it. For example, the leader-election is done by Zookeeper (developed by Yahoo).

Ironically, there are some mishappenings that are not very common to the average programmer:

Data center failures happen due to power outages, cooling failures, network failures, and natural disasters.

Large scale!

They use a failure detector called Accural Failure Detector which instead of sending a boolean value to other nodes in the system, sends a suspicion value. The original authors of the failure detector suggested an approximation using gaussian distribution, but at seems exponential distribution is better in gossip based settings. I’m not sure I understand this. However, do note that they make use of clearly relevant research and adapt it to their needs, and provide a reason for doing so (“adjust well to network conditions and server load conditions”)

Some optimisations are provided for the local persistence: 

  • Reads are optimised with bloom filters.
  • Column indexes are maintained (256 Kb)
  • Write logs are written to a separate local disk

All implementation details are left to a section which is very compact and absolutely loaded with information. It was a bit hard to digest at 00h39. I’m unsure how relevant some of these details are to the understanding of Cassandra.

Evaluation is completely based on experiences rather than experimental data. They note that many of the problems that arised was not, and couldn’t be, foreseen in an experimental environment.

One very fundamental lesson learned was not to add any new feature without understanding the effects of its usage by applications.

Might seem obvious, but awww we so often forget what it really means.

It is also worth noting that they provide key data to some research projects on failure detectors and their scalability. Most of them were unsuitable as size of clusters grew. They also present the case of Inbox Search. However, no comparison data is provided. Only values, and those don’t tell me much. Not in respect to anything else at least. Their qualitative analysis is a bit weak as it only provides some “interesting” scenarios, but aren’t doing a very good job at linking these to Cassandra’s system architecture. It shows inclinations, or directions, rather than concrete qualitative statements about performance.

In conclusion, this paper is truly different from the previous one presented here. First of all it is a lot shorter, 6 pages compared to 18 pages. Second, there are real motivations provided along with very clear and constructive examples. Third, they make no distinction between academic and industrial projects, and seem to make equal use of both. Whatever works, works. And last, they don’t try to solve everything. It is focused on one thing, and one thing only: decentralised storage. In other words, it is not an “all in one” solution.

    • #JustMigrate
    • #distributed systems
    • #facebook
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Breaking it down

My previous post was a bit disorganised. I have too many opinions about academia, some very strong, that I couldn’t focus. Thus, I decided to do an analysis of the next paper. By the time I have finished this post, I have finished reading the paper: “Mobile Computing with the Rover Toolkit” by Anthony D. Joseph, Student Member, IEEE, Joshua A. Tauber, Student Member, IEEE and M. Frans Kaashoek, Member, IEEE. Published in IEEE TRANSACTIONS ON COMPUTERS, VOL. 46, NO. 3, MARCH 1997

Let’s start with what they set out to show in this paper:

In this paper, we describe the Rover toolkit, a set of software tools that supports applications that operate obliviously to the underlying environment, while also enabling the construction of applications that use awareness of the mobile environment to adapt to its limitations.

Quite high, and again I flag for the “trying to take over the world” stance. Moreover, Lalith just popped in and we got into the discussion: Should researchers really produce applications?

They’re apparently evaluating this by “We illustrate the effectiveness of the toolkit using a number of distributed applications, each of which
runs well over networks that differ by three orders of magnitude in bandwidth and latency.” It remains to see in which context though. Immediately afterwards the author’s goes on about a number of assumptions made, and a lengthy description of the characteristics of mobile computing is presented. Too many details, especially regarding data consistency, are presented in my opinion. Eventually they reach some kind of conclusion:

[…] a mobile-aware application can store not only the value of a write, but also the operation associated with the write. That operation can include any relevant context. Storing the operation allows the application to use application-specific semantic and contextual information;

Great. Basically, applications need to be aware of what’s going on underneath (i.e what network connectivity do we have? How much battery is left?) so that it can optimise performance accordingly.

Here’s a summary of some implementation details

  • The Rover toolkit uses a client-server model which provides optimistic concurrency control and caching.
  • It is possible to transfer code and data for computation at a remote location.
  • Remote procedure calls can be queued.
  • Servers can also be run on mobile devices.
  • The main challenge for the programmer is to define so called relocatable dynamic objects, the communication between clients and servers, as well as any conflict resolution.

Here it is worth noting: does this really make life simpler for the programmer? Surely, it seems they don’t have to worry about moving the actual code. But how flexible and adaptable is it? What trade-offs do they have to make?

They present four main results:

  1. QRPC is well suited to intermittently connected environments.
  2. Using RDO’s enable remote computation of heavy tasks and reduces latency and bandwidth usage.
  3. Porting to Rover apparently requires little work (only three weeks in one case… ehrm… ok?)
  4. Mobile-aware applications using Rover perform better on slow networks compared to their original versions
  5. (then they mention a fifth despite only saying four results) UI’s are faster too.

In related works we find that Rover is, apparently, also the first toolkit to support both the development of mobile-aware applications and, so called, proxies to enable untouched applications to benefit of the “mobile-awareness”. (This, once again, feels like a marketing trick… but sure let’s go with it.)

I won’t describe anything from the implementation details. To be honest, I didn’t bother reading it too carefully either. Let me add a thought that emerged from these sections though: there are potentially some ideas that have migrated into “real” products here. The details are also irrelevant since the paper was written in 1997 and much have changed since.

Section 5 presents the programmer with something similar to guidelines for how to port, or integrate, Rover to their mobile applications. It looks like a good overview of what steps are required for using Rover. Question: there has gone a significant amount of thinking into this project, why spoil it with  miniscule details like:

“The application developer also must decide which mechanisms to use for notifying users of the cache status of displayed data. In the e-mail application, color is used to distinguish operations that have not been propagated to a server.”

Secondly, that is not very unintrusive. How many users knows what a cache is?

A table showing number of lines changed to integrate/add Rover to existing and new applications. Convincing? It is good though to see how much work is required, or at least get an estimation of it.

Lab tests were carried out for evaluation. There is a concise list of hypotheses that they are evaluating. I like. Unfortunately, there are only internal comparisons. We discussed in class one day that it is hard to do benchmarks (in general) in Computer Science because the field is moving too fast. I believe that if you cannot do quantitative measurements, at least provide qualitative assessments.

Their evaluation obviously shows gains (I have only seen a few papers where a solution was disproved… they were an entertaining read!). I’m mostly concerned about their values; 17% doesn’t sound like a significant gain. Is it worth it? Increase of bugs? Their final graph on speedup shows some promising results. There is a significant increase against the original versions (based on a subset sample of tasks). A 7.5 speedup over slow networks is mentioned in the conclusion. 

However, they do a bad job of connecting to their original goal.

“We have found it quite easy to adapt applications to use these Rover facilities”

What does that mean by the way? Really, when you’re making qualitative statements, provide a solid argument. Don’t make loose relative statements from your quantitative 7.5 speedups.

In practice, we find the combination of the Rover cache, relocatable dynamic objects, and queued remote procedure calls results in a surprisingly useful system.

Surprise! Now, except for you guys, who did/does? Show me!

Once again, the paper presents some cool ideas, probably genuine and innovative at the time of writing. But seriously, even if this is 14 years old, this still happens today. Perhaps even more because of increased competition.

Footnote: I wrote this post as I was reading the paper in an attempt to track my thoughts. In other words, an experiment.

    • #JustMigrate
    • #academia
    • #mobile computing
    • #philosophy
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Overdoing it

Our course literature for Mobile Computing consists of a lot of academic papers. A superb way integrating academic work into classes compared to often long and partially irrelevant books. Some papers are well written and easy to understand. Usually because the authors have paid careful attention to the structure of the paper. Sometimes though, actually more often than not, I find that researchers are really trying to “take over the world” with their proposed solutions. They offer glory and a resort to all your problems.

This is obviously not true. Neither do I think the researchers think that it is the case. But it sounds like that. So why spice papers up with sentences like the following?

The Odyssey architecture supports application-aware adaptation while paying careful attention to a variety of practical considerations. Our prototype confirms the feasibility of realizing this architecture, and its ability to support a wide range of applications.

A paper is, inevitably, an argumentation. It is a space to convince a reader that proposed solution maps to a problem defined in the introduction. My problem, I think, is that researchers rarely present any realistic and convincing arguments to the problem in the first place. Their solution might be great. It may even be innovative! But if it is not a problem conceived by users, and where a proposed solution does not provide added value to the user, it is flawed in the first place.

Odyssey is the first system to simultaneously address the problems of adaptation for mobility, application diversity, and application concurrency. It is the first effort to propose and implement an architecture for application-aware adaptation that pays careful attention to the needs of mobile computing.

The statement above from the related works section of the same paper has some legitimacy. They do build on previous research and manage to show that clearly. I can even see the progression from their problem statement that this is might “true”. Nevertheless, I interpret it as a marketing trick.

Perhaps am I looking in the wrong place, but I’d like to see better motivations to the research conducted. And I shouldn’t have to resolve to reading a ton of surveys before reading a new paper. Although this is only a hypothesis so far, I sense that papers stemming from industry are better at providing realistic and believable scenarios and motivations. Ultimately, conclusions in those papers also hold stronger.

How do we know we, and researchers, are spending time on “the right thing”?

[Update: note so self - be more structured next time you write a post]

    • #JustMigrate
    • #mobile computing
    • #nontech
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+
Collaborative Editing LoveIf there’s one thing Google has pulled off really well it got to be Docs. Actually, they’ve done many things right, but this has to be one of their most useful day to day tools (except Search duh!). At the moment (obviously I’m not telling my classmates that I’m writing this right now) I’m working with two classmates on a project proposal. Since we’re still not sure about all the details we’re discussing many of them at the same time as we write. And… you guessed it, we’re able to do it all on Docs! 
Now, why aren’t more organisations with people distributed across the globe (hint: #jamboree2011) using tools which are built for distributed groups of people? My next hack is going to be an “auto-link all my docs tagged Jamboree2011 to Sharepoint”-script. 


Cheers Google! :) 
Pop-upView Separately

Collaborative Editing Love

If there’s one thing Google has pulled off really well it got to be Docs. Actually, they’ve done many things right, but this has to be one of their most useful day to day tools (except Search duh!). At the moment (obviously I’m not telling my classmates that I’m writing this right now) I’m working with two classmates on a project proposal. Since we’re still not sure about all the details we’re discussing many of them at the same time as we write. And… you guessed it, we’re able to do it all on Docs! 

Now, why aren’t more organisations with people distributed across the globe (hint: #jamboree2011) using tools which are built for distributed groups of people? My next hack is going to be an “auto-link all my docs tagged Jamboree2011 to Sharepoint”-script. 

Cheers Google! :) 

    • #JustMigrate
    • #hack
    • #jamboree
    • #productivity
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

What nonsense

This is a snippet from the introduction of our mobile computing project:

Traffic in Lisbon has reached catastrophic proportions, with constant grid-locks and smog. In order to reclaim the city back to its inhabitants, the Mayor has decided to prohibit all vehicles from entering the city. People will be transported using a revolutionary mass transit system, using small flying vehicles.  Each vehicle is able to carry up to 4 persons, and drives autonomous.

Now, since when did my course become a class in literature and storytelling. At least define a project with a relevant scenario (and preferably scope, just don’t cram everything in there because it looks good)!

I expected more from this project.  

Update: We managed to get a “special” project instead. A project which is a lot more interesting! Will be working on porting a JavaME framework which provides focal point consistency to Android. Will probably give a longer update in due time.  

    • #JustMigrate
    • #mobile computing
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+

Some computer science issues in #ubicomp

Did a presentation with Lalith today on ubiquituos computing based on Mark Weiser’s visionary paper published in 1992. The guys over at Xerox PARC did some truly remarkable work about 20 years ago and many of their ideas are today products we use in everyday life. As my colleague Wasif pointed out in a tweet recently

@lalithsuresh @mljungblad Not the 1st time Apple has taken inspiration from Xerox,check this out:

Anyway, I’m happy with the presentation. The professor did a good job of interrupting us, but equally improving and augmenting the content. Got some interesting discussions in the class and now looking forward to the other presentations and this course.

Good job Lalith!

The presentation is available under a Creative Commons NonCommercial ShareAlike license. Both the Keynote file (soon) and a pdf is available.

ubicomp_web.pdf Download this file
    • #JustMigrate
    • #mobile computing
    • #presentation
    • #university
  • 2 years ago
  • Comments
  • Permalink
Share

Short URL

TwitterFacebookPinterestGoogle+
Page 5 of 10
← Newer • Older →

About

Software developer at MEDEA, a research centre at Malmö University. M.Sc. in Computer Science with focus on distributed computing from KTH. Wrote a thesis on scaling recommender systems at Tuenti.

Active Scout since many years, right now leading the Info/PR team for Lägr1.

Hobby photographer, active reader, cautiously enthusiastic, avid traveller, and a big fan of smart ideas.

Found on-line at Github, LinkedIn, Twitter, and Facebook or via e-mail.

Twitter

loading tweets…

  • RSS
  • Random
  • Archive
  • Mobile

All texts are CC-BY.

Effector Theme by Pixel Union