Report on ACM Eurosys 2013 Conference

Technical University Library

This year Eurosys took place in Prague, Czech Republic. Eurosys is one of the main systems conferences. This year there were 28 accepted papers out of 143 (19% acceptance rate), a bit higher than previous years, but statistics did not consider submissions that did not fulfill the requirements, which has been counted in the past. Out of them only 6 european, and plenty of MSR papers.

Monday 15th April

Session 1: Large scale distributed computation I

TimeStream: Reliable Stream Computation in the Cloud

Chengping Qian (Microsoft Research Asia), Yong He (South China University of Technology), Chunzhi Su, Zhuojie Wu, and Hongyu Zhu (Shanghai Jiaotong University), Taizhi Zhang (Peking University), Lidong Zhou (Microsoft Research Asia), Yuan Yu (Microsoft Research Silicon Valley), and Zheng Zhang (Microsoft Research Asia)
Good motivation from MSR about why use stream processing: real-time heat map of latency pairwise in datacenter for network/ Infrastructure monitoring, real-time advertising, map queries, both current and previous ones, to the presented adverts.

Adverts must be reliable! more than monitoring (makes sense :) ).

Contribution focusses on resilience and fault tolerance. They build a DAG, rewritten dynamically, replacing link with hashed ones with several nodes. Not losing info, any subgraph can be substituted by an equivalent one, and reloads/recomputes missing pieces.

Several optimizations added, such as message batch aggregation, and lightweight dependency tracking from input/output, to estimate impact

Interesting work, and was well presented.

Optimus: A Dynamic Rewriting Framework for Execution Plans of Data-Parallel Computation

Qifa Ke, Michael Isard, and Yuan Yu (Microsoft Research Silicon Valley)

Motivation: there are many problems in large scale computations which cannot be known in advance. How to handle partition skew, what is the right number of tasks (e.g. Reducers?). Also, large scale matrix multiplication is argued it can cause problems with intermediate steps. Iterative co, or providing fault tolerance capabilities? The paper proposes to optimize the EPG (Execution Plan Graph) at runtime to solve these issues.

The solution for reliability is interesting,having a ‘cache’ for intermediate data, and choosing either data for next step.

BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data

[BEST PAPER AWARD]

Sameer Agarwal (University of California, Berkeley), Barzan Mozafari (Massachusetts Institute of Technology), Aurojit Panda (University of California, Berkeley), Henry Milner (University of California, Berkeley), Samuel Madden (Massachusetts Institute of Technology), and Ion Stoica (University of California, Berkeley)

Motivation: compute aggregate statistics on huge amounts of data.

Select WITHIN 2 seconds. Results with error, can be refined . The idea is very elegant and original, and was brilliantly presented.

Estimation time is obtained by querying over small samples and extrapolating (should be linear). So, how well does a query cover the original one? Depends on what elements it has. Computes how complete the data will be, don’t get it too well. A sample has cost and coverage. ILP problem determines what samples to get.

Session 2: Security and Privacy

IFDB: Decentralized Information Flow Control for Databases

David Schultz and Barbara Liskov (MIT CSAIL)

Information flow control, by tagging database rows with labels, stored in an extra column. It’s far from my area, but I am surprised this has not been done already. Paper is well explained, and results show that the incurred overhead is small. A problem pointed in the questions is that it requires manual tagging, which is in general extremely hard to do right.

Process Firewalls: Protecting Processes During Resource Access

 Hayawardh Vijayakumar (The Pennsylvania State University), Joshua Schiffman (Advanced Micro Devices), and Trent Jaeger (The Pennsylvania State University)

There are many threats to file access, resource access cntrol is very hard. Programmer are not going to get it right. System call level protection carries a huge overhead.  Idea: reverse threat protection approach. With introspection you protect vulnerable processes, instead of sand boxing dangerous attackers. So, declare unsafe resources for a specific process context.

It is interesting that processing firewall rules is much more efficient that implementing checks manually (because it is prone to errors). Declarative wins apparently by a huge margin.

Resolving the conflict between generality and plausibility in verified computation

Srinath Setty, Benjamin Braun, Victor Vu, and Andrew J. Blumberg (UT Austin), Bryan Parno (Microsoft Research Redmond), and Michael Walfish (UT Austin)

They propose a Cryptographic technique, Probabilistically Checkable Proof (PCP), for checking whether a server indeed performed some computation. It is not yet usable because of the huge computation cost.

Session 3: Replication

ChainReaction: A Causal+ Consistent Datastore based on Chain Replication

Sergio Almeida, Joao Leitao, and Luıs Rodrigues (INESC-ID, Instituto Superior Tecnico, Universidade Tecnica de Lisboa)

ChainReaction is a Geo-distributed K/V store that implements the existing Causal+ model for improved read performance over existing causal replication-based systems. The work extends over FAWN, and adds capabilities for a geo-distributed setup. It was well presented, and design decisions and results are clearly reflected in the paper.

Augustus: Scalable and Robust Storage for Cloud Applications

Ricardo Padilha and Fernando Pedone (University of Lugano, Switzerland)

Bizantine Failure Tolerance (BFT) would be convenient in a cloud environment, but it heavily penalizes latency. The paper proposes single-partition transactions, and multi-partition read only transactions. The restrictions in applicability look a bit severe, but it was nicely validated across different workloads. Not sure that the social network workload they generated was representative, with them choosing an arbitrary 50% chance of a connection being close, with no justification.

MDCC: Multi-Data Center Consistency

Tim Kraska, Gene Pang, and Michael Franklin (UC Berkeley), Samuel Madden (MIT), and Alan Fekete (University of Sydney)

The authors present MDCC, a replication technique that attempts to exploit two main observations of geo distributed DBs: conflicting operations are commutative, and they are actually rare, as each client often updates their own data. With these, they implement a modified version of Paxos Multi + Fast, which attempts to lessen latency by reducing in several cases the number of phases. Results point in an extensive set of experiments to a significant performance improvement over other transactional databases.

Session 4: Concurrency and Parallelism

Conversion: Multi-Version Concurrency Control for Main Memory Segments 
Timothy Merrifield and Jakob Eriksson (University of Illinois at Chicago)

Cache control has become a main bottleneck in multi-core systems. Proposal: each process handles its own working copy for concurrent memory access. If processes can afford working with a slightly out of date copy, performance can be significantly improved.

Whose Cache Line Is It Anyway? Operating System Support for Live Detection and Repair of False Sharing 
Mihir Nanavati, Mark Spear, Nathan Taylor, Shriram Rajagopalan, Dutch T. Meyer, William Aiello, and Andrew Warfield (University of British Columbia)

Writes to the same cache line from multiple processes force to write everyone to main memory. Can have a huge impact on performance in many cases. Idea: split pages in an isolated page where conflicts are, and an underlay page with no conflicts.

Adaptive Parallelism for Web Search 
Myeongjae Jeon (Rice University), Yuxiong He (Microsoft Research), Sameh Elnikety (Microsoft Research), Alan L. Cox and Scott Rixner (Rice University)

In web search services (e.g. Bing), parallelism involves querying multiple index servers for results, and aggregating them with techniques such as PageRank. However, index server queries are sequential. The paper discusses the challenges of parallelizing in-server search. I don’t think it is novel, but it is well explained.

Tuesday 16th April

Session 1: Large scale distributed computation II

Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing

Zuhair Khayyat, Karim Awara, and Amani Alonazi (King Abdullah University of Science and Technology), Hani Jamjoom and Dan Williams (IBM T. J. Watson Research Center, Yorktown Heights), and Panos Kalnis (King Abdullah University of Science and Technology)

MIzan is a Pregel-based system, implemented in C++, which optimizes execution time by dynamically migrating vertices every iteration. Each node superstep execution is profiled, so that they statistically see what nodes perform slower, and if over threshold migrate dynamically. Every worker has a match worker with less load, where all migrations are headed too. Node locations are handled through a DHT. The technique is interesting in the sense that it completely ignores graph topology. It might incidentally reduce the number of cut edges, but does it by only looking at runtime statistics. However, the destination is chosen to load balance CPU, so if the network is the bottleneck it might not be enough.

The solution works, although validation has some problems, they only looked at graphs in the 2M range, which lend to questions about why not do it in a single machine. Comparisons where against a specific version of Giraph, which Greg Malewicz pointed out it had been vastly improved in the last six months.

MeT: Workload aware elasticity for NoSQL

Francisco Cruz, Francisco Maia, Miguel Matos, Rui Oliveira, Joao Paulo, Jose Pereira, and Ricardo Vilaca (HASLab / INESC TEC and U. Minho)

MeT is an HBase extension that performs elastic configuration of slaves. Depending on the observed access patterns, it provides replication and load balancing, Monitoring runtime statistics feeds a decision algorithm, identifying suboptimal config from CPU Usage. If a problem is detected, a distribution algorithm is run to spread the replicas.

Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices

Shivaram Venkataraman (UC Berkeley), Erik Bodzsar (University of Chicago), and Indrajit Roy, Alvin AuYoung, and Robert S. Schreiber (HP Labs)

Presto is an R library that parallelizes matrix computations using distributed machines (darray construct, supporting for each). The API is lightweight, and requires minimal modification of R code. When paralellizing sparse matrices, Presto attempts to avoid the impact of skew in sparse matrices partitions.

Online repartitioning scheme, profiling each partition for optimizing further iteratons, if ratio higher than threshold, split problematic partition. It is integrated to R without modifying it by  hacking memory allocation including object headers. The contribution is not major, but is is well thought and described.

Session 2: Operating Systems Implementation

RadixVM: Scalable address spaces for multithreaded applications

Austin T. Clements, Frans Kaashoek, and Nickolai Zeldovich (MIT CSAIL)

Failure-Atomic msync(): A Simple and Efficient Mechanism for Preserving the Integrity of Durable Data

Stan Park (University of Rochester), Terence Kelly (HP Labs), and Kai Shen (University of Rochester)

Composing OS extensions safely and efficiently with Bascule

Andrew Baumann (Microsoft Research), Dongyoon Lee (University of Michigan), Pedro Fonseca (MPI Software Systems), and Jacob R. Lorch, Barry Bond, Reuben Olinsky, and Galen C. Hunt (Microsoft Research)

Session 3: Miscellaneous

Hypnos: Understanding and Treating Sleep Conflicts in Smartphone

Abhilash Jindal, Abhinav Pathak, Y. Charlie Hu, and Samuel Midkiff (Purdue University)

They analyze several sleep conflicts when the state machine of the smartphone fails, and the device is not effectively suspended. Tested on Nexus One and Galaxy S devices (3+ year old). Looks a bit weak.

Prefetching Mobile Ads: Can advertising systems afford it?

Prashanth Mohan (UC Berkeley) and Suman Nath and Oriana Riva (Microsoft Research)

It is MS data of course, but… just go to Breaking for Commercials: Characterizing Mobile Advertising

Maygh: Building a CDN from client web browsers

Liang Zhang, Fangfei Zhou, Alan Mislove, and Ravi Sundaram (Northeastern University)

Maygh is a web-based CDN implemented with HTML 5.  Content is cached through HTML 5 LocalStorage, 5 MB with programmatic control. . Novelty, no client modification at all (browser, not plugins such as firecoral). Implemented with RTMFP (Flash), WebRTC, key is NAT traversal via STUN.

Architecture is based on a proxy, the Maigh coordinator, maintaning a directory for content, via hashing. The idea works in principle, it has not been developed beyond a proof of concept (scalabiilty, security are not addressed properly). Interesting read.

Wednesday 17th April

Session 1: Virtualization

hClock: Hierarchical QoS for Packet Scheduling in a Hypervisor

Jean-Pascal Billaud and Ajay Gulati (VMware, Inc.)

RapiLog: Reducing System Complexity Through Verification

Gernot Heiser, Etienne Le Sueur, Adrian Danis, and Aleksander Budzynowski (NICTA and UNSW) and Tudor-Ioan Salomie and Gustavo Alonso (ETH Zurich)

Application Level Ballooning for Efficient Server Consolidation

Tudor-Ioan Salomie, Gustavo Alonso, and Timothy Roscoe (ETH Zurich) and Kevin Elphinstone (UNSW and NICTA)

Currently many applications (e.g. databases) are not designed to work fairly in a virtualized environment, where resources such as memory are shared and dynamically assigned. instead, they grab hold of the resources, whereas in many cases they could be working with much less memory and perform similarly.

The paper proposes a technique for ‘ballooning’ applications, so that the amount of memory assigned to them can be expanded, or squeezed. It requires modification of the applications and was developed for MySQL and the OpenJDK JVM, over Xen. Very interesting paper.

Session 2: Scheduling and performance isolation

Omega: flexible, scalable schedulers for large compute clusters

[BEST STUDENT PAPER AWARD]

Malte Schwarzkopf (University of Cambridge Computer Laboratory), Andy Konwinski (University of California Berkeley), and Michael Abd-el-Malek and John Wilkes (Google Inc.)

Omega is the upcoming scheduler for Google datacenters. It performs heterogeneous scheduling, segregating types of jobs, batch and services, with priority for batch (as they are orders of magnitude more). Solution: multiple schedulers, with shared state, and optimistic concurrency . They had to add some optimizations because constraints cause an enormous number of conflicts up when simulating realistic scenarios. The approach allows having  custom schedulers per application type, and they show an example that improves MR scheduling playing around the pattern in user preferences. Very good paper, well-deserved award.

Choosy: Max-Min Fair Sharing for Datacenter Jobs with Constraints

Ali Ghodsi, Matei Zaharia, Scott Shenker, and Ion Stoica (UC Berkeley)

Same problem, resource allocation in multitenant datacenters. They apply a constrained max min algorithm to guarantee fairness. The algo recursively maximizes allocation for the user with fewest machines. Nice flow filling model. Offline optimum, online approximation. Very interesting contrast with the previous paper, one very practical, based on the real Google workload, and this one much more academic in the resource allocation problem.

CPI2: CPU performance isolation for shared compute clusters

Xiao Zhang, Eric Tune, Robert Hagmann, Rohit Jnagal, vrigo Gokhale, and John Wilkes (Google, Inc.)

Performance isolation does not have perfectly in practice because of all the contending resources such as cache memory. They have implemented detection of problematic processes by monitoring hw performance counters, and ideally throttle the culprits.

Internet Measurement Conference 2012: by Hamed Haddadi

IMC 2012. http://www-net.cs.umass.edu/imc2012/program.htm

Paper 1: Using CAIDA telescope to collect scans of sip services, the scans have a sophisticated pattern which survives even when there are a low number of hosts taking part in the bot. The paper is a good demonstration of solid measurement work like many other IMC paper
Also Amazing animation (CAIDA cuttlefish )

Paper 2: prefix hijacking detection has been attracting a lot of attention lately. They form a live fingerprint of the route update distribution patterns and identify and classify hijacks, failure and route anomalies using threshold techniques and distributed “eyes” , with less than 10second delay.

Paper4: looking at one way traffic on the net, which can shed light on a large number of anomalies, they use a large netflow dataset for analysis of these packets (which never receive a reply). Interestingly, over 7 years of their data, a significant portion of flows is one-way 30–70% but a very low volume of traffic as they are usually small packets.

Paper3: the paper focuses on concurrent prefix hijacks, where an AS hijacks prefixes of a number of other ASes. These are becoming trendy as full table leaks are difficult and detected faster. It is also a big task to remove individual valid changes in AS prefixes. There are a number of interesting case studies in the paper.

Paper2 morning session , fast classification at wire speed with commodity hardware , the paper has an interesting analysis of pros and cons of speed vs accuracy , number of cores, amount of memory, they have used synthetic and real traces from CAIDA. The optimal classification can be done when there is one core dedicated per queue.

1. Fathom: A Browser-based Network Measurement Platform (review)
Mohan Dhawan (Rutgers University), Justin Samuel (UC Berkeley), Renata Teixeira (CNRS & UPMC), Christian Kreibich, Mark Allman, and Nicholas Weaver (ICSI), and Vern Paxson (ICSI & UC Berkeley)
Interesting measurement methodology using Firefox extension

http://www-net.cs.umass.edu/imc2012/papers/p87.pdf

Transition to ipv6, they have used javascript websites and flash googled flash ads to try out number of observed networks which have ipv6 enabled , though using these has introduced very interesting biases towards Asian and Latin American countries. They notice that no one is taking action for adoption. A high proportion of 6to4 tunnelling is seen, and corporate networks seem to be leading the way in adoption. The findings indicate a delay for Teredo hence Microsoft hasn’t enabled it by default http://en.wikipedia.org/wiki/Teredo_tunneling
The sampling technique has been very interesting in the paper.

3. MAPLE: A Scalable Architecture for Maintaining Packet Latency Measurements (review)
Myungjin Lee (Purdue University), Nick Duffield (AT&T Labs-Research), and Ramana Rao Kompella (Purdue University)
Another tool paper, specific for latency measurements, moves to per packet granularity to obtain measurements of latency at packet level rather than flow level. They use time stamped packets to keep track of packets using hash tables and a variant of bloom filters for efficiency.

4. Can you GET me now? Estimating the Time-to-First-Byte of HTTP transactions with Passive Measurements (review) (short paper)
Emir Halepovic, Jeffrey Pang, and Oliver Spatscheck (AT&T Labs-Research)
Motivation is to measure user experienced delay, using passive analysis for convenience and representativeness, defining ttfb as time between sun ack and first byte of http data. They show ttfb captures user experience better than rtt.

5. Towards Geolocation of Millions of IP Addresses (review) (short paper)
Zi Hu, John Heidemann, and Yuri Pradkin
Improvements to popular maxmind geoloc system, in an open geoloc database format for all address. They use a vantage point system to triangulate IP address locations. Accuracy is preserved by choosing a number of vantage points.

1. Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+ (review)
Neil Zhenqiang Gong (EECS, UC Berkeley), Wenchang Xu (CS, Tinghua University), Ling Huang (Intel Lab), Prateek Mittal (EECS, UC Berkeley), Vyas Sekar (Intel Lab), and Emil Stefanov and Dawn Song (EECS, UC Berkeley)
First large scale study of an OSN evolution. Breadth first search crawling, differentiating between followers and followee graphs, they design a new model base don the observation that google plus has a large number of low degree nodes, with a log-normal distribution. It’s makes google plus a hybrid between Facebook and twitter. They also look at Triassic closure model and find them better then preferential attachment.
Surprised why they didn’t check the correlation between number of posts and degree of nodes, also maybe attribute such as LinkedIn endorsed skills play a role in this relationship.

2. Evolution of a Location-based Online Social Network: Analysis and Models (review)
Miltiadis Allamanis, Salvatore Scellato, and Cecilia Mascolo (University of Cambridge)
Looking at spatial and location based social networks. Using daily snapshots of gowalla social networks looking at check ins of 122k users, they explore global attachment models such as preferential attachment model, age model , distance model and the gravity model. 30% of new edges are between users that have one check in in common.

3. New Kid on the Block: Exploring the Google+ Social Graph (review)
Gabriel Magno and Giovanni Comarela (Federal University of Minas Gerais), Diego Saez-Trumper (Universitat Pompeu Fabra), Meeyoung Cha (Korea Advanced Institute of Science and Technology), and Virgilio Almeida (Federal University of Minas Gerais)
Another google+ paper, looking at information sharing and privacy settings in google plus, some users put private data such as home and mobile numbers, though own users are known to be more risk taking. A bunch of other metrics are also discussed, however the type of users are not discussed. Also they data shows the strong geographical correlation of friendship between users, showing that offline relationship is also reflected in the data. I imagine The data may have strong errors obviously, as some users put premium number in the phone field to collect money :)

4. Multi-scale Dynamics in a Massive Online Social Network (review)
Xiaohan Zhao (UC Santa Barbara), Alessandra Sala (Bell Labs, Ireland), Christo Wilson (UC Santa Barbara), Xiao Wang (Renren Inc.), Sabrina Gaito (Università degli Studi di Milano), and Haitao Zheng and Ben Y. Zhao (UC Santa Barbara)
Looking at volition of user activity and growth of network, using Chinese Facebook equivalent , capturing node and edge dynamics over 2 years. Network growth and effect of aGe of nodes and preferential attachment, and how do these change as network matures. They also look at community formation and theirs lifetime and similarity using set intersection and jacquard coefficient. Driving force behind edge creation shifts from new nodes to old nodes as network grows, preferential attachment strength also decays.

Day 2

8:30-10:15 Video On Demand. Session Chair: Mark Allman (ICSI)

1. Watching Video from Everywhere: a Study of the PPTV Mobile VoD System (review)

Zhenyu Li, Jiali Lin, Marc-Ismael Akodjenou-Jeannin, and Gaogang Xie (ICT, CAS), Mohamed Ali Kaafar (INRIA), and Yun Jin and Gang Peng (PPlive)
dataset from smartphone video videos of 4m users, watching 400k videos over two weeks, the results can be a good guide for those designing wireless provisioning, the trends of watching long videos versus short videos are displayed against time of day which is interesting. 3G users are more likely to wathc movies but they often give up at the beginning.

2. Program popularity and viewer behaviour in a large TV on demand system (review)

Henrik Abrahamsson (SICS) and Mattias Nordmark (TeliaSonera)

Looking at TV & video on-demand access patterns, the usual heavy tail and top 100 popularity trends can be seen. they find the cacheability very high so with 5% top videos cacheing, the hit rate increases 50%.

Video Stream Quality Impacts Viewer Behavior: Inferring Causality using Quasi-Experimental Designs (review)

S. Shanmuga Krishnan (Akamai Technologies) and Ramesh K. Sitaraman (University of Massachusetts, Amherst and Akamai Technologies)

nice introduction to video delivery economics, to improve user behaviour and performance. The performance aspects is understood, but the improved “user behaviour” is not clear. a LArge dataset of video views is presented. Using randomised experiments (Fisher 1937) they look at correlation vs causation of different factors such as geography and content times by treating users differently for example for re-buffering of videos and its effect on video abandonment. Patience is increased with the length of the videos. So short video clips are abandoned fast if they are slow to load. The mobile users are more patient than fiber users so access technology also plays a role.

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard (review)

Te-Yuan Huang, Nikhil Handigol, Brandon Heller, Nick McKeown, and Ramesh Johari (Stanford University)

The performance of video rate over http/tcp is analysed. the competing flows makes the video rate to go too low which takes it below acceptable value. The on-off traffic pattern due to buffering heavily effects the congestion window management of TPC due to slow-start and hence bandwidth underestimation. This is due to video-client trying to do TCP’s job and estimating b/w. perhaps a video-specific protocol is needed??

On the Incompleteness of the AS-level graph: a Novel Methodology for BGP Route Collector Placement (review)
Enrico Gregori (IIT-CNR), Alessandro Improta (University of Pisa / IIT-CNR), Luciano Lenzini (University of Pisa), Lorenzo Rossi (IIT-CNR), and Luca Sani (IMT Lucca)

The paper shows the geographic distribution of feeders and and their coverage of AS topology dataset. they increase the accuracy by more route collectors, I believe (and Walter Willinger also mentioned) the work can heavily improve by using IXP data.

Quantifying Violations of Destination-based Forwarding on the Internet (review) (short paper)

Tobias Flach, Ethan Katz-Bassett, and Ramesh Govindan (University of Southern California)

Using reverse traceroute for finding destination-based forwarding violations, e.g., by MPLS tunnels or load balancing, using planetlab nodes and destinations with spoofed packets along paths. large portion of violations are caused by load balancing. for 29% of the targeted routers, the router forwards traffic going to a single destination via different next hops, and 1.3% of the routers even select next hops in differ- ent ASes.

Revisiting Broadband Performance (review)

Igor Canadi and Paul Barford (University of Wisconsin) and Joel Sommers (Colgate University)

growing interest in broadband subscription and FCC interest in investigation broadband speeds and rates, they use Ookla data which is a flash-based performance testing application, with over 700 server locations. the paper uses 59 metro areas across the world for segmenting the areas based on geographic diversity. the comparison of data is made against SamKnows data. Some ISPs are seen to be rate-limiting users to very low speeds,

Obtaining In-Context Measurements of Cellular Network Performance (review)

Aaron Gember and Aditya Akella (University of Wisconsin-Madison) and Jeffrey Pang, Alexander Varshavsky, and Ramon Caceres (AT&T Labs-Research)

checking performance of user devices for different conditions, crowd sourcing using 12 volunteers to measure performance of cellular networks, using speed test websites for looking at latency and loss over different hours of day, they look at different situations and positions of the phone however different data delivery types can affect this result quite heavily.

Cell vs. WiFi: On the Performance of Metro Area Mobile Connections (review)

Joel Sommers (Colgate University) and Paul Barford (University of Wisconsin)

another mobile performance measurement and speed test crowd source data collection, also from native apps on smart phones. iOS devices show more latency compared to android devices, perhaps due to poor OS or API design. they find performance of wifi better but cellular is more consistent

Network Performance of Smart Mobile Handhelds in a University Campus WiFi Network (review)

Xian Chen and Ruofan Jin (University of Connecticut), Kyoungwon Suh (Illinois State University), and Bing Wang and Wei Wei (University of Connecticut)

an interesting paper comparing CDN performance between Akamai and Google on campus

1. Breaking for Commercials: Characterizing Mobile Advertising (review)

Narseo Vallina-Rodriguez and Jay Shah (University of Cambridge), Alessandro Finamore (Politecnico di Torino), Hamed Haddadi (Queen Mary, University of London), Yan Grunenberger and Konstantina Papagiannaki (Telefonica Research), and Jon Crowcroft (University of Cambridge)

BEST paper! read it fully! :)

Screen-Off Traffic Characterization and Optimization in 3G/4G Networks (review)(short paper)

Junxian Huang, Feng Qian, and Z. Morley Mao (University of Michigan) and Subhabrata Sen and Oliver Spatscheck (AT&T Labs-Research)

collecting data from 20 volunteers on android for 5 months, looking at screen status at 1Hz. screen off traffic consumes half of energy on network interface because applications download less and traffic pattern changes, screen-aware fast dormancy increases energy saving by 15%.

Configuring DHCP Leases in the Smartphone Era (review) (short paper)

Ioannis Papapanagiotou (North Carolina State University) and Erich M Nahum and Vasileios Pappas (IBM Research)

using a big trace to look at DHCP lease duration and lifetime in corporate and academic environment

Video Telephony for End-consumers: Measurement Study of Google+, iChat, and Skype (review)

Yang Xu, Chenguang Yu, Jingjiang Li, and Yong Liu (Polytechnic Institute of NYU)

this actually won the best paper award, recommend reading it! they show the effect of video and voice processing on e2e delay, they also present the techniques used for scalability

On Traffic Matrix Completion in the Internet (review)

Gonca Gursun and Mark Crovella (Boston University)

the idea is to reverse engineer traffic matrices to detect invisible flows (going through other networks), using AS topology and traffic matrices in ASes using matrix completion method.

DNS to the rescue: Discerning Content and Services in a Tangled Web (review)

Ignacio Bermudez, Marco Mellia, and Maurizio Munafo` (Politecnico di Torino) and Ram Keralapura and Antonio Nucci (Narus Inc.)

Interesting paper about the complex content delivery chain in the internet, and a service which helps classify the type of contents.

Beyond Friendship: Modeling User Activity Graphs on Social Network-Based Gifting Applications (review)
Atif Nazir, Alex Waagen, Vikram S. Vijayaraghavan, Chen-Nee Chuah, and Raissa D’Souza (UC Davis) and Balachander Krishnamurthy (AT&T Labs-Research)

aiming to model user activity on OSNs, using facebook apps data to look at user activity, power-law fits are seen for in-degrees but out degree has strong heavy tail, the node activity has to be modelled from connectivity.

Inside Dropbox: Understanding Personal Cloud Storage Services (review)

Idilio Drago (University of Twente), Marco Mellia and Maurizio M. Munafo (Politecnico di Torino), and Anna Sperotto, Ramin Sadre, and Aiko Pras (University of Twente)

Looking at dropbox data storage and file storage system, which splits the file into 4M chunks and encrypted communication, there is a communication separation between storage and control, dropbox seems to be a very popular app, mainly used by native client, experiments using planetlab shows generally they all use same data centres in US (amazon data centre for data and control in california), the slicing in chunks means that many of the files are too small and do not use the bandwidth efficiently due to TPC slow start, so even for large files it means filling the channel capacity takes longer

Content delivery and the natural evolution of DNS (review)

John S. Otto, Mario A. Sánchez, John P. Rula, and Fabián E. Bustamante (Northwestern University)

discusses use of DNS for dynamic routing, and use of openDNS and googleDNS for these purposes. CDN depends on user DNS to directly requests. different redirections mean better performance , try out namehelp for proactive cacheing

Measuring the Deployment of IPv6: Topology, Routing and Performance (review)

Amogh Dhamdhere, Matthew Luckie, Bradley Huffaker, and kc Claffy (CAIDA), Ahmed Elmukashfi (Simula), and Emile Aben (RIPE)

IPv4 addresses have run out. IPv6 has been around but not used as not backwards compatible. Hence tunnelling has been the main growth area. used measurement data from BGP and AS relationships and lots of data, and classify ASes to transit providers, content/access/hosting providers and enterprise customers. They find that IPv6 is strong at core but lagging at edge. then measured AS level paths from 7 vantage points towards dual-stacked ASes. They find V4 network maturing, and transit providers dpeloying V6, same as content providers, the edge is lagging, with Europe and Asia leading

Report on EPSRC UBHAVE workshop by Hamed Haddadi

Opportunities and Challenges in Interdisciplinary Research

Today I attended this excellent workshop arranged by EPSRC UBHAVE project team, bringing together a number of excellent computer scientists, engineers, medics, industry members, psychologists and HCI experts from across the world, discussing the challenges faced by interdisciplinary researchers, from convincing patients to carry around monitors, to setting the right interface/sampling rate/data collection strategy for devices and sensors. The interesting projects, range of smart phone apps, and the adoption of technology in form of peculiar mixes of software and hardware brought a mesmerising atmosphere. Number of challenges were highlighted:

 

  • difficulty in establishing what method and which collaborator is right
  • privacy issues and security of devices
  • economic and personal incentives to use technology
  • large delay between research grant cycle and industrial advancements
  • Lower academic rewards (promotion/etc) for interdisciplinary research
  • understanding each other!

I worked 2 years on the Huntington Disease project, and indeed, communication between biologists, engineers, mathematicians and computer scientists is a SERIOUSLY challenging issues, however it is just as vital to go through, as otherwise we face the classic problems that us (i.e., engineers and computer scientists) are constantly facing: designing systems by geeks, approved in conferences by geeks, adopted by geeks, and often failing to make it to mass markets, on the other extreme, our technology offers scale, speed and accuracy, and more importantly, ability to monitor in situ, capturing the contextual data, relieving the sociologist and biologist from privacy-intrusive and cumbersome ethnography, monitoring, lab experiments and interviews.

 

Rather than poisoning your fresh brains with my rants, I’ll let you have a look at the program yourself and click on the links ! :)

 

 

 

 

Conference Programme

 

 

Morning Session: “Making Multidisciplinary Research Work”

 

10.30 – 10.40am         Welcome and Introduction

                                    Professor Lucy Yardley, University of Southampton, UK

                                    Professor Susan Michie, University College London, UK

 

10.40 – 11.20am         “Engaging the Users in Multidisciplinary Projects: How to find them, what to do with them, and where to go next”          

                                    Professor Torben Elgaard Jensen, Technical University of Denmark

 

11.20 – 12.00pm         “Multilevel and Reciprocal Behaviour Change: The Role of Mobile and Social Technologies”

                                    Professor Kevin Patrick, University of California, San Diego, USA

 

12.00 – 12.50pm         Panel Discussion led by:

                                    Dr Niels Rosenquist, Massachusetts General Hospital

 

12.50 – 1.40pm           Buffet Lunch(First floor, South Corridor Foyer)

 

Afternoon Session 1: “The Potential of Digital Technology for Assessing and Changing Behaviour”

(Small Meeting House)

 

1.40 – 2.20pm             “Behavioural Intervention Technologies for Depression”

                                    Professor David Mohr, Northwestern University, USA

 

2.20 – 3.00pm             “My Smartphone told me I’m Stressed”   

                                    Professor Andrew Campbell, Dartmouth College, USA

 

3.00 – 3.40pm             “UBhave: Addressing the question of how best to use phones to measure and change behaviour”

                                    Professor Lucy Yardley, University of Southampton, UK

                                    Dr Cecilia Mascolo, University of Cambridge, UK

 

3.40 – 4.10pm             Coffee(First floor, South Corridor Foyer)

 

4.10 – 5.00pm             Panel Discussion led by:

                                    Professor Susan Michie, University College London, UK

                                   

5.00 – 5.30pm             Close

 

Afternoon Session 2: “Challenges of User Led Innovation for Energy Technologies”

(First floor, Room 2)

 

1.30 – 5.30pm             Led by Dr Alastair Buckley, University of Sheffield, UK


Report on SIGCOMM 2012, Helsinki (by Steve Uhlig, Networks)

http://conferences.sigcomm.org/sigcomm/2012/

1st day: HotSDN workshop attracted a lot of submissions as well as a large attendance (100+).
This reflects how topical SDN is and the growing community working on it.

2nd day:
--------

Keynote and SIGCOMM award: Nick McKeown, Stanford University.
Title: "Mind the gap".
The talk explained the "gap between theory and practice", how to improve practice especially by
putting pressure on Industry. Nick explained his strategy when trying to find out the right topic:
meet industry and when they get angry, you might be on something. Of course that does not mean
the topic is a good research one, but at least it is potentially relevant. Nick said that 
papers should be written in the same way as documentation, to explain what we are doing. So
make the problem and the story be the lead. The marketing and the flow of the paper itself 
should not have the lead. From a technical perspective, Nick explained that the point of SDN
for him is to reduce the complexity of network management (this was questioned by Dina Papagiannaki
during questions). His research strategy is to make everything public, so as to allow others 
to reproduce his work. The point is to stand on each others shoulders rather than compete. 
He offered some criticism about the SIGCOMM conference, that is in his opinion too small
(30 papers) and narrow in audience (500). He suggests to make it more like SIGGRAPH, with
50% acceptance, and more than 2000 people, among which plenty of Industry.

The test of time award went to the "Tussle in cyberspace" paper from SIGCOMM 2002. We believe
that in the future there should be a presentation because such papers have high impact for different
reasons. Hearing from the authors about the intention of the paper and what it turned out to have
impact on would be interesting for the community.

Session 1: Middlebox and Middleware
-----------------------------------
- Multi-resource Fair Queuing: Best paper award. Not clear why frankly. Old topic.
- Making middleboxes someone else's problem: Putting middleboxes in the Cloud. Such
an idea was expected, and seems to work pretty well for some specific applications.
- Hyperdex: Interesting searchable key-value store. Talk did not do justice to the
applicability of the paper to networking.

Session 2: Wireless Communication (by Cigdem Sengul from T-labs)
---------------------------------

- Picasso: Flexible RF and Spectrum Slicing  by Hong et al from Stanford University looks 
at full-duplexing (i.e., receive or transmit at the same time) in adjacent bands. They also 
presented a demo on Day 2, which I unfortunately missed seeing due to having a demo at the 
same time.  For more information on Picasso: http://www.stanford.edu/~hsiying/Picasso.html
- Spinal codes by Perry et al from MIT invents this family of rateless codes to get close 
to Shannon capacity. For more information on their research: http://nms.csail.mit.edu/spinal/
- Efficient and reliable low-power backscatter networks: treats nodes as virtual senders
and relies on collision patterns as codes.

Poster and demo session
------------------------
Lot of variety in the topics and quite interesting. The sessions were attended by a lot of
people, very good for visibility!

Session 3: Data Centers: Latency
---------------------------------
- Deadline aware datacenter TCP: changing TCP to improve it meets deadlines when latency is
an issue.
- Finishing flows quickly: Same goal as previous paper but through flow scheduling.
- DeTail: reducing flow completion time tail: again same story, but not cross-layer approach...

Session 4: Measuring Networks (mostly by Cigdem Sengul from T-labs)
-----------------------------
- Inferring visibility: inference techniques to try to guess which paths cross or do not cross
a given AS. Very centric on the inference techniques, not very much on whether it works.
- Anatomy of a Large European IXP: The paper emphasizes once more how the Internet is not 
what we think it is - with hypergiants and CDNs (content delivery network) getting flatter. 
What is interesting is that data from a single IXP captures all we know of the Internet 
(through several BGP based studies, and measurement data) and adds to that by showing the 
vast number of connections in the Internet.
- Measuring and Fingerprinting Click-Spam in Ad Networks by Vache et al. from UT Austin and  
MSR India. For me, this is the best and the most enjoyable presentation of the entire conference. 
In this work, the authors presented a measurement methodology for identifying Click-Spam in 
advertisement networks, and digging into the data to identify fraud activities. Their impressive 
results show the pervasiveness of the Click-spam especially in the mobile advertising context, 
which is also interesting not for only an Internet researcher but also an Internet user. The 
authors also warn that this is an open problem which they do not expect to go away in a long time. 

Session 5: Data Centers: Resources Management
---------------------------------------------
- FairCloud: sharing the network in Cloud computing
- The only constant in change: incorporating time-varying network reservations in data centers
- It's not easy being green: tradeoff between access latency, carbon footprint, and electricity 
costs.

Session 6: Wireless and Mobile Networking
-----------------------------------------
- Skipped.

Poster and demo session (2)
---------------------------
Again posters and demo's.

Session 7: Best of CCR
-----------------------
This session was dedicated to the best of CCR talks, where the best papers from the CCR was 
presented by their authors. The papers were (1) Forty Data Communication Research Questions by 
Craig Partridge, (2) Extracting Benefit from Harm: Using Malware Pollution to Analyze the 
Impact of Political and Geophysical Events on the Internet by Albert Dainotti et al and (3) 
The Collateral Damage of Internet Censorship by DNS Injection, an anonymous submission presented 
by Philip Levis.

Session 7: Network Formalism and Algorithmics
---------------------------------------------
- Perspectives on Network Calculus: nice tutorial and update on the state-of-the-art.
- Abstractions for network update: using fine-granular flow-level abstraction to apply
network updates in such a way that packets won't be stuck between network states.
- Pre-classifier to reduce TCAM consumption: TCAM's main drawback is their power consumption.
This paper relies on a pre-classifier to be able to switch off part of the regions of the TCAM.

Session 8: Streaming and Content Networking
-------------------------------------------
- ShadowStream: Adding performance evaluation to the capabilities of a streaming platform.
- Case for Coordinated Internet-scale control plane: Conviva marketing their data and
selling the case for a black-box control plane based on this data. Audience did not buy it
from the questions.
- Optimizing cost and performance for multihoming: redirecting users to improve QoE. Not
very convincing as again it is black-box as the previous one.

Session 9: Routing
------------------
- Private and verifiable interdomain routing decisions: system to help peers of a network
to prove that it propagates the wrong routes.
- LIFEGUARD: practical repair of persistent route failures: Assuming that one is able to
locate connectivity failures, the paper proposes to help ISPs through poisoning of the
faulty paths. Very incremental compared to previous work and hard to buy...
- On Chip Networks from a Networking Perspective: Congestion and Scalability in Many Core 
Interconnects by George Nychis et al from CMU and Microsoft Asia. The paper is indeed 
interesting and shows how certain networking solutions apply and do not apply on On Chip 
Networks. However, I am still thinking what the exact takeaways might be for the Networking 
Community. 

Session 10: Data Centers: Network Resilience
--------------------------------------------
- NetPilot: automating datacenter network failure detection: deactivate and restart offending
equipment.
- Surviving failures in bandwidth-constrained datacenters: exploiting traffic patterns to
improve behavior under failure.
- Mirror Mirror on the Ceiling: Flexible Wireless Links for Data Centers by Xia Zhou et al from 
UC Santa Barbara. Again an interesting marriage between different topics - the use of 60GhZ in 
wireless links. However, while the mirror idea for reducing interference in these networks is 
very interesting for the wireless networks, it is not clear how data centers will benefit 
from such comparably low-capacity wireless links. Furthermore, scheduling of multiple concurrent 
links seems to be an unsolved issue. 

Overall while SIGCOMM papers are extremely strong on their evaluation and execution, few are 
really inspiring and tackling fundamental issues in data communications. Part of the problem
might be that talks are too centered on the marketing of the paper itself, and not enough on
the challenges in the area.

NoiseFloor 2012, Staffordshire University

From the 2nd to 4th of May I attended the NoiseFloor 2012 – a festival of experimental electronic music and sonic art which took place for the third time at Staffordshire University, Stafford from the 1st to the 4th of May. The schedule consisted of 9 concerts featuring acousmatic music and live electronics as well as two paper sessions related to performance and composition practices and music computing. The festival provided a good opportunity to establish contacts in the area of electronic composition and music technology. In terms of my PhD research at QMUL I organised a meeting with Dr Eric Lyon from SARC, Belfast being an expert on MAX/MSP – the graphical programming language my collaborative interactive music system is based on. Eric, who is currently working on a book about MAX programming, gave me some expert insight on advanced practices including the programming of custom objects in Java and C++. Also in his paper presentation related to an image to spatialisation algorithm used in a composition presented in one of the concerts, he demonstrated his impressive ability to combining  different audio programming languages in skilful ways. Moreover, I had the opportunity to discuss my research on collaborative music making using interactive technology with several composers and performers attending the festival such as BEER (Birmingham Ensemble for Electroacoustic Research). Beside that, I  presented a paper on musical improvisation entitled “Improvising with a Black Box – A perspective on No-Input Music” that resulted from a research project at my prior institute IEM, Graz.

Future Network Technologies Research and Innovation in HORIZON2020

Our faculty have been invited to present their vision the workshop that will take place on 29th June in Brussels to present your ideas for HORIZON2020 Future Networks Research.

Prof. Steve Uhlig & Dr. Hamed Haddadi, Queen Mary, University of London, UK.

Innovation for the Internet: the need to engage all stakeholders

 ABSTRACT

The Internet is evolving at a significant pace due to new usage trends and platforms such as mobile devices, social media, streaming networks and content delivery platforms. Within the next EU framework, the researchers need to focus on the future trends, devices and usage habits and strategically align their research to support those needs. In this document, we propose a number of challenges, related to the new interactions between different stakeholders. We also discuss how today’s Internet ecosystem requires to revisit not only the functionalities of the network, but also to rethink the different business models that will shape the future Internet. We also suggest that the Societal relevance of the Internet should be more supported by the Horizon 2020 agenda, as well as encourage that future projects have wider and more specific public engagement and community reach plans, engaging all stakeholders such as user communities, industrial bodies, the research community, policy makers and the Internet governing bodies.

 

Motivation: Today’s changing Internet ecosystem

Today’s Internet [1] differs significantly from the one that is described in popular textbooks [2], [3], [4]. The early commercial Internet had a strongly hierarchical structure, with large transit Internet Service Providers (ISPs) providing global connectivity to a multitude of national and regional ISPs [5].  Most of the applications/content was delivered by client-server applications that were largely centralized. With the recent advent of large-scale content distribution networks (CDNs), e.g., Akamai, Youtube, Yahoo, Limelight, and One Click Hosters (OCHs), e.g., Rapidshare, MegaUpload, the way the Internet is structured and traffic is delivered has fundamentally changed [1].

 

Today, the key players in the application and content delivery ecosystem, e.g., Cloud providers, CDNs, OCHs, data-centers and content sharing websites such as Google and Facebook which often have direct peerings with Internet Service Providers or are co-located within ISPs.  Application and content delivery providers rely on massively distributed architectures based on data centers to deliver their content to the users. Therefore, the Internet structure is not as strongly hierarchical as it used to be [1].

 

These fundamental changes in application and content delivery and Internet structure have deep implications on how the Internet will look like in the future. Hereafter, we describe how we believe that three different aspects of the Internet may lead to significant changes in the way we need to think about the forces that shape the flow of traffic in the Internet. Specifically, we first describe how central DNS has become as a focal point between application/content providers and ISPs. Next, we discuss how software-defined networking may change the ability of many stakeholders to influence the path that the traffic belonging to specific flows will follow across the network infrastructure. Finally, we discuss how the distributed nature of existing application and content delivery networks will, together with changes within the forwarding/routing, enable much more advanced handling of the traffic, on a much finer granularity compared to the current Internet.

 

Challenge 1: DNS and Server Redirection

 

The Domain Name System (DNS) was originally intended to provide a naming service, i.e., one-to-one mappings between a domain name and an IP address. Since then, DNS has evolved into a highly scalable system that fulfils the very stringent needs of applications in terms of its responsiveness [6,7,8]. Today, the DNS system is a commodity infrastructure that allows applications and content providers to map individual users to servers. This behaviour diverges from the original purpose of deploying DNS [10]. As application and content delivery infrastructures control how DNS is used to map end-users to their servers, the transport network, namely ISPs, has very limited control as to how traffic flows across the Internet [31]. Note that the case of DNS is a specific instance of a more general class of mapping systems for networked applications, such as trackers used in P2P or Locator/ID split approaches, e.g., LISP. Whatever the actual mapping system being used, the use of DNS by application/content providers is a sign that network-aware application optimization approaches are needed. P4P as well as Application-layer Traffic Optimization (ALTO) are possible solutions for this. Direct CDN-ISP collaboration is another way of ensuring that the application side and the network collaborate to provide the best possible service to the end-users in a cost-efficient manner [32].

 

Challenge 2: Software-defined networking

 

Applications and content are not the only place where an Internet (r)evolution is taking place. Thanks to a maturing market that is now close to “carrier grade” [13,14,15,16,17], the deployment of open source based routers has significantly increased during the last few years. While these devices are not competing with commercial high-end switches and routers available with respect to reliability, availability and density, they are fit to address specialized tasks within enterprise and ISP networks. Even PC-based routers with open source routing software are evolving fast enough to foresee their use outside research and academic environments [18,19,20].

 

The success of open-source routing software is being paralleled with increasing virtualization, not only on the server side, but also inside network devices. Server virtualization is now followed by network virtualization, which is made possible thanks to software-defined networking, e.g., OpenFlow [21] that expose the data path logic to the outside world. The model of network devices controlled by proprietary software tied to specific hardware will slowly but surely be made obsolete. Innovation within the network infrastructure will then be possible. A decade ago, IP packets were strictly following the paths decided by routing protocols. Tomorrow, together with the paths chosen by traditional routing protocols, a wide range of possibilities will arise to customize not only the path followed by specific traffic, but also the processing that this traffic undergoes. Indeed, specific actions that are statically performed today by specialized middleboxes placed inside the network, e.g., NAT, encryption, DPI, will be implemented on-path if processing capabilities happen to exist, otherwise the traffic will be dynamically redirected to close-by computational resources. This opens a wide range of applications that could be implemented almost anywhere inside the network infrastructure.

 

Fusing the transport network and applications/content

 

As content is moving closer to the end-user for improved quality of experience and the infrastructure opens up to unprecedented control and flexibility, the old business model of hierarchical providers and customer-provider relationships is hardly viable. Nowadays, delivering applications and content to end-users is becoming a less and less profitable business, except for the few able to capitalize on the revenues from advertising, e.g., Google, Facebook. On the other side, network infrastructure providers struggle to provide the necessary network bandwidth and low latency for these applications, at reasonable costs. The consequence of more and more limited ISP profit margins is a struggle between content providers and the network infrastructure to gain control of the traffic.

 

This struggle stems from fundamental differences in the business model of applications/content providers and ISPs. Today, application/content providers, for example through DNS tweaking, decide about the flow of the traffic by properly selecting the server from which a given user fetches some content [8,22,23]. This makes application/content delivery extremely dynamic and adaptive. On the ISP side, most of the traffic engineering relies on changing the routing configuration [24,25,26]. Tweaking existing routing protocols is not only dangerous, due to the danger of mis-configurations [27], routing instabilities [28] and convergence problems [29,30], but is simply not adequate to choose paths at the granularity of applications and content.

 

Industry and academia must join forces to address the challenges posed by the evolving Internet. We believe that the three research areas above need critical input from the community in order to enable a truly content-centric Internet. First, even after more than two decades of deployment and evolution, the DNS is still poorly understood. The DNS is much more than a naming system, it is a critical mapping system and a critical point in the application/content distribution arena. Second, software-defined networking opens a wide range of possibilities that would transform the current dumb pipes of the Internet core into a flexible and versatile infrastructure. Further, software-defined networking researchers has the ability to allow injecting intelligence inside the network without having to think about how it will affect a whole range of legacy protocols.

 

One way to go is to enable the different stakeholders to work together, e.g., enable ISPs to collaborate with application/content providers [31,32]. This can be achieved for example by exploiting the diversity in content location to ensure that ISP’s network engineering is not made obsolete by content provider decisions [31,32] or the other way around. Another option in which we believe is to leverage the flexibility in network virtualization and making their infrastructure much more adaptive than today’s static provisioning [33].

 

New Internet business models and privacy

 

The networks research community has been witnessing an explosive growth in the adoption of wireless devices such as smartphones and tablets. This new fertile market has been fueled by applications and games brought through multiple markets of third party developers.  These markets today rely on “App Stores” provided and controlled by device or operating system manufacturers such as  Apple or Google, now recently joined by Facebook. At the heart of this trade lies a particular revenue model: provide attractive content and applications, and in return benefit from a trusted ecosystem built from a large number of users. Majority of these ecosystems revolve around targetted advertising and use of personal information. Several recent proposals have been made by the networks and social computing research community, on enabling market places for personal information [34,35].

 

It has been suggested that personal data is the new currency on the Internet. This highlights the urgent need for understanding privacy issues, which requires engagement with policy makers and investing in new methods to create federated marketplaces for resources and data.

 

Engaging all stakeholders

 

The deep changes we discussed create unprecedented opportunities for industry and researchers to develop new solutions that will address not only relevant operational challenges, but also potentially business-critical ones. The ossification of the Internet protocols does not mean that the Internet is not evolving. The Internet has changed enormously over the last decade, and will continue to do so, no matter what. What we observe today is a convergence of applications/content and network infrastructure that questions a model of the Internet that used to separate two stakeholders: application/content infrastructures on the one side and a dumb transport network on the other.

 

The fundamental changes in the Internet lead to fundamental questions about the possible directions in which the Internet might be going, not only at a technical level, but also from a business perspective. These are Societal questions, that ask for answers for the sake of Internet governance and to ensure that the infrastructure is serving the purposes of the Society as a whole, not of a few business players. Emphasis must also be placed on engagement with users as the focal point of the ecosystem, not only business stakeholders.

 

 

Active Engagement with the European Community and Beyond

 

Traditionally, EU projects in the networking area have not been strongly urged to engage with the public, but focus their attention on the impact for European Industry. Given the Societal relevance of the Internet in supporting the Digital Economy, we encourage that future projects have wider and more specific public engagement and community reach plans, engaging user communities, industrial bodies, the research community, policy makers and the Internet governing bodies. This approach will encourage working beyond the usual outputs in the form of periodic reports and standard workshops that do not reach the relevant audience. Re-focusing the dissemination and impact criteria during project evaluation would incentivize projects to target long-term growth and innovation in Europe. We feel that today impact and dissemination play mostly a role at satisfying short-term industrial or business use-cases, which are heavily biased by industrial partners during review process of project proposals for impact.

 

Lastly, we encourage the inclusion of research and development organisations in China, India, Brazil and similar developing countries which are shaping the future of the network usage trends. Indeed, we now live in a globalized world, meaning that EU project should compete with their US and Chinese counterparts, both in terms of agenda but also in terms of their reach and impact.

 

[1] C. Labovitz, S. Lekel-Johnson, D. McPherson, J. Oberheide, and F. Jahanian, “Internet Inter-Domain Traffic,” in Proc. of ACM SIGCOMM, 2010.

[2] K. Claffy, H. Braun, and G. Polyzos, “Traffic Characteristics of the T1 NSFNET backbone,” in Proc. of IEEE INFOCOM, 1993.

[3] K. Thompson, G. Miller, and R. Wilder, “Wide-Area Internet Traffic Patterns and Characteristics,” IEEE Network Magazine, 11(6), November/December 1997.

[4] W. Fang and L. Peterson, “Inter-AS Traffic Patterns and their Implications,”  in Proc. of IEEE Global Internet Symposium, 1999.

[5]  L. Subramanian, S. Agarwal, J. Rexford, and R. Katz, “Characterizing  the Internet Hierarchy from Multiple Vantage Points,” in Proc. of IEEE INFOCOM, 2002.

[6] B. Krishnamurthy, C. Wills, and Y. Zhang, “On the Use and Performance of Content Distribution Networks,” in Proc. of ACM IMW, 2001.

[7] R. Krishnan, H. Madhyastha, S. Srinivasan, S. Jain, A. Krishnamurthy, T. Anderson, and J. Gao, “Moving Beyond End-to-end Path Information to Optimize CDN Performance,” in Proc. of ACM Internet Measurement Conference, 2009.

[8] T. Leighton, “Improving Performance on the Internet,” Communications of the ACM, 52(2):44–51, 2009.

[9] J. Jung, E. Sit, H. Balakrishnan, and R. Morris, “DNS Performance and the Effectiveness of Caching,” IEEE/ACM Trans. Netw., 10(5):589–603, 2002.

[10] P. Vixie, “What DNS is Not,” Commun. of the ACM, vol. 52, no. 12, 2009.

[11] B. Ager, W. Muehlbauer, G. Smaragdakis, and S. Uhlig, “Comparing DNS Resolvers in the Wild,” in Proc. of ACM Internet Measurement Conference, 2010.

[12] C. Contavalli, W. van der Gaast, S. Leach, and D. Rodden, “Client IP Information in DNS requests,” IETF draft, work in progress, draftvandergaast-edns-client-ip-00.txt, Jan 2010.

[13] “Quagga Routing Suite,” http://www.quagga.net.

[14] M. Handley, O. Hodson, and E. Kohler, “XORP: an Open Platform for Network Research,” ACM Comp. Comm. Rev., vol. 33, no. 1, 2003.

[15] J. Edwards, “Enterprises Cut Costs with Open-source Routers,” http:

//www.computerworld.com/s/article/9133851, 2009.

[16] “IP Infusion ZebOS,” http://www.ipinfusion.com/.

[17] Arista Networks, “EOS: An Extensible Operating System,” www.aristanetworks.com/en/EOS, 2009.

[18] E. Kohler, R. Morris, B. Chen, J. Jannotti, and F. Kaashoek, “The Click Modular Router,” ACM Trans. Comput. Syst., 18(3):263– 297, August 2000.

[19] N. Egi, A. Greenhalgh, M. Handley, M. Hoerdt, F. Huici, and L. Mathy, “Towards High Performance Virtual Routers on Commodity Hardware,” in Proc. of ACM CoNEXT, 2008.

[20] M. Dobrescu, N. Egi, K. Argyraki, B. Chun, K. Fall, G. Iannaccone, A. Knies, M. Manesh, and S. Ratnasamy, “RouteBricks: Exploiting Parallelism to Scale Software Routers,” in Proc. of ACM SOSP, 2009.

[21] N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner, “OpenFlow: Enabling Innovation in Campus Networks,” ACM Comp. Comm. Rev., 2008.

[22] C. Huang, A. Wang, J. Li, and K. W. Ross, “Measuring and Evaluating Large-scale CDNs,” in Proc. of ACM Internet Measurement Conference, 2008. Paper withdrawn at Microsoft request.

[23] S. Triukose, Z. Al-Qudah, and M. Rabinovich, “Content Delivery Networks: Protection or Threat?” in Proc. of ESORICS, 2009.

[24] B. Fortz and M. Thorup, “Internet Traffic Engineering by Optimizing OSPF Weights,” in Proc. of IEEE INFOCOM, 2000.

[25] B. Fortz and M. Thorup, “Optimizing OSPF/IS-IS Weights in a Changing World,” IEEE Journal in Selected Areas in Communications, 20(4):756–767, 2002.

[26] Y. Wang, Z. Wang, and L. Zhang, “Internet Traffic Engineering Without Full Mesh Overlaying,” in Proc. of IEEE INFOCOM, 2001.

[27] R. Mahajan, D. Wetherall, and T. Anderson, “Understanding BGP Misconfigurations,” in Proc. of ACM SIGCOMM, 2002.

[28] C. Labovitz, G. R. Malan, and F. Jahanian, “Internet Routing Instability,” in Proc. of ACM SIGCOMM, 1997.

[29]  T. Griffin and G. Wilfong, “An Analysis of BGP Convergence Properties,” in Proc. of ACM SIGCOMM, 1999.

[30] C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian, “Delayed Internet Routing Convergence,” in Proc. of ACM SIGCOMM, 2000.

[32] Ingmar Poese, Benjamin Frank, Bernhard Ager, Georgios Smaragdakis, Steve Uhlig, Anja Feldmann, “Improving Content Delivery with PaDIS,” IEEE Internet Computing, 16(3):46-52, May-June 2012.

[33]  J. He, R. Zhang-Shen, Y. Li, C.-Y. Lee, J. Rexford, and M. Chiang,  “DaVinci: Dynamically Adaptive Virtual Networks for a Customized  Internet,” in Proc. of ACM CoNEXT, 2008.

[34] Hamed Haddadi, Richard Mortier, Steven Hand, Ian Brown, Eiko Yoneki, Derek McAuley and Jon Crowcroft: “Privacy Analytics”. ACM SIGCOMM Computer Communication Review, April 2012.

[35] Christina Aperjis and Bernardo A. Huberman. A Market for Unbiased Private Data: Paying Individuals According to their Privacy Attitudes. Available at http://dx.doi.org/10.2139/ssrn.2046861, April 2012.

Report from IEEE WCNC 2012

IEEE WCNC is the world premier wireless event that brings together industry professionals, academics, and individuals from government agencies and other institutions to exchange information and ideas on the advancement of wireless communications and networking technology. The 2012 edition of this conference was held in Paris, France, on 1 to 4 April 2012. The paper “Dynamic Frequency Allocation and Network Reconfiguration on Relay Based Cellular Network” worked by Dr.Haibo Mei, Dr. John Bigham and Dr. Peng Jiang was accepted in the Mobile and Wireless Networks Program of this conference. The abstract of this paper is in Appendix1.

During the conference trip,it was exciting to meet lots of outstanding researchers and developers in wireless communication field. There were numbers of   key note speeches and presentation sessions. The author received   some valuable comments during the presentation. All these experiences help authors’ further research in the wireless communication filed. Especially, the works “On Fractional Frequency Reuse in Imperfect Cellular Grids”,  “Energy-Efficient Subchannel Allocation Scheme Based on Adaptive Base Station Cooperation in Downlink Cellular Networks”, “Optimized Dual Relay Deployment for LTE-Advanced Cellular Systems” are mostly interesting to the author.

This conference trip is  valuable. It is always fantastic to have such chance to exchange idea with other researchers.

 

Appendix 1:

Abstract—Relay Based Cellular Networks (RBCNs) are a key development in cellular networking technology. However, because of ever increasing demand and base station failure, RBCNs still suffer from user congestion and low resilience problems. This paper proposes two competing solutions: dynamic frequency allocation and antenna tilting to those problems. Firstly a new dynamic fractional frequency allocation algorithm and a heuristic antenna tilting algorithm are designed. The comparative benefits of each algorithm are investigated. Secondly, the additional benefits of applying the two algorithms sequentially or iteratively are evaluated. The benefits of iteratively integrating the two algorithms are more interesting. Such integration solution allows the two algorithms to be applied cooperatively. The evaluations are based on high demand scenarios and base station failure scenarios. The results show that for the high demand scenarios the new dynamic fractional frequency allocation algorithm is very powerful, and the advantage of antenna tilting is not large though present. However, for the BS failure case there is a marked additional benefit in antenna tilting. The integrated solution achieves significantly more benefit than simple sequential application of the two algorithms.

Report from EACL 2012

13th Conference of the European Chapter of the Association for Computational Linguistics

Day 1:

Workshop on Semantic Analysis in Social Media (SASN2012)

The first half of the day had some pretty interesting talks: on unsupervised part-of-speech tagging for social media, emotional stability on Twitter, speech act tagging for Twitter, topic classification for blogs. The second half was a bit less interesting (IMHO) as it focused more on tools/software, but note this one on predicting Dutch election results. (full workshop proceedings available online here)

Day 2:

Workshop on Computational Models of Language Acquisition and Loss

Mark Steedman’s keynote on CCG grammar induction from semantics was very interesting – there’s little information in the workshop abstract but see the related EACL conference paper. (full workshop proceedings available here).

Workshop on Unsupervised and Semi-Supervised Learning in NLP

Generally interesting for techniques, but mostly applied to standard text tasks (parsing, coreference resolution etc which I find it hard to get excited about). But one on child language acquisition. (full workshop proceedings available here).

Main Conference:

The keynote speeches were great: Martin Cooke on how to make speech more intelligible without necessarily making it louder; Regina Barzilay on using reinforcement learning to learn language/semantics directly from task success; Ray Mooney on learning language from context (although I missed that one to come back & give revision lectures …)

Some other highlights for me: Heriot-Watt’s demo of their most recent POMDP dialogue system; Postdam/Bielefeld’s experiments on improving NLU by using incremental pragmatic information; some nice stuff on unsupervised learning of semantic roles; and possibly the worst talk I have ever had the misfortune to sit through (I won’t link to it but it’s on paraphrase generation via machine translation, if you really want to find it. I’m sure the paper’s excellent).

Oh, and my & Stuart’s paper on Twitter emotion detection of course.

Full proceedings available here.

Report from the Passive and Active Measurement Conference 2012

http://pam2012.ftw.at/

PAM is the oldest Internet measurement conference, started in 2000. This year’s edition took place in Vienna in March 2012.

The keynote was given by Yuval Shavitt (Tel Aviv University) on “Internet Topology Measurement: Past, Present, Future”. Topology measurements are still an active area of research, as our visibility of the Internet topology is still limited, and subject to significant biases whose impact is still being investigated by the research community.

Session on Malicious Behavior
- Detecting Pedophile Activity in Bittorent Networks: P2P networks are used for various illegal activities, such as pedophile.
- Re-Wiring Activity of Malicious Networks: Malicious networks tend to loose their connectivity when their activities are spotted. This study looks at the visibility of network re-wiring.

Session on Traffic Evolution and Analysis
- Unmasking the Growing UDP Traffic in a Campus Network: UDP traffic is becoming more and more popular, for example in China it has been reported to be as high as 80% is some networks. This paper provides similar evidence from Korea.
- Investigating IPv6 Traffic—What happened at the World IPv6 Day? Despite efforts such as the IPv6 world day, there is still little IPv6 traffic in the Internet. This paper studies what happened during the IPv6 World Day based on two vantage points, one campus network in the US and a large IXP in Germany.
- An End-Host View on Local Traffic at Home and Work: This paper compares local and wide-area traffic from end-hosts connected to different home and work networks.
- Comparison of User Traffic Characteristics on Mobile-Access versus Fixed-Access Networks: Mobile traffic is growing, and we still do not know much about how it differs from traffic seen on wired networks.

Session on Evaluation Methodology
- SyFi: A Systematic Approach for Estimating Stateful Firewall Performance: Firewalls are pervasive in today’s Internet given the need to protect networks from attacks. This paper builds a predictive model of the throughput achieved by commercial firewalls.
- OFLOPS: An Open Framework for OpenFlow Switch Evaluation: Current OpenFlow implementations have different levels of maturity. This work proposes a framework to test OpenFlow implementations capabilities.
- Probe and Pray: Using UPnP for Home Network Measurements: UPnP is nowadays becoming a popular active measurement platform. This paper studies the limitations as well as the usefulness of this platform.

Session on Large Scale Monitoring
- BackStreamDB: A Distributed System for Backbone Traffic Monitoring Providing Arbitrary Measurements in Real-Time. Distributed approaches for traffic monitoring are the future. Nice piece of work.
- A Sequence-oriented Stream Warehouse Paradigm for Network Monitoring Applications. Mining large-scale data is a pain, and networking is no exception. This paper proposes SQL extensions to ease the monitoring of networks by allowing to express sequence-oriented queries in a declarative language.
- PFQ: a Novel Engine for Multi-Gigabit Packet Capturing With Multi-Core Commodity Hardware. Unleashing the power of multi-cores to deliver amazing packet copying to user-space apps!

Session on New Measurement Initiatives
- Difficulties in Modeling SCADA Traffic: A Comparative Analysis: One of the few measurements of M2M traffic. Nothing surprising in terms of traffic properties, given the small scale of these traffic traces.
- Characterizing delays in Norwegian 3G networks: Yet another study of 3G networks. The talk generated heated comments on the limitations of the methodology.
- On 60 GHz Wireless Link Performance in Indoor Environments: Studies the use of 60 GHz wireless indoors and the conditions under which it works (LOS and NLOS). Pretty positive results…
- Geolocating IP Addresses in Cellular Data Networks: Very interesting geolocation paper that confirms previous work on the issues with geolocation databases.

Session on Reassessing Tools and Methods
- Speed Measurements of Residential Internet Access. How do bandwidth probing tools compare when using them to measure the available bandwidth of residential users?
- One-way Traffic Monitoring with iatmon. Using one-way delay measurements to track changes in traffic behavior and classifying different traffic sources.
- A Hands-on Look at Active Probing using the IP Prespecified Timestamp Option. IP options are not very widely used, despite their potential applicability. This work shows more evidence for the discrepancy between RFC and implementations.

Application Protocols
- Xunlei: Peer-Assisted Download Acceleration on a Massive Scale. A must-read: one of the pieces of the future in content delivery platforms!
- Pitfalls in HTTP Traffic Measurements and Analysis. What you should not trust from packet-level data when analyzing HTTP traces.
- A Longitudinal Characterization of Local and Global BitTorrent Workload Dynamics. Nice study of different types of content delivered through BitTorrent (file size, throughput, type of content).

Perspectives on Internet Structure and Services
- Exposing a Nation-Centric View on the German Internet – A Change in Perspective on the AS Level. Trying to define the AS-level ecosystem of Germany, still unclear whether it makes any sense, even though many defense agencies would like to be able to define it.
- Behavior of DNS’ Top Talkers, a .com/.net View. First ever analysis of the .com and .net TLD servers. Very interesting observations about IPv6 DNS, as well as the set of DNS resolvers that are the top talkers with the TLD servers. Must-read for DNS.
- The BIZ Top-Level Domain: Ten Years Later. Thinking about what will happen with the biz domain given the last 10 years or its use, especially defensive registrations. Must-read for DNS.

Report from IEEE INFOCOM 2012

http://www.ieee-infocom.org/program.html

Day 1: Mini-conference
Mini-conference papers are those that were discussed but not accepted in the main conference. They’re borderline papers, though sometimes more thought-provoking than
papers accepted at the main conference. Also, the variety in the topics of these
mini-conference papers is better than those accepted at the main conference.

Day 2: Conference opening, keynote, and afternoon sessions
Conference received about 1500 papers, less than 300 were accepted. A few awards were given during the opening. The keynote was given by Broadcom CTO. The topic was data-centers, very focused on the switch product line of Broadcom. Very limited comments on management or new topics such as OpenFlow were given despite being interesting for the research community.

The main conference was organized as 6 parallel tracks. 10 sessions addressed sensor networks design, showing the increased importance of this topic, strongly related to Internet of Things. A significant fraction of the papers deal with non-wired communications, such as sensors, wireless and mobile communications. Hot topics such as data-centers, cloud/grid, social computing, energy-efficiency, software-defined radio, are of course getting more attention than they used to in the past. The only missing topic surprisingly is optical communications.

Day 3
Sessions on cloud/grid were interesting, covering many aspects of the issues in cloud. INFOCOM being a rather applied theory conference, most of the papers address topics from an optimization, game theory, or performance evaluation viewpoint. The session on network optimization was the most interesting of the day in my opinion, with 3 papers from Google about traffic engineering on the Google network, worth reading.

Day 4
This last day of the conference was very interesting, with sessions on Internet measurement, Future Internet architectures, and Internet routing and router design. Multiple very interesting papers, such as:
- A Hybrid IP Lookup Architecture with Fast Updates: this paper proposes to fast IP lookups by using both TCAM and SRAM/FPGA to ensure updates have limited disruptive impact on lookups.
- Transparent acceleration of software packet forwarding using netmap: bypassing the TCP/IP stack through simplified drivers that allow applications to speak directly with the NICs.