Access Keys:
Skip to content (Access Key - 0)

Domenico Vicinanza and Alessandra Scicchitano

Heavy Data, Light Measurements – network troubleshooting in a distributed computing environment 

Complex, heavily distributed computing environments like clouds and grids are increasingly more reliant on complex network setups, consisting of multiple layers, spanning several domains and using different technologies.

Network issues may have diverse impacts with severe consequences on the actual user experience, depending where they happen in the chain between the user application and the physical connections.

Network troubleshooting could be then a crucial task in managing a distributed computing centre or a grid/cloud infrastructure. The reliability of the computing facilities is directly related to the underlying network, which has to be properly designed and set-up but also equipped with adequate monitoring and troubleshooting tools. This is particularly true when the data centre of the computing infrastructure is involved with capturing, processing, storing, sharing and transferring large and complex data sets (Big Data).

Network monitoring and troubleshooting activities in highly performing computing infrastructure should be as least destructive as possible, not taking up precious resources and not interfering with the traffic generated by the computing facility. Lightweight network measurement is then the preferred strategy. 

A network measurement can be defined as “lightweight” if it is not competing with the real network traffic, i.e. not competing with user generated data being transferred or shared by users or applications. Examples of lightweight measurements are:
- 1-way delay/RTT
- 1-way delay variation (jitter)
- Packet loss
The presentation will focus on the measurement of those three metrics and their impact on the performance of the network as this is perceived by the final users.

The talk will include a section about tools and monitoring infrastructures (like perfSONAR MDM) which will be presented and discussed.

(tick)

Domenico Vicinanza and Alessandra Scicchitano's Biographies

Domenico Vicinanza
Domenico Vicinanza works at DANTE, Cambridge, UK as a product manager. He received his MSc and PhD degrees in Physics and he is a professional music composer. He worked for seven years as a Research Associate at University of Salerno and Roma Tre and as a Scientific Associate at CERN. His activities during this time included LHC Computing Grid sites administration, system administration, IT resources management, network troubleshooting, grid computing, services support and teaching. He is also involved in the application of distributed computing and advanced networking technologies to music and visual arts as the technical coordinator of the ASTRA (Ancient instrument Sound/Timbre Reconstruction Application) and Lost Sounds Orchestra projects for the reconstruction of musical instruments on GÉANT and EUMEDCONNECT.

Alessandra Scicchitano
Alessandra Scicchitano received a Dr.Ing. degree in Computer Engineering and Ph.D. degree in System and Networking in 2004 and 2007 respectively from UniCal, Italy. From 2005 to 2007, she was visiting researcher at Politecnico di Torino in a joint program for her PhD studies, working on scheduling algorithm for IQ switches. From 2007 to 2009 she held a PostDoc position at IBM Zurich Research Lab, working on the IEEE 802.1au standard and on algorithms for adaptive routing in HPC systems. Today she is part of the Peta Solutions team at SWITCH and her main focus is on virtualization and E2E performance.